Added Eigen results to performance graphs.
Details: - Updated the Haswell, SkylakeX, and Epyc performance graphs in docs/graphs to report on Eigen implementations, where applicable. Specifically, Eigen implements all level-3 operations sequentially, however, of those operations it only provides multithreaded gemm. Thus, mt results for symm/hemm, syrk/herk, trmm, and trsm are omitted. Thanks to Sameer Agarwal for his help configuring and using Eigen. - Updated docs/Performance.md to note the new implementation tested. - CREDITS file update.
1
CREDITS
@@ -9,6 +9,7 @@ The BLIS framework was primarily authored by
|
||||
|
||||
but many others have contributed code and feedback, including
|
||||
|
||||
Sameer Agarwal @sandwichmaker (Google)
|
||||
Murtaza Ali (Texas Instruments)
|
||||
Sajid Ali @s-sajid-ali (Northwestern University)
|
||||
Erling Andersen @erling-d-andersen
|
||||
|
||||
@@ -194,6 +194,13 @@ size of interest so that we can better assist you.
|
||||
* Requested threading via `export OPENBLAS_NUM_THREADS=1` (single-threaded)
|
||||
* Requested threading via `export OPENBLAS_NUM_THREADS=26` (multithreaded, 26 cores)
|
||||
* Requested threading via `export OPENBLAS_NUM_THREADS=52` (multithreaded, 52 cores)
|
||||
* Eigen 3.3.7
|
||||
* Prior to compilation, modified top-level `CMakeLists.txt` to ensure that `-march=native` was added to `CXX_FLAGS` variable (h/t Sameer Agarwal).
|
||||
* configured and built BLAS library via `mkdir build; cd build; cmake ..; make blas`
|
||||
* Requested threading via `export OMP_NUM_THREADS=1` (single-threaded)
|
||||
* Requested threading via `export OMP_NUM_THREADS=26` (multithreaded, 26 cores)
|
||||
* Requested threading via `export OMP_NUM_THREADS=52` (multithreaded, 52 cores)
|
||||
* **NOTE**: This version of Eigen does not provide multithreaded implementations of `symm`/`hemm`, `syrk`/`herk`, `trmm`, or `trsm`, and therefore those curves are omitted from the multithreaded graphs.
|
||||
* MKL 2019 update 1
|
||||
* Requested threading via `export MKL_NUM_THREADS=1` (single-threaded)
|
||||
* Requested threading via `export MKL_NUM_THREADS=26` (multithreaded, 26 cores)
|
||||
@@ -251,6 +258,13 @@ size of interest so that we can better assist you.
|
||||
* Requested threading via `export OPENBLAS_NUM_THREADS=1` (single-threaded)
|
||||
* Requested threading via `export OPENBLAS_NUM_THREADS=12` (multithreaded, 12 cores)
|
||||
* Requested threading via `export OPENBLAS_NUM_THREADS=24` (multithreaded, 24 cores)
|
||||
* Eigen 3.3.7
|
||||
* Prior to compilation, modified top-level `CMakeLists.txt` to ensure that `-march=native` was added to `CXX_FLAGS` variable (h/t Sameer Agarwal).
|
||||
* configured and built BLAS library via `mkdir build; cd build; cmake ..; make blas`
|
||||
* Requested threading via `export OMP_NUM_THREADS=1` (single-threaded)
|
||||
* Requested threading via `export OMP_NUM_THREADS=12` (multithreaded, 12 cores)
|
||||
* Requested threading via `export OMP_NUM_THREADS=24` (multithreaded, 24 cores)
|
||||
* **NOTE**: This version of Eigen does not provide multithreaded implementations of `symm`/`hemm`, `syrk`/`herk`, `trmm`, or `trsm`, and therefore those curves are omitted from the multithreaded graphs.
|
||||
* MKL 2018 update 2
|
||||
* Requested threading via `export MKL_NUM_THREADS=1` (single-threaded)
|
||||
* Requested threading via `export MKL_NUM_THREADS=12` (multithreaded, 12 cores)
|
||||
@@ -309,6 +323,13 @@ size of interest so that we can better assist you.
|
||||
* Requested threading via `export OPENBLAS_NUM_THREADS=1` (single-threaded)
|
||||
* Requested threading via `export OPENBLAS_NUM_THREADS=32` (multithreaded, 32 cores)
|
||||
* Requested threading via `export OPENBLAS_NUM_THREADS=64` (multithreaded, 64 cores)
|
||||
* Eigen 3.3.7
|
||||
* Prior to compilation, modified top-level `CMakeLists.txt` to ensure that `-march=native` was added to `CXX_FLAGS` variable (h/t Sameer Agarwal).
|
||||
* configured and built BLAS library via `mkdir build; cd build; cmake ..; make blas`
|
||||
* Requested threading via `export OMP_NUM_THREADS=1` (single-threaded)
|
||||
* Requested threading via `export OMP_NUM_THREADS=32` (multithreaded, 32 cores)
|
||||
* Requested threading via `export OMP_NUM_THREADS=64` (multithreaded, 64 cores)
|
||||
* **NOTE**: This version of Eigen does not provide multithreaded implementations of `symm`/`hemm`, `syrk`/`herk`, `trmm`, or `trsm`, and therefore those curves are omitted from the multithreaded graphs.
|
||||
* MKL 2019 update 1
|
||||
* Requested threading via `export MKL_NUM_THREADS=1` (single-threaded)
|
||||
* Requested threading via `export MKL_NUM_THREADS=32` (multithreaded, 32 cores)
|
||||
|
||||
|
Before Width: | Height: | Size: 102 KiB After Width: | Height: | Size: 107 KiB |
|
Before Width: | Height: | Size: 105 KiB After Width: | Height: | Size: 114 KiB |
|
Before Width: | Height: | Size: 66 KiB After Width: | Height: | Size: 76 KiB |
|
Before Width: | Height: | Size: 91 KiB After Width: | Height: | Size: 95 KiB |
|
Before Width: | Height: | Size: 90 KiB After Width: | Height: | Size: 97 KiB |
|
Before Width: | Height: | Size: 69 KiB After Width: | Height: | Size: 81 KiB |
|
Before Width: | Height: | Size: 98 KiB After Width: | Height: | Size: 102 KiB |
|
Before Width: | Height: | Size: 94 KiB After Width: | Height: | Size: 100 KiB |
|
Before Width: | Height: | Size: 75 KiB After Width: | Height: | Size: 83 KiB |
|
Before Width: | Height: | Size: 92 KiB After Width: | Height: | Size: 92 KiB |
|
Before Width: | Height: | Size: 100 KiB After Width: | Height: | Size: 100 KiB |
|
Before Width: | Height: | Size: 70 KiB After Width: | Height: | Size: 70 KiB |