Updated Eigen results in docs/graphs with 3.3.90.
Details: - Updated the level-3 performance graphs in docs/graphs with new Eigen results, this time using a development version cloned from their git mirror on March 27, 2019 (version 3.3.90). Performance is improved over 3.3.7, though still noticeably short of BLIS/MKL in most cases. - Very minor updates to docs/Performance.md and matlab scripts in test/3/matlab.
@@ -194,7 +194,8 @@ size of interest so that we can better assist you.
|
||||
* Requested threading via `export OPENBLAS_NUM_THREADS=1` (single-threaded)
|
||||
* Requested threading via `export OPENBLAS_NUM_THREADS=26` (multithreaded, 26 cores)
|
||||
* Requested threading via `export OPENBLAS_NUM_THREADS=52` (multithreaded, 52 cores)
|
||||
* Eigen 3.3.7
|
||||
* Eigen 3.3.90
|
||||
* Obtained via the [Eigen git mirror](https://github.com/eigenteam/eigen-git-mirror) (March 27, 2019)
|
||||
* Prior to compilation, modified top-level `CMakeLists.txt` to ensure that `-march=native` was added to `CXX_FLAGS` variable (h/t Sameer Agarwal).
|
||||
* configured and built BLAS library via `mkdir build; cd build; cmake ..; make blas`
|
||||
* The `gemm` implementation was pulled in at compile-time via Eigen headers; other operations were linked to Eigen's BLAS library.
|
||||
@@ -214,7 +215,7 @@ size of interest so that we can better assist you.
|
||||
* Hardware limits: 1.0GHz - 2.0GHz
|
||||
* Adjusted minimum: 2.0GHz
|
||||
* Comments:
|
||||
* MKL yields superb performance for most operations, though BLIS is not far behind except for trsm. (We understand the trsm underperformance and hope to address it in the future.) OpenBLAS lags far behind MKL and BLIS due to lack of full support for AVX-512, and possibly other reasons related to software architecture and register/cache blocksizes.
|
||||
* MKL yields superb performance for most operations, though BLIS is not far behind except for `trsm`. (We understand the `trsm` underperformance and hope to address it in the future.) OpenBLAS lags far behind MKL and BLIS due to lack of full support for AVX-512, and possibly other reasons related to software architecture and register/cache blocksizes.
|
||||
|
||||
### SkylakeX results
|
||||
|
||||
@@ -262,7 +263,8 @@ size of interest so that we can better assist you.
|
||||
* Requested threading via `export OPENBLAS_NUM_THREADS=1` (single-threaded)
|
||||
* Requested threading via `export OPENBLAS_NUM_THREADS=12` (multithreaded, 12 cores)
|
||||
* Requested threading via `export OPENBLAS_NUM_THREADS=24` (multithreaded, 24 cores)
|
||||
* Eigen 3.3.7
|
||||
* Eigen 3.3.90
|
||||
* Obtained via the [Eigen git mirror](https://github.com/eigenteam/eigen-git-mirror) (March 27, 2019)
|
||||
* Prior to compilation, modified top-level `CMakeLists.txt` to ensure that `-march=native` was added to `CXX_FLAGS` variable (h/t Sameer Agarwal).
|
||||
* configured and built BLAS library via `mkdir build; cd build; cmake ..; make blas`
|
||||
* The `gemm` implementation was pulled in at compile-time via Eigen headers; other operations were linked to Eigen's BLAS library.
|
||||
@@ -328,7 +330,8 @@ size of interest so that we can better assist you.
|
||||
* Requested threading via `export OPENBLAS_NUM_THREADS=1` (single-threaded)
|
||||
* Requested threading via `export OPENBLAS_NUM_THREADS=32` (multithreaded, 32 cores)
|
||||
* Requested threading via `export OPENBLAS_NUM_THREADS=64` (multithreaded, 64 cores)
|
||||
* Eigen 3.3.7
|
||||
* Eigen 3.3.90
|
||||
* Obtained via the [Eigen git mirror](https://github.com/eigenteam/eigen-git-mirror) (March 27, 2019)
|
||||
* Prior to compilation, modified top-level `CMakeLists.txt` to ensure that `-march=native` was added to `CXX_FLAGS` variable (h/t Sameer Agarwal).
|
||||
* configured and built BLAS library via `mkdir build; cd build; cmake ..; make blas`
|
||||
* The `gemm` implementation was pulled in at compile-time via Eigen headers; other operations were linked to Eigen's BLAS library.
|
||||
|
||||
|
Before Width: | Height: | Size: 107 KiB After Width: | Height: | Size: 108 KiB |
|
Before Width: | Height: | Size: 114 KiB After Width: | Height: | Size: 115 KiB |
|
Before Width: | Height: | Size: 76 KiB After Width: | Height: | Size: 78 KiB |
|
Before Width: | Height: | Size: 95 KiB After Width: | Height: | Size: 96 KiB |
|
Before Width: | Height: | Size: 97 KiB After Width: | Height: | Size: 96 KiB |
|
Before Width: | Height: | Size: 81 KiB After Width: | Height: | Size: 81 KiB |
|
Before Width: | Height: | Size: 102 KiB After Width: | Height: | Size: 104 KiB |
|
Before Width: | Height: | Size: 100 KiB After Width: | Height: | Size: 101 KiB |
|
Before Width: | Height: | Size: 83 KiB After Width: | Height: | Size: 88 KiB |
|
Before Width: | Height: | Size: 92 KiB After Width: | Height: | Size: 92 KiB |
|
Before Width: | Height: | Size: 100 KiB After Width: | Height: | Size: 100 KiB |
|
Before Width: | Height: | Size: 70 KiB After Width: | Height: | Size: 70 KiB |
@@ -144,8 +144,8 @@ if rows == 4 && cols == 5
|
||||
'Location', legend_loc );
|
||||
end
|
||||
set( leg,'Box','off','Color','none','Units','inches','FontSize',fontsize-3 );
|
||||
%set( leg,'Position',[11.20 12.81 0.7 0.3 ] ); % (0,2br)
|
||||
set( leg,'Position',[ 4.20 12.81 0.7 0.3 ] ); % (0,0br)
|
||||
set( leg,'Position',[11.20 12.81 0.7 0.3 ] ); % (0,2br)
|
||||
%set( leg,'Position',[ 4.20 12.81 0.7 0.3 ] ); % (0,0br)
|
||||
elseif nth > 1 && theid == 4
|
||||
if with_eigen == 1
|
||||
leg = legend( [ blis_ln open_ln eige_ln vend_ln ], ...
|
||||
@@ -158,6 +158,7 @@ if rows == 4 && cols == 5
|
||||
end
|
||||
set( leg,'Box','off','Color','none','Units','inches','FontSize',fontsize-3 );
|
||||
%set( leg,'Position',[7.70 12.81 0.7 0.3 ] ); % (0,1br)
|
||||
%set( leg,'Position',[11.20 12.81 0.7 0.3 ] ); % (0,2br)
|
||||
set( leg,'Position',[10.47 14.17 0.7 0.3 ] ); % (0,2tl)
|
||||
end
|
||||
end
|
||||
|
||||
@@ -9,9 +9,9 @@ plot_panel_4x5(2.20,8,56,'2s','../results/tx2/20190205/jc8ic7','tx2_jc8ic7','ARM
|
||||
%plot_panel_4x5(2.00,32,26,'1s','../results/skx/20190306/jc2ic13','skx_jc2ic13','MKL'); close; clear all;
|
||||
%plot_panel_4x5(2.00,32,52,'2s','../results/skx/20190306/jc4ic13','skx_jc4ic13','MKL'); close; clear all;
|
||||
% with eigen:
|
||||
plot_panel_4x5(2.00,32,1, 'st','../results/skx/merged20190306_0326/st', 'skx', 'MKL',1); close; clear all;
|
||||
plot_panel_4x5(2.00,32,26,'1s','../results/skx/merged20190306_0326/jc2ic13','skx_jc2ic13','MKL',1); close; clear all;
|
||||
plot_panel_4x5(2.00,32,52,'2s','../results/skx/merged20190306_0326/jc4ic13','skx_jc4ic13','MKL',1); close; clear all;
|
||||
plot_panel_4x5(2.00,32,1, 'st','../results/skx/merged20190306_0328/st', 'skx', 'MKL',1); close; clear all;
|
||||
plot_panel_4x5(2.00,32,26,'1s','../results/skx/merged20190306_0328/jc2ic13','skx_jc2ic13','MKL',1); close; clear all;
|
||||
plot_panel_4x5(2.00,32,52,'2s','../results/skx/merged20190306_0328/jc4ic13','skx_jc4ic13','MKL',1); close; clear all;
|
||||
|
||||
% has
|
||||
% pre-eigen:
|
||||
@@ -19,9 +19,9 @@ plot_panel_4x5(2.00,32,52,'2s','../results/skx/merged20190306_0326/jc4ic13','skx
|
||||
%plot_panel_4x5(3.00,16,12,'1s','../results/has/20190206/jc2ic3jr2','has_jc2ic3jr2','MKL',1); close; clear all;
|
||||
%plot_panel_4x5(3.00,16,24,'2s','../results/has/20190206/jc4ic3jr2','has_jc4ic3jr2','MKL',1); close; clear all;
|
||||
% with eigen:
|
||||
plot_panel_4x5(3.25,16,1, 'st','../results/has/merged20190206_0326/st', 'has', 'MKL',1); close; clear all;
|
||||
plot_panel_4x5(3.00,16,12,'1s','../results/has/merged20190206_0326/jc2ic3jr2','has_jc2ic3jr2','MKL',1); close; clear all;
|
||||
plot_panel_4x5(3.00,16,24,'2s','../results/has/merged20190206_0326/jc4ic3jr2','has_jc4ic3jr2','MKL',1); close; clear all;
|
||||
plot_panel_4x5(3.25,16,1, 'st','../results/has/merged20190206_0328/st', 'has', 'MKL',1); close; clear all;
|
||||
plot_panel_4x5(3.00,16,12,'1s','../results/has/merged20190206_0328/jc2ic3jr2','has_jc2ic3jr2','MKL',1); close; clear all;
|
||||
plot_panel_4x5(3.00,16,24,'2s','../results/has/merged20190206_0328/jc4ic3jr2','has_jc4ic3jr2','MKL',1); close; clear all;
|
||||
|
||||
% epyc
|
||||
% pre-eigen:
|
||||
@@ -29,7 +29,7 @@ plot_panel_4x5(3.00,16,24,'2s','../results/has/merged20190206_0326/jc4ic3jr2','h
|
||||
%plot_panel_4x5(2.55,8,32,'1s','../results/epyc/merged201903_0619/jc1ic8jr4','epyc_jc1ic8jr4','MKL'); close; clear all;
|
||||
%plot_panel_4x5(2.55,8,64,'2s','../results/epyc/merged201903_0619/jc2ic8jr4','epyc_jc2ic8jr4','MKL'); close; clear all;
|
||||
% with eigen:
|
||||
plot_panel_4x5(3.00,8,1, 'st','../results/epyc/merged20190306_0319_0326/st', 'epyc', 'MKL',1); close; clear all;
|
||||
plot_panel_4x5(2.55,8,32,'1s','../results/epyc/merged20190306_0319_0326/jc1ic8jr4','epyc_jc1ic8jr4','MKL',1); close; clear all;
|
||||
plot_panel_4x5(2.55,8,64,'2s','../results/epyc/merged20190306_0319_0326/jc2ic8jr4','epyc_jc2ic8jr4','MKL',1); close; clear all;
|
||||
plot_panel_4x5(3.00,8,1, 'st','../results/epyc/merged20190306_0319_0328/st', 'epyc', 'MKL',1); close; clear all;
|
||||
plot_panel_4x5(2.55,8,32,'1s','../results/epyc/merged20190306_0319_0328/jc1ic8jr4','epyc_jc1ic8jr4','MKL',1); close; clear all;
|
||||
plot_panel_4x5(2.55,8,64,'2s','../results/epyc/merged20190306_0319_0328/jc2ic8jr4','epyc_jc2ic8jr4','MKL',1); close; clear all;
|
||||
|
||||
|
||||