Updated Eigen results in docs/graphs with 3.3.90.

Details:
- Updated the level-3 performance graphs in docs/graphs with new Eigen
  results, this time using a development version cloned from their git
  mirror on March 27, 2019 (version 3.3.90). Performance is improved
  over 3.3.7, though still noticeably short of BLIS/MKL in most cases.
- Very minor updates to docs/Performance.md and matlab scripts in
  test/3/matlab.
This commit is contained in:
Field G. Van Zee
2019-03-28 17:40:50 -05:00
parent 20ea7a1217
commit 7bc75882f0
24 changed files with 19 additions and 15 deletions

View File

@@ -194,7 +194,8 @@ size of interest so that we can better assist you.
* Requested threading via `export OPENBLAS_NUM_THREADS=1` (single-threaded)
* Requested threading via `export OPENBLAS_NUM_THREADS=26` (multithreaded, 26 cores)
* Requested threading via `export OPENBLAS_NUM_THREADS=52` (multithreaded, 52 cores)
* Eigen 3.3.7
* Eigen 3.3.90
* Obtained via the [Eigen git mirror](https://github.com/eigenteam/eigen-git-mirror) (March 27, 2019)
* Prior to compilation, modified top-level `CMakeLists.txt` to ensure that `-march=native` was added to `CXX_FLAGS` variable (h/t Sameer Agarwal).
* configured and built BLAS library via `mkdir build; cd build; cmake ..; make blas`
* The `gemm` implementation was pulled in at compile-time via Eigen headers; other operations were linked to Eigen's BLAS library.
@@ -214,7 +215,7 @@ size of interest so that we can better assist you.
* Hardware limits: 1.0GHz - 2.0GHz
* Adjusted minimum: 2.0GHz
* Comments:
* MKL yields superb performance for most operations, though BLIS is not far behind except for trsm. (We understand the trsm underperformance and hope to address it in the future.) OpenBLAS lags far behind MKL and BLIS due to lack of full support for AVX-512, and possibly other reasons related to software architecture and register/cache blocksizes.
* MKL yields superb performance for most operations, though BLIS is not far behind except for `trsm`. (We understand the `trsm` underperformance and hope to address it in the future.) OpenBLAS lags far behind MKL and BLIS due to lack of full support for AVX-512, and possibly other reasons related to software architecture and register/cache blocksizes.
### SkylakeX results
@@ -262,7 +263,8 @@ size of interest so that we can better assist you.
* Requested threading via `export OPENBLAS_NUM_THREADS=1` (single-threaded)
* Requested threading via `export OPENBLAS_NUM_THREADS=12` (multithreaded, 12 cores)
* Requested threading via `export OPENBLAS_NUM_THREADS=24` (multithreaded, 24 cores)
* Eigen 3.3.7
* Eigen 3.3.90
* Obtained via the [Eigen git mirror](https://github.com/eigenteam/eigen-git-mirror) (March 27, 2019)
* Prior to compilation, modified top-level `CMakeLists.txt` to ensure that `-march=native` was added to `CXX_FLAGS` variable (h/t Sameer Agarwal).
* configured and built BLAS library via `mkdir build; cd build; cmake ..; make blas`
* The `gemm` implementation was pulled in at compile-time via Eigen headers; other operations were linked to Eigen's BLAS library.
@@ -328,7 +330,8 @@ size of interest so that we can better assist you.
* Requested threading via `export OPENBLAS_NUM_THREADS=1` (single-threaded)
* Requested threading via `export OPENBLAS_NUM_THREADS=32` (multithreaded, 32 cores)
* Requested threading via `export OPENBLAS_NUM_THREADS=64` (multithreaded, 64 cores)
* Eigen 3.3.7
* Eigen 3.3.90
* Obtained via the [Eigen git mirror](https://github.com/eigenteam/eigen-git-mirror) (March 27, 2019)
* Prior to compilation, modified top-level `CMakeLists.txt` to ensure that `-march=native` was added to `CXX_FLAGS` variable (h/t Sameer Agarwal).
* configured and built BLAS library via `mkdir build; cd build; cmake ..; make blas`
* The `gemm` implementation was pulled in at compile-time via Eigen headers; other operations were linked to Eigen's BLAS library.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 107 KiB

After

Width:  |  Height:  |  Size: 108 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 114 KiB

After

Width:  |  Height:  |  Size: 115 KiB

Binary file not shown.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 76 KiB

After

Width:  |  Height:  |  Size: 78 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 95 KiB

After

Width:  |  Height:  |  Size: 96 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 97 KiB

After

Width:  |  Height:  |  Size: 96 KiB

Binary file not shown.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 81 KiB

After

Width:  |  Height:  |  Size: 81 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 102 KiB

After

Width:  |  Height:  |  Size: 104 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 100 KiB

After

Width:  |  Height:  |  Size: 101 KiB

Binary file not shown.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 83 KiB

After

Width:  |  Height:  |  Size: 88 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 92 KiB

After

Width:  |  Height:  |  Size: 92 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 100 KiB

After

Width:  |  Height:  |  Size: 100 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 70 KiB

After

Width:  |  Height:  |  Size: 70 KiB

View File

@@ -144,8 +144,8 @@ if rows == 4 && cols == 5
'Location', legend_loc );
end
set( leg,'Box','off','Color','none','Units','inches','FontSize',fontsize-3 );
%set( leg,'Position',[11.20 12.81 0.7 0.3 ] ); % (0,2br)
set( leg,'Position',[ 4.20 12.81 0.7 0.3 ] ); % (0,0br)
set( leg,'Position',[11.20 12.81 0.7 0.3 ] ); % (0,2br)
%set( leg,'Position',[ 4.20 12.81 0.7 0.3 ] ); % (0,0br)
elseif nth > 1 && theid == 4
if with_eigen == 1
leg = legend( [ blis_ln open_ln eige_ln vend_ln ], ...
@@ -158,6 +158,7 @@ if rows == 4 && cols == 5
end
set( leg,'Box','off','Color','none','Units','inches','FontSize',fontsize-3 );
%set( leg,'Position',[7.70 12.81 0.7 0.3 ] ); % (0,1br)
%set( leg,'Position',[11.20 12.81 0.7 0.3 ] ); % (0,2br)
set( leg,'Position',[10.47 14.17 0.7 0.3 ] ); % (0,2tl)
end
end

View File

@@ -9,9 +9,9 @@ plot_panel_4x5(2.20,8,56,'2s','../results/tx2/20190205/jc8ic7','tx2_jc8ic7','ARM
%plot_panel_4x5(2.00,32,26,'1s','../results/skx/20190306/jc2ic13','skx_jc2ic13','MKL'); close; clear all;
%plot_panel_4x5(2.00,32,52,'2s','../results/skx/20190306/jc4ic13','skx_jc4ic13','MKL'); close; clear all;
% with eigen:
plot_panel_4x5(2.00,32,1, 'st','../results/skx/merged20190306_0326/st', 'skx', 'MKL',1); close; clear all;
plot_panel_4x5(2.00,32,26,'1s','../results/skx/merged20190306_0326/jc2ic13','skx_jc2ic13','MKL',1); close; clear all;
plot_panel_4x5(2.00,32,52,'2s','../results/skx/merged20190306_0326/jc4ic13','skx_jc4ic13','MKL',1); close; clear all;
plot_panel_4x5(2.00,32,1, 'st','../results/skx/merged20190306_0328/st', 'skx', 'MKL',1); close; clear all;
plot_panel_4x5(2.00,32,26,'1s','../results/skx/merged20190306_0328/jc2ic13','skx_jc2ic13','MKL',1); close; clear all;
plot_panel_4x5(2.00,32,52,'2s','../results/skx/merged20190306_0328/jc4ic13','skx_jc4ic13','MKL',1); close; clear all;
% has
% pre-eigen:
@@ -19,9 +19,9 @@ plot_panel_4x5(2.00,32,52,'2s','../results/skx/merged20190306_0326/jc4ic13','skx
%plot_panel_4x5(3.00,16,12,'1s','../results/has/20190206/jc2ic3jr2','has_jc2ic3jr2','MKL',1); close; clear all;
%plot_panel_4x5(3.00,16,24,'2s','../results/has/20190206/jc4ic3jr2','has_jc4ic3jr2','MKL',1); close; clear all;
% with eigen:
plot_panel_4x5(3.25,16,1, 'st','../results/has/merged20190206_0326/st', 'has', 'MKL',1); close; clear all;
plot_panel_4x5(3.00,16,12,'1s','../results/has/merged20190206_0326/jc2ic3jr2','has_jc2ic3jr2','MKL',1); close; clear all;
plot_panel_4x5(3.00,16,24,'2s','../results/has/merged20190206_0326/jc4ic3jr2','has_jc4ic3jr2','MKL',1); close; clear all;
plot_panel_4x5(3.25,16,1, 'st','../results/has/merged20190206_0328/st', 'has', 'MKL',1); close; clear all;
plot_panel_4x5(3.00,16,12,'1s','../results/has/merged20190206_0328/jc2ic3jr2','has_jc2ic3jr2','MKL',1); close; clear all;
plot_panel_4x5(3.00,16,24,'2s','../results/has/merged20190206_0328/jc4ic3jr2','has_jc4ic3jr2','MKL',1); close; clear all;
% epyc
% pre-eigen:
@@ -29,7 +29,7 @@ plot_panel_4x5(3.00,16,24,'2s','../results/has/merged20190206_0326/jc4ic3jr2','h
%plot_panel_4x5(2.55,8,32,'1s','../results/epyc/merged201903_0619/jc1ic8jr4','epyc_jc1ic8jr4','MKL'); close; clear all;
%plot_panel_4x5(2.55,8,64,'2s','../results/epyc/merged201903_0619/jc2ic8jr4','epyc_jc2ic8jr4','MKL'); close; clear all;
% with eigen:
plot_panel_4x5(3.00,8,1, 'st','../results/epyc/merged20190306_0319_0326/st', 'epyc', 'MKL',1); close; clear all;
plot_panel_4x5(2.55,8,32,'1s','../results/epyc/merged20190306_0319_0326/jc1ic8jr4','epyc_jc1ic8jr4','MKL',1); close; clear all;
plot_panel_4x5(2.55,8,64,'2s','../results/epyc/merged20190306_0319_0326/jc2ic8jr4','epyc_jc2ic8jr4','MKL',1); close; clear all;
plot_panel_4x5(3.00,8,1, 'st','../results/epyc/merged20190306_0319_0328/st', 'epyc', 'MKL',1); close; clear all;
plot_panel_4x5(2.55,8,32,'1s','../results/epyc/merged20190306_0319_0328/jc1ic8jr4','epyc_jc1ic8jr4','MKL',1); close; clear all;
plot_panel_4x5(2.55,8,64,'2s','../results/epyc/merged20190306_0319_0328/jc2ic8jr4','epyc_jc2ic8jr4','MKL',1); close; clear all;