Details:
- Made several updates to test/1m4m/runme.sh, including:
- Added missing handling for 1m and 4m1a implementations when setting
the BLIS_??_NT environment variables.
- Added support for using numactl to run the test executables.
- Several other cleanups.
Details:
- Added a new standalone test driver directory named '1m4m' that can
build and run performance experiments for BLIS 1m, 4m1a, assembly,
OpenBLAS, and the vendor library (MKL). This new driver directory
was used to regenerate performance results for the 1m paper.
- Added alternate (commented-out) cache blocksizes to
config/haswell/bli_cntx_init_haswell.c. These blocksizes tend to
work well on an a 12-core Intel Xeon E5-2650 v3.