Commit Graph

4 Commits

Author SHA1 Message Date
RuQing Xu
f44149f787 Armv8 Trash New Bulk Kernels
- They didn't make much improvements.
- Can't register row-preferral and column-preferral ukrs at the same time.
  Will break 1m.
2021-10-08 02:35:58 +09:00
RuQing Xu
4bfadf9b56 Firestorm Block Size Fixes 2021-10-06 01:51:26 +09:00
RuQing Xu
abc648352c Armv8 Fix 6x8 Row-Maj Ukr
- Fixed for 6x8 only, 4x4 & 4x8 pending;
- Installed to config firestorm as benchmark seems to show better perf:
   Old:
blis_dgemm_ukr_c                     6     8   320    36.87   2.43e-17   PASS
blis_dgemm_ukr_c                     6     8   352    40.55   1.04e-17   PASS
blis_dgemm_ukr_c                     6     8   384    44.24   5.68e-17   PASS
blis_dgemm_ukr_c                     6     8   416    41.67   3.51e-17   PASS
blis_dgemm_ukr_c                     6     8   448    34.41   2.94e-17   PASS
blis_dgemm_ukr_c                     6     8   480    42.53   2.35e-17   PASS

   New:
blis_dgemm_ukr_r                     6     8   352    50.69   1.59e-17   PASS
blis_dgemm_ukr_r                     6     8   384    49.15   5.55e-17   PASS
blis_dgemm_ukr_r                     6     8   416    50.44   2.86e-17   PASS
blis_dgemm_ukr_r                     6     8   448    46.92   3.12e-17   PASS
blis_dgemm_ukr_r                     6     8   480    48.08   4.08e-17   PASS
2021-10-03 13:14:19 +09:00
RuQing Xu
e38ca28689 Added Apple Firestorm (A14/M1) Subconfig
- Use the same bulk kernel as Cortex-A53 / ThunderX2;
- Larger block size;
- Use gemmsup kernels for double precision.
2021-08-13 03:21:19 +09:00