Files
blis/kernels
Rayan, Rohan e85be22da0 Adding tiny path for SGEMM (#237)
Adding SGEMM tiny path for Zen architectures.
Needed to cover some performance gaps seen wrt MKL
Only allowing matrices that all fit into the L1 cache to the tiny path
Only tuned for single threaded operation at the moment
Todo: Tune cases where AVX2 performs better than AVX512 on Zen4
Todo: The current ranges are very conservative, there may be scope to increase the matrix sizes that go into the tiny path

AMD-Internal: CPUPL-7555
Co-authored-by: Rohan Rayan rohrayan@amd.com
2025-10-24 13:14:33 +05:30
..
2021-10-08 02:35:58 +09:00
2024-08-05 15:35:08 -04:00
2025-09-04 17:14:06 +01:00
2024-08-05 15:35:08 -04:00
2024-08-05 15:35:08 -04:00
2025-10-24 13:14:33 +05:30
2023-11-23 08:54:31 -05:00
2020-07-22 18:24:26 +05:30
2025-10-24 13:14:33 +05:30