Files
blis/kernels
Mangala V 245fdf072c AVX-512 based col-preferred kernels for ZGEMM in native path
- Kernel block size is 12x4
- Updated the zen4 config to enable these kernels in zen4 path.
- Tuned MC,KC,NC for better performance for m/n/k size > 500
- Updated CMakeLists.txt with ZGEMM kernels for windows build.

Kernel supports:
1. Preload and prebroadcast of A and B
2. Prefecth of C Matrix
3. K loop is sub divided in to multiple loops to maintain distance between c prefetchs.
4. Special case when alpha/beta imag component is zero
5. Row/Col/General stride of Matrix C

AMD-Internal: [CPUPL-2998]
Change-Id: I62e3c352d475b1add3f43270805fbcee00e2e440
2023-03-28 23:05:06 -04:00
..
2020-09-29 16:52:18 -05:00
2022-07-22 03:42:17 -04:00
2023-03-24 07:32:43 -04:00
2021-04-27 11:09:48 +05:30
2020-07-22 18:24:26 +05:30