Files
blis/kernels/zen4
Mangala V 245fdf072c AVX-512 based col-preferred kernels for ZGEMM in native path
- Kernel block size is 12x4
- Updated the zen4 config to enable these kernels in zen4 path.
- Tuned MC,KC,NC for better performance for m/n/k size > 500
- Updated CMakeLists.txt with ZGEMM kernels for windows build.

Kernel supports:
1. Preload and prebroadcast of A and B
2. Prefecth of C Matrix
3. K loop is sub divided in to multiple loops to maintain distance between c prefetchs.
4. Special case when alpha/beta imag component is zero
5. Row/Col/General stride of Matrix C

AMD-Internal: [CPUPL-2998]
Change-Id: I62e3c352d475b1add3f43270805fbcee00e2e440
2023-03-28 23:05:06 -04:00
..
2022-06-13 10:52:53 +05:30
2023-03-27 23:18:32 -05:00