Files
blis/kernels/zen/3
vignbala 236d092656 AVX512 optimizations for ZGEMM to handle k = 1 cases
- Implemented bli_zgemm_16x4_avx512_k1_nn( ... ) AVX512 kernel to
  be used as part of BLAS/CBLAS calls to ZGEMM. The kernel is built
  for handling the GEMM computation with inputs having k = 1,
  with the transpose values being N(for column-major) and T(for
  row-major).

- Updated the zgemm_blis_impl( ... ) layer to query the architecture
  ID and invoke the AVX2 or AVX512 kernel accordingly.

- Added API level tests for accuracy and code-coverage, as well as
  micro-kernel tests for verifying functionality and out-of-bounds
  memory accesses.

AMD-Internal: [CPUPL-5249]
Change-Id: Id1f8bebff3e0da83c7febe86299564fd658b2e84
2024-07-09 07:07:24 -04:00
..