Files
blis/kernels/zen5/aocl_smart
Shubham Sharma 7695561f4e Tuned DGEMM blocksizes for ZEN5
- In the existing code, blocksizes for sizes where M >> K, N >> K and K < 500
  were not tuned properly for cases when application would use more than
  one instance of blis in parallel.
- Imporved DGEMM performane for sizes where M, N >> k by retuning blocksizes.
  Such sizes are used by applications like HPL.

AMD-Internal: [SWLCSG-3338]
Change-Id: Iec17ecc53a6fabf50eedacaf208e4e74a4e21418
2025-02-03 05:40:07 -05:00
..