mirror of
https://github.com/amd/blis.git
synced 2026-07-02 13:17:16 +00:00
- Blocksizes for sizes where M >> K, N >> K and K < 500 were tuned by running blis bench on only one MPI rank. Blocksizes tuned this way are not performing well for all configurations. - Retuned the blocksizes so that performance is good for such skinny sizes. AMD-Internal: [CPUPL-6362] Change-Id: I89c61889df2443ef6bf0e87bf89263768b5c00c1