mirror of
https://github.com/amd/blis.git
synced 2026-06-30 03:07:23 +00:00
- Replaced switch case with if else, lookup table for switch case is palced at the end of .text section which causes a huge jump. - Reduced number of branches for tiny sizes. - Cpuid query is slow, therefore added a new if statement which avoids cpuid query for tiny sizes(<200). - Redirected tiny sizes to AVX2 kernel. AMD-Internal: [CPUPL-5407] Change-Id: I8e73777b2f00c9dcff9775ddfcb7ca3f74fa901c