mirror of
https://github.com/amd/blis.git
synced 2026-05-13 02:25:39 +00:00
- Added a conditional check to see if the vectorized kernels for DNRM2_ and DZNRM2_ can be called directly, without incurring any framework overhead. - The condition to satisfy this fast-path is for the size to be such that the ideal threads required is 1, with the vector having unit stride( so that packing at the framework-level can be avoided ). AMD-Internal: [CPUPL-4045] Change-Id: Ie37e86f802ada0e226dff88e74f0341e97ebfe28