mirror of
https://github.com/amd/blis.git
synced 2026-04-20 07:38:53 +00:00
Add vzeroupper to Haswell microkernels. (#524)
Details: - Added vzeroupper instruction to the end of all 'gemm' and 'gemmtrsm' microkernels so as to avoid a performance penalty when mixing AVX and SSE instructions. These vzeroupper instructions were once part of the haswell kernels, but were inadvertently removed during a source code shuffle some time ago when we were managing duplicate 'haswell' and 'zen' kernel sets. Thanks to Devin Matthews for tracking this down and re-inserting the missing instructions.
This commit is contained in: