mirror of
https://github.com/amd/blis.git
synced 2026-05-05 06:51:11 +00:00
- Added new GEMV_AVX2 5-Loop for handling BF16 inputs, for n = 1 and m = 1 conditions. - Modified Re-order and Un-reorder functions to cater to default n=1 reorder conditions. - Added bf16 beta and store support in F32 GEMV N AVX2 and 256_512 kernels. - Added bf16 beta support for F32 GEMV M kernels, and modified bf16 store conditions for GEMV M kernels. - Modified the n=1 re-order guards for reference bf16 re-order API. - Added an additional path in the un-reorder case for handling n=1 vector conversion AMD-Internal: [ SWLCSG - 3602 ]