Files
blis/kernels
mkadavil 42e539b878 Quantization (scale + zero point) updates/fixes for BF16 LPGEMM api.
-_mm512_cvtpbh_ps intrinsic is not supported in older versions of gcc
(<gcc 12.2) and subsequently throws a compilation error. This is fixed
by replacing this intrinsic with a macro that achieves the bf16 to f32
conversion via shift operations.
-Bug fixes in the vector scale factor load in fringe kernels.

AMD-Internal: [SWLCSG-2945]
Change-Id: I8eac4c4b34b043e7a8116dc465723d8f85b28018
2024-07-23 04:39:14 +05:30
..
2021-10-08 02:35:58 +09:00
2023-11-22 17:51:46 -05:00
2023-11-23 08:54:31 -05:00
2023-11-23 08:54:31 -05:00
2020-07-22 18:24:26 +05:30