Files
blis/kernels
Sharma, Arnav ee3d250b7a Fix for F32 to BF16 Conversion and AVX512 ISA Support Checks
- Fixed register assignment bug in lpgemv_m_kernel_f32_avx512 where zmm3
  was incorrectly used instead of zmm4 in BF16_F32_BETA_OP_NLT16F_MASK macro.

- Replaced hardware-specific BF16 conversion intrinsics with manual
  rounding, bit manipulation and F32 instruction set for compatibility on
  hardware without native BF16 support.

- Added AVX512_BF16 ISA support checks for s8s8s32obf16 and u8s8s32obf16
  GEMM operations to ensure processor compatibility before execution.

AMD-Internal: [CPUPL-7410]
2025-09-19 18:49:33 +05:30
..
2021-10-08 02:35:58 +09:00
2024-08-05 15:35:08 -04:00
2025-09-04 17:14:06 +01:00
2024-08-05 15:35:08 -04:00
2024-08-05 15:35:08 -04:00
2025-09-17 18:28:34 +01:00
2023-11-23 08:54:31 -05:00
2020-07-22 18:24:26 +05:30
2025-09-17 21:48:34 +01:00