Files
Balasubramanian, Vignesh 54ac36c8bc Bugfix: BF16 to F32 conversion in AVX2 F32 codepath
- Updated the conversion function(in case of receiving
  column stored inputs) from BF16 to F32, in order to
  use the correct strides while storing.

- Conversion of B is potentially multithreaded using
  the threads meant for IC compute. With the wrong
  strides in the kernel, this gives rise to incorrect
  writes onto the miscellaneous buffer.

AMD-Internal: [CPUPL-7675]

Co-authored-by: Vishal-A <Vishal.Akula@amd.com>
Co-authored-by: Vignesh Balasubramanian <vignbala@amd.com>
2025-12-01 15:06:13 +05:30
..
2025-09-17 18:28:34 +01:00
2025-08-26 16:37:43 +01:00
2024-08-05 15:35:08 -04:00