Files
blis/kernels/zen
Vignesh Balasubramanian 06f23c4fd4 Bugfix : Functional correctness of DNRM2_ and DZNRM2_ APIs
- Updated the final reduction of partial sums( AVX-2 code section )
  to use scalar accumulation entirely, instead of using the
  _mm256_hadd_pd( ... ) intrinsic. This will in turn change the
  associativity in the reduction step.

- Reverted to using scalar code on the fringe cases in AVX-2 kernel
  for DNRM2 and DZNRM2, for improving functional correctness.

AMD-Internal: [CPUPL-4049]
Change-Id: I9d320b39d23a0cbcc77fb24d951fced778ea5ea5
2023-11-07 10:21:41 -05:00
..
2023-11-07 01:10:09 -05:00
2022-10-14 12:43:35 +05:30
2021-11-12 08:58:51 +05:30