Files
blis/gtestsuite/testinghelpers
Vignesh Balasubramanian a6a67fea2d ZAXPBYV optimizations for handling unit and non-unit strides
- Updated the bli_zaxpbyv_zen_int( ... ) kernel's computational
  logic. The kernel performs two different sets of compute based
  on the value of alpha, for both unit and non-unit strides. There
  are no constraints on beta scaling of the 'y' vector.

- Updated the logic to support 'x' conjugate in the computation.
  The kernel supports conjugate/no conjugate operation through the
  usage of _mm256_fmsubadd_pd( ... ) and _mm256_addsub_pd( ... )
  intrinsics.

- Updated the early return condition in the kernel to adhere to
  the standard compliance.

- Updated the scalar computation with vector computation(using 128
  bit registers), in case of dealing with a single element(fringe case)
  in unit-stride or vectors with non-unit strides. A single dcomplex
  element occupies 128 bits in memory, thereby providing scope for
  this optimization.

- Added accuracy and extreme value testing with sufficient sizes
  and initializations, to test the required main and fringe cases
  of the computation.

AMD-Internal: [CPUPL-3623]
Change-Id: I7ae918856e7aba49424162290f3e3d592c244826
2023-10-12 06:31:08 -04:00
..