Files
blis/kernels/zen
Harihara Sudhan S 03fa660792 Optimized xGEMV for non-unit stride X vector
- In GEMV variant 1, the input matrix A is in row major. X vector
  has to be of unit stride if the operation is to be vectorized.
- In cases when X vector is non-unit stride, vectorization of the GEMV
  operation inside the kernel has been ensured by packing the input X
  vector to a temporary buffer with unit stride. Currently, the
  packing is done using the SCAL2V.
- In case of DGEMV, X vector is scaled by alpha as part of packing.
  In CGEMV and ZGEMV, alpha is passed as 1 while packing.
- The temporary buffer created is released once the GEMV operation
  is complete.
- In DGEMV variant 1, moved problem decomposition for Zen architecture
  to the DOTXF kernel.
- Removed flag check based kernel dispatch logic from DGEMV. Now,
  kernels will be picked from the context for non-avx machines. For
  avx machines, the kernel(s) to be dispatched is(are) assigned to
  the function pointer in the unf_var layer.

AMD-Internal: [CPUPL-3475]
Change-Id: Icd9fd91eccd831f1fcb9fbf0037fcbbc2e34268e
2023-08-08 01:01:22 -04:00
..
2023-08-06 01:51:47 -04:00
2022-10-14 12:43:35 +05:30
2021-11-12 08:58:51 +05:30