mirror of
https://github.com/amd/blis.git
synced 2026-05-21 17:08:17 +00:00
- In GEMV variant 1, the input matrix A is in row major. X vector has to be of unit stride if the operation is to be vectorized. - In cases when X vector is non-unit stride, vectorization of the GEMV operation inside the kernel has been ensured by packing the input X vector to a temporary buffer with unit stride. Currently, the packing is done using the SCAL2V. - In case of DGEMV, X vector is scaled by alpha as part of packing. In CGEMV and ZGEMV, alpha is passed as 1 while packing. - The temporary buffer created is released once the GEMV operation is complete. - In DGEMV variant 1, moved problem decomposition for Zen architecture to the DOTXF kernel. - Removed flag check based kernel dispatch logic from DGEMV. Now, kernels will be picked from the context for non-avx machines. For avx machines, the kernel(s) to be dispatched is(are) assigned to the function pointer in the unf_var layer. AMD-Internal: [CPUPL-3475] Change-Id: Icd9fd91eccd831f1fcb9fbf0037fcbbc2e34268e