mirror of
https://github.com/amd/blis.git
synced 2026-05-04 06:21:12 +00:00
1. Added Trans A feature to handle column major inputs
for A matrix.
2. Trans A is enabled by on-the-go pack of A matrix.
3. The on-the-go pack of A converts a column storage
MCxKC block of A into row storage MCxKC block as
LPGEMM kernels are row major kernels.
4. New pack routines are added for conversion of A matrix
from column major storage to row major storage.
5. LPGEMM Cntx is updated with pack kernel function
pointers.
6. Packing of A matrix:
- Converts column major input A to row major
in blocks of MCxKC with newly added pack A
functions when cs_a > 1.
7. Pack routines are added for AVX512 and AVX2
INT8 LPGEMM APIs.
8. Trans A feature is now supported in:
1. u8s8s32os32/os8
2. u8s8s16os16/os8/ou8
3. s8s8s32os32/os8
4. s8s8s16os16/os8
AMD-Internal: SWLCSG-2582
Change-Id: I7ce331545525a9a09f3853280615b55fcf2edabf