Packed A matrix stride update to account for fringe cases.

-When A matrix is packed, it is packed in blocks of MRxKC, to form a
whole packed MCxKC block. If the m value is not a multiple of MR, then
the m % MR block is packed in a different manner as opposed to the MR
blocks. Subsequently the strides of the packed MR block and m % MR
blocks are different and the same needs to be updated when calling the
GEMV kernels with packed A matrix.
-Fixes to address compiler warnings.

AMD-Internal: [SWLCSG-3359]
Change-Id: I7f47afbc9cd92536cb375431d74d9b8bca7bab44
This commit is contained in:
Mithun Mohan
2025-01-21 11:45:28 +00:00
committed by Nallani Bhaskar
parent 66461b8df3
commit 39289858b7
8 changed files with 55 additions and 18 deletions

View File

@@ -1,3 +1,4 @@
r t n n r 1 128 64 1 128 128 *:none
c n t n n 32 128 2 32 128 32 bf16bf16f32of32:bias=na,swish
r n n n r 6 1 4 4 16 16 bf16s4f32of32:pre_op_scale=scalar,pre_op_scale_type=bf16,group_size=2
r n n n r 6 1 4 4 16 16 bf16s4f32of32:pre_op_zp=vector,pre_op_scale=scalar,pre_op_scale_type=bf16,group_size=2