mirror of
https://github.com/amd/blis.git
synced 2026-03-27 20:57:21 +00:00
- The existing row-preferred reference kernels for GEMM SUP path were not taking into consideration the packing state of matrices A or B. Thus, whenever either or both A and B matrices were packed the kernel was unable to iterate appropriately through the matrices thereby calculating incorrect values resulting in failures. - Though, for generic configuration, the SUP path is disabled by default the set of Pack and Compute Extension APIs use these kernels thus, this issue resulted in their failures as well. - With this patch, the loops being used in these kernels have been fixed to iterate over steps of MR and NR while also accounting for the fringe cases. Within the updated loops, temporary pointers used to point to the correct block/panel of the matrices are incremented with panel strides of respective matrices. AMD-Internal: [CPUPL-5674] Change-Id: Ic3939877c79ebb9ccf9e53b1d1672cea4b8c5959