mirror of
https://github.com/amd/blis.git
synced 2026-04-19 23:28:52 +00:00
- For single-threaded configuration of BLIS, packing of A and B matrices are enabled by default. But, packing of A is only supported for RV kernels where elements from matrix A are being broadcasted. Since elements are being loaded in RD kernels, packing of A results in failures. Hence, disabled packing of matrix A for RD kernels. - Fixed the issue where c_i index pointer was incorrectly being reset when exceeding MC block thus, resulting in failures for certain Post-Ops. - Fixed the FP32 reoder case were for n == 1 and rs_b == 1 condition, it was incorrectly using sizeof(BLIS_FLOAT) instead of sizeof(float). AMD-Internal: [SWLCSG-3497] Change-Id: I6d18afa996c253d79f666ea9789270bb59b629dd