mirror of
https://github.com/amd/blis.git
synced 2026-05-11 09:39:59 +00:00
Details: - Restructured herk_l and herk_u macro-kernels in the imagine of trmm and trsm, in that the edge cases are captured by the main loop, rather than trying to have "cleanup" sections that result in four distinct parts (interior, bottom edge, right edge, bottom-right edge) of the code. - Fixed the way b_next was being computed in the non-gemm level-3 macro-kernels (herk, trmm, trsm). The way they are computed now matches that of gemm.