mirror of
https://github.com/amd/blis.git
synced 2026-05-11 09:39:59 +00:00
Details: - Changed the BLIS_HEAP_STRIDE_ALIGN_SIZE in the configurations from 16 to BLIS_CACHE_LINE_SIZE (typically 64). - Changed the use of nr in sizing of bd buffer to packnr in level-3 macro- kernels. - Reformulated gemm_ker_var2 to look more like the other level-3 macro- kernels, in that the interior and edge-case handling is expressed once inside the loops in the n and m dimensions, rather than the edge-case handling being "unrolled" and expressed as distinct code regions. The previous macro-kernel now lives in retired form in the subdirectory other/bli_gemm_ker_var2.c.old. - Updated experimental gemm_ker_var5 according to above change. - Fixed bug in bli_her2k.c whereby incorrect transformations were being applied to optimize the macro-kernel accesses pattern on C when C is row-stored. - Various updates inside of test/exec_sizes.