mirror of
https://github.com/amd/blis.git
synced 2026-05-11 17:50:00 +00:00
Details: - Modified bli_packm_blk_var1.c and _var2.c to increase the triangular case's panel increment by 1 if it would otherwise be odd. This is particularly necessary in _var2.c when handling the interleaved 3m or ro/io/rpi pack schemas, since division of an odd number by 2 can happen if both the panel length and the panel packing dimension (register packing blocksize) are odd, thus making their product odd. - Modified bli_packm_init.c so that panel strides are increased by 1 if they would otherwise be odd, even for non-3m related packing. - Modified the trmm and trsm macro-kernels so that triangular packed micro-panels are traversed with this new "increment by 1 if odd" policy. - Added sanity checks in trmm and trsm macro-kernels that would result in an abort() if the conditions that would lead to a "divide odd integer by 2" scenario ever manifest. - Defined bli_is_odd(), _is_even() macros in bli_scalar_macro_defs.h.