Details: - Added missing C preprocessor guards in bli_kernel_macro_defs.h that enforce constraints on the register blocksizes relative to the cache blocksizes. Thanks to Tyler for helping me stumble across this issue.