Details:
- Updated the changes introduced in 618f433 so that the strides of the
temporary microtile ct used in the macrokernels is determined based
on the storage preference of the microkernel (via the new functions
below), rather than the strides of c. In almost all cases, presently,
this change results in no net effect, as a high-level optimization
in the _front() functions aligns the storage of c to that of the
microkernel's preference. However, I encountered some cases where
this is not always the case in some development code that has yet
to be committed, and therefore I'm generalizing the framework code
in advance.
- Defined two new functions in bli_cntx.c:
bli_cntx_l3_ukr_prefers_rows_dt()
bli_cntx_l3_ukr_prefers_cols_dt()
which return bool_t's based on the current micro-kernel's storage
preferences. For induced methods, the preference of the underlying
real domain microkernel is returned.
- Updated definition of bli_cntx_l3_ukr_dislikes_storage_of(), and
by proxy bli_cntx_l3_ukr_prefers_storage_of(), to be in terms of
the above functions, rather than querying the preferences of the
native microkernel directly (which did the wrong thing for induced
methods).