mirror of
https://github.com/amd/blis.git
synced 2026-05-11 01:30:00 +00:00
Can disable trsm_r-specific blocksize constraints.
Details: - Added cpp guards around the constraints in bli_kernel_macro_defs.h that enforce MC % NR = 0 and NC % MR = 0. These constraints are ONLY needed when handling right-side trsm by allowing the matrix on the right (matrix B) to be triangular, because it involves swapping register, but not cache, blocksizes (packing A by NR and B by MR) and then swapping the operands to gemmtrsm just before that kernel is called. It may be useful to disable these constraints if, for example, the developer wishes to test the configuration with a different set of cache blocksizes where only MC % MR = 0 and NC % NR = 0 are enforced. - In summary, #defining BLIS_RELAX_MCNR_NCMR_CONSTRAINTS will bypass the enforcement of MC % NR = 0 and NC % MR = 0.
This commit is contained in:
@@ -1351,11 +1351,19 @@
|
||||
|
||||
// Verify that cache blocksizes are whole multiples of register blocksizes.
|
||||
// Specifically, verify that:
|
||||
// - MC is a whole multiple of MR *AND* NR.
|
||||
// - NC is a whole multiple of NR *AND* MR.
|
||||
// - KC is a whole multiple of KR *AND* both MR, NR.
|
||||
// - MC is a whole multiple of MR.
|
||||
// - NC is a whole multiple of NR.
|
||||
// - KC is a whole multiple of KR.
|
||||
// These constraints are enforced because it makes it easier to handle diagonals
|
||||
// in the macro-kernel implementations.
|
||||
// in the macro-kernel implementations. Additionally, we optionally verify that:
|
||||
// - MC is a whole multiple of NR.
|
||||
// - NC is a whole multiple of MR.
|
||||
// These latter constraints, guarded by #ifndef BLIS_RELAX_MCNR_NCMR_CONSTRAINTS
|
||||
// below, are only enforced when we wish to be able to handle the trsm right-
|
||||
// side case handling that swaps A and B, so that B is the triangular matrix,
|
||||
// with NR blocking used to pack A and MR blocking used to pack B, with the
|
||||
// arguments to the gemmtrsm microkernel swapped at the last minute, as the
|
||||
// kernel is called.
|
||||
|
||||
//
|
||||
// MC must be a whole multiple of MR and NR.
|
||||
@@ -1370,6 +1378,7 @@
|
||||
#error "MC must be multiple of MR for all datatypes."
|
||||
#endif
|
||||
|
||||
#ifndef BLIS_RELAX_MCNR_NCMR_CONSTRAINTS
|
||||
#if ( \
|
||||
( BLIS_DEFAULT_MC_S % BLIS_DEFAULT_NR_S != 0 ) || \
|
||||
( BLIS_DEFAULT_MC_D % BLIS_DEFAULT_NR_D != 0 ) || \
|
||||
@@ -1378,6 +1387,7 @@
|
||||
)
|
||||
#error "MC must be multiple of NR for all datatypes."
|
||||
#endif
|
||||
#endif
|
||||
|
||||
//
|
||||
// NC must be a whole multiple of NR and MR.
|
||||
@@ -1392,6 +1402,7 @@
|
||||
#error "NC must be multiple of NR for all datatypes."
|
||||
#endif
|
||||
|
||||
#ifndef BLIS_RELAX_MCNR_NCMR_CONSTRAINTS
|
||||
#if ( \
|
||||
( BLIS_DEFAULT_NC_S % BLIS_DEFAULT_MR_S != 0 ) || \
|
||||
( BLIS_DEFAULT_NC_D % BLIS_DEFAULT_MR_D != 0 ) || \
|
||||
@@ -1400,6 +1411,7 @@
|
||||
)
|
||||
#error "NC must be multiple of MR for all datatypes."
|
||||
#endif
|
||||
#endif
|
||||
|
||||
//
|
||||
// KC must be a whole multiple of KR.
|
||||
|
||||
Reference in New Issue
Block a user