Can disable trsm_r-specific blocksize constraints.

Details:
- Added cpp guards around the constraints in bli_kernel_macro_defs.h
  that enforce MC % NR = 0 and NC % MR = 0. These constraints are ONLY
  needed when handling right-side trsm by allowing the matrix on the
  right (matrix B) to be triangular, because it involves swapping
  register, but not cache, blocksizes (packing A by NR and B by MR)
  and then swapping the operands to gemmtrsm just before that kernel
  is called. It may be useful to disable these constraints if, for
  example, the developer wishes to test the configuration with
  a different set of cache blocksizes where only MC % MR = 0 and
  NC % NR = 0 are enforced.
- In summary, #defining BLIS_RELAX_MCNR_NCMR_CONSTRAINTS will bypass
  the enforcement of MC % NR = 0 and NC % MR = 0.
This commit is contained in:
Field G. Van Zee
2016-11-01 14:35:15 -05:00
parent 8a11a2174a
commit d25e6f8b63
2 changed files with 20 additions and 4 deletions

View File

@@ -86,6 +86,9 @@ void bli_trsm_front
}
#if 0
// NOTE: Enabling this code requires that BLIS be configured with
// BLIS_RELAX_MCNR_NCMR_CONSTRAINTS defined.
#ifdef BLIS_RELAX_MCNR_NCMR_CONSTRAINTS
// If A is being solved against from the right, transpose all operands
// so that we can perform the computation as if A were being solved
@@ -98,6 +101,7 @@ void bli_trsm_front
bli_obj_induce_trans( c_local );
}
#endif
#else
// If A is being solved against from the right, swap A and B so that

View File

@@ -1351,11 +1351,19 @@
// Verify that cache blocksizes are whole multiples of register blocksizes.
// Specifically, verify that:
// - MC is a whole multiple of MR *AND* NR.
// - NC is a whole multiple of NR *AND* MR.
// - KC is a whole multiple of KR *AND* both MR, NR.
// - MC is a whole multiple of MR.
// - NC is a whole multiple of NR.
// - KC is a whole multiple of KR.
// These constraints are enforced because it makes it easier to handle diagonals
// in the macro-kernel implementations.
// in the macro-kernel implementations. Additionally, we optionally verify that:
// - MC is a whole multiple of NR.
// - NC is a whole multiple of MR.
// These latter constraints, guarded by #ifndef BLIS_RELAX_MCNR_NCMR_CONSTRAINTS
// below, are only enforced when we wish to be able to handle the trsm right-
// side case handling that swaps A and B, so that B is the triangular matrix,
// with NR blocking used to pack A and MR blocking used to pack B, with the
// arguments to the gemmtrsm microkernel swapped at the last minute, as the
// kernel is called.
//
// MC must be a whole multiple of MR and NR.
@@ -1370,6 +1378,7 @@
#error "MC must be multiple of MR for all datatypes."
#endif
#ifndef BLIS_RELAX_MCNR_NCMR_CONSTRAINTS
#if ( \
( BLIS_DEFAULT_MC_S % BLIS_DEFAULT_NR_S != 0 ) || \
( BLIS_DEFAULT_MC_D % BLIS_DEFAULT_NR_D != 0 ) || \
@@ -1378,6 +1387,7 @@
)
#error "MC must be multiple of NR for all datatypes."
#endif
#endif
//
// NC must be a whole multiple of NR and MR.
@@ -1392,6 +1402,7 @@
#error "NC must be multiple of NR for all datatypes."
#endif
#ifndef BLIS_RELAX_MCNR_NCMR_CONSTRAINTS
#if ( \
( BLIS_DEFAULT_NC_S % BLIS_DEFAULT_MR_S != 0 ) || \
( BLIS_DEFAULT_NC_D % BLIS_DEFAULT_MR_D != 0 ) || \
@@ -1400,6 +1411,7 @@
)
#error "NC must be multiple of MR for all datatypes."
#endif
#endif
//
// KC must be a whole multiple of KR.