Fixed incorrect sizeof(type) in edge case macros. (#662)

Details:
- In bli_edge_case_macro_defs.h, the GEMM_UKR_SETUP_CT_PRE() and
  GEMMTRSM_UKR_SETUP_CT_PRE() macros previously declared their temporary
  ct microtiles as:

    PASTEMAC(ch,ctype)
          _ct[ BLIS_STACK_BUF_MAX_SIZE / sizeof( PASTEMAC(ch,type) ) ] \
               __attribute__((aligned(alignment))); \

  The problem here is that sizeof( PASTEMAC(ch,type) ) evaluates to
  things like sizeof( BLIS_DOUBLE ), not sizeof( double ), and since
  BLIS_DOUBLE is an enum, it is typically an int, which means the
  sizeof() expression is evaluating to the wrong value. This was likely
  a benign bug, though, since BLIS does not support any computational
  datatypes that are smaller than sizeof( int ), which means the ct
  array would be *over*-allocated rather than underallocated. Thanks
  to @moon-chilled for identifying and reporting this bug in #624.
- CREDITS file update.
This commit is contained in:
Field G. Van Zee
2022-09-13 11:46:24 -05:00
committed by GitHub
parent 6e5431e849
commit cb74202db3
2 changed files with 3 additions and 2 deletions

View File

@@ -68,6 +68,7 @@ but many others have contributed code and feedback, including
Devin Matthews @devinamatthews (The University of Texas at Austin)
Stefanos Mavros @smavros
Mithun Mohan @MithunMohanKadavil (AMD)
@moon-chilled
Ilknur Mustafazade @Runkli
@nagsingh
Bhaskar Nallani @BhaskarNallani (AMD)

View File

@@ -47,7 +47,7 @@
PASTEMAC(ch,ctype)* restrict _c = c; \
const inc_t _rs_c = rs_c; \
const inc_t _cs_c = cs_c; \
PASTEMAC(ch,ctype) _ct[ BLIS_STACK_BUF_MAX_SIZE / sizeof( PASTEMAC(ch,type) ) ] \
PASTEMAC(ch,ctype) _ct[ BLIS_STACK_BUF_MAX_SIZE / sizeof( PASTEMAC(ch,ctype) ) ] \
__attribute__((aligned(alignment))); \
const inc_t _rs_ct = row_major ? nr : 1; \
const inc_t _cs_ct = row_major ? 1 : mr;
@@ -137,7 +137,7 @@
PASTEMAC(ch,ctype)* restrict _c = c11; \
const inc_t _rs_c = rs_c; \
const inc_t _cs_c = cs_c; \
PASTEMAC(ch,ctype) _ct[ BLIS_STACK_BUF_MAX_SIZE / sizeof( PASTEMAC(ch,type) ) ] \
PASTEMAC(ch,ctype) _ct[ BLIS_STACK_BUF_MAX_SIZE / sizeof( PASTEMAC(ch,ctype) ) ] \
__attribute__((aligned(alignment))); \
const inc_t _rs_ct = row_major ? nr : 1; \
const inc_t _cs_ct = row_major ? 1 : mr;