Files
blis/frame/base
Field G. Van Zee 11dfc176a3 Reorganized thread auto-factorization logic.
Details:
- Reorganized logic of bli_thread_partition_2x2() so that the primary
  guts were factored out into "fast" and "slow" variants. Then added
  logic to the "fast" variant that allows for more optimal thread
  factorizations in some situations where there is at least one factor
  of 2.
- Changed BLIS_THREAD_RATIO_M from 2 to 1 in bli_kernel_macro_defs.h and
  added comments to that file describing BLIS_THREAD_RATIO_? and
  BLIS_THREAD_MAX_?R.
- In bli_family_zen.h and bli_family_zen2.h, preprocessed out several
  macros not used in vanilla BLIS and removed the unused macro
  BLIS_ENABLE_ZEN_BLOCK_SIZES from the former file.
- Disabled AMD's small matrix handling entry points in bli_syrk_front.c
  and bli_trsm_front.c. (These branches of small matrix handling have
  not been reviewed by vanilla BLIS developers.)
- Added commented-out calls printf() to bli_rntm.c.
- Whitespace changes to bli_thread.c.
2020-12-01 19:51:27 +00:00
..
2019-10-11 11:53:51 -05:00
2019-10-11 11:53:51 -05:00
2019-10-11 11:53:51 -05:00
2019-10-11 11:53:51 -05:00
2019-10-11 11:53:51 -05:00