Files
blis/kernels/zen
Dipal M Zambare 1ec020b33e AMD kernel updates; frame-specific AMD updates. (#597)
Details:
- Allow building BLIS with certain framework files (each with the '_amd'
  suffix) that have been customized by AMD for Zen-based hardware. These
  customized files were derived from portable versions of the same files
  (i.e., those without the '_amd' suffix). Whether the portable or AMD-
  specific files are compiled is now controlled by a new configure
  option, --[en|dis]able-amd-frame-tweaks. This option is disabled by
  default in vanilla BLIS, though AMD may choose to enable it by default
  in their fork. For now, the added AMD-specific files are:
  - bli_gemv_unf_var2_amd.c
  - bla_copy_amd.c
  - bla_gemv_amd.c
  These files reside in 'amd' subdirectories found within the directory
  housing their generic counterparts.
- Register optimized real-domain copyv, setv, and swapv kernels in
  bli_cntx_init_zen.c.
- Various minor updates to level-1v kernels in 'zen' kernel set.
- Added caxpyf kernel as well as saxpyf and multiple daxpyf kernels to
  the 'zen' kernel set
- If the problem passed to ?gemm_() in bla_gemm.c has a unit m or n dim,
  call gemv instead and return early.
- Combined variable declarations with their initialization in various
  level-2 and level-3 BLAS compatibility files, and also inserted
  'const' qualifer in those same declaration statements.
- Moved frame/compat/bla_gemmt.c and .h to frame/compat/extra/ .
- Added copyv and swapv test drivers to 'test' directory.
- Whitespace, comment changes.
2022-03-29 16:15:36 -05:00
..