Developed AVX512 based "sup" kernels for SGEMM

- Only for RRR case, var2m kernels are added
- Main kernel is of 12x32 (AVX512), associated fringe kernels of
	- 8x32, 4x32, 2x32, 1x32 (AVX512)
	- 12x16, 8x16, 4x16, 2x16, 1x16 (AVX512)
	- 12x8, 8x8 (AVX2)
	- 12x4, 8x4 (SSE4)
	- 12x2, 8x2 (SSE4)
	- existing AVX2/SSE4 kernels are used for other fringe
	  cases
- Currently, these kernels are not invoked in zen4 path
- Once all AVX512 kernels (n and rd) are done, invoke all of them
  together in zen4 config

AMD-Internal: [CPUPL-2801]

Change-Id: I7a206fee9151e92319d83dcc5f3eed61d3bf1196
This commit is contained in:
Nithya V S
2022-11-16 16:54:58 +05:30
parent 13aa3c8cd0
commit 6e9defe1b5
2 changed files with 4038 additions and 1 deletions

File diff suppressed because it is too large Load Diff

View File

@@ -39,4 +39,22 @@ AMAXV_KER_PROT( float, s, amaxv_zen_int_avx512 )
AMAXV_KER_PROT( double, d, amaxv_zen_int_avx512 )
GEMMTRSM_UKR_PROT( double, d, gemmtrsm_l_zen_asm_16x14)
GEMMTRSM_UKR_PROT( double, d, gemmtrsm_u_zen_asm_16x14)
GEMMTRSM_UKR_PROT( double, d, gemmtrsm_u_zen_asm_16x14)
//sgemm rv sup
GEMMSUP_KER_PROT( float, s, gemmsup_rv_zen_asm_12x32m )
GEMMSUP_KER_PROT( float, s, gemmsup_rv_zen_asm_8x16m )
GEMMSUP_KER_PROT( float, s, gemmsup_rv_zen_asm_4x16m )
GEMMSUP_KER_PROT( float, s, gemmsup_rv_zen_asm_2x16m )
GEMMSUP_KER_PROT( float, s, gemmsup_rv_zen_asm_1x16m )
GEMMSUP_KER_PROT( float, s, gemmsup_rv_zen_asm_8x8m )
GEMMSUP_KER_PROT( float, s, gemmsup_rv_zen_asm_8x4m )
GEMMSUP_KER_PROT( float, s, gemmsup_rv_zen_asm_8x2m )
GEMMSUP_KER_PROT( float, s, gemmsup_rv_zen_asm_12x16m )
GEMMSUP_KER_PROT( float, s, gemmsup_rv_zen_asm_12x8m )
GEMMSUP_KER_PROT( float, s, gemmsup_rv_zen_asm_12x4m )
GEMMSUP_KER_PROT( float, s, gemmsup_rv_zen_asm_12x2m )
GEMMSUP_KER_PROT( float, s, gemmsup_rv_zen_asm_8x32m )
GEMMSUP_KER_PROT( float, s, gemmsup_rv_zen_asm_4x32m )
GEMMSUP_KER_PROT( float, s, gemmsup_rv_zen_asm_2x32m )
GEMMSUP_KER_PROT( float, s, gemmsup_rv_zen_asm_1x32m )