mirror of
https://github.com/amd/blis.git
synced 2026-04-20 15:48:50 +00:00
Added new ZTRSM small code path for ZEN5
- Added new ZTRSM kernels for right and left variants. - Kernel dimensions are 12x4. - 12x4 ZGEMM SUP kernels are used internally for solving GEMM subproblem. - These kernels do not support conjugate transpose. - Only column major inputs are supported. - Tuned thresholds to pick efficent code path for ZEN5. AMD-Internal: [CPUPL-6356] Change-Id: I33ba3d337b0fcd972ca9cfe4668cb23d2b279b6e
This commit is contained in:
1646
kernels/zen5/3/bli_dtrsm_small_zen5.c
Normal file
1646
kernels/zen5/3/bli_dtrsm_small_zen5.c
Normal file
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
1629
kernels/zen5/3/bli_ztrsm_small_zen5.c
Normal file
1629
kernels/zen5/3/bli_ztrsm_small_zen5.c
Normal file
File diff suppressed because it is too large
Load Diff
@@ -75,6 +75,11 @@ TRSMSMALL_KER_PROT( d, trsm_small_XAutB_XAlB_ZEN5 )
|
||||
TRSMSMALL_KER_PROT( d, trsm_small_AltXB_AuXB_ZEN5 )
|
||||
TRSMSMALL_KER_PROT( d, trsm_small_AutXB_AlXB_ZEN5 )
|
||||
|
||||
TRSMSMALL_KER_PROT( z, trsm_small_XAltB_XAuB_ZEN5 )
|
||||
TRSMSMALL_KER_PROT( z, trsm_small_XAutB_XAlB_ZEN5 )
|
||||
TRSMSMALL_KER_PROT( z, trsm_small_AltXB_AuXB_ZEN5 )
|
||||
TRSMSMALL_KER_PROT( z, trsm_small_AutXB_AlXB_ZEN5 )
|
||||
|
||||
#ifdef BLIS_ENABLE_OPENMP
|
||||
err_t bli_trsm_small_mt_ZEN5
|
||||
(
|
||||
|
||||
Reference in New Issue
Block a user