mirror of
https://github.com/amd/blis.git
synced 2026-05-04 14:31:12 +00:00
Developed intrinsic based f32 kernels in lpgemm
Description: 1. Developed row variant intrinsic Kernels for float32/sgemm which are called from lpgemm api aocl_gemm_f32f32f32of32() 2. 6x64m, 6x48m, 6x32m kernels and respective fringe kernels are developed using avx512. 3. 6x16m main kernel and respective n fringe and mn fringe are are developed based on avx2 and avx 4. Modularizing, K loop unroll, perf tuning, post-ops and dynamic dispatch are planned next 5. When leading dims are greater than dims bench_lpgemm need to be updated to test it and this is planned next. Change-Id: I54c78fef639ea109d6ef2c2b05c07ce396c81370
This commit is contained in:
committed by
Nallani Bhaskar
parent
46965dfc57
commit
91a9968a5e
@@ -56,8 +56,8 @@ void lpgemm_rowvar_ ## LP_SFX \
|
||||
C_type* c, \
|
||||
const dim_t rs_c, \
|
||||
const dim_t cs_c, \
|
||||
C_type alpha, \
|
||||
C_type beta, \
|
||||
const C_type alpha, \
|
||||
const C_type beta, \
|
||||
rntm_t* rntm, \
|
||||
lpgemm_thrinfo_t* thread, \
|
||||
lpgemm_post_op* post_op_list, \
|
||||
|
||||
Reference in New Issue
Block a user