Developed intrinsic based f32 kernels in lpgemm

Description:

1. Developed row variant intrinsic Kernels for float32/sgemm
   which are called from lpgemm api aocl_gemm_f32f32f32of32()

2. 6x64m, 6x48m, 6x32m kernels and respective fringe kernels are
   developed using avx512.

3. 6x16m main kernel and respective n fringe and mn fringe are
   are developed based on avx2 and avx

4. Modularizing, K loop unroll, perf tuning, post-ops and dynamic
   dispatch are planned next

5. When leading dims are greater than dims bench_lpgemm need
   to be updated to test it and this is planned next.

Change-Id: I54c78fef639ea109d6ef2c2b05c07ce396c81370
This commit is contained in:
bhaskarn
2023-02-08 17:52:52 +05:30
committed by Nallani Bhaskar
parent 46965dfc57
commit 91a9968a5e
10 changed files with 5593 additions and 327 deletions

View File

@@ -56,8 +56,8 @@ void lpgemm_rowvar_ ## LP_SFX \
C_type* c, \
const dim_t rs_c, \
const dim_t cs_c, \
C_type alpha, \
C_type beta, \
const C_type alpha, \
const C_type beta, \
rntm_t* rntm, \
lpgemm_thrinfo_t* thread, \
lpgemm_post_op* post_op_list, \