blis/bench at d2713d3dc005867043683a6ebad37dfd93581f9d - blis - Public git mirror

amd/blis

mirror of https://github.com/amd/blis.git synced 2026-05-11 17:50:00 +00:00

Files

History

eashdash 672544bc04 GeLU Activation Function Post-Op for LPGEMM S16, S32 and BF16

1. Added Tanh approximation based GeLU Post-Op for S16, S32 and BF16
2. Changes are done at frame and micro-kernel level to
   implement this post-op.
3. Efficient AVX-512 and AVX-2 vector versions of TANHF and EXPF
   functions are implemented for the GeLU post-operation.
4. TANH and EXPF math functions are efficiently implemented in
   macro-based fashion to exploit register level fusion of GeLU
   with GEMM operations for improved performance
5. LPGEMM bench is changed to pass GeLU post-op as input and
   support accuracy check to verify functional correctness

AMD-Internal: [CPUPL-2978]
Change-Id: I472ac35c00a4ea1ab983cc5f6ff6a123c8035f28

2023-02-02 08:25:04 -05:00

..

bench_aocl_gemm

GeLU Activation Function Post-Op for LPGEMM S16, S32 and BF16

2023-02-02 08:25:04 -05:00

bench_amaxv.c

AOCL-WINDOWS: Added the windows build system to build bench folder on windows.

2022-06-27 22:32:39 -04:00

bench_axpbyv.c

AOCL-WINDOWS: Added the windows build system to build bench folder on windows.

2022-06-27 22:32:39 -04:00

bench_copyv.c

AOCL-WINDOWS: Added the windows build system to build bench folder on windows.

2022-06-27 22:32:39 -04:00

bench_dotv.c

AOCL-WINDOWS: Added the windows build system to build bench folder on windows.

2022-06-27 22:32:39 -04:00

bench_gemm.c

Integrated 32x6 DGEMM kernel for zen4 and its related changes are added.

2023-01-19 23:11:36 +05:30

bench_gemmt.c

AOCL-WINDOWS: Added the windows build system to build bench folder on windows.

2022-06-27 22:32:39 -04:00

bench_gemv.c

AOCL-WINDOWS: Added the windows build system to build bench folder on windows.

2022-06-27 22:32:39 -04:00

bench_ger.c

AOCL-WINDOWS: Added the windows build system to build bench folder on windows.

2022-06-27 22:32:39 -04:00

bench_nrm2.c

Adding AVX2 support for DNRM2

2022-09-20 06:05:01 -04:00

bench_scalv.c

AOCL-WINDOWS: Added the windows build system to build bench folder on windows.

2022-06-27 22:32:39 -04:00

bench_swapv.c

AOCL-WINDOWS: Added the windows build system to build bench folder on windows.

2022-06-27 22:32:39 -04:00

bench_syrk.c

AOCL-WINDOWS: Added the windows build system to build bench folder on windows.

2022-06-27 22:32:39 -04:00

bench_trsm.c

Fixed Bug in bench_trsm.c

2022-07-25 15:38:30 +00:00

bench_trsv.c

AOCL-WINDOWS: Added the windows build system to build bench folder on windows.

2022-06-27 22:32:39 -04:00

CMakeLists.txt

AOCL-WINDOWS: Added the windows build system to build bench folder on windows.

2022-06-27 22:32:39 -04:00

inputamaxv.txt

Bench addition for amaxv API

2021-06-04 17:45:04 +05:30

inputaxpbyv.txt

Optimized AXPBYV Kernel using AVX2 Intrinsics

2022-01-05 04:19:11 -05:00

inputcopy.txt

Added bench utility for copyv API

2021-06-09 12:29:49 +05:30

inputdotv.txt

Added bench utility for dotv and scalv APIs.

2021-05-21 10:00:32 +05:30

inputgemm.txt

AOCL DTL - Added thread and execution time details in logs

2021-11-12 08:58:54 +05:30

inputgemmt.txt

Added bench app for syrk - input is a log file generated from AOCL_DTL

2021-05-11 14:57:51 +05:30

inputgemv.txt

Fixed crash issue in bench utility for gemv API

2021-05-19 14:21:09 +05:30

inputger.txt

Added bench utility for ger API.

2021-05-19 14:05:01 +05:30

inputnrm2.txt

Adding AVX2 support for DNRM2

2022-09-20 06:05:01 -04:00

inputscalv.txt

Added bench utility for dotv and scalv APIs.

2021-05-21 10:00:32 +05:30

inputswap.txt

Added bench utility for swapv API

2021-06-09 17:05:00 +05:30

inputsyrk.txt

Added bench app for syrk - input is a log file generated from AOCL_DTL

2021-05-11 14:57:51 +05:30

inputtrsm.txt

Trsm bench utility missmatch DTL logs and bench

2021-11-12 08:58:52 +05:30

inputtrsv.txt

Bench trsv logging error

2021-06-08 11:54:55 +05:30

Makefile

Adding AVX2 support for DNRM2

2022-09-20 06:05:01 -04:00