blis/bench at b9f6286731ec4c3acc90ad25c4b0f7fe056e31a9 - blis - Public git mirror

amd/blis

mirror of https://github.com/amd/blis.git synced 2026-04-20 07:38:53 +00:00

Files

History

Mithun Mohan b9f6286731 Tiny GEMM path for BF16 LPGEMM API.

-Currently the BF16 API uses the 5 loop algorithm inside the OMP loop
to compute the results, irrespective if the input sizes. However it
was observed that for very tiny sizes (n <= 128, m <= 36), this OMP
loop and NC,MC,KC loops were turning out to be overheads.
-In order to address this, a new path without OMP loop and just the
NR loop over the micro-kernel is introduced for tiny inputs. This is
only applied when the num threads set for GEMM is 1.
-Only row major inputs are allowed to proceed with tiny GEMM.

AMD-Internal: [SWLCSG-3380, SWLCSG-3258]

Change-Id: I9dfa6b130f3c597ca7fcf5f1bc1231faf39de031

2025-02-07 04:37:11 -05:00

..

bench_aocl_gemm

Tiny GEMM path for BF16 LPGEMM API.

2025-02-07 04:37:11 -05:00

bench_amaxv.c

Update to AOCL-BLAS bench application for logging outputs

2025-02-06 22:59:59 +05:30

bench_asumv.c

Update to AOCL-BLAS bench application for logging outputs

2025-02-06 22:59:59 +05:30

bench_axpbyv.c

Update to AOCL-BLAS bench application for logging outputs

2025-02-06 22:59:59 +05:30

bench_axpyv.c

Update to AOCL-BLAS bench application for logging outputs

2025-02-06 22:59:59 +05:30

bench_copyv.c

Update to AOCL-BLAS bench application for logging outputs

2025-02-06 22:59:59 +05:30

bench_dotv.c

Update to AOCL-BLAS bench application for logging outputs

2025-02-06 22:59:59 +05:30

bench_gemm_pack_compute.c

Update to AOCL-BLAS bench application for logging outputs

2025-02-06 22:59:59 +05:30

bench_gemm.c

Update to AOCL-BLAS bench application for logging outputs

2025-02-06 22:59:59 +05:30

bench_gemmt.c

Update to AOCL-BLAS bench application for logging outputs

2025-02-06 22:59:59 +05:30

bench_gemv.c

Update to AOCL-BLAS bench application for logging outputs

2025-02-06 22:59:59 +05:30

bench_ger.c

Update to AOCL-BLAS bench application for logging outputs

2025-02-06 22:59:59 +05:30

bench_nrm2.c

Update to AOCL-BLAS bench application for logging outputs

2025-02-06 22:59:59 +05:30

bench_scalv.c

Update to AOCL-BLAS bench application for logging outputs

2025-02-06 22:59:59 +05:30

bench_swapv.c

Update to AOCL-BLAS bench application for logging outputs

2025-02-06 22:59:59 +05:30

bench_syrk.c

Update to AOCL-BLAS bench application for logging outputs

2025-02-06 22:59:59 +05:30

bench_trsm.c

Bugfix for AOCL-BLAS bench application

2025-01-29 03:25:57 -05:00

bench_trsv.c

Update to AOCL-BLAS bench application for logging outputs

2025-02-06 22:59:59 +05:30

blis_int_type.h

Update to AOCL-BLAS bench application for logging outputs

2025-02-06 22:59:59 +05:30

CMakeLists.txt

Added logic to use right format specifier to read integer value.

2024-09-17 04:48:59 -04:00

inputamaxv.txt

Bench addition for amaxv API

2021-06-04 17:45:04 +05:30

inputasumv.txt

Added support to benchmark ASUMV APIs

2025-01-31 06:04:16 -05:00

inputaxpbyv.txt

Optimized AXPBYV Kernel using AVX2 Intrinsics

2022-01-05 04:19:11 -05:00

inputaxpyv.txt

Added support to benchmark AXPYV APIs

2024-04-08 00:06:54 -04:00

inputcopy.txt

Added bench utility for copyv API

2021-06-09 12:29:49 +05:30

inputdotv.txt

Code cleanup: file formats and permissions

2024-08-05 11:52:33 -04:00

inputgemm.txt

Additional bug-fix for AOCL-BLAS bench

2025-01-30 08:28:14 -05:00

inputgemmpackcompute.txt

Code cleanup: No newline at end of file

2023-11-22 17:11:10 -05:00

inputgemmt.txt

Added bench app for syrk - input is a log file generated from AOCL_DTL

2021-05-11 14:57:51 +05:30

inputgemv.txt

Fixed crash issue in bench utility for gemv API

2021-05-19 14:21:09 +05:30

inputger.txt

Added bench utility for ger API.

2021-05-19 14:05:01 +05:30

inputnrm2.txt

Code cleanup: No newline at end of file

2023-04-21 10:02:48 -04:00

inputscalv.txt

Added support to benchmark AXPYV APIs

2024-04-08 00:06:54 -04:00

inputswap.txt

Added bench utility for swapv API

2021-06-09 17:05:00 +05:30

inputsyrk.txt

Added bench app for syrk - input is a log file generated from AOCL_DTL

2021-05-11 14:57:51 +05:30

inputtrsm.txt

Trsm bench utility missmatch DTL logs and bench

2021-11-12 08:58:52 +05:30

inputtrsv.txt

Bench trsv logging error

2021-06-08 11:54:55 +05:30

Makefile

Added support to benchmark ASUMV APIs

2025-01-31 06:04:16 -05:00