blis/kernels at 5120f98e12465b5795bf70e2299325a9e109f8da - blis - Public git mirror

amd/blis

mirror of https://github.com/amd/blis.git synced 2026-05-24 02:14:33 +00:00

Files

History

Meghana Vankadari 5120f98e12 Developed all WoQ kernels for bf16s4f32o<f32|bf16>

Description:

1. Written 6x64 main and other fringe kernels for WoQ where scaling s4
   weights into bf16 performed in the kernel itself to reduce bandwidth.

2. These kernels are performing better compared to bf16 weights when m
   is small and n is large.

3. Established a threshold to do quantization support at packing of
   B (KCXNC) level or WoQ kernel level.

Change-Id: I4f8265b8b58c276ff2590cc948d1f920aa0bb289

2024-09-10 12:00:10 +00:00

..

Merge commit 'cfa3db3f' into amd-main

2024-07-08 06:09:11 -04:00

Add explicit handling for beta == 0 in armsve sd and armv7a d gemm ukrs.

2021-09-29 16:43:38 -05:00

Armv8 Trash New Bulk Kernels

2021-10-08 02:35:58 +09:00

Replaced use of bool_t type with C99 bool.

2020-08-03 11:27:13 +05:30

Code cleanup: Copyright notices

2024-08-05 15:35:08 -04:00

Added a dummy file to kernels/generic.

2017-11-21 12:34:20 -06:00

Code cleanup: Copyright notices

2024-08-05 15:35:08 -04:00

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

Code cleanup: Copyright notices

2024-08-05 15:35:08 -04:00

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

Optionally disable trsm diagonal pre-inversion.

2020-12-04 16:08:15 -06:00

Code cleanup: Miscellaneous fixes

2024-08-06 06:56:01 -04:00

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

Merge commit 'e366665c' into amd-main

2023-10-18 09:09:54 -04:00

Code cleanup: No newline at end of file

2023-04-21 10:02:48 -04:00

BLIS: Missing clobbers (batch 7)

2023-11-22 17:51:46 -05:00

Code cleanup: Copyright notices

2024-08-05 15:35:08 -04:00

CMake: Enabled ADDON(aocl_gemm) feature for Windows.

2024-08-23 07:05:28 -04:00

Code cleanup: AMD copyright notice

2023-11-23 08:54:31 -05:00

Added support for zen3 configuration

2020-07-22 18:24:26 +05:30

Developed all WoQ kernels for bf16s4f32o<f32|bf16>

2024-09-10 12:00:10 +00:00

Code cleanup: spelling corrections

2024-08-05 16:18:51 -04:00

CMakeLists.txt

Code cleanup: Copyright notices

2024-08-05 15:35:08 -04:00