blis/config/zen5 at 45d82a1ebfb4ee2fcd7aa4e823708fc72052de6f - blis

amd/blis

mirror of https://github.com/amd/blis.git synced 2026-05-24 18:34:40 +00:00

Files

Shubham Sharma. f378fc57b5 DGEMM Native AVX512 updates

- In the initial patch - for m, n non-multiple of MR and NR
  respectively we are calling bli_dgemm_ker_var2. Now we have
  implemented macro-kernel for these fringe cases as well.
- Replaced RBP register with R11 in the macro-kernel.
- Retuned MC, KC and NC with these new changes.
  This will result in better performance for matrix sizes
  like m=4000 or greater when running on single thread.


AMD-Internal: [CPUPL-5262]
Change-Id: I66c111ceb7feee776703339680d57e8d6d5c809a

2024-07-31 12:23:34 -04:00

bli_cntx_init_zen5.c

DGEMM Native AVX512 updates

2024-07-31 12:23:34 -04:00

bli_family_zen5.h

BLIS: Implement zen5 sub-configuration

2024-04-12 07:26:31 -04:00

make_defs.cmake

Removed -fno-tree-loop-vectorize from kernel flags

2024-07-19 00:49:52 -04:00

make_defs.mk

Int4 B matrix reordering support in LPGEMM.

2024-06-24 07:55:34 -04:00