blis/kernels at 21aa63eca1fa47817070365ac19b4480d47400fe - blis - Public git mirror

amd/blis

mirror of https://github.com/amd/blis.git synced 2026-04-19 23:28:52 +00:00

Files

History

Meghana Vankadari 21aa63eca1 Implemented AVX2 based GEMV for n=1 case.

- Added a new GEMV kernel with MR = 8 which will be used
  for cases where n=1.
- Modified GEMM and GEMV framework to choose right GEMV kernel
  based on compile-time and run-time architecture parameters. This
  had to be done since GEMV kernels are not stored-in/retrieved-from
  the cntx.
- Added a pack kernel that packs A matrix from col-major to row-major
  using AVX2 instructions.

AMD-Internal: [SWLCSG-3519]
Change-Id: Ibf7a8121d0bde37660eac58a160c5b9c9ebd2b5c

2025-05-05 08:56:22 +00:00

..

Merge commit 'cfa3db3f' into amd-main

2024-07-08 06:09:11 -04:00

Add explicit handling for beta == 0 in armsve sd and armv7a d gemm ukrs.

2021-09-29 16:43:38 -05:00

Armv8 Trash New Bulk Kernels

2021-10-08 02:35:58 +09:00

Replaced use of bool_t type with C99 bool.

2020-08-03 11:27:13 +05:30

Code cleanup: Copyright notices

2024-08-05 15:35:08 -04:00

Added a dummy file to kernels/generic.

2017-11-21 12:34:20 -06:00

BLIS: Missing clobbers (batch 8)

2025-02-07 10:39:24 -05:00

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

Code cleanup: Copyright notices

2024-08-05 15:35:08 -04:00

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

Optionally disable trsm diagonal pre-inversion.

2020-12-04 16:08:15 -06:00

Code cleanup: Miscellaneous fixes

2024-08-06 06:56:01 -04:00

Remove UT-Austin from copyright headers' clause 3.

2018-12-04 14:31:06 -06:00

Merge commit 'e366665c' into amd-main

2023-10-18 09:09:54 -04:00

Code cleanup: No newline at end of file

2023-04-21 10:02:48 -04:00

BLIS: Missing clobbers (batch 7)

2023-11-22 17:51:46 -05:00

Code cleanup: Copyright notices

2024-08-05 15:35:08 -04:00

Implemented AVX2 based GEMV for n=1 case.

2025-05-05 08:56:22 +00:00

Code cleanup: AMD copyright notice

2023-11-23 08:54:31 -05:00

Added support for zen3 configuration

2020-07-22 18:24:26 +05:30

Optimisation for DCOPY API

2025-04-28 05:58:21 -04:00

Optimisation for DCOPY API

2025-04-28 05:58:21 -04:00

CMakeLists.txt

CMake: compiler flags updated for lpgemm kernels under zen folder.

2025-04-30 06:09:36 -04:00