blis/kernels/zen at ea79efa915af7c2ac1c6c2f3b4a86ec83b6a446d - blis

amd/blis

mirror of https://github.com/amd/blis.git synced 2026-05-11 17:50:00 +00:00

Files

Nallani Bhaskar ea79efa915 Fixed out of bound memory access in sgemmsup zen rv kernels

Details:

1. In sgemmsup_zen_rv_?x2 kernels "vmovps" instruction
   is used to load B matrix in k loop and k last loop,
   which is loading 128 bit into xmm than 64 bit as expected.

2. Changed vmovps instruction to vmovsd instrucntions
   which load only 64 bit in xmm register

3. Avoided C memory access by vfma instruction when multiplying
   with non-beta at corner cases with required access to 128 bit
   which leads to out of bound. Replaced with vmovq first to
   get 64 bit data then peformed vfma on xmm register in rv_6x8m
   and rv_6x4m

   AMD-Internal: [CPUPL-2472]

Change-Id: Iad397f8f5b5cc607b4278b603b1e0ea3f6b082f2

2022-08-30 01:13:14 -04:00

Added support for AVX512 for Windows and AMAVX

2022-06-08 11:09:48 +05:30

Removed Arch specific code from BLIS framework.

2022-05-17 20:35:40 +05:30

Enabled ZHER Optimized Path

2022-08-29 08:09:42 -04:00

Fixed out of bound memory access in sgemmsup zen rv kernels

2022-08-30 01:13:14 -04:00

util

DGEMMT : Tuning SUP threshold to improve ST and MT performance.

2022-05-17 18:09:22 +05:30

bli_kernels_zen.h

Code cleanup and warnings fixes

2022-08-29 15:15:40 +05:30

CMakeLists.txt

Removed packm kernels of zen

2021-11-12 08:58:51 +05:30