Files
blis/kernels/zen
Mangala V f6046784ce Re-Designed SGEMM SUP kernel to use mask load/store instruction
Added all fringe kernels with mask load store support
Fringe kernels cover m direction from 5 to 1 and
n direction from 15 to 1 for row storage format

- New edge kernels that uses masked load-store
  instructions for handling corner cases.

- Mask load-store instruction macros are added.
  vmaskmovps, VMASKMOVPS for masked load-store.

- It improves performance by reducing branching overhead
  and by being more cache friendly.

- Mask load-store is added only for row storage format

AMD-Internal: [CPUPL-4041]

Change-Id: I563c036c79bf8e476a8ebde37f8f6db751fb3456
2023-11-10 01:23:48 -05:00
..
2023-11-09 00:16:30 -05:00