Files
blis/kernels
satish kumar nuggu 23278627f4 STRSM small kernel implementation
Details:
-- AMD Internal Id: [CPUPL-1702]
-- Used 16x6 SGEMM kernel with vector fma by utilizing ymm registers
-- Used packing of matrix A to effectively cache and reuse
-- Implemented kernels using macro based modular approach
-- Taken care of --disable_pre_inversion configuration
-- modularized strsm 16 combinations of trsm into 4 kernels

Change-Id: I30a1551967c36f6bae33be3b7ae5b7fcc7c905ea
2021-11-12 08:58:55 +05:30
..
2021-11-12 08:58:52 +05:30
2020-09-29 16:52:18 -05:00
2021-11-12 08:58:55 +05:30
2021-04-27 11:09:48 +05:30
2020-07-22 18:24:26 +05:30
2021-03-08 19:04:17 +05:30