Files
blis/kernels
Shubham 18569b42ee Added DTRSM small RUNN/RLTN variant AVX512 kernels
- 8x8 kernels are used for DTRSM SMALL
- Matrix A(a10) is packed for GEMM operations.
- Packed martix A will be re-used in all the col-block
  along N-dimension.
- Diagonal elements of A matrix are packed(a11) for
  TRSM operations.
- Implemented fringe cases with following block sizes
   8x8, 8x4, 8x3, 8x2, 8x1
   4x8, 4x4, 4x3, 4x2, 4x1
   3x8, 3x4, 3x3, 3x2, 3x1
   2x8, 2x4, 2x3, 2x2, 2x1
   1x8, 1x4, 1x3, 1x2, 1x1

AMD-Internal: [CPUPL-2745]

Change-Id: I6a174e7f88a4c2c5778052525879552a1e82f6ad
2023-02-16 06:49:46 -05:00
..
2020-09-29 16:52:18 -05:00
2022-07-22 03:42:17 -04:00
2021-04-27 11:09:48 +05:30
2020-07-22 18:24:26 +05:30