Files
blis/kernels
Mangala.V 8504ef013d Optimisation of DTRSM and ZTRSM
1. Extract instruction replaced with cast when accessing first 128bit,
   as cast inst needs no cycle but extract takes few cycles
2. Added prefetch of A buffer when computing gemm operation
3. Added prefetch of C11 buffer before TRSM operation, with offset of 7 to cs_c

With above changes performance improvements observed in case of Single thread

Change-Id: Id377c490ddac8b06384acfa9a6d89dbe11bbc7be
2022-08-11 01:39:40 -04:00
..
2020-09-29 16:52:18 -05:00
2022-07-22 03:42:17 -04:00
2022-08-11 01:39:40 -04:00
2021-04-27 11:09:48 +05:30
2020-07-22 18:24:26 +05:30
2022-06-13 10:52:53 +05:30