Harsh Dave
590c763e22
Implemented ctrsm small kernels
...
Details:
-- AMD Internal Id: CPUPL-1702
-- Used 8x3 CGEMM kernel with vector fma by utilizing ymm registers
efficiently to produce 24 scomplex outputs at a time
-- Used packing of matrix A to effectively cache and reuse
-- Implemented kernels using macro based modular approach
-- Added ctrsm_small for in ctrsm_ BLAS path for single thread
when (m,n)<1000 and multithread (m+n)<320
-- Taken care of --disable_pre_inversion configuration
-- Achieved 13% average performance improvement for sizes less than 1000
-- modularized all 16 combinations of trsm into 4 kernels
Change-Id: I557c5bcd8cb7c034acd99ce0666bc411e9c4fe64
2021-11-12 08:58:55 +05:30
..
2021-03-08 19:04:17 +05:30
2021-03-08 19:04:17 +05:30
2021-11-12 08:58:49 +05:30
2020-11-14 09:39:48 -06:00
2021-11-12 08:58:53 +05:30
2020-11-03 20:44:12 +05:30
2019-08-23 14:18:07 +05:30
2020-10-28 17:50:27 +05:30
2020-10-28 17:50:27 +05:30
2020-11-03 20:44:12 +05:30
2019-08-23 14:18:07 +05:30
2020-11-03 20:44:12 +05:30
2020-09-16 09:49:31 +05:30
2020-11-03 20:44:12 +05:30
2019-08-23 14:18:07 +05:30
2020-11-02 22:56:58 -05:00
2019-08-23 14:18:07 +05:30
2021-05-21 10:00:32 +05:30
2021-04-27 11:09:48 +05:30
2020-10-29 17:06:30 +05:30
2020-10-29 17:06:30 +05:30
2020-09-24 23:52:31 -04:00
2020-09-24 23:52:31 -04:00
2021-11-12 08:58:54 +05:30
2021-11-12 08:58:50 +05:30
2020-11-23 04:53:15 -05:00
2020-11-14 09:39:48 -06:00
2021-06-01 18:03:29 +05:30
2019-08-23 14:18:07 +05:30
2020-11-02 22:56:58 -05:00
2019-08-23 14:18:07 +05:30
2021-06-04 15:24:13 +05:30
2019-08-23 14:18:07 +05:30
2021-06-04 15:24:13 +05:30
2019-08-23 14:18:07 +05:30
2021-06-04 15:24:13 +05:30
2019-08-23 14:18:07 +05:30
2021-06-04 15:24:13 +05:30
2019-08-23 14:18:07 +05:30
2021-06-04 15:24:13 +05:30
2019-08-23 14:18:07 +05:30
2021-06-04 15:24:13 +05:30
2019-08-23 14:18:07 +05:30
2020-11-18 12:55:36 +05:30
2020-11-18 12:55:36 +05:30
2020-11-06 10:16:31 +05:30
2019-08-23 14:18:07 +05:30
2020-11-18 12:55:36 +05:30
2020-11-18 12:55:36 +05:30
2020-11-18 12:55:36 +05:30
2020-11-18 12:55:36 +05:30
2020-11-18 12:55:36 +05:30
2020-11-18 12:55:36 +05:30
2021-11-12 08:58:54 +05:30
2019-08-23 14:18:07 +05:30
2020-10-30 19:12:19 +05:30
2019-08-23 14:18:07 +05:30
2021-06-04 15:24:13 +05:30
2019-08-23 14:18:07 +05:30
2021-06-04 15:24:13 +05:30
2019-08-23 14:18:07 +05:30
2021-06-04 15:24:13 +05:30
2019-08-23 14:18:07 +05:30
2021-06-04 15:24:13 +05:30
2019-08-23 14:18:07 +05:30
2021-06-04 15:24:13 +05:30
2019-08-23 14:18:07 +05:30
2021-06-04 15:24:13 +05:30
2019-08-23 14:18:07 +05:30
2021-06-04 15:24:13 +05:30
2019-08-23 14:18:07 +05:30
2021-06-04 15:24:13 +05:30
2019-08-23 14:18:07 +05:30
2021-11-12 08:58:55 +05:30
2019-08-23 14:18:07 +05:30
2021-06-04 15:24:13 +05:30
2019-08-23 14:18:07 +05:30
2021-03-08 19:04:17 +05:30
2021-03-08 19:04:17 +05:30