zjing14
ca40ef6976
Add xdlops v4r4r4 into online compilation (#48)
* init for v4r4 xdlops olc
* refactor wrap
* init impl of v4r4 nchw xdlops olc
* tuning
* test perf
* fixed v4r4 nhwc
* tuned v4r4 nhwc
* use gridwise_gemm_xdlops_v2r3
* swap a/b
* add pointer support into offline v2r3
* debugging v4r4r4 transform for olc
* change timer of olc
* refactor v4r4 xdlops nchw olc
* remove transform fun in v4r4 xdlops nhwc olc
Co-authored-by: Chao Liu <chao.liu2@amd.com>
[ROCm/composable_kernel commit: fbdf4332c7]
2021-07-16 23:27:08 -05:00
..
2021-07-01 14:33:00 -05:00
2021-07-08 10:40:00 -05:00
2021-07-04 22:50:29 -05:00
2021-03-25 13:51:11 -05:00
2021-07-01 14:33:00 -05:00
2021-06-09 23:53:08 -05:00
2021-03-25 13:51:11 -05:00
2021-07-16 23:27:08 -05:00
2021-07-08 11:26:57 -05:00
2021-05-11 00:09:25 -05:00
2021-07-04 22:50:29 -05:00
2021-05-11 00:09:25 -05:00
2021-07-04 22:50:29 -05:00
2021-07-01 14:33:00 -05:00
2021-03-25 13:51:11 -05:00
2021-03-25 13:51:11 -05:00
2021-03-25 13:51:11 -05:00
2021-03-25 13:51:11 -05:00
2019-09-09 00:29:33 -05:00
2021-06-09 23:53:08 -05:00
2021-07-01 14:33:00 -05:00
2021-06-09 23:53:08 -05:00
2019-09-09 00:29:33 -05:00
2021-03-25 13:51:11 -05:00
2021-06-24 08:34:19 -05:00
2021-03-25 13:51:11 -05:00
2021-05-12 13:10:42 -05:00
2021-06-09 23:53:08 -05:00
2021-03-25 13:51:11 -05:00
2020-06-23 20:31:27 -05:00
2021-03-25 13:51:11 -05:00
2021-07-01 14:33:00 -05:00
2021-06-24 08:34:19 -05:00
2021-04-12 21:32:55 -05:00
2021-03-25 13:51:11 -05:00