Chao Liu 8cba08d07a Gemm+Reduce Fusion (#128)
* add gridwise gemm v4r1

* rename

* adding gemm+reduce

* adding gemm+reduce

* adding gemm+reduce

* adding gemm+reduce

* use sfc in shuffling

* remove hardcode

* remove hardcode

* refactor

* fix build

* adding gemm+reduce

* adding gemm+reduce

* adding gemm+reduce

* adding gemm+reduce

* adding gemm+reduce

* format

* clean

* adding gemm+reduce

* adding profiler for gemm+reduce

* adding gemm+reduce profiler

* fix build

* clean up

* gemm+reduce

* fix build

* update DeviceGemm_Xdl_CShuffle; update enum to enum class

* clean up

* add test for gemm+reduce

* clean up

* refactor

* fix build

* fix build

[ROCm/composable_kernel commit: f95267f166]
2022-03-23 22:18:42 -05:00
2022-02-18 21:44:11 -06:00
2022-03-23 22:18:42 -05:00
2022-03-23 22:18:42 -05:00
2022-03-23 22:18:42 -05:00
2022-03-23 22:18:42 -05:00
2022-03-23 22:18:42 -05:00
2018-10-08 22:49:58 -05:00
2021-08-08 17:41:54 +00:00
2022-03-08 21:46:36 -06:00
Description
[DEPRECATED] Moved to ROCm/rocm-libraries repo. NOTE: develop branch is maintained as a read-only mirror
Readme MIT 234 MiB
Languages
C++ 93.1%
Python 4.5%
CMake 1.5%
Shell 0.5%
Pawn 0.2%