Qianfeng 0875df0f9a Pr82 followup (#115)
* Use thread cluster descriptor and explicit M_K 2d descriptor to simply Blockwise Reduction

* Change by replacing ReduceDims by NumReduceDims as Device Reduce interface template parameter

* Rename the folder name for the pool2d and reduce examples

* Update to reduction test scripts

* Add Readme for pool2d_fwd and reduce_blockwise examples

* Tiny fix in reduce profiler and tiny update in reduce testing scripts

* Tiny fix in testing script profile_reduce_no_index.sh

* Tiny change in script/profile_reduce_with_index.sh

* Renaming and refining in Reduction profiler/device layer/examples

* Renaming and refining in Reduction profiler/device layer/examples

* Renaming all NumReduceDims to NumReduceDim

[ROCm/composable_kernel commit: 827301d95a]
2022-03-10 10:14:43 -06:00
2022-02-18 21:44:11 -06:00
2022-03-10 10:14:43 -06:00
2022-03-10 10:14:43 -06:00
2022-03-10 10:14:43 -06:00
2022-03-10 10:14:43 -06:00
2022-03-10 10:14:43 -06:00
2022-03-08 21:46:36 -06:00
2018-10-08 22:49:58 -05:00
2021-08-08 17:41:54 +00:00
2022-03-08 21:46:36 -06:00
Description
[DEPRECATED] Moved to ROCm/rocm-libraries repo. NOTE: develop branch is maintained as a read-only mirror
Readme MIT 234 MiB
Languages
C++ 93.1%
Python 4.5%
CMake 1.5%
Shell 0.5%
Pawn 0.2%