mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-14 10:09:41 +00:00
87522d2ac9403ed29712790afbd4b8ff7f38d97f
* adding batched_gemm_and_reduction
* batched_gemm_reduce works with bactch_count=1
* fix a bug in grid_size; batched_gemm_reduce works for batch_count > 1
* adding profiler for batched_gemm_fp16
* fixed a bug in declaration of d1 and d0; both example and profiler work
* clang-format
* cleanup
* batched_gemm_reduce: add test
* minor change
* fixed some typo in function names
[ROCm/composable_kernel commit: 34c661e71c]
Languages
C++
93.1%
Python
4.5%
CMake
1.5%
Shell
0.5%
Pawn
0.2%