This website requires JavaScript.
Explore
Help
Register
Sign In
ROCm
/
composable_kernel
Watch
1
Star
0
Fork
0
You've already forked composable_kernel
mirror of
https://github.com/ROCm/composable_kernel.git
synced
2026-06-30 11:47:48 +00:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
Files
f88efddea48ca2bb9b25a99697cb7d699f289a6f
composable_kernel
/
include
/
ck
/
tensor_operation
/
gpu
History
Ville Pietilä
f88efddea4
Add new instances for merging multiple fwd conv groups into a single GEMM batch. Allow group merging for C > 1 when vector load/store size is 1 for the output tensor.
2026-01-23 06:26:41 -05:00
..
block
Implement grouped gemm tile loop for RDNA4 (
#3304
)
2026-01-13 07:14:23 +01:00
device
Add new instances for merging multiple fwd conv groups into a single GEMM batch. Allow group merging for C > 1 when vector load/store size is 1 for the output tensor.
2026-01-23 06:26:41 -05:00
element
Implement grouped gemm tile loop for RDNA4 (
#3304
)
2026-01-13 07:14:23 +01:00
grid
Implement batched gemm add relu gemm add for rdna4 (
#3391
)
2026-01-20 13:06:59 -08:00
thread
Grouped convolution forward device implementation and base flavors for RDNA3/4 (
#2964
)
2025-12-18 13:12:15 -07:00
warp
chore(copyright): update copyright header for include directory (
#3293
)
2025-11-26 11:00:05 -07:00