This website requires JavaScript.
Explore
Help
Register
Sign In
ROCm
/
composable_kernel
Watch
1
Star
0
Fork
0
You've already forked composable_kernel
mirror of
https://github.com/ROCm/composable_kernel.git
synced
2026-05-05 22:22:27 +00:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
Files
7ac379428408337a231a86f8a8b7353b5b45aa2d
composable_kernel
/
include
/
ck
/
tensor_operation
/
gpu
History
Ville Pietilä
7ac3794284
Add new instances for merging multiple fwd conv groups into a single GEMM batch. Allow group merging for C > 1 when vector load/store size is 1 for the output tensor. (
#3639
)
...
Co-authored-by: Ville Pietilä <>
2026-01-25 13:42:23 +01:00
..
block
Implement grouped gemm tile loop for RDNA4 (
#3304
)
2026-01-13 07:14:23 +01:00
device
Add new instances for merging multiple fwd conv groups into a single GEMM batch. Allow group merging for C > 1 when vector load/store size is 1 for the output tensor. (
#3639
)
2026-01-25 13:42:23 +01:00
element
Implement grouped gemm tile loop for RDNA4 (
#3304
)
2026-01-13 07:14:23 +01:00
grid
Remove code duplications in batched gemm wmma (
#3580
)
2026-01-23 12:39:03 -08:00
thread
Grouped convolution forward device implementation and base flavors for RDNA3/4 (
#2964
)
2025-12-18 13:12:15 -07:00
warp
chore(copyright): update copyright header for include directory (
#3293
)
2025-11-26 11:00:05 -07:00