This website requires JavaScript.
Explore
Help
Register
Sign In
ROCm
/
composable_kernel
Watch
1
Star
0
Fork
0
You've already forked composable_kernel
mirror of
https://github.com/ROCm/composable_kernel.git
synced
2026-07-03 13:48:30 +00:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
Files
521970ce2fc6aaeb9529ab6a10e3d4f474cd5484
composable_kernel
/
include
/
ck
/
tensor_operation
/
gpu
History
kiefer
b8d4b01267
Implement rotating memory and flush cache. Requires ad-hoc buffer size calculations.
2025-09-04 14:22:26 +00:00
..
block
upgrade from clang-format-12 to clang-format-18 (
#2568
)
2025-07-28 11:34:07 -07:00
device
Implement rotating memory and flush cache. Requires ad-hoc buffer size calculations.
2025-09-04 14:22:26 +00:00
element
Merge remote-tracking branch 'origin/develop' into 90-prepare-an-upstream-pr-for-multipled-based-gemms
2025-08-06 07:47:43 +00:00
grid
Add CTranspose optimization for NCHW cases just like in xdl cshuffle non-v3 device implementation.
2025-08-24 12:44:01 +00:00
thread
Add instances for all 8-bit 3D vanilla grouped conv fwd types, including mixed types but with the exception of deprecated f16 comp fp8. Adapt test so we can test 8-bit and mixed types.
2025-08-26 09:20:38 +00:00
warp
clang-format-18
2025-08-06 11:53:43 +00:00