Enrico Degregori
|
faa7f9ae07
|
Wmma support for gemm_multiply_multiply_wp (#3278)
* Initial implementation with splitK support
* Add gfx11 support
* Fix compilation error
* Add instances
* Add irregular instances
* Fix GetBuffer arguments
* Minor changes
* Address review comments
* Fix compilation errors
* Fix copyright header
[ROCm/composable_kernel commit: 161835533b]
|
2025-12-03 07:38:23 -08:00 |
|
Aviral Goel
|
bb41ea37e1
|
chore(copyright) update library wide CMakeLists.txt copyright header template (#3313)
* chore(copyright) update library wide CMakeLists.txt files copyright header template
* Fix build
---------
Co-authored-by: Sami Remes <samremes@amd.com>
[ROCm/composable_kernel commit: 004784ef98]
|
2025-11-28 13:49:54 -08:00 |
|
Aviral Goel
|
d171245c4b
|
chore(copyright): update copyright header for test directory (#3265)
[ROCm/composable_kernel commit: f6c999bddb]
|
2025-11-22 19:38:27 -05:00 |
|
Enrico Degregori
|
9575bcd099
|
Fix splitk preshuffle (#3137)
* Fix splitK multiply_multiply_wp
* Add tests for gemm_multiply_multiply_wp
* Add tests for gemm_universal_preshuffle (KBatch = 1)
* Add tests gemm_blockscale_wp
* Fix splitk gemm universal preshuffle
* Run new tests on arch supporting fp8
* Restore example
* Fix strides profiler
* Fix tests
* Fix clang format
* Finalize profiler preshuffle with tolerances
* Minor improvements to splitk related changes
* Address review comments: clang format and ckProfiler typo
* Remove b_k_split_offset from SplitKBatchOffset struct
[ROCm/composable_kernel commit: 507d81c3af]
|
2025-11-03 11:59:01 -08:00 |
|