Commit Graph

3885 Commits

Author SHA1 Message Date
assistant-librarian[bot]
2044d0dd35 Merge commit 'de6466481f9472350a5f4afce27c86ecdbb5b42f' into develop 2025-11-26 18:14:59 +00:00
Aviral Goel
ee7a68b10f chore(copyright): update copyright header for include directory (#3293)
[ROCm/composable_kernel commit: de6466481f]
2025-11-26 11:00:05 -07:00
John Shumway
d449d96c98 Fix template parameter macros (#3305)
Some of the device implementation templates have macros like GridwiseGemmMultiABDTemplateParameters that can cause build errors if multiple files are included together. This error comes up with our builder code.

To clean up the macros and make them safer, we follow these follow rules:
* Use more specific names to avoid duplication.
* Undefine the macro after it is used to avoid leaking out of the file scope.
* Use a prefix CK_ on the macro to avoid conflicting with other libraries.
* Use all caps with underscores for preprocessor macro names.

[ROCm/composable_kernel commit: 10a782d846]
2025-11-26 09:48:17 -08:00
assistant-librarian[bot]
283383c61c Merge commit '35a4b26af0088ca0d634b57055a4143fdb9f2e2d' into develop 2025-11-26 07:13:26 +00:00
Aviral Goel
cb0aaf8e90 fix: add dynamic selection of pipelines for aquant mode (#3282)
- Add conditional selection to use v3 pipeline when PreshuffleQuant is true
- Add static assertion in memory pipeline to prevent PreshuffleQuant usage
- Restore BaseBQuantGemmPipelineAgBgCrCompV3 for BQuant cases
- Update BaseGemmPipeline selection to handle all quant modes properly

[ROCm/composable_kernel commit: 35a4b26af0]
2025-11-26 10:58:09 +04:00
assistant-librarian[bot]
a86762f0f9 Merge commit '8fa90025d0da22683dabe721d77a75a536388683' into develop 2025-11-26 03:34:44 +00:00
Yi DING
631655adb1 [CK_TILE] Refine warp_gemm_attribute_mfma (#3272)
[ROCm/composable_kernel commit: 8fa90025d0]
2025-11-26 10:57:15 +08:00
assistant-librarian[bot]
9eb4b35ef6 Merge commit 'c7dce2ac29136939b6fe6aabadd026e53dcf35c9' into develop 2025-11-26 02:44:11 +00:00
Yi DING
e303358608 [CK_TILE] Fix Compilation of Flatmm Examples (#3285)
[ROCm/composable_kernel commit: c7dce2ac29]
2025-11-26 10:11:43 +08:00
Illia Silin
250fd6be12 Enable ck_builder in CI. (#3296)
* build and run ck_builder tests

* add test_ckb_all to targets

* fix syntax

* fix test path

* Update CMake targets for builder testing in CI (#3290)

Our existing CMake only had build targets. Update CMakeLists.txt to have CTEST targets:
* smoke-builder
* regression-builder
* check-builder

Co-authored-by: John Shumway <jshumway@amd.com>

* use check-builder target

* get rid of test_ckb_all target

* call ninja check-builder separately

---------

Co-authored-by: John Shumway <jshumway@amd.com>

[ROCm/composable_kernel commit: a54f7b1138]
2025-11-25 17:45:59 -08:00
assistant-librarian[bot]
6d42c1d821 Merge commit 'cd4729386927c3d20b70fc9465614e9158524598' into develop 2025-11-25 23:12:05 +00:00
Aviral Goel
9f94579e5c chore(copyright): update copyright header for experimental & example directory (#3292)
[ROCm/composable_kernel commit: cd47293869]
2025-11-26 03:09:39 +04:00
Bartłomiej Kocot
91cc903d12 [CK TILE] Grouped Conv Explicit Gemm (#3289)
* [CK TILE] Grouped Conv Explicit Gemm

* fixes

* apply builder fixes

[ROCm/composable_kernel commit: 00dfa2f2ce]
2025-11-25 23:28:35 +01:00
assistant-librarian[bot]
8965f337ca Merge commit '37ea1600888f515e5dfb7153b75b2f06474d880d' into develop 2025-11-25 21:12:15 +00:00
Khushbu Agarwal
a6241c62cc [CK-Tile] fix block scale example for gfx1201 (#3283)
[ROCm/composable_kernel commit: 37ea160088]
2025-11-25 13:10:28 -08:00
assistant-librarian[bot]
a5b724bc4d Merge commit '9ac2666d5b48efc3743ce073aab0a68833accf5c' into develop 2025-11-25 14:13:08 +00:00
Bartłomiej Kocot
083ea723a0 [CK_BUILDER] Add grouped conv bwd ck tile traits (#3281)
* [CK_BUILDER] Add grouped conv bwd ck tile traits

* copilot fixes

[ROCm/composable_kernel commit: 9ac2666d5b]
2025-11-25 14:57:43 +01:00
assistant-librarian[bot]
5375584eac Merge commit 'ab0101c59c6be6ad376ba668a51f0e38dca66aa2' into develop 2025-11-25 02:43:48 +00:00
Aviral Goel
804730c0f3 chore(copyright): update copyright header for library directory (#3274)
* chore(copyright): update copyright header  for library directory

* chore(copyright): update copyright header for library directory

---------

Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>

[ROCm/composable_kernel commit: ab0101c59c]
2025-11-24 18:10:26 -08:00
Aviral Goel
91ffc9dd1e chore(copyright): update copyright header for example directory (#3273)
* chore(copyright): update copyright header for codegen directory

* chore(copyright): update copyright header for example directory

[ROCm/composable_kernel commit: d85f065b15]
2025-11-24 18:02:41 -08:00
assistant-librarian[bot]
de08f43ef6 Merge commit '229d43ea0c8b9c94092ce001e411f82c3766b6fb' into develop 2025-11-25 01:51:07 +00:00
rocking
9cb3d700da Fix batch prefill compile fail in aiter (#3279)
* Fix batch prefill aiter compile fail

* Fix compile error

[ROCm/composable_kernel commit: 229d43ea0c]
2025-11-25 09:46:32 +08:00
assistant-librarian[bot]
4aaa8c92bb Merge commit 'de6a9590abe907283e189abba1b487f8e5562d1b' into develop 2025-11-24 21:29:18 +00:00
Thomas Ning
99e6b461db Reorganize of KPack in GEMM (#3247)
* add the reorganize of KPack

* fix the compilation error

* fix the compilation error

[ROCm/composable_kernel commit: de6a9590ab]
2025-11-24 12:38:59 -08:00
assistant-librarian[bot]
5297edb40c Merge commit 'e95337c58c00d12b5c947006836f9fb46964b35c' into develop 2025-11-24 18:22:07 +00:00
Aviral Goel
f65f0820ca chore(copyright): update copyright header for codegen directory (#3266)
[ROCm/composable_kernel commit: e95337c58c]
2025-11-24 10:12:40 -08:00
John Shumway
04f8fa2316 Guard a builder test to avoid gfx11 and gfx12 (#3268)
We're getting a compile error on gfx11 and gfx12 for an I8 test that doesn't have a supported WMMA implmentation. We'll need to build architecture support into the builder, but to get things green I'm just adding an ifndef guard around the test.

[ROCm/composable_kernel commit: 1bc7529977]
2025-11-24 10:10:09 -08:00
Christopher Millette
a049cdebba First look at mfma / wmma unification (#2704)
* First look at mfma / wmma unification

* Refactor

* Re-org file structure

* Restructure transform selection and WaveWiseMma class

* Update license files. Add missing gfx1151 support. Change wave size for HOST to 1. Update datatypes naming consistency

* Fixes default MmaSelector implentation

* Adds unit tests for amdgcn_mma and arch

* Consolidate common arch id checks to constexpr functions. Strongly type ids as amdgcn_target_arch_id object.

* Refactor is_any_value_of

* Fixes mma_selector logic

* Fix typo

* Add mma selector test for tile decomposition

* Fix compilation of mma.hpp

* Revert back to c++17 compatibility

* Fix compiler error by returning index_t from get_warp_size()

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Fixes compiler error for missing is_wave32() function

* Fixes compiler error for host wave_size() should be 64

* Fixes compiler errors where __cpp_concepts is not defined

* Fixes compiler errors where __cpp_concepts is not defined

* Fix test failure for host is wave64 by default

---------

Co-authored-by: Chris Millette <you@example.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

[ROCm/composable_kernel commit: b9c6cb1452]
2025-11-24 09:39:59 -08:00
assistant-librarian[bot]
c420d0386d Merge commit '8111572785d3de98457940f2b5ca6fe9cf7603af' into develop 2025-11-24 16:13:04 +00:00
Khushbu Agarwal
84b12586c6 [CK_Tile] Support for preshuffle weight(B) quant tensor for block scale gemm (#3165)
* formatted

* formatted

* formatting

* formatting

* formatting

* [CK TILE GEMM] Refactor block_scale_gemm examples

- Split cpp file to reduce building time
- Support multiple GemmConfig

* [CK TILE GEMM] Refactor block_scale_gemm examples

- Update Readme

* enable prefill shapes

* [CK TILE GEMM] Refactor block_scale_gemm examples

- Add support for rowcol and tensor GEMM operations

* [CK TILE GEMM] Refactor block_scale_gemm examples

- Update README

* adding preshuffle quant as new parameter and its associated new files

* remove debugging statements

* adding test

* enable preshuffle quant with permuteN

* updating readme and correcponding gemmconfigs

* updating cmake file

* fixing CI failures for grouped quant gemm

* addressing review comments

* fixing CI issue

* addressing reveiw comments

* formatting

* formatting

* fixing aquant operator overlaoding

* formatting

---------

Co-authored-by: Cong Ma <congma13@amd.com>
Co-authored-by: Thomas Ning <Thomas.Ning@amd.com>

[ROCm/composable_kernel commit: 8111572785]
2025-11-24 07:48:42 -08:00
assistant-librarian[bot]
6f3484eaa8 Merge commit 'e857e26bf64ab54dc6dcef0d89203982873a5fa8' into develop 2025-11-24 15:13:49 +00:00
Illia Silin
585a6a2048 disable CI on gfx1010 by default (#3280)
[ROCm/composable_kernel commit: e857e26bf6]
2025-11-24 07:06:41 -08:00
assistant-librarian[bot]
1a4543c060 Merge commit '81042ea5747d3e1e4a71c3f327556f3fb0655d99' into develop 2025-11-24 14:13:16 +00:00
Qianfeng
8ec85b7617 Fix a bug for qr_ks_vs_async_trload pipeline (#3271)
[ROCm/composable_kernel commit: 81042ea574]
2025-11-24 21:31:48 +08:00
assistant-librarian[bot]
f2425d427c Merge commit '5948dbffe4d0bbe4d1802a047bd8599ba662386e' into develop 2025-11-24 09:15:05 +00:00
rocking
ca1a0da0c3 Support fp8 dynamic quantization for fmha (#3206)
* Support qscale for dynamic quant, remove static quant

* Support hdim=256

* Remove bias test case for fp8

---------

Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
Co-authored-by: asleepzzz <hanwen.chang@amd.com>

[ROCm/composable_kernel commit: 5948dbffe4]
2025-11-24 16:28:25 +08:00
assistant-librarian[bot]
7bd01a9f5f Merge commit '096f0a3b23a49ffaef1e2dbed74bf366e36ad15c' into develop 2025-11-24 07:13:25 +00:00
Johannes Graner
dd7a2d199f [CK Tile] Fix example for conv fwd + bias + clamp (#3235)
* Fix clamp not being applied correctly

* Apply group offsets to D tensors

---------

Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>

[ROCm/composable_kernel commit: 096f0a3b23]
2025-11-24 07:36:26 +01:00
assistant-librarian[bot]
8abfd83364 Merge commit 'f6c999bddb9e0ae468c7b45bc68cc1410472dcf5' into develop 2025-11-23 00:40:28 +00:00
Aviral Goel
1bec1dd091 chore(copyright): update copyright header for test directory (#3265)
[ROCm/composable_kernel commit: f6c999bddb]
2025-11-22 19:38:27 -05:00
assistant-librarian[bot]
d7685c394a Merge commit '02ab76c2cb47143b82743bcf9d86389c540a608b' into develop 2025-11-22 04:13:58 +00:00
Emily Martins
ede105dd91 Fix CK Tile DP + 2 Tile Stream-K Validation Errors (#3269)
When there are multiple workgroups contributing to a tile, when using
atomics, there may be round off error in cases where the accumulator
type is not the same as the C type. To compute an error tolerance for
test validation, the Stream-K Tile Partitioner has a function called
estimate_num_wgs_per_tile to estimate the number of workgroups per tile.
That said, this function only provides an estimate. In some cases for
DP+2TSK, the function returns 1 rather than the more accurate value of
2.

Thus, this change updates the estimate_num_wgs_per_tile function to
explicitely return the value of 2 in cases for DP+2TSK to ensure that we
have a better error tolerance to avoid test failures due to round-off
error.

[ROCm/composable_kernel commit: 02ab76c2cb]
2025-11-21 20:29:47 -07:00
assistant-librarian[bot]
343e40d0e9 Merge commit '21ae743acd49c79913b3835236c5315983fa83ef' into develop 2025-11-21 16:13:44 +00:00
Illia Silin
6d7d99f91b Enable daily builds on gfx1010 (#3258)
* add build/test on gfx1010

* only build and run on gfx1010 once daily

[ROCm/composable_kernel commit: 21ae743acd]
2025-11-21 07:22:01 -08:00
assistant-librarian[bot]
323c839a2b Merge commit 'ea6e4fcbbc0bd76a562f246f743f5554edc312e4' into develop 2025-11-21 15:12:19 +00:00
John Shumway
34c3e1f562 Fix builder errors. (#3260)
There were four errors to fix:
1. The checks for defaulted direction were not implemented in the predicate concept.
2. Had to delete an obsolete and undefined operation enum.
3. A factory was passing a boolean in place of an integer.
4. Some of the factory tests are not compiling correctly when linking in the full source (with CK_EXPERIMENTAL_BUILDER=ON), so I commented them out.

[ROCm/composable_kernel commit: ea6e4fcbbc]
2025-11-21 15:25:45 +01:00
assistant-librarian[bot]
1829bc6596 Merge commit 'f38c3de9f9047e72429c796fd0445f36eceb142b' into develop 2025-11-21 03:31:42 +00:00
John Shumway
345dbb25f8 Fix copyright messages in experimental/builder. (#3253)
Our copyright were were mostly correct, but we inconsistently used (C) instead of (c) like the rest of the CK code. This PR fixes that (using lowercase c) and adds a missing copyright header to one file.

[ROCm/composable_kernel commit: f38c3de9f9]
2025-11-20 17:40:55 -08:00
assistant-librarian[bot]
967480c146 Merge commit 'c8563f2101d864ed0cc1f68f02763ee4ec6aa59d' into develop 2025-11-21 01:40:40 +00:00
Aviral Goel
89e3931da8 chore(copyright): update copyright header for test directory (#3252)
* chore(copyright): update copyright header for test directory

* chore(copyright): update copyright header for test directory

* chore(copyright): update copyright header for client_example directory

* chore(copyright): update copyright header for test directory

[ROCm/composable_kernel commit: c8563f2101]
2025-11-20 20:36:57 -05:00