Commit Graph

3879 Commits

Author SHA1 Message Date
Thomas Ning
6f751b7a9b Fix and improve the gemm quant pipeline infrastructure (#3245)
[ROCm/composable_kernel commit: a38aeceb21]
2025-11-26 18:04:27 -08:00
assistant-librarian[bot]
e7c7922385 Merge commit '79aae7c7f71404bdb80d6db52bc6401e0e221d42' into develop 2025-11-27 00:36:02 +00:00
Max Podkorytov
a7a9ccdeca [CK Tile] enable building examples by default (#3259)
* remove EXCLUDE_FROM_ALL from ck-tile examples
-> +15 min build time w/ 64 threads for a single arch

* fix cpp17 compile error in the ck-tile examples

---------

Co-authored-by: khuagarw <khuagarw@amd.com>
Co-authored-by: Ding, Yi <yi.ding@amd.com>

[ROCm/composable_kernel commit: 79aae7c7f7]
2025-11-26 16:24:44 -08:00
andrew clark
d790a9f9de Automated Perfetto UI Notifications (#3255)
* Testing visualization generation

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Adding dummy test data

* Update Jenkinsfile

* Update Jenkinsfile

* Adding notifications

* Testing

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Image compression

* Update Jenkinsfile

* Moving capture logic to main Jenkins file

* Testing generation

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Fixing curl request

* Update Jenkinsfile

* Clean up

* Fix

* Fixing notification

* Testing message creation

* Adjusting message payload

* Testing notification generation

* Updating main jenkinsfile

* Fixing cleanup call

* Removing test pipeline code

* Comment clean up

* Testing pipeline

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Moving archive

Moving trace archive to safe location before source checkout

* Removing test pipeline

* Testing pipeline with unique file names

* Update Jenkinsfile

* Removing test files

Updated main pipeline

[ROCm/composable_kernel commit: 40d7217ac7]
2025-11-26 16:27:27 -07:00
assistant-librarian[bot]
2044d0dd35 Merge commit 'de6466481f9472350a5f4afce27c86ecdbb5b42f' into develop 2025-11-26 18:14:59 +00:00
Aviral Goel
216c23b945 chore(copyright): update copyright header for include directory (#3293)
[ROCm/composable_kernel commit: de6466481f]
2025-11-26 11:00:05 -07:00
John Shumway
90e0eb4dfc Fix template parameter macros (#3305)
Some of the device implementation templates have macros like GridwiseGemmMultiABDTemplateParameters that can cause build errors if multiple files are included together. This error comes up with our builder code.

To clean up the macros and make them safer, we follow these follow rules:
* Use more specific names to avoid duplication.
* Undefine the macro after it is used to avoid leaking out of the file scope.
* Use a prefix CK_ on the macro to avoid conflicting with other libraries.
* Use all caps with underscores for preprocessor macro names.

[ROCm/composable_kernel commit: 10a782d846]
2025-11-26 09:48:17 -08:00
assistant-librarian[bot]
283383c61c Merge commit '35a4b26af0088ca0d634b57055a4143fdb9f2e2d' into develop 2025-11-26 07:13:26 +00:00
Aviral Goel
612f91226f fix: add dynamic selection of pipelines for aquant mode (#3282)
- Add conditional selection to use v3 pipeline when PreshuffleQuant is true
- Add static assertion in memory pipeline to prevent PreshuffleQuant usage
- Restore BaseBQuantGemmPipelineAgBgCrCompV3 for BQuant cases
- Update BaseGemmPipeline selection to handle all quant modes properly

[ROCm/composable_kernel commit: 35a4b26af0]
2025-11-26 10:58:09 +04:00
assistant-librarian[bot]
a86762f0f9 Merge commit '8fa90025d0da22683dabe721d77a75a536388683' into develop 2025-11-26 03:34:44 +00:00
Yi DING
16dd90a523 [CK_TILE] Refine warp_gemm_attribute_mfma (#3272)
[ROCm/composable_kernel commit: 8fa90025d0]
2025-11-26 10:57:15 +08:00
assistant-librarian[bot]
9eb4b35ef6 Merge commit 'c7dce2ac29136939b6fe6aabadd026e53dcf35c9' into develop 2025-11-26 02:44:11 +00:00
Yi DING
c0adc147a3 [CK_TILE] Fix Compilation of Flatmm Examples (#3285)
[ROCm/composable_kernel commit: c7dce2ac29]
2025-11-26 10:11:43 +08:00
Illia Silin
b80f571425 Enable ck_builder in CI. (#3296)
* build and run ck_builder tests

* add test_ckb_all to targets

* fix syntax

* fix test path

* Update CMake targets for builder testing in CI (#3290)

Our existing CMake only had build targets. Update CMakeLists.txt to have CTEST targets:
* smoke-builder
* regression-builder
* check-builder

Co-authored-by: John Shumway <jshumway@amd.com>

* use check-builder target

* get rid of test_ckb_all target

* call ninja check-builder separately

---------

Co-authored-by: John Shumway <jshumway@amd.com>

[ROCm/composable_kernel commit: a54f7b1138]
2025-11-25 17:45:59 -08:00
assistant-librarian[bot]
6d42c1d821 Merge commit 'cd4729386927c3d20b70fc9465614e9158524598' into develop 2025-11-25 23:12:05 +00:00
Aviral Goel
f13e2e69cc chore(copyright): update copyright header for experimental & example directory (#3292)
[ROCm/composable_kernel commit: cd47293869]
2025-11-26 03:09:39 +04:00
Bartłomiej Kocot
2c2672ff0e [CK TILE] Grouped Conv Explicit Gemm (#3289)
* [CK TILE] Grouped Conv Explicit Gemm

* fixes

* apply builder fixes

[ROCm/composable_kernel commit: 00dfa2f2ce]
2025-11-25 23:28:35 +01:00
assistant-librarian[bot]
8965f337ca Merge commit '37ea1600888f515e5dfb7153b75b2f06474d880d' into develop 2025-11-25 21:12:15 +00:00
Khushbu Agarwal
192bb72244 [CK-Tile] fix block scale example for gfx1201 (#3283)
[ROCm/composable_kernel commit: 37ea160088]
2025-11-25 13:10:28 -08:00
assistant-librarian[bot]
a5b724bc4d Merge commit '9ac2666d5b48efc3743ce073aab0a68833accf5c' into develop 2025-11-25 14:13:08 +00:00
Bartłomiej Kocot
95ec5ccec0 [CK_BUILDER] Add grouped conv bwd ck tile traits (#3281)
* [CK_BUILDER] Add grouped conv bwd ck tile traits

* copilot fixes

[ROCm/composable_kernel commit: 9ac2666d5b]
2025-11-25 14:57:43 +01:00
assistant-librarian[bot]
5375584eac Merge commit 'ab0101c59c6be6ad376ba668a51f0e38dca66aa2' into develop 2025-11-25 02:43:48 +00:00
Aviral Goel
52168e0481 chore(copyright): update copyright header for library directory (#3274)
* chore(copyright): update copyright header  for library directory

* chore(copyright): update copyright header for library directory

---------

Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>

[ROCm/composable_kernel commit: ab0101c59c]
2025-11-24 18:10:26 -08:00
Aviral Goel
a535de0f75 chore(copyright): update copyright header for example directory (#3273)
* chore(copyright): update copyright header for codegen directory

* chore(copyright): update copyright header for example directory

[ROCm/composable_kernel commit: d85f065b15]
2025-11-24 18:02:41 -08:00
assistant-librarian[bot]
de08f43ef6 Merge commit '229d43ea0c8b9c94092ce001e411f82c3766b6fb' into develop 2025-11-25 01:51:07 +00:00
rocking
f20f9dd453 Fix batch prefill compile fail in aiter (#3279)
* Fix batch prefill aiter compile fail

* Fix compile error

[ROCm/composable_kernel commit: 229d43ea0c]
2025-11-25 09:46:32 +08:00
assistant-librarian[bot]
4aaa8c92bb Merge commit 'de6a9590abe907283e189abba1b487f8e5562d1b' into develop 2025-11-24 21:29:18 +00:00
Thomas Ning
a18901385b Reorganize of KPack in GEMM (#3247)
* add the reorganize of KPack

* fix the compilation error

* fix the compilation error

[ROCm/composable_kernel commit: de6a9590ab]
2025-11-24 12:38:59 -08:00
assistant-librarian[bot]
5297edb40c Merge commit 'e95337c58c00d12b5c947006836f9fb46964b35c' into develop 2025-11-24 18:22:07 +00:00
Aviral Goel
ed24d3a8fa chore(copyright): update copyright header for codegen directory (#3266)
[ROCm/composable_kernel commit: e95337c58c]
2025-11-24 10:12:40 -08:00
John Shumway
39d9acab2e Guard a builder test to avoid gfx11 and gfx12 (#3268)
We're getting a compile error on gfx11 and gfx12 for an I8 test that doesn't have a supported WMMA implmentation. We'll need to build architecture support into the builder, but to get things green I'm just adding an ifndef guard around the test.

[ROCm/composable_kernel commit: 1bc7529977]
2025-11-24 10:10:09 -08:00
Christopher Millette
10eb15416c First look at mfma / wmma unification (#2704)
* First look at mfma / wmma unification

* Refactor

* Re-org file structure

* Restructure transform selection and WaveWiseMma class

* Update license files. Add missing gfx1151 support. Change wave size for HOST to 1. Update datatypes naming consistency

* Fixes default MmaSelector implentation

* Adds unit tests for amdgcn_mma and arch

* Consolidate common arch id checks to constexpr functions. Strongly type ids as amdgcn_target_arch_id object.

* Refactor is_any_value_of

* Fixes mma_selector logic

* Fix typo

* Add mma selector test for tile decomposition

* Fix compilation of mma.hpp

* Revert back to c++17 compatibility

* Fix compiler error by returning index_t from get_warp_size()

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Fixes compiler error for missing is_wave32() function

* Fixes compiler error for host wave_size() should be 64

* Fixes compiler errors where __cpp_concepts is not defined

* Fixes compiler errors where __cpp_concepts is not defined

* Fix test failure for host is wave64 by default

---------

Co-authored-by: Chris Millette <you@example.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

[ROCm/composable_kernel commit: b9c6cb1452]
2025-11-24 09:39:59 -08:00
assistant-librarian[bot]
c420d0386d Merge commit '8111572785d3de98457940f2b5ca6fe9cf7603af' into develop 2025-11-24 16:13:04 +00:00
Khushbu Agarwal
7d6cd1f3c4 [CK_Tile] Support for preshuffle weight(B) quant tensor for block scale gemm (#3165)
* formatted

* formatted

* formatting

* formatting

* formatting

* [CK TILE GEMM] Refactor block_scale_gemm examples

- Split cpp file to reduce building time
- Support multiple GemmConfig

* [CK TILE GEMM] Refactor block_scale_gemm examples

- Update Readme

* enable prefill shapes

* [CK TILE GEMM] Refactor block_scale_gemm examples

- Add support for rowcol and tensor GEMM operations

* [CK TILE GEMM] Refactor block_scale_gemm examples

- Update README

* adding preshuffle quant as new parameter and its associated new files

* remove debugging statements

* adding test

* enable preshuffle quant with permuteN

* updating readme and correcponding gemmconfigs

* updating cmake file

* fixing CI failures for grouped quant gemm

* addressing review comments

* fixing CI issue

* addressing reveiw comments

* formatting

* formatting

* fixing aquant operator overlaoding

* formatting

---------

Co-authored-by: Cong Ma <congma13@amd.com>
Co-authored-by: Thomas Ning <Thomas.Ning@amd.com>

[ROCm/composable_kernel commit: 8111572785]
2025-11-24 07:48:42 -08:00
assistant-librarian[bot]
6f3484eaa8 Merge commit 'e857e26bf64ab54dc6dcef0d89203982873a5fa8' into develop 2025-11-24 15:13:49 +00:00
Illia Silin
a1651a1b10 disable CI on gfx1010 by default (#3280)
[ROCm/composable_kernel commit: e857e26bf6]
2025-11-24 07:06:41 -08:00
assistant-librarian[bot]
1a4543c060 Merge commit '81042ea5747d3e1e4a71c3f327556f3fb0655d99' into develop 2025-11-24 14:13:16 +00:00
Qianfeng
3b341e4a16 Fix a bug for qr_ks_vs_async_trload pipeline (#3271)
[ROCm/composable_kernel commit: 81042ea574]
2025-11-24 21:31:48 +08:00
assistant-librarian[bot]
f2425d427c Merge commit '5948dbffe4d0bbe4d1802a047bd8599ba662386e' into develop 2025-11-24 09:15:05 +00:00
rocking
cdd72e57d3 Support fp8 dynamic quantization for fmha (#3206)
* Support qscale for dynamic quant, remove static quant

* Support hdim=256

* Remove bias test case for fp8

---------

Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
Co-authored-by: asleepzzz <hanwen.chang@amd.com>

[ROCm/composable_kernel commit: 5948dbffe4]
2025-11-24 16:28:25 +08:00
assistant-librarian[bot]
7bd01a9f5f Merge commit '096f0a3b23a49ffaef1e2dbed74bf366e36ad15c' into develop 2025-11-24 07:13:25 +00:00
Johannes Graner
679699f32a [CK Tile] Fix example for conv fwd + bias + clamp (#3235)
* Fix clamp not being applied correctly

* Apply group offsets to D tensors

---------

Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>

[ROCm/composable_kernel commit: 096f0a3b23]
2025-11-24 07:36:26 +01:00
assistant-librarian[bot]
8abfd83364 Merge commit 'f6c999bddb9e0ae468c7b45bc68cc1410472dcf5' into develop 2025-11-23 00:40:28 +00:00
Aviral Goel
d171245c4b chore(copyright): update copyright header for test directory (#3265)
[ROCm/composable_kernel commit: f6c999bddb]
2025-11-22 19:38:27 -05:00
assistant-librarian[bot]
d7685c394a Merge commit '02ab76c2cb47143b82743bcf9d86389c540a608b' into develop 2025-11-22 04:13:58 +00:00
Emily Martins
0d6a0a3c2f Fix CK Tile DP + 2 Tile Stream-K Validation Errors (#3269)
When there are multiple workgroups contributing to a tile, when using
atomics, there may be round off error in cases where the accumulator
type is not the same as the C type. To compute an error tolerance for
test validation, the Stream-K Tile Partitioner has a function called
estimate_num_wgs_per_tile to estimate the number of workgroups per tile.
That said, this function only provides an estimate. In some cases for
DP+2TSK, the function returns 1 rather than the more accurate value of
2.

Thus, this change updates the estimate_num_wgs_per_tile function to
explicitely return the value of 2 in cases for DP+2TSK to ensure that we
have a better error tolerance to avoid test failures due to round-off
error.

[ROCm/composable_kernel commit: 02ab76c2cb]
2025-11-21 20:29:47 -07:00
assistant-librarian[bot]
343e40d0e9 Merge commit '21ae743acd49c79913b3835236c5315983fa83ef' into develop 2025-11-21 16:13:44 +00:00
Illia Silin
d43b58b3cc Enable daily builds on gfx1010 (#3258)
* add build/test on gfx1010

* only build and run on gfx1010 once daily

[ROCm/composable_kernel commit: 21ae743acd]
2025-11-21 07:22:01 -08:00
assistant-librarian[bot]
323c839a2b Merge commit 'ea6e4fcbbc0bd76a562f246f743f5554edc312e4' into develop 2025-11-21 15:12:19 +00:00
John Shumway
071fbaaf28 Fix builder errors. (#3260)
There were four errors to fix:
1. The checks for defaulted direction were not implemented in the predicate concept.
2. Had to delete an obsolete and undefined operation enum.
3. A factory was passing a boolean in place of an integer.
4. Some of the factory tests are not compiling correctly when linking in the full source (with CK_EXPERIMENTAL_BUILDER=ON), so I commented them out.

[ROCm/composable_kernel commit: ea6e4fcbbc]
2025-11-21 15:25:45 +01:00