Commit Graph

4046 Commits

Author SHA1 Message Date
assistant-librarian[bot]
572df7d4d1 Merge commit '9ed9539ddfcdd8de4180fb992b718b57e1cadfae' into develop 2025-12-01 07:15:08 +00:00
Yi DING
43da4ac445 [CK_TILE] Disable cast_tile_pk_fp16bf16_fp32 as It Causes Extra spills on Recent Compilers (#3327)
[ROCm/composable_kernel commit: 9ed9539ddf]
2025-12-01 14:48:22 +08:00
assistant-librarian[bot]
0dff04aa27 Merge commit 'ba6af9fe7c6689075b46052cc40b7f94d96f647f' into develop 2025-12-01 06:17:27 +00:00
Gino Lu
4fb6b9c561 [CK_TILE] Add unit test for fp4 warp gemm (#2817)
This update includes a unit test for warp GEMM

[ROCm/composable_kernel commit: ba6af9fe7c]
2025-12-01 13:56:48 +08:00
assistant-librarian[bot]
4f8c179bfd Merge commit '004784ef98beffb24a03d106b143ee9f8e03e826' into develop 2025-11-28 22:12:10 +00:00
Aviral Goel
bb41ea37e1 chore(copyright) update library wide CMakeLists.txt copyright header template (#3313)
* chore(copyright) update library wide CMakeLists.txt files copyright header template

* Fix build

---------

Co-authored-by: Sami Remes <samremes@amd.com>

[ROCm/composable_kernel commit: 004784ef98]
2025-11-28 13:49:54 -08:00
assistant-librarian[bot]
74d3173d15 Merge commit 'f981554c39eafbf993e05c832cb86b3aaf474571' into develop 2025-11-28 13:21:12 +00:00
Sami Remes
77407b3d26 [CK_TILE] Fix Quant GEMM build (#3320)
* Fix build

* Fix ck_tile example 38 & 40

---------

Co-authored-by: Yi DING <yi.ding@amd.com>

[ROCm/composable_kernel commit: f981554c39]
2025-11-28 20:33:53 +08:00
assistant-librarian[bot]
296bf24afd Merge commit 'f875ab0bbc6ea68a689a688a58f9a53ad12fd536' into develop 2025-11-28 09:13:31 +00:00
msaffari-amd
4f5a48c910 Add validity checks for MoE FlatMM scatter and enable bf16 hardware atomic-add (#3236)
* Add validity checks for MoE FlatMM scatter and enable bf16 hardware atomic

* correct clang-format

* removed unused rtol_atol variable from example code

* clang format correction

* remove unused varable max_accumulated_value from example

[ROCm/composable_kernel commit: f875ab0bbc]
2025-11-28 09:43:01 +01:00
assistant-librarian[bot]
6032baee56 Merge commit '30727c48fcdf2178f013cbb843db563abd77d09c' into develop 2025-11-27 23:12:24 +00:00
Cong Ma
fa1c7bc6ba Tile engine for streamk (#3157)
* [CK TILE STREAMK] Introduce initial support for tile engine in streamk GEMM.

- This commit lays the groundwork for integrating the tile engine into streamk GEMM.
  It focuses on creating benchmark executables for streamk GEMM.
- Additional scripts like test_benchmark.sh and gemm_benchmark.py will be added once
  the streamk implementation reaches stability.

* [CK TILE STREAMK] Enable CI to execute tile engine benchmarks for StreamK GEMM

* [CK TILE STREAMK] Refactor: Extract common utility functions.

* [CK TILE STREAMK] Revise tile engine of streamk to align with the updated implementation

* Add pre-commit

* [CK TILE STREAMK] Add 'dp_persistent' and 'reduction_strategy' in output of CK TILE STREAMK

* [CK TILE STREAMK] Fix a bug about value of 'dp_persistent' of CK TILE STREAMK

* [CK TILE STREAMK] Update Jenkinsfile

* [CK TILE Engine] Update StreamK tile engine help message

Remove default value messages as they are automatically printed

* [CK TILE Engine] Update StreamK tile engine

- Remove namespace reboot

* [CK TILE Engine] Update StreamK tile engine

- Fix merge error

[ROCm/composable_kernel commit: 30727c48fc]
2025-11-27 15:49:57 -07:00
assistant-librarian[bot]
d0b319035a Merge commit '24d88d24729cc097d6d0c87a839827f40e35d86a' into develop 2025-11-27 17:12:03 +00:00
arai713
a3d6a1cb26 [CK_TILE] Move DataTypeTraits into a Common File (#3146)
This renames the typeToStr struct in the common utilities to DataTypeTraits and removes all duplication of DataTypeTraits across files in CK Tile.

Co-authored-by: Christopher Millette <63608002+cgmillette@users.noreply.github.com>

[ROCm/composable_kernel commit: 24d88d2472]
2025-11-27 09:09:54 -08:00
assistant-librarian[bot]
c27dc5875d Merge commit '678298d4c7141d41a552e7d8fea396ee88a4652f' into develop 2025-11-27 08:15:41 +00:00
Matthias Gehre
6c993365ac Add support for gfx1153 (#3306)
[ROCm/composable_kernel commit: 678298d4c7]
2025-11-27 08:48:00 +01:00
assistant-librarian[bot]
a3422f31e3 Merge commit 'a38aeceb2164f9d1807bda1a19d59636bafd4f31' into develop 2025-11-27 02:44:03 +00:00
Thomas Ning
6f751b7a9b Fix and improve the gemm quant pipeline infrastructure (#3245)
[ROCm/composable_kernel commit: a38aeceb21]
2025-11-26 18:04:27 -08:00
assistant-librarian[bot]
e7c7922385 Merge commit '79aae7c7f71404bdb80d6db52bc6401e0e221d42' into develop 2025-11-27 00:36:02 +00:00
Max Podkorytov
a7a9ccdeca [CK Tile] enable building examples by default (#3259)
* remove EXCLUDE_FROM_ALL from ck-tile examples
-> +15 min build time w/ 64 threads for a single arch

* fix cpp17 compile error in the ck-tile examples

---------

Co-authored-by: khuagarw <khuagarw@amd.com>
Co-authored-by: Ding, Yi <yi.ding@amd.com>

[ROCm/composable_kernel commit: 79aae7c7f7]
2025-11-26 16:24:44 -08:00
andrew clark
d790a9f9de Automated Perfetto UI Notifications (#3255)
* Testing visualization generation

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Adding dummy test data

* Update Jenkinsfile

* Update Jenkinsfile

* Adding notifications

* Testing

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Image compression

* Update Jenkinsfile

* Moving capture logic to main Jenkins file

* Testing generation

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Fixing curl request

* Update Jenkinsfile

* Clean up

* Fix

* Fixing notification

* Testing message creation

* Adjusting message payload

* Testing notification generation

* Updating main jenkinsfile

* Fixing cleanup call

* Removing test pipeline code

* Comment clean up

* Testing pipeline

* Update Jenkinsfile

* Update Jenkinsfile

* Update Jenkinsfile

* Moving archive

Moving trace archive to safe location before source checkout

* Removing test pipeline

* Testing pipeline with unique file names

* Update Jenkinsfile

* Removing test files

Updated main pipeline

[ROCm/composable_kernel commit: 40d7217ac7]
2025-11-26 16:27:27 -07:00
assistant-librarian[bot]
2044d0dd35 Merge commit 'de6466481f9472350a5f4afce27c86ecdbb5b42f' into develop 2025-11-26 18:14:59 +00:00
Aviral Goel
216c23b945 chore(copyright): update copyright header for include directory (#3293)
[ROCm/composable_kernel commit: de6466481f]
2025-11-26 11:00:05 -07:00
John Shumway
90e0eb4dfc Fix template parameter macros (#3305)
Some of the device implementation templates have macros like GridwiseGemmMultiABDTemplateParameters that can cause build errors if multiple files are included together. This error comes up with our builder code.

To clean up the macros and make them safer, we follow these follow rules:
* Use more specific names to avoid duplication.
* Undefine the macro after it is used to avoid leaking out of the file scope.
* Use a prefix CK_ on the macro to avoid conflicting with other libraries.
* Use all caps with underscores for preprocessor macro names.

[ROCm/composable_kernel commit: 10a782d846]
2025-11-26 09:48:17 -08:00
assistant-librarian[bot]
283383c61c Merge commit '35a4b26af0088ca0d634b57055a4143fdb9f2e2d' into develop 2025-11-26 07:13:26 +00:00
Aviral Goel
612f91226f fix: add dynamic selection of pipelines for aquant mode (#3282)
- Add conditional selection to use v3 pipeline when PreshuffleQuant is true
- Add static assertion in memory pipeline to prevent PreshuffleQuant usage
- Restore BaseBQuantGemmPipelineAgBgCrCompV3 for BQuant cases
- Update BaseGemmPipeline selection to handle all quant modes properly

[ROCm/composable_kernel commit: 35a4b26af0]
2025-11-26 10:58:09 +04:00
assistant-librarian[bot]
a86762f0f9 Merge commit '8fa90025d0da22683dabe721d77a75a536388683' into develop 2025-11-26 03:34:44 +00:00
Yi DING
16dd90a523 [CK_TILE] Refine warp_gemm_attribute_mfma (#3272)
[ROCm/composable_kernel commit: 8fa90025d0]
2025-11-26 10:57:15 +08:00
assistant-librarian[bot]
9eb4b35ef6 Merge commit 'c7dce2ac29136939b6fe6aabadd026e53dcf35c9' into develop 2025-11-26 02:44:11 +00:00
Yi DING
c0adc147a3 [CK_TILE] Fix Compilation of Flatmm Examples (#3285)
[ROCm/composable_kernel commit: c7dce2ac29]
2025-11-26 10:11:43 +08:00
Illia Silin
b80f571425 Enable ck_builder in CI. (#3296)
* build and run ck_builder tests

* add test_ckb_all to targets

* fix syntax

* fix test path

* Update CMake targets for builder testing in CI (#3290)

Our existing CMake only had build targets. Update CMakeLists.txt to have CTEST targets:
* smoke-builder
* regression-builder
* check-builder

Co-authored-by: John Shumway <jshumway@amd.com>

* use check-builder target

* get rid of test_ckb_all target

* call ninja check-builder separately

---------

Co-authored-by: John Shumway <jshumway@amd.com>

[ROCm/composable_kernel commit: a54f7b1138]
2025-11-25 17:45:59 -08:00
assistant-librarian[bot]
6d42c1d821 Merge commit 'cd4729386927c3d20b70fc9465614e9158524598' into develop 2025-11-25 23:12:05 +00:00
Aviral Goel
f13e2e69cc chore(copyright): update copyright header for experimental & example directory (#3292)
[ROCm/composable_kernel commit: cd47293869]
2025-11-26 03:09:39 +04:00
Bartłomiej Kocot
2c2672ff0e [CK TILE] Grouped Conv Explicit Gemm (#3289)
* [CK TILE] Grouped Conv Explicit Gemm

* fixes

* apply builder fixes

[ROCm/composable_kernel commit: 00dfa2f2ce]
2025-11-25 23:28:35 +01:00
assistant-librarian[bot]
8965f337ca Merge commit '37ea1600888f515e5dfb7153b75b2f06474d880d' into develop 2025-11-25 21:12:15 +00:00
Khushbu Agarwal
192bb72244 [CK-Tile] fix block scale example for gfx1201 (#3283)
[ROCm/composable_kernel commit: 37ea160088]
2025-11-25 13:10:28 -08:00
assistant-librarian[bot]
a5b724bc4d Merge commit '9ac2666d5b48efc3743ce073aab0a68833accf5c' into develop 2025-11-25 14:13:08 +00:00
Bartłomiej Kocot
95ec5ccec0 [CK_BUILDER] Add grouped conv bwd ck tile traits (#3281)
* [CK_BUILDER] Add grouped conv bwd ck tile traits

* copilot fixes

[ROCm/composable_kernel commit: 9ac2666d5b]
2025-11-25 14:57:43 +01:00
assistant-librarian[bot]
5375584eac Merge commit 'ab0101c59c6be6ad376ba668a51f0e38dca66aa2' into develop 2025-11-25 02:43:48 +00:00
Aviral Goel
52168e0481 chore(copyright): update copyright header for library directory (#3274)
* chore(copyright): update copyright header  for library directory

* chore(copyright): update copyright header for library directory

---------

Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>

[ROCm/composable_kernel commit: ab0101c59c]
2025-11-24 18:10:26 -08:00
Aviral Goel
a535de0f75 chore(copyright): update copyright header for example directory (#3273)
* chore(copyright): update copyright header for codegen directory

* chore(copyright): update copyright header for example directory

[ROCm/composable_kernel commit: d85f065b15]
2025-11-24 18:02:41 -08:00
assistant-librarian[bot]
de08f43ef6 Merge commit '229d43ea0c8b9c94092ce001e411f82c3766b6fb' into develop 2025-11-25 01:51:07 +00:00
rocking
f20f9dd453 Fix batch prefill compile fail in aiter (#3279)
* Fix batch prefill aiter compile fail

* Fix compile error

[ROCm/composable_kernel commit: 229d43ea0c]
2025-11-25 09:46:32 +08:00
assistant-librarian[bot]
4aaa8c92bb Merge commit 'de6a9590abe907283e189abba1b487f8e5562d1b' into develop 2025-11-24 21:29:18 +00:00
Thomas Ning
a18901385b Reorganize of KPack in GEMM (#3247)
* add the reorganize of KPack

* fix the compilation error

* fix the compilation error

[ROCm/composable_kernel commit: de6a9590ab]
2025-11-24 12:38:59 -08:00
assistant-librarian[bot]
5297edb40c Merge commit 'e95337c58c00d12b5c947006836f9fb46964b35c' into develop 2025-11-24 18:22:07 +00:00
Aviral Goel
ed24d3a8fa chore(copyright): update copyright header for codegen directory (#3266)
[ROCm/composable_kernel commit: e95337c58c]
2025-11-24 10:12:40 -08:00
John Shumway
39d9acab2e Guard a builder test to avoid gfx11 and gfx12 (#3268)
We're getting a compile error on gfx11 and gfx12 for an I8 test that doesn't have a supported WMMA implmentation. We'll need to build architecture support into the builder, but to get things green I'm just adding an ifndef guard around the test.

[ROCm/composable_kernel commit: 1bc7529977]
2025-11-24 10:10:09 -08:00
Christopher Millette
10eb15416c First look at mfma / wmma unification (#2704)
* First look at mfma / wmma unification

* Refactor

* Re-org file structure

* Restructure transform selection and WaveWiseMma class

* Update license files. Add missing gfx1151 support. Change wave size for HOST to 1. Update datatypes naming consistency

* Fixes default MmaSelector implentation

* Adds unit tests for amdgcn_mma and arch

* Consolidate common arch id checks to constexpr functions. Strongly type ids as amdgcn_target_arch_id object.

* Refactor is_any_value_of

* Fixes mma_selector logic

* Fix typo

* Add mma selector test for tile decomposition

* Fix compilation of mma.hpp

* Revert back to c++17 compatibility

* Fix compiler error by returning index_t from get_warp_size()

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Fixes compiler error for missing is_wave32() function

* Fixes compiler error for host wave_size() should be 64

* Fixes compiler errors where __cpp_concepts is not defined

* Fixes compiler errors where __cpp_concepts is not defined

* Fix test failure for host is wave64 by default

---------

Co-authored-by: Chris Millette <you@example.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

[ROCm/composable_kernel commit: b9c6cb1452]
2025-11-24 09:39:59 -08:00
assistant-librarian[bot]
c420d0386d Merge commit '8111572785d3de98457940f2b5ca6fe9cf7603af' into develop 2025-11-24 16:13:04 +00:00