Commit Graph

54 Commits

Author SHA1 Message Date
DarylHawkinsAMD
1e6bbed1fb [CK_BUILDER] CK Tile header installation for builder, algorithm concept improvements (#3419)
* Added install of CK_Tile headers when using CK_EXPERIMENTAL_BUILDER. MIOpen needs this since the builder uses features from CK Tile and the CK Tile install is excluded when doing a narrow build for MIOpen
* Changed algorithm concept type checks to be concepts instead of constexpr bool functions. This improves compiler error messages when using these concepts in static_asserts

---------

Co-authored-by: Daryl Hawkins <DarylHawkins@amd.com>
2025-12-15 16:24:36 -07:00
John Shumway
2544e394cf Add missing enums to data_type_sizeof (#3430)
Fixes broken build on gfx942. This was some test code that got merged at the same time.
2025-12-15 11:49:36 -08:00
John Shumway
9ac51aa0f4 Add describe() method to device ops for runtime introspection (#3375)
Introduces a polymorphic describe() method to BaseOperator that enables runtime introspection of kernel configurations through a unified interface.

Key changes:

* Add virtual describe() method to BaseOperator returning Description objects
* Implement describe() in 6 device operation classes (conv fwd/bwd variants)
* Create conv_describe.hpp with factory function for ConvDescription
* Extract type definitions to conv_types.hpp to resolve circular dependencies
* Add InstanceStringDescription for kernels without full ConvDescription support

Other Improvements:

* Update tests to use describe() instead of GetInstanceString()
* Remove circular dependency include from conv_traits.hpp
* Add ODD_C to ConvFwdSpecialization enum and fix OddC mapping
* Replace silent fallback in conv_layout() with compile-time error

This provides a foundation for runtime kernel introspection and better tooling support for analyzing and debugging kernel configurations.
2025-12-14 12:49:12 -08:00
Robin Voetter
6219b12730 [CK_BUILDER] convolution testing (#3267)
* Add README.md for testing

* Add tensor_memory_manager.

* ck-builder: tensor memory manager rebase fixes

This fixes some issues caused by the API being changed recently.
Also, this streamlines the ckt namespace to always be ck_tile::builder::test,
as this is already being used by other tests

Really, this commit should be squashed into the previous,
but I'm keeping it separate for brevity.

* ck-builder: test arguments initial prototype

* ck-builder: test system initial prototype

* ck-builder: fix non-standardized copyright comments

* ck-builder: new prototype

* ck-builder: group testing inputs/outputs into a separate structure

This is basically the return of the tensor memory manager after all,
except that the design is more closely tied to the actual operation.
Using a struct allows us to add additional input/output tensors
without breaking code (by defaulting those new parameters). Note
that the tensors are split into a separate inputs/outputs because we
usually want to allocate the output _twice_: once for the real
computation and once for the reference computation.

* ck-builder: simplify prototype naming; start docs

* ck-builder: update testing readme

* ck-builder: testing documentation

* ck-builder: HipStatusMatcher

This matcher can be used to check HIP status codes and provide
nice and readable error messages.

* ck-builder: tensor_buffer.hpp tests

* ck-builder: conv_fwd.hpp tests

* ck-builder: add example end-to-end test in conv fwd 2d fp16

* ck-builder: simplify extent usage

* ck-builder: update testing doc

* ck-builder: skip end to end test on non-gfx9

* fix check_copyright_year interpreter

/bin/bash is not guaranteed to exist on Linux. Signed,
a NixOS user

* ck-builder: fix copyrights

* ck-builder: reduce conv fwd testing size

This test allocated 24GB of memory, too much for 16GB cards.

---------

Co-authored-by: John Shumway <jshumway@amd.com>
2025-12-13 15:33:41 +01:00
Aviral Goel
4dcc3e59c1 chore: update copyright header for misc files (#3402)
* chore: update copyright header for misc files

* fix: typo in kernel resulting in ci failure
2025-12-11 08:25:29 -08:00
Ville Pietilä
d66e5f667c [CK_BUILDER] Improve CK Builder and CK Builder tests (#3382)
* Remove stale documentation.

* Add placeholder for conv algorithm design description. Add link to conv factory description.

* Improve testing transfer parameters.

* Python script to check the block tilings.

* Improve tests and conv types serialization.

* Change representation of boolean values from 1/0 to true/false in instance strings.

* Change representation of boolean values from 1/0 to true/false in conv algorithm types.

* Test code improvements.

* Improve covn descriptions tests.

* Improve conv signature definition in conv fwd builder tests.

* clang-format.

* Remove obsolete script.

* Revert StaticAssertTypeEq changes in conv layout tests.

* Remove obsolete using declaration.

---------

Co-authored-by: Ville Pietilä <>
2025-12-11 09:50:00 +02:00
Bartłomiej Kocot
04612c30ce [CK_BUILDER] Ck Tile Grouped convolution factory (#3352)
* [BUILDER] Ck Tile Grouped convolution factory

* Part 2

* Fixes after rebase

* Remove leftovers
2025-12-08 10:32:56 +01:00
John Shumway
f5b0af2272 Simplify includes for CK builder reflection (#3357)
We only want to import enums and types into the builder reflection code. But, some of the enums are included in much larger files or even big trees of include files. This leads to unintended mixing of code and very confusing interactions and symbol conflicts. We organize the includes and extract two new enum-only headers to help with decoupling in CK. This refactoring is critical if we want to include reflection in a device-operator "describe" method.

* Remove a few unnecessary includes from headers in builder/reflect/.
* Extract enums scheduler and pipeline to their own headers so they can be used without importing other code.
* Order includes alphabetically for better organization.

The immediate goal is to unblock reflection integration, and this type of cleanup helps the flexibility and robustness of the CK header library.
2025-12-05 07:44:10 -08:00
John Shumway
13f6d63565 Clean up conv_traits.hpp (#3354)
When I asked for a description of operators that didn't have ConvTraits, I was getting very long confusing errors about ConvTraits not being defined. Now we get specific errors explaining which concepts are violated, making it easier to know which code to generalize or update.

* Add concepts to conv_traits.hpp to get better error message.
* Put the correct requires clauses in the right places to get descriptive error messages.
* General cleanup of functions in conv_traits.hpp to make functions easier to read.
2025-12-04 19:12:36 -08:00
Max Podkorytov
d184eed823 [CK-Tile] Refactor base pipeline usage (#3251)
* initial poc

* factor out common parts in operator()

* cv4

* rest of the universal gemm pipelines

* fix test

* remove boilerplate from tile engine

* fix example

* fix example

* format

* fix tests build for gemm

* remove base pipeline codegen from gemm instance builder

* unify v3 logic with the rest of universal gemm pipelines

* fix build for multi abd test

* fix test gemm multi d

* fix build for weight preshuffle

* fix grouped gemm test

* fix grouped gemm multi d test

* fix grouped gemm preshuffle

* fix grouped gemm example except for quant

* fix gemm preshuffle

* fix splitk 2 stage example

* fix batched gemm example

* fix multid example

* fix multiabd example

* fix batched gemm test

* fixup

* fix examples build

* fix grouped gemm test build

* fix smoke builder
2025-12-04 11:45:49 -08:00
Ville Pietilä
9cb1f421bc [CK_BUILDER] Refactor convolution signature to provide data type/layout/elementwise op per tensor (#3331)
* Separate layouts into separate entities for input, weight, and output tensors.

* Add test for handling bias tensor layouts.

* Use instance string in builder tests.

* Add handling of output bias data types and layouts.

* Generalize handling of the elementwise ops.

* Test fix.

* Create builder for layouts.

* Layout builder improvements.

* Improve layout builder.

* Simplify bias layout handling.

* Code clean-up.

* Move layout utils into separate file.

* Remove hard-coded layout combinations.

* Small code clean-up.

* Move data type utils into a separate file.

* Add data types, layouts, and elementwise ops per conv tensor.

* Builder bug fixes after refactoring.

* Working baseline.

* Make signature definition look nice in the test code.

* Move TensorConfig into test implementations.

* Fix all fwd conv builder tests.

* Fix conv traits and descriptors tests.

* More factory assets under a separate directory.

* Fix building conv traits.

* Fix clang-format.

* Add Readme doc to describe the design.

* Add link to main Readme. Fix links in the builder design doc.

* Clean-up data type/layout/elementwise op conversions.

* Switch from dimension and tensor type specific layouts to a flat list of tensor layouts.

* Fix clang-formatting.

* Fix clang-format for test code.

* Simplify fwd conv signature definitions in the test code.

* Remove accidental edits.

* Fix comment string.

* Fix instance factory after rebase.

* Fix tests after rebase.

* Unify layout handling.

* Add more conv layout unit tests.

* Clang-format.

* Fix merge conflicts.

* Improve elementwise op handling.

---------

Co-authored-by: Ville Pietilä <>
2025-12-04 12:58:31 +02:00
John Shumway
f29b67cf9b [CK_BUILDER] Add Description::instance_string() method and update tests (#3340)
* Create Description::instance_string() function

To expose more reflection capabilities in MIOpen, we add the instance_string functionality to the ckr::Description class. This PR introduces a base class, adds the instance_string method, and implements the method by injecting the Traits::instance_string method through the ConvDescription constructor.

This will enable us to replace the specialized get_instance_string() method on device operations with a describe() method in a subsequent PR.

* Test describe().instance_string()

Update the instance string tests to also call `ckr::describe<Instance>().instance_string()`. This documents that the xld kernels are supported with describe(), but WMMA and DL kernels are not yet supported. Also update namespace and add a HasConvTraits concept.
2025-12-03 06:36:09 -08:00
John Shumway
280bc42191 [CK_BUILDER] Refactor builder factory code. (#3276)
Refactor the builder factory code into multiple files and subdirectories and a ck_tile::builder::factory namespace.

The factory implements compile-time dispatch from high-level signature and algorithm descriptors to our existing specialized convolution kernel implementations.

Major changes in this PR:

Dispatch logic is explicit in the function make_conv_instance instead of implicit in template specialization selection.
Helper code is moved to a subdirectory builder/factory/helpers.
Helpers now have unit tests.
Factories are moved to their own files.
Code moved to namespaces ck_tile::builder::factory and ck_tile::builder::factory::internal.
This does not yet fix the problem of bad error messages, but the make_conv_instance function makes the poor error messages clear. The choice of algorithm must be much more robust (perhaps with explicit enumeration in the algorithm descriptor), so that the dispatch doesn't fail.

Quality changes:

Making dispatch explicit rather than implicit will improve robustness, readability, maintainability, testability, and extensibility.
Separating code into separate files and subdirectories helps readability and extensibility.
Adding unit tests for helpers documents behavior and will enable more complex logic and functionality.
Separating files (especially unit tests) helps clarify includes and dependencies and makes code easier to refactor.
2025-12-02 07:40:14 -08:00
DarylHawkinsAMD
d3f37ebf6c [CK_BUILDER] Install CK builder headers, added missing include (#3334) 2025-12-02 04:28:46 -08:00
John Shumway
7873f8fa13 [CK_BUILDER] Update the testing documentation (#3312)
* [CKBuilder] Update the testing documentation

Now that we have clear sets for smoke tests and regression test, we rearange the CMakeLists.txt file to be organized and have description and instructional comments.

Move all the test targets that compile quickly into the smoke test suite.

Update the builder README.md to reflect this new test organization and functionality.

* Update experimental/builder/README.md

Clarify integration tests description from review comment.

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Correct README.md

The regression tests here still run very fast like smoke tests, but can take minutes or even tens of minutes to compile.  We test most of the builder functionality without compiling heavily-templated kernel code, but these regression tests do an expensive full build of the CK kernels.

---------

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>
2025-12-01 13:05:32 -08:00
John Shumway
d17994f3df [CK_BUILDER] Fix cosmetic problem with conv_description (#3333)
The ConvDescription::detailed command wasn't using TreeFormatter::writeLast correctly, which led to extra lines being drawn in the tree view. It's a simple fix, just a cosmetic improvment out reflection output (ASCII art).
2025-12-01 12:45:04 -08:00
John Shumway
abd6a4b3fc Cleanup convolution description (#3329)
Remove obsolete feature for extracting a description from a builder, since this should apply directly to the instance type. Also add some documentation, including a README.md for reflection.
2025-12-01 10:03:58 +01:00
Aviral Goel
004784ef98 chore(copyright) update library wide CMakeLists.txt copyright header template (#3313)
* chore(copyright) update library wide CMakeLists.txt files copyright header template

* Fix build

---------

Co-authored-by: Sami Remes <samremes@amd.com>
2025-11-28 13:49:54 -08:00
Illia Silin
a54f7b1138 Enable ck_builder in CI. (#3296)
* build and run ck_builder tests

* add test_ckb_all to targets

* fix syntax

* fix test path

* Update CMake targets for builder testing in CI (#3290)

Our existing CMake only had build targets. Update CMakeLists.txt to have CTEST targets:
* smoke-builder
* regression-builder
* check-builder

Co-authored-by: John Shumway <jshumway@amd.com>

* use check-builder target

* get rid of test_ckb_all target

* call ninja check-builder separately

---------

Co-authored-by: John Shumway <jshumway@amd.com>
2025-11-25 17:45:59 -08:00
Aviral Goel
cd47293869 chore(copyright): update copyright header for experimental & example directory (#3292) 2025-11-26 03:09:39 +04:00
Bartłomiej Kocot
00dfa2f2ce [CK TILE] Grouped Conv Explicit Gemm (#3289)
* [CK TILE] Grouped Conv Explicit Gemm

* fixes

* apply builder fixes
2025-11-25 23:28:35 +01:00
Bartłomiej Kocot
9ac2666d5b [CK_BUILDER] Add grouped conv bwd ck tile traits (#3281)
* [CK_BUILDER] Add grouped conv bwd ck tile traits

* copilot fixes
2025-11-25 14:57:43 +01:00
John Shumway
1bc7529977 Guard a builder test to avoid gfx11 and gfx12 (#3268)
We're getting a compile error on gfx11 and gfx12 for an I8 test that doesn't have a supported WMMA implmentation. We'll need to build architecture support into the builder, but to get things green I'm just adding an ifndef guard around the test.
2025-11-24 10:10:09 -08:00
John Shumway
ea6e4fcbbc Fix builder errors. (#3260)
There were four errors to fix:
1. The checks for defaulted direction were not implemented in the predicate concept.
2. Had to delete an obsolete and undefined operation enum.
3. A factory was passing a boolean in place of an integer.
4. Some of the factory tests are not compiling correctly when linking in the full source (with CK_EXPERIMENTAL_BUILDER=ON), so I commented them out.
2025-11-21 15:25:45 +01:00
John Shumway
f38c3de9f9 Fix copyright messages in experimental/builder. (#3253)
Our copyright were were mostly correct, but we inconsistently used (C) instead of (c) like the rest of the CK code. This PR fixes that (using lowercase c) and adds a missing copyright header to one file.
2025-11-20 17:40:55 -08:00
Robin Voetter
bb155ef678 ck-builder: add remaining ck factory tests (#3223)
Now that the remaining reflection has been implemented, we
can add the remaining factory tests too. This is the complete set
of instances for forward grouped conv currently in CK.
2025-11-20 10:42:36 -08:00
Robin Voetter
245c6011cf ck-builder: group transfer operations per tensor (#3217)
Grouping transfer operations per tensor makes it easier to
constrain on and operate with the transfer operations. As an
example, we can now deduplicate the logic for translating
the transfer operations from the ck-builder interface to the old
ck interface for the A and B tensors.
2025-11-20 10:40:48 -08:00
kabrahamAMD
964f8e1f60 [CK_Builder ]fixed accidental drop of get_elementwise_operation during merge and added usage of get_elementwise_operation() to other builder instances (#3238)
Fixed issues encountered during merge of #3192

* fixed accidental drop of get_elementwise_operation during merge and added call to get_elementwise_op to 4 other builders

* run clang-format

---------

Co-authored-by: Kevin Abraham <kevin.abraham@streamhpc.com>
2025-11-19 12:31:05 -08:00
Robin Voetter
7fe7aa76f5 [CK_BUILDER] fixes (#3222)
* ck-builder: some miscellaneous fixes
* ck-builder: fix InstanceSet.FromFactory test

The exact syntax that the instance string functionality
returns has changed. This commit updates the test to expect
the right string.
2025-11-19 09:05:25 -08:00
John Shumway
ad57f6ef0b [CK_BUILDER] Put global CK functions in an the CK namespace (#3232)
* Wrap ck host utitlies in CK namespace.

The CK and CK-Tile source code bases are incompatible because CK is not properly using namespaces everywhere. In particular, we need to put hip_check_error in the ck namespace.

Move all functions in include/ck_/host_utility that were in global namespace into the ck namespace.

There may be additional namespace problems like this, and it's possible we'll have namespace clashes. But it is good design to properly guard our to code bases (CK and CKTile) so that they can both coexist. Moreover, estabilishing this compatiblity is essential if we are going to allow the builder to instantiate  kernels from either template library.

* Add using declarations to test code.

After moving some of the untils into the ck namespace, most examples and a few tests had to be updated to recognize the new namespace declarations. We add using declarations to individual compute units for functions that were previously in the global namespace.

* Add using declarations to client examples.
2025-11-19 11:23:02 +01:00
kabrahamAMD
92498464f6 [CK_Builder] removed direction and elementwise_operation from required parameters … (#3192)
Removed direction and elementwise operation from default values required for convolution signature concept. Added constexpr helpers to set default values. Add compile-time tests.
2025-11-17 15:23:48 -08:00
Ville Pietilä
7d57bc169f [CK_BUILDER] Forward convolution builder improvements (#3179)
Proposed changes
Improve the forward convolution builder implementation and addressed leftover feedback left from PR #3138. Main changes

Refactored tests such that they reflect better the builder pattern. The templates and types for the convolution algorithm concepts are created via factory that facilitates programmatic creation of the device op instances.
Moved tests into anonymous namespace.
The convolution factory had lot of if-else constructs when CK Builder types were converted into CK library types. I had initially trouble in using static_assert in the default branch of switch as the static_assert was evaluated at compile time even for valid types. However, if we change the static_assert to throw "<error message>", it will result in a compile-time error only if the default branch is actually hit. This assumes that the function is consteval. Hence, changed all conversions in the convolution factory to use switch, which is more intuitive.
Removed the explicit device op definition from convolution signature and the corresponding predicate file. The device ops are defined by the corresponding concepts. This allowed to remove lot of boilerplate code from the convolution factory.
Adde inheritance and convolution algorithm specialization to handle device ops that are specialization of a more generic ones. The large tensor support is more naturally expressed by this pattern.
Added support for the FP8 data type.

* WIP: Builder for expected test results.

* Improve ckb fwd conv instance tests.

* clang-format

* Change if-else statements into switch in conv factory.

* Fix clang-formatting.

* Removed unnecessary includes.

* Added missing copyright.

* Remove explicit device op flag from from convolution signature.

* Add missing concept.

* Fix build.

* clang-format

* Add test for building conv fwd FP8 instances.

* Add missing header to instance traits.

* Clean-up recently added instances.

* Introduce inheritance and specialization.

* Use builder to build conv algorithm templates and types.

* clang-format

* Fix conv description tests.

---------

Co-authored-by: John Shumway <john.shumwayjr@gmail.com>
2025-11-13 08:47:25 -08:00
Bartłomiej Kocot
92c1f4981a [CK_BUILDER] Add grouped conv fwd ck tile traits (#3183)
* [CK BUILDER] Add grouped conv fwd ck tile traits

* Update instance_traits_tile_grouped_convolution_forward.hpp

* Update grouped_convolution_forward_kernel.hpp
2025-11-11 13:55:33 -08:00
JH-Leon-KIM-AMD
5f3cae3e28 [CK_BUILDER]ckb add remining fwd conv device ops (#3155)
* Add device operation to conv signature. Use unions to hold conv layouts and device operations.

* Add predicates for all device op instances.

* Use the device op signature for validation.

* Fix ckb CMakeLists.txt file for tests.

* Fix building CK Builder instance traits after the introduction of direct load template parameter in CK.

* Fix clang-formatting.

* add device_grouped_conv_fwd_dl_multiple_d_nhwc_kyxc_nhwk

* Add full DL configurability with Option A implementation

- Added 5 DL descriptor structs (39 configurable parameters)
- Added 10 C++20 concepts for type-safe validation
- Updated factory to read all parameters from descriptors
- Updated test helper to populate all descriptors
- All tests passing (13/13 including 3 new DL tests)

* Add factory and test support for DeviceGroupedConvFwdMultipleD_Xdl_CShuffle_Large_Tensor

- Add factory specialization for Large_Tensor device operation (conv_factory.hpp lines 1145-1265)
- Add macro collision workaround using pragma push/pop (conv_factory.hpp lines 43-51)
- Add test helper function run_test_DeviceGroupedConvFwdMultipleD_Xdl_CShuffle_Large_Tensor
- Add builder test file test_ckb_conv_fwd_2d_large_tensor_fp16.cpp with 2 test cases
- Update CMakeLists.txt to include new test file
- Reuse existing ConvAlgorithm_DeviceGroupedConvFwdMultipleABD_Xdl_CShuffle descriptor
- Map all 42 template parameters identical to regular XDL CShuffle
- All 15 builder tests passing including 2 new Large_Tensor tests

Completes Task 350: All 4 forward convolution device operations now supported in CK Builder.

* Update copyright headers to new format

- Change copyright format to: Copyright (C) Advanced Micro Devices, Inc., or its affiliates.
- Reorder headers: Copyright first, then SPDX-License-Identifier
- Updated files:
  * experimental/builder/test/conv/test_ckb_conv_fwd_2d_dl_fp16.cpp
  * experimental/builder/test/conv/test_ckb_conv_fwd_2d_large_tensor_fp16.cpp
  * experimental/builder/include/ck_tile/builder/device_op_types.hpp

* fix c++ 18 format

* Fix clang-format-18 error in device_op_types.hpp

---------

Co-authored-by: Ville Pietilä <ville.pietila@amd.com>
Co-authored-by: Ville Pietilä <188998872+vpietila-amd@users.noreply.github.com>
2025-11-06 16:29:48 -08:00
Adam Osewski
18e083003f [CK_BUILDER] Convolution description (#3163)
* Add DirectLoad tparam & clean up headers.

* Add convolution traits.

* Update inline documentation.

* Add more convolution specialization and gemm padding types.

* Add additional helper functions & more tests to conv traits.

* Fix tests cmake file.

* Add case insensitive string comparison

* Fix function name overlapping with variable name.

* Unify pipeline version and scheduler enums.

* Fix includes.

* Update test conv traits with unified enums.

* Update concepts etc with update unified enum

* Fix ckb conv fwd test - unified enum usage.

* Dump changes.

* Add ostream overloads for all enum classes.

* Update detailed() function in ConvDescription

* Fix handling union based conv direction.

* Add test & update conv description.

* Refine tree view.

* Update copyrights

* Fix merge artifacts

* Update detailed tree conv description

* Fix clang-format
2025-11-06 15:46:26 +01:00
Adam Osewski
b8527a9236 [CK_BUILDER] Convolution traits. (#3152)
Added:

1. Convolution traits & unit tests
2. Update builder enumerators to have representation of Convolution Kernels properties.
3. Unified builder pipeline version & scheduler enumerators
2025-11-05 08:53:06 -08:00
John Shumway
0be0288f58 [CK_BUILDER] Update copyright messages. (#3150)
* Update copyright messages.

Copyright messages should no longer include a year. This PR updates all 38 source files to the new format.

* Switch to (C) from unicode copyright symbol.

The unicodein comments  was causing compilation errors.
2025-11-04 15:35:16 +01:00
John Shumway
6dbee64886 [CK_BUILDER] Add backward weight instance traits for xdl cshuffle. (#3143)
* Add backward weight instance traits for xdl cshuffle.

To keep instance test file sizes reasonable, we start a new test_bwd_weight_instances_traits.cpp test file.

* Fix copyright notices.

* Remove (c) symbol, replace with (C).

Having UTF-8 in source caused an error with code generation.
2025-11-04 15:34:00 +01:00
msaffari-amd
d405641f06 Ck tile engine gemm unit tests exapand test coverage (#3025)
* initial commit for testing datatypes, layouts and traits

* correct warp tile size for small datatype config to make a validate instance for fp16, bf16, fp8

* add tile size coverage test

* Cover more tests, parallel instance generation, documentation

* update cmakelist to run more tests

* initial codes to support add test params in json file

* add congurable  problem sizes for different tests

* modify README.md

* clean test_gemm_simple code

* correct padding coverage test

* Add comprehensive and quick tile size config files

* remove fp64 from datatypes

* update documents. manage selecting tile_size config (quick or Comprehensive)

* correct padding test problem sizes

* update comprehensive test and correct documents

* Skip GEMM tests with unsupported arguments instead of failing

* change gen_single instead of gen_indivisual because of an issue. add splitk tests to tile_size_quick_config

* clean CMakeList, remod py file

* Refactor test configs: Rename tile_size to coverage, remove separate traits config,  clean cmakefile, readme

* update fp32, fp8 to test all layouts, clean documents and comments

* limit fp32 test layouts to rcr because of compilation error on some gpus

* remove fp32 because of the removing from gemm_instance_builder, make quick test smaller, updating comments

* Fix fp8/bf8 test failures on gfx950 by adding OCP FP8 format support

* Reduce quick_coverage test count from ~250 to ~144 for faster CI
2025-11-03 10:29:16 +01:00
Ville Pietilä
3ae3992c18 [CK_BUILDER] Add conv factories for DeviceGroupedConvFwdMultipleABD_Xdl_CShuffle and DeviceGroupedConvFwdMultipleD_Wmma_CShuffle (#3138)
* Add device operation to conv signature. Use unions to hold conv layouts and device operations.

* Add predicates for all device op instances.

* Use the device op signature for validation.

* Fix ckb CMakeLists.txt file for tests.

* Fix building CK Builder instance traits after the introduction of direct load template parameter in CK.

* Fix clang-formatting.

* Add factory for DeviceGroupedConvFwdMultipleABD_Xdl_CShuffle device op.

* Add conv factory for  DeviceGroupedConvFwdMultipleD_Wmma_CShuffle

* Rename elements per wave per shuffle member in the epilogue concept.

* clang-format

* Add concepts and types for optional device op template parameters.

* Add optional compute, direct load, and loop scheduler arguments to conv factory.

* Add number of groups to merge template parameter.

* clang-format.
2025-11-03 09:03:25 +02:00
John Shumway
5ed2046bee Add the last two forward instance traits. (#3134)
* Add InstanceTraits for DeviceGroupedConvFwdMultipleD_Wmma_CShuffle

* Add InstanceTraits for kernel_grouped_conv_fwd_dl_multiple_d

* A few small changes to fix broken instance traits.
2025-10-31 07:52:42 -07:00
Ville Pietilä
b387249fd9 [CK_BUILDER] Generalize convolution factory to build arbitrary device operations. (#3116)
Generalize the current convolution factory in CK Builder to be able to build instances of any relevant convolution device operation. The main changes are:

* Added new enums FwdGroupConvDeviceOperation, BwdDataGroupConvDeviceOperation, and * BwdWeightGroupConvDeviceOperation that contain the device operations for which the builder should be able to build instances.
* Create a union structure GroupConvDeviceOp that can represent a single value of the fwd, bwd weight, or bwd data device operations. This would be more naturally represented by std::variant object, but we cannot use std::variant in NTTPs because it is not a structural object.
* Introduced a new member device_operation in the ConvSignatureDescriptor concept that assumes GroupConvDeviceOp value.
* Added predicates to be used in creation ConvFactory specialization for the different device operation. When we add support for a new device operation, we'll just create a new ConvFactory specialization with appropriate predicates.
* Changed handling of the convolution layouts (GroupConvLayout1D, GroupConvLayout2D, GroupConvLayout3D) to use the union based handling, i.e., there's now a GroupConvLayout union struct that can hold a single value of the 1D, 2D, or 3D layouts. This simplifies the handling of the different layouts as we get rid of templatized convolution signature.

These code changes allow developers to work more easily in parallel when adding new device operations.

* Fix building CK Builder instance traits after the introduction of direct load template parameter in CK.

* Fix clang-formatting.
2025-10-30 16:13:58 -07:00
Ville Pietilä
90da26ccfd [CK_BUILDER] Rename CK Builder test targets with consistent prefix test_ckb (#3114)
* Rename CK Builder test targets with consistent prefix test_ckb.

* Add test_ckb_all target to build all CK Builder tests.

* Update Readme for CK Builder.
2025-10-30 16:08:32 -07:00
John Shumway
cafaeb6b7b Add instance traits for two more grouped forward convolutions (#3112) 2025-10-29 16:04:13 +01:00
Ville Pietilä
13e13ce359 [CK_BUILDER] Clean-up fwd conv builder implementation (#3110) 2025-10-28 20:37:33 -07:00
Robin Voetter
6f58d6e457 [CK_BUILDER] Factory tests (#3071)
This pull requests adds some initial "factory tests" - these check that the instances which are used in MIOpen are actually present in CK. The main reason for this is documentation and sanity checking. Its likely that these tests get outdated fast, so we'll have to maintain them, but fortunately this is quite straight forward and shouldn't take a lot of time once they are in place.
2025-10-28 10:27:42 -07:00
John Shumway
54746e9329 [CK_BUILDER] Test and fix instance traits utils. (#3096)
* Refactor instance_traits_util and add unit tests tests

* Address reviewer comments.

Just adds some TODOs to indicate deprecated layouts in our reflection. Our strategy is to leave the reflection code broad (covering deprecated features), but keep the builder concepts narrow. Once we've removed deprecated features from all instances, we can remove them from reflection.

Also add a comment to the cmake to explain the unit test target test_conv_builder.

* Addressed more reviewer comments.

* Remove duplicate PassThrough::name

Accidentally added this field to the end of the struct, too. The `name` field should be a the start of the struct for consistency.
2025-10-27 22:14:08 -07:00
Ville Pietilä
6c2ca1211a [CK_BUILDER] First fwd convolution builder implementation (#3070)
* Add experimental builder infrastructure for composable_kernel

- Add experimental/builder directory with README documentation.
- Create initial test infrastructure with CMakeLists.txt and placeholder test.
- Update root CMakeLists.txt to support CK_EXPERIMENTAL_BUILDER option.
- Update .gitignore to not treat `experimental/builder` as a CMake build directory.

This establishes the directory structure  for a high-level builder pattern that will provide a semantically-clear interface for constructing CK operations, with initial focus on convolution kernels for MIOpen integration.

* Fix clang formatting.

* Fix CMake build infrastructure for experimental builder

- Add experimental/builder CMakeLists.txt with proper subdirectory structure
- Add placeholder include/ck_tile/builder CMakeLists.txt for header installation
- Fix gtest.cmake to use include_guard to prevent multiple inclusions
- Update root CMakeLists.txt to include full builder directory instead of just tests

* Scope C++20 settingto the test code

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Remove redundant GTest::gtest linkage

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Introduce basic types, and convolution algorithm concepts and limits.

* Add convolution signature concepts.

* Add convolution factory.

* Finalize conv factory implementation for fwd convolutions.

* Add type definitions for testing.

* Add placeholder test.

* Add convolution builder definition.

* Fully functional fwd conv builder.

* Test improvements.

* Clean-up include headers.

* Enable the limit checks for the convolution algorithm parameters.

* Remove dead code.

* clang formatting.

* Add more tests and missing conv specialization argument.

* clang formatting.

* Add explicit handling of the tensor layouts.

* Add complete 2D/3D layout support to CK Builder

  - Add missing 2D layouts: GNHWC_GKYXC_GNHWK, NGCHW_GKCYX_NGKHW
  - Add missing 3D layout: GNDHWC_GKZYXC_GNDHWK
  - Add 1D layouts (NWGC, NGCW, GNWC, NGCW_GKCX) for future support
  - Add 3 tests for new 2D/3D layouts
  - All tests pass (5/5)

* Add tests for remaining 2D/3D layouts

  - Add test for 2D NGCHW_GKYXC_NGKHW (channels-first) with Filter1x1Stride1Pad0
  - Add test for 3D NDHWGC_GKZYXC_NDHWGK (channels-last)
  - All 7 tests pass (complete coverage for all 2D/3D forward layouts)

* Change enum converters to consteval.

* 7 tests with pipeline and specialization| Test # | Dim | Type | Layout               | Pipeline | Specialization          |
  |--------|-----|------|----------------------|----------|-------------------------|
  | 1      | 2D  | BF16 | NHWGC_GKYXC_NHWGK    | V1       | DEFAULT                 |
  | 2      | 2D  | FP16 | GNHWC_GKYXC_GNHWK    | V3       | FILTER_1X1_PAD0         |
  | 3      | 2D  | FP32 | NGCHW_GKCYX_NGKHW    | V4       | FILTER_1X1_STRIDE1_PAD0 |
  | 4      | 2D  | BF16 | NHWGC_GKYXC_NHWGK    | V5       | FILTER_3x3              |
  | 5      | 3D  | FP32 | NGCDHW_GKCZYX_NGKDHW | V1       | FILTER_1X1_PAD0         |
  | 6      | 3D  | BF16 | GNDHWC_GKZYXC_GNDHWK | V3       | DEFAULT                 |
  | 7      | 3D  | FP16 | NDHWGC_GKZYXC_NDHWGK | V4       | FILTER_1X1_PAD0         |

* Add missing convolution layouts and provide better compile-time error in instance traits.

* Fix clang formatting.

* Changed I8 -> S8.

* Fix signature.

* Rename concepts and corresponding members.

* Rename LDS related parameters.

* Remove ODD_C specialization. Add V2 pipeline.

* Add missing types.

* Add elementwise operation to the conv signature.

* Improve compile-time error message for unsupported elementwise ops.

* Separate different fwd conv builder tests into separate compilation units.

* Fix layout to string and add name to old CK PassThrough elementwise op.

* Enable both CK and CK Tile tensor layouts in instance traits.

* Fix clang-format.

---------

Co-authored-by: John Shumway <jshumway@amd.com>
Co-authored-by: John Shumway <john.shumwayjr@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: JH-Leon-KIM-AMD <jeonghyun.kim@amd.com>
2025-10-27 20:09:24 +02:00
John Shumway
6d709dac41 [CK Builder] Add missing tf32 type to reflection. (#3090)
We need to check all the architectures for build errors. This missing tf32 type came up as a build failure when I compiled for different instinct architectures.
2025-10-25 07:28:12 -07:00
Adam Osewski
f53d857b25 [CK_Builder] Add name member to unary elementwise ops & update builder traits. (#3093)
* Add name member to unary elementwise ops.

* Update elementwise_op_name to check for name attribute.

* Require that the layout is derived from BaseTensorLayout struct.
2025-10-25 07:27:03 -07:00