Commit Graph

2541 Commits

Author SHA1 Message Date
Kevin Abraham
190696e63c fixed tests for grouped_convolution_forward_clamp.cpp 2025-10-28 07:28:40 +00:00
Kevin Abraham
225a028cce added test_grouped_convolution_forward_clamp 2025-10-27 20:38:02 +00:00
Kevin Abraham
5e14625be7 resolved merge issues with test_ck_factory_grouped_convolution_forward_convscale 2025-10-27 20:36:26 +00:00
Kevin Abraham
3b45c7fd2d added grouped_conv_bilinear to tests 2025-10-27 16:44:46 +00:00
Kevin Abraham
ef1c170abd Revert "added grouped_conv_bilinear to tests"
This reverts commit fb30b087d6402a34f08cc1ee772a81df632e39d5.
2025-10-27 16:44:45 +00:00
Kevin Abraham
37323495cb Implemented tests for dynamic op 2025-10-27 16:44:45 +00:00
Kevin Abraham
d08b6ab52b added grouped_conv_bilinear to tests 2025-10-27 16:44:45 +00:00
Kevin Abraham
02b250f298 implemented tests for instances from grouped_convolution_forward_convscale.hpp:210 2025-10-27 16:44:44 +00:00
Kevin Abraham
bd40b58266 implemented tests for instances from grouped_convolution_forward_convscale.hpp:100 2025-10-27 16:43:54 +00:00
Robin Voetter
7a49fb2874 ck-builder: add missing type tf32 to type_name 2025-10-27 16:41:30 +00:00
Robin Voetter
bf022a6d15 ck-builder: add InstanceSet and InstanceMatcher
These types can be used to check whether a list of instances
defined statically/obtained from a DeviceOp is congruent. We don't
care about the order, just which items are present or not. Note that
when GetInstanceString() is not implemented for a particular kernel,
it returns "" (via the base class implementation). When this is the
case, tests should temporarily add "" to the expected list of
instances until the implementation is done.
2025-10-27 16:41:28 +00:00
Robin Voetter
d16618825d ck-builder: ck factory grouped conv fwd bias bnorm clamp 2025-10-27 15:32:11 +01:00
Robin Voetter
2cb1d61ec6 ck-builder: ck factory grouped conv fwd scaleadd scaleadd relu 2025-10-27 15:21:33 +01:00
Robin Voetter
2a309d7534 ck-builder: ck factory grouped conv fwd bias clamp 2025-10-27 15:20:54 +01:00
Robin Voetter
f709bedcd6 fix interpreter path on remove_exec_bit script
/bin/bash does not need to exist on POSIX systems.
2025-10-27 14:46:03 +01:00
Robin Voetter
10869a06b7 ck-builder: ck factory grouped conv fwd convinvscale 2025-10-27 14:41:28 +01:00
John Shumway
b24e1bf32b Refactor instance_traits_util and add unit tests tests 2025-10-27 13:33:08 +01:00
Robin Voetter
ea7f5faa3e ck-builder: ck factory grouped conv fwd scale 2025-10-27 13:33:07 +01:00
Robin Voetter
5679bcfe49 ck-builder: ck factory grouped conv fwd scaleadd ab 2025-10-27 13:33:07 +01:00
Robin Voetter
d15334ed0d ck-builder: ck factory grouped conv fwd 2025-10-27 13:33:06 +01:00
Robin Voetter
16db75fadf ck-builder: ck factory convscale relu/add 2025-10-27 13:33:05 +01:00
Robin Voetter
c07b436666 ck-builder: add InstanceSet and InstanceMatcher
These types can be used to check whether a list of instances
defined statically/obtained from a DeviceOp is congruent. We don't
care about the order, just which items are present or not. Note that
when GetInstanceString() is not implemented for a particular kernel,
it returns "" (via the base class implementation). When this is the
case, tests should temporarily add "" to the expected list of
instances until the implementation is done.
2025-10-27 13:31:16 +01:00
John Shumway
6d709dac41 [CK Builder] Add missing tf32 type to reflection. (#3090)
We need to check all the architectures for build errors. This missing tf32 type came up as a build failure when I compiled for different instinct architectures.
2025-10-25 07:28:12 -07:00
Adam Osewski
f53d857b25 [CK_Builder] Add name member to unary elementwise ops & update builder traits. (#3093)
* Add name member to unary elementwise ops.

* Update elementwise_op_name to check for name attribute.

* Require that the layout is derived from BaseTensorLayout struct.
2025-10-25 07:27:03 -07:00
kabrahamAMD
e576992dca [CK_BUILDER] Add inline string diff for tests (#3067)
Adds new testing functionality: an inline diff for string comparison.

Example usage:

EXPECT_THAT("Actual string", ck_tile::test::StringEqWithDiff("Expected string"));

Failure message:

Value of: "Actual string"
Expected: "Expected string"
Actual: "Actual string" (of type char [14]),
Diff: "[Expe|A]ct[ed|ual] string"

The inline-diff function uses the Wagner-Fischer algorithm to find the minimum edit distance and generate diff markers, which has O(N^2) complexity. It has optional color codes that are enabled with the matcher.
2025-10-25 07:22:41 -07:00
Max Podkorytov
86d542f663 [CK-Tile][Async gemm] add missing sync and f8 inputs test cases (#3000)
* add missing sync and f8 test cases

* reformat test cases

* comment failing cases

* bump

* reintroduce compv4 shapes
2025-10-24 12:16:01 -07:00
Khushbu Agarwal
0584399571 [CK_TILE] Adding support for TiledPermuteN on preshuffle Block Scale Gemm (#3019)
* Adding support for TiledPermuteN

* Adding test

* resolving remod.py

---------

Co-authored-by: root <root@banff-cyxtera-s73-2.ctr.dcgpu>
2025-10-24 11:06:51 -07:00
Max Podkorytov
f39626fcf7 [CK][host] limit the rotating count to prevent oom (#3089)
* [CK][host] limit the rotating count to prevent oom

* add numeric header for accumulate
2025-10-24 08:55:54 -07:00
Max Podkorytov
fdcc1f75c3 limit the rotating count to prevent oom (#3087) 2025-10-24 08:55:34 -07:00
andrew clark
775b96ea6a Fixing Run CI Check for Changed Files (#3072)
* Fixing check for changed files

* Testing CI skip behavior

* Testing CI Trigger

This should skip CI

---------

Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
2025-10-24 07:52:43 -07:00
kyle-256
3c12a02827 [CK_TILE] add tensorwise quant in grouped gemm (#3007)
* add tensorwise quant in grouped gemm

* fix example issue

* update test cases

* format codes

* clang format

* use GTEST_FAIL

* fix a bug in test_grouped_gemm_util

* skip test when use wmma on grouped_quant kernel

* change cmake

* change code based on comments

---------

Co-authored-by: ThomasNing <thomas.ning@amd.com>
2025-10-24 07:41:54 -07:00
yinglu
6bbc05e1bd conv:tf32:add missed instances (#3081)
* conv:tf32:add missed instances
2025-10-24 16:28:36 +08:00
Robin Voetter
d0364641ed [CK_BUILDER] old ck build fixes (#3075)
* Disable c++20-compat warnings when building old CK in C++20 mode

Turns out that this creates some warnings for no good reason.

* ck-builder: add missing layouts and element-wise op names

For layouts, we can directly use the ::name attribute, which should
cover all layouts. For element-wise ops, I just added the ones which
are currently missing when compiling CK with -DMIOPEN_REQ_LIBS_ONLY.
2025-10-23 13:01:19 -07:00
Thrupti Raj Lakshmana Gowda
0fd7d1a607 Excluding Tile engine from build (#3085) 2025-10-23 12:57:18 -07:00
Geo Min
2546fc241e adding commit hash (#3084) 2025-10-23 12:32:26 -07:00
Yi DING
fe4eaeb2eb Use filename but not path to filter compilation (#3083)
* prologue

* Use filename but not path to filter test compilation
2025-10-23 12:01:26 -07:00
Gino Lu
bedade2572 [CK_TILE] Add fp4 warp gemm 16x16x128 (#2738)
* first commit

* fix format error

* fix vec size error

* fix clang format

* fix type error

* add interface in warp_gemm_impl

* fix interface

* fix bug

* fix bug

---------

Co-authored-by: asleepzzz <hanwen.chang@amd.com>
Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
2025-10-23 10:55:51 -07:00
Rostyslav Geyyer
6df69abeef Rearrange pointers to fix the reinterpret_cast issue (#3077) 2025-10-23 10:54:13 -07:00
Qianfeng
fbd101b1ac [CK_TILE] Fix in set_slice_tile (#2232)
Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>
2025-10-23 10:34:02 -07:00
Michal Kulikowski
b9789a0742 [CK][Examples] Fixing stride issues in ck examples by workaround - Bypassing hostTensor validation.
Signed-off-by: Michal Kulikowski <Michal.Kulikowski@amd.com>
2025-10-23 08:46:02 +02:00
Haocong WANG
0d3860dfdb [CKTILE] FMHA fwd trload lse fix (#3046)
* enable storelse for fmha_fwd_trload kernel

* fix lse in trload

* fix the mask related bug
2025-10-23 09:33:33 +08:00
spolifroni-amd
1b95803431 updated the changelog with 7.1 and beyond info 2025-10-22 13:35:45 -06:00
lalala-sh
211d64e18a [CK_TILE] Update flatmm related kernels (#3022)
---------

Co-authored-by: Ding, Yi <yi.ding@amd.com>
Co-authored-by: felix <felix.li@amd.com>
2025-10-22 22:36:11 +08:00
Johannes Graner
cbd1279ae6 [CK_TILE] Conv bwd splitN support (#3047)
* Conv bwd splitN support

* Adjust splitting calculations to lengths format

* Prepare indexing for future splitK support
2025-10-22 13:34:06 +02:00
MHYangAMD
5a27a97391 Introduce tree reduction for BlockReduce2dCrossWarpSync (#2588)
* Introduce tree reduction for BlockReduce2dCrossWarpSync

* Rename original impl to BlockReduce2dLinearCrossWarpSync

* Replace warp_size with get_warp_size()

---------

Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
2025-10-22 14:41:35 +08:00
John Shumway
37dff024c1 [CK_BUILDER] Add compile-time reflection for a convolution instance (#3065)
* [CK_BILDER] Add compile-time reflection for a convolution instance

Introduce InstanceTraits template metaprogramming framework to enable runtime introspection of device kernel template parameters without requiring implementation knowledge. This reflection system extracts configuration details (block sizes, data types, layouts, tuning parameters) directly from kernel specializations through template
pattern matching. In particular, the GetInstanceString method returns a string that uniquely idenitfies the kernel, by explicitly serializing all template paramter values.

This provides critical functionality for MIOpen integration, since the existing GetTypeString method is ambiguous, and only captures some of the template paramters.

The implementation uses a two-level design: a primary InstanceTraits template declaration in instance_traits.hpp serves as the interface, while kernel-specific specializations (e.g., for DeviceGroupedConvFwdMultipleABD_Xdl_CShuffle_V3) provide the actual extraction logic. This separation allows the reflection system to scale to additional kernel types without modifying the core interface.

Key architectural decisions:

- Forward-declare device kernels in instance_traits.hpp to avoid  circular dependencies, since device implementation headers will  include the reflection headers

- Use compile-time constants and type aliases to expose kernel  parameters, enabling zero-overhead introspection

- Provide a templated instance_string() function that generates human-readable  kernel configuration strings by serializing all template parameters  in order, useful for debugging and kernel identification

- Guard reflection integration with preprocessor definition CK_EXPERIMENTAL_BUILDER to keep  it opt-in until the API stabilizes

- Add GetInstanceString() virtual method to BaseOperator, allowing  runtime polymorphic access to compile-time kernel information

This infrastructure also enables upcoming higher-level semantic reflection abstractions (like ConvTraits) to query kernel configurations programmatically.

Includes unit tests validating both the trait extraction accuracy and the string generation format.
2025-10-21 21:10:19 -07:00
Bartłomiej Kocot
3a28632b20 Gridwise gemm conv v3 force padded layout on gfx950 (#2961)
* Gridwise gemm conv v3 force padded layout on gfx950

* fix bug in other gridwise

* fix

* Update gridwise_gemm_wmma_cshuffle_v3_common.hpp
2025-10-21 15:41:02 +02:00
Yashvardhan Agarwal
35754d2ec8 fix identity value of AbsMax (#3058)
* fix identity value of AbsMax

- Identity value of AbsMax should be 0 not numeric<T>::lowest()

* Update include/ck_tile/core/utility/reduce_operator.hpp

resolved comment

Co-authored-by: Christopher Millette <63608002+cgmillette@users.noreply.github.com>

---------

Co-authored-by: Christopher Millette <63608002+cgmillette@users.noreply.github.com>
2025-10-21 14:42:08 +02:00
Johannes Graner
4043401db1 Fix race conditions in ck_tile remod (#3061) 2025-10-21 09:35:04 +02:00
Max Podkorytov
ff6efa2fb1 refine 2025-10-20 23:13:58 -04:00