Commit Graph

82 Commits

Author SHA1 Message Date
Thomas Ning
99b4da18a2 Update tile_engine/ops/pooling/pool_profiler.hpp
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-02-19 11:18:18 -08:00
Thomas Ning
9e97c2dc40 Update tile_engine/ops/pooling/pool_instance_builder.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-02-19 11:17:53 -08:00
Aleksander Dudek
39ad7898fa [CK_TILE] Add pooling tests for ckTileEngine 2025-12-16 11:20:47 +00:00
Aleksander Dudek
ef33d16a23 [CK_TILE] fix formatting of pooling in ckTileEngine with clang-format part3 2025-12-16 10:51:57 +00:00
Aleksander Dudek
4c8dbd3847 Merge branch 'develop' into ckTileEnginePooling 2025-12-16 10:42:16 +00:00
Aleksander Dudek
1bccd37e06 [CK_TILE] fix formatting of pooling in ckTileEngine with clang-format 2025-12-16 10:40:41 +00:00
linqunAMD
6d7299ff78 [ck_tile] remove duplicate functions in ck_tile (#3311)
* [ck_tile] remove duplicated shuffle_b and shuffle_b_permuteN

* [ck_tile] move get_k_warp to gemm_shape

* resolve code rebase error
2025-12-15 07:13:00 -08:00
Aleksander Dudek
d5cef00770 Merge branch 'develop' into ckTileEnginePooling 2025-12-12 10:33:06 +00:00
Max Podkorytov
4011dbfec3 [CK-Tile] fixup codegen for tile engine ops gemm multid and gemm preshuffle (#3383)
* fixup gemm multi-d and preshuffle in tile engine codegen

---------

Co-authored-by: Thrupti Raj Lakshmana Gowda <thruptiraj.lakshmanagowda@amd.com>
2025-12-11 14:23:43 -08:00
Aleksander Dudek
f6d2243288 [CK_TILE] Add pooling to ckTileEngine part4 fix suppported configurations 2025-12-11 12:11:42 +00:00
Aviral Goel
6d25525adc feat(precommit-hooks): add check for correct copyright header (#3302)
* chore(copyright): update copyright header for left files

* feat(copyright): add copyright check to precommit hooks

* chore(copyright): update copyright header for include/ck_tile directory

* chore(copyright): update copyright header for example directory

* chore(copyright): update copyright header for .github directory

* refactor: copyright_check script with better if else handling

* chore(copyright): update compyright header for remaining files

* feat: add script to automate copyright addition
2025-12-10 22:50:43 -08:00
Aleksander Dudek
07c078d5ef [CK_TILE] Add pooling to ckTileEngine part3 2025-12-09 11:59:37 +00:00
Max Podkorytov
d184eed823 [CK-Tile] Refactor base pipeline usage (#3251)
* initial poc

* factor out common parts in operator()

* cv4

* rest of the universal gemm pipelines

* fix test

* remove boilerplate from tile engine

* fix example

* fix example

* format

* fix tests build for gemm

* remove base pipeline codegen from gemm instance builder

* unify v3 logic with the rest of universal gemm pipelines

* fix build for multi abd test

* fix test gemm multi d

* fix build for weight preshuffle

* fix grouped gemm test

* fix grouped gemm multi d test

* fix grouped gemm preshuffle

* fix grouped gemm example except for quant

* fix gemm preshuffle

* fix splitk 2 stage example

* fix batched gemm example

* fix multid example

* fix multiabd example

* fix batched gemm test

* fixup

* fix examples build

* fix grouped gemm test build

* fix smoke builder
2025-12-04 11:45:49 -08:00
arai713
583fafc803 [CK_TILE] Fix for Moving DataTypeTraits into a Common File (#3335)
This PR fixes a mismatch caused when PR #3146 was merged out of sync with develop, which made its intended changes ineffective. This PR reapplies those changes to move DataTypeTraits into a common file to mitigate code duplication.

Co-authored-by: Thomas Ning <Thomas.Ning@amd.com>
2025-12-03 22:46:22 -08:00
Aviral Goel
004784ef98 chore(copyright) update library wide CMakeLists.txt copyright header template (#3313)
* chore(copyright) update library wide CMakeLists.txt files copyright header template

* Fix build

---------

Co-authored-by: Sami Remes <samremes@amd.com>
2025-11-28 13:49:54 -08:00
Aleksander Dudek
990f13229f [CK_TILE] Add pooling to ckTileEngine part2 2025-11-28 08:56:17 +00:00
Cong Ma
30727c48fc Tile engine for streamk (#3157)
* [CK TILE STREAMK] Introduce initial support for tile engine in streamk GEMM.

- This commit lays the groundwork for integrating the tile engine into streamk GEMM.
  It focuses on creating benchmark executables for streamk GEMM.
- Additional scripts like test_benchmark.sh and gemm_benchmark.py will be added once
  the streamk implementation reaches stability.

* [CK TILE STREAMK] Enable CI to execute tile engine benchmarks for StreamK GEMM

* [CK TILE STREAMK] Refactor: Extract common utility functions.

* [CK TILE STREAMK] Revise tile engine of streamk to align with the updated implementation

* Add pre-commit

* [CK TILE STREAMK] Add 'dp_persistent' and 'reduction_strategy' in output of CK TILE STREAMK

* [CK TILE STREAMK] Fix a bug about value of 'dp_persistent' of CK TILE STREAMK

* [CK TILE STREAMK] Update Jenkinsfile

* [CK TILE Engine] Update StreamK tile engine help message

Remove default value messages as they are automatically printed

* [CK TILE Engine] Update StreamK tile engine

- Remove namespace reboot

* [CK TILE Engine] Update StreamK tile engine

- Fix merge error
2025-11-27 15:49:57 -07:00
arai713
24d88d2472 [CK_TILE] Move DataTypeTraits into a Common File (#3146)
This renames the typeToStr struct in the common utilities to DataTypeTraits and removes all duplication of DataTypeTraits across files in CK Tile.

Co-authored-by: Christopher Millette <63608002+cgmillette@users.noreply.github.com>
2025-11-27 09:09:54 -08:00
Aleksander Dudek
99f5c2fcf7 [CK_TILE] Add pooling to ckTileEngine part1 2025-11-27 11:31:53 +00:00
Max Podkorytov
79aae7c7f7 [CK Tile] enable building examples by default (#3259)
* remove EXCLUDE_FROM_ALL from ck-tile examples
-> +15 min build time w/ 64 threads for a single arch

* fix cpp17 compile error in the ck-tile examples

---------

Co-authored-by: khuagarw <khuagarw@amd.com>
Co-authored-by: Ding, Yi <yi.ding@amd.com>
2025-11-26 16:24:44 -08:00
Aviral Goel
22a934a229 chore(copyright): update copyright header for include directory (#3219)
* chore(copyright): update copyright header for tile_engine directory

* chore(copyright): update copyright header for script directory

* chore(copyright): update copyright header for test_data directory

* chore(copyright): update copyright header for python directory

* chore(copyright): update copyright header for profiler directory

* chore(copyright): update copyright header for library directory
2025-11-17 08:57:45 -08:00
Thrupti Raj Lakshmana Gowda
9af30f04b6 Ck tile engine commons (#3166)
* Moving Preshuffle to commons

* Fixing Common Validations

* Addressing Review Comments

* Partial Rebasing

* Partial Rebasing

* Partial Rebasing

* Rebasing Complete
2025-11-13 00:56:18 -06:00
Aviral Goel
88e3212fcc chore(copyright): update copyright header for tile_engine directory (#3180) 2025-11-11 08:17:24 -08:00
Aviral Goel
73f637894d refactor: remove gemm preshuffle pipeline v1 by removing all references from codebase (#3132)
* test: temporarily disable flaky test_ck_tile_moe_sorting_2d_buf

* refactor: deprecate gemm preshuffle pipeline v1 by removing all references from codebase

* Revert "test: temporarily disable flaky test_ck_tile_moe_sorting_2d_buf"

This reverts commit 573c08a085.
2025-11-02 00:06:28 -04:00
Thrupti Raj Lakshmana Gowda
a33d98f8e2 [CK TILE ENGINE] GEMM Multi D Restructure (#3121)
* Renaming old code

* Adding GEMM code with new Architecture

* Partial Progress : Errors

* Partial Progress : Working code

* Changes to element wise function

* Removing Debugging statements

* Working GEMM Multi D code

* Removing Stale Code

* Address Copilot review comments

* Address Copilot review comments

* Changes to validation file

* Changes to common code snippets

* Creating common folder

* Removing duplicate files

* Pointing to right common file

* Pointing to right common file

* Pointing to right common file

* Changing to VERBOSE

* Changing CMAKE messages to verbose

* Updating Cmake with right layout datatype configs

* Working code for GEMM Multi D
2025-10-31 12:02:46 -07:00
John Afaganis
3f996ee738 Add copyright notices to missing files (#3133) 2025-10-31 07:35:11 -07:00
Thrupti Raj Lakshmana Gowda
7fc0a38e90 Ck tile engine gemm (#2982)
* Partial Progress : CK Tile Engine GEMM

* Partial Progress : CK Tile Engine GEMM

* Partial Progress : Working GEMM Code

* Partial Progress : Working GEMM Code

* Changinf jenkins to remove preshuffle

* Partial Progress : CK TILE ENGINE GEMM Debugging

* Partial Progress : Removing changes that are not GEMM

* Partial Progress : Validation of full block size in GEMM

* Changes in Jenkins to run only fp16 and bf16

* Addressing Review Comments

* Partial Progress : Addressing CI issues

* Partial Progress - Runing GEMM for fp16,bf16 and rcr

* Clang

* Adding fp8 and bf8

* Adding fp8 and bf8

* Adding additional architrcture

* Limited datatypes and layouts

* Adding k_block_per_cu in test config

* Changes to faling CI errors

* Changes to faling CI errors

* Validation for GEMM

* Adding Layout support

* Adding Validations

* Adding layout in jenkins

* Update on Jenkins

* Distribution validation for GEMM

* Resolving merge conflicts

* Solving merge conflicts
2025-10-27 21:11:13 -05:00
Thrupti Raj Lakshmana Gowda
8b185e872e Ck tile engine preshuffle (#2919)
* Partial Progress : Preshuffle working code for datatype

* Partial Progress : Preshuffle Cleanup

* Working code for default config with min max step

* Partial Progress : PermuteN implemented in validation

* Partial Progress : PermuteN changes in Preshuffle

* CK Tile Engine Preshuffle Complete

* CK TILE ENGINE : Preshuffle Layout validation

* CK Tile Engine Preshuffle Validation

* Preshuffle Validation check

* CK Tile Engine Preshuffle : Fixing Validation Cases

* Addressing PR review Comments

* Changes in config

* Addressing Review Comments

* Adding additional architecture in Jenkins

* Partial Progress : Selective Datatype and layouts

* Limited datatypes and layouts

* Addressing CI errors

* Datatype updates

* Datatype updates

* Datatype changes to Preshuffle

* Addressing Review Comments

* Addressing Review Comments

* Datatype changes

* Changes to Cmake

* Update on Jenkins

* Formatting with precommit

* Ruff Formatting
2025-10-27 09:15:34 -05:00
Thrupti Raj Lakshmana Gowda
0fd7d1a607 Excluding Tile engine from build (#3085) 2025-10-23 12:57:18 -07:00
Thrupti Raj Lakshmana Gowda
9f77061094 [CK TILE ENGINE] Code changes to finding GPU id from TARGET (#3055)
* Reading gpuname from target for gemm in ck tile engine

* Reading gpuname from target for gemm preshuffle in ck tile engine

* Reading gpuname from target for gemm preshuffle in ck tile engine

* Get GPU changes for GEMM Muti D in TILE ENGINE

* Addressing errors for gpu name in cktileengine
2025-10-20 09:02:18 -07:00
Johannes Graner
d40b50b9d5 Update pre-commit to fixed versions, run remod for ck_tile (#2895)
* Fix ruff linter errors

* Fix remod dos2unix command

* Clang format

* Ignore utility in remod

* Run remod

* Specify clang-format version in pre-commit

* Specify ruff version

* Include PoolKernelArgs in reference_pool

* Add calculate_total_elements to reference batched contraction

* Fix calculate_total_elements declaration

* Refactor remod pre-commit hook

* Fix Aquant tests

---------

Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
2025-10-16 15:29:17 -07:00
msaffari-amd
589e242eda Fix: Handle JSON boolean values (pad_m, pad_n, pad_k and persistent) in gemm_instance_builder (#3008) 2025-10-14 13:20:25 +02:00
damien-lejeune
46c10c316d Update include path to break the remod's cyclic dep issue (#2978)
* Update include path to break the cyclic dep issue

* Use ck_tile::permute_vectors_i4x4_b in tile engine

---------

Co-authored-by: Damien Lejeune <damien.lejeune@amd.com>
Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>
2025-10-13 13:24:47 +02:00
pmaybank
592d73ad73 [CK_TILE] Add support for gfx12 in tile_engine for GEMM benchmarking (#2802)
* initial work on adding support of gfx12 in tile_engine for GEMM benchmarking
* add stage("Run TILE_ENGINE_GEMM Tests on gfx1201") to Jenkins config
* make tile_[m/n/k] validation arch dependent
2025-09-17 17:59:01 +01:00
Thrupti Raj Lakshmana Gowda
7d7ded62d3 [CK Tile Engine] k_block_per_cu changes in Preshuffle (#2842)
* kperblock changes in CK Tile Engine Preshuffle

* Config file formatting changes
2025-09-15 13:22:11 -07:00
Thrupti Raj Lakshmana Gowda
f6ba94fb5c [CK TILE ENGINE] Adding GEMM Preshuffle to CK Tile Engine (#2712)
* Partial Progress : Completed ListBlob

* Additional changes in Listbob

* Partial Progress : Generate Blobs Completed

* Partial Progress : Added Host side code for Preshuffle

* Working code for Preshuffle before Cleanup

* Partial Progress : Cleanup

* Partial Progress : Datatype Validation

* Partial Progress : Warptiles for preshuffle changed from hardcoding to take from config

* Partial Progress : Cleanup

* Partial Progress : Code Cleanup

* Partial Progress : Passing all valid tiles failing for unsupported tiles

* Partial Progress : Working code, testing pending for edge cases

* Partial Progress for testing

* Completed Code

* kBlockPerCu as tunable parameter from config

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Update tile_engine/ops/gemm_preshuffle/README.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Partial Progress : Working listkernels

* Partial Progress : Cleanup Working listkernels

* Partial Progress : Single instance

* Partial Progress : Working single instance code

* Partial Progress : Working generate individual instance code

* Partial Progress : Working rewamped code for given config file needed validation and edge case testing

* Partial Progress : Working Code, testing pending

* Removing LOGS file

* Working code

* Minor changes to GEMM Preshuffle : Restructured

* Minor Changes in Preshuffle

* Changes to Jenkins File

* Changes to Jenkins file to consider new architecture

* Changes to Jenkins file for fixing CI

---------

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>
2025-09-12 11:50:19 -07:00
Vidyasagar Ananthan
0e322200e5 Fixing python backward compatibility issue in benchmarking script. 2025-09-02 12:33:30 -04:00
Thomas Ning
705804d9bf Restructure the Tile Engine to have faster build time and clear config report (#2747)
* Making edits to identify individual compilation issues.

* Minor fix for blob txt files not being created.

* Fixing compilation issues.

* Fixing ordering bug.

* Adding python profiling functionality.

* Setting individual build as default.

* Setting gpu target filtering for tile engine to gfx90a, gfx942 and gfx950.

* update the default running parameters and settings

* Fixing bug with benchmarking, shifting file generation to build instead of config.

* Updating fixes.

* Fixing json output and parsing.

* Disable ccache for tile engine gemm ops because we dont need it.

* Removing duplicate type definition.

* Improving json printing.

* Add the flexibility of different layout and more warp tile support

* Fix extra flag in name of individual kernels.

* Fixing bug with booleans.

* Solve the first patch of the post merge conflict

* Compilation fixes, and cosmetic improvements.

* Yet again compilation fixes after latest changes from develop.

* Fixing python benchmarking script.

---------

Co-authored-by: Vidyasagar Ananthan <vidyasagar.ananthan@amd.com>
Co-authored-by: Vidyasagar Ananthan <vanantha@amd.com>
2025-08-30 06:54:18 -07:00
Mateusz Ozga
0758883fa4 [CK-TILE] Default2DEpilogue, example and adding nullptr_t type for D (#2752)
* Init commit

* Quick fix, CI fails

* Remove CDElementWise

* Add CDEELementWise

---------

Co-authored-by: Thomas Ning <Thomas.Ning@amd.com>
2025-08-28 12:45:50 -07:00
John Afaganis
508e7912f9 Revert "[CK-TILE] Default epilogue, adding support for D (#2629)" (#2746)
This reverts commit d43228fbca.
2025-08-26 09:48:49 -07:00
Mateusz Ozga
d43228fbca [CK-TILE] Default epilogue, adding support for D (#2629)
* Extend 2d-epilogue, D support

* Added tests & update

* Remove unused attribute

* Extend tests

---------

Co-authored-by: Thomas Ning <Thomas.Ning@amd.com>
2025-08-25 19:29:35 -07:00
Vidyasagar Ananthan
bf3e719c16 Setting gpu target filtering for tile engine to gfx90a, gfx942 and gfx950. (#2709) 2025-08-19 18:12:06 -07:00
Max Podkorytov
f38751fc2a invoke script directly (#2687) 2025-08-19 00:23:07 -07:00
linqunAMD
9fcc1ee9fd Support Wave32 in CK_TILE - Part 1 (#2594)
* Support wave32/wave64 in CK_TILE - Part 1

* remove blocksize in kernel launch

* fix build error

* fix clang format

* fix clang format 2

* fix clang format 3

* fix fmha build error

* fix fmha build 2

* fix fmha build 3

* fix build error 4

* address review comment

* update change log

* replace KernelBlockSize with kBlockSize

* fix CI fail

* fix clang format

* address review comment and rebase code.

* fix universal test fail

---------

Co-authored-by: Lin, Qun <Quentin.Lin+amdeng@amd.com>
Co-authored-by: Thomas Ning <Thomas.Ning@amd.com>
2025-08-18 10:08:31 -07:00
Thrupti Raj Lakshmana Gowda
3f57ec3d2d GEMM Multi D for CK Tile Engine (#2660)
* Readme for GEMM Multi D

* GEMM Multi D partial Progress

* GEMM Multi D partial Progress!

* CK Tile Engine GEMM Multi D : All Python files generated

* Partial Progress

* Partial Progress

* Partial Progress

* Partial Progress : Incorrect Result

* Partial Progress : Debugging

* Partial Progress : Correct Results

* Partial Progress - Incorrect Results

* Partial Progress - Commenting Passthrough bypass logic

* Changing Passthrough to MultiplyMultiply

* Correct Results!

* Fix and debug the pass through feature

* Sample commit

* Correct Results : MultiplyMultiply

* Code Cleanup

* Removing Failed Instances

* Working code before Unary element support

* Custom Elementwise Function support and working implementation for Mul and Add

* Updating README

* Working for Passthrough

* Review Comments : Minor Fixes

* Review Comments : Minor Fixes

* Readme Updated

* Partial Changes after Rebase

* Working Code : Changes after Rebase

* Updating Jenkins file

* Removing default value changed while testing

* Configuration changes in config files

* Tile Handler changes in GEMM Multi D Tile Engine

* Tile Handler changes in GEMM Multi D Example

* Change log for Gemm Multi D in CK Tile Engine

* Configuration changes in config files

---------

Co-authored-by: ThomasNing <thomasning@amd.com>
2025-08-12 16:05:05 -07:00
Sami Remes
3c9400471d [CK_TILE] Enable persistent kernel and tail handler in tile_engine (#2300)
* Enable persistent kernel in tile_engine and use tail handler

* Fix formatting

* Add persistent to default_config.json

* Remove extra newlines and add persistent also to user config

* Reduce instances from default_config.json

* add persistent to benchmark.json and custom_ci_config.json

* changed the config file to have few instances

---------

Co-authored-by: Thomas Ning <Thomas.Ning@amd.com>
Co-authored-by: ThomasNing <thomasning@amd.com>
2025-08-07 16:03:49 -07:00
Illia Silin
833ae1d051 Revert "Reduce build time tile engine (#2579)" (#2623)
This reverts commit e5b79b26fa.
2025-08-05 09:27:55 -07:00
Thomas Ning
e5b79b26fa Reduce build time tile engine (#2579)
* Modify CMakeLists to allow for splitting.

* Modify CMakeLists for data and layout logic.

* Run tests and get build artifact.

* Test new Cmakelists for speedup.

* Further improvements for speedup.

* turn off the FMHA

* turn off the automatic tile engine gemm

* minor fix

* disable the transpose test first

* Address the comment

* Jenkinsfile

* change the make thread to 64

* change the compile thread to 32

* Try to use with less OS memory space

* Have the Unity build batch size to 2

* reduce the chunk size

---------

Co-authored-by: Vidyasagar Ananthan <vidyasagar.ananthan@amd.com>
2025-08-01 14:42:33 -07:00
Khushbu Agarwal
88d72178d6 [CK_Tile] Updating gpu timer when doing flush cache (#2593)
* Missed updating function names in example

* updating timer

* code cleanup

* addressing review comments

* updating tile_engine code

* addressing review comments
2025-07-31 16:43:33 -07:00
Khushbu Agarwal
61e21f5567 Update to gpu_timer for rotating_buffer (#2524)
* update gpu_timer for rotating buffer as hipblasLt's implementation

* timing fix

* Updating gpu timer for old ck as well

* Revert "Updating gpu timer for old ck as well"

This reverts commit 958cd1bc99.

* code clean up with runtime argument; function rename

* code cleanup

* general timer fixes

* bug fix

* clang formatted

* addressing reveiew comments

* clang formatted

* Addressing review comments

* CI fix

---------

Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com>
2025-07-29 15:21:05 -07:00