Commit Graph

2 Commits

Author SHA1 Message Date
damien-lejeune
b904c41e44 Update include path to break the remod's cyclic dep issue (#2978)
* Update include path to break the cyclic dep issue

* Use ck_tile::permute_vectors_i4x4_b in tile engine

---------

Co-authored-by: Damien Lejeune <damien.lejeune@amd.com>
Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>

[ROCm/composable_kernel commit: 46c10c316d]
2025-10-13 13:24:47 +02:00
Thomas Ning
2142e9efec Restructure the Tile Engine to have faster build time and clear config report (#2747)
* Making edits to identify individual compilation issues.

* Minor fix for blob txt files not being created.

* Fixing compilation issues.

* Fixing ordering bug.

* Adding python profiling functionality.

* Setting individual build as default.

* Setting gpu target filtering for tile engine to gfx90a, gfx942 and gfx950.

* update the default running parameters and settings

* Fixing bug with benchmarking, shifting file generation to build instead of config.

* Updating fixes.

* Fixing json output and parsing.

* Disable ccache for tile engine gemm ops because we dont need it.

* Removing duplicate type definition.

* Improving json printing.

* Add the flexibility of different layout and more warp tile support

* Fix extra flag in name of individual kernels.

* Fixing bug with booleans.

* Solve the first patch of the post merge conflict

* Compilation fixes, and cosmetic improvements.

* Yet again compilation fixes after latest changes from develop.

* Fixing python benchmarking script.

---------

Co-authored-by: Vidyasagar Ananthan <vidyasagar.ananthan@amd.com>
Co-authored-by: Vidyasagar Ananthan <vanantha@amd.com>

[ROCm/composable_kernel commit: 705804d9bf]
2025-08-30 06:54:18 -07:00