Tile Engine support for gfx950 (#4592)

## Motivation

This PR adds support for the gfx950 GPU architecture to the Tile Engine
in Composable Kernel library, focusing on GEMM operations with FP8 and
BF8 data types.

## Technical Details

Added gfx950-specific MFMA warp GEMM implementations with conditional
compilation.
Updated default GEMM configuration parameters for tile sizes and warp
configurations.
Added Jenkins CI pipeline stage for testing TILE_ENGINE_GEMM on gfx950
hardware.

## Test Plan

Tile engine itself is a benchmarking utility, so if it passes the CI it
will be tested automatically.

## Test Result

Tile engine itself is a benchmarking utility, so if it passes the CI it
will be tested automatically.

## Submission Checklist

- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

---------

Co-authored-by: Thrupti Raj Lakshmana Gowda<ThruptiRaj.LakshmanaGowda@amd.com>
Co-authored-by: Thomas Ning <Thomas.Ning@amd.com>
This commit is contained in:
Thrupti Raj Lakshmana Gowda
2026-02-26 10:14:40 -06:00
committed by GitHub
parent 3a76dfd28f
commit f7c2b42170
4 changed files with 86 additions and 26 deletions

View File

@@ -2,7 +2,7 @@
"tile_config": {
"tile_m": {
"values": [
64
128
]
},
"tile_n": {
@@ -12,17 +12,17 @@
},
"tile_k": {
"values": [
192
64
]
},
"warp_m": {
"values": [
2
4
]
},
"warp_n": {
"values": [
2
1
]
},
"warp_k": {
@@ -32,17 +32,17 @@
},
"warp_tile_m": {
"values": [
16
32
]
},
"warp_tile_n": {
"values": [
16
32
]
},
"warp_tile_k": {
"values": [
32
64
]
}
},
@@ -59,8 +59,7 @@
},
"epilogue": {
"values": [
"default",
"cshuffle"
"default"
]
},
"pad_m": {
@@ -80,7 +79,7 @@
},
"persistent": {
"values": [
true
false
]
}
},