Files
composable_kernel/example/ck_tile/01_fmha/codegen
damien-lejeune 5bebfd460f [rocm-libraries] ROCm/rocm-libraries#8492 (commit 46b6a06)
Add tile size for FMHA batch prefill bf16 for MI308X

## Motivation

Adding a tile size adapted to MI308X, for the FMHA Batch Prefill BF16
input type case

## Technical Details

N/A

## Test Plan

Benchmarking from the Aiter side with:

```
python3 op_tests/test_batch_prefill.py  -s 8000 -p 1 -q 4 -k 1 --head_dim 256 -c true -d bf16 --input_dtype bf16 --quant_method none --kv_layout linear -t sglang -l 0.0 --return_lse false --profile
```

## Test Result

We see an improvement with the new tile size on MI308X (both with PLT
mode OFF and ON)

## Submission Checklist

- [X] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

Co-authored-by: Damien Lejeune <damien.lejeune@amd.com>
2026-06-17 06:22:26 +00:00
..