Add tile shape for FMHA batch prefill on MI308X (on fp8,
hdim=256) (#8350)
## Motivation
Add a tile size appropriate for FMHA batch prefill fp8/hdim256 on MI308X
## Technical Details
Appending the tile shape to the existing factory such that it can be
picked up by Aiter
## Test Plan
Ran the performance test on both MI300X and MI308X
## Test Result
MI300X performance seems unaffected by this change. MI308X does improve.
## Submission Checklist
- [X] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
Co-authored-by: Damien Lejeune <damien.lejeune@amd.com>