remove support for gfx940 and gfx941 targets (#1944)

* remove support for gfx940 and gfx941 targets

* update changelog
This commit is contained in:
Illia Silin
2025-03-05 11:07:33 -08:00
committed by GitHub
parent d378233924
commit 9b51c08bf7
28 changed files with 56 additions and 40 deletions

View File

@@ -126,6 +126,6 @@ Note FA use bottom-right by default to express swa case, here we require you exp
TBD
## FP8 experimental support
As described in [this blog](https://blog.hippoml.com/8bit-hippoattention-up-to-3x-faster-compared-to-flashattentionv2-8f9def90b482), we have an experimental support for fp8 fmha kernels, you can evaluate the performance by setting the arg `-prec=fp8` to the `tile_example_fmha_fwd`, on a gfx940/941/942 machine and ROCm 6.0+.
As described in [this blog](https://blog.hippoml.com/8bit-hippoattention-up-to-3x-faster-compared-to-flashattentionv2-8f9def90b482), we have an experimental support for fp8 fmha kernels, you can evaluate the performance by setting the arg `-prec=fp8` to the `tile_example_fmha_fwd`, on a gfx942 machine and ROCm 6.0+.
Currently we only support `-vlayout=c`( `hdim*seqlen` for V matrix) and `-squant=1`(static quantization) with `hdim=128` for fp8 now. Full feature support will come later.