mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-04-20 06:49:15 +00:00
[CK_TILE][FMHA] Enable gpt-oss sink (#3490)
* Enable gptoss sink
Signed-off-by: Linjun-AMD <Jun.Lin@amd.com>
* Update include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_splitkv_pipeline_qr_ks_vs.hpp
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_splitkv_pipeline_qr_ks_vs.hpp
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* add gptoss sink test
Signed-off-by: Linjun-AMD <Jun.Lin@amd.com>
* update CHANGELOG.md
Signed-off-by: Linjun-AMD <Jun.Lin@amd.com>
* fix test args error
Signed-off-by: Linjun-AMD <Jun.Lin@amd.com>
* Update test_fmha_fwd.cpp
* update sink test
Signed-off-by: Linjun-AMD <Jun.Lin@amd.com>
* Revert "update sink test"
This reverts commit 970b4f1686.
* update sink test
Signed-off-by: Linjun-AMD <Jun.Lin@amd.com>
* update valid sink_v in splitkv pipeline
Signed-off-by: Linjun-AMD <Jun.Lin@amd.com>
* Update block_fmha_batch_prefill_pipeline_qr_ks_vs_async.hpp
* Update example_fmha_fwd.cpp
* fix lse error
Signed-off-by: Linjun-AMD <Jun.Lin@amd.com>
* fix clangformat error
Signed-off-by: Linjun-AMD <Jun.Lin@amd.com>
* fix aiter scale error
Signed-off-by: Linjun-AMD <Jun.Lin@amd.com>
* Update block_fmha_pipeline_qr_ks_vs.hpp
* div scale_s for sink_value
Signed-off-by: Linjun-AMD <Jun.Lin@amd.com>
* Update fmha_fwd_runner.hpp
* update sink_value with bias
Signed-off-by: Linjun-AMD <Jun.Lin@amd.com>
* Update block_fmha_batch_prefill_pipeline_qr_ks_vs_async.hpp
* Fix typo in dropout parameter in fmha_batch_prefill_kernel
* Update block_fmha_batch_prefill_pipeline_qr_ks_vs_async.hpp
* Update example_fmha_fwd.cpp
* Update include/ck_tile/ops/fmha/pipeline/block_fmha_pipeline_qr_ks_vs_async_trload.hpp
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update include/ck_tile/ops/fmha/pipeline/block_fmha_fwd_splitkv_pipeline_nwarp_sshuffle_qr_ks_vs.hpp
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* optimized some code
Signed-off-by: Linjun-AMD <Jun.Lin@amd.com>
* fix splitkv error
Signed-off-by: Linjun-AMD <Jun.Lin@amd.com>
* update sink reference
Signed-off-by: Linjun-AMD <Jun.Lin@amd.com>
* Update fmha_fwd_runner.hpp
* Update smoke_test_fwd_sink.sh
---------
Signed-off-by: Linjun-AMD <Jun.Lin@amd.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com>
This commit is contained in:
@@ -84,3 +84,10 @@ $EXE -prec=fp16 -mode=1 -b=1 -h=1 -d=128 -d_v=128 -s=16384 -s_k=16384 -bias=n -l
|
||||
# 1 1 1 1 1 1 1 1 1 1
|
||||
# l=2/r=0(br) l=2/r=0/s=2(br)
|
||||
|
||||
$EXE -prec=fp16 -mode=0 -b=1 -h=1 -d=128 -d_v=128 -s=512 -s_k=512 -bias=n -lse=0 -iperm=0 -operm=0 -vlayout=r -kname=1 -v=1 -warmup=0 -repeat=1 -init_sink=1 -mask=1
|
||||
|
||||
$EXE -prec=fp16 -mode=0 -b=1 -h=1 -d=128 -d_v=128 -s=1024 -s_k=1024 -bias=n -lse=0 -iperm=0 -operm=0 -vlayout=r -kname=1 -v=1 -warmup=0 -repeat=1 -init_sink=1 -mask=0
|
||||
|
||||
$EXE -prec=fp16 -mode=0 -b=1 -h=1 -d=128 -d_v=128 -s=4096 -s_k=4096 -bias=n -lse=0 -iperm=0 -operm=0 -vlayout=r -page_block_size=128 -cache_batch_idx=0 -kname=1 -v=1 -warmup=0 -repeat=1 -init_sink=1
|
||||
|
||||
$EXE -prec=fp16 -mode=1 -b=1 -h=1 -d=128 -d_v=128 -s=8192 -s_k=8192 -bias=n -lse=0 -iperm=0 -operm=0 -vlayout=r -page_block_size=128 -cache_batch_idx=0 -kname=1 -v=1 -warmup=0 -repeat=1 -init_sink=1 -mask=1
|
||||
|
||||
Reference in New Issue
Block a user