linqunAMD
|
9fcc1ee9fd
|
Support Wave32 in CK_TILE - Part 1 (#2594)
* Support wave32/wave64 in CK_TILE - Part 1
* remove blocksize in kernel launch
* fix build error
* fix clang format
* fix clang format 2
* fix clang format 3
* fix fmha build error
* fix fmha build 2
* fix fmha build 3
* fix build error 4
* address review comment
* update change log
* replace KernelBlockSize with kBlockSize
* fix CI fail
* fix clang format
* address review comment and rebase code.
* fix universal test fail
---------
Co-authored-by: Lin, Qun <Quentin.Lin+amdeng@amd.com>
Co-authored-by: Thomas Ning <Thomas.Ning@amd.com>
|
2025-08-18 10:08:31 -07:00 |
|
Illia Silin
|
504b101da3
|
upgrade from clang-format-12 to clang-format-18 (#2568)
* upgrade to clang-format-18
* update to clang-format-18 in pre-commit-config
|
2025-07-28 11:34:07 -07:00 |
|
carlushuang
|
6df5fe2ad8
|
[CK_TILE]naive attn support FP8 KVCache quant (#1747)
* quant
* fix bug
* simple smoothquant after softmax
* update kv-quant
* update stride
* fix fp8-pertoken-kvcache
* update int8/fp8 quant support
---------
Co-authored-by: so <a.com>
Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com>
|
2025-01-03 18:43:07 +08:00 |
|
carlushuang
|
77a38e0211
|
[CK_TILE] naive attn (#1708)
* add reference attention fwd
* refactor addresser
* update
* paged, and i8 reflect-quant
* lets call it forward-quant
* fix error in decode variation
* update naive-attn
* fix page table
* fix build err
|
2024-12-12 11:54:03 +08:00 |
|