Po Yen Chen
cf2d635ea2
[CK_TILE] Fix incorrect computation of group mode PagedAttention (#1688)
* Allow getting batch size from splitkv tile partitioner
* Fix wrong paged-kvcache impl for group mode
* Fix wrong example code for page-kvcache
* Undo changes in fmha_fwd.cpp
* Always use 2D block table
* Add is_gappy kernel argument for paged-kvcache
The is_gappy argument is used for differentiating seqstart_k_ptr usage
in flash-attention & xformers
* Remove out-of-date comments
* Remove no-longer used method
* Fix wrong # page-block calculation
* Fix wrong comment
---------
Co-authored-by: Qianfeng <qianfeng.zhang@amd.com>
2024-11-26 20:37:54 +08:00
..
2024-11-01 13:51:56 +08:00
2024-11-08 12:28:23 +08:00
2024-11-26 11:14:56 +08:00
2024-11-02 13:30:16 +08:00
2024-11-26 11:14:56 +08:00
2024-11-26 20:37:54 +08:00
2024-11-26 11:14:56 +08:00
2024-11-26 08:45:14 +01:00
2024-09-27 22:57:38 +02:00
2024-11-14 14:06:36 +08:00
2024-10-30 17:42:50 +01:00
2024-11-01 13:51:56 +08:00
2024-11-01 13:51:56 +08:00
2024-11-25 13:12:35 +08:00
2024-10-26 23:52:49 +08:00
2024-10-26 23:52:49 +08:00
2024-10-26 23:52:49 +08:00
2024-11-14 14:06:36 +08:00
2024-11-01 13:51:56 +08:00
2024-10-31 14:54:53 +08:00
2024-10-31 14:54:53 +08:00
2024-10-31 14:54:53 +08:00
2024-11-26 11:14:56 +08:00
2024-10-31 14:54:53 +08:00
2024-11-26 11:14:56 +08:00
2024-11-26 08:45:14 +01:00
2024-10-31 14:54:53 +08:00
2024-10-31 14:54:53 +08:00
2024-10-31 14:54:53 +08:00
2024-10-31 14:54:53 +08:00
2024-11-01 13:51:56 +08:00
2024-11-25 13:12:35 +08:00
2024-10-31 14:54:53 +08:00
2024-10-31 14:54:53 +08:00
2024-10-31 14:54:53 +08:00
2024-10-31 14:54:53 +08:00