Po Yen Chen
cf2d635ea2
[CK_TILE] Fix incorrect computation of group mode PagedAttention (#1688)
* Allow getting batch size from splitkv tile partitioner
* Fix wrong paged-kvcache impl for group mode
* Fix wrong example code for page-kvcache
* Undo changes in fmha_fwd.cpp
* Always use 2D block table
* Add is_gappy kernel argument for paged-kvcache
The is_gappy argument is used for differentiating seqstart_k_ptr usage
in flash-attention & xformers
* Remove out-of-date comments
* Remove no-longer used method
* Fix wrong # page-block calculation
* Fix wrong comment
---------
Co-authored-by: Qianfeng <qianfeng.zhang@amd.com>
2024-11-26 20:37:54 +08:00
..
2024-08-16 13:40:10 -07:00
2024-08-16 13:40:10 -07:00
2024-10-16 18:14:32 +08:00
2024-10-16 18:14:32 +08:00
2024-10-16 18:14:32 +08:00
2024-08-16 13:40:10 -07:00
2024-08-16 13:40:10 -07:00
2024-08-28 20:50:43 +08:00
2024-08-28 20:50:43 +08:00
2024-10-21 10:52:11 +08:00
2024-10-21 10:52:11 +08:00
2024-10-26 18:35:45 +08:00
2024-11-26 20:37:54 +08:00
2024-05-07 22:32:54 +08:00
2024-10-26 18:35:45 +08:00
2024-04-15 19:27:12 -05:00
2024-11-26 11:14:56 +08:00
2024-04-15 19:27:12 -05:00
2024-10-30 14:03:16 +08:00
2024-10-30 14:03:16 +08:00
2024-04-15 19:27:12 -05:00
2024-10-30 14:03:16 +08:00
2024-11-11 18:08:25 -08:00
2024-10-30 17:42:50 +01:00
2024-11-25 12:31:38 +08:00