Commit Graph

1423 Commits

Author SHA1 Message Date
PoYen, Chen
47a74f282d Extract Q/Knew vector size to helper methods 2024-07-24 03:23:18 +00:00
PoYen, Chen
eb4ea3ac2a Fix wrong rotary_cos/rotary_sin memory size for Q 2024-07-23 16:22:25 +00:00
PoYen, Chen
85bac93951 Fix wrong index into knew_host/vnew_host 2024-07-23 15:31:15 +00:00
PoYen, Chen
b11f92dc4c Fix wrong shape of knew_host/vnew_host 2024-07-23 14:52:42 +00:00
PoYen, Chen
ca4b208b60 Fix wrong grid size 2024-07-23 14:20:52 +00:00
PoYen, Chen
52b47810bb Rename more tile size constants 2024-07-23 09:30:05 +00:00
PoYen, Chen
99c1d463de Align naming of some tile size constants 2024-07-23 09:24:38 +00:00
PoYen, Chen
ce5e0f1d67 Re-order parameters 2024-07-23 09:02:41 +00:00
PoYen, Chen
fb80c7b2cb Extract rotary embedding logic out 2024-07-23 08:51:59 +00:00
PoYen, Chen
2192bbc68a Rename RotaryEmbeddingEnum 2024-07-23 07:50:50 +00:00
PoYen, Chen
d4606cf3c3 Rename header 2024-07-23 07:45:25 +00:00
PoYen, Chen
b275732128 Remove always true static_assert() 2024-07-23 07:25:50 +00:00
PoYen, Chen
eb649a2f25 Move thread locating logics into policy 2024-07-23 07:21:20 +00:00
PoYen, Chen
0e5cb6f913 Skip code if # of block is more than needed 2024-07-23 06:53:24 +00:00
PoYen, Chen
7124f3eda5 Add make_tile_window() for adding distribution only 2024-07-23 06:52:38 +00:00
PoYen, Chen
0925c0e941 Use better naming for tile indices 2024-07-23 06:40:53 +00:00
PoYen, Chen
bc7c7ee0c5 Fix wrong knew/vnew appending positions 2024-07-23 04:46:53 +00:00
PoYen, Chen
56df4d6397 Remove debug print code in kernel 2024-07-23 04:01:55 +00:00
PoYen, Chen
48c70720b5 Apply RoPE to q_tile 2024-07-23 03:54:11 +00:00
PoYen, Chen
e88253a2f4 Add code blocks for q_tile 2024-07-23 03:28:40 +00:00
PoYen, Chen
1dbed18555 Remove constness from q_ptr 2024-07-23 03:11:31 +00:00
PoYen, Chen
c26c60db4c Unify parameter/variable naming style 2024-07-23 02:59:17 +00:00
PoYen, Chen
c0bc097758 Apply elementwise function to the loaded tiles 2024-07-23 02:50:07 +00:00
PoYen, Chen
df352f955a Add comment 2024-07-23 02:31:45 +00:00
PoYen, Chen
d1ecfdc700 Support 8x rotary_dim under half-rotated RoPE 2024-07-23 02:19:16 +00:00
PoYen, Chen
631f29d527 Handle RoPE half-rotated logics 2024-07-22 08:50:03 +00:00
PoYen, Chen
1136e6b560 Fix error in RoPE host reference 2024-07-22 08:39:09 +00:00
PoYen, Chen
01865d2ae4 Clean-up pipeline 2024-07-22 03:14:10 +00:00
PoYen, Chen
fffd6799e6 Instantiate multiple kernels for RoPE approaches 2024-07-20 02:28:21 +00:00
PoYen, Chen
27b5141706 Fix wrong thread starting offset 2024-07-18 20:02:06 +00:00
PoYen, Chen
23450526c0 Only apply interleaved RoPE on Knew for now 2024-07-18 19:42:14 +00:00
PoYen, Chen
85bfed07fa Add dram distribution for rotary_cos/rotary_sin (interleaved) 2024-07-18 09:11:22 +00:00
PoYen, Chen
39ef09bb23 Remove unused inner namespace 2024-07-18 09:10:51 +00:00
PoYen, Chen
e83c3c7fa0 Add constraint to the rotary_dim option 2024-07-16 06:54:37 +00:00
PoYen, Chen
99f863e4cd Fix rotary cos/sin tensor/tile size 2024-07-16 06:31:17 +00:00
PoYen, Chen
b32fd8d3f4 Rename variables used in distributio encoding 2024-07-16 06:27:28 +00:00
PoYen, Chen
879710a495 Fix wrong seqlen_k for kvcache 2024-07-16 03:42:51 +00:00
PoYen, Chen
65dac9fb90 Fix wrong boundaries 2024-07-15 01:42:53 +00:00
PoYen, Chen
4e01307e04 Fix compilation error in debug mode 2024-07-15 01:26:46 +00:00
PoYen, Chen
1a093f94b2 Add minimum seqlen_k to generate compliance kvcache 2024-07-15 01:11:16 +00:00
PoYen, Chen
57c6a4125c Fix seqlen_knew enabling check logic 2024-07-15 00:40:39 +00:00
PoYen, Chen
ad61d9d4b2 Randomly generate seqlen_knew if needed 2024-07-15 00:39:03 +00:00
PoYen, Chen
f6850aef29 Add compute data type alias for RoPE 2024-07-15 00:05:33 +00:00
PoYen, Chen
b0925bb7f6 Create Rotary Cos/Sin tile windows in kernel 2024-07-14 23:47:40 +00:00
PoYen, Chen
391210ed9e Pass RoPE kernel args 2024-07-14 23:18:32 +00:00
PoYen, Chen
b5ad1411b0 Merge branch 'feature/cond-add-splitkv' into feature/fmha-fwd-appendkv 2024-07-14 22:13:17 +00:00
PoYen, Chen
c6717bb300 Merge branch 'feature/cond-add-splitkv' of github.com:ROCm/composable_kernel into feature/cond-add-splitkv 2024-07-14 22:11:39 +00:00
PoYen, Chen
8c1647d778 Avoid invoking deprecated method 'find_module' 2024-07-14 22:10:30 +00:00
Po Yen Chen
5ce0fecf36 Merge branch 'develop' into feature/cond-add-splitkv 2024-07-15 05:48:51 +08:00
PoYen, Chen
55f55025ee Fix wrong tensor size 2024-07-14 15:40:56 +00:00