Commit Graph

104 Commits

Author SHA1 Message Date
PoYen, Chen
15d0034a64 Add paged-kv codegen logic for appendkv kernels 2024-08-07 04:19:45 +00:00
PoYen, Chen
db31475e07 Unify origin 2024-08-06 08:37:29 +00:00
PoYen, Chen
bd0d2f3975 Add batch_stride_k/batch_stride_v in group mode 2024-08-06 08:02:43 +00:00
PoYen, Chen
faf6b0e8ab Fix wrong origin for bias 2024-08-06 08:02:08 +00:00
PoYen, Chen
f9e2bafd10 Make sure we always start reading complete tile 2024-08-06 03:13:57 +00:00
PoYen, Chen
4fed268723 Move code after decide seqlen_q/seqlen_k 2024-08-06 01:39:49 +00:00
PoYen, Chen
77dac7775c Move V tile through TileWindowNavigator 2024-08-05 22:36:52 +00:00
PoYen, Chen
ab086bdb76 Simplify more make_tile_window() overloads 2024-08-05 22:16:24 +00:00
PoYen, Chen
bb78353264 Remove ununnecessary data members 2024-08-05 21:52:59 +00:00
PoYen, Chen
8fea4139df Fix tile window navigation bugs 2024-08-05 21:34:15 +00:00
PoYen, Chen
ecaaa6f136 Simplify TileWindowNavigator interfaces 2024-08-05 16:31:31 +00:00
PoYen, Chen
1c9d77b606 Introduce 'TileWindowNavigator' types 2024-08-05 15:58:41 +00:00
PoYen, Chen
55b77cf962 Add another make_tile_window() 2024-08-05 15:57:03 +00:00
PoYen, Chen
24cb604373 Add copy_const<> type trait 2024-08-05 15:56:15 +00:00
PoYen, Chen
381f7e90e0 Merge branch 'develop' into feature/fmha-fwd-appendkv 2024-08-04 02:12:20 +00:00
PoYen, Chen
baf4a612f0 Fix wrong kernel name 2024-08-02 10:26:47 +00:00
PoYen, Chen
e7969b9fd2 Add template argument 'kIsPagedKV' for splitkv kernels 2024-08-02 10:14:51 +00:00
carlushuang
b3f86e79dd workaround rocm-6.2 compiler issue (#1421) 2024-07-31 16:03:59 +08:00
PoYen, Chen
c1c50ee498 Enlarge KPerThread for rotary_interleaved=false 2024-07-26 07:09:53 +00:00
PoYen, Chen
bd28e96425 Remove no-longer used method in pipeline 2024-07-24 06:59:45 +00:00
PoYen, Chen
5c733dc568 Remove debug statements 2024-07-24 06:47:52 +00:00
PoYen, Chen
d84c915549 Disable host verification if API not exist 2024-07-24 06:02:41 +00:00
PoYen, Chen
59e1d9b84f Shift rotary_cos/rotary_sin by cache_seqlen_k 2024-07-24 05:06:47 +00:00
PoYen, Chen
a4da1e7f22 Remove RoPEComputeDataType type alias 2024-07-24 04:45:28 +00:00
PoYen, Chen
251f8cfea9 Merge branch 'develop' into feature/fmha-fwd-appendkv 2024-07-24 04:16:35 +00:00
PoYen, Chen
3348131699 Fix wrong data type for Q rotary_cos/rotary_sin 2024-07-24 04:10:43 +00:00
PoYen, Chen
5ea60715ea Update host/device specifiers 2024-07-24 03:45:19 +00:00
PoYen, Chen
6f95239229 Use different rotary_cos/rotary_sin distr for Q/Knew 2024-07-24 03:40:29 +00:00
PoYen, Chen
47a74f282d Extract Q/Knew vector size to helper methods 2024-07-24 03:23:18 +00:00
PoYen, Chen
eb4ea3ac2a Fix wrong rotary_cos/rotary_sin memory size for Q 2024-07-23 16:22:25 +00:00
PoYen, Chen
b11f92dc4c Fix wrong shape of knew_host/vnew_host 2024-07-23 14:52:42 +00:00
PoYen, Chen
ca4b208b60 Fix wrong grid size 2024-07-23 14:20:52 +00:00
PoYen, Chen
52b47810bb Rename more tile size constants 2024-07-23 09:30:05 +00:00
PoYen, Chen
99c1d463de Align naming of some tile size constants 2024-07-23 09:24:38 +00:00
PoYen, Chen
ce5e0f1d67 Re-order parameters 2024-07-23 09:02:41 +00:00
PoYen, Chen
fb80c7b2cb Extract rotary embedding logic out 2024-07-23 08:51:59 +00:00
PoYen, Chen
2192bbc68a Rename RotaryEmbeddingEnum 2024-07-23 07:50:50 +00:00
PoYen, Chen
d4606cf3c3 Rename header 2024-07-23 07:45:25 +00:00
PoYen, Chen
b275732128 Remove always true static_assert() 2024-07-23 07:25:50 +00:00
PoYen, Chen
eb649a2f25 Move thread locating logics into policy 2024-07-23 07:21:20 +00:00
PoYen, Chen
0e5cb6f913 Skip code if # of block is more than needed 2024-07-23 06:53:24 +00:00
PoYen, Chen
7124f3eda5 Add make_tile_window() for adding distribution only 2024-07-23 06:52:38 +00:00
PoYen, Chen
0925c0e941 Use better naming for tile indices 2024-07-23 06:40:53 +00:00
PoYen, Chen
bc7c7ee0c5 Fix wrong knew/vnew appending positions 2024-07-23 04:46:53 +00:00
PoYen, Chen
56df4d6397 Remove debug print code in kernel 2024-07-23 04:01:55 +00:00
PoYen, Chen
48c70720b5 Apply RoPE to q_tile 2024-07-23 03:54:11 +00:00
PoYen, Chen
e88253a2f4 Add code blocks for q_tile 2024-07-23 03:28:40 +00:00
PoYen, Chen
1dbed18555 Remove constness from q_ptr 2024-07-23 03:11:31 +00:00
PoYen, Chen
c26c60db4c Unify parameter/variable naming style 2024-07-23 02:59:17 +00:00
PoYen, Chen
c0bc097758 Apply elementwise function to the loaded tiles 2024-07-23 02:50:07 +00:00