PoYen, Chen
|
d41ff70db5
|
Enlarge rotary_dim limit (8 -> 16)
|
2024-07-26 06:51:24 +00:00 |
|
PoYen, Chen
|
4280a07d2a
|
Refine pipeline padding settings
|
2024-07-24 11:37:56 +00:00 |
|
PoYen, Chen
|
f053ae2b5b
|
Add missing init code
|
2024-07-24 07:12:06 +00:00 |
|
PoYen, Chen
|
bd28e96425
|
Remove no-longer used method in pipeline
|
2024-07-24 06:59:45 +00:00 |
|
PoYen, Chen
|
c50c36a07f
|
Re-arrange the 'set +x' command
|
2024-07-24 06:56:53 +00:00 |
|
PoYen, Chen
|
8fb015b83f
|
Remove more debug statements
|
2024-07-24 06:48:40 +00:00 |
|
PoYen, Chen
|
5c733dc568
|
Remove debug statements
|
2024-07-24 06:47:52 +00:00 |
|
PoYen, Chen
|
2126d4d88d
|
Add append-kv smoke tests
|
2024-07-24 06:35:53 +00:00 |
|
PoYen, Chen
|
f7fb3fafaa
|
Allow only apply RoPE on Q (without append KV)
|
2024-07-24 06:26:00 +00:00 |
|
PoYen, Chen
|
08b4e8a125
|
Fix wrong rope key for fp8 pipeline
|
2024-07-24 06:06:07 +00:00 |
|
PoYen, Chen
|
d84c915549
|
Disable host verification if API not exist
|
2024-07-24 06:02:41 +00:00 |
|
PoYen, Chen
|
8a73d334b8
|
Rename utility function
|
2024-07-24 05:19:05 +00:00 |
|
PoYen, Chen
|
d59e098ec4
|
Fix wrong pipeline
|
2024-07-24 05:17:57 +00:00 |
|
PoYen, Chen
|
29c9b650b5
|
Align commit message to the real comment
|
2024-07-24 05:14:00 +00:00 |
|
PoYen, Chen
|
c7b7b44883
|
Add comment for why I just 't' for all padding flags
|
2024-07-24 05:13:16 +00:00 |
|
PoYen, Chen
|
59e1d9b84f
|
Shift rotary_cos/rotary_sin by cache_seqlen_k
|
2024-07-24 05:06:47 +00:00 |
|
PoYen, Chen
|
a4da1e7f22
|
Remove RoPEComputeDataType type alias
|
2024-07-24 04:45:28 +00:00 |
|
PoYen, Chen
|
251f8cfea9
|
Merge branch 'develop' into feature/fmha-fwd-appendkv
|
2024-07-24 04:16:35 +00:00 |
|
PoYen, Chen
|
3348131699
|
Fix wrong data type for Q rotary_cos/rotary_sin
|
2024-07-24 04:10:43 +00:00 |
|
PoYen, Chen
|
5ea60715ea
|
Update host/device specifiers
|
2024-07-24 03:45:19 +00:00 |
|
PoYen, Chen
|
6f95239229
|
Use different rotary_cos/rotary_sin distr for Q/Knew
|
2024-07-24 03:40:29 +00:00 |
|
PoYen, Chen
|
47a74f282d
|
Extract Q/Knew vector size to helper methods
|
2024-07-24 03:23:18 +00:00 |
|
PoYen, Chen
|
eb4ea3ac2a
|
Fix wrong rotary_cos/rotary_sin memory size for Q
|
2024-07-23 16:22:25 +00:00 |
|
Haocong WANG
|
d22713a719
|
disable bad instance (#1410)
|
2024-07-23 09:05:03 -07:00 |
|
PoYen, Chen
|
85bac93951
|
Fix wrong index into knew_host/vnew_host
|
2024-07-23 15:31:15 +00:00 |
|
PoYen, Chen
|
b11f92dc4c
|
Fix wrong shape of knew_host/vnew_host
|
2024-07-23 14:52:42 +00:00 |
|
PoYen, Chen
|
ca4b208b60
|
Fix wrong grid size
|
2024-07-23 14:20:52 +00:00 |
|
PoYen, Chen
|
52b47810bb
|
Rename more tile size constants
|
2024-07-23 09:30:05 +00:00 |
|
PoYen, Chen
|
99c1d463de
|
Align naming of some tile size constants
|
2024-07-23 09:24:38 +00:00 |
|
PoYen, Chen
|
ce5e0f1d67
|
Re-order parameters
|
2024-07-23 09:02:41 +00:00 |
|
PoYen, Chen
|
fb80c7b2cb
|
Extract rotary embedding logic out
|
2024-07-23 08:51:59 +00:00 |
|
PoYen, Chen
|
2192bbc68a
|
Rename RotaryEmbeddingEnum
|
2024-07-23 07:50:50 +00:00 |
|
PoYen, Chen
|
d4606cf3c3
|
Rename header
|
2024-07-23 07:45:25 +00:00 |
|
PoYen, Chen
|
b275732128
|
Remove always true static_assert()
|
2024-07-23 07:25:50 +00:00 |
|
PoYen, Chen
|
eb649a2f25
|
Move thread locating logics into policy
|
2024-07-23 07:21:20 +00:00 |
|
PoYen, Chen
|
0e5cb6f913
|
Skip code if # of block is more than needed
|
2024-07-23 06:53:24 +00:00 |
|
PoYen, Chen
|
7124f3eda5
|
Add make_tile_window() for adding distribution only
|
2024-07-23 06:52:38 +00:00 |
|
PoYen, Chen
|
0925c0e941
|
Use better naming for tile indices
|
2024-07-23 06:40:53 +00:00 |
|
PoYen, Chen
|
bc7c7ee0c5
|
Fix wrong knew/vnew appending positions
|
2024-07-23 04:46:53 +00:00 |
|
PoYen, Chen
|
56df4d6397
|
Remove debug print code in kernel
|
2024-07-23 04:01:55 +00:00 |
|
PoYen, Chen
|
48c70720b5
|
Apply RoPE to q_tile
|
2024-07-23 03:54:11 +00:00 |
|
PoYen, Chen
|
e88253a2f4
|
Add code blocks for q_tile
|
2024-07-23 03:28:40 +00:00 |
|
PoYen, Chen
|
1dbed18555
|
Remove constness from q_ptr
|
2024-07-23 03:11:31 +00:00 |
|
PoYen, Chen
|
c26c60db4c
|
Unify parameter/variable naming style
|
2024-07-23 02:59:17 +00:00 |
|
PoYen, Chen
|
c0bc097758
|
Apply elementwise function to the loaded tiles
|
2024-07-23 02:50:07 +00:00 |
|
PoYen, Chen
|
df352f955a
|
Add comment
|
2024-07-23 02:31:45 +00:00 |
|
PoYen, Chen
|
d1ecfdc700
|
Support 8x rotary_dim under half-rotated RoPE
|
2024-07-23 02:19:16 +00:00 |
|
Bartłomiej Kocot
|
5d8c3d8190
|
Revert Support access per groups and filter2x3 in grouped conv fwd (#1382) (#1406)
|
2024-07-22 14:21:24 +02:00 |
|
PoYen, Chen
|
631f29d527
|
Handle RoPE half-rotated logics
|
2024-07-22 08:50:03 +00:00 |
|
PoYen, Chen
|
1136e6b560
|
Fix error in RoPE host reference
|
2024-07-22 08:39:09 +00:00 |
|