PoYen, Chen
|
8779716403
|
Fix uneven split checking logic
|
2024-08-06 01:17:14 +00:00 |
|
PoYen, Chen
|
3fc7279519
|
Disable calling fmha_fwd()
|
2024-08-05 21:36:52 +00:00 |
|
PoYen, Chen
|
8fea4139df
|
Fix tile window navigation bugs
|
2024-08-05 21:34:15 +00:00 |
|
PoYen, Chen
|
90d84eaeae
|
Fix seqlen_k_min for pre-fill case (1 -> 0)
|
2024-08-04 02:53:40 +00:00 |
|
PoYen, Chen
|
381f7e90e0
|
Merge branch 'develop' into feature/fmha-fwd-appendkv
|
2024-08-04 02:12:20 +00:00 |
|
PoYen, Chen
|
db95d25d36
|
Launch splitkv kernel if given page_block_size
|
2024-08-02 10:26:09 +00:00 |
|
PoYen, Chen
|
e7969b9fd2
|
Add template argument 'kIsPagedKV' for splitkv kernels
|
2024-08-02 10:14:51 +00:00 |
|
carlushuang
|
b3f86e79dd
|
workaround rocm-6.2 compiler issue (#1421)
|
2024-07-31 16:03:59 +08:00 |
|
PoYen, Chen
|
94f430de32
|
Update rotary_dim range in smoke_test_fwd.sh
|
2024-07-26 07:13:25 +00:00 |
|
PoYen, Chen
|
d41ff70db5
|
Enlarge rotary_dim limit (8 -> 16)
|
2024-07-26 06:51:24 +00:00 |
|
PoYen, Chen
|
4280a07d2a
|
Refine pipeline padding settings
|
2024-07-24 11:37:56 +00:00 |
|
PoYen, Chen
|
f053ae2b5b
|
Add missing init code
|
2024-07-24 07:12:06 +00:00 |
|
PoYen, Chen
|
c50c36a07f
|
Re-arrange the 'set +x' command
|
2024-07-24 06:56:53 +00:00 |
|
PoYen, Chen
|
8fb015b83f
|
Remove more debug statements
|
2024-07-24 06:48:40 +00:00 |
|
PoYen, Chen
|
2126d4d88d
|
Add append-kv smoke tests
|
2024-07-24 06:35:53 +00:00 |
|
PoYen, Chen
|
f7fb3fafaa
|
Allow only apply RoPE on Q (without append KV)
|
2024-07-24 06:26:00 +00:00 |
|
PoYen, Chen
|
08b4e8a125
|
Fix wrong rope key for fp8 pipeline
|
2024-07-24 06:06:07 +00:00 |
|
PoYen, Chen
|
d84c915549
|
Disable host verification if API not exist
|
2024-07-24 06:02:41 +00:00 |
|
PoYen, Chen
|
8a73d334b8
|
Rename utility function
|
2024-07-24 05:19:05 +00:00 |
|
PoYen, Chen
|
d59e098ec4
|
Fix wrong pipeline
|
2024-07-24 05:17:57 +00:00 |
|
PoYen, Chen
|
29c9b650b5
|
Align commit message to the real comment
|
2024-07-24 05:14:00 +00:00 |
|
PoYen, Chen
|
c7b7b44883
|
Add comment for why I just 't' for all padding flags
|
2024-07-24 05:13:16 +00:00 |
|
PoYen, Chen
|
59e1d9b84f
|
Shift rotary_cos/rotary_sin by cache_seqlen_k
|
2024-07-24 05:06:47 +00:00 |
|
PoYen, Chen
|
a4da1e7f22
|
Remove RoPEComputeDataType type alias
|
2024-07-24 04:45:28 +00:00 |
|
PoYen, Chen
|
eb4ea3ac2a
|
Fix wrong rotary_cos/rotary_sin memory size for Q
|
2024-07-23 16:22:25 +00:00 |
|
PoYen, Chen
|
85bac93951
|
Fix wrong index into knew_host/vnew_host
|
2024-07-23 15:31:15 +00:00 |
|
PoYen, Chen
|
b11f92dc4c
|
Fix wrong shape of knew_host/vnew_host
|
2024-07-23 14:52:42 +00:00 |
|
PoYen, Chen
|
ca4b208b60
|
Fix wrong grid size
|
2024-07-23 14:20:52 +00:00 |
|
PoYen, Chen
|
2192bbc68a
|
Rename RotaryEmbeddingEnum
|
2024-07-23 07:50:50 +00:00 |
|
PoYen, Chen
|
48c70720b5
|
Apply RoPE to q_tile
|
2024-07-23 03:54:11 +00:00 |
|
PoYen, Chen
|
1dbed18555
|
Remove constness from q_ptr
|
2024-07-23 03:11:31 +00:00 |
|
PoYen, Chen
|
631f29d527
|
Handle RoPE half-rotated logics
|
2024-07-22 08:50:03 +00:00 |
|
PoYen, Chen
|
fffd6799e6
|
Instantiate multiple kernels for RoPE approaches
|
2024-07-20 02:28:21 +00:00 |
|
PoYen, Chen
|
23450526c0
|
Only apply interleaved RoPE on Knew for now
|
2024-07-18 19:42:14 +00:00 |
|
PoYen, Chen
|
e83c3c7fa0
|
Add constraint to the rotary_dim option
|
2024-07-16 06:54:37 +00:00 |
|
PoYen, Chen
|
879710a495
|
Fix wrong seqlen_k for kvcache
|
2024-07-16 03:42:51 +00:00 |
|
PoYen, Chen
|
65dac9fb90
|
Fix wrong boundaries
|
2024-07-15 01:42:53 +00:00 |
|
PoYen, Chen
|
4e01307e04
|
Fix compilation error in debug mode
|
2024-07-15 01:26:46 +00:00 |
|
PoYen, Chen
|
1a093f94b2
|
Add minimum seqlen_k to generate compliance kvcache
|
2024-07-15 01:11:16 +00:00 |
|
PoYen, Chen
|
57c6a4125c
|
Fix seqlen_knew enabling check logic
|
2024-07-15 00:40:39 +00:00 |
|
PoYen, Chen
|
ad61d9d4b2
|
Randomly generate seqlen_knew if needed
|
2024-07-15 00:39:03 +00:00 |
|
PoYen, Chen
|
f6850aef29
|
Add compute data type alias for RoPE
|
2024-07-15 00:05:33 +00:00 |
|
PoYen, Chen
|
391210ed9e
|
Pass RoPE kernel args
|
2024-07-14 23:18:32 +00:00 |
|
PoYen, Chen
|
b5ad1411b0
|
Merge branch 'feature/cond-add-splitkv' into feature/fmha-fwd-appendkv
|
2024-07-14 22:13:17 +00:00 |
|
PoYen, Chen
|
8c1647d778
|
Avoid invoking deprecated method 'find_module'
|
2024-07-14 22:10:30 +00:00 |
|
PoYen, Chen
|
55f55025ee
|
Fix wrong tensor size
|
2024-07-14 15:40:56 +00:00 |
|
PoYen, Chen
|
93e5125d7a
|
Rename RoPE utility function
|
2024-07-14 14:48:06 +00:00 |
|
PoYen, Chen
|
83d6acc111
|
Apply RoPE on host side
|
2024-07-14 14:45:17 +00:00 |
|
PoYen, Chen
|
3183b68921
|
Simplify v_host_ref definition
|
2024-07-12 06:42:41 +00:00 |
|
PoYen, Chen
|
e5885cab83
|
Simplify K appending logics
|
2024-07-12 06:37:23 +00:00 |
|