PoYen, Chen
d59e098ec4
Fix wrong pipeline
2024-07-24 05:17:57 +00:00
PoYen, Chen
29c9b650b5
Align commit message to the real comment
2024-07-24 05:14:00 +00:00
PoYen, Chen
c7b7b44883
Add comment for why I just 't' for all padding flags
2024-07-24 05:13:16 +00:00
PoYen, Chen
59e1d9b84f
Shift rotary_cos/rotary_sin by cache_seqlen_k
2024-07-24 05:06:47 +00:00
PoYen, Chen
a4da1e7f22
Remove RoPEComputeDataType type alias
2024-07-24 04:45:28 +00:00
PoYen, Chen
eb4ea3ac2a
Fix wrong rotary_cos/rotary_sin memory size for Q
2024-07-23 16:22:25 +00:00
PoYen, Chen
85bac93951
Fix wrong index into knew_host/vnew_host
2024-07-23 15:31:15 +00:00
PoYen, Chen
b11f92dc4c
Fix wrong shape of knew_host/vnew_host
2024-07-23 14:52:42 +00:00
PoYen, Chen
ca4b208b60
Fix wrong grid size
2024-07-23 14:20:52 +00:00
PoYen, Chen
2192bbc68a
Rename RotaryEmbeddingEnum
2024-07-23 07:50:50 +00:00
PoYen, Chen
48c70720b5
Apply RoPE to q_tile
2024-07-23 03:54:11 +00:00
PoYen, Chen
1dbed18555
Remove constness from q_ptr
2024-07-23 03:11:31 +00:00
PoYen, Chen
631f29d527
Handle RoPE half-rotated logics
2024-07-22 08:50:03 +00:00
PoYen, Chen
fffd6799e6
Instantiate multiple kernels for RoPE approaches
2024-07-20 02:28:21 +00:00
PoYen, Chen
23450526c0
Only apply interleaved RoPE on Knew for now
2024-07-18 19:42:14 +00:00
PoYen, Chen
e83c3c7fa0
Add constraint to the rotary_dim option
2024-07-16 06:54:37 +00:00
PoYen, Chen
879710a495
Fix wrong seqlen_k for kvcache
2024-07-16 03:42:51 +00:00
PoYen, Chen
65dac9fb90
Fix wrong boundaries
2024-07-15 01:42:53 +00:00
PoYen, Chen
4e01307e04
Fix compilation error in debug mode
2024-07-15 01:26:46 +00:00
PoYen, Chen
1a093f94b2
Add minimum seqlen_k to generate compliance kvcache
2024-07-15 01:11:16 +00:00
PoYen, Chen
57c6a4125c
Fix seqlen_knew enabling check logic
2024-07-15 00:40:39 +00:00
PoYen, Chen
ad61d9d4b2
Randomly generate seqlen_knew if needed
2024-07-15 00:39:03 +00:00
PoYen, Chen
f6850aef29
Add compute data type alias for RoPE
2024-07-15 00:05:33 +00:00
PoYen, Chen
391210ed9e
Pass RoPE kernel args
2024-07-14 23:18:32 +00:00
PoYen, Chen
b5ad1411b0
Merge branch 'feature/cond-add-splitkv' into feature/fmha-fwd-appendkv
2024-07-14 22:13:17 +00:00
PoYen, Chen
8c1647d778
Avoid invoking deprecated method 'find_module'
2024-07-14 22:10:30 +00:00
PoYen, Chen
55f55025ee
Fix wrong tensor size
2024-07-14 15:40:56 +00:00
PoYen, Chen
93e5125d7a
Rename RoPE utility function
2024-07-14 14:48:06 +00:00
PoYen, Chen
83d6acc111
Apply RoPE on host side
2024-07-14 14:45:17 +00:00
PoYen, Chen
3183b68921
Simplify v_host_ref definition
2024-07-12 06:42:41 +00:00
PoYen, Chen
e5885cab83
Simplify K appending logics
2024-07-12 06:37:23 +00:00
PoYen, Chen
3578c6f836
Append K/V in the host verification code
2024-07-12 06:32:35 +00:00
PoYen, Chen
4107bf03a6
Merge remote-tracking branch 'origin/feature/cond-add-splitkv' into feature/fmha-fwd-appendkv
2024-07-12 04:43:04 +00:00
PoYen, Chen
b34ddf5f71
Merge remote-tracking branch 'origin/feature/cond-add-splitkv' into feature/fmha-fwd-appendkv
2024-07-12 04:42:45 +00:00
Po Yen Chen
b4306af655
Merge branch 'develop' into feature/cond-add-splitkv
2024-07-12 12:34:31 +08:00
Illia Silin
98a01bbc72
Add CK_TILE tests to daily CI builds. ( #1381 )
...
* add ck_tile tests to CI
* build and run ck_tile tests on gfx90a and gfx942 in parallel
* fix groovy syntax
* turn ck_tile tests OFF by default
* skip creating the build folder
* build ck_tile examples with 64 threads
* build ck_tile examples with cmake-ck-dev.sh script
* add video group to docker on mi300
* do not retry to rebuild the early CI stages
* help prevent jenkins false failure
* restore cron trigger
2024-07-11 13:22:40 -07:00
PoYen, Chen
8c733fb3be
Fix compilation errors
2024-07-10 10:53:58 +00:00
PoYen, Chen
e939082bdc
Add RoPE example utilities
2024-07-09 05:20:47 +00:00
Po Yen Chen
dc72074ec7
Merge branch 'develop' into feature/cond-add-splitkv
2024-07-09 03:42:25 +08:00
carlushuang
8182976c37
[CK_TILE] wa prec, remove sgpr offset for inline asm ( #1356 )
...
* wa prec, remove sgpr offset for inline asm
* macro for set tile
* ignore unused param if no kernel instances in host API
* fix more prec issue
* cache buffer resource
* fix
* support pre-nop
* clear tile by vector type members
* add workaround to reduce scratch memory
* conditionally enable workaround code
* enable workaround start from certain build version
* fallback set_tile() implementation from certain build version
* undo template argument changes
* put dummy asm in load_raw()
* fix comments, refactor s_nop inside buffer_load
---------
Co-authored-by: PoYen, Chen <PoYen.Chen@amd.com >
2024-07-08 11:09:55 -07:00
PoYen, Chen
18a3834fb4
Set num_splits=1 if split-kv is not supported
2024-07-08 10:27:32 +00:00
PoYen, Chen
8ac6bacf26
Unify CMakeLists.txt coding style
2024-07-08 10:19:31 +00:00
PoYen, Chen
5d21b4d736
Merge branch 'feature/cond-add-splitkv' into feature/fmha-fwd-appendkv
2024-07-08 10:18:28 +00:00
PoYen, Chen
6ca3910199
Show message if we are ignoring option
2024-07-08 10:17:55 +00:00
PoYen, Chen
fe4ae5dcd9
Early return if 0 < s_k_new is not supported
2024-07-08 10:09:36 +00:00
PoYen, Chen
be076db91c
Merge branch 'feature/cond-add-splitkv' into feature/fmha-fwd-appendkv
2024-07-08 10:03:58 +00:00
PoYen, Chen
aba46cd655
Regsiter API handlers automatically
2024-07-08 09:39:15 +00:00
PoYen, Chen
3aefb560e0
Remove "EXAMPLE_" prefix of cmake variables
2024-07-08 07:17:24 +00:00
PoYen, Chen
1c070380fa
Merge branch 'feature/cond-add-splitkv' into feature/fmha-fwd-appendkv
2024-07-08 07:13:34 +00:00
PoYen, Chen
82f3b3d0a0
Conditionally add call to fmha_fwd_splitkv()
2024-07-08 06:40:18 +00:00