PoYen, Chen
d96752d0f5
Refine smoke_test_fwd.sh
2024-08-13 08:36:04 +00:00
PoYen, Chen
e8603dc21a
Add missing comment
2024-08-08 20:40:50 +00:00
PoYen, Chen
822d5dcd8e
Fix wrong seqlen for kvcache
2024-08-08 20:39:36 +00:00
PoYen, Chen
6a399ea47e
Use generic lambda to init all the api traits/args
2024-08-08 19:22:53 +00:00
PoYen, Chen
9206808835
Move functors to the begining of validation code
2024-08-08 18:01:10 +00:00
PoYen, Chen
028d89862a
Wrap code by #if directives
2024-08-08 17:58:49 +00:00
PoYen, Chen
9dddf6e437
Rename 'max_num_blocks' to 'max_num_page_blocks'
2024-08-08 17:38:08 +00:00
PoYen, Chen
e3a4bfba88
Show more detailed warning message
2024-08-08 17:35:36 +00:00
PoYen, Chen
d3624a03de
Merge branch 'develop' into feature/fmha-fwd-appendkv
2024-08-08 17:26:53 +00:00
PoYen, Chen
3e2b69e163
Display more info for specific kernels
2024-08-08 17:26:09 +00:00
PoYen, Chen
c8f63d4848
Separate more non-splitkv & splitkv traits/args
2024-08-08 16:54:00 +00:00
PoYen, Chen
677d9b28dd
Use generic lambda to init traits objects
2024-08-08 16:38:17 +00:00
PoYen, Chen
a0d2163045
Remove dropout code in splitkv kernel
2024-08-08 10:21:34 +00:00
PoYen, Chen
9d9c5a6c24
Fix compilation errors
2024-08-08 08:26:55 +00:00
PoYen, Chen
247e135cfc
Remove fmha_fwd_dispatch()
2024-08-08 08:15:04 +00:00
PoYen, Chen
291e9b4bbb
Separate splitkv/non-splitkv args/traits
2024-08-08 08:07:03 +00:00
PoYen, Chen
655b13b059
Rename option s_k_new to s_knew
2024-08-07 15:31:54 +00:00
PoYen, Chen
b6c2f2f01d
Add missing group mode argument
2024-08-07 15:22:57 +00:00
Illia Silin
12c1f68dd9
Run CK_TILE FMHA benchmarks and collect the performance data. ( #1447 )
...
* run ck_tile benchmarks after the smoke tests and store logs
* change the path of fmha benchmark logs
* change the way of stashig ck_tile fmha logs
* prevent the errors in stages where no logs are generated
* fix the ck_tile fmha log names and headers
* generate the fmha performance logs in the root folder
* change jenkins scrip arguments format
* use exact file names for stashing
* modify scripts to process FMHA performance results
* unstash FMHA logs before parsing them
2024-08-07 08:18:26 -07:00
PoYen, Chen
55ce2948a9
Always add fmha_fwd() api
2024-08-07 13:43:14 +00:00
PoYen, Chen
838f9955fd
Fix wrong strides for appendkv kernel
2024-08-07 08:06:47 +00:00
PoYen, Chen
443a528adc
Add block_table kernel args for appendkv kernel
2024-08-07 04:27:15 +00:00
PoYen, Chen
15d0034a64
Add paged-kv codegen logic for appendkv kernels
2024-08-07 04:19:45 +00:00
PoYen, Chen
b98985262d
Add missing kernel arguments for group mode
2024-08-06 14:54:07 +00:00
PoYen, Chen
12da00c3be
Use 128 as minimus page_block_size
2024-08-06 03:20:29 +00:00
PoYen, Chen
f9e2bafd10
Make sure we always start reading complete tile
2024-08-06 03:13:57 +00:00
PoYen, Chen
8779716403
Fix uneven split checking logic
2024-08-06 01:17:14 +00:00
PoYen, Chen
3fc7279519
Disable calling fmha_fwd()
2024-08-05 21:36:52 +00:00
PoYen, Chen
8fea4139df
Fix tile window navigation bugs
2024-08-05 21:34:15 +00:00
PoYen, Chen
90d84eaeae
Fix seqlen_k_min for pre-fill case (1 -> 0)
2024-08-04 02:53:40 +00:00
PoYen, Chen
381f7e90e0
Merge branch 'develop' into feature/fmha-fwd-appendkv
2024-08-04 02:12:20 +00:00
PoYen, Chen
db95d25d36
Launch splitkv kernel if given page_block_size
2024-08-02 10:26:09 +00:00
PoYen, Chen
e7969b9fd2
Add template argument 'kIsPagedKV' for splitkv kernels
2024-08-02 10:14:51 +00:00
carlushuang
b3f86e79dd
workaround rocm-6.2 compiler issue ( #1421 )
2024-07-31 16:03:59 +08:00
PoYen, Chen
94f430de32
Update rotary_dim range in smoke_test_fwd.sh
2024-07-26 07:13:25 +00:00
PoYen, Chen
d41ff70db5
Enlarge rotary_dim limit (8 -> 16)
2024-07-26 06:51:24 +00:00
PoYen, Chen
4280a07d2a
Refine pipeline padding settings
2024-07-24 11:37:56 +00:00
PoYen, Chen
f053ae2b5b
Add missing init code
2024-07-24 07:12:06 +00:00
PoYen, Chen
c50c36a07f
Re-arrange the 'set +x' command
2024-07-24 06:56:53 +00:00
PoYen, Chen
8fb015b83f
Remove more debug statements
2024-07-24 06:48:40 +00:00
PoYen, Chen
2126d4d88d
Add append-kv smoke tests
2024-07-24 06:35:53 +00:00
PoYen, Chen
f7fb3fafaa
Allow only apply RoPE on Q (without append KV)
2024-07-24 06:26:00 +00:00
PoYen, Chen
08b4e8a125
Fix wrong rope key for fp8 pipeline
2024-07-24 06:06:07 +00:00
PoYen, Chen
d84c915549
Disable host verification if API not exist
2024-07-24 06:02:41 +00:00
PoYen, Chen
8a73d334b8
Rename utility function
2024-07-24 05:19:05 +00:00
PoYen, Chen
d59e098ec4
Fix wrong pipeline
2024-07-24 05:17:57 +00:00
PoYen, Chen
29c9b650b5
Align commit message to the real comment
2024-07-24 05:14:00 +00:00
PoYen, Chen
c7b7b44883
Add comment for why I just 't' for all padding flags
2024-07-24 05:13:16 +00:00
PoYen, Chen
59e1d9b84f
Shift rotary_cos/rotary_sin by cache_seqlen_k
2024-07-24 05:06:47 +00:00
PoYen, Chen
a4da1e7f22
Remove RoPEComputeDataType type alias
2024-07-24 04:45:28 +00:00