Commit Graph

113 Commits

Author SHA1 Message Date
PoYen, Chen
6a399ea47e Use generic lambda to init all the api traits/args 2024-08-08 19:22:53 +00:00
PoYen, Chen
9206808835 Move functors to the begining of validation code 2024-08-08 18:01:10 +00:00
PoYen, Chen
028d89862a Wrap code by #if directives 2024-08-08 17:58:49 +00:00
PoYen, Chen
9dddf6e437 Rename 'max_num_blocks' to 'max_num_page_blocks' 2024-08-08 17:38:08 +00:00
PoYen, Chen
e3a4bfba88 Show more detailed warning message 2024-08-08 17:35:36 +00:00
PoYen, Chen
d3624a03de Merge branch 'develop' into feature/fmha-fwd-appendkv 2024-08-08 17:26:53 +00:00
PoYen, Chen
3e2b69e163 Display more info for specific kernels 2024-08-08 17:26:09 +00:00
PoYen, Chen
c8f63d4848 Separate more non-splitkv & splitkv traits/args 2024-08-08 16:54:00 +00:00
PoYen, Chen
677d9b28dd Use generic lambda to init traits objects 2024-08-08 16:38:17 +00:00
PoYen, Chen
a0d2163045 Remove dropout code in splitkv kernel 2024-08-08 10:21:34 +00:00
PoYen, Chen
9d9c5a6c24 Fix compilation errors 2024-08-08 08:26:55 +00:00
PoYen, Chen
247e135cfc Remove fmha_fwd_dispatch() 2024-08-08 08:15:04 +00:00
PoYen, Chen
291e9b4bbb Separate splitkv/non-splitkv args/traits 2024-08-08 08:07:03 +00:00
PoYen, Chen
655b13b059 Rename option s_k_new to s_knew 2024-08-07 15:31:54 +00:00
PoYen, Chen
b6c2f2f01d Add missing group mode argument 2024-08-07 15:22:57 +00:00
Illia Silin
12c1f68dd9 Run CK_TILE FMHA benchmarks and collect the performance data. (#1447)
* run ck_tile benchmarks after the smoke tests and store logs

* change the path of fmha benchmark logs

* change the way of stashig ck_tile fmha logs

* prevent the errors in stages where no logs are generated

* fix the ck_tile fmha log names and headers

* generate the fmha performance logs in the root folder

* change jenkins scrip arguments format

* use exact file names for stashing

* modify scripts to process FMHA performance results

* unstash FMHA logs before parsing them
2024-08-07 08:18:26 -07:00
PoYen, Chen
55ce2948a9 Always add fmha_fwd() api 2024-08-07 13:43:14 +00:00
PoYen, Chen
838f9955fd Fix wrong strides for appendkv kernel 2024-08-07 08:06:47 +00:00
PoYen, Chen
443a528adc Add block_table kernel args for appendkv kernel 2024-08-07 04:27:15 +00:00
PoYen, Chen
15d0034a64 Add paged-kv codegen logic for appendkv kernels 2024-08-07 04:19:45 +00:00
PoYen, Chen
b98985262d Add missing kernel arguments for group mode 2024-08-06 14:54:07 +00:00
PoYen, Chen
12da00c3be Use 128 as minimus page_block_size 2024-08-06 03:20:29 +00:00
PoYen, Chen
f9e2bafd10 Make sure we always start reading complete tile 2024-08-06 03:13:57 +00:00
PoYen, Chen
8779716403 Fix uneven split checking logic 2024-08-06 01:17:14 +00:00
PoYen, Chen
3fc7279519 Disable calling fmha_fwd() 2024-08-05 21:36:52 +00:00
PoYen, Chen
8fea4139df Fix tile window navigation bugs 2024-08-05 21:34:15 +00:00
PoYen, Chen
90d84eaeae Fix seqlen_k_min for pre-fill case (1 -> 0) 2024-08-04 02:53:40 +00:00
PoYen, Chen
381f7e90e0 Merge branch 'develop' into feature/fmha-fwd-appendkv 2024-08-04 02:12:20 +00:00
PoYen, Chen
db95d25d36 Launch splitkv kernel if given page_block_size 2024-08-02 10:26:09 +00:00
PoYen, Chen
e7969b9fd2 Add template argument 'kIsPagedKV' for splitkv kernels 2024-08-02 10:14:51 +00:00
carlushuang
b3f86e79dd workaround rocm-6.2 compiler issue (#1421) 2024-07-31 16:03:59 +08:00
PoYen, Chen
94f430de32 Update rotary_dim range in smoke_test_fwd.sh 2024-07-26 07:13:25 +00:00
PoYen, Chen
d41ff70db5 Enlarge rotary_dim limit (8 -> 16) 2024-07-26 06:51:24 +00:00
PoYen, Chen
4280a07d2a Refine pipeline padding settings 2024-07-24 11:37:56 +00:00
PoYen, Chen
f053ae2b5b Add missing init code 2024-07-24 07:12:06 +00:00
PoYen, Chen
c50c36a07f Re-arrange the 'set +x' command 2024-07-24 06:56:53 +00:00
PoYen, Chen
8fb015b83f Remove more debug statements 2024-07-24 06:48:40 +00:00
PoYen, Chen
2126d4d88d Add append-kv smoke tests 2024-07-24 06:35:53 +00:00
PoYen, Chen
f7fb3fafaa Allow only apply RoPE on Q (without append KV) 2024-07-24 06:26:00 +00:00
PoYen, Chen
08b4e8a125 Fix wrong rope key for fp8 pipeline 2024-07-24 06:06:07 +00:00
PoYen, Chen
d84c915549 Disable host verification if API not exist 2024-07-24 06:02:41 +00:00
PoYen, Chen
8a73d334b8 Rename utility function 2024-07-24 05:19:05 +00:00
PoYen, Chen
d59e098ec4 Fix wrong pipeline 2024-07-24 05:17:57 +00:00
PoYen, Chen
29c9b650b5 Align commit message to the real comment 2024-07-24 05:14:00 +00:00
PoYen, Chen
c7b7b44883 Add comment for why I just 't' for all padding flags 2024-07-24 05:13:16 +00:00
PoYen, Chen
59e1d9b84f Shift rotary_cos/rotary_sin by cache_seqlen_k 2024-07-24 05:06:47 +00:00
PoYen, Chen
a4da1e7f22 Remove RoPEComputeDataType type alias 2024-07-24 04:45:28 +00:00
PoYen, Chen
eb4ea3ac2a Fix wrong rotary_cos/rotary_sin memory size for Q 2024-07-23 16:22:25 +00:00
PoYen, Chen
85bac93951 Fix wrong index into knew_host/vnew_host 2024-07-23 15:31:15 +00:00
PoYen, Chen
b11f92dc4c Fix wrong shape of knew_host/vnew_host 2024-07-23 14:52:42 +00:00