Files
composable_kernel/include/ck_tile/ops
Po Yen Chen a1c07e8d91 [CK_TILE] Change output accum tensor layout of fmha fwd split-kv & combine kernels (#1527)
* Use same layout for o_acc and o tensor

* Use better param names in partitioner

* Remove redundant kargs 'max_seqlen_q'

* Use better param names in splitkv kernel

* Add comment for additional kernel arguments

* Sync empty loop early return logics between pipelines

* Pass more arguments to cmake in scripts

* Align backslashes

* Fix wrong o_acc tensor view strides

* Change o_acc layout if o_perm=0

* Handle whole row masked via attn_bias

* Use use vector width = 1 for o_acc

* Use more even split sizes
2024-10-01 22:13:52 +08:00
..
2024-04-15 19:27:12 -05:00
2024-04-15 19:27:12 -05:00
2024-09-18 11:32:29 -07:00
2024-06-24 08:45:52 +08:00
2024-04-15 19:27:12 -05:00
2024-06-24 08:45:52 +08:00
2024-04-15 19:27:12 -05:00
2024-04-15 19:27:12 -05:00
2024-09-07 16:23:32 +08:00
2024-06-24 08:45:52 +08:00
2024-04-15 19:27:12 -05:00
2024-06-24 08:45:52 +08:00