Files
composable_kernel/include/ck_tile/ops
Po Yen Chen 791802b381 [CK_TILE] fMHA batch_prefill block index & logits soft-capping optimizations (#2198)
* Write soft-sign in inline asm

* Change tile idx computation

* Add macro to turn off soft-sign asm opt

* Use simple for loop to avoid register spill

* Only do block id transform for masking cases
2025-05-16 15:14:46 +08:00
..
2025-05-07 10:46:53 -07:00
2025-05-14 09:31:26 +08:00
2025-01-22 17:34:27 +08:00
2024-10-26 23:52:49 +08:00
2024-10-26 23:52:49 +08:00
2024-10-26 23:52:49 +08:00