Po Yen Chen
7fbc9d6c97
[CK_TILE] FMHA FAv3 scheduling fine-tuning for performance (#2833)
* Re-mapping thread block indices for causal=True kernels
* Use more intuitive remap_opt value
* Fallback to origin remapping if seqlen_q >= 64K
* Use GenericAttentionMask to reduce mask computation
* Avoid unnecessary boundary check for IsMasking=false case
* Fix wrong kernel entry specifier
* Add s_nop to prevent delay wave0-3
* Refine scheduling
* Remove unnecessary sched_group_barrier()
* Move sched_group_barrier() call to scheduler
* Replace inline asm s_setprio with intrinsics
* Rephrase comments
* Expend some o_acc rescaling insts to avoid SIMD idle
* Fix block idx special mapping logic
* Tune block index mapping for causal=False cases
* Tune block index mapping for causal=True cases
* Fix wrong vmcnt()
* Remove parameter name
* Use boolean option for turn on/off causal mask
* Update benchmark_fwd_v3.sh option usages
* Add option if compiler support it
2025-09-16 11:32:38 +08:00
..
2025-08-18 10:08:31 -07:00
2025-09-03 09:34:11 +02:00
2025-09-10 08:29:20 +08:00
2025-09-11 07:27:33 -07:00
2025-09-13 20:54:08 -07:00
2025-09-10 08:28:00 +08:00
2025-09-16 11:32:38 +08:00
2025-08-28 20:30:30 +08:00
2025-09-12 12:45:14 -07:00
2025-09-11 11:38:16 -07:00
2025-09-11 07:27:33 -07:00
2025-08-18 10:08:31 -07:00
2025-08-28 12:45:50 -07:00
2025-06-23 12:29:15 +08:00
2025-08-18 10:08:31 -07:00
2025-09-11 12:41:20 +08:00
2025-08-28 12:45:50 -07:00
2025-08-18 10:08:31 -07:00
2024-10-26 23:52:49 +08:00
2024-10-26 23:52:49 +08:00
2025-08-18 10:08:31 -07:00
2025-02-11 09:49:48 +01:00
2025-07-26 21:51:54 -07:00
2025-09-03 13:38:17 -07:00
2025-07-24 11:21:45 +02:00
2025-04-13 20:09:30 -07:00
2025-04-16 16:51:17 +08:00
2025-09-01 09:16:45 +08:00
2025-02-11 17:49:17 +08:00
2025-09-08 10:25:57 -07:00
2025-08-28 20:30:30 +08:00
2025-08-20 05:29:57 -07:00
2025-02-11 09:49:48 +01:00
2025-02-11 09:49:48 +01:00
2025-02-11 09:49:48 +01:00
2025-02-11 09:49:48 +01:00
2025-08-12 16:05:05 -07:00
2025-07-16 14:05:26 +08:00
2025-02-11 09:49:48 +01:00
2025-02-11 09:49:48 +01:00
2025-02-11 09:49:48 +01:00
2025-02-11 09:49:48 +01:00