Qianfeng
3d50f57f43
Update for fmha_fwd qs_ks_vs pipeline ( #1810 )
...
* Update for fmha_fwd qs_ks_vs pipeline
* Remove _builtin_amdgcn_sched_barrier(0)
* Move p_compute to p converting earlier for trying to increase vgprs re-using
* Enable GetQKBlockGemm to use WarpGemm-16x16x16 for QLoadOnce==false situation
* Re-add __builtin_amdgcn_sched_barrier(0)
---------
Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com >
2025-01-13 12:43:05 +08:00
..
2024-11-27 05:01:15 +08:00
2024-11-08 12:28:23 +08:00
2024-11-26 11:14:56 +08:00
2024-12-28 14:40:17 +01:00
2024-12-17 09:19:44 +01:00
2025-01-13 12:43:05 +08:00
2024-12-23 10:59:02 +08:00
2024-12-28 14:40:17 +01:00
2024-09-27 22:57:38 +02:00
2025-01-08 17:51:06 +08:00
2025-01-03 14:28:59 +08:00
2024-10-30 17:42:50 +01:00
2024-11-01 13:51:56 +08:00
2024-11-01 13:51:56 +08:00
2024-12-13 11:53:52 +08:00
2024-10-26 23:52:49 +08:00
2024-10-26 23:52:49 +08:00
2024-10-26 23:52:49 +08:00
2025-01-07 18:49:24 +08:00
2025-01-07 18:49:24 +08:00
2025-01-07 18:49:24 +08:00
2025-01-07 18:49:24 +08:00
2025-01-07 18:49:24 +08:00
2025-01-07 18:49:24 +08:00
2025-01-07 18:49:24 +08:00
2025-01-07 18:49:24 +08:00
2025-01-07 18:49:24 +08:00
2025-01-07 18:49:24 +08:00
2025-01-07 18:49:24 +08:00
2025-01-07 18:49:24 +08:00
2025-01-07 18:49:24 +08:00
2025-01-07 18:49:24 +08:00
2025-01-07 18:49:24 +08:00
2025-01-07 18:49:24 +08:00
2025-01-07 18:49:24 +08:00
2025-01-07 18:49:24 +08:00