Files
composable_kernel/include/ck_tile/host
Yi DING 8cb8da53c9 [CK_TILE] FMHA BWD Optimization For GFX950 (#2628)
* simplify fmha_bwd_kernel MakeKargs & dq_dram_window

* simply duplicate

* trload pipeline

* Try two-stage

* add prefetch

* optimize & iglp

[ROCm/composable_kernel commit: 4fde1646e5]
2025-08-12 11:11:55 +08:00
..
2025-06-10 20:35:28 +08:00
2024-04-15 19:27:12 -05:00