mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-06-30 03:37:38 +00:00
- Add FusedMoeGemmTilePartitioner_NoAtomic: Forces single workgroup per expert - Add FusedMoeGemmPipelineFlatmmPolicy_NoAtomic: Fixes alignment consistency - Update API to use no-atomic approach when intermediate_size <= Block_N0 Eliminates atomic operations by ensuring each workgroup handles complete expert computation without K-dimension splitting.