Amir Ghamarian
d93efe1b61
Add fused topk_softmax_decode kernel for M=1 MoE decode
...
New CK tile kernel variant that fuses topk_softmax and moe_sorting into
a single kernel launch for the decode case (M=1, single token). The
pipeline inlines the topk loop with results in shared memory (no global
scratch), then thread 0 emits moe_sorting-compatible packed output.
Includes CMake target tile_example_topk_softmax_decode with built-in
comparison benchmark against the separate topk+sorting baseline.
Validated on gfx950, E=8..1024, k=1..8, bf16/fp16.
Made-with: Cursor
2026-03-29 18:06:03 +00:00
..
2026-01-15 07:19:31 -08:00
2026-01-07 16:30:57 +01:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-12-18 07:59:45 +01:00
2025-11-28 13:49:54 -08:00
2025-10-16 03:10:57 -07:00
2026-01-07 16:30:57 +01:00
2026-01-07 16:30:57 +01:00
2026-01-15 16:43:02 +01:00
2026-01-13 07:14:23 +01:00
2026-01-07 16:30:57 +01:00
2025-12-18 07:59:45 +01:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-12-18 07:59:45 +01:00
2025-11-28 13:49:54 -08:00
2026-01-07 16:30:57 +01:00
2026-01-07 16:30:57 +01:00
2026-01-07 16:30:57 +01:00
2026-01-17 08:30:27 +01:00
2025-12-15 07:16:01 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2026-01-17 08:30:27 +01:00
2026-01-07 16:30:57 +01:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2026-01-07 16:30:57 +01:00
2025-11-28 13:49:54 -08:00
2026-01-07 16:30:57 +01:00
2026-01-07 16:30:57 +01:00
2026-01-07 16:30:57 +01:00
2025-12-30 16:25:08 +01:00
2025-11-28 13:49:54 -08:00
2026-01-07 16:30:57 +01:00
2025-11-28 13:49:54 -08:00
2026-01-07 16:30:57 +01:00
2025-11-28 13:49:54 -08:00
2026-01-07 16:30:57 +01:00
2026-01-07 16:30:57 +01:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2025-12-18 13:12:15 -07:00
2025-11-28 13:49:54 -08:00
2025-11-28 13:49:54 -08:00
2026-01-15 16:43:02 +01:00
2025-11-28 13:49:54 -08:00
2026-01-07 16:30:57 +01:00
2026-01-15 16:43:02 +01:00
2026-01-15 16:43:02 +01:00
2026-03-29 18:06:03 +00:00
2026-01-14 07:31:45 -08:00
2024-12-04 00:46:47 +01:00