[CK_TILE] moe sorting optimize local_token (#2469)

* fix bug in loops that need use local tokens to compute

* support extra chain local_token

* update

* update

* refine some main

* update

* support dispatch_policy

* fix 15 example
This commit is contained in:
carlushuang
2025-07-15 09:42:18 +08:00
committed by GitHub
parent 141bf2d54d
commit cfe211cc60
9 changed files with 579 additions and 94 deletions

View File

@@ -399,7 +399,7 @@ bool run(const ck_tile::ArgParser& arg_parser)
// if return zero, means no need workspace, can set moe_sorting_args.p_ws to nullptr
ck_tile::index_t workspace_size =
ck_tile::moe_sorting_get_workspace_size(tokens, experts, topk);
ck_tile::moe_sorting_get_workspace_size(tokens, experts, topk, 0 /*dispatch_policy*/);
ck_tile::DeviceMem moe_sorting_ws(workspace_size != 0 ? workspace_size : 0);
if(workspace_size != 0)
moe_sorting_ws.SetZero(); // note, clear here!!!!