mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-03 05:01:25 +00:00
[CK_TILE] optimize moe sorting kernel, boost large context case up to 20x (#2153)
* combine 2-3 as single stage * support zeroing * improve long tokens * update specialization * b16 ws * 8bit topk optimize * update 15 example
This commit is contained in:
@@ -257,5 +257,5 @@
|
||||
#endif
|
||||
|
||||
#ifndef CK_TILE_WA_ISSUE_2028
|
||||
#define CK_TILE_WA_ISSUE_2028 1
|
||||
#define CK_TILE_WA_ISSUE_2028 0
|
||||
#endif
|
||||
|
||||
Reference in New Issue
Block a user