Files
ktransformers/kt-kernel/python
mrhaoxx 6d4632b8c7 fix: add missing gpu_experts_mask=None to KTMoEWrapper call in SFT wrapper
KTMoEWrapper.__new__() requires gpu_experts_mask as a positional argument,
but the SFT wrapper omitted it, causing MoE layer wrapping to fail silently
and FSDP2 to attempt broadcasting all expert weights (OOM/NCCL crash).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:18:40 +08:00
..