mirror of
https://github.com/microsoft/mscclpp.git
synced 2026-05-11 17:00:22 +00:00
- Report both per-rank and aggregate BW to align with NCCL-EP's ep_bench (which reports per-rank GB/s). - Accept MSCCLPP_EP_LL_TOKENS/HIDDEN/TOPK/EXPERTS_PER_RANK env overrides so we can match external benchmark problem sizes (NCCL-EP LL defaults are num_tokens=128, hidden=7168, top_k=8).