Files
mscclpp/examples
Binyang Li 9aeeaf0f12 Simplify torch-integration tuning example for MPI-only multi-node testing
Use mpi4py for bootstrap and local-rank discovery; drop the torchrun /
gloo / manual MSCCLPP_MASTER_ADDR paths and the netifaces dependency.
Add MNNVL/multi-node algorithm selection (rsag, rsag_zero_copy,
nvls_zero_copy) and route barrier / timing-sync allreduces through the
configured symmetric_memory flag so they work across hosts.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-06 18:51:29 +00:00
..