mirror of
https://github.com/microsoft/mscclpp.git
synced 2026-05-22 22:08:28 +00:00
Use mpi4py for bootstrap and local-rank discovery; drop the torchrun / gloo / manual MSCCLPP_MASTER_ADDR paths and the netifaces dependency. Add MNNVL/multi-node algorithm selection (rsag, rsag_zero_copy, nvls_zero_copy) and route barrier / timing-sync allreduces through the configured symmetric_memory flag so they work across hosts. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>