Add a new IB stack impl that doesn't use RDMA atomics (#728)

* Added configurable InfiniBand (IB) signaling mode.
`EndpointConfig::Ib::Mode` enum selects the mode (`Default`, `Host`,
`HostNoAtomic`). `Default` is equivalent to `Host` unless specified
different by envrionment `MSCCLPP_IBV_MODE`. `Host` corresponds to the
previous implementation using RDMA atomics for signaling, while
`HostNoAtomic` uses write-with-immediate instead.
* Regarding updates in Python bindings and API.
This commit is contained in:
Changho Hwang
2026-02-10 10:07:53 +09:00
committed by GitHub
parent c12822a7af
commit 42be3660e0
20 changed files with 648 additions and 222 deletions

View File

@@ -20,6 +20,7 @@ void register_env(nb::module_& m) {
.def_ro("socket_family", &Env::socketFamily)
.def_ro("socket_ifname", &Env::socketIfname)
.def_ro("comm_id", &Env::commId)
.def_ro("ibv_mode", &Env::ibvMode)
.def_ro("cache_dir", &Env::cacheDir)
.def_ro("npkit_dump_dir", &Env::npkitDumpDir)
.def_ro("cuda_ipc_use_default_stream", &Env::cudaIpcUseDefaultStream);