Files
mscclpp/src
Qinghua Zhou 01032fa167 core: TODO notes on CUDA-IPC atomicAdd context/flush caveats
Annotate the two known issues flagged by Copilot review on PR #796:

- atomicadd_kernel.cu: launching the atomicAdd kernel from a separate
  CUDA context while `dst` is a CUDA-IPC mapping registered in the
  primary context is technically UB; works in practice on current
  drivers but should be revisited.
- context.cc: `CudaIpcStream::sync()` deliberately skips
  `proxyAtomicStream_` to avoid deadlocking the proxy thread, with
  the side effect that `Connection::flush()` does not order pending
  remote atomicAdd ops on the CUDA-IPC transport.

Both behaviors were cherry-picked from DeepEP branch
`chhwang/dev-atomic-add-cleanup` and should be revisited before this
lands on `main`.
2026-05-06 03:44:10 +00:00
..
2026-01-21 20:32:24 -08:00