mirror of
https://github.com/microsoft/mscclpp.git
synced 2026-06-28 18:37:22 +00:00
Add a public Granularity enum (MultiCastMinimum, MultiCastRecommended) and let GpuBuffer choose the NVLS multicast allocation granularity via a constructor argument, defaulting to MultiCastMinimum to minimize memory usage. Expose the same option through the C++ and Python (nanobind) APIs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>