Commit Graph

15 Commits

Author SHA1 Message Date
Roshan Dathathri
41e0964d93 Allow binding allocated memory to NVLS multicast pointer (#290)
And change NVLS multimem instructions to static functions
2024-04-18 17:11:31 -07:00
Binyang Li
64d837f9ab Add executor to execute schedule-plan file (#283)
Add executor to execute the JSON schedule file generated by msccl-tools

---------

Co-authored-by: Changho Hwang <changhohwang@microsoft.com>
2024-04-18 19:10:41 +00:00
Changho Hwang
5ba6ce00c7 Fix bootstrapping mechanism (#278)
Co-authored-by: Binyang Li <binyli@microsoft.com>
Co-authored-by: Pashupati Kumar <74680231+pash-msft@users.noreply.github.com>
2024-03-27 10:24:24 +08:00
Saeed Maleki
91d592dcc0 NVLS support. (#250)
Co-authored-by: Saeed Maleki <saemal@microsoft.com>
Co-authored-by: Binyang Li <binyli@microsoft.com>
Co-authored-by: Changho Hwang <changhohwang@microsoft.com>
2024-02-04 20:46:10 -08:00
Changho Hwang
544ff0c21d ROCm support (#213)
Co-authored-by: Binyang Li <binyli@microsoft.com>
2023-11-24 16:41:56 +08:00
Changho Hwang
060fda12e6 mscclpp-test in Python (#204)
Co-authored-by: Binyang Li <binyli@microsoft.com>
Co-authored-by: Saeed Maleki <saemal@microsoft.com>
Co-authored-by: Esha Choukse <eschouks@microsoft.com>
2023-11-16 12:45:25 +08:00
Changho Hwang
4cdb100265 Release GIL for Python APIs with wait (#190) 2023-11-14 21:11:01 +08:00
Changho Hwang
3521fb0280 Clear minor warnings (#214)
Clear warnings from the clang compiler.
2023-11-14 09:28:48 +08:00
Saeed Maleki
85e8017535 Atomic for semaphores instead of fences (#188)
Co-authored-by: Pratyush Patel <pratyushpatel.1995@gmail.com>
Co-authored-by: Esha Choukse <eschouks@microsoft.com>
Co-authored-by: Changho Hwang <changhohwang@microsoft.com>
2023-10-13 18:57:08 +08:00
Saeed Maleki
148681b4bc Fix a pytest bug (#196) 2023-10-13 16:39:43 +08:00
Changho Hwang
8c0f9e84d0 v0.3.0 (#171) 2023-10-11 22:35:54 +08:00
Changho Hwang
11ac824cc7 Align interfaces of put/get/putPackets/getPackets (#185) 2023-10-07 22:18:26 +08:00
Binyang2014
097aa8843a Fix pytest unstable issue. (#170)
- remove `#include <cstdint>` from `poll.hpp`. To make it only contains
device-side code
- Fix compilation issue, which will cause pytest fail randomly. Reuse
the compiled result for same kernel with different arguments
2023-09-06 17:09:04 -07:00
Olli Saarikivi
828be48b21 Add Context and Endpoint classes to enable non-Communicator use-cases (#166)
This PR implements and closes #137. The new `Endpoint` and `Context`
classes expose the connection establishing functionality from
`Communicator`, which now is only responsible for tying together the
bootstrapper with a context.

The largest breaking change here is that
`Communicator.connectOnSetup(...)` now returns the `Connection` wrapped
inside a `NonblockingFuture`. This is because with the way `Context` is
implemented a `Connection` is now fully initialized on construction.

Some smaller breaking API changes from this change are that
`RegisteredMemory` no longer has a `rank()` function (as there maybe no
concept of rank), and similarly `Connection` has no `remoteRank()` and
`tag()` functions. The latter are replaced by `remoteRankOf` and `tagOf`
functions in `Communicator`.

A new `EndpointConfig` class is introduced to avoid duplication of the
IB configuration parameters in the APIs of `Context` and `Communicator`.
The usual usage pattern of just passing in a `Transport` still works due
to an implicit conversion into `EndpointConfig`.

Miscellaneous changes:
-Cleans up how the PIMPL pattern is applied by making both the `Impl`
struct and the `pimpl_` pointers private for all relevant classes in the
core API.
-Enables ctest to be run from the build root directory.
2023-09-06 13:10:04 +08:00
Binyang2014
858e381829 Pytest (#162)
Port python tests to mscclpp.
Please run
`mpirun -tag-output -np 8 pytest ./python/test/test_mscclpp.py -x` to start pytest

---------

Co-authored-by: Saeed Maleki <saemal@microsoft.com>
Co-authored-by: Changho Hwang <changhohwang@microsoft.com>
Co-authored-by: Saeed Maleki <30272783+saeedmaleki@users.noreply.github.com>
2023-09-01 21:22:11 +08:00