Commit Graph

214 Commits

Author SHA1 Message Date
Binyang Li
d725e45f13 fix 2023-03-28 14:53:08 +00:00
Binyang Li
9c633a9633 bug fix 2023-03-28 14:40:51 +00:00
Binyang Li
487030887b refactor 2023-03-28 12:22:43 +00:00
Saeed Maleki
17e144c774 a typo in p2p proxy 2023-03-28 08:07:54 +00:00
Saeed Maleki
81b18cd9f9 a bit of clean up 2023-03-28 06:08:12 +00:00
Saeed Maleki
fa26bdd9fc no gdr copy anywhere in the code except for the files that are not compiled 2023-03-28 05:40:40 +00:00
Saeed Maleki
33af4bfb67 no gdr copy anywhere in the code except for the files that are not compiled 2023-03-28 05:36:31 +00:00
Saeed Maleki
d9ba953fb0 gdrcopy is not initialized 2023-03-28 04:56:06 +00:00
Saeed Maleki
e7cccbf897 both head and tail are on OK to be only used by GPU 2023-03-28 04:26:39 +00:00
Saeed Maleki
952d852256 both head and tail are on OK to be only used by GPU 2023-03-28 04:24:45 +00:00
Ziyue Yang
b234cf5012 NPKit: add DMA events and fix bandwidth calculation (#33) 2023-03-28 09:58:32 +08:00
Saeed Maleki
ea3bd90303 Merge pull request #32 from microsoft/chhwang/log-handler
Add mscclppSetLogHandler
2023-03-27 17:59:00 -07:00
Saeed Maleki
32c4498fb8 typo fixes 2023-03-28 00:55:41 +00:00
Saeed Maleki
75036c0f12 typo fixes 2023-03-28 00:50:59 +00:00
Saeed Maleki
5adf3e3755 typo fix 2023-03-27 23:42:43 +00:00
Saeed Maleki
43c52367fb merged with main and simplified the callback requirements 2023-03-27 23:41:27 +00:00
Saeed Maleki
19bf369dc1 link format correction 2023-03-27 20:40:15 +00:00
Changho Hwang
0edb89dba2 Update README.md 2023-03-27 23:29:24 +08:00
Changho Hwang
8fc8f5b4fe Lint 2023-03-27 14:09:26 +00:00
Changho Hwang
72431957fd Use clang-format-12 2023-03-27 14:00:03 +00:00
Changho Hwang
8e4146aba9 Add mscclppSetLogHandler 2023-03-27 13:33:07 +00:00
Binyang2014
c706990c18 Merge pull request #28 from microsoft/binyli/cpplint
Add lint and enable CI
We use [clang-format](https://clang.llvm.org/docs/ClangFormat.html) to format the cpp code. The file `.clang-format` describes the C++ style we should adopt. For the detail description, we can refer to https://clang.llvm.org/docs/ClangFormat.html

To check if our source code aligns with the current code style, we can run: `make cpplint`, to auto fix the code, we can run `make cpplint-autofix`

To fix certain file, we an run: `make cpplint-file INPUTFILE=src/bootstrap/bootstrap.cc`
v0.1.0
2023-03-27 14:20:38 +08:00
Binyang Li
7ec6ae9d6a add cpplint and CI 2023-03-27 03:32:10 +00:00
Saeed Maleki
f6a7962511 Merge pull request #30 from microsoft/saemal/consistency_bug_fix
an important deadlock bug fix
2023-03-26 14:02:42 -07:00
Saeed Maleki
35ca25781a an important deadlock bug fix 2023-03-26 02:09:05 +00:00
Saeed Maleki
57b885c9ab Merge pull request #29 from microsoft/crutcher-python
This PR adds the hooks for python binding and also adds testing environment for different functions available in mscclpp.h.
2023-03-25 12:04:35 -07:00
Saeed Maleki
9eca65283c added cmake requirement -- it needs 3.18 or higher 2023-03-25 18:44:20 +00:00
Crutcher Dunnavant
98e254c5f6 readme change 2023-03-25 01:06:23 -07:00
Crutcher Dunnavant
95fda5a4ef ignore ide dirs 2023-03-25 00:15:56 -07:00
Crutcher Dunnavant
f929d2eaba add ci hook 2023-03-25 00:41:21 +00:00
Crutcher Dunnavant
3b1abaaad1 basic init test 2023-03-24 23:45:29 +00:00
Saeed Maleki
0898214f0a added mscclppGetErrorString 2023-03-24 22:57:14 +00:00
Crutcher Dunnavant
8b6e35d5e0 rebase and fix 2023-03-24 22:23:20 +00:00
Crutcher Dunnavant
57b3c36975 include left out lib; add enums 2023-03-24 22:18:51 +00:00
Crutcher Dunnavant
e181cca064 switch to static linking of nanobind 2023-03-24 22:18:51 +00:00
Crutcher Dunnavant
69957baf8d update readme, build python package dir 2023-03-24 22:18:51 +00:00
Crutcher Dunnavant
48e4bac1e0 formatting and additional methods 2023-03-24 22:18:51 +00:00
Crutcher Dunnavant
eb9b750830 format and guard 2023-03-24 22:18:51 +00:00
Crutcher Dunnavant
be96f38ba3 Work towards a nanobind wrapper 2023-03-24 22:18:51 +00:00
Saeed Maleki
0f31dafed5 Merge pull request #27 from microsoft/chhwang/accept-timeout
30 sec timeout for socket accept
2023-03-24 12:46:34 -07:00
Saeed Maleki
b07508b8f3 removed clockSec since it is not used 2023-03-24 19:43:41 +00:00
Saeed Maleki
35b8ebaf64 retry for almost 20 seconds 2023-03-24 19:42:00 +00:00
Saeed Maleki
3fb9383621 Merge pull request #24 from microsoft/madanm-apipush
simplified API for CUDA level communication calls.
2023-03-24 10:42:07 -07:00
Saeed Maleki
56b599b5e7 a bit of api change and clean up on docs 2023-03-24 17:41:04 +00:00
Changho Hwang
551eae0ba1 Update docs 2023-03-24 09:28:12 +00:00
Changho Hwang
7a4c27778f 30 sec timeout for socket accept 2023-03-24 08:29:00 +00:00
Changho Hwang
274e921009 Minor fixes 2023-03-24 07:28:30 +00:00
Changho Hwang
b056db1e1f Merge pull request #26 from microsoft/ziyyang/npkit-pr
Port NPKit
2023-03-24 14:52:59 +08:00
Saeed Maleki
f7dcea914d Merge branch 'madanm-apipush' of https://github.com/microsoft/mscclpp into madanm-apipush 2023-03-24 06:49:53 +00:00
Saeed Maleki
c042112b6b perf debug for allgather 2023-03-24 06:49:38 +00:00