Commit Graph

24 Commits

Author SHA1 Message Date
caiomcbr
7493e2f075 Double buffering for NCCL APIs (#324)
Using two scratch buffers in each peer to exchange data.

---------

Co-authored-by: Changho Hwang <changhohwang@microsoft.com>
2024-07-15 22:18:53 +00:00
Binyang Li
422c81f0f8 remove make pylib-copy command (#249)
Fix #216
Remove `make pylib-copy`
2024-01-19 12:29:15 -08:00
Changho Hwang
5fa5bd2706 Check nvidia_peermem during runtime (#234) 2023-12-25 12:02:10 +08:00
Changho Hwang
544ff0c21d ROCm support (#213)
Co-authored-by: Binyang Li <binyli@microsoft.com>
2023-11-24 16:41:56 +08:00
Changho Hwang
dab19e00c1 Templatize Dockerfiles & update workflows (#223)
Now build images by a script with a shared Dockerfile template

---------

Co-authored-by: Binyang Li <binyli@microsoft.com>
Co-authored-by: Saeed Maleki <saemal@microsoft.com>
2023-11-22 13:29:12 -08:00
Changho Hwang
f68820436c Explicit build dependency on nvidia_peermem (#201) 2023-10-23 04:29:30 +00:00
Changho Hwang
8c0f9e84d0 v0.3.0 (#171) 2023-10-11 22:35:54 +08:00
Changho Hwang
11ac824cc7 Align interfaces of put/get/putPackets/getPackets (#185) 2023-10-07 22:18:26 +08:00
Changho Hwang
497a9e0c82 Add backup workflows (#189) 2023-10-07 15:13:49 +08:00
Saeed Maleki
e7d5e652df Python bindings (#125)
Co-authored-by: Olli Saarikivi <olsaarik@microsoft.com>
Co-authored-by: Changho Hwang <changhohwang@microsoft.com>
Co-authored-by: Binyang Li <binyli@microsoft.com>
2023-07-19 15:35:54 +08:00
Binyang2014
56bdbc2f32 Enable test for both cuda11 and cuda12 (#124)
Update pipeline: enable test for both cuda11 and cuda12
2023-07-10 13:19:14 +08:00
Changho Hwang
4114d65c60 Documents & minor updates (#119)
Co-authored-by: Saeed Maleki <saemal@microsoft.com>
Co-authored-by: Binyang Li <binyli@microsoft.com>
2023-07-07 17:35:05 +08:00
Changho Hwang
bb7b85a810 2-node AllReduce improvements (#118)
* Added `get()` interfaces to `SmChannel`
* Improved 2-node (8 gpus/node) AllReduce: algbw 139GB/s for 1GB (kernel
3) and 99GB/s for 48MB (kernel 4)
* Fixed a FIFO perf bug
* Several fixes & validations in mscclpp-test

---------

Co-authored-by: Binyang Li <binyli@microsoft.com>
Co-authored-by: Saeed Maleki <saemal@microsoft.com>
2023-07-07 07:05:46 +00:00
Binyang2014
2640578b22 Add performance check for mscclpp-test (#110)
- Add ndmv4 perf baseline
- change mscclpp-test to output perf number into a json file
- add python script to check the perf result with the baseline
2023-06-21 07:42:53 +00:00
Changho Hwang
5a4885ccbb Misc updates (#95) 2023-06-12 13:53:43 +08:00
Changho Hwang
798631bd52 Update unit tests (#81) 2023-06-08 09:58:05 +00:00
Changho Hwang
7346e70109 Use MSCCL++ Docker image for CodeQL (#94) 2023-06-06 18:42:22 +08:00
Changho Hwang
0581bfb431 Fix CodeQL workflow (#80) 2023-05-22 14:03:30 +08:00
Changho Hwang
8d54bf3301 Update CI (#79) 2023-05-21 11:45:41 -07:00
Binyang Li
5704fb7c6a update 2023-05-11 08:55:51 +00:00
Binyang Li
1487596dc8 update cpplint 2023-05-11 08:34:57 +00:00
Binyang Li
669c67b3de enable github action on all ranches 2023-05-05 08:42:25 +00:00
Changho Hwang
72431957fd Use clang-format-12 2023-03-27 14:00:03 +00:00
Binyang Li
7ec6ae9d6a add cpplint and CI 2023-03-27 03:32:10 +00:00