Olli Saarikivi
|
81e7d1b344
|
Channels work
|
2023-05-03 17:11:25 +00:00 |
|
Saeed Maleki
|
6002a520b6
|
solved merge conflict
|
2023-05-02 23:56:11 +00:00 |
|
Saeed Maleki
|
54d1e1872c
|
testing writes with signal is passing
|
2023-05-02 23:53:31 +00:00 |
|
Olli Saarikivi
|
4ba8516832
|
allgather_test_cpp functional again
|
2023-05-02 23:14:13 +00:00 |
|
Olli Saarikivi
|
c7b7d20d85
|
Export epoch header
|
2023-05-02 20:35:16 +00:00 |
|
Olli Saarikivi
|
04e878489d
|
Work on a channel service
|
2023-04-28 22:50:38 +00:00 |
|
Saeed Maleki
|
82c27625e6
|
ipc uses a base ptr now
|
2023-04-27 21:33:15 +00:00 |
|
Saeed Maleki
|
df80d8854b
|
connect test
|
2023-04-27 05:26:08 +00:00 |
|
Saeed Maleki
|
7a865d96d7
|
merged with saemal/api-extension
|
2023-04-26 23:24:31 +00:00 |
|
Olli Saarikivi
|
d746201287
|
WIP builds, but doesn't link
|
2023-04-26 17:46:47 +00:00 |
|
Changho Hwang
|
31f7897d5d
|
integrate with new interfaces in mscclpp.hpp
|
2023-04-25 11:47:58 +00:00 |
|
Saeed Maleki
|
cacdb46702
|
merged with api-extension
|
2023-04-24 23:26:28 +00:00 |
|
Binyang Li
|
073460c341
|
fx compile issue
|
2023-04-23 14:25:56 +00:00 |
|
Olli Saarikivi
|
83c7ba1afb
|
C++ API working, allgather_test_cpp passing
|
2023-04-19 17:11:21 +00:00 |
|
Madan Musuvathi
|
c042d9af54
|
Merge branch 'cpp-api' into saemal/api-extension
|
2023-04-13 22:32:38 +00:00 |
|
Saeed Maleki
|
2d68e808e3
|
fix for MPI requirement for mscclpp-tests
|
2023-04-12 19:50:45 +00:00 |
|
Changho Hwang
|
d2c2ae72a7
|
Some cleanup
|
2023-04-11 08:45:22 +00:00 |
|
Changho Hwang
|
69b5bdfd13
|
minor fix
|
2023-04-11 05:01:39 +00:00 |
|
Saeed Maleki
|
e336c93dc9
|
Merge branch 'main' into binyli/mscclpp-test
|
2023-04-08 06:30:48 +00:00 |
|
Changho Hwang
|
b6ea0ca266
|
IB unit test (#47)
|
2023-04-07 21:45:14 +08:00 |
|
Changho Hwang
|
b7461facff
|
Fix Makefile
|
2023-04-07 13:09:56 +00:00 |
|
Changho Hwang
|
949a9cd0a3
|
Optional use of gdrcopy (#48)
Co-authored-by: Saeed Maleki <saemal@microsoft.com>
|
2023-04-07 13:36:59 +08:00 |
|
Binyang Li
|
bf472ff864
|
Fix bug & remove pthread related code
|
2023-04-07 03:37:47 +00:00 |
|
Saeed Maleki
|
1b2db68e93
|
filename changes
|
2023-04-06 23:55:11 +00:00 |
|
Saeed Maleki
|
6c1ebed569
|
combining ./python and ./ lint formats into makefile
|
2023-04-06 23:26:56 +00:00 |
|
Saeed Maleki
|
e82a75c132
|
typo fix
|
2023-04-06 21:58:48 +00:00 |
|
Binyang Li
|
674d30a813
|
minor fix
|
2023-04-04 08:37:16 +00:00 |
|
Binyang Li
|
69a49c189f
|
Add correctness check
|
2023-04-04 07:01:21 +00:00 |
|
Binyang Li
|
617f39daf1
|
change makefile
|
2023-04-03 03:45:36 +00:00 |
|
Binyang Li
|
36c418a239
|
Merge branch 'main' into binyli/mscclpp-test
|
2023-04-02 08:13:48 +00:00 |
|
Binyang Li
|
98020f5b52
|
update
|
2023-03-31 07:10:43 +00:00 |
|
Binyang Li
|
22a977e730
|
init
|
2023-03-30 06:29:38 +00:00 |
|
Saeed Maleki
|
be5e422021
|
merged with main
|
2023-03-29 23:03:12 +00:00 |
|
Binyang2014
|
62279b0063
|
Add mscclppSetBootstrapConnTimeout (#34)
|
2023-03-28 14:01:56 +08:00 |
|
Saeed Maleki
|
33af4bfb67
|
no gdr copy anywhere in the code except for the files that are not compiled
|
2023-03-28 05:36:31 +00:00 |
|
Changho Hwang
|
72431957fd
|
Use clang-format-12
|
2023-03-27 14:00:03 +00:00 |
|
Binyang Li
|
7ec6ae9d6a
|
add cpplint and CI
|
2023-03-27 03:32:10 +00:00 |
|
Saeed Maleki
|
3fb9383621
|
Merge pull request #24 from microsoft/madanm-apipush
simplified API for CUDA level communication calls.
|
2023-03-24 10:42:07 -07:00 |
|
Ziyue Yang
|
f92b428cba
|
Port NPKit
|
2023-03-24 06:41:16 +00:00 |
|
Changho Hwang
|
e7459032e0
|
Add patch version
|
2023-03-24 05:19:25 +00:00 |
|
Saeed Maleki
|
777e93ee47
|
merged with main
|
2023-03-24 02:35:15 +00:00 |
|
Madan Musuvathi
|
e569175832
|
added documentation
|
2023-03-23 00:39:45 +00:00 |
|
Changho Hwang
|
9a6ddfd244
|
Update makefile
|
2023-03-22 09:19:47 +00:00 |
|
Saeed Maleki
|
0a707d84ec
|
new api works -- single node is not performant
|
2023-03-22 02:19:49 +00:00 |
|
Olli Saarikivi
|
0cfe2dcffb
|
Add allpairs allreduce test
To support this include separate source and destination offsets in the trigger.
Add functions for getting the rank and world size from a communicator.
|
2023-03-21 19:00:13 +00:00 |
|
Saeed Maleki
|
93afed3e54
|
new allgather algorithm with both DMA and IB on a single node
|
2023-03-19 21:53:36 +00:00 |
|
Saeed Maleki
|
2061ea91f7
|
Add allgather_test (#14)
|
2023-03-17 12:55:20 +08:00 |
|
Saeed Maleki
|
2279a690d1
|
mscclpp_net.h is not required anywhere
|
2023-03-14 05:38:15 +00:00 |
|
Changho Hwang
|
29a430e7a8
|
NUMA binding
|
2023-02-23 08:18:12 +00:00 |
|
Changho Hwang
|
48b81edf6d
|
Move some files
|
2023-02-22 11:07:22 +00:00 |
|