Olli Saarikivi
d83343ef4e
Make getWc not return a void pointer
2023-05-16 22:52:38 +00:00
Olli Saarikivi
dee55997e9
Remove free and most reinterpret_casts in IB code
2023-05-16 22:48:16 +00:00
Saeed Maleki
5de083ad7e
freeing cudaMalloc'ed pointers
2023-05-15 23:53:30 +00:00
Saeed Maleki
966402706c
Merge pull request #72 from microsoft/ziyyang/doxygen
...
Add Doxygen-based document
2023-05-15 16:50:17 -07:00
Saeed Maleki
e21392e2c3
Merge branch 'main' into ziyyang/doxygen
2023-05-15 23:45:54 +00:00
Saeed Maleki
112d1eeb22
Merge pull request #75 from microsoft/api-extension
...
Merge api-extension branch to main
2023-05-15 16:35:50 -07:00
Saeed Maleki
c9ac615b20
Merge pull request #74 from microsoft/saemal/offloading
...
offloading allgather to CPU entirely
2023-05-15 16:27:00 -07:00
Saeed Maleki
6f7ca05305
Merge remote-tracking branch 'origin/api-extension' into saemal/offloading
2023-05-12 22:43:22 +00:00
Saeed Maleki
2a7b745972
fully working with double buffering
2023-05-12 22:42:22 +00:00
Olli Saarikivi
8f2d7922ed
Change install dir
2023-05-12 21:25:29 +00:00
Olli Saarikivi
d58e698d51
Add headers to install and set default install dir
2023-05-12 21:23:01 +00:00
Saeed Maleki
2691784b88
working -- at least for single node
2023-05-12 20:21:58 +00:00
Saeed Maleki
113473a116
more progress
2023-05-12 07:01:21 +00:00
Saeed Maleki
31851ad82c
host epoch removed
2023-05-12 06:11:12 +00:00
Saeed Maleki
ef558a42e8
wip
2023-05-12 05:54:32 +00:00
Saeed Maleki
260c3e35f0
Merge pull request #73 from microsoft/binyli/exception
...
Refine exception
2023-05-11 14:29:41 -07:00
Saeed Maleki
62f96f316c
Merge branch 'api-extension' into binyli/exception
2023-05-11 21:24:18 +00:00
Binyang2014
643771bf93
Merge pull request #71 from microsoft/binyli/merge-main
...
Resolve conflict and merge main branch to api-extension
2023-05-11 17:39:06 +08:00
Binyang Li
e63aae7142
Merge apt-extension
2023-05-11 09:20:41 +00:00
Binyang Li
5704fb7c6a
update
2023-05-11 08:55:51 +00:00
Binyang Li
1487596dc8
update cpplint
2023-05-11 08:34:57 +00:00
Binyang Li
785a973ace
refine exception
2023-05-11 08:25:25 +00:00
Ziyue Yang
e257f19cb8
add doc section in readme
2023-05-11 00:46:02 +00:00
Olli Saarikivi
96a0c45fb4
Remove makefile
2023-05-11 00:23:21 +00:00
Olli Saarikivi
9f6c48cbf9
Format all files
2023-05-11 00:23:14 +00:00
Olli Saarikivi
ccf45b33a2
Delete old init code and other C-style code
2023-05-10 22:03:42 +00:00
Olli Saarikivi
b2dfd8a8fe
Merge branch 'api-extension' of https://github.com/microsoft/mscclpp into api-extension
2023-05-10 20:50:51 +00:00
Olli Saarikivi
beaf2aea39
Move public headers under include/
2023-05-10 20:46:49 +00:00
Saeed Maleki
c05586f074
Merge branch 'api-extension' of https://github.com/microsoft/mscclpp into api-extension
2023-05-10 20:24:40 +00:00
Saeed Maleki
33eb4093ac
timeout fix
2023-05-10 20:24:33 +00:00
Olli Saarikivi
f4ecae7c96
Rename tests/ to test/
2023-05-10 18:49:02 +00:00
Olli Saarikivi
75a2af8de2
Add GoogleTest with CTest integration + some tests
...
Also rename addSetup to onSetup to unify naming.
2023-05-10 18:46:55 +00:00
Ziyue Yang
48a278d2a5
init doxyfile
2023-05-10 16:23:02 +00:00
Olli Saarikivi
4045323aa2
Merge branch 'saemal/api-extension' into api-extension
2023-05-10 15:30:10 +00:00
Binyang Li
b948ed6bfd
Merge branch 'main' into binyli/merge-main
2023-05-10 06:02:22 +00:00
Binyang2014
f8c1dc64da
Update sm copy test ( #70 )
...
result for 1K message:
```
# Launching MSCCL++ proxy threads
#
# in-place out-of-place
# size count time algbw busbw #wrong time algbw busbw #wrong
# (B) (elements) (us) (GB/s) (GB/s) (us) (GB/s) (GB/s)
1024 256 8.34 0.12 0.12 0
Stopping MSCCL++ proxy threads
# Out of bounds values : 0 OK
```
result for 1G message
```
# in-place out-of-place
# size count time algbw busbw #wrong time algbw busbw #wrong
# (B) (elements) (us) (GB/s) (GB/s) (us) (GB/s) (GB/s)
1073741824 268435456 5716.9 187.82 187.82 0
Stopping MSCCL++ proxy threads
# Out of bounds values : 0 OK
```
For 1KB, the latency is better than nccl, which is: 16.68us, for 1GB data, the bandwidth is a bit worse than nccl, which is 190.74 GB/s
2023-05-10 13:56:18 +08:00
Saeed Maleki
1769138568
Host Epoch + Error code
2023-05-09 23:10:12 +00:00
Saeed Maleki
8b384600a9
host epoch works
2023-05-09 22:17:43 +00:00
Binyang2014
bbf7ef621e
Enable github action on all branches ( #68 )
2023-05-09 23:07:13 +09:00
Binyang Li
9c40d616d9
Merge main branch
2023-05-09 10:59:04 +00:00
Binyang2014
8650dbaff8
Add exception class for mscclpp ( #67 )
...
Add exception class for mscclpp
2023-05-06 16:27:25 +08:00
Olli Saarikivi
4f528d29a0
Make clang-format style file explicit
2023-05-05 19:15:38 +00:00
Olli Saarikivi
86be901d98
CMake improvements
2023-05-05 19:11:33 +00:00
Olli Saarikivi
051643b4c2
Fix clang-format glob
2023-05-05 18:13:34 +00:00
Olli Saarikivi
adaa75536d
Add clang-format to CMake
2023-05-05 18:05:55 +00:00
Binyang Li
669c67b3de
enable github action on all ranches
2023-05-05 08:42:25 +00:00
Saeed Maleki
9fb29f9dfc
timeout for flush
2023-05-04 17:48:24 +00:00
Binyang Li
bb3239fd6b
Fix IB write issue
2023-05-04 11:03:45 +00:00
Saeed Maleki
9ecf1f9945
Merge pull request #66 from microsoft/olli/api-extension
...
Olli/api extension
2023-05-03 19:47:03 -07:00
Olli Saarikivi
ddc9e681c8
Add ib_test to CMake
2023-05-04 00:57:34 +00:00