Changho Hwang
d4ede480f4
Ethernet support ( #284 )
...
Co-authored-by: Binyang Li <binyli@microsoft.com >
Co-authored-by: Caio Rocha <caiorocha@microsoft.com >
2024-04-25 11:06:43 -07:00
Changho Hwang
5ba6ce00c7
Fix bootstrapping mechanism ( #278 )
...
Co-authored-by: Binyang Li <binyli@microsoft.com >
Co-authored-by: Pashupati Kumar <74680231+pash-msft@users.noreply.github.com >
2024-03-27 10:24:24 +08:00
Changho Hwang
a6b24dcbed
Fix #163 ( #182 )
...
The bug was caused as frequent calls of initialize() temporarily exhaust
all available ephemeral ports. Fixed by retrying `bind()` after a while
upon `EADDRINUSE`.
2023-09-15 08:35:01 +00:00
Saeed Maleki
8d1b984bed
Change device handle interfaces & others ( #142 )
...
* Changed device handle interfaces
* Changed proxy service interfaces
* Move device code into separate files
* Fixed FIFO polling issues
* Add configuration arguments in several interface functions
---------
Co-authored-by: Changho Hwang <changhohwang@microsoft.com >
Co-authored-by: Binyang Li <binyli@microsoft.com >
Co-authored-by: root <root@a100-saemal0.qxveptpukjsuthqvv514inp03c.gx.internal.cloudapp.net >
2023-08-16 20:00:56 +08:00
Saeed Maleki
e7d5e652df
Python bindings ( #125 )
...
Co-authored-by: Olli Saarikivi <olsaarik@microsoft.com >
Co-authored-by: Changho Hwang <changhohwang@microsoft.com >
Co-authored-by: Binyang Li <binyli@microsoft.com >
2023-07-19 15:35:54 +08:00
Saeed Maleki
df2f0c14ab
bootstrap now takes interface ( #113 )
...
This PR fixes the issue regarding taking the interface as an input.
2023-06-29 00:16:06 +08:00
Changho Hwang
c4a5958dfc
Fix hanging bootstrap issues ( #100 )
...
* Renew socket interfaces and error handling into C++ style
* Fix bootstrap hanging bugs
* Misc code cleanup
---------
Co-authored-by: Binyang Li <binyli@microsoft.com >
Co-authored-by: Saeed Maleki <saemal@microsoft.com >
2023-06-15 11:29:49 +08:00
Changho Hwang
9cee6c4a74
Cleanup old files and functions ( #86 )
2023-06-01 17:34:57 +08:00
Olli Saarikivi
457c422791
Remove alloc.h and beef up cuda_utils.hpp ( #82 )
2023-05-24 08:34:18 +00:00
Olli Saarikivi
9f6c48cbf9
Format all files
2023-05-11 00:23:14 +00:00
Binyang Li
7ac861b1e9
Refactor bootstrap
2023-04-21 08:41:33 +00:00
Saeed Maleki
f3f53a4148
lint
2023-04-08 06:32:57 +00:00
Saeed Maleki
ec83a27e83
wip
2023-04-08 01:57:22 +00:00
Saeed Maleki
e2cfd5ac83
a lot of documentation
2023-03-30 00:37:33 +00:00
Saeed Maleki
be5e422021
merged with main
2023-03-29 23:03:12 +00:00
Binyang2014
62279b0063
Add mscclppSetBootstrapConnTimeout ( #34 )
2023-03-28 14:01:56 +08:00
Saeed Maleki
fa26bdd9fc
no gdr copy anywhere in the code except for the files that are not compiled
2023-03-28 05:40:40 +00:00
Saeed Maleki
19bf369dc1
link format correction
2023-03-27 20:40:15 +00:00
Saeed Maleki
35b8ebaf64
retry for almost 20 seconds
2023-03-24 19:42:00 +00:00
Changho Hwang
7a4c27778f
30 sec timeout for socket accept
2023-03-24 08:29:00 +00:00
Saeed Maleki
537537563e
fixes connection refused
2023-02-17 01:51:02 +00:00
lambda7xx
fe7d8097d6
cleaned up the mess
2023-02-07 04:42:58 +00:00
Changho Hwang
b4bd7489f0
Move bootstrap components to bootstrap/.
2023-02-06 08:02:13 +00:00