Commit Graph

48 Commits

Author SHA1 Message Date
Felipe Petroski Such
38cd87cdcc add memory region functions 2023-04-07 15:38:48 -07:00
Changho Hwang
949a9cd0a3 Optional use of gdrcopy (#48)
Co-authored-by: Saeed Maleki <saemal@microsoft.com>
2023-04-07 13:36:59 +08:00
Saeed Maleki
cd3cd2c157 lint 2023-04-06 03:20:21 +00:00
Saeed Maleki
08275e93d7 added barrier API + pushed one after mscclppsetup 2023-04-06 03:15:54 +00:00
Saeed Maleki
1731911d00 removing extra stream and destroying created ones 2023-04-02 02:07:41 +00:00
Saeed Maleki
4c6616e7b9 lint 2023-04-01 19:20:50 +00:00
Saeed Maleki
8927dd4d72 great allgather numbers with the current binding mechanism 2023-04-01 18:54:42 +00:00
Binyang Li
af5825b474 bind numa node to communicator 2023-03-31 08:05:49 +00:00
Saeed Maleki
be5e422021 merged with main 2023-03-29 23:03:12 +00:00
Binyang2014
62279b0063 Add mscclppSetBootstrapConnTimeout (#34) 2023-03-28 14:01:56 +08:00
Saeed Maleki
33af4bfb67 no gdr copy anywhere in the code except for the files that are not compiled 2023-03-28 05:36:31 +00:00
Saeed Maleki
d9ba953fb0 gdrcopy is not initialized 2023-03-28 04:56:06 +00:00
Saeed Maleki
e7cccbf897 both head and tail are on OK to be only used by GPU 2023-03-28 04:26:39 +00:00
Saeed Maleki
952d852256 both head and tail are on OK to be only used by GPU 2023-03-28 04:24:45 +00:00
Saeed Maleki
43c52367fb merged with main and simplified the callback requirements 2023-03-27 23:41:27 +00:00
Saeed Maleki
19bf369dc1 link format correction 2023-03-27 20:40:15 +00:00
Changho Hwang
8fc8f5b4fe Lint 2023-03-27 14:09:26 +00:00
Changho Hwang
8e4146aba9 Add mscclppSetLogHandler 2023-03-27 13:33:07 +00:00
Saeed Maleki
0898214f0a added mscclppGetErrorString 2023-03-24 22:57:14 +00:00
Saeed Maleki
3fb9383621 Merge pull request #24 from microsoft/madanm-apipush
simplified API for CUDA level communication calls.
2023-03-24 10:42:07 -07:00
Saeed Maleki
56b599b5e7 a bit of api change and clean up on docs 2023-03-24 17:41:04 +00:00
Changho Hwang
551eae0ba1 Update docs 2023-03-24 09:28:12 +00:00
Ziyue Yang
f92b428cba Port NPKit 2023-03-24 06:41:16 +00:00
Changho Hwang
05fde6c6f3 minor changes 2023-03-24 04:51:20 +00:00
Saeed Maleki
777e93ee47 merged with main 2023-03-24 02:35:15 +00:00
Madan Musuvathi
e6ee81e4fa fixed the order of remote rank and tag in mscclppConnect API 2023-03-23 21:09:04 +00:00
Madan Musuvathi
72edabe2a6 added GetDevConn api to retrieve a connection from remoteRank and tag 2023-03-23 21:03:30 +00:00
Madan Musuvathi
896539b236 Comm owns all state including devcons 2023-03-22 22:43:32 +00:00
Saeed Maleki
270839797e Merge branch 'main' into chhwang/dealloc 2023-03-22 21:14:42 +00:00
Madan Musuvathi
4c459aa0df allgather_test code cleanup 2023-03-22 20:38:29 +00:00
Changho Hwang
48a23243a4 Dealloc more resources 2023-03-22 12:06:35 +00:00
Changho Hwang
9f2eef35d3 Init from a given mscclppUniqueId 2023-03-22 11:25:49 +00:00
Saeed Maleki
483b0c8433 flag is now allocated by the system 2023-03-22 05:14:24 +00:00
Saeed Maleki
b75f9e6d8a implementing new API 2023-03-22 00:29:10 +00:00
Olli Saarikivi
0cfe2dcffb Add allpairs allreduce test
To support this include separate source and destination offsets in the trigger.
Add functions for getting the rank and world size from a communicator.
2023-03-21 19:00:13 +00:00
Saeed Maleki
5493e22633 fixed multinode bug 2023-03-19 06:09:07 +00:00
Saeed Maleki
a485a7f238 single node works fine -- multinode is problematic 2023-03-19 01:08:05 +00:00
Saeed Maleki
9cc21f70e6 redesigning fifo 2023-03-17 22:51:11 +00:00
Changho Hwang
67dbbd1692 Thread-safe trigger 2023-03-17 09:46:23 +00:00
Changho Hwang
dc41c58769 Alloc proxy states on demand 2023-03-14 10:05:56 +00:00
Changho Hwang
e357beef00 One fifo per proxy 2023-03-13 14:19:36 +00:00
Saeed Maleki (saemal)
dced4c4c14 done with the design 2023-03-06 12:26:46 -08:00
Saeed Maleki (saemal)
b663469bcd making a fifo for proxy threads 2023-03-06 10:58:12 -08:00
Saeed Maleki
0216ceb34e added todos 2023-03-06 08:04:33 +00:00
Changho Hwang
9e5573f16b Misc changes and comments 2023-03-03 08:32:47 +00:00
Changho Hwang
9674830db2 Change flags into uint64_t 2023-03-03 07:41:34 +00:00
Changho Hwang
3d051a985f Add p2p proxy: doesn't work with cuda graph yet 2023-02-28 18:58:03 +00:00
Changho Hwang
48b81edf6d Move some files 2023-02-22 11:07:22 +00:00