Felipe Petroski Such
|
38cd87cdcc
|
add memory region functions
|
2023-04-07 15:38:48 -07:00 |
|
Changho Hwang
|
949a9cd0a3
|
Optional use of gdrcopy (#48)
Co-authored-by: Saeed Maleki <saemal@microsoft.com>
|
2023-04-07 13:36:59 +08:00 |
|
Saeed Maleki
|
cd3cd2c157
|
lint
|
2023-04-06 03:20:21 +00:00 |
|
Saeed Maleki
|
08275e93d7
|
added barrier API + pushed one after mscclppsetup
|
2023-04-06 03:15:54 +00:00 |
|
Saeed Maleki
|
1731911d00
|
removing extra stream and destroying created ones
|
2023-04-02 02:07:41 +00:00 |
|
Saeed Maleki
|
4c6616e7b9
|
lint
|
2023-04-01 19:20:50 +00:00 |
|
Saeed Maleki
|
8927dd4d72
|
great allgather numbers with the current binding mechanism
|
2023-04-01 18:54:42 +00:00 |
|
Binyang Li
|
af5825b474
|
bind numa node to communicator
|
2023-03-31 08:05:49 +00:00 |
|
Saeed Maleki
|
be5e422021
|
merged with main
|
2023-03-29 23:03:12 +00:00 |
|
Binyang2014
|
62279b0063
|
Add mscclppSetBootstrapConnTimeout (#34)
|
2023-03-28 14:01:56 +08:00 |
|
Saeed Maleki
|
33af4bfb67
|
no gdr copy anywhere in the code except for the files that are not compiled
|
2023-03-28 05:36:31 +00:00 |
|
Saeed Maleki
|
d9ba953fb0
|
gdrcopy is not initialized
|
2023-03-28 04:56:06 +00:00 |
|
Saeed Maleki
|
e7cccbf897
|
both head and tail are on OK to be only used by GPU
|
2023-03-28 04:26:39 +00:00 |
|
Saeed Maleki
|
952d852256
|
both head and tail are on OK to be only used by GPU
|
2023-03-28 04:24:45 +00:00 |
|
Saeed Maleki
|
43c52367fb
|
merged with main and simplified the callback requirements
|
2023-03-27 23:41:27 +00:00 |
|
Saeed Maleki
|
19bf369dc1
|
link format correction
|
2023-03-27 20:40:15 +00:00 |
|
Changho Hwang
|
8fc8f5b4fe
|
Lint
|
2023-03-27 14:09:26 +00:00 |
|
Changho Hwang
|
8e4146aba9
|
Add mscclppSetLogHandler
|
2023-03-27 13:33:07 +00:00 |
|
Saeed Maleki
|
0898214f0a
|
added mscclppGetErrorString
|
2023-03-24 22:57:14 +00:00 |
|
Saeed Maleki
|
3fb9383621
|
Merge pull request #24 from microsoft/madanm-apipush
simplified API for CUDA level communication calls.
|
2023-03-24 10:42:07 -07:00 |
|
Saeed Maleki
|
56b599b5e7
|
a bit of api change and clean up on docs
|
2023-03-24 17:41:04 +00:00 |
|
Changho Hwang
|
551eae0ba1
|
Update docs
|
2023-03-24 09:28:12 +00:00 |
|
Ziyue Yang
|
f92b428cba
|
Port NPKit
|
2023-03-24 06:41:16 +00:00 |
|
Changho Hwang
|
05fde6c6f3
|
minor changes
|
2023-03-24 04:51:20 +00:00 |
|
Saeed Maleki
|
777e93ee47
|
merged with main
|
2023-03-24 02:35:15 +00:00 |
|
Madan Musuvathi
|
e6ee81e4fa
|
fixed the order of remote rank and tag in mscclppConnect API
|
2023-03-23 21:09:04 +00:00 |
|
Madan Musuvathi
|
72edabe2a6
|
added GetDevConn api to retrieve a connection from remoteRank and tag
|
2023-03-23 21:03:30 +00:00 |
|
Madan Musuvathi
|
896539b236
|
Comm owns all state including devcons
|
2023-03-22 22:43:32 +00:00 |
|
Saeed Maleki
|
270839797e
|
Merge branch 'main' into chhwang/dealloc
|
2023-03-22 21:14:42 +00:00 |
|
Madan Musuvathi
|
4c459aa0df
|
allgather_test code cleanup
|
2023-03-22 20:38:29 +00:00 |
|
Changho Hwang
|
48a23243a4
|
Dealloc more resources
|
2023-03-22 12:06:35 +00:00 |
|
Changho Hwang
|
9f2eef35d3
|
Init from a given mscclppUniqueId
|
2023-03-22 11:25:49 +00:00 |
|
Saeed Maleki
|
483b0c8433
|
flag is now allocated by the system
|
2023-03-22 05:14:24 +00:00 |
|
Saeed Maleki
|
b75f9e6d8a
|
implementing new API
|
2023-03-22 00:29:10 +00:00 |
|
Olli Saarikivi
|
0cfe2dcffb
|
Add allpairs allreduce test
To support this include separate source and destination offsets in the trigger.
Add functions for getting the rank and world size from a communicator.
|
2023-03-21 19:00:13 +00:00 |
|
Saeed Maleki
|
5493e22633
|
fixed multinode bug
|
2023-03-19 06:09:07 +00:00 |
|
Saeed Maleki
|
a485a7f238
|
single node works fine -- multinode is problematic
|
2023-03-19 01:08:05 +00:00 |
|
Saeed Maleki
|
9cc21f70e6
|
redesigning fifo
|
2023-03-17 22:51:11 +00:00 |
|
Changho Hwang
|
67dbbd1692
|
Thread-safe trigger
|
2023-03-17 09:46:23 +00:00 |
|
Changho Hwang
|
dc41c58769
|
Alloc proxy states on demand
|
2023-03-14 10:05:56 +00:00 |
|
Changho Hwang
|
e357beef00
|
One fifo per proxy
|
2023-03-13 14:19:36 +00:00 |
|
Saeed Maleki (saemal)
|
dced4c4c14
|
done with the design
|
2023-03-06 12:26:46 -08:00 |
|
Saeed Maleki (saemal)
|
b663469bcd
|
making a fifo for proxy threads
|
2023-03-06 10:58:12 -08:00 |
|
Saeed Maleki
|
0216ceb34e
|
added todos
|
2023-03-06 08:04:33 +00:00 |
|
Changho Hwang
|
9e5573f16b
|
Misc changes and comments
|
2023-03-03 08:32:47 +00:00 |
|
Changho Hwang
|
9674830db2
|
Change flags into uint64_t
|
2023-03-03 07:41:34 +00:00 |
|
Changho Hwang
|
3d051a985f
|
Add p2p proxy: doesn't work with cuda graph yet
|
2023-02-28 18:58:03 +00:00 |
|
Changho Hwang
|
48b81edf6d
|
Move some files
|
2023-02-22 11:07:22 +00:00 |
|