Madan Musuvathi
|
4c459aa0df
|
allgather_test code cleanup
|
2023-03-22 20:38:29 +00:00 |
|
Madan Musuvathi
|
261fd7f838
|
allgather_test code cleanup
|
2023-03-22 18:50:23 +00:00 |
|
Madan Musuvathi
|
44c6b94747
|
api version 1
|
2023-03-22 18:28:30 +00:00 |
|
Madan Musuvathi
|
6ea460bb3a
|
fusing signal with sync
|
2023-03-22 18:16:42 +00:00 |
|
Saeed Maleki
|
483b0c8433
|
flag is now allocated by the system
|
2023-03-22 05:14:24 +00:00 |
|
Saeed Maleki
|
0a707d84ec
|
new api works -- single node is not performant
|
2023-03-22 02:19:49 +00:00 |
|
Saeed Maleki
|
b75f9e6d8a
|
implementing new API
|
2023-03-22 00:29:10 +00:00 |
|
Saeed Maleki
|
aa1a37ab4d
|
first version
|
2023-03-21 21:34:19 +00:00 |
|
Saeed Maleki
|
e2ee8d80b9
|
perf fix for multi-node allgather
|
2023-03-21 06:26:12 +00:00 |
|
Saeed Maleki
|
8b30121240
|
Merge branch 'main' into chhwang/fix-trigger
|
2023-03-20 23:12:58 +00:00 |
|
Saeed Maleki
|
7cb2903799
|
some comment check ins
|
2023-03-20 21:07:58 +00:00 |
|
Saeed Maleki
|
93afed3e54
|
new allgather algorithm with both DMA and IB on a single node
|
2023-03-19 21:53:36 +00:00 |
|
Saeed Maleki
|
8a1ec28ff1
|
single node allgather works very well
|
2023-03-19 19:27:17 +00:00 |
|
Saeed Maleki
|
3e8f6758e5
|
both allgather algorithms
|
2023-03-19 06:35:40 +00:00 |
|
Saeed Maleki
|
17cbc84a14
|
both allgather algorithms
|
2023-03-19 06:35:32 +00:00 |
|
Saeed Maleki
|
a485a7f238
|
single node works fine -- multinode is problematic
|
2023-03-19 01:08:05 +00:00 |
|
Saeed Maleki
|
9cc21f70e6
|
redesigning fifo
|
2023-03-17 22:51:11 +00:00 |
|
Saeed Maleki
|
73df12358f
|
Merge branch 'main' of https://github.com/microsoft/mscclpp into main
|
2023-03-17 17:54:17 +00:00 |
|
Saeed Maleki
|
e86df92fa5
|
fixed a typo in debugging information
|
2023-03-17 17:52:53 +00:00 |
|
Changho Hwang
|
67dbbd1692
|
Thread-safe trigger
|
2023-03-17 09:46:23 +00:00 |
|
Saeed Maleki
|
2061ea91f7
|
Add allgather_test (#14)
|
2023-03-17 12:55:20 +08:00 |
|
Changho Hwang
|
aacee9727b
|
trigger wrappers
|
2023-03-14 09:14:51 +00:00 |
|
Saeed Maleki
|
e000eb9177
|
some compilation clean up
|
2023-03-14 05:26:54 +00:00 |
|
Saeed Maleki
|
ab9298d6e0
|
fixed the bits for trigger
|
2023-03-13 23:21:27 +00:00 |
|
Changho Hwang
|
e357beef00
|
One fifo per proxy
|
2023-03-13 14:19:36 +00:00 |
|
Saeed Maleki
|
ea7134549e
|
vector instructions for trigger
|
2023-03-13 07:02:26 +00:00 |
|
Changho Hwang
|
1be76d128d
|
128-bit trigger
|
2023-03-10 10:49:36 +00:00 |
|
Changho Hwang
|
85d92961a3
|
Remove MPI dependency
|
2023-03-10 08:26:38 +00:00 |
|
Ubuntu
|
8f2831330c
|
a few todos + some clean up in the test
|
2023-03-10 04:34:14 +00:00 |
|
Changho Hwang
|
6ac3c4c90f
|
Relaxed sync
|
2023-03-09 07:24:09 +00:00 |
|
Saeed Maleki
|
160060ec77
|
fifo works now
|
2023-03-08 20:10:09 +00:00 |
|
Changho Hwang
|
1a382a8e1d
|
Fix fifo triggers
|
2023-03-07 03:31:26 +00:00 |
|
Saeed Maleki
|
3e4c45d73a
|
compiles
|
2023-03-06 20:36:54 +00:00 |
|
Saeed Maleki (saemal)
|
dced4c4c14
|
done with the design
|
2023-03-06 12:26:46 -08:00 |
|
Changho Hwang
|
5ac2ea6e9f
|
IB more fixes
|
2023-03-06 07:01:03 +00:00 |
|
Saeed Maleki
|
7e4bacf20c
|
works
|
2023-03-03 23:10:41 +00:00 |
|
Saeed Maleki (saemal)
|
3def04e72d
|
separating dma data/flag/sync logic
|
2023-03-03 15:05:54 -08:00 |
|
Changho Hwang
|
9674830db2
|
Change flags into uint64_t
|
2023-03-03 07:41:34 +00:00 |
|
Changho Hwang
|
a4df6e2d44
|
Merge branch 'main' into chhwang/p2p-simple
|
2023-03-01 05:37:46 +00:00 |
|
Saeed Maleki
|
ac1bf6dc52
|
cudagraph now works with p2p proxy
|
2023-02-28 23:25:51 +00:00 |
|
Changho Hwang
|
3d051a985f
|
Add p2p proxy: doesn't work with cuda graph yet
|
2023-02-28 18:58:03 +00:00 |
|
Changho Hwang
|
6bbee64482
|
Add cuda graph warmup
|
2023-02-28 14:19:11 +00:00 |
|
Changho Hwang
|
29a430e7a8
|
NUMA binding
|
2023-02-23 08:18:12 +00:00 |
|
Saeed Maleki
|
1a528a3aa3
|
merged with main
|
2023-02-22 23:16:26 +00:00 |
|
Saeed Maleki
|
e1243191da
|
added cuda graphs few clean ups
|
2023-02-22 23:07:10 +00:00 |
|
Saeed Maleki
|
bca3362c12
|
a few clean ups
|
2023-02-22 22:43:18 +00:00 |
|
Changho Hwang
|
89ca0451a8
|
Support incremental flag & add perf test
|
2023-02-22 10:55:35 +00:00 |
|
Changho Hwang
|
7459a08699
|
Rename test code
|
2023-02-22 06:56:25 +00:00 |
|
Changho Hwang
|
368f8f4d24
|
Merge branch 'saemal/cleanup' into chhwang/p2p-simple
|
2023-02-22 06:54:51 +00:00 |
|
Changho Hwang
|
91e04a527b
|
Bidirectional connection
|
2023-02-22 06:06:14 +00:00 |
|