Commit Graph

49 Commits

Author SHA1 Message Date
Saeed Maleki
7cb2903799 some comment check ins 2023-03-20 21:07:58 +00:00
Saeed Maleki
93afed3e54 new allgather algorithm with both DMA and IB on a single node 2023-03-19 21:53:36 +00:00
Saeed Maleki
8a1ec28ff1 single node allgather works very well 2023-03-19 19:27:17 +00:00
Saeed Maleki
3e8f6758e5 both allgather algorithms 2023-03-19 06:35:40 +00:00
Saeed Maleki
17cbc84a14 both allgather algorithms 2023-03-19 06:35:32 +00:00
Saeed Maleki
a485a7f238 single node works fine -- multinode is problematic 2023-03-19 01:08:05 +00:00
Saeed Maleki
9cc21f70e6 redesigning fifo 2023-03-17 22:51:11 +00:00
Changho Hwang
67dbbd1692 Thread-safe trigger 2023-03-17 09:46:23 +00:00
Saeed Maleki
2061ea91f7 Add allgather_test (#14) 2023-03-17 12:55:20 +08:00
Changho Hwang
aacee9727b trigger wrappers 2023-03-14 09:14:51 +00:00
Saeed Maleki
e000eb9177 some compilation clean up 2023-03-14 05:26:54 +00:00
Saeed Maleki
ab9298d6e0 fixed the bits for trigger 2023-03-13 23:21:27 +00:00
Changho Hwang
e357beef00 One fifo per proxy 2023-03-13 14:19:36 +00:00
Saeed Maleki
ea7134549e vector instructions for trigger 2023-03-13 07:02:26 +00:00
Changho Hwang
1be76d128d 128-bit trigger 2023-03-10 10:49:36 +00:00
Changho Hwang
85d92961a3 Remove MPI dependency 2023-03-10 08:26:38 +00:00
Ubuntu
8f2831330c a few todos + some clean up in the test 2023-03-10 04:34:14 +00:00
Changho Hwang
6ac3c4c90f Relaxed sync 2023-03-09 07:24:09 +00:00
Saeed Maleki
160060ec77 fifo works now 2023-03-08 20:10:09 +00:00
Changho Hwang
1a382a8e1d Fix fifo triggers 2023-03-07 03:31:26 +00:00
Saeed Maleki
3e4c45d73a compiles 2023-03-06 20:36:54 +00:00
Saeed Maleki (saemal)
dced4c4c14 done with the design 2023-03-06 12:26:46 -08:00
Changho Hwang
5ac2ea6e9f IB more fixes 2023-03-06 07:01:03 +00:00
Saeed Maleki
7e4bacf20c works 2023-03-03 23:10:41 +00:00
Saeed Maleki (saemal)
3def04e72d separating dma data/flag/sync logic 2023-03-03 15:05:54 -08:00
Changho Hwang
9674830db2 Change flags into uint64_t 2023-03-03 07:41:34 +00:00
Changho Hwang
a4df6e2d44 Merge branch 'main' into chhwang/p2p-simple 2023-03-01 05:37:46 +00:00
Saeed Maleki
ac1bf6dc52 cudagraph now works with p2p proxy 2023-02-28 23:25:51 +00:00
Changho Hwang
3d051a985f Add p2p proxy: doesn't work with cuda graph yet 2023-02-28 18:58:03 +00:00
Changho Hwang
6bbee64482 Add cuda graph warmup 2023-02-28 14:19:11 +00:00
Changho Hwang
29a430e7a8 NUMA binding 2023-02-23 08:18:12 +00:00
Saeed Maleki
1a528a3aa3 merged with main 2023-02-22 23:16:26 +00:00
Saeed Maleki
e1243191da added cuda graphs few clean ups 2023-02-22 23:07:10 +00:00
Saeed Maleki
bca3362c12 a few clean ups 2023-02-22 22:43:18 +00:00
Changho Hwang
89ca0451a8 Support incremental flag & add perf test 2023-02-22 10:55:35 +00:00
Changho Hwang
7459a08699 Rename test code 2023-02-22 06:56:25 +00:00
Changho Hwang
368f8f4d24 Merge branch 'saemal/cleanup' into chhwang/p2p-simple 2023-02-22 06:54:51 +00:00
Changho Hwang
91e04a527b Bidirectional connection 2023-02-22 06:06:14 +00:00
Saeed Maleki
ca89c17aaa more clean up 2023-02-20 00:23:24 +00:00
Saeed Maleki
09d3e2f72c comments for bootstrap test 2023-02-19 19:11:16 +00:00
Changho Hwang
33e20aceb9 IB all-to-all works 2023-02-17 11:39:16 +00:00
Saeed Maleki
537537563e fixes connection refused 2023-02-17 01:51:02 +00:00
v-xiaoxshi
654dd5f172 works 2023-02-16 07:07:13 +00:00
v-xiaoxshi
a364a39d17 compiles now 2023-02-16 05:25:17 +00:00
v-xiaoxshi
ad3be20b15 compiles now 2023-02-16 04:51:51 +00:00
v-xiaoxshi
81baa73822 more progress 2023-02-16 04:28:36 +00:00
Changho Hwang
8e57fd9896 p2p all-to-all works 2023-02-13 11:25:20 +00:00
Saeed Maleki
dfe1c4500a test without mpi 2023-02-07 22:48:37 +00:00
Changho Hwang
8f7ebe99e3 Build into a shared library 2023-02-07 07:36:37 +00:00