Olli Saarikivi
|
9f6c48cbf9
|
Format all files
|
2023-05-11 00:23:14 +00:00 |
|
Olli Saarikivi
|
ccf45b33a2
|
Delete old init code and other C-style code
|
2023-05-10 22:03:42 +00:00 |
|
Changho Hwang
|
08e80f1754
|
IB: completely replaced with C++ interfaces
|
2023-04-27 04:01:46 +00:00 |
|
Changho Hwang
|
dd0883b84f
|
Lint
|
2023-04-12 09:25:35 +00:00 |
|
Changho Hwang
|
bc729cd481
|
Move MRs / MR infos to mscclppHostIBConn & cleanup
|
2023-04-12 09:05:42 +00:00 |
|
Changho Hwang
|
fd3f928108
|
remove hostFifo & rename devFifo to just fifo
|
2023-04-12 08:08:19 +00:00 |
|
Madan Musuvathi
|
9124856ea4
|
first version hostConn
|
2023-04-12 01:36:06 +00:00 |
|
Changho Hwang
|
7a0e64813a
|
Add fifo for host connections
|
2023-04-11 12:28:45 +00:00 |
|
Changho Hwang
|
35acdf796c
|
Add mscclppProxyFifo
|
2023-04-11 11:28:40 +00:00 |
|
Saeed Maleki
|
ee6c2deb44
|
Merge branch 'main' into saemal/api-extension
|
2023-04-11 01:43:13 +00:00 |
|
Saeed Maleki
|
b6179224aa
|
lint
|
2023-04-11 01:36:37 +00:00 |
|
Saeed Maleki
|
48102a0858
|
removing unnecessary flags
|
2023-04-11 01:22:40 +00:00 |
|
Changho Hwang
|
a1ae982c61
|
Merge signalEpochId with proxySignalEpochId
|
2023-04-10 14:05:25 +00:00 |
|
Saeed Maleki
|
426e78997c
|
name changes + documentation for clarity
|
2023-04-09 02:20:54 +00:00 |
|
Ziyue Yang
|
5f0b58abda
|
fix lint
|
2023-04-08 07:16:32 +00:00 |
|
Ziyue Yang
|
09de60854e
|
fix lint
|
2023-04-08 07:15:25 +00:00 |
|
Ziyue Yang
|
748d3d1596
|
separate flag and data
|
2023-04-08 07:12:46 +00:00 |
|
Ziyue Yang
|
f68eeba2d4
|
change clock collection approach
|
2023-04-08 05:29:34 +00:00 |
|
Changho Hwang
|
949a9cd0a3
|
Optional use of gdrcopy (#48)
Co-authored-by: Saeed Maleki <saemal@microsoft.com>
|
2023-04-07 13:36:59 +08:00 |
|
Ziyue Yang
|
352a10a33d
|
NPKit: improve event collection for async requests (#45)
|
2023-04-06 16:21:34 +08:00 |
|
Saeed Maleki
|
1731911d00
|
removing extra stream and destroying created ones
|
2023-04-02 02:07:41 +00:00 |
|
Saeed Maleki
|
4c6616e7b9
|
lint
|
2023-04-01 19:20:50 +00:00 |
|
Saeed Maleki
|
8927dd4d72
|
great allgather numbers with the current binding mechanism
|
2023-04-01 18:54:42 +00:00 |
|
Binyang Li
|
af5825b474
|
bind numa node to communicator
|
2023-03-31 08:05:49 +00:00 |
|
Changho Hwang
|
b58eae4037
|
Minor changes
|
2023-03-30 07:11:41 +00:00 |
|
Saeed Maleki
|
e2cfd5ac83
|
a lot of documentation
|
2023-03-30 00:37:33 +00:00 |
|
Binyang Li
|
d725e45f13
|
fix
|
2023-03-28 14:53:08 +00:00 |
|
Binyang Li
|
9c633a9633
|
bug fix
|
2023-03-28 14:40:51 +00:00 |
|
Binyang Li
|
487030887b
|
refactor
|
2023-03-28 12:22:43 +00:00 |
|
Saeed Maleki
|
17e144c774
|
a typo in p2p proxy
|
2023-03-28 08:07:54 +00:00 |
|
Saeed Maleki
|
81b18cd9f9
|
a bit of clean up
|
2023-03-28 06:08:12 +00:00 |
|
Saeed Maleki
|
fa26bdd9fc
|
no gdr copy anywhere in the code except for the files that are not compiled
|
2023-03-28 05:40:40 +00:00 |
|
Saeed Maleki
|
33af4bfb67
|
no gdr copy anywhere in the code except for the files that are not compiled
|
2023-03-28 05:36:31 +00:00 |
|
Saeed Maleki
|
d9ba953fb0
|
gdrcopy is not initialized
|
2023-03-28 04:56:06 +00:00 |
|
Saeed Maleki
|
952d852256
|
both head and tail are on OK to be only used by GPU
|
2023-03-28 04:24:45 +00:00 |
|
Ziyue Yang
|
b234cf5012
|
NPKit: add DMA events and fix bandwidth calculation (#33)
|
2023-03-28 09:58:32 +08:00 |
|
Saeed Maleki
|
19bf369dc1
|
link format correction
|
2023-03-27 20:40:15 +00:00 |
|
Saeed Maleki
|
3fb9383621
|
Merge pull request #24 from microsoft/madanm-apipush
simplified API for CUDA level communication calls.
|
2023-03-24 10:42:07 -07:00 |
|
Ziyue Yang
|
f92b428cba
|
Port NPKit
|
2023-03-24 06:41:16 +00:00 |
|
Saeed Maleki
|
777e93ee47
|
merged with main
|
2023-03-24 02:35:15 +00:00 |
|
Saeed Maleki
|
b75f9e6d8a
|
implementing new API
|
2023-03-22 00:29:10 +00:00 |
|
Olli Saarikivi
|
0cfe2dcffb
|
Add allpairs allreduce test
To support this include separate source and destination offsets in the trigger.
Add functions for getting the rank and world size from a communicator.
|
2023-03-21 19:00:13 +00:00 |
|
Saeed Maleki
|
4efc6e98db
|
incorrect access fixed
|
2023-03-19 01:26:30 +00:00 |
|
Saeed Maleki
|
a485a7f238
|
single node works fine -- multinode is problematic
|
2023-03-19 01:08:05 +00:00 |
|
Changho Hwang
|
dc41c58769
|
Alloc proxy states on demand
|
2023-03-14 10:05:56 +00:00 |
|
Changho Hwang
|
c2859d258c
|
Use aligned ld/st
|
2023-03-14 09:22:28 +00:00 |
|
Changho Hwang
|
135520a14a
|
cleanups
|
2023-03-14 09:21:52 +00:00 |
|
Changho Hwang
|
75ec82d257
|
Store fifo tail in proxy state
|
2023-03-14 09:00:38 +00:00 |
|
Changho Hwang
|
e89d154503
|
Check run state periodically
|
2023-03-14 08:38:55 +00:00 |
|
Changho Hwang
|
9b124cabdb
|
cleanup
|
2023-03-13 14:27:29 +00:00 |
|