mscclpp

mirror of https://github.com/microsoft/mscclpp.git synced 2026-05-12 09:17:06 +00:00

Author	SHA1	Message	Date
caiomcbr	7493e2f075	Double buffering for NCCL APIs (#324 ) Using two scratch buffers in each peer to exchange data. --------- Co-authored-by: Changho Hwang <changhohwang@microsoft.com>	2024-07-15 22:18:53 +00:00
Binyang Li	422c81f0f8	remove make pylib-copy command (#249 ) Fix #216 Remove `make pylib-copy`	2024-01-19 12:29:15 -08:00
Changho Hwang	5fa5bd2706	Check `nvidia_peermem` during runtime (#234 )	2023-12-25 12:02:10 +08:00
Changho Hwang	c15a166cf0	Add a documentation issue template (#230 )	2023-12-05 01:01:45 +00:00
Changho Hwang	544ff0c21d	ROCm support (#213 ) Co-authored-by: Binyang Li <binyli@microsoft.com>	2023-11-24 16:41:56 +08:00
Changho Hwang	dab19e00c1	Templatize Dockerfiles & update workflows (#223 ) Now build images by a script with a shared Dockerfile template --------- Co-authored-by: Binyang Li <binyli@microsoft.com> Co-authored-by: Saeed Maleki <saemal@microsoft.com>	2023-11-22 13:29:12 -08:00
Changho Hwang	f68820436c	Explicit build dependency on `nvidia_peermem` (#201 )	2023-10-23 04:29:30 +00:00
Changho Hwang	8c0f9e84d0	v0.3.0 (#171 )	2023-10-11 22:35:54 +08:00
Changho Hwang	11ac824cc7	Align interfaces of put/get/putPackets/getPackets (#185 )	2023-10-07 22:18:26 +08:00
Changho Hwang	497a9e0c82	Add backup workflows (#189 )	2023-10-07 15:13:49 +08:00
Changho Hwang	bb64f68d74	Update issue templates (#179 )	2023-09-15 04:05:09 +00:00
Saeed Maleki	e7d5e652df	Python bindings (#125 ) Co-authored-by: Olli Saarikivi <olsaarik@microsoft.com> Co-authored-by: Changho Hwang <changhohwang@microsoft.com> Co-authored-by: Binyang Li <binyli@microsoft.com>	2023-07-19 15:35:54 +08:00
Binyang2014	56bdbc2f32	Enable test for both cuda11 and cuda12 (#124 ) Update pipeline: enable test for both cuda11 and cuda12	2023-07-10 13:19:14 +08:00
Changho Hwang	4114d65c60	Documents & minor updates (#119 ) Co-authored-by: Saeed Maleki <saemal@microsoft.com> Co-authored-by: Binyang Li <binyli@microsoft.com>	2023-07-07 17:35:05 +08:00
Changho Hwang	bb7b85a810	2-node AllReduce improvements (#118 ) * Added `get()` interfaces to `SmChannel` * Improved 2-node (8 gpus/node) AllReduce: algbw 139GB/s for 1GB (kernel 3) and 99GB/s for 48MB (kernel 4) * Fixed a FIFO perf bug * Several fixes & validations in mscclpp-test --------- Co-authored-by: Binyang Li <binyli@microsoft.com> Co-authored-by: Saeed Maleki <saemal@microsoft.com>	2023-07-07 07:05:46 +00:00
Binyang2014	2640578b22	Add performance check for mscclpp-test (#110 ) - Add ndmv4 perf baseline - change mscclpp-test to output perf number into a json file - add python script to check the perf result with the baseline	2023-06-21 07:42:53 +00:00
Changho Hwang	5a4885ccbb	Misc updates (#95 )	2023-06-12 13:53:43 +08:00
Changho Hwang	798631bd52	Update unit tests (#81 )	2023-06-08 09:58:05 +00:00
Changho Hwang	7346e70109	Use MSCCL++ Docker image for CodeQL (#94 )	2023-06-06 18:42:22 +08:00
Changho Hwang	0581bfb431	Fix CodeQL workflow (#80 )	2023-05-22 14:03:30 +08:00
Changho Hwang	8d54bf3301	Update CI (#79 )	2023-05-21 11:45:41 -07:00
Binyang Li	5704fb7c6a	update	2023-05-11 08:55:51 +00:00
Binyang Li	1487596dc8	update cpplint	2023-05-11 08:34:57 +00:00
Binyang Li	669c67b3de	enable github action on all ranches	2023-05-05 08:42:25 +00:00
Changho Hwang	72431957fd	Use clang-format-12	2023-03-27 14:00:03 +00:00
Binyang Li	7ec6ae9d6a	add cpplint and CI	2023-03-27 03:32:10 +00:00

26 Commits