mscclpp

mirror of https://github.com/microsoft/mscclpp.git synced 2026-05-12 17:26:04 +00:00

Author	SHA1	Message	Date
Binyang2014	56bdbc2f32	Enable test for both cuda11 and cuda12 (#124 ) Update pipeline: enable test for both cuda11 and cuda12	2023-07-10 13:19:14 +08:00
Changho Hwang	bb7b85a810	2-node AllReduce improvements (#118 ) * Added `get()` interfaces to `SmChannel` * Improved 2-node (8 gpus/node) AllReduce: algbw 139GB/s for 1GB (kernel 3) and 99GB/s for 48MB (kernel 4) * Fixed a FIFO perf bug * Several fixes & validations in mscclpp-test --------- Co-authored-by: Binyang Li <binyli@microsoft.com> Co-authored-by: Saeed Maleki <saemal@microsoft.com>	2023-07-07 07:05:46 +00:00
Binyang2014	2640578b22	Add performance check for mscclpp-test (#110 ) - Add ndmv4 perf baseline - change mscclpp-test to output perf number into a json file - add python script to check the perf result with the baseline	2023-06-21 07:42:53 +00:00
Binyang2014	8efacae332	update pipeline (#103 ) Update Azure pipeline: - Using mscclpp:base-cuda12.1 image for building and testing - Add mp-ut tests for multi-nodes	2023-06-14 20:14:57 +08:00