Commit Graph

2 Commits

Author SHA1 Message Date
Changho Hwang
bb7b85a810 2-node AllReduce improvements (#118)
* Added `get()` interfaces to `SmChannel`
* Improved 2-node (8 gpus/node) AllReduce: algbw 139GB/s for 1GB (kernel
3) and 99GB/s for 48MB (kernel 4)
* Fixed a FIFO perf bug
* Several fixes & validations in mscclpp-test

---------

Co-authored-by: Binyang Li <binyli@microsoft.com>
Co-authored-by: Saeed Maleki <saemal@microsoft.com>
2023-07-07 07:05:46 +00:00
Binyang2014
8efacae332 update pipeline (#103)
Update Azure pipeline:
- Using mscclpp:base-cuda12.1 image for building and testing
- Add mp-ut tests for multi-nodes
2023-06-14 20:14:57 +08:00