mirror of
https://github.com/microsoft/mscclpp.git
synced 2026-05-12 17:26:04 +00:00
* code complelete * fix correctness issue * Fix correctness issuee * fix lint * ass compile * Fix build issue * Fix runtime error * Fix correctness issue * Fix crash issue * minor change * Fix memory leak * Fix review comments * Finish allgather * address comments * load element to register first then store to remote address * Finish allGather * init * Build connections * allreduce_test works * Bug fix * Add CUDA flags * Add packet copy (LL) * Lint * Set tmpPtr from constructors * Lint * Multiple blocks per peer * Beautify * Temporal ring reduce * Ring reduce works correctly * Overlapping * Fix overlapping * Improve vector sum * figuring out how to use atomics * working now * wip * Enhance LL AllReduce * Support multiple blocks per peer * Fix a ring reduce bug * Fix a AllReduce kernel 2 bug * Bug fix * wip * Make it compilable * Lint * Lint * Minor changes * Unit test to reproduce memory consistency bugs * Unit test bug fixes * Fixes * Typo * wip * done with core * wip * wip * compiles * only the atomic is failing * almost working * all tests pass now * clang-12 * More jailbreaks * bug fix for common.cu * adding stdint to concurrency.hpp * Out-of-place for AllReduce kernel 2 * Optimize `sync()` * Fix mp_unit_tests * Init TestEngine with TestArgs * Change common.cu into common.cc * Cleanup common.hpp * Lint * fixes to the mscclpp-tests * fixed common.cc --------- Co-authored-by: Binyang Li <binyli@microsoft.com> Co-authored-by: Saeed Maleki <saemal@microsoft.com>