Files
mscclpp/python
Caio Rocha 0c9b9abfd5 Adding Support 4 Nodes AllReduce Small Message Size (#794)
Results on 4 Nodes H200:

| Size | NCCL  | MSCCL++ 57TB | MSCCL++ 29TB |
|------|-------|--------------|--------------|
| 8K   | 45.75 | 17.74        | 18.18        |
| 16K  | 47.08 | 18.9         | 18.42        |
| 32K  | 47.29 | 19.48        | 19.12        |
| 64K  | 50.34 | 20.51        | 19.29        |
| 128K | 59.65 | 21.37        | 20.25        |
| 256K | 87.46 | 23.87        | 23.51        |
| 512K | 106.55| 29.15        | 29.51        |
| 1M   | 115   | 40.64        | 41.83        |
| 2M   | 135.89| 63.73        | 70.45        |
| 4M   | 177.59| 121.76       | 128.79       |
| 8M   | 251.17| 228.5        | 251.36       |

---------

Co-authored-by: Binyang Li <binyli@microsoft.com>
Co-authored-by: Caio Rocha <caiorocha@microsof.com>
2026-05-12 13:45:55 -07:00
..