Commit Graph

9 Commits

Author SHA1 Message Date
Qinghua Zhou
1071ddb050 Update the benchmark to improve the rank mapping, communicator creation, backend selection 2026-03-10 03:17:12 +00:00
Qinghua Zhou
d00713d3c2 Add more real moe workloads for alltoallv 2026-03-02 12:51:21 +00:00
Qinghua Zhou
ee843d445f Add test of real MoE workloads 2026-02-25 12:39:48 +00:00
Qinghua Zhou
ae59eab6a2 Add unified benchmarking function to test all_to_all_single of mscclpp and torch 2026-02-24 07:17:17 +00:00
Qinghua Zhou
715ecd91cf Add baseline test of torch.distributed.all_to_all_single 2026-02-24 06:51:10 +00:00
Qinghua Zhou
98be0def08 Use variable sizes in the peformance test 2026-02-24 06:29:46 +00:00
Qinghua Zhou
6292b6ab33 Report undirectional bandwidth 2026-02-24 06:02:33 +00:00
Qinghua Zhou
21e3f1ebb3 Get correct remote receive displacements for peers 2026-02-23 14:22:30 +00:00
Qinghua Zhou
7ba83e20dd PyTorch-compatible all_to_all_single API using mscclpp kernels 2026-02-23 09:51:51 +00:00