Files
mscclpp/apps
Changho Hwang 474ef0b696 Optimized allreduce fallback for ~10KB sizes (#506)
* Pass the op type as a template parameter
* Use the all-pairs algorithm for ~10KB
* Don't write channel handles on the shared memory for small sizes
* A reduction bug fix & cleanup
2025-04-23 10:38:15 -07:00
..