Files
mscclpp/tests
Binyang2014 f8c1dc64da Update sm copy test (#70)
result for 1K message:
```
# Launching MSCCL++ proxy threads
#
#                                    in-place                       out-of-place          
#       size         count     time   algbw   busbw  #wrong     time   algbw   busbw  #wrong
#        (B)    (elements)     (us)  (GB/s)  (GB/s)            (us)  (GB/s)  (GB/s)       
        1024           256                                      8.34    0.12    0.12      0
Stopping MSCCL++ proxy threads
# Out of bounds values : 0 OK
```

result for 1G message
```
#                                    in-place                       out-of-place          
#       size         count     time   algbw   busbw  #wrong     time   algbw   busbw  #wrong
#        (B)    (elements)     (us)  (GB/s)  (GB/s)            (us)  (GB/s)  (GB/s)       
  1073741824     268435456                                    5716.9  187.82  187.82      0
Stopping MSCCL++ proxy threads
# Out of bounds values : 0 OK
```
For 1KB, the latency is better than nccl, which is: 16.68us, for 1GB data, the bandwidth is a bit worse than nccl, which is 190.74 GB/s
2023-05-10 13:56:18 +08:00
..
2023-04-07 21:45:14 +08:00
2023-04-06 23:55:11 +00:00
2023-05-10 13:56:18 +08:00
2023-05-10 13:56:18 +08:00
2023-05-10 13:56:18 +08:00
2023-05-10 13:56:18 +08:00
2023-03-27 20:40:15 +00:00
2023-05-10 13:56:18 +08:00