Files
mscclpp/tests
Changho Hwang c3cb81a906 Fix putDirect (#62)
`ring_send_recv_test_perf` result (1 node 8 gpus):

```
# minBytes 1024 maxBytes 1024 step: 1048576(bytes) warmup iters: 10 iters: 100 validation: 1 graph: 1, kernel num: 0
#
# Using devices
#  Rank  0 Pid 365596 on costsim-dev-00000A device  0 [0001:00:00.0] NVIDIA A100-SXM4-80GB
#  Rank  1 Pid 365597 on costsim-dev-00000A device  1 [0002:00:00.0] NVIDIA A100-SXM4-80GB
#  Rank  2 Pid 365598 on costsim-dev-00000A device  2 [0003:00:00.0] NVIDIA A100-SXM4-80GB
#  Rank  3 Pid 365599 on costsim-dev-00000A device  3 [0004:00:00.0] NVIDIA A100-SXM4-80GB
#  Rank  4 Pid 365600 on costsim-dev-00000A device  4 [000B:00:00.0] NVIDIA A100-SXM4-80GB
#  Rank  5 Pid 365602 on costsim-dev-00000A device  5 [000C:00:00.0] NVIDIA A100-SXM4-80GB
#  Rank  6 Pid 365603 on costsim-dev-00000A device  6 [000D:00:00.0] NVIDIA A100-SXM4-80GB
#  Rank  7 Pid 365605 on costsim-dev-00000A device  7 [000E:00:00.0] NVIDIA A100-SXM4-80GB
#
# Initializing MSCCL++
# Setting up the connection in MSCCL++
# Launching MSCCL++ proxy threads
#
#                                    in-place                       out-of-place          
#       size         count     time   algbw   busbw  #wrong     time   algbw   busbw  #wrong
#        (B)    (elements)     (us)  (GB/s)  (GB/s)            (us)  (GB/s)  (GB/s)       
        1024           256    31.70    0.26    0.23      0
Stopping MSCCL++ proxy threads
# Out of bounds values : 0 OK
#
```
2023-04-26 18:19:13 +08:00
..
2023-04-07 21:45:14 +08:00
2023-04-06 23:55:11 +00:00
2023-04-26 18:19:13 +08:00
2023-03-27 20:40:15 +00:00
2023-04-26 18:19:13 +08:00