Support Fusion for ReadPutPacket Operation at DSL (#742)

Support is being added for fusing the ReadPutPacket operation on DSL,
which reduces the overhead caused by reading packet data multiple times
in the scratch buffer. Fusion will occur when two rppkt operations are
executed consecutively with the same src_buffer:

rppkt(src, dst0) + rppkt(src, dst1) -> rppkt(src, [dst0, dst1]

Co-authored-by: Binyang Li <binyli@microsoft.com>
This commit is contained in:
Caio Rocha
2026-02-12 17:27:20 -08:00
committed by GitHub
parent 42be3660e0
commit dff3bc7bbb
4 changed files with 105 additions and 9 deletions

View File

@@ -11,7 +11,7 @@ from mscclpp import (
env,
)
from mscclpp import CommGroup, GpuBuffer
from mscclpp.utils import KernelBuilder, GpuBuffer, pack
from mscclpp.utils import KernelBuilder, pack
import os
import struct