Commit Graph

  • d3f8aecdf2 check nvlink support for specified GPU mahdiehghazim 2026-01-29 18:28:38 +00:00
  • 0ea0452cca Merge branch 'main' of https://github.com/microsoft/mscclpp into mahdieh/mnnvls_support mahdiehghazim 2026-01-29 17:22:23 +00:00
  • 7e6f89d7d0 add compile flag to makefile mahdiehghazim 2026-01-29 16:42:21 +00:00
  • 03599e27d7 WIP Qinghua Zhou 2026-01-29 05:08:31 +00:00
  • 97820a3e48 update Binyang Li 2026-01-29 04:13:06 +00:00
  • 6468495b46 update Binyang Li 2026-01-28 03:27:41 +00:00
  • 48b2d66cbe lint Binyang Li 2026-01-27 23:47:56 +00:00
  • cc2b7ef7e2 update Binyang Li 2026-01-27 20:23:54 +00:00
  • 08589bf332 Use native GPU architecture when NVIDIA GPU is detected; otherwise fall back to multi-arch build. (#732) mahdiehghazim 2026-01-26 15:53:36 -05:00
  • eb54ceaa8d update nccl API Binyang Li 2026-01-26 06:31:57 +00:00
  • 7024b17e27 Merge branch 'main' into chhwang/tutorial-mc Changho Hwang 2026-01-25 22:01:12 -08:00
  • cc797abc87 Revert "Support versioning for mscclpp document (#724)" (#734) Qinghua Zhou 2026-01-24 08:42:54 +08:00
  • 69d3b79ecd Support versioning for mscclpp document (#724) Qinghua Zhou 2026-01-24 01:45:41 +08:00
  • 071dc92d38 fp8 nvls support (e5m2 and e4m3) (#730) mahdiehghazim 2026-01-23 10:38:38 -05:00
  • d3d515d366 lint Binyang Li 2026-01-23 00:35:07 +00:00
  • 34c9fd6fd7 merge main Binyang Li 2026-01-23 00:34:03 +00:00
  • be43563084 merge main Binyang Li 2026-01-23 00:28:17 +00:00
  • ecea73d500 autodetect cuda architecture variant mahdieh/cuda-arch-variant-autodetect mahdiehghazim 2026-01-22 22:10:39 +00:00
  • 050edb7866 update Binyang Li 2026-01-22 21:47:38 +00:00
  • 2af7be379e update Binyang Li 2026-01-22 20:35:14 +00:00
  • e71ef6e406 add some logic Binyang Li 2026-01-22 17:20:05 +00:00
  • a707273701 Torch integration (#692) Binyang Li 2026-01-21 20:32:24 -08:00
  • 78ce9fac8d Fix ci pipeline failure (#729) Binyang Li 2026-01-21 10:28:14 -08:00
  • 2d93bbc229 WIP Binyang Li 2026-01-20 23:16:38 +00:00
  • 708d245b92 WIP Binyang Li 2026-01-20 22:15:03 +00:00
  • 653ca04a17 udpate for algo Binyang Li 2026-01-20 18:59:21 +00:00
  • d080a46383 update Binyang Li 2026-01-18 18:11:29 +00:00
  • 4d7ec86d22 Merge branch 'binyli/torch-integration' into binyli/gb200-algo Binyang Li 2026-01-18 17:41:57 +00:00
  • d9f391847a lint Binyang Li 2026-01-18 17:32:25 +00:00
  • 431379af44 lint Binyang Li 2026-01-18 17:21:05 +00:00
  • 9690392f17 lint Binyang Li 2026-01-17 07:53:24 +00:00
  • b1be3b9d76 update Binyang Li 2026-01-17 00:44:08 +00:00
  • 9ab5a96ad9 Merge branch 'binyli/torch-integration' into binyli/gb200-algo Binyang Li 2026-01-17 00:39:35 +00:00
  • 5d66c3804b move context structure to internal header Binyang Li 2026-01-17 00:38:40 +00:00
  • af38fb680b fix build issue Binyang Li 2026-01-16 18:29:06 +00:00
  • af891a43ad merge branch Binyang Li 2026-01-16 18:10:27 +00:00
  • 0e03bcc5f6 Merge branch 'main' into binyli/torch-integration Binyang Li 2026-01-15 23:32:59 -08:00
  • abbdb7f630 Fix ci issue (#727) Binyang Li 2026-01-15 22:21:02 -08:00
  • bdabb12975 fix for nccl-test Binyang Li 2026-01-16 03:22:53 +00:00
  • 85299c79a5 fix ut Binyang Li 2026-01-16 00:41:49 +00:00
  • 0dcdc04670 fix Qinghua Zhou 2026-01-15 23:44:27 +00:00
  • 5d11bc8bff fix for tests Binyang Li 2026-01-15 23:27:08 +00:00
  • d21decdaef fix Binyang Li 2026-01-15 22:57:10 +00:00
  • a0fe68e699 fix build for fp8 Binyang Li 2026-01-15 21:58:53 +00:00
  • 298e3b0ccd lint fix Binyang Li 2026-01-15 18:50:29 +00:00
  • c1db742279 fix ci issue Binyang Li 2026-01-15 18:42:58 +00:00
  • 25b4e66664 Address code review feedback: license headers, naming conventions, and style consistency (#725) Copilot 2026-01-15 09:46:52 -08:00
  • f4b9af493b tackle comments Changho Hwang 2026-01-15 12:49:14 +00:00
  • acbd1541ea documentation Changho Hwang 2026-01-15 09:45:51 +00:00
  • b386a61cec fix Binyang Li 2026-01-15 07:38:20 +00:00
  • 296d85ef23 merge main Binyang Li 2026-01-15 07:31:55 +00:00
  • 8df39ccc2c Support multi-node in MemoryChannel tutorial Changho Hwang 2026-01-15 06:52:02 +00:00
  • 361ae4e0b2 link Binyang Li 2026-01-15 06:45:39 +00:00
  • 105239fc6c Use GpuIpcMem for NVLS connections (#719) Changho Hwang 2026-01-14 21:16:04 -08:00
  • f91103b5d1 fix npkit build Binyang Li 2026-01-15 04:37:29 +00:00
  • c2a87302bd Reduce CI build time (#723) Changho Hwang 2026-01-14 18:45:40 -08:00
  • 4489f9541f Merge branch 'main' into qinghuazhou/allreduce_nvls_size_alignment qinghuazhou/allreduce_nvls_size_alignment Qinghua Zhou 2026-01-15 02:34:22 +00:00
  • 5a1c058438 make example work Binyang Li 2026-01-14 19:54:02 +00:00
  • 5b547434e3 WIP Binyang Li 2026-01-14 07:02:26 +00:00
  • 1e73a5aa48 Merge branch 'main' into binyli/refactor Binyang Li 2026-01-14 06:42:30 +00:00
  • ff751b370e update Binyang Li 2026-01-14 06:41:11 +00:00
  • 3413209747 WIP Binyang Li 2026-01-14 06:03:23 +00:00
  • 09b428530d WIP Binyang Li 2026-01-14 05:42:24 +00:00
  • d2e3ab11af WIP Binyang Li 2026-01-14 05:25:19 +00:00
  • a02ba3b1bd Add GpuIpcMemHandle (#704) Changho Hwang 2026-01-13 18:49:31 -08:00
  • 9f2382b40a Merge branch 'main' into qinghuazhou/allreduce_nvls_size_alignment Qinghua Zhou 2026-01-14 02:49:15 +00:00
  • f1561c9637 Add ut test to track input size alignment Qinghua Zhou 2026-01-13 21:59:49 +00:00
  • 4dd075602c Bypassing SSCA alerts (#721) Changho Hwang 2026-01-12 07:46:27 -08:00
  • 347d37e6f4 update Binyang Li 2026-01-12 15:30:18 +00:00
  • 4b4ef4393b make nccl-test run Binyang Li 2026-01-12 15:15:46 +00:00
  • a767d418cf WIP Binyang Li 2026-01-12 14:55:12 +00:00
  • ab3965df34 fix Binyang Li 2026-01-12 11:29:53 +00:00
  • c1eb1141e5 update Binyang Li 2026-01-12 10:01:32 +00:00
  • 086fd26df1 reorgnize the code Binyang Li 2026-01-12 08:32:29 +00:00
  • 6b38f0b6f7 merge main Binyang Li 2026-01-12 05:49:43 +00:00
  • 75e13d8f62 fix Binyang Li 2026-01-12 04:33:53 +00:00
  • f4a96da6bc WIP Binyang Li 2026-01-11 09:02:00 +00:00
  • b8a1b0a134 Add CUDA 13.0 Docker images (#720) Changho Hwang 2026-01-09 03:03:33 -08:00
  • ef6bb8ac08 update Binyang Li 2026-01-09 10:37:20 +00:00
  • 3f0295fd9a WIP Binyang Li 2026-01-09 09:43:46 +00:00
  • 78e11401a6 WIP Binyang Li 2026-01-09 08:43:35 +00:00
  • eb1ff05e47 WIP Binyang Li 2026-01-09 07:28:22 +00:00
  • 084f137da8 WIP Binyang Li 2026-01-08 11:31:10 +00:00
  • 2d2a930194 WIP Binyang Li 2026-01-08 11:18:27 +00:00
  • 51efcfa198 fix Binyang Li 2026-01-08 11:14:29 +00:00
  • a94a4febdc new algo Binyang Li 2026-01-08 08:34:31 +00:00
  • eab2afb8b9 Update container images for pipeline (#717) Binyang Li 2026-01-07 14:10:49 +08:00
  • 5352410d27 Merge branch 'main' into qinghuazhou/allreduce_nvls_size_alignment Qinghua Zhou 2026-01-06 01:02:18 +00:00
  • 168a6c7037 Tune the nThreadsPerBlock for FP8 and Half datatype on MI300 (#694) Qinghua Zhou 2026-01-06 08:59:59 +08:00
  • fc221e234d Remove UB std:: declarations (#709) Changho Hwang 2026-01-05 11:11:46 +08:00
  • 2cf14ff723 Minor fixes (#715) Changho Hwang 2026-01-05 11:09:48 +08:00
  • bb555277ad Rename P2P log subsys into GPU (#716) Changho Hwang 2026-01-05 11:08:43 +08:00
  • ca6a4a3274 Replace __HIP_PLATFORM_AMD__ to use internal macro (#712) Binyang Li 2026-01-04 20:47:58 +08:00
  • a023d468ac for gb200 Binyang Li 2025-12-31 09:41:54 +00:00
  • feeacd9aed try with other algo Binyang Li 2025-12-31 07:59:52 +00:00
  • 7cc4cfeb9f WIP Binyang Li 2025-12-31 03:55:24 +00:00
  • 38168793cc WIP Binyang Li 2025-12-31 03:38:39 +00:00
  • a66d9db547 WIP Binyang Li 2025-12-30 13:39:03 +00:00
  • 413ddbec43 WIP Binyang Li 2025-12-30 13:25:13 +00:00
  • 665352bcb7 update Binyang Li 2025-12-30 06:35:50 +00:00