Update citations (#524)

Co-authored-by: Aashaka Shah <aashaka96@gmail.com>
2026-04-20 06:49:29 +00:00 · 2025-05-13 17:52:04 -07:00
parent 2c63059014
commit 908659318b
2 changed files with 60 additions and 28 deletions
--- a/CITATION.cff
+++ b/CITATION.cff
@@ -1,53 +1,56 @@
 cff-version: 1.2.0
-title: "MSCCL++: A GPU-driven communication stack for scalable AI applications"
-version: 0.6.0
+title: >-
+  MSCCL++: Rethinking GPU Communication Abstractions for
+  Cutting-edge AI Applications
 message: >-
-  If you use this project in your research, please cite it as below.
+  If you use this software, please cite it using the
+  metadata from this file.
+type: software
 authors:
-  - given-names: Peng
-    family-names: Cheng
-    affiliation: Microsoft Research
-  - given-names: Roshan
-    family-names: Dathathri
-    affiliation: Microsoft Research
-  - given-names: Changho
-    family-names: Hwang
+  - given-names: Aashaka
+    family-names: Shah
    affiliation: Microsoft Research
  - given-names: Abhinav
    family-names: Jangda
    affiliation: Microsoft Research
-  - given-names: Suriya
-    family-names: Kalivardhan
-    affiliation: Microsoft Azure
  - given-names: Binyang
    family-names: Li
    affiliation: Microsoft Azure
-  - given-names: Shuguang
-    family-names: Liu
-    affiliation: Microsoft Azure
-  - given-names: Saeed
-    family-names: Maleki
-    affiliation: Microsoft Research
-  - given-names: Madan
-    family-names: Musuvathi
-    affiliation: Microsoft Research
  - given-names: Caio
    family-names: Rocha
    affiliation: Microsoft Azure
+  - given-names: Changho
+    family-names: Hwang
+    affiliation: Microsoft Research
+  - given-names: Jithin
+    family-names: Jose
+    affiliation: Microsoft Azure
+  - given-names: Madan
+    family-names: Musuvathi
+    affiliation: Microsoft Research
  - given-names: Olli
    family-names: Saarikivi
    affiliation: Microsoft Research
-  - given-names: Aashaka
-    family-names: Shah
+  - given-names: Peng
+    family-names: Cheng
    affiliation: Microsoft Research
-  - given-names: Wei
-    family-names: Tsui
+  - given-names: Qinghua
+    family-names: Zhou
+    affiliation: Microsoft Azure
+  - given-names: Roshan
+    family-names: Dathathri
+    affiliation: Microsoft Research
+  - given-names: Saeed
+    family-names: Maleki
    affiliation: Microsoft Research
  - given-names: Ziyue
    family-names: Yang
    affiliation: Microsoft Research
-
+identifiers:
+  - type: other
+    value: 'arxiv:2504.09014'
 repository-code: 'https://github.com/microsoft/mscclpp'
+url: 'https://microsoft.github.io/mscclpp/index.html'
 abstract: >-
  MSCCL++ redefines the interface for inter-GPU communication, thereby
  delivering a highly efficient and customizable communication stack
--- a/README.md
+++ b/README.md
@@ -15,6 +15,8 @@ A GPU-driven communication stack for scalable AI applications.

 See [Quick Start](https://microsoft.github.io/mscclpp/getting-started/quickstart.html) to quickly get started.

+Check our [paper](https://arxiv.org/abs/2504.09014) to know more about MSCCL++.
+
 ## Overview

 MSCCL++ redefines inter-GPU communication interfaces, thereby delivering a highly efficient and customizable communication stack for distributed GPU applications. Its design is specifically tailored to accommodate diverse performance optimization scenarios often encountered in state-of-the-art AI applications. Figure below provides a high-level overview of MSCCL++ abstractions in CUDA, C, and Python.
@@ -154,6 +156,18 @@ Customized proxies can be used for conducting a series of pre-defined data trans

 MSCCL++ provides Python bindings and interfaces, which simplifies integration with Python applications.

+## Projects using MSCCL++
+MSCCL++ is being used in many amazing projects to power their communication needs. Some projects include:
+
+- ARK: A GPU-driven system framework for scalable AI applications [[Paper link](https://www.usenix.org/conference/nsdi23/presentation/hwang)], Accepted at NSDI 2023 [![GitHub stars](https://img.shields.io/github/stars/microsoft/ark?style=social)](https://github.com/microsoft/ark)
+- FlashInfer: A Kernel Library for LLM Serving [![GitHub stars](https://img.shields.io/github/stars/flashinfer-ai/flashinfer?style=social)](https://github.com/flashinfer-ai/flashinfer)
+- ForestColl: Throughput-Optimal Collective Communications on Heterogeneous Network Fabrics [[Paper link](https://arxiv.org/abs/2402.06787)]
+- LMDeploy: A toolkit for compressing, deploying, and serving LLMs [![GitHub stars](https://img.shields.io/github/stars/InternLM/lmdeploy?style=social)](https://github.com/InternLM/lmdeploy)
+- Nanoflow: A throughput-oriented high-performance serving framework for LLMs [[Paper link](https://arxiv.org/abs/2408.12757)] [![GitHub stars](https://img.shields.io/github/stars/efeslab/Nanoflow?style=social)](https://github.com/efeslab/Nanoflow)
+- ROCm Communication Collectives Library (RCCL) [![GitHub stars](https://img.shields.io/github/stars/ROCm/rccl?style=social)]( https://github.com/ROCm/rccl)
+- Splitwise: Efficient generative LLM inference using phase splitting [[Paper link](https://arxiv.org/abs/2311.18677)], Accepted at ISCA 2024, Best Paper Nominee
+- TVM: Open deep learning compiler stack for cpu, gpu and specialized accelerators [![GitHub stars](https://img.shields.io/github/stars/apache/tvm?style=social)](https://github.com/apache/tvm)
+
 ## Contributing

 This project welcomes contributions and suggestions.  Most contributions require you to agree to a
@@ -175,3 +189,18 @@ trademarks or logos is subject to and must follow
 [Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general).
 Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship.
 Any use of third-party trademarks or logos are subject to those third-party's policies.
+
+## Citation
+
+If you use this project for your work, please cite our [paper](https://arxiv.org/abs/2504.09014):
+```bibtex
+@misc{ShahJLRHJMSCZDMY2025,
+      title={MSCCL++: Rethinking GPU Communication Abstractions for Cutting-edge AI Applications}, 
+      author={Aashaka Shah and Abhinav Jangda and Binyang Li and Caio Rocha and Changho Hwang and Jithin Jose and Madan Musuvathi and Olli Saarikivi and Peng Cheng and Qinghua Zhou and Roshan Dathathri and Saeed Maleki and Ziyue Yang},
+      year={2025},
+      eprint={2504.09014},
+      archivePrefix={arXiv},
+      primaryClass={cs.DC},
+      url={https://arxiv.org/abs/2504.09014}, 
+}
+```