mirror of
https://github.com/microsoft/mscclpp.git
synced 2026-05-12 17:26:04 +00:00
First step to merge msccl-tools into mscclpp repo. In this step will move all msccl related code, pass the current tests and do some necessary refactor. Add `mscclpp.language` module Add `_InstructionOptimizer` and `DagOptimizer` class to optimize the dag Add `DagLower` to lower dag to intermediate representation Add documents for mscclpp.language Remove msccl related code
68 lines
1.8 KiB
ReStructuredText
68 lines
1.8 KiB
ReStructuredText
.. MSCCL++ documentation master file, created by
|
|
sphinx-quickstart on Tue Sep 5 13:03:46 2023.
|
|
You can adapt this file completely to your liking, but it should at least
|
|
contain the root `toctree` directive.
|
|
|
|
Welcome to MSCCL++'s documentation!
|
|
===================================
|
|
|
|
MSCCL++ is a GPU-driven communication stack for scalable AI applications. It is designed to provide a high-performance, scalable, and customizable communication stack for distributed GPU applications.
|
|
|
|
Getting Started
|
|
---------------
|
|
- Follow the :doc:`quick start <getting-started/quickstart>` for your platform of choice.
|
|
- Take a look at the :doc:`tutorials <getting-started/tutorials/index>` to learn how to write your first mscclpp program.
|
|
|
|
.. toctree::
|
|
:maxdepth: 1
|
|
:caption: Getting Started
|
|
:hidden:
|
|
|
|
getting-started/quickstart
|
|
getting-started/tutorials/index
|
|
|
|
Design
|
|
-------
|
|
- :doc:`Design <design/design>` doc for those who want to understand the internals of MSCCL++.
|
|
- :doc:`NCCL over MSCCL++ <design/nccl-over-mscclpp>` doc for those who want to understand how to use NCCL over MSCCL++.
|
|
- :doc:`MSCCL++ DSL <design/mscclpp-dsl>` doc for those who want to understand the MSCCL++ DSL.
|
|
|
|
.. toctree::
|
|
:maxdepth: 1
|
|
:caption: Design
|
|
:hidden:
|
|
|
|
design/design
|
|
design/nccl-over-mscclpp
|
|
design/mscclpp-dsl
|
|
|
|
Performance
|
|
---------------
|
|
- We evaluate the performance of MSCCL++ in A100 and H100. Here are some :doc:`performance results <performance/performance-ndmv4>` for all-reduce operations.
|
|
|
|
.. toctree::
|
|
:maxdepth: 1
|
|
:caption: Performance
|
|
:hidden:
|
|
|
|
performance/performance-ndmv4
|
|
|
|
C++ API
|
|
---------------
|
|
- :doc:`mscclpp <api/index>`
|
|
|
|
|
|
.. toctree::
|
|
:maxdepth: 1
|
|
:caption: C++ API
|
|
:hidden:
|
|
|
|
api/index
|
|
|
|
Indices and tables
|
|
==================
|
|
|
|
* :ref:`genindex`
|
|
* :ref:`modindex`
|
|
* :ref:`search`
|