mirror of
https://github.com/NVIDIA/cutlass.git
synced 2026-04-19 22:38:56 +00:00
CUTLASS 3.3.0 (#1167)
* Release 3.3.0 Adds support for mixed precision GEMMs On Hopper and Ampere Adds support for < 16B aligned GEMMs on Hopper Enhancements to EVT Enhancements to Python interface Enhancements to Sub-byte type handling in CuTe Several other bug-fixes and performance improvements. * minor doc update
This commit is contained in:
@@ -67,14 +67,13 @@ The CUTLASS Python interface currently supports the following operations:
|
||||
* Grouped GEMM (for pre-SM90 kernels)
|
||||
|
||||
### Getting started
|
||||
We recommend using the CUTLASS Python interface via one of the Docker images located in the [docker](/python/docker) directory.
|
||||
We recommend using the CUTLASS Python interface via an [NGC PyTorch Docker container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch):
|
||||
|
||||
```bash
|
||||
docker build -t cutlass-cuda12.1:latest -f docker/Dockerfile-cuda12.1-pytorch .
|
||||
docker run --gpus all -it --rm cutlass-cuda12.1:latest
|
||||
docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:23.08-py3
|
||||
```
|
||||
|
||||
The CUTLASS Python interface has been tested with CUDA 11.8, 12.0, and 12.1 on Python 3.8.10 and 3.9.7.
|
||||
The CUTLASS Python interface has been tested with CUDA 11.8, 12.0, and 12.1 on Python 3.8 and 3.9.
|
||||
|
||||
#### Optional environment variables
|
||||
Prior to installing the CUTLASS Python interface, one may optionally set the following environment variables:
|
||||
@@ -82,19 +81,21 @@ Prior to installing the CUTLASS Python interface, one may optionally set the fol
|
||||
* `CUDA_INSTALL_PATH`: the path to the installation of CUDA
|
||||
|
||||
If these environment variables are not set, the installation process will infer them to be the following:
|
||||
* `CUTLASS_PATH`: one directory level above the current directory (i.e., `$(pwd)/..`)
|
||||
* `CUTLASS_PATH`: either one directory level above the current directory (i.e., `$(pwd)/..`) if installed locally or in the `source` directory of the location in which `cutlass_library` was installed
|
||||
* `CUDA_INSTALL_PATH`: the directory holding `/bin/nvcc` for the first version of `nvcc` on `$PATH` (i.e., `which nvcc | awk -F'/bin/nvcc' '{print $1}'`)
|
||||
|
||||
**NOTE:** The version of `cuda-python` installed must match the CUDA version in `CUDA_INSTALL_PATH`.
|
||||
|
||||
#### Installation
|
||||
The CUTLASS Python interface can currently be installed via:
|
||||
The CUTLASS Python interface can currently be installed by navigating to the root of the CUTLASS directory and performing
|
||||
```bash
|
||||
python setup.py develop --user
|
||||
pip install .
|
||||
```
|
||||
This will allow changes to the Python interface source to be reflected when using the Python interface.
|
||||
|
||||
We plan to add support for installing via `python setup.py install` in a future release.
|
||||
If you would like to be able to make changes to CULASS Python interface and have them reflected when using the interface, perform:
|
||||
```bash
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
### Examples
|
||||
Jupyter notebook examples of using the CUTLASS Python interface are located in [examples/python](/examples/python).
|
||||
@@ -135,10 +136,7 @@ python setup_library.py develop --user
|
||||
|
||||
Alternatively, `cutlass_library` will automatically be installed if you install the CUTLASS Python interface package.
|
||||
|
||||
You can also use the [generator.py](/python/cutlass_library/generator.py) script directly without installing the module via:
|
||||
```bash
|
||||
python -m cutlass_library.generator
|
||||
```
|
||||
You can also use the [generator.py](/python/cutlass_library/generator.py) script directly without installing the module.
|
||||
|
||||
# Copyright
|
||||
|
||||
|
||||
Reference in New Issue
Block a user