Adding dispatcher architecture (#3300)

* WIP POC of dispatcher

* Dispatcher python workflow setup.

* Dispatcher cleanup and updates.

Further dispatcher cleanup and updates.

Build fixes

Improvements and python to CK example

Improvements to readme

* Fixes to python paths

* Cleaning up code

* Improving dispatcher support for different arch

Fixing typos

* Fix formatting errors

* Cleaning up examples

* Improving codegeneration

* Improving and fixing C++ examples

* Adding conv functionality (fwd,bwd,bwdw) and examples.

* Fixes based on feedback.

* Further fixes based on feedback.

* Adding stress test for autogeneration and autocorrection, and fixing preshuffle bug.

* Another round of improvements  based on feedback.

* Trimming out unnecessary code.

* Fixing the multi-D implementation.

* Using gpu verification for gemms and fixing convolutions tflops calculation.

* Fix counter usage issue and arch filtering per ops.

* Adding changelog and other fixes.

* Improve examples and resolve critical bugs.

* Reduce build time for python examples.

* Fixing minor bug.

* Fix compilation error.

* Improve installation instructions for dispatcher.

* Add docker based  installation instructions for dispatcher.

* Fixing arch-based filtering to match tile engine.

* Remove dead code and fix arch filtering.

* Minor bugfix.

* Updates after rebase.

* Trimming code.

* Fix copyright headers.

* Consolidate examples, cut down code.

* Minor fixes.

* Improving python examples.

* Update readmes.

* Remove conv functionality.

* Cleanup following conv removable.
This commit is contained in:
Vidyasagar Ananthan
2026-01-22 09:34:33 -08:00
committed by GitHub
parent 44f481a45c
commit 9e049a32a1
97 changed files with 33472 additions and 0 deletions

View File

@@ -0,0 +1,9 @@
# Copyright (c) Advanced Micro Devices, Inc., or its affiliates.
# SPDX-License-Identifier: MIT
# This directory contains Python utilities for the dispatcher examples.
# The main utility file is ctypes_utils.py which is used by GEMM Python examples.
# Conv Python examples use their own conv_utils.py in the examples directory.
# No build targets needed - these are pure Python utilities.
message(STATUS "Python utilities directory configured (no build targets)")

View File

@@ -0,0 +1,60 @@
# CK Tile Dispatcher Python Utilities
This directory contains Python utilities used by the dispatcher examples.
## Contents
- `ctypes_utils.py` - Core ctypes utilities for GEMM Python examples
- `KernelConfig` - Kernel configuration dataclass
- `setup_gemm_dispatcher()` - Setup dispatcher with auto-correction
- `cleanup_gemm()` - Cleanup dispatcher resources
- `GemmRunner` - GPU execution helper
- Auto-correction and validation utilities
- `conv_utils.py` - Core utilities for Conv Python examples
- `ConvSignature`, `ConvAlgorithm` - Convolution configuration
- `ConvProblem` - Problem definition
- `GpuConvRunner` - GPU execution helper
- `EnhancedConvCodegenRunner` - Kernel codegen utilities
## Usage
### GEMM Examples
The GEMM Python examples in `dispatcher/examples/gemm/python/` import:
```python
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent.parent.parent.parent / "python"))
from ctypes_utils import (
KernelConfig,
setup_gemm_dispatcher,
cleanup_gemm,
GemmRunner,
)
```
### Conv Examples
The Conv Python examples in `dispatcher/examples/conv/python/` import:
```python
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent.parent.parent.parent / "python"))
from conv_utils import (
ConvSignature,
ConvAlgorithm,
ConvProblem,
GpuConvRunner,
)
```
## Requirements
- Python 3.8+
- NumPy
- HIP runtime (for GPU execution)

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,43 @@
[pytest]
# Pytest configuration for CK Tile Dispatcher Python tests
# Test discovery
python_files = test_*.py
python_classes = Test*
python_functions = test_*
# Test paths
testpaths = tests
# Options
addopts =
-v
--strict-markers
--tb=short
--color=yes
--durations=10
# Markers
markers =
slow: marks tests as slow (deselect with '-m "not slow"')
cuda: marks tests requiring CUDA/ROCm
torch: marks tests requiring PyTorch
integration: marks integration tests
unit: marks unit tests
# Coverage
[coverage:run]
source = .
omit =
*/tests/*
*/examples/*
setup.py
[coverage:report]
precision = 2
show_missing = True
skip_covered = False
[coverage:html]
directory = htmlcov

View File

@@ -0,0 +1,22 @@
# Core dependencies
numpy>=1.19.0
# Optional dependencies (install with pip install -e ".[torch]")
# torch>=2.0.0
# Development dependencies (install with pip install -e ".[dev]")
# pytest>=6.0.0
# pytest-cov>=2.0.0
# black>=21.0
# flake8>=3.9.0
# mypy>=0.910
# isort>=5.0.0
# Visualization dependencies (install with pip install -e ".[viz]")
# matplotlib>=3.3.0
# seaborn>=0.11.0
# Documentation dependencies
# sphinx>=4.0.0
# sphinx-rtd-theme>=1.0.0