mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-27 08:25:46 +00:00
* WIP POC of dispatcher
* Dispatcher python workflow setup.
* Dispatcher cleanup and updates.
Further dispatcher cleanup and updates.
Build fixes
Improvements and python to CK example
Improvements to readme
* Fixes to python paths
* Cleaning up code
* Improving dispatcher support for different arch
Fixing typos
* Fix formatting errors
* Cleaning up examples
* Improving codegeneration
* Improving and fixing C++ examples
* Adding conv functionality (fwd,bwd,bwdw) and examples.
* Fixes based on feedback.
* Further fixes based on feedback.
* Adding stress test for autogeneration and autocorrection, and fixing preshuffle bug.
* Another round of improvements based on feedback.
* Trimming out unnecessary code.
* Fixing the multi-D implementation.
* Using gpu verification for gemms and fixing convolutions tflops calculation.
* Fix counter usage issue and arch filtering per ops.
* Adding changelog and other fixes.
* Improve examples and resolve critical bugs.
* Reduce build time for python examples.
* Fixing minor bug.
* Fix compilation error.
* Improve installation instructions for dispatcher.
* Add docker based installation instructions for dispatcher.
* Fixing arch-based filtering to match tile engine.
* Remove dead code and fix arch filtering.
* Minor bugfix.
* Updates after rebase.
* Trimming code.
* Fix copyright headers.
* Consolidate examples, cut down code.
* Minor fixes.
* Improving python examples.
* Update readmes.
* Remove conv functionality.
* Cleanup following conv removable.
[ROCm/composable_kernel commit: 9e049a32a1]
110 lines
2.7 KiB
Markdown
110 lines
2.7 KiB
Markdown
# CK Tile Dispatcher - Language Bindings
|
|
|
|
This directory contains language bindings for the CK Tile Dispatcher.
|
|
|
|
## Structure
|
|
|
|
```
|
|
bindings/
|
|
├── ctypes/ # Python ctypes bindings (C API)
|
|
│ ├── gemm_ctypes_lib.cpp # GEMM dispatcher C API
|
|
│ ├── conv_ctypes_lib.cpp # Convolution dispatcher C API (fwd + bwd_data)
|
|
│ ├── conv_bwdw_ctypes_lib.cpp # Convolution backward weight C API
|
|
│ ├── gpu_helper.cpp # CLI helper for Python
|
|
│ └── CMakeLists.txt
|
|
└── README.md
|
|
```
|
|
|
|
## ctypes Bindings
|
|
|
|
The ctypes bindings provide a C API that Python can load via `ctypes.CDLL()`.
|
|
|
|
### Building
|
|
|
|
```bash
|
|
cd build
|
|
cmake .. -DCMAKE_PREFIX_PATH=/opt/rocm
|
|
make dispatcher_gemm_lib dispatcher_conv_lib gpu_helper
|
|
```
|
|
|
|
### Usage from Python
|
|
|
|
```python
|
|
import ctypes
|
|
|
|
# Load the library
|
|
lib = ctypes.CDLL("path/to/libdispatcher_gemm_lib.so")
|
|
|
|
# Initialize
|
|
lib.dispatcher_init()
|
|
|
|
# Check if problem is supported
|
|
is_supported = lib.dispatcher_is_supported(M, N, K)
|
|
|
|
# Run GEMM
|
|
time_ms = ctypes.c_float()
|
|
result = lib.dispatcher_run_gemm(
|
|
A_ptr, B_ptr, C_ptr,
|
|
M, N, K,
|
|
ctypes.byref(time_ms)
|
|
)
|
|
|
|
# Cleanup
|
|
lib.dispatcher_cleanup()
|
|
```
|
|
|
|
### GEMM API
|
|
|
|
| Function | Description |
|
|
|----------|-------------|
|
|
| `dispatcher_init()` | Initialize the dispatcher |
|
|
| `dispatcher_is_supported(M, N, K)` | Check if problem size is supported |
|
|
| `dispatcher_select_kernel(M, N, K, name_buf, buf_size)` | Get kernel name for problem |
|
|
| `dispatcher_run_gemm(A, B, C, M, N, K, time_ms)` | Execute GEMM |
|
|
| `dispatcher_get_kernel_count()` | Get number of registered kernels |
|
|
| `dispatcher_export_registry_json()` | Export registry as JSON |
|
|
| `dispatcher_cleanup()` | Release resources |
|
|
|
|
### Convolution API
|
|
|
|
| Function | Description |
|
|
|----------|-------------|
|
|
| `conv_dispatcher_init()` | Initialize the dispatcher |
|
|
| `conv_dispatcher_is_supported(prob)` | Check if problem is supported |
|
|
| `conv_dispatcher_select_kernel(prob, name_buf, buf_size)` | Get kernel name |
|
|
| `conv_dispatcher_run(input, weight, output, prob, stream)` | Execute convolution |
|
|
| `conv_dispatcher_get_kernel_count()` | Get number of registered kernels |
|
|
| `conv_dispatcher_cleanup()` | Release resources |
|
|
|
|
## GPU Helper
|
|
|
|
The `gpu_helper` executable provides a CLI interface for Python:
|
|
|
|
```bash
|
|
./gpu_helper 1024 1024 1024 --validate
|
|
```
|
|
|
|
Output is JSON for easy parsing:
|
|
```json
|
|
{
|
|
"problem": {"M": 1024, "N": 1024, "K": 1024},
|
|
"kernel": "gemm_fp16_rcr_...",
|
|
"execution": {
|
|
"time_ms": 0.5,
|
|
"tflops": 4.2
|
|
},
|
|
"validation": {
|
|
"accuracy": 100.0
|
|
},
|
|
"status": "success"
|
|
}
|
|
```
|
|
|
|
## Examples
|
|
|
|
See the examples that use these bindings:
|
|
|
|
- **GEMM**: `dispatcher/examples/gemm/python/`
|
|
- **Conv**: `dispatcher/examples/conv/python/`
|
|
|