mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-03-25 01:27:40 +00:00
* WIP POC of dispatcher * Dispatcher python workflow setup. * Dispatcher cleanup and updates. Further dispatcher cleanup and updates. Build fixes Improvements and python to CK example Improvements to readme * Fixes to python paths * Cleaning up code * Improving dispatcher support for different arch Fixing typos * Fix formatting errors * Cleaning up examples * Improving codegeneration * Improving and fixing C++ examples * Adding conv functionality (fwd,bwd,bwdw) and examples. * Fixes based on feedback. * Further fixes based on feedback. * Adding stress test for autogeneration and autocorrection, and fixing preshuffle bug. * Another round of improvements based on feedback. * Trimming out unnecessary code. * Fixing the multi-D implementation. * Using gpu verification for gemms and fixing convolutions tflops calculation. * Fix counter usage issue and arch filtering per ops. * Adding changelog and other fixes. * Improve examples and resolve critical bugs. * Reduce build time for python examples. * Fixing minor bug. * Fix compilation error. * Improve installation instructions for dispatcher. * Add docker based installation instructions for dispatcher. * Fixing arch-based filtering to match tile engine. * Remove dead code and fix arch filtering. * Minor bugfix. * Updates after rebase. * Trimming code. * Fix copyright headers. * Consolidate examples, cut down code. * Minor fixes. * Improving python examples. * Update readmes. * Remove conv functionality. * Cleanup following conv removable.
CK Tile Dispatcher - Language Bindings
This directory contains language bindings for the CK Tile Dispatcher.
Structure
bindings/
├── ctypes/ # Python ctypes bindings (C API)
│ ├── gemm_ctypes_lib.cpp # GEMM dispatcher C API
│ ├── conv_ctypes_lib.cpp # Convolution dispatcher C API (fwd + bwd_data)
│ ├── conv_bwdw_ctypes_lib.cpp # Convolution backward weight C API
│ ├── gpu_helper.cpp # CLI helper for Python
│ └── CMakeLists.txt
└── README.md
ctypes Bindings
The ctypes bindings provide a C API that Python can load via ctypes.CDLL().
Building
cd build
cmake .. -DCMAKE_PREFIX_PATH=/opt/rocm
make dispatcher_gemm_lib dispatcher_conv_lib gpu_helper
Usage from Python
import ctypes
# Load the library
lib = ctypes.CDLL("path/to/libdispatcher_gemm_lib.so")
# Initialize
lib.dispatcher_init()
# Check if problem is supported
is_supported = lib.dispatcher_is_supported(M, N, K)
# Run GEMM
time_ms = ctypes.c_float()
result = lib.dispatcher_run_gemm(
A_ptr, B_ptr, C_ptr,
M, N, K,
ctypes.byref(time_ms)
)
# Cleanup
lib.dispatcher_cleanup()
GEMM API
| Function | Description |
|---|---|
dispatcher_init() |
Initialize the dispatcher |
dispatcher_is_supported(M, N, K) |
Check if problem size is supported |
dispatcher_select_kernel(M, N, K, name_buf, buf_size) |
Get kernel name for problem |
dispatcher_run_gemm(A, B, C, M, N, K, time_ms) |
Execute GEMM |
dispatcher_get_kernel_count() |
Get number of registered kernels |
dispatcher_export_registry_json() |
Export registry as JSON |
dispatcher_cleanup() |
Release resources |
Convolution API
| Function | Description |
|---|---|
conv_dispatcher_init() |
Initialize the dispatcher |
conv_dispatcher_is_supported(prob) |
Check if problem is supported |
conv_dispatcher_select_kernel(prob, name_buf, buf_size) |
Get kernel name |
conv_dispatcher_run(input, weight, output, prob, stream) |
Execute convolution |
conv_dispatcher_get_kernel_count() |
Get number of registered kernels |
conv_dispatcher_cleanup() |
Release resources |
GPU Helper
The gpu_helper executable provides a CLI interface for Python:
./gpu_helper 1024 1024 1024 --validate
Output is JSON for easy parsing:
{
"problem": {"M": 1024, "N": 1024, "K": 1024},
"kernel": "gemm_fp16_rcr_...",
"execution": {
"time_ms": 0.5,
"tflops": 4.2
},
"validation": {
"accuracy": 100.0
},
"status": "success"
}
Examples
See the examples that use these bindings:
- GEMM:
dispatcher/examples/gemm/python/ - Conv:
dispatcher/examples/conv/python/