mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-14 02:02:46 +00:00
Backup commit grouping all in-progress local work so nothing is lost: - Modified CK-UA kernel + example sources (unified_attention.cpp, unified_attention_kernel.hpp) and CMake/build files. - Updated dispatcher README and ctypes_utils.py. - New unified_attention example notes: PARAMETERS.md, VARIABLES.md. - New unified_attention instances for d128 fp16/bf16 (mask/nmask, gqa6). - New 99_toy_tutorial/ collection: bank-conflict investigations (test_*.cpp, *.js, *.gdb, *.asm, *.md), tile distribution / row reduction / calling_gemm / thread_buffer tutorials. - Slide decks and supporting assets (bank_conflict_slides.qmd/.html, tile_distribution_slides.qmd, assets/, *_files/, step1_reshape_only, xor_full_steps_simple). - GDB helper script (break_on_ds_read.gdb). Not intended for upstream review; pure WIP snapshot.
CK Tile Dispatcher Python Utilities
This directory contains Python utilities used by the dispatcher examples.
Contents
-
ctypes_utils.py- Core ctypes utilities for GEMM Python examplesKernelConfig- Kernel configuration dataclasssetup_gemm_dispatcher()- Setup dispatcher with auto-correctioncleanup_gemm()- Cleanup dispatcher resourcesGemmRunner- GPU execution helper- Auto-correction and validation utilities
-
conv_utils.py- Core utilities for Conv Python examplesConvSignature,ConvAlgorithm- Convolution configurationConvProblem- Problem definitionGpuConvRunner- GPU execution helperEnhancedConvCodegenRunner- Kernel codegen utilities
Usage
GEMM Examples
The GEMM Python examples in dispatcher/examples/gemm/python/ import:
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent.parent.parent.parent / "python"))
from ctypes_utils import (
KernelConfig,
setup_gemm_dispatcher,
cleanup_gemm,
GemmRunner,
)
Conv Examples
The Conv Python examples in dispatcher/examples/conv/python/ import:
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent.parent.parent.parent / "python"))
from conv_utils import (
ConvSignature,
ConvAlgorithm,
ConvProblem,
GpuConvRunner,
)
Requirements
- Python 3.8+
- NumPy
- HIP runtime (for GPU execution)