Files
composable_kernel/client_example/20_splitk_gemm/README.md
2025-10-16 10:13:27 +00:00

67 lines
2.0 KiB
Markdown

# Client Example: Split-K GEMM
## Theory
This client example demonstrates **Split-K GEMM**, a technique for parallelizing matrix multiplication along the K dimension. Split-K is used to improve parallelism and memory bandwidth utilization for large GEMM operations, especially when K is large.
**Mathematical Formulation:**
- Standard GEMM: $C = A \times B$
- Split-K: Partition the K dimension into $K_s$ splits, compute partial results, then reduce:
$$
C = \sum_{s=1}^{K_s} (A_{[:, K_s]} \times B_{[K_s, :]})
$$
**Algorithmic Background:**
- Each split computes a partial GEMM over a chunk of K.
- Partial results are reduced (summed) to produce the final output.
- Useful for large K, limited workspace, or maximizing GPU occupancy.
## How to Run
### Prerequisites
Please follow the instructions in the main [Build Guide](../../README.md#building-ck) section as a prerequisite to building and running this example.
### Build and run
```bash
cd composable_kernel/client_example/20_splitk_gemm
mkdir build && cd build
cmake -DCMAKE_CXX_COMPILER=/opt/rocm/bin/hipcc ..
make -j
# Example run (FP16 compute, FP8 output)
./splitK_gemm_fp16_f8
```
## Source Code Structure
### Directory Layout
```
client_example/20_splitk_gemm/
├── splitK_gemm_fp16_f8.cpp # Main client example: Split-K GEMM (FP16 compute, FP8 output)
├── CMakeLists.txt # Build configuration for the example
```
### Key Functions
- **main()** (in `splitK_gemm_fp16_f8.cpp`):
Sets up input matrices, configures Split-K parameters, launches the Split-K GEMM kernel, and verifies the result.
- **Split-K kernel invocation**:
Uses the Composable Kernel device API to launch the Split-K GEMM operation.
---
## Additional Details
- Supports FP16 compute with FP8 output for memory efficiency.
- Example parameters can be adjusted in the source for different workloads.
---
## Related Examples
- [35_splitK_gemm](../../example/35_splitK_gemm/README.md): Split-K GEMM in the main example directory
---
[Back to Client Examples](../README.md)