mirror of https://github.com/ROCm/composable_kernel.git synced 2026-04-19 14:29:05 +00:00

Files

Aviral Goel 004784ef98 chore(copyright) update library wide CMakeLists.txt copyright header template (#3313 )

* chore(copyright) update library wide CMakeLists.txt files copyright header template

* Fix build

---------

Co-authored-by: Sami Remes <samremes@amd.com>

2025-11-28 13:49:54 -08:00

CMakeLists.txt

chore(copyright) update library wide CMakeLists.txt copyright header template (#3313 )

2025-11-28 13:49:54 -08:00

grouped_gemm_fixed_nk_bias_fp16.cpp

chore(copyright): update copyright header for test directory (#3252 )

2025-11-20 20:36:57 -05:00

README.md

[DOCS] Documentation Addition (Readme updates) (#2495 )

2025-10-16 03:10:57 -07:00

README.md

Client Example: Grouped GEMM with Bias

Theory

This client example demonstrates grouped GEMM fused with bias addition. Grouped GEMM performs multiple independent GEMM operations (with potentially different shapes) in a single kernel launch, and bias addition is a standard pattern in neural network layers.

Mathematical Formulation: For G groups, each with its own A_g, B_g, b_g:

GEMM: Y_g = A_g \times B_g
Bias: E_g = Y_g + b_g

Algorithmic Background:

Each group can have different matrix sizes and strides.
The kernel launches a grid covering all groups, with each block assigned to a group.
Bias is added in the epilogue for each group.

How to Run

Prerequisites

Please follow the instructions in the main Build Guide section as a prerequisite to building and running this example.

Build and run

cd composable_kernel/client_example/21_grouped_gemm_bias
mkdir build && cd build
cmake -DCMAKE_CXX_COMPILER=/opt/rocm/bin/hipcc ..
make -j

# Example run (grouped GEMM with bias, FP16)
./grouped_gemm_fixed_nk_bias_fp16

Source Code Structure

Directory Layout

client_example/21_grouped_gemm_bias/
├── grouped_gemm_fixed_nk_bias_fp16.cpp         # Main client example: grouped GEMM + bias (FP16)
├── CMakeLists.txt                              # Build configuration for the example

Key Functions

main() (in grouped_gemm_fixed_nk_bias_fp16.cpp):
Sets up input matrices for each group, configures GEMM and bias parameters, launches the grouped kernel, and verifies the result.
Grouped GEMM kernel invocation:
Uses the Composable Kernel device API to launch grouped GEMM with bias addition.

Additional Details

Supports multiple groups with different matrix shapes.
Example parameters can be adjusted in the source for different workloads.

15_grouped_gemm: Grouped GEMM in the main example directory
11_convnd_fwd_bias: Convolution with bias fusion

Back to Client Examples

README.md

Client Example: Grouped GEMM with Bias

Theory

How to Run

Prerequisites

Build and run

Source Code Structure

Directory Layout

Key Functions

Additional Details

Related Examples