mirror of https://github.com/ROCm/composable_kernel.git synced 2026-05-12 09:16:52 +00:00

Files

Aviral Goel 004784ef98 chore(copyright) update library wide CMakeLists.txt copyright header template (#3313 )

* chore(copyright) update library wide CMakeLists.txt files copyright header template

* Fix build

---------

Co-authored-by: Sami Remes <samremes@amd.com>

2025-11-28 13:49:54 -08:00

CMakeLists.txt

chore(copyright) update library wide CMakeLists.txt copyright header template (#3313 )

2025-11-28 13:49:54 -08:00

common.hpp

chore(copyright): update copyright header for test directory (#3252 )

2025-11-20 20:36:57 -05:00

conv3d_fwd_fp16_comp_fp8.cpp

chore(copyright): update copyright header for test directory (#3252 )

2025-11-20 20:36:57 -05:00

conv3d_fwd_fp16.cpp

chore(copyright): update copyright header for test directory (#3252 )

2025-11-20 20:36:57 -05:00

conv3d_fwd_fp32.cpp

chore(copyright): update copyright header for test directory (#3252 )

2025-11-20 20:36:57 -05:00

README.md

[DOCS] Documentation Addition (Readme updates) (#2495 )

2025-10-16 03:10:57 -07:00

README.md

Client Example: N-Dimensional Convolution Forward

Theory

This client example demonstrates N-dimensional convolution forward for 3D inputs, supporting multiple data types (FP16, FP32, FP8 composite). Convolution is a fundamental operation in deep learning, especially in convolutional neural networks (CNNs) for images, audio, and volumetric data.

Mathematical Formulation: Given input X, weights W:


Y = \text{Conv}(X, W)

Supports 3D convolution (ND can be extended).
Utilizes implicit GEMM for efficient computation.

Algorithmic Background:

The forward convolution operation is implemented as a convolution with transformed coordinates.
Used in inference and training pipelines for 3D CNNs, medical imaging, and volumetric data.

How to Run

Prerequisites

Please follow the instructions in the main Build Guide section as a prerequisite to building and running this example.

Build and run

cd composable_kernel/client_example/16_convnd_fwd
mkdir build && cd build
cmake -DCMAKE_CXX_COMPILER=/opt/rocm/bin/hipcc ..
make -j

# Example run (3D forward, FP16)
./conv3d_fwd_fp16

# Example run (3D forward, FP32)
./conv3d_fwd_fp32

# Example run (3D forward, FP16 compute with FP8)
./conv3d_fwd_fp16_comp_fp8

Source Code Structure

Directory Layout

client_example/16_convnd_fwd/
├── conv3d_fwd_fp16.cpp         # 3D convolution forward (FP16)
├── conv3d_fwd_fp32.cpp         # 3D convolution forward (FP32)
├── conv3d_fwd_fp16_comp_fp8.cpp # 3D convolution forward (FP16 compute, FP8)
├── common.hpp                  # Common utilities for convolution
├── CMakeLists.txt              # Build configuration for the example

Key Functions

main() (in each .cpp):
Sets up input/output tensors, configures convolution parameters, launches the forward kernel, and verifies the result.
Forward convolution kernel invocation:
Uses the Composable Kernel device API to launch convolution forward for different data types.

Additional Details

Supports FP16, FP32, and FP8 composite for 3D convolution.
Parameters can be adjusted in the source files for different workloads. The following parameters are configurable:
- NumDimSpatial: Number of spatial dimensions (default: 3 for 3D convolution)
- G: Number of groups (default: 1)
- N: Batch size (default: 64)
- K: Number of output channels (default: 128)
- C: Number of input channels (default: 64)
- Z, Y, X: Filter/kernel dimensions (default: 3x3x3)
- Di, Hi, Wi: Input dimensions - depth, height, width (default: 28x28x3)
- Do, Ho, Wo: Output dimensions - depth, height, width (default: 28x28x3)

09_convnd_fwd: N-dimensional convolution in the main example directory
30_grouped_conv_fwd_multiple_d: Grouped convolution forward with multiple D

Back to Client Examples

README.md

Client Example: N-Dimensional Convolution Forward

Theory

How to Run

Prerequisites

Build and run

Source Code Structure

Directory Layout

Key Functions

Additional Details

Related Examples