Files
composable_kernel/example/48_pool3d_fwd
John Shumway ad57f6ef0b [CK_BUILDER] Put global CK functions in an the CK namespace (#3232)
* Wrap ck host utitlies in CK namespace.

The CK and CK-Tile source code bases are incompatible because CK is not properly using namespaces everywhere. In particular, we need to put hip_check_error in the ck namespace.

Move all functions in include/ck_/host_utility that were in global namespace into the ck namespace.

There may be additional namespace problems like this, and it's possible we'll have namespace clashes. But it is good design to properly guard our to code bases (CK and CKTile) so that they can both coexist. Moreover, estabilishing this compatiblity is essential if we are going to allow the builder to instantiate  kernels from either template library.

* Add using declarations to test code.

After moving some of the untils into the ck namespace, most examples and a few tests had to be updated to recognize the new namespace declarations. We add using declarations to individual compute units for functions that were previously in the global namespace.

* Add using declarations to client examples.
2025-11-19 11:23:02 +01:00
..
2023-08-15 02:25:28 +08:00

3D Pooling Forward

This example demonstrates a 3D pooling forward operation. Pooling is a fundamental operation in convolutional neural networks that reduces the spatial dimensions of feature maps while retaining important information. 3D pooling extends this concept to three-dimensional data, commonly used in video analysis, medical imaging, and 3D computer vision applications.

Mathematical Formulation

3D pooling operates on 5D tensors with shape [N, C, D, H, W] where:

  • N is the batch size
  • C is the number of channels
  • D, H, W are the depth, height, and width dimensions

The operation applies a pooling function over 3D windows of the input tensor.

For each output position (n, c, d_out, h_out, w_out): \text{Out}_{ncd_{out}h_{out}w_{out}} = \text{Pool}(\{X_{ncd'h'w'} : d' \in W_d, h' \in W_h, w' \in W_w\})

Where:

  • W_d, W_h, W_w define the 3D pooling window
  • Pool is the pooling function (e.g., max or average)

Max Pooling: \text{Pool}(S) = \max(S) Average Pooling: \text{Pool}(S) = \frac{1}{|S|} \sum_{x \in S} x

The window positions are determined by:

  • Window size: (pool_d, pool_h, pool_w)
  • Stride: (stride_d, stride_h, stride_w)
  • Padding: (pad_d, pad_h, pad_w)

Algorithmic Strategy: Parallel Window-based Computation

3D pooling is implemented as a parallel algorithm where each thread computes one output element.

  1. Grid Scheduling: The output tensor elements are distributed across GPU threads. Each thread is assigned to compute one element of the output tensor.

  2. Window Processing: For each output position, a thread:

    • Calculate Input Window: Determines the 3D input window corresponding to the current output position based on stride, padding, and window size.
    • Boundary Handling: Checks for boundary conditions and padding, ensuring that only valid input positions are processed.
    • Apply Pooling Function:
      • Max Pooling: Iterates through the window and finds the maximum value.
      • Average Pooling: Iterates through the window, accumulates values, and computes the average.
    • Store Result: Writes the computed result to the output tensor.
  3. Memory Access Optimization: The kernel is optimized for memory access patterns, using techniques like:

    • Coalesced memory access where possible
    • Shared memory for frequently accessed data
    • Efficient handling of boundary conditions

Source Code Organization

Build and Run

Prerequisites

Ensure the Composable Kernel library is built and installed.

cd /path/to/composable_kernel/build
make -j install

Build the Example

cd /path/to/composable_kernel/example/48_pool3d_fwd
mkdir build && cd build

cmake \
  -DCMAKE_CXX_COMPILER=/opt/rocm/bin/hipcc \
  -DCMAKE_PREFIX_PATH="/opt/rocm;${CK_INSTALL_PATH}" \
  ..

make -j

Run the Example

# Run the example with default settings
./pool3d_fwd_xdl

# Run with verification, data initialization, and timing
./pool3d_fwd_xdl 1 2 1

Applications

3D pooling is essential in several domains that process volumetric or temporal data.

  • Video Analysis: In video understanding tasks, 3D CNNs use 3D pooling to reduce temporal and spatial dimensions while preserving important motion and appearance features.
  • Medical Imaging: 3D medical images (CT scans, MRI) require 3D pooling for feature extraction while maintaining spatial relationships in all three dimensions.
  • 3D Computer Vision: Object detection and segmentation in 3D point clouds or voxel grids use 3D pooling for hierarchical feature learning.
  • Action Recognition: Video action recognition models use 3D pooling to aggregate features across temporal and spatial dimensions.
  • Volumetric Data Processing: Scientific applications processing 3D volumetric data (weather modeling, fluid dynamics) use 3D pooling for multi-scale analysis.