Files
Vidyasagar Ananthan 92c67a824f [DOCS] Documentation Addition (Readme updates) (#2495)
* GH-2368 Adding a basic glossary

GH-2368 Minor edits

GH-2368 Adding missing READMEs and standardization.

resolving readme updates

GH-2368 Minor improvements to documentation.

Improving some readmes.

Further improvement for readmes.

Cleaned up the documentation in 'client_example' (#2468)

Update for PR

Update ACRONYMS.md to remove trivial terms

Update ACRONYMS.md to provide detailed explanations for BF16 and BF8 formats

Apply suggestion from @spolifroni-amd

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

Apply suggestion from @spolifroni-amd

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

Update README.md to clarify CK Tile API description and remove outdated references to the Tile Engine.

revise 37_transpose readme

revise 36_copy readme

Remove references to the Tile Engine in README files for 19_gemm_multi_d and 35_batched_transpose, and update distribution links for clarity.

Remove references to the Tile Engine in multiple README files and update distribution links for consistency and clarity.

Remove references to the Tile Engine in README files across multiple examples

* GH-2368 Adding a basic glossary

GH-2368 Minor edits

GH-2368 Adding missing READMEs and standardization.

resolving readme updates

GH-2368 Minor improvements to documentation.

Improving some readmes.

Further improvement for readmes.

Cleaned up the documentation in 'client_example' (#2468)

Update for PR

Update ACRONYMS.md to remove trivial terms

Update ACRONYMS.md to provide detailed explanations for BF16 and BF8 formats

Apply suggestion from @spolifroni-amd

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

Apply suggestion from @spolifroni-amd

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

Update README.md to clarify CK Tile API description and remove outdated references to the Tile Engine.

revise 37_transpose readme

revise 36_copy readme

Remove references to the Tile Engine in README files for 19_gemm_multi_d and 35_batched_transpose, and update distribution links for clarity.

Remove references to the Tile Engine in multiple README files and update distribution links for consistency and clarity.

Remove references to the Tile Engine in README files across multiple examples

Refine README files by removing outdated references to the Tile Engine

* Updates based on PR feedback 1

* Updates based on PR feedback 2

* Updates based on PR feedback 3

* Updates based on PR feedback 4

* Updates based on PR feedback 5

* Updates based on PR feedback 6

* Updates based on PR feedback 7

* Updates based on PR feedback 8

* Content Modification of CK Tile Example

* Modify the ck_tile gemm config

---------

Co-authored-by: AviralGoelAMD <aviral.goel@amd.com>
Co-authored-by: ThomasNing <thomas.ning@amd.com>
2025-10-16 03:10:57 -07:00

4.5 KiB

3D Pooling Forward

This example demonstrates a 3D pooling forward operation. Pooling is a fundamental operation in convolutional neural networks that reduces the spatial dimensions of feature maps while retaining important information. 3D pooling extends this concept to three-dimensional data, commonly used in video analysis, medical imaging, and 3D computer vision applications.

Mathematical Formulation

3D pooling operates on 5D tensors with shape [N, C, D, H, W] where:

  • N is the batch size
  • C is the number of channels
  • D, H, W are the depth, height, and width dimensions

The operation applies a pooling function over 3D windows of the input tensor.

For each output position (n, c, d_out, h_out, w_out): \text{Out}_{ncd_{out}h_{out}w_{out}} = \text{Pool}(\{X_{ncd'h'w'} : d' \in W_d, h' \in W_h, w' \in W_w\})

Where:

  • W_d, W_h, W_w define the 3D pooling window
  • Pool is the pooling function (e.g., max or average)

Max Pooling: \text{Pool}(S) = \max(S) Average Pooling: \text{Pool}(S) = \frac{1}{|S|} \sum_{x \in S} x

The window positions are determined by:

  • Window size: (pool_d, pool_h, pool_w)
  • Stride: (stride_d, stride_h, stride_w)
  • Padding: (pad_d, pad_h, pad_w)

Algorithmic Strategy: Parallel Window-based Computation

3D pooling is implemented as a parallel algorithm where each thread computes one output element.

  1. Grid Scheduling: The output tensor elements are distributed across GPU threads. Each thread is assigned to compute one element of the output tensor.

  2. Window Processing: For each output position, a thread:

    • Calculate Input Window: Determines the 3D input window corresponding to the current output position based on stride, padding, and window size.
    • Boundary Handling: Checks for boundary conditions and padding, ensuring that only valid input positions are processed.
    • Apply Pooling Function:
      • Max Pooling: Iterates through the window and finds the maximum value.
      • Average Pooling: Iterates through the window, accumulates values, and computes the average.
    • Store Result: Writes the computed result to the output tensor.
  3. Memory Access Optimization: The kernel is optimized for memory access patterns, using techniques like:

    • Coalesced memory access where possible
    • Shared memory for frequently accessed data
    • Efficient handling of boundary conditions

Source Code Organization

Build and Run

Prerequisites

Ensure the Composable Kernel library is built and installed.

cd /path/to/composable_kernel/build
make -j install

Build the Example

cd /path/to/composable_kernel/example/48_pool3d_fwd
mkdir build && cd build

cmake \
  -DCMAKE_CXX_COMPILER=/opt/rocm/bin/hipcc \
  -DCMAKE_PREFIX_PATH="/opt/rocm;${CK_INSTALL_PATH}" \
  ..

make -j

Run the Example

# Run the example with default settings
./pool3d_fwd_xdl

# Run with verification, data initialization, and timing
./pool3d_fwd_xdl 1 2 1

Applications

3D pooling is essential in several domains that process volumetric or temporal data.

  • Video Analysis: In video understanding tasks, 3D CNNs use 3D pooling to reduce temporal and spatial dimensions while preserving important motion and appearance features.
  • Medical Imaging: 3D medical images (CT scans, MRI) require 3D pooling for feature extraction while maintaining spatial relationships in all three dimensions.
  • 3D Computer Vision: Object detection and segmentation in 3D point clouds or voxel grids use 3D pooling for hierarchical feature learning.
  • Action Recognition: Video action recognition models use 3D pooling to aggregate features across temporal and spatial dimensions.
  • Volumetric Data Processing: Scientific applications processing 3D volumetric data (weather modeling, fluid dynamics) use 3D pooling for multi-scale analysis.