Files
composable_kernel/experimental/builder/include/ck_tile/builder/reflect
kabrahamAMD adba0d2198 [CK_Builder] added bwd data kernels to builder factory (#4582)
This PR adds bwd data wmma and xdl kernels to the ck builder, their
instance and conv traits as well as tests for the above.

---------

Co-authored-by: Kevin Abraham <kevin.abraham@streamhpc.com>
Co-authored-by: John Shumway <jshumway@amd.com>
2026-02-27 03:05:38 +00:00
..

Convolution Reflection Directory

This directory contains tools for "reflecting" on convolution kernel instances. It allows developers to inspect the compile-time configuration of a kernel and generate detailed, human-readable descriptions.

See the main builder documentation for an overview.

Design Overview

The reflection system works by extracting properties from a convolution kernel type and formatting them into a string. This is useful for debugging, performance tuning, and generating documentation.

  1. Trait Extraction: The ConvTraits template (in conv_traits.hpp) is specialized for each kernel instance. It extracts low-level details like tile sizes, data layouts, and pipeline versions from the kernel's type definition. This template is common for xld and wmma, fwd and backwards weight kernels. std::optional is used for parameters that are only used by some kernels

  2. Description Generation: The describe<Instance>() function (in conv_description.hpp) uses ConvTraits to populate a ConvDescription (Description) object.

  3. Formatting: The ConvDescription class (which implements Description) contains methods like brief() and detailed() that format the extracted properties into well-structured strings for display.

Key Files

  • description.hpp: The generalized Description base class with no implementation.

  • conv_description.hpp: The main entry point. Contains the ConvDescription struct and the describe() factory function.

  • conv_traits.hpp: Home of the ConvTraits template, which is the core of the property extraction mechanism.

  • tree_formatter.hpp: A simple utility for generating the indented, tree-like format used in the detailed() description.

Usage

To get a description of a convolution kernel instance, use the describe function and call one of its formatting methods:

#include "ck_tile/builder/reflect/conv_description.hpp"

// Assume MyConvFwdInstance is a type alias for a specific kernel instance
using MyConvFwdInstance = /* ... some kernel type ... */;

// Describe the instance
const auto description = ck_tile::reflect::conv::Describe<MyConvFwdInstance>();

// Print the detailed description
std::cout << description.detailed() << std::endl;

Appendix: Current Limitations

Supported Instance Types

The reflection system (ckr::describe) currently supports the following convolution instance types:

  • Standard XDL Forward Convolution (DeviceGroupedConvFwdMultipleABD_Xdl_CShuffle)
  • Large Tensor XDL Forward Convolution (DeviceGroupedConvFwdMultipleD_Xdl_CShuffle_Large_Tensor)
  • V3 XDL Forward Convolution (DeviceGroupedConvFwdMultipleABD_Xdl_CShuffle_V3)
  • V3 WMMA Forward Convolution (DeviceGroupedConvBwdWeightMultipleD_Wmma_CShuffleV3)
  • XDL Backward Weight Convolution (DeviceGroupedConvBwdWeight_Xdl_CShuffle)
  • V3 XDL Backward Weight Convolution (DeviceGroupedConvBwdWeight_Xdl_CShuffleV3)
  • XDL Multiple D Backward Weight Convolution (DeviceGroupedConvBwdWeightMultipleD_Xdl_CShuffle)
  • Two Stage XDL Backward Weight Convolution (DeviceGroupedConvBwdWeightTwoStage_Xdl_CShuffle)
  • V3 Two Stage XDL Backward Weight Convolution (DeviceGroupedConvBwdWeightTwoStage_Wmma_CShuffleV3)
  • Wmma Backward Weight Convolution (DeviceGroupedConvBwdWeight_Wmma_CShuffle)
  • V3 Wmma Backward Weight Convolution (DeviceGroupedConvBwdWeight_Wmma_CShuffleV3)
  • V3 Wmma Multiple D Backward Weight Convolution (DeviceGroupedConvBwdWeightMultipleD_Wmma_CShuffleV3)

These variants all share similar template parameter structures and are compatible with the current ConvTraits implementation.

Unsupported Instance Types

The following instance types are not yet supported by the reflection system:

  • DL (pre-XDL) Variants (DeviceGroupedConvFwdDlMultipleD_NHWC_KYXC_NHWK)
    • Uses different internal structure with parameters like K0PerBlock, K1, M1PerThread, etc.
    • Missing standard members like kKPerBlock, kMPerXDL, kAK1

Future Work

To support these additional instance types, the reflection system would need:

  1. Specialized ConvTraits templates for each variant type
  2. Updated conv_layout, conv_data_type, and other helper functions to handle different parameter structures
  3. Conditional compilation or SFINAE techniques to select the appropriate trait extraction logic based on instance type
  4. Customize ConvDescription methods for more general kernels.

For now, these unsupported types can still use GetInstanceString() through the base class pointer, but cannot use the ckr::describe reflection API.