mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-06-23 16:35:49 +00:00
[CK] [MIOPEN] Split convolution library by layout MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit # Split Composable Kernel convolution operations by data layout TLDR: 1. This is a reorganization of files, folders, and CMakeLists for convolution kernels and facilitates a splitting of the convolution library into layouts. 2. The speedup can range anywhere between 15-40% depending on the target architecture for miopen only builds of CK. For TheRock nightly builds of CK, which includes both miopen and hip tensor kernel instances, this constituted in a 10% decrease in compile time for gfx1100. ## Overview Based on https://github.com/ROCm/composable_kernel/pull/3010/ (except keeping 1 static library) ## What MIOpen Actually Uses MIOpen **exclusively uses: - **NHWGC** for all 2D convolutions - **NDHWGC** for all 3D convolutions This is because MIOpen's tensor descriptors natively use channel-last, group-aware formats. ## Key Changes ### 1. Layout-Based Directory Structure Reorganized convolution instance files from flat per-operation to hierarchical layout-based structure. For example: **Before:** grouped_conv2d_fwd/ ├── device_grouped_conv2d_fwd_xdl_nhwgc_*.cpp (MIOpen-required) ├── device_grouped_conv2d_fwd_xdl_gnhwc_*.cpp (optional) └── device_grouped_conv2d_fwd_xdl_ngchw_*.cpp (optional) **After:** grouped_conv2d_fwd/ ├── nhwgc/ ← MIOpen-required │ ├── xdl/device_grouped_conv2d_fwd_xdl_*.cpp │ └── wmma/device_grouped_conv2d_fwd_wmma_*.cpp ├── gnhwc/ ← Optional (excluded with MIOPEN_REQ_LIBS_ONLY) └── ngchw/ ← Optional (excluded with MIOPEN_REQ_LIBS_ONLY) ### 2. Preserved Umbrella Library As before, all convolution operations are consolidated into a single static `device_conv_operations` library: - Aggregates layout-specific instance object files via `ADD_CONV_LAYOUT_INSTANCES` macro - **Default build:** Includes all layouts (NHWGC + GNHWC + NGCHW + NDHWGC + GNDHWC + NGCDHW) - **MIOpen build (`MIOPEN_REQ_LIBS_ONLY=ON`):** Includes only NHWGC and NDHWGC layouts ### 3. Binary Size Reduction When building with `MIOPEN_REQ_LIBS_ONLY=ON`: **Layouts Included (26 targets):** - 7× NHWGC instances (2D operations + variants) - 19× NDHWGC instances (3D operations + variants) **Layouts Excluded (16 targets):** - 3× GNHWC instances (2D operations) - 3× NGCHW instances (2D operations) - 3× GNDHWC instances (3D operations) - 3× NGCDHW instances (3D operations) - 2× GNWC instances (1D operations) - 1× NWGC instance (1D operations) - 1× additional NHWGC instance (grouped_conv1d_fwd, not needed by MIOpen) This represents a **~38% reduction in instance targets** (16 excluded out of 42 total layout-specific targets). ### Testing - ✅ All existing CK tests link against the umbrella library - ✅ MIOpen links successfully with the reduced umbrella library - ✅ Profiler builds with all layout-specific targets explicitly listed Notes from the Author: Since this refactor moved most of the convolution files further into subdirectories, I concentrated on ensuring that no source files were excluded, including sharded sources: Targets are correctly migrated — no missing targets, no shard count mismatches.
31 lines
1.2 KiB
CMake
31 lines
1.2 KiB
CMake
# Copyright (c) Advanced Micro Devices, Inc., or its affiliates.
|
|
# SPDX-License-Identifier: MIT
|
|
|
|
# Helper function to conditionally link device_conv libraries only if they exist as targets.
|
|
# This is useful when device_conv libraries may be filtered out based on GPU targets,
|
|
# DTYPES, or build configuration flags.
|
|
#
|
|
# Usage:
|
|
# target_link_device_conv_libraries_if_exist(my_test PRIVATE utility device_conv2d_nhwgc_operations ...)
|
|
#
|
|
# Only device_conv* libraries are checked with if(TARGET).
|
|
# All other libraries (utility, gtest_main, etc.) are always linked.
|
|
function(target_link_device_conv_libraries_if_exist TARGET_NAME VISIBILITY)
|
|
set(_libs_to_link)
|
|
foreach(lib ${ARGN})
|
|
if(lib MATCHES "^device_conv")
|
|
# Only add device_conv libraries if they exist
|
|
if(TARGET ${lib})
|
|
list(APPEND _libs_to_link ${lib})
|
|
endif()
|
|
else()
|
|
# Always add non-device_conv libraries
|
|
list(APPEND _libs_to_link ${lib})
|
|
endif()
|
|
endforeach()
|
|
# Single target_link_libraries call with all libraries
|
|
if(_libs_to_link)
|
|
target_link_libraries(${TARGET_NAME} ${VISIBILITY} ${_libs_to_link})
|
|
endif()
|
|
endfunction()
|