mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-14 02:02:46 +00:00
This document describes techniques for reducing C++ template instantiation
overhead in the Composable Kernel codebase, including:
- Replacing recursive templates with pack expansion (O(N) → O(1) depth)
- Using named functors instead of lambdas to share instantiations
- Replacing template recursion with constexpr loops
- Using fold expressions for accumulation operations
These techniques can significantly reduce build times for template-heavy code.
[ROCm/composable_kernel commit: b66597ed96]