mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-06-29 19:28:33 +00:00
[CK] Add rocm_ck schema engine: Signature, resolve(), ArchProperties (#7179) ## Summary A `Signature` is a directed compute graph: tensors are nodes, operators are edges. Shared names between operator outputs and inputs form the graph structure. `resolve()` walks this graph at compile time (`consteval`), inferring dtype, rank, and layout for every tensor — invalid configs become compiler errors, not runtime crashes. **Key design decisions:** - **Operators teach the system about tensors.** `GemmOp` implies rank 2 and Row/Col/Row layout. `AddOp` and `ReluOp` propagate from connected slots. The dtype cascade fills in the rest: per-tensor → signature-wide → error. - **Adding a new op is zero lines in the resolution engine** if it's structurally binary (`lhs/rhs/out`) or unary (`in/out`) — C++20 concepts handle dispatch automatically. Only ops with special semantics need explicit branches. - **TargetSet is a compile-time bitset over GPU targets.** The wave tile validation table is the single source of truth for valid instruction shapes, traced from CK Tile's WarpGemmDispatcher. FP8 tiles are available on gfx942+ via IterateK composition, not gfx950-only. **Reading order:** signature.hpp (data model) → arch_properties.hpp (TargetSet, wave tiles) → resolve.hpp (resolution engine). 3 new headers, 3 unit tests (including diamond DAG coverage), 3 compile-fail tests. Introduces tests/compile_fail/ infrastructure. **Stack**: PR 2 of 3 porting the rocm_ck constexpr schema from experimental to production. 1. Foundation types — DataType, Layout, Args, Ops (#7114) 2. **This PR** — Schema engine (graph resolution) 3. Spec factories — GemmSpec, makeSpec() (#7180 ) Note: We also removed `FmhaBwdOp` for clarity, since that was introduced early and doesn't have tests set up. **Depends on**: #7114 ## Test plan - [x] ctest --test-dir build --output-on-failure — unit tests + compile-fail tests pass - [x] Compile-fail tests correctly reject: mixed CDNA+RDNA TargetSet, conflicting layouts, empty quantization scale names --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
14 lines
379 B
C++
14 lines
379 B
C++
// Copyright (c) Advanced Micro Devices, Inc., or its affiliates.
|
|
// SPDX-License-Identifier: MIT
|
|
//
|
|
// Must fail: Layout::Auto cannot be used for stride computation.
|
|
// Expected error: "leadingDimStride requires Row or Col layout"
|
|
|
|
#include <rocm_ck/layout.hpp>
|
|
|
|
#include <array>
|
|
|
|
using namespace rocm_ck;
|
|
|
|
constexpr auto bad = leadingDimStride(Layout::Auto, std::array{1, 2});
|