[ck] Enforce ASCII-only C/C++ sources for hipRTC
compatibility (#7829)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
## Summary
CK source files must be compilable via **hipRTC (HIP runtime
compilation)**, whose preprocessor does not accept non-ASCII bytes
anywhere in a translation unit — **including in comments**. Bytes that
are harmless under `hipcc` (em-dashes, smart quotes, multiplication
signs, Greek letters, box-drawing glyphs, etc.) cause hipRTC to fail at
preprocessing time. These regularly leak in via LLM-assisted authoring
or copy/paste from formatted documents and silently break hipRTC paths
that are not exercised by the default `hipcc`-based build matrix.
This PR (a) cleans every existing violation (53 files) and (b) adds a
pre-checkin gate so new violations are rejected before merge.
## File extensions covered
Both the cleanup scan and the new Jenkins enforcement stage use the same
predicate:
```
*.h *.hpp *.cpp *.h.in *.hpp.in *.cpp.in *.inc *.cl
```
(excluding `*/build/*` and `*/include/rapidjson/*`). This is a strict
superset of the existing `Clang Format` stage's predicate — `*.inc` is
added so test-fixture include files are also gated. The local pre-commit
hook's `c++/inc` type filter covers the same set.
## Why no enforcement today
CK is opted out of the rocm-libraries root `.pre-commit-config.yaml`, so
the existing `pre-commit` workflow doesn't touch CK. The local CK
`.pre-commit-config.yaml` only runs for developers who installed hooks.
The **authoritative gate is therefore the new Jenkins stage** in this
PR; the local hook is convenience.
## Commit layout (bisect-friendly)
1. `79798aa6261` — **`[ck] Convert reflect/ rendering to ASCII for
hipRTC compatibility`**
Behavior change, isolated. `TreeFormatter` swaps `├─ / └─ / │ ` for `|-
/ +- / | ` (3-col width preserved so alignment is unchanged).
`conv_description.hpp` swaps `×` for `x` as the dimension separator.
`test_conv_description.cpp` expected strings updated in lockstep so the
snapshot test stays green. This is the only commit in the series with
observable runtime impact.
2. `738fdb0d81c` — **`[ck] Strip non-ASCII bytes from C++ sources for
hipRTC compatibility`**
Mechanical text cleanup across 53 files. Replacements happen in comments
or in `std::cout` strings that are not asserted on by any test. None of
the 174 `.inc` files in the tree required edits, but they were in the
scan's predicate so the enforcement stage's predicate is a superset of
what was scanned. Full replacement table in the commit message.
3. `1d7cd8ba235` — **`[ck] Enforce ASCII-only C/C++ sources for hipRTC
compatibility`**
- New `projects/composablekernel/script/check_ascii_only.sh` (modeled on
`check_copyright_year.sh`).
- New entry in `projects/composablekernel/.pre-commit-config.yaml` under
the local-hooks block (`types_or: [c++, inc]`).
- New `ASCII Only Check` parallel stage in
`projects/composablekernel/Jenkinsfile`'s `Static checks` block,
mirroring the existing `Clang Format` stage but with `*.inc` added to
the find predicate. Always-on, no `RUN_CPPCHECK` gate.
The tree is buildable at every commit boundary. Commit 1 leaves 50 known
violations; commit 2 leaves 0; commit 3 wires the gate.
## Demo
Script output on a synthesized violation:
```
$ printf '// em-dash test \xe2\x80\x94 here\n' > /tmp/bad.cpp
$ projects/composablekernel/script/check_ascii_only.sh /tmp/bad.cpp
ERROR: /tmp/bad.cpp contains non-ASCII bytes:
1:// em-dash test — here
Fix: replace with ASCII (em-dash -> --, smart quotes -> ", arrows -> ->, etc.)
$ echo $?
1
```
Full repo scan after the cleanup commits (note the `-name '*.inc'`
clause):
```
$ cd projects/composablekernel && find . -type f \( -name '*.h' -o -name '*.hpp' -o -name '*.cpp' \
-o -name '*.h.in' -o -name '*.hpp.in' -o -name '*.cpp.in' -o -name '*.inc' -o -name '*.cl' \) \
-not -path '*/build/*' -not -path '*/include/rapidjson/*' -print0 \
| xargs -0 -P 8 -n 64 script/check_ascii_only.sh
$ echo $?
0
```
## Test plan
- [ ] Jenkins PR build: confirm new `Static checks -> ASCII Only Check`
stage runs green over the full predicate (incl. `*.inc`) and existing
`Clang Format` stage is unaffected.
- [ ] `test_conv_description` passes against the ASCII tree-formatter
output (touched in commit 1).
- [ ] Local: `pre-commit run ascii-only-checker --all-files` runs
cleanly after installing CK pre-commit hooks via
`script/install_precommit.sh`.
- [ ] Manually inject a non-ASCII byte in any `.cpp/.hpp/.inc` file,
push: confirm Jenkins fails the new stage with a clear error.
- [ ] Spot-check a representative subset of touched files under hipRTC
compilation to confirm no remaining hipRTC-blocking content (optional,
since the static byte check is a sufficient condition for hipRTC
preprocessor acceptance on this dimension).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
[CK] Add rocm_ck spec factories: GemmSpec, makeSpec() (#7180)
## What this PR does
This is the third PR in the rocm_ck schema stack:
1. **#7150** — Foundation types (DataType, Layout, Args, Ops)
2. **#7163** — Schema engine (Signature, resolve(), ArchProperties)
3. **#7180 (this)** — Spec factories (GemmSpec, makeSpec())
`makeSpec()` is the bridge between user intent and kernel instantiation.
It takes a **Signature** (WHAT to compute — operator graph, dtypes,
layouts) and a **GemmAlgorithm** (HOW to compute it — tile sizes,
pipeline, partitioning) and produces a validated `GemmSpec` — a
structural type ready to use as a non-type template parameter.
The key property: **every constraint is enforced at compile time.** An
invalid GEMM configuration is a compile error, not a runtime crash or
silent corruption. The 33 compile-fail tests are the executable
specification of what's allowed.
## What's interesting
**Physical tensor table.** Not every tensor in a compute graph needs
device memory. The intermediate result of `C = A * B` in a fused
GEMM+Add+ReLU lives only in registers. `makeSpec()` walks the operator
chain and determines which tensors are physical (need Args slots) and
which are intermediate. The output is a fixed-layout table: `[lhs, rhs,
output, D0?, D1?, scale?]`.
**Epilogue composition.** Instead of a combinatorial explosion of named
patterns (GemmAdd, GemmAddRelu, GemmMulSilu, ...), the epilogue is a
composable chain of ops. `{GemmOp, AddOp, ReluOp}` produces
`epilogue_ops = {Add, Relu}` with the bias tensor automatically slotted
as D0. Two consecutive AddOps fold into a single Add with two D tensors
via CK Tile's parameter pack.
**Signature/Algorithm split.** The same Signature can pair with multiple
GemmAlgorithms to produce different tuning variants without changing the
mathematical result. This is the foundation for the dispatcher — one
operation description, many tile configurations.
## New types
| Type | Role |
|------|------|
| `GemmSpec` | Validated NTTP kernel descriptor — physical tensors, tile
geometry, epilogue chain |
| `GemmAlgorithm` | User-facing tuning input — tile sizes, pipeline,
partitioning, padding |
| `EpilogueOp` | NTTP-compatible projection of the Op variant for
epilogue chains |
| `Dim3` | M x N x K triple for tile geometry |
## Test coverage
- **69 unit tests** — happy paths, layouts, dtypes, quantization,
epilogue chains, algorithm variants
- **33 compile-fail tests** — one per constraint (tile divisibility,
INT8 rules, pipeline restrictions, etc.)
- **6 schema compatibility baselines** — frozen specs that break if the
schema changes
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
[CK] Add rocm_ck schema engine: Signature, resolve(), ArchProperties (#7179)
## Summary
A `Signature` is a directed compute graph: tensors are nodes, operators
are edges. Shared names between operator outputs and inputs form the
graph structure. `resolve()` walks this graph at compile time
(`consteval`), inferring dtype, rank, and layout for every tensor —
invalid configs become compiler errors, not runtime crashes.
**Key design decisions:**
- **Operators teach the system about tensors.** `GemmOp` implies rank 2
and Row/Col/Row layout. `AddOp` and `ReluOp` propagate from connected
slots. The dtype cascade fills in the rest: per-tensor → signature-wide
→ error.
- **Adding a new op is zero lines in the resolution engine** if it's
structurally binary (`lhs/rhs/out`) or unary (`in/out`) — C++20 concepts
handle dispatch automatically. Only ops with special semantics need
explicit branches.
- **TargetSet is a compile-time bitset over GPU targets.** The wave tile
validation table is the single source of truth for valid instruction
shapes, traced from CK Tile's WarpGemmDispatcher. FP8 tiles are
available on gfx942+ via IterateK composition, not gfx950-only.
**Reading order:** signature.hpp (data model) → arch_properties.hpp
(TargetSet, wave tiles) → resolve.hpp (resolution engine).
3 new headers, 3 unit tests (including diamond DAG coverage), 3
compile-fail tests. Introduces tests/compile_fail/ infrastructure.
**Stack**: PR 2 of 3 porting the rocm_ck constexpr schema from
experimental to production.
1. Foundation types — DataType, Layout, Args, Ops (#7114)
2. **This PR** — Schema engine (graph resolution)
3. Spec factories — GemmSpec, makeSpec() (#7180 )
Note: We also removed `FmhaBwdOp` for clarity, since that was introduced
early and doesn't have tests set up.
**Depends on**: #7114
## Test plan
- [x] ctest --test-dir build --output-on-failure — unit tests +
compile-fail tests pass
- [x] Compile-fail tests correctly reject: mixed CDNA+RDNA TargetSet,
conflicting layouts, empty quantization scale names
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
[CK] upgrade CI to rocm7.13 as default compiler (#7612)
## Motivation
Upgrade the default docker and compiler version in CI to rocm7.13.
In order to pass all the checks I had to also clean up a lot of
non-ascii characters in the source code comments and modify a couple of
tests that were affected by a new compiler logic.
## Technical Details
<!-- Explain the changes along with any relevant GitHub links. -->
## Test Plan
<!-- Explain any relevant testing done to verify this PR. -->
## Test Result
<!-- Briefly summarize test outcomes. -->
## Submission Checklist
- [ ] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
---------
Co-authored-by: Aviral Goel <aviral.goel@amd.com>
[CK] Add rocm_ck directory structure with feature flag (#7090)
## Summary
Adds initial rocm_ck directory structure, #7119.
- Establishes production `rocm_ck/` directory at
`composablekernel/rocm_ck/`, peer to `tile_engine/` and `dispatcher/`
- Adds `CK_ENABLE_ROCM_CK` option (default OFF) as a CK-internal feature
flag — no superbuild or TheRock changes needed
- Creates `rocm_ck` INTERFACE library, `ck_tile_headers` target, GTest
integration with builder-style convenience targets (`smoke-rocm-ck`,
`check-rocm-ck`)
- Adds Jenkins `RUN_ROCM_CK_TESTS` parameter for CI, following the
`RUN_BUILDER_TESTS` pattern
- README explains the constexpr schema model: host-device separation via
constexpr data rather than template parameters, enabling multi-arch
distribution through kpack archives
## Test plan
- [x] `cmake -DCK_ENABLE_ROCM_CK=ON` configures without errors
- [x] `ninja check-rocm-ck` passes (4 host-only index type tests)
- [x] Default build (`CK_ENABLE_ROCM_CK=OFF`) is unaffected — no rocm_ck
targets present
- [x] Jenkins `RUN_ROCM_CK_TESTS=true` enables the flag and runs
`check-rocm-ck`
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Max Podkorytov <4273004+tenpercent@users.noreply.github.com>