Files
composable_kernel/include/ck/utility/sequence_helper.hpp
John Afaganis 96c39b331e [rocm-libraries] ROCm/rocm-libraries#7829 (commit 13af7da)
[ck] Enforce ASCII-only C/C++ sources for hipRTC
 compatibility (#7829)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

## Summary

CK source files must be compilable via **hipRTC (HIP runtime
compilation)**, whose preprocessor does not accept non-ASCII bytes
anywhere in a translation unit — **including in comments**. Bytes that
are harmless under `hipcc` (em-dashes, smart quotes, multiplication
signs, Greek letters, box-drawing glyphs, etc.) cause hipRTC to fail at
preprocessing time. These regularly leak in via LLM-assisted authoring
or copy/paste from formatted documents and silently break hipRTC paths
that are not exercised by the default `hipcc`-based build matrix.

This PR (a) cleans every existing violation (53 files) and (b) adds a
pre-checkin gate so new violations are rejected before merge.

## File extensions covered

Both the cleanup scan and the new Jenkins enforcement stage use the same
predicate:

```
*.h  *.hpp  *.cpp  *.h.in  *.hpp.in  *.cpp.in  *.inc  *.cl
```

(excluding `*/build/*` and `*/include/rapidjson/*`). This is a strict
superset of the existing `Clang Format` stage's predicate — `*.inc` is
added so test-fixture include files are also gated. The local pre-commit
hook's `c++/inc` type filter covers the same set.

## Why no enforcement today

CK is opted out of the rocm-libraries root `.pre-commit-config.yaml`, so
the existing `pre-commit` workflow doesn't touch CK. The local CK
`.pre-commit-config.yaml` only runs for developers who installed hooks.
The **authoritative gate is therefore the new Jenkins stage** in this
PR; the local hook is convenience.

## Commit layout (bisect-friendly)

1. `79798aa6261` — **`[ck] Convert reflect/ rendering to ASCII for
hipRTC compatibility`**
Behavior change, isolated. `TreeFormatter` swaps `├─ / └─ / │ ` for `|-
/ +- / | ` (3-col width preserved so alignment is unchanged).
`conv_description.hpp` swaps `×` for `x` as the dimension separator.
`test_conv_description.cpp` expected strings updated in lockstep so the
snapshot test stays green. This is the only commit in the series with
observable runtime impact.

2. `738fdb0d81c` — **`[ck] Strip non-ASCII bytes from C++ sources for
hipRTC compatibility`**
Mechanical text cleanup across 53 files. Replacements happen in comments
or in `std::cout` strings that are not asserted on by any test. None of
the 174 `.inc` files in the tree required edits, but they were in the
scan's predicate so the enforcement stage's predicate is a superset of
what was scanned. Full replacement table in the commit message.

3. `1d7cd8ba235` — **`[ck] Enforce ASCII-only C/C++ sources for hipRTC
compatibility`**
- New `projects/composablekernel/script/check_ascii_only.sh` (modeled on
`check_copyright_year.sh`).
- New entry in `projects/composablekernel/.pre-commit-config.yaml` under
the local-hooks block (`types_or: [c++, inc]`).
- New `ASCII Only Check` parallel stage in
`projects/composablekernel/Jenkinsfile`'s `Static checks` block,
mirroring the existing `Clang Format` stage but with `*.inc` added to
the find predicate. Always-on, no `RUN_CPPCHECK` gate.

The tree is buildable at every commit boundary. Commit 1 leaves 50 known
violations; commit 2 leaves 0; commit 3 wires the gate.

## Demo

Script output on a synthesized violation:

```
$ printf '// em-dash test \xe2\x80\x94 here\n' > /tmp/bad.cpp
$ projects/composablekernel/script/check_ascii_only.sh /tmp/bad.cpp
ERROR: /tmp/bad.cpp contains non-ASCII bytes:
1:// em-dash test — here
  Fix: replace with ASCII (em-dash -> --, smart quotes -> ", arrows -> ->, etc.)
$ echo $?
1
```

Full repo scan after the cleanup commits (note the `-name '*.inc'`
clause):

```
$ cd projects/composablekernel && find . -type f \( -name '*.h' -o -name '*.hpp' -o -name '*.cpp' \
    -o -name '*.h.in' -o -name '*.hpp.in' -o -name '*.cpp.in' -o -name '*.inc' -o -name '*.cl' \) \
    -not -path '*/build/*' -not -path '*/include/rapidjson/*' -print0 \
  | xargs -0 -P 8 -n 64 script/check_ascii_only.sh
$ echo $?
0
```

## Test plan

- [ ] Jenkins PR build: confirm new `Static checks -> ASCII Only Check`
stage runs green over the full predicate (incl. `*.inc`) and existing
`Clang Format` stage is unaffected.
- [ ] `test_conv_description` passes against the ASCII tree-formatter
output (touched in commit 1).
- [ ] Local: `pre-commit run ascii-only-checker --all-files` runs
cleanly after installing CK pre-commit hooks via
`script/install_precommit.sh`.
- [ ] Manually inject a non-ASCII byte in any `.cpp/.hpp/.inc` file,
push: confirm Jenkins fails the new stage with a clear error.
- [ ] Spot-check a representative subset of touched files under hipRTC
compilation to confirm no remaining hipRTC-blocking content (optional,
since the static byte check is a sufficient condition for hipRTC
preprocessor acceptance on this dimension).

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-06-04 15:00:17 +00:00

188 lines
6.0 KiB
C++

// Copyright (c) Advanced Micro Devices, Inc., or its affiliates.
// SPDX-License-Identifier: MIT
#pragma once
#include "ck/utility/functional4.hpp"
#include "ck/utility/tuple.hpp"
namespace ck {
template <index_t... Is>
__host__ __device__ constexpr auto make_sequence(Number<Is>...)
{
return Sequence<Is...>{};
}
// F returns index_t
template <typename F, index_t N>
__host__ __device__ constexpr auto generate_sequence(F, Number<N>)
{
return typename sequence_gen<N, F>::type{};
}
// F returns Number<>
template <typename F, index_t N>
__host__ __device__ constexpr auto generate_sequence_v2(F&& f, Number<N>)
{
return unpack([&f](auto&&... xs) { return make_sequence(f(xs)...); },
typename arithmetic_sequence_gen<0, N, 1>::type{});
}
template <index_t... Is>
__host__ __device__ constexpr auto to_sequence(Tuple<Number<Is>...>)
{
return Sequence<Is...>{};
}
// Functor wrapper for merge_sequences to enable reuse across call sites
struct merge_sequences_functor
{
template <typename... Seqs>
__host__ __device__ constexpr auto operator()(Seqs... seqs) const
{
return merge_sequences(seqs...);
}
};
// Unpacks tuple of sequences and merges them into a single sequence
template <typename TupleOfSequences>
__host__ __device__ constexpr auto unpack_and_merge_sequences(TupleOfSequences tuple_of_sequences)
{
return unpack(merge_sequences_functor{}, tuple_of_sequences);
}
// sequence_find_value - O(1) template depth constexpr search
//
// Optimization: Constexpr loop with array lookup instead of recursive template pattern
//
// Why this approach:
// - Recursive template (OLD): template instantiation for each recursion level -> O(N)
// instantiations
// Example: Finding value in Sequence<1,2,3,4,5> requires 5 recursive instantiations
//
// - Constexpr loop (NEW): Single function instantiation with runtime loop -> O(1) instantiation
// Same search requires only 1 function instantiation, loop executes at compile-time
//
// Implementation details:
// 1. Pack expansion creates constexpr array: {(Is == Target)...}
// 2. Constexpr for loop searches the array
// 3. Entire function evaluates at compile-time (no runtime cost)
//
// Impact:
// - Significantly reduces template instantiation depth for sequence search operations
// - Dramatically improves compilation time vs recursive template approach
// - Pattern applies to any compile-time search/lookup operation
//
// Trade-off: Uses constexpr evaluation instead of pure template metaprogramming.
// Requires C++14 constexpr but results in dramatically better compile times.
//
template <index_t Target, index_t... Is>
__host__ __device__ constexpr index_t sequence_find_value(Sequence<Is...>)
{
if constexpr(sizeof...(Is) == 0)
{
return -1;
}
else
{
constexpr bool matches[] = {(Is == Target)...};
for(index_t i = 0; i < static_cast<index_t>(sizeof...(Is)); ++i)
{
if(matches[i])
return i;
}
return -1;
}
}
// Result type for find_in_tuple_of_sequences
template <index_t ITran, index_t IDimUp, bool Found>
struct FindTransformResult
{
static constexpr index_t itran = ITran;
static constexpr index_t idim_up = IDimUp;
static constexpr bool found = Found;
};
// find_in_tuple_of_sequences - finds which sequence contains a target value
//
// Optimization: Pack expansion with constexpr search instead of nested static_for loops
//
// Why this approach:
// - Nested static_for (OLD): Creates lambda closure for each iteration level
// Example: Searching Tuple<Seq<0,1>, Seq<2,3>, Seq<4,5>> creates multiple applier::operator()
// instantiations. Result: Many applier instantiations for typical tensor descriptor operations.
//
// - Pack expansion + constexpr (NEW): Single function with compile-time array search
// Example: Same search creates constexpr array, single search function.
// Result: 1 function instantiation regardless of tuple size.
//
// Implementation:
// 1. Pack expansion: sequence_find_value<Target>(Seqs{})... applies search to each sequence
// 2. Results collected in constexpr array
// 3. Linear search finds first non-negative result (sequence containing target)
//
// Impact:
// - Significantly reduces applier::operator() instantiations in tensor descriptor transforms
// - O(1) template depth instead of O(N*M) for N sequences of length M
//
// Use case: Finding which dimension index contains a specific value (common in tensor reordering)
//
template <index_t Target, typename... Seqs>
struct FindInTupleOfSequencesCompute
{
private:
// Result struct for constexpr computation
struct ResultData
{
index_t itran;
index_t idim_up;
bool found;
};
// Compute result using constexpr function with array lookup
static constexpr ResultData compute()
{
if constexpr(sizeof...(Seqs) == 0)
{
return {0, 0, false};
}
else
{
// Pack expansion creates array - O(1) template depth
constexpr index_t indices[] = {sequence_find_value<Target>(Seqs{})...};
// Find first matching sequence
for(index_t i = 0; i < static_cast<index_t>(sizeof...(Seqs)); ++i)
{
if(indices[i] >= 0)
{
return {i, indices[i], true};
}
}
return {0, 0, false};
}
}
static constexpr ResultData result_ = compute();
public:
static constexpr index_t itran = result_.itran;
static constexpr index_t idim_up = result_.idim_up;
static constexpr bool found = result_.found;
using type = FindTransformResult<itran, idim_up, found>;
};
// Find target value in a tuple of sequences
// Returns FindTransformResult<itran, idim_up, found>
// Uses O(1) template depth via pack expansion (no recursion)
template <index_t Target, typename... Seqs>
__host__ __device__ constexpr auto find_in_tuple_of_sequences(Tuple<Seqs...>)
{
return typename FindInTupleOfSequencesCompute<Target, Seqs...>::type{};
}
} // namespace ck