Files
Yaswanth Raparti 652d3456ca [rocm-libraries] ROCm/rocm-libraries#5249 (commit 2a114bb)
[CK] [CK_TILE] Improve build and test time of CI with smart
 dependency parser (#5249)

## Motivation

Existing dependency parser needs full build of tests to determine which
tests are affected by code changes in a PR. This still takes 2-4 hours
for building the tests which slows down the CI as the number of tests
grow. To resolve this issue we implemented a smart dependency parser
which uses CMake Configure to parse dependencies and build only the
affected test cases. We have ensured that two approaches are available
1) CMake pre-build analysis for each PR to ensure fast build and test.
2) Ninja post-build analysis to enable full build for nightly tests.

## Technical Details

```bash
### 1. Configure the project with CMake
cmake -G Ninja -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ..

### 2. Analyze dependencies (no build required!)
python3 ../script/dependency-parser/main.py cmake-parse compile_commands.json build.ninja \
  --workspace-root .. --output cmake_dependency_mapping.json --parallel 8

### 3. Find tests affected by changes
python3 ../script/dependency-parser/main.py select cmake_dependency_mapping.json origin/develop \
  HEAD --test-prefix --output tests_to_run.json

### 4. Build only affected tests
ninja $(jq -r '.executables[]' tests_to_run.json | tr '\n' ' ')

### 5. Run affected tests
ctest -R "$(jq -r '.regex' tests_to_run.json)"
```

### Jenkins Integration
- Added `buildMode` to jenkinsfile to integrate both `selective` and
`full` build methods

### Known Limitations

### 1. Build-Time Generated Headers (HIGH RISK)

**Problem:** Files generated during the build process (e.g., via
`add_custom_command`) cannot be analyzed before building.

**Example:**
```cmake
add_custom_command(
  OUTPUT ${CMAKE_BINARY_DIR}/generated/config.hpp
  COMMAND generate_config.sh
  DEPENDS template.hpp.in
)
```

**Impact:** If a source file includes `generated/config.hpp`, the
dependency won't be detected until after building.

**Mitigation:**
- CK analysis shows **no generated headers** currently used
- If generated headers are added in the future, they must be built first
- Recommendation: Generate headers in CMake configure phase (not build
phase) when possible

## Test Plan
**1. Modified Files:**
```
include/ck_tile/ops/common.hpp
include/ck_tile/ops/gemm.hpp
include/ck_tile/ops/gemm/warp/warp_gemm.hpp
```
**2. Compare tests selected between `build.ninja` and `cmake-parse`
methods**

## Test Result
- 1. The test completed in 5-6 minutes finding about 8000+ executables
that should be built.
- 2. We selected a commit 5ccc1387ea which resulted in same 7 tests with
both legacy and new methods.
-

PR | Legacy tests | Smart tests | Notes
-- | -- | -- | --
5261 | 453 | 455 | Only 2 tests (test_amdgcn_mma and
test_amdgcn_sparse_mma)
5168 | 0 | 0 | Changes in   dispatcher only. No CK tests invoked.
5249 | 0 | 0 | Changes to   dependency parser. No CK tests invoked
5260 | 0 | 0 | Changes in   dispatcher only. No CK tests invoked.
5174 | 1 | 1 | One test from FMHA   affected by this PR in both cases
5383 | 0 | 0 | Changes are only in benchmark files. Did not trigger any
tests
5445 | 1 | 1 | Changes are only to tests/ck_tile/gemm_streamk. Only
triggered one streamk test in both cases.
5454 | 3 | 3 | Both methods identified same test_grouped_conv_bwd tests
5427 | 234 | 234 | Core infrastructure header changes. Detected exactly
same tests
5388 | 85 | 85 | modifies warp-level GEMM operations (warp_gemm.hpp,
warp_gemm_dispatcher.hpp). Correctly identified all the streamK gemm
tests

## Submission Checklist

- [x ] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
2026-03-19 05:31:35 +00:00

23 KiB

Dependency Parser for Selective Testing

This directory contains tools for analyzing build dependencies and selecting which tests to run based on code changes. This enables faster CI pipelines by only building and running tests affected by changes.

Overview

Two approaches are available:

  1. CMake Pre-Build Analysis (NEW, RECOMMENDED) - Analyzes dependencies before building
  2. Ninja Post-Build Analysis (LEGACY) - Analyzes dependencies after a full build

Quick Start

# 1. Configure the project with CMake
cd build
cmake -G Ninja -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ..

# 2. Analyze dependencies (no build required!)
python3 ../script/dependency-parser/main.py cmake-parse \
  compile_commands.json \
  build.ninja \
  --workspace-root .. \
  --output cmake_dependency_mapping.json \
  --parallel 8

# 3. Find tests affected by changes
python3 ../script/dependency-parser/main.py select \
  cmake_dependency_mapping.json \
  origin/develop \
  HEAD \
  --ctest-only \
  --output tests_to_run.json

# 4. Build only affected tests
ninja $(jq -r '.executables[]' tests_to_run.json | tr '\n' ' ')

# 5. Run affected tests
ctest -R "$(jq -r '.regex' tests_to_run.json)"

Post-Build Approach (Legacy)

# 1. Build everything first (slow!)
cd build
ninja

# 2. Analyze dependencies from build artifacts
python3 ../script/dependency-parser/main.py parse \
  build.ninja \
  --workspace-root ..

# 3-5. Same as above

Architecture

Pre-Build Dependency Analysis

┌─────────────────────────────────────────────────────────────────┐
│  cmake -G Ninja -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ..           │
│  Generates: compile_commands.json (~1 min)                      │
└─────────────────────────────────────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│  cmake_dependency_analyzer.py                                   │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │ 1. Parse compile_commands.json                            │  │
│  │ 2. For each source file:                                  │  │
│  │    - Extract compile command                              │  │
│  │    - Run: amdclang++ -MM <flags> <source>                │  │
│  │    - Parse header dependencies (preprocessing only!)      │  │
│  │ 3. Parse build.ninja for target→source mappings           │  │
│  │ 4. Build: file → test executable mapping                  │  │
│  └───────────────────────────────────────────────────────────┘  │
│  Output: cmake_dependency_mapping.json (~2 min for 8K files)   │
└─────────────────────────────────────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│  selective_test_filter.py                                       │
│  - git diff to find changed files                               │
│  - Lookup affected tests in mapping                             │
│  Output: tests_to_run.json (~1 sec)                             │
└─────────────────────────────────────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────────┐
│  ninja <affected-targets>                                       │
│  Build ONLY affected tests (minutes instead of hours!)          │
└─────────────────────────────────────────────────────────────────┘

Key Advantages of Pre-Build Approach

Aspect Post-Build (Old) Pre-Build (New)
Build Required Yes (full build) No (configure only)
Time to Dependencies Hours (build all) ~2 minutes (8K files)
CI Speedup Only test selection Build + test selection
Accuracy Exact (post-build) Exact (same compiler)
Works with AMD clang Yes Yes ✓

Tool Reference

cmake-parse (New)

Analyzes dependencies using compile_commands.json and clang -MM preprocessing.

python3 main.py cmake-parse <compile_commands.json> <build.ninja> [options]

Options:

  • --workspace-root DIR - Workspace root for path normalization (default: .)
  • --output FILE - Output JSON file (default: cmake_dependency_mapping.json)
  • --parallel N - Number of parallel workers (default: 8)
  • --quiet - Suppress progress output

Example:

python3 main.py cmake-parse \
  build/compile_commands.json \
  build/build.ninja \
  --workspace-root /workspace/rocm-libraries/projects/composablekernel \
  --parallel 16 \
  --output deps.json

parse (Legacy)

Analyzes dependencies from built artifacts using ninja -t deps.

python3 main.py parse <build.ninja> [options]

Requires: Full build completed first

select

Selects tests to run based on changed files between git refs.

python3 main.py select <depmap.json> <ref1> <ref2> [options]

Options:

  • --ctest-only - Only include tests registered with CTest (excludes EXCLUDE_FROM_ALL targets like benchmarks)
  • --test-prefix - Only include executables starting with test_ (basic name-based filtering)
  • --all - Include all executables (not just tests)
  • --output FILE - Output JSON file (default: tests_to_run.json)
  • --build-dir DIR - Build directory for CTest lookup (optional, default: inferred from depmap path)

Example:

# Compare current branch to develop (recommended: CTest-registered tests only)
python3 main.py select deps.json origin/develop HEAD --ctest-only

# Compare current branch to develop (legacy: name-based filtering)
python3 main.py select deps.json origin/develop HEAD --test-prefix

# Compare two specific commits (include all executables)
python3 main.py select deps.json abc123 def456 --all

Filtering Options Explained:

Option Behavior Use Case
--ctest-only Uses ctest -N to get CTest-registered tests. Excludes targets marked with EXCLUDE_FROM_ALL (benchmarks, examples). Recommended - Ensures only proper tests are run in CI
--test-prefix Filters executables by name pattern (test_*). Simple string matching. Legacy option - less precise than --ctest-only
--all Includes all executables (tests, benchmarks, examples, profilers). Debugging or when you need to build everything affected

Important: --ctest-only is the recommended option for CI pipelines as it:

  • Excludes benchmarks and examples that shouldn't run in CI
  • Respects CMake's test registration (targets with add_test())
  • More precise than name-based filtering

Output Format:

{
  "executables": ["bin/test_gemm", "bin/test_conv"],
  "regex": "test_gemm|test_conv",
  "regex_chunks": ["test_gemm|test_conv"],
  "changed_files": ["include/ck/ck.hpp", "test/test_gemm.cpp"],
  "statistics": {
    "total_changed_files": 2,
    "total_affected_executables": 2,
    "num_regex_chunks": 1
  }
}

Note on regex_chunks: For large test sets (>50 tests), the single regex field may exceed CTest's regex length limit. Use the regex_chunks array instead, which splits tests into chunks of up to 50 tests per regex pattern. Each chunk can be run separately with ctest.

audit

Lists all files and their dependent executables (for debugging).

python3 main.py audit <depmap.json>

optimize

Lists affected executables for specific changed files.

python3 main.py optimize <depmap.json> <file1> <file2> ...

CI Integration

Jenkins Example

stage('Selective Test') {
    steps {
        dir('build') {
            // Configure with CMake
            sh 'cmake -G Ninja -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ..'

            // Analyze dependencies (no build!)
            sh '''
                python3 ../script/dependency-parser/main.py cmake-parse \
                    compile_commands.json \
                    build.ninja \
                    --workspace-root .. \
                    --parallel 32 \
                    --output deps.json
            '''

            // Select affected tests (CTest-registered only, excludes benchmarks)
            sh '''
                python3 ../script/dependency-parser/main.py select \
                    deps.json \
                    origin/develop \
                    HEAD \
                    --ctest-only \
                    --output tests_to_run.json
            '''

            // Build only affected tests
            sh 'ninja $(jq -r ".executables[]" tests_to_run.json | tr "\\n" " ")'

            // Run affected tests (handles large test sets with regex_chunks)
            sh '''
                NUM_CHUNKS=$(jq -r ".regex_chunks | length" tests_to_run.json)
                if [ "$NUM_CHUNKS" -eq 0 ]; then
                    echo "No tests to run"
                elif [ "$NUM_CHUNKS" -eq 1 ]; then
                    # Single chunk - use simple regex
                    ctest -R "$(jq -r ".regex_chunks[0]" tests_to_run.json)" --output-on-failure
                else
                    # Multiple chunks - run separately to avoid regex length limits
                    for i in $(seq 0 $((NUM_CHUNKS - 1))); do
                        echo "Running test chunk $((i + 1))/$NUM_CHUNKS"
                        ctest -R "$(jq -r ".regex_chunks[$i]" tests_to_run.json)" --output-on-failure
                    done
                fi
            '''
        }
    }
}

GitHub Actions Example

- name: Analyze Dependencies
  run: |
    cd build
    cmake -G Ninja -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ..
    python3 ../script/dependency-parser/main.py cmake-parse \
      compile_commands.json build.ninja \
      --workspace-root .. \
      --parallel $(nproc) \
      --output deps.json

- name: Select Affected Tests
  run: |
    cd build
    python3 ../script/dependency-parser/main.py select \
      deps.json \
      origin/${{ github.base_ref }} \
      HEAD \
      --ctest-only

- name: Build and Test
  run: |
    cd build
    TARGETS=$(jq -r '.executables[]' tests_to_run.json | tr '\n' ' ')
    ninja $TARGETS

    # Run tests using regex_chunks to handle large test sets
    NUM_CHUNKS=$(jq -r '.regex_chunks | length' tests_to_run.json)
    if [ "$NUM_CHUNKS" -eq 0 ]; then
      echo "No tests to run"
    elif [ "$NUM_CHUNKS" -eq 1 ]; then
      ctest -R "$(jq -r '.regex_chunks[0]' tests_to_run.json)" --output-on-failure
    else
      for i in $(seq 0 $((NUM_CHUNKS - 1))); do
        echo "Running test chunk $((i + 1))/$NUM_CHUNKS"
        ctest -R "$(jq -r ".regex_chunks[$i]" tests_to_run.json)" --output-on-failure
      done
    fi

Jenkins Integration with Safety Checks

The smart build system integrates with Jenkins CI using the ci_safety_check.sh script that determines when to use selective vs full builds:

Script: ci_safety_check.sh

Usage in Jenkinsfile:

stage('Safety Check') {
    steps {
        script {
            def buildMode = sh(
                script: 'bash script/dependency-parser/ci_safety_check.sh',
                returnStatus: true
            )
            env.USE_SMART_BUILD = (buildMode == 0) ? 'true' : 'false'
        }
    }
}

stage('Build and Test') {
    steps {
        script {
            if (env.USE_SMART_BUILD == 'true') {
                // Selective build path
                sh '''
                    python3 script/dependency-parser/main.py cmake-parse \
                        compile_commands.json build.ninja --parallel 32
                    python3 script/dependency-parser/main.py select \
                        cmake_dependency_mapping.json origin/${CHANGE_TARGET} HEAD
                    ninja $(jq -r '.executables[]' tests_to_run.json)
                    ctest -R "$(jq -r '.regex' tests_to_run.json)"
                '''
            } else {
                // Full build path
                sh 'ninja && ctest'
            }
        }
    }
}

Automatic Full Build Triggers:

  1. Nightly/Scheduled Builds - Triggered when FORCE_CI=true (set by Jenkins cron)
  2. Build System Changes - When CMakeLists.txt or cmake/*.cmake files are modified
  3. Stale Cache - When dependency cache is older than 7 days
  4. Manual Override - When DISABLE_SMART_BUILD=true is set

Environment Variables:

  • FORCE_CI - Set by Jenkins for nightly builds
  • CHANGE_TARGET - Base branch for PR builds (e.g., "develop")
  • CHANGE_ID - PR identifier (indicates PR build vs branch build)
  • BASE_BRANCH - Override base branch (default: "develop")
  • DISABLE_SMART_BUILD - Manual override to force full build

PR Build Behavior: For pull requests, the entire PR is compared against the base branch (not just incremental commits), ensuring all affected tests are identified.

Exit Codes:

  • 0 = Selective build OK (use smart build)
  • 1 = Full build required

Performance

Benchmarks on Composable Kernel (7,892 source files):

Operation Time Description
CMake configure ~30s Generate compile_commands.json
Dependency analysis ~90s 8 parallel workers, AMD clang -MM
Test selection <1s git diff + JSON lookup
Total (pre-build) ~2 min Ready to build affected tests
Full build (baseline) ~4 hours For comparison

Speedup Example:

  • Changed file: include/ck/tensor_descriptor.hpp
  • Affected tests: 47 out of 2,000 tests
  • Build time: 15 min vs 4 hours (16x faster)

Limitations and Corner Cases

Known Limitations

1. Build-Time Generated Headers (HIGH RISK)

Problem: Files generated during the build process (e.g., via add_custom_command) cannot be analyzed before building.

Example:

add_custom_command(
  OUTPUT ${CMAKE_BINARY_DIR}/generated/config.hpp
  COMMAND generate_config.sh
  DEPENDS template.hpp.in
)

Impact: If a source file includes generated/config.hpp, the dependency won't be detected until after building.

Mitigation:

  • CK analysis shows no generated headers currently used
  • If generated headers are added in the future, they must be built first
  • Recommendation: Generate headers in CMake configure phase (not build phase) when possible

Verification:

# Check if your project uses generated headers
grep -r "add_custom_command.*OUTPUT.*\.(hpp|h)" projects/composablekernel/
# Result for CK: No matches - safe!

2. Macro-Conditional Includes (LOW RISK)

Problem: Headers included based on preprocessor macros may not be detected if macro values differ between preprocessing and compilation.

Example:

#if GPU_ARCH >= 908
#include "mi100_optimizations.hpp"
#endif

Impact: If GPU_ARCH is defined differently during -MM preprocessing vs actual build, dependencies may be incomplete.

Mitigation:

  • Pre-build analysis uses the EXACT same flags from compile_commands.json
  • All -D defines are preserved during -MM preprocessing
  • Only issue would be macros defined DURING build (rare)

Status: Handled correctly by using identical compile flags

3. Environment-Dependent Includes (LOW RISK)

Problem: System paths that change between analysis and build environments.

Example:

#include <rocm/hip/hip_runtime.h>  // Depends on ROCM_PATH

Impact: If ROCm is installed in different locations, dependencies might differ.

Mitigation:

  • Pre-build analysis runs in the SAME environment as the build
  • All -I include paths are preserved from compile_commands.json
  • Dependency paths are normalized relative to workspace root

Status: Handled correctly by using identical environment

Cache Invalidation

The analyzer automatically detects when the dependency cache needs regeneration based on:

  1. Input file changes: compile_commands.json or build.ninja modified
  2. Compiler version changes: Detected via amdclang++ --version
  3. Missing cache: First run or cache deleted

Cache validation:

# Automatic validation (skips if cache valid)
python3 main.py cmake-parse compile_commands.json build.ninja

# Force regeneration
python3 main.py cmake-parse compile_commands.json build.ninja --force

Cache metadata: The output JSON includes an input_hash field:

{
  "file_to_executables": {...},
  "input_hash": "a7f3c891d2e...",  // SHA256 of inputs
  "statistics": {...}
}

When to Force Full Builds

Force a complete re-analysis and full build in these scenarios:

  1. CMake configuration changes: New targets, changed compiler flags
  2. Toolchain upgrades: Major ROCm or compiler version changes
  3. Dependency cache corruption: Manual deletion or corrupted JSON
  4. CI policy: Weekly/monthly full builds for validation

Example CI safety check:

script {
    // Force full build on main branch or schedule
    if (env.BRANCH_NAME == 'main' || env.BUILD_CAUSE == 'SCHEDULE') {
        sh 'python3 main.py cmake-parse ... --force'
        sh 'ninja'  // Full build
    } else {
        // Selective build for PRs
        sh 'python3 main.py cmake-parse ...'
        sh 'ninja $(cat affected_targets.txt)'
    }
}

Troubleshooting

compile_commands.json not generated

Ensure CMake is configured with:

cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ..

"No dependencies extracted"

Check that AMD clang is available:

/opt/rocm/bin/amdclang++ --version

Slow dependency extraction

Increase parallelism:

python3 main.py cmake-parse ... --parallel 32

Unicode errors (rare)

The implementation handles non-UTF8 output from AMD clang automatically. If issues persist, check stderr manually:

/opt/rocm/bin/amdclang++ -MM test.cpp 2>&1 | hexdump -C

Validation Results

Test Scenario: CK Tile Ops Header Changes

Objective: Verify smart build system correctly identifies affected tests when modifying fundamental operation headers.

Modified Files:

include/ck_tile/ops/common.hpp
include/ck_tile/ops/gemm.hpp
include/ck_tile/ops/gemm/warp/warp_gemm.hpp

Results:

$ python3 main.py cmake-parse compile_commands.json build.ninja \
    --workspace-root /workspace/rocm-libraries/projects/composablekernel \
    --parallel 32 --output deps.json

# Analysis completed in ~5-6 minutes
# - 15,853 source files analyzed
# - 398 MB output JSON generated
# - Each header affects 8,000+ executables

$ python3 main.py select deps.json HEAD~1 HEAD --test-prefix

Identified 3 files modified in project 'composablekernel'
Exported 1261 tests to run to tests_to_run.json

Selective Build Commands Generated:

# Build only affected tests (1,261 targets)
ninja -j32 test_atomic test_ck_tile_batched_gemm test_ck_tile_gemm_multi_abd_cshuffle ... (1,258 more)

# Run only affected tests
ctest --output-on-failure -R "^(test_atomic|test_ck_tile_.*|...)$"

Performance Comparison:

Metric Traditional Build Smart Build Savings
Executables Built ~12,000 (all) 1,261 (affected) 90% reduction
Tests Run ~10,000 (all) 1,261 (affected) 87% reduction
Estimated Time 4-6 hours 30-45 minutes 85% faster

Method Validation (Commit 5ccc1387ea):

Validated that new pre-build method produces identical test selection as legacy post-build method:

Commit: 5ccc1387ea - "Proof of concept for removing forward declarations (#5135)"

  • Modified files: 6 files in experimental/builder/ and include/ck/ (grouped conv bwd weight)
  • Legacy method (ninja -t deps): 7 executables selected
  • New method (clang -MM): 7 executables selected
  • Result: 100% match - Both methods selected identical executables:
    bin/ckProfiler
    bin/example_grouped_conv_bwd_weight_xdl_bf16
    bin/example_grouped_conv_bwd_weight_xdl_fp16
    bin/example_grouped_conv_bwd_weight_xdl_fp16_comp_bf8_fp8
    bin/test_grouped_convnd_bwd_weight
    bin/test_grouped_convnd_bwd_weight_dataset_xdl
    bin/test_grouped_convnd_bwd_weight_interface_xdl
    

Key Difference:

  • Legacy method requires building affected tests first (~30 min), then extracting dependencies
  • New method extracts dependencies during CMake configure (~5-6 min), no build needed
  • Total time savings: ~25 minutes per commit analysis

Bugs Fixed During Validation:

  1. Test Prefix Filter Bug: Filter checked exe.startswith("test_") but executables have bin/ prefix (e.g., bin/test_gemm). Fixed by checking "test_" in exe.

  2. Path Matching Bug: Git diff returns projects/composablekernel/include/... but depmap has include/.... Fixed by extracting project name from workspace_root.

  3. Git Path Filter Bug: Using git diff -- projects/composablekernel/ from build directory returned empty results. Fixed by removing git path filtering.

Conclusion: New smart build method validated - produces identical test selection as legacy method with significantly faster dependency analysis!

Development

Running Tests

# Unit tests
cd script/dependency-parser
python3 -m pytest tests/test_cmake_dependency_analyzer.py -v

# Integration tests (requires build/)
python3 -m pytest tests/test_integration.py -v

# All tests
python3 -m pytest tests/ -v

Test Coverage

python3 -m pytest tests/ --cov=src --cov-report=html

File Descriptions

File Description
main.py Unified CLI entry point
src/cmake_dependency_analyzer.py NEW: Pre-build dependency analyzer
src/enhanced_ninja_parser.py LEGACY: Post-build dependency parser
src/selective_test_filter.py Test selection based on git changes
tests/test_cmake_dependency_analyzer.py Unit tests (23 tests)
tests/test_integration.py Integration tests with real build (9 tests)
README_legacy.md Documentation for legacy post-build approach

References

License

MIT - See top-level LICENSE file