[rocm-libraries] ROCm/rocm-libraries#5249 (commit 2a114bb)

[CK] [CK_TILE] Improve build and test time of CI with smart
 dependency parser (#5249)

## Motivation

Existing dependency parser needs full build of tests to determine which
tests are affected by code changes in a PR. This still takes 2-4 hours
for building the tests which slows down the CI as the number of tests
grow. To resolve this issue we implemented a smart dependency parser
which uses CMake Configure to parse dependencies and build only the
affected test cases. We have ensured that two approaches are available
1) CMake pre-build analysis for each PR to ensure fast build and test.
2) Ninja post-build analysis to enable full build for nightly tests.

## Technical Details

```bash
### 1. Configure the project with CMake
cmake -G Ninja -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ..

### 2. Analyze dependencies (no build required!)
python3 ../script/dependency-parser/main.py cmake-parse compile_commands.json build.ninja \
  --workspace-root .. --output cmake_dependency_mapping.json --parallel 8

### 3. Find tests affected by changes
python3 ../script/dependency-parser/main.py select cmake_dependency_mapping.json origin/develop \
  HEAD --test-prefix --output tests_to_run.json

### 4. Build only affected tests
ninja $(jq -r '.executables[]' tests_to_run.json | tr '\n' ' ')

### 5. Run affected tests
ctest -R "$(jq -r '.regex' tests_to_run.json)"
```

### Jenkins Integration
- Added `buildMode` to jenkinsfile to integrate both `selective` and
`full` build methods

### Known Limitations

### 1. Build-Time Generated Headers (HIGH RISK)

**Problem:** Files generated during the build process (e.g., via
`add_custom_command`) cannot be analyzed before building.

**Example:**
```cmake
add_custom_command(
  OUTPUT ${CMAKE_BINARY_DIR}/generated/config.hpp
  COMMAND generate_config.sh
  DEPENDS template.hpp.in
)
```

**Impact:** If a source file includes `generated/config.hpp`, the
dependency won't be detected until after building.

**Mitigation:**
- CK analysis shows **no generated headers** currently used
- If generated headers are added in the future, they must be built first
- Recommendation: Generate headers in CMake configure phase (not build
phase) when possible

## Test Plan
**1. Modified Files:**
```
include/ck_tile/ops/common.hpp
include/ck_tile/ops/gemm.hpp
include/ck_tile/ops/gemm/warp/warp_gemm.hpp
```
**2. Compare tests selected between `build.ninja` and `cmake-parse`
methods**

## Test Result
- 1. The test completed in 5-6 minutes finding about 8000+ executables
that should be built.
- 2. We selected a commit 5ccc1387ea which resulted in same 7 tests with
both legacy and new methods.
-

PR | Legacy tests | Smart tests | Notes
-- | -- | -- | --
5261 | 453 | 455 | Only 2 tests (test_amdgcn_mma and
test_amdgcn_sparse_mma)
5168 | 0 | 0 | Changes in   dispatcher only. No CK tests invoked.
5249 | 0 | 0 | Changes to   dependency parser. No CK tests invoked
5260 | 0 | 0 | Changes in   dispatcher only. No CK tests invoked.
5174 | 1 | 1 | One test from FMHA   affected by this PR in both cases
5383 | 0 | 0 | Changes are only in benchmark files. Did not trigger any
tests
5445 | 1 | 1 | Changes are only to tests/ck_tile/gemm_streamk. Only
triggered one streamk test in both cases.
5454 | 3 | 3 | Both methods identified same test_grouped_conv_bwd tests
5427 | 234 | 234 | Core infrastructure header changes. Detected exactly
same tests
5388 | 85 | 85 | modifies warp-level GEMM operations (warp_gemm.hpp,
warp_gemm_dispatcher.hpp). Correctly identified all the streamK gemm
tests

## Submission Checklist

- [x ] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
This commit is contained in:
Yaswanth Raparti
2026-03-19 05:31:35 +00:00
committed by assistant-librarian[bot]
parent 345a56c55e
commit 652d3456ca
13 changed files with 3585 additions and 210 deletions

109
Jenkinsfile vendored
View File

@@ -1,3 +1,29 @@
// Composable Kernel Jenkins Pipeline
//
// SMART BUILD SYSTEM:
// This pipeline uses intelligent dependency analysis to speed up PR builds while
// maintaining full validation on nightly runs.
//
// How it works:
// 1. PR Builds (Selective):
// - Configure: cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON (~30s)
// - Analyze: Parse compile_commands.json + clang -MM for dependencies (~2min)
// - Select: git diff to find affected tests (~1s)
// - Build: ninja <affected-tests> only (minutes vs hours)
// - Test: ctest -R <affected-pattern>
//
// 2. Nightly Builds (Full):
// - FORCE_CI=true from cron triggers full build
// - All targets built and tested for validation
//
// 3. Safety Checks:
// - Forces full build if CMake configuration changes
// - Forces full build if dependency cache stale (>7 days)
// - Manual override: set DISABLE_SMART_BUILD=true
//
// Benefits: PR builds 5h → 30min (typical), nightly builds unchanged
// See: script/dependency-parser/README.md for details
//
def rocmnode(name) {
return '(rocmtest || miopen) && (' + name + ')'
}
@@ -678,13 +704,21 @@ def cmake_build(Map conf=[:]){
}
setup_cmd = conf.get(
"setup_cmd",
"""${cmake_envs} cmake -G Ninja ${setup_args} -DCMAKE_CXX_FLAGS=" -O3 " .. """
)
build_cmd = conf.get(
"build_cmd",
"${build_envs} ninja -j${nt} ${config_targets}"
"""${cmake_envs} cmake -G Ninja ${setup_args} -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DCMAKE_CXX_FLAGS=" -O3 " .. """
)
// Smart-build: Only build if running all tests or forced
// Otherwise, smart-build will determine what to build after cmake configure
if (runAllUnitTests) {
build_cmd = conf.get(
"build_cmd",
"${build_envs} ninja -j${nt} ${config_targets}"
)
} else {
// Smart-build enabled: skip full build, only run cmake configure
build_cmd = ""
}
cmd = conf.get("cmd", """
${setup_cmd}
${build_cmd}
@@ -741,25 +775,44 @@ def cmake_build(Map conf=[:]){
//run tests except when NO_CK_BUILD is set
if(!setup_args.contains("NO_CK_BUILD")){
sh "python3 ../script/ninja_json_converter.py .ninja_log --legacy-format --output ck_build_trace_${arch_name}.json"
archiveArtifacts "ck_build_trace_${arch_name}.json"
sh "python3 ../script/parse_ninja_trace.py ck_build_trace_${arch_name}.json"
if (params.NINJA_BUILD_TRACE || params.BUILD_INSTANCES_ONLY){
if (params.NINJA_FTIME_TRACE) {
echo "running ClangBuildAnalyzer"
sh "/ClangBuildAnalyzer/build/ClangBuildAnalyzer --all . clang_build.log"
sh "/ClangBuildAnalyzer/build/ClangBuildAnalyzer --analyze clang_build.log > clang_build_analysis_${arch_name}.log"
archiveArtifacts "clang_build_analysis_${arch_name}.log"
}
// do not run unit tests when building instances only
if(!params.BUILD_INSTANCES_ONLY){
if (!runAllUnitTests){
sh "../script/launch_tests.sh"
// Smart Build: Run smart_build_and_test.sh
sh """
export WORKSPACE_ROOT=${env.WORKSPACE}
export PARALLEL=32
export NINJA_JOBS=${nt}
export ARCH_NAME=${arch_name}
export PROCESS_NINJA_TRACE=true
export NINJA_FTIME_TRACE=${params.NINJA_FTIME_TRACE ? 'true' : 'false'}
bash ../script/dependency-parser/smart_build_and_test.sh
"""
// Archive artifacts if they were generated
if (fileExists("ck_build_trace_${arch_name}.json")) {
archiveArtifacts "ck_build_trace_${arch_name}.json"
}
if (fileExists("clang_build_analysis_${arch_name}.log")) {
archiveArtifacts "clang_build_analysis_${arch_name}.log"
}
}
else{
sh "ninja check"
echo "Full test suite requested (RUN_ALL_UNIT_TESTS=true or develop branch)"
sh "ninja -j${nt} check"
// Process ninja build trace after full build
sh "python3 ../script/ninja_json_converter.py .ninja_log --legacy-format --output ck_build_trace_${arch_name}.json"
archiveArtifacts "ck_build_trace_${arch_name}.json"
sh "python3 ../script/parse_ninja_trace.py ck_build_trace_${arch_name}.json"
if (params.NINJA_FTIME_TRACE) {
echo "running ClangBuildAnalyzer"
sh "/ClangBuildAnalyzer/build/ClangBuildAnalyzer --all . clang_build.log"
sh "/ClangBuildAnalyzer/build/ClangBuildAnalyzer --analyze clang_build.log > clang_build_analysis_${arch_name}.log"
archiveArtifacts "clang_build_analysis_${arch_name}.log"
}
}
if (params.RUN_BUILDER_TESTS && !setup_args.contains("-DCK_CXX_STANDARD=") && !setup_args.contains("gfx10") && !setup_args.contains("gfx11")) {
sh 'ninja check-builder'
@@ -781,12 +834,24 @@ def cmake_build(Map conf=[:]){
}
else{
// run unit tests unless building library for all targets
// Note: This else block is when NINJA_BUILD_TRACE=false and BUILD_INSTANCES_ONLY=false
// So no ninja trace processing needed here
if (!params.BUILD_INSTANCES_ONLY){
if (!runAllUnitTests){
sh "../script/launch_tests.sh"
// Smart Build: Run smart_build_and_test.sh
sh """
export WORKSPACE_ROOT=${env.WORKSPACE}
export PARALLEL=32
export NINJA_JOBS=${nt}
export ARCH_NAME=${arch_name}
export PROCESS_NINJA_TRACE=false
export NINJA_FTIME_TRACE=false
bash ../script/dependency-parser/smart_build_and_test.sh
"""
}
else{
sh "ninja check"
echo "Full test suite requested (RUN_ALL_UNIT_TESTS=true or develop branch)"
sh "ninja -j${nt} check"
}
if (params.RUN_BUILDER_TESTS && !setup_args.contains("-DCK_CXX_STANDARD=") && !setup_args.contains("gfx10") && !setup_args.contains("gfx11")) {
sh 'ninja check-builder'
@@ -1241,6 +1306,10 @@ pipeline {
name: "USE_SCCACHE",
defaultValue: true,
description: "Use the sccache for building CK (default: ON)")
booleanParam(
name: "DISABLE_SMART_BUILD",
defaultValue: false,
description: "Disable smart build system and force full build/test (default: OFF). Smart build uses pre-build dependency analysis for selective testing on PRs, full builds on nightly runs.")
booleanParam(
name: "RUN_CPPCHECK",
defaultValue: false,

View File

@@ -1,207 +1,668 @@
# Dependency-based Selective Test Filtering using Static Analysis of Ninja Builds for C++ Projects
# Dependency Parser for Selective Testing
This directory contains tools for analyzing build dependencies and selecting which tests to run based on code changes. This enables faster CI pipelines by only building and running tests affected by changes.
## Overview
This tool provides advanced dependency-based selective test filtering and build optimization for large C++ monorepos using static parsing of Ninja build files. By analyzing both source and header dependencies, it enables precise identification of which tests and executables are affected by code changes, allowing for efficient CI/CD workflows and faster incremental builds.
Two approaches are available:
The parser:
- Identifies all executables in the Ninja build.
- Maps object files to their source and header dependencies using `ninja -t deps`.
- Constructs a reverse mapping from each file to all dependent executables.
- Automatically detects monorepo structure (`projects/<name>/`) and scopes analysis accordingly.
- Exports results in CSV and JSON formats for integration with other tools.
1. **CMake Pre-Build Analysis** (NEW, RECOMMENDED) - Analyzes dependencies before building
2. **Ninja Post-Build Analysis** (LEGACY) - Analyzes dependencies after a full build
## Features
## Quick Start
- **Comprehensive Dependency Tracking**: Captures direct source file dependencies and, critically, all included header files via `ninja -t deps`.
- **Executable to Object Mapping**: Parses the `build.ninja` file to understand how executables are linked from object files.
- **Batch Dependency Extraction**: Runs a single `ninja -t deps` call (no arguments) to dump all dependency information at once, then filters in-memory. This avoids the massive overhead of per-object subprocess calls on large build files (e.g., a 246MB `build.ninja` with 29K+ objects completes in ~2 seconds instead of ~54 minutes).
- **Monorepo Awareness**: Automatically detects `projects/<project>/` paths, strips them to project-relative paths, and scopes `git diff` to only the relevant subtree.
- **File to Executable Inversion**: Inverts the dependency graph to map each file to the set of executables that depend on it.
- **Filtering**: Filters out system files (`/usr/`, `/opt/rocm/`, etc.) and focuses on project-specific dependencies.
- **Multiple Output Formats**:
- **CSV**: `enhanced_file_executable_mapping.csv` - Each row lists a file and a semicolon-separated list of dependent executables.
- **JSON**: `enhanced_dependency_mapping.json` - Includes file-to-executable mapping, executable-to-file mapping, repo metadata, and statistics.
- **Robust Error Handling**: Includes error handling for missing files and failed subprocess commands.
## Prerequisites
- **Python 3.7+**
- **Ninja build system**: The `ninja` executable must be in the system's PATH or its path provided as an argument.
- A **Ninja build directory** containing a `build.ninja` file. The project should have been built at least once (even partially) so that `ninja -t deps` has dependency data.
## Quick Start with launch_tests.sh
The easiest way to use this tool is via the `launch_tests.sh` wrapper script:
### Pre-Build Approach (Recommended)
```bash
# From the monorepo root (or anywhere):
script/launch_tests.sh /path/to/build-dir
# 1. Configure the project with CMake
cd build
cmake -G Ninja -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ..
# Uses default build dir (<CK_ROOT>/build) if no argument given:
script/launch_tests.sh
# 2. Analyze dependencies (no build required!)
python3 ../script/dependency-parser/main.py cmake-parse \
compile_commands.json \
build.ninja \
--workspace-root .. \
--output cmake_dependency_mapping.json \
--parallel 8
# 3. Find tests affected by changes
python3 ../script/dependency-parser/main.py select \
cmake_dependency_mapping.json \
origin/develop \
HEAD \
--ctest-only \
--output tests_to_run.json
# 4. Build only affected tests
ninja $(jq -r '.executables[]' tests_to_run.json | tr '\n' ' ')
# 5. Run affected tests
ctest -R "$(jq -r '.regex' tests_to_run.json)"
```
This script:
1. Discovers the git root (monorepo root) automatically.
2. Runs the dependency parser against `build.ninja`.
3. Runs `git diff` between `origin/develop` and the current branch (scoped to CK files only).
4. Maps changed files to affected tests/examples.
5. Runs the affected tests via `ctest` in chunks.
Environment variables:
- `CTEST_CHUNK_SIZE`: Number of tests per ctest invocation (default: 10).
- `CTEST_FAIL_FAST`: Set to `true` to stop on first failure (default: `false`).
## Using CMake with Ninja
To use this tool effectively, your C++ project should be configured with CMake to generate Ninja build files:
1. **Configure CMake to use Ninja:**
```bash
cmake -G Ninja \
-DCMAKE_PREFIX_PATH=/opt/rocm \
-DCMAKE_CXX_COMPILER=/opt/rocm/bin/hipcc \
-DCMAKE_BUILD_TYPE=Release \
-DGPU_TARGETS="gfx942" \
/path/to/composablekernel
```
2. **Build your project (full or partial):**
```bash
# Full build
ninja
# Or build specific targets
ninja example_gemm_xdl_fp16 example_gemm_xdl_fp16_v3
```
The parser only extracts dependencies for objects that were actually built.
3. **Run the dependency parser:**
```bash
python main.py parse /path/to/build/build.ninja --workspace-root /path/to/monorepo-root
```
**Note:** `--workspace-root` should point to the **git root** (monorepo root) for correct monorepo detection. If omitted, it defaults to `..` relative to the build directory.
## Usage
All features are available via the unified `main.py` CLI:
### Post-Build Approach (Legacy)
```bash
# Dependency parsing
python main.py parse /path/to/build.ninja --workspace-root /path/to/monorepo-root
# 1. Build everything first (slow!)
cd build
ninja
# Selective test filtering (between git refs)
python main.py select enhanced_dependency_mapping.json <ref1> <ref2> [--all | --test-prefix] [--output <output_json>]
# 2. Analyze dependencies from build artifacts
python3 ../script/dependency-parser/main.py parse \
build.ninja \
--workspace-root ..
# Code auditing (list all files and their dependent executables)
python main.py audit enhanced_dependency_mapping.json
# Build optimization (list affected executables for specific changed files)
python main.py optimize enhanced_dependency_mapping.json <changed_file1> [<changed_file2> ...]
# 3-5. Same as above
```
### Parse arguments
## Architecture
| Argument | Required | Description |
|----------|----------|-------------|
| `build_ninja` | Yes | Path to the `build.ninja` file |
| `--workspace-root` | No | Root of the workspace/monorepo (default: `..`) |
| `--ninja` | No | Path to the ninja executable (default: `ninja`) |
### Pre-Build Dependency Analysis
### Select arguments
```
┌─────────────────────────────────────────────────────────────────┐
│ cmake -G Ninja -DCMAKE_EXPORT_COMPILE_COMMANDS=ON .. │
│ Generates: compile_commands.json (~1 min) │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ cmake_dependency_analyzer.py │
│ ┌───────────────────────────────────────────────────────────┐ │
│ │ 1. Parse compile_commands.json │ │
│ │ 2. For each source file: │ │
│ │ - Extract compile command │ │
│ │ - Run: amdclang++ -MM <flags> <source> │ │
│ │ - Parse header dependencies (preprocessing only!) │ │
│ │ 3. Parse build.ninja for target→source mappings │ │
│ │ 4. Build: file → test executable mapping │ │
│ └───────────────────────────────────────────────────────────┘ │
│ Output: cmake_dependency_mapping.json (~2 min for 8K files) │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ selective_test_filter.py │
│ - git diff to find changed files │
│ - Lookup affected tests in mapping │
│ Output: tests_to_run.json (~1 sec) │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ ninja <affected-targets> │
│ Build ONLY affected tests (minutes instead of hours!) │
└─────────────────────────────────────────────────────────────────┘
```
| Argument | Required | Description |
|----------|----------|-------------|
| `depmap_json` | Yes | Path to `enhanced_dependency_mapping.json` |
| `ref1` | Yes | Source git ref (branch or commit SHA) |
| `ref2` | Yes | Target git ref (branch or commit SHA) |
| `--all` | No | Include all affected executables (default) |
| `--test-prefix` | No | Only include executables starting with `test_` |
| `--output` | No | Output JSON file (default: `tests_to_run.json`) |
### Key Advantages of Pre-Build Approach
## How It Works
| Aspect | Post-Build (Old) | Pre-Build (New) |
|--------|------------------|-----------------|
| **Build Required** | Yes (full build) | No (configure only) |
| **Time to Dependencies** | Hours (build all) | ~2 minutes (8K files) |
| **CI Speedup** | Only test selection | Build + test selection |
| **Accuracy** | Exact (post-build) | Exact (same compiler) |
| **Works with AMD clang** | Yes | Yes ✓ |
1. **Build File Parsing (`_parse_build_file`)**:
* Reads the `build.ninja` file (~246MB for the full CK monorepo build).
* Uses regular expressions to identify executable link rules and object compilation rules.
* Populates `executable_to_objects` and `object_to_source` mappings.
## Tool Reference
2. **Batch Dependency Extraction (`_extract_object_dependencies`)**:
* Runs a single `ninja -t deps` command (no arguments) which dumps all dependency information for every built object file.
* Parses the output and filters to only the objects found in `object_to_source`.
* Strips the workspace root prefix from absolute paths to produce project-relative paths.
### cmake-parse (New)
3. **Monorepo Path Detection (`_build_file_to_executable_mapping`)**:
* Applies a regex to detect `projects/<project_name>/` in dependency paths.
* Strips the monorepo prefix so paths are relative to the CK project root (e.g., `include/ck/ck.hpp`).
* Records the detected project name for use by the selective test filter.
Analyzes dependencies using `compile_commands.json` and clang `-MM` preprocessing.
4. **File Filtering (`_is_project_file`)**:
* Excludes system files (`/usr/`, `/opt/rocm/`, etc.).
* Includes files in known CK directories (`include/`, `library/`, `test/`, `example/`, etc.).
* Also recognizes monorepo-prefixed paths (`projects/composablekernel/include/`, etc.).
```bash
python3 main.py cmake-parse <compile_commands.json> <build.ninja> [options]
```
5. **Selective Test Filtering (`selective_test_filter.py`)**:
* Loads the dependency mapping JSON.
* Runs `git diff --name-only` between two refs, scoped to `projects/<project>/` when in monorepo mode.
* Strips the monorepo prefix from changed file paths.
* Looks up each changed file in the dependency map to find affected executables.
* Exports the list of tests to run as JSON.
**Options:**
- `--workspace-root DIR` - Workspace root for path normalization (default: `.`)
- `--output FILE` - Output JSON file (default: `cmake_dependency_mapping.json`)
- `--parallel N` - Number of parallel workers (default: 8)
- `--quiet` - Suppress progress output
## Output Files
**Example:**
```bash
python3 main.py cmake-parse \
build/compile_commands.json \
build/build.ninja \
--workspace-root /workspace/rocm-libraries/projects/composablekernel \
--parallel 16 \
--output deps.json
```
Running the parser generates two files in the build directory:
### parse (Legacy)
- **`enhanced_file_executable_mapping.csv`**:
```csv
source_file,executables
"include/ck/ck.hpp","bin/example_gemm_xdl_fp16;bin/example_gemm_xdl_fp16_v3"
"example/01_gemm/gemm_xdl_fp16.cpp","bin/example_gemm_xdl_fp16"
```
Analyzes dependencies from built artifacts using `ninja -t deps`.
- **`enhanced_dependency_mapping.json`**:
```json
{
"repo": {
"type": "monorepo",
"project": "composablekernel"
},
"file_to_executables": {
"include/ck/ck.hpp": ["bin/example_gemm_xdl_fp16", "bin/example_gemm_xdl_fp16_v3"],
"example/01_gemm/gemm_xdl_fp16.cpp": ["bin/example_gemm_xdl_fp16"]
},
"executable_to_files": {
"bin/example_gemm_xdl_fp16": ["include/ck/ck.hpp", "example/01_gemm/gemm_xdl_fp16.cpp"]
},
"statistics": {
"total_files": 180,
"total_executables": 20403,
"total_object_files": 29530,
"files_with_multiple_executables": 140
}
```bash
python3 main.py parse <build.ninja> [options]
```
**Requires:** Full build completed first
### select
Selects tests to run based on changed files between git refs.
```bash
python3 main.py select <depmap.json> <ref1> <ref2> [options]
```
**Options:**
- `--ctest-only` - Only include tests registered with CTest (excludes EXCLUDE_FROM_ALL targets like benchmarks)
- `--test-prefix` - Only include executables starting with `test_` (basic name-based filtering)
- `--all` - Include all executables (not just tests)
- `--output FILE` - Output JSON file (default: `tests_to_run.json`)
- `--build-dir DIR` - Build directory for CTest lookup (optional, default: inferred from depmap path)
**Example:**
```bash
# Compare current branch to develop (recommended: CTest-registered tests only)
python3 main.py select deps.json origin/develop HEAD --ctest-only
# Compare current branch to develop (legacy: name-based filtering)
python3 main.py select deps.json origin/develop HEAD --test-prefix
# Compare two specific commits (include all executables)
python3 main.py select deps.json abc123 def456 --all
```
**Filtering Options Explained:**
| Option | Behavior | Use Case |
|--------|----------|----------|
| `--ctest-only` | Uses `ctest -N` to get CTest-registered tests. Excludes targets marked with `EXCLUDE_FROM_ALL` (benchmarks, examples). | **Recommended** - Ensures only proper tests are run in CI |
| `--test-prefix` | Filters executables by name pattern (`test_*`). Simple string matching. | Legacy option - less precise than `--ctest-only` |
| `--all` | Includes all executables (tests, benchmarks, examples, profilers). | Debugging or when you need to build everything affected |
**Important:** `--ctest-only` is the recommended option for CI pipelines as it:
- Excludes benchmarks and examples that shouldn't run in CI
- Respects CMake's test registration (targets with `add_test()`)
- More precise than name-based filtering
**Output Format:**
```json
{
"executables": ["bin/test_gemm", "bin/test_conv"],
"regex": "test_gemm|test_conv",
"regex_chunks": ["test_gemm|test_conv"],
"changed_files": ["include/ck/ck.hpp", "test/test_gemm.cpp"],
"statistics": {
"total_changed_files": 2,
"total_affected_executables": 2,
"num_regex_chunks": 1
}
}
```
**Note on regex_chunks:**
For large test sets (>50 tests), the single `regex` field may exceed CTest's regex length limit. Use the `regex_chunks` array instead, which splits tests into chunks of up to 50 tests per regex pattern. Each chunk can be run separately with ctest.
### audit
Lists all files and their dependent executables (for debugging).
```bash
python3 main.py audit <depmap.json>
```
### optimize
Lists affected executables for specific changed files.
```bash
python3 main.py optimize <depmap.json> <file1> <file2> ...
```
## CI Integration
### Jenkins Example
```groovy
stage('Selective Test') {
steps {
dir('build') {
// Configure with CMake
sh 'cmake -G Ninja -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ..'
// Analyze dependencies (no build!)
sh '''
python3 ../script/dependency-parser/main.py cmake-parse \
compile_commands.json \
build.ninja \
--workspace-root .. \
--parallel 32 \
--output deps.json
'''
// Select affected tests (CTest-registered only, excludes benchmarks)
sh '''
python3 ../script/dependency-parser/main.py select \
deps.json \
origin/develop \
HEAD \
--ctest-only \
--output tests_to_run.json
'''
// Build only affected tests
sh 'ninja $(jq -r ".executables[]" tests_to_run.json | tr "\\n" " ")'
// Run affected tests (handles large test sets with regex_chunks)
sh '''
NUM_CHUNKS=$(jq -r ".regex_chunks | length" tests_to_run.json)
if [ "$NUM_CHUNKS" -eq 0 ]; then
echo "No tests to run"
elif [ "$NUM_CHUNKS" -eq 1 ]; then
# Single chunk - use simple regex
ctest -R "$(jq -r ".regex_chunks[0]" tests_to_run.json)" --output-on-failure
else
# Multiple chunks - run separately to avoid regex length limits
for i in $(seq 0 $((NUM_CHUNKS - 1))); do
echo "Running test chunk $((i + 1))/$NUM_CHUNKS"
ctest -R "$(jq -r ".regex_chunks[$i]" tests_to_run.json)" --output-on-failure
done
fi
'''
}
}
```
}
```
## Use Cases
### GitHub Actions Example
- **Selective CI/CD Testing**: Run only the tests affected by a PR's changes, cutting CI time dramatically.
- **Impact Analysis**: Determine which executables need to be rebuilt when a header changes.
- **Build Optimization**: Identify which targets are affected by a set of file changes.
- **Code Auditing**: Get a clear overview of how files are used across different executables.
```yaml
- name: Analyze Dependencies
run: |
cd build
cmake -G Ninja -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ..
python3 ../script/dependency-parser/main.py cmake-parse \
compile_commands.json build.ninja \
--workspace-root .. \
--parallel $(nproc) \
--output deps.json
## Limitations
- name: Select Affected Tests
run: |
cd build
python3 ../script/dependency-parser/main.py select \
deps.json \
origin/${{ github.base_ref }} \
HEAD \
--ctest-only
- Relies on the accuracy of Ninja's dependency information (`ninja -t deps`). If the build system doesn't correctly generate `.d` (dependency) files, the header information might be incomplete.
- Only objects that have been **actually built** will have dependency data. A partial build means partial coverage of the dependency map.
- The definition of "project file" vs. "system file" is based on a path-based heuristic and might need adjustment for other project structures.
- name: Build and Test
run: |
cd build
TARGETS=$(jq -r '.executables[]' tests_to_run.json | tr '\n' ' ')
ninja $TARGETS
# Run tests using regex_chunks to handle large test sets
NUM_CHUNKS=$(jq -r '.regex_chunks | length' tests_to_run.json)
if [ "$NUM_CHUNKS" -eq 0 ]; then
echo "No tests to run"
elif [ "$NUM_CHUNKS" -eq 1 ]; then
ctest -R "$(jq -r '.regex_chunks[0]' tests_to_run.json)" --output-on-failure
else
for i in $(seq 0 $((NUM_CHUNKS - 1))); do
echo "Running test chunk $((i + 1))/$NUM_CHUNKS"
ctest -R "$(jq -r ".regex_chunks[$i]" tests_to_run.json)" --output-on-failure
done
fi
```
### Jenkins Integration with Safety Checks
The smart build system integrates with Jenkins CI using the `ci_safety_check.sh` script that determines when to use selective vs full builds:
**Script:** [ci_safety_check.sh](ci_safety_check.sh)
**Usage in Jenkinsfile:**
```groovy
stage('Safety Check') {
steps {
script {
def buildMode = sh(
script: 'bash script/dependency-parser/ci_safety_check.sh',
returnStatus: true
)
env.USE_SMART_BUILD = (buildMode == 0) ? 'true' : 'false'
}
}
}
stage('Build and Test') {
steps {
script {
if (env.USE_SMART_BUILD == 'true') {
// Selective build path
sh '''
python3 script/dependency-parser/main.py cmake-parse \
compile_commands.json build.ninja --parallel 32
python3 script/dependency-parser/main.py select \
cmake_dependency_mapping.json origin/${CHANGE_TARGET} HEAD
ninja $(jq -r '.executables[]' tests_to_run.json)
ctest -R "$(jq -r '.regex' tests_to_run.json)"
'''
} else {
// Full build path
sh 'ninja && ctest'
}
}
}
}
```
**Automatic Full Build Triggers:**
1. **Nightly/Scheduled Builds** - Triggered when `FORCE_CI=true` (set by Jenkins cron)
2. **Build System Changes** - When CMakeLists.txt or cmake/*.cmake files are modified
3. **Stale Cache** - When dependency cache is older than 7 days
4. **Manual Override** - When `DISABLE_SMART_BUILD=true` is set
**Environment Variables:**
- `FORCE_CI` - Set by Jenkins for nightly builds
- `CHANGE_TARGET` - Base branch for PR builds (e.g., "develop")
- `CHANGE_ID` - PR identifier (indicates PR build vs branch build)
- `BASE_BRANCH` - Override base branch (default: "develop")
- `DISABLE_SMART_BUILD` - Manual override to force full build
**PR Build Behavior:** For pull requests, the entire PR is compared against the base branch (not just incremental commits), ensuring all affected tests are identified.
**Exit Codes:**
- `0` = Selective build OK (use smart build)
- `1` = Full build required
## Performance
Benchmarks on Composable Kernel (7,892 source files):
| Operation | Time | Description |
|-----------|------|-------------|
| CMake configure | ~30s | Generate compile_commands.json |
| Dependency analysis | ~90s | 8 parallel workers, AMD clang -MM |
| Test selection | <1s | git diff + JSON lookup |
| **Total (pre-build)** | **~2 min** | Ready to build affected tests |
| Full build (baseline) | ~4 hours | For comparison |
**Speedup Example:**
- Changed file: `include/ck/tensor_descriptor.hpp`
- Affected tests: 47 out of 2,000 tests
- Build time: 15 min vs 4 hours (16x faster)
## Limitations and Corner Cases
### Known Limitations
#### 1. Build-Time Generated Headers (HIGH RISK)
**Problem:** Files generated during the build process (e.g., via `add_custom_command`) cannot be analyzed before building.
**Example:**
```cmake
add_custom_command(
OUTPUT ${CMAKE_BINARY_DIR}/generated/config.hpp
COMMAND generate_config.sh
DEPENDS template.hpp.in
)
```
**Impact:** If a source file includes `generated/config.hpp`, the dependency won't be detected until after building.
**Mitigation:**
- CK analysis shows **no generated headers** currently used
- If generated headers are added in the future, they must be built first
- Recommendation: Generate headers in CMake configure phase (not build phase) when possible
**Verification:**
```bash
# Check if your project uses generated headers
grep -r "add_custom_command.*OUTPUT.*\.(hpp|h)" projects/composablekernel/
# Result for CK: No matches - safe!
```
#### 2. Macro-Conditional Includes (LOW RISK)
**Problem:** Headers included based on preprocessor macros may not be detected if macro values differ between preprocessing and compilation.
**Example:**
```cpp
#if GPU_ARCH >= 908
#include "mi100_optimizations.hpp"
#endif
```
**Impact:** If `GPU_ARCH` is defined differently during `-MM` preprocessing vs actual build, dependencies may be incomplete.
**Mitigation:**
- Pre-build analysis uses the EXACT same flags from `compile_commands.json`
- All `-D` defines are preserved during `-MM` preprocessing
- Only issue would be macros defined DURING build (rare)
**Status:** ✅ Handled correctly by using identical compile flags
#### 3. Environment-Dependent Includes (LOW RISK)
**Problem:** System paths that change between analysis and build environments.
**Example:**
```cpp
#include <rocm/hip/hip_runtime.h> // Depends on ROCM_PATH
```
**Impact:** If ROCm is installed in different locations, dependencies might differ.
**Mitigation:**
- Pre-build analysis runs in the SAME environment as the build
- All `-I` include paths are preserved from `compile_commands.json`
- Dependency paths are normalized relative to workspace root
**Status:** ✅ Handled correctly by using identical environment
### Cache Invalidation
The analyzer automatically detects when the dependency cache needs regeneration based on:
1. **Input file changes**: `compile_commands.json` or `build.ninja` modified
2. **Compiler version changes**: Detected via `amdclang++ --version`
3. **Missing cache**: First run or cache deleted
**Cache validation:**
```bash
# Automatic validation (skips if cache valid)
python3 main.py cmake-parse compile_commands.json build.ninja
# Force regeneration
python3 main.py cmake-parse compile_commands.json build.ninja --force
```
**Cache metadata:**
The output JSON includes an `input_hash` field:
```json
{
"file_to_executables": {...},
"input_hash": "a7f3c891d2e...", // SHA256 of inputs
"statistics": {...}
}
```
### When to Force Full Builds
Force a complete re-analysis and full build in these scenarios:
1. **CMake configuration changes**: New targets, changed compiler flags
2. **Toolchain upgrades**: Major ROCm or compiler version changes
3. **Dependency cache corruption**: Manual deletion or corrupted JSON
4. **CI policy**: Weekly/monthly full builds for validation
**Example CI safety check:**
```groovy
script {
// Force full build on main branch or schedule
if (env.BRANCH_NAME == 'main' || env.BUILD_CAUSE == 'SCHEDULE') {
sh 'python3 main.py cmake-parse ... --force'
sh 'ninja' // Full build
} else {
// Selective build for PRs
sh 'python3 main.py cmake-parse ...'
sh 'ninja $(cat affected_targets.txt)'
}
}
```
## Troubleshooting
- **"ninja: command not found"**: Ensure `ninja` is installed and in your PATH, or provide the full path via `--ninja`.
- **"build.ninja not found"**: Double-check the path to your `build.ninja` file.
- **Empty or Incomplete Output**:
* Make sure the project has been successfully built at least once. `ninja -t deps` relies on information generated during the build.
* Verify that your CMake is configured to generate dependency files for Ninja (`-G Ninja`).
- **JSON shows `"type": "component"` instead of `"monorepo"`**: Ensure `--workspace-root` points to the **git/monorepo root**, not the CK project root. The parser needs to see `projects/<name>/` in the dependency paths to detect monorepo mode.
### compile_commands.json not generated
Ensure CMake is configured with:
```bash
cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ..
```
### "No dependencies extracted"
Check that AMD clang is available:
```bash
/opt/rocm/bin/amdclang++ --version
```
### Slow dependency extraction
Increase parallelism:
```bash
python3 main.py cmake-parse ... --parallel 32
```
### Unicode errors (rare)
The implementation handles non-UTF8 output from AMD clang automatically.
If issues persist, check stderr manually:
```bash
/opt/rocm/bin/amdclang++ -MM test.cpp 2>&1 | hexdump -C
```
## Validation Results
### Test Scenario: CK Tile Ops Header Changes
**Objective:** Verify smart build system correctly identifies affected tests when modifying fundamental operation headers.
**Modified Files:**
```
include/ck_tile/ops/common.hpp
include/ck_tile/ops/gemm.hpp
include/ck_tile/ops/gemm/warp/warp_gemm.hpp
```
**Results:**
```bash
$ python3 main.py cmake-parse compile_commands.json build.ninja \
--workspace-root /workspace/rocm-libraries/projects/composablekernel \
--parallel 32 --output deps.json
# Analysis completed in ~5-6 minutes
# - 15,853 source files analyzed
# - 398 MB output JSON generated
# - Each header affects 8,000+ executables
$ python3 main.py select deps.json HEAD~1 HEAD --test-prefix
Identified 3 files modified in project 'composablekernel'
Exported 1261 tests to run to tests_to_run.json
```
**Selective Build Commands Generated:**
```bash
# Build only affected tests (1,261 targets)
ninja -j32 test_atomic test_ck_tile_batched_gemm test_ck_tile_gemm_multi_abd_cshuffle ... (1,258 more)
# Run only affected tests
ctest --output-on-failure -R "^(test_atomic|test_ck_tile_.*|...)$"
```
**Performance Comparison:**
| Metric | Traditional Build | Smart Build | Savings |
|--------|-------------------|-------------|---------|
| Executables Built | ~12,000 (all) | 1,261 (affected) | 90% reduction |
| Tests Run | ~10,000 (all) | 1,261 (affected) | 87% reduction |
| Estimated Time | 4-6 hours | 30-45 minutes | 85% faster |
**Method Validation (Commit 5ccc1387ea):**
Validated that new pre-build method produces identical test selection as legacy post-build method:
**Commit:** 5ccc1387ea - "Proof of concept for removing forward declarations (#5135)"
- **Modified files:** 6 files in `experimental/builder/` and `include/ck/` (grouped conv bwd weight)
- **Legacy method** (`ninja -t deps`): 7 executables selected
- **New method** (`clang -MM`): 7 executables selected
- **Result:** ✅ **100% match** - Both methods selected identical executables:
```
bin/ckProfiler
bin/example_grouped_conv_bwd_weight_xdl_bf16
bin/example_grouped_conv_bwd_weight_xdl_fp16
bin/example_grouped_conv_bwd_weight_xdl_fp16_comp_bf8_fp8
bin/test_grouped_convnd_bwd_weight
bin/test_grouped_convnd_bwd_weight_dataset_xdl
bin/test_grouped_convnd_bwd_weight_interface_xdl
```
**Key Difference:**
- Legacy method requires building affected tests first (~30 min), then extracting dependencies
- New method extracts dependencies during CMake configure (~5-6 min), no build needed
- **Total time savings:** ~25 minutes per commit analysis
**Bugs Fixed During Validation:**
1. **Test Prefix Filter Bug**: Filter checked `exe.startswith("test_")` but executables have `bin/` prefix (e.g., `bin/test_gemm`). Fixed by checking `"test_" in exe`.
2. **Path Matching Bug**: Git diff returns `projects/composablekernel/include/...` but depmap has `include/...`. Fixed by extracting project name from workspace_root.
3. **Git Path Filter Bug**: Using `git diff -- projects/composablekernel/` from build directory returned empty results. Fixed by removing git path filtering.
**Conclusion:** ✅ New smart build method validated - produces identical test selection as legacy method with significantly faster dependency analysis!
## Development
### Running Tests
```bash
# Unit tests
cd script/dependency-parser
python3 -m pytest tests/test_cmake_dependency_analyzer.py -v
# Integration tests (requires build/)
python3 -m pytest tests/test_integration.py -v
# All tests
python3 -m pytest tests/ -v
```
### Test Coverage
```bash
python3 -m pytest tests/ --cov=src --cov-report=html
```
## File Descriptions
| File | Description |
|------|-------------|
| `main.py` | Unified CLI entry point |
| `src/cmake_dependency_analyzer.py` | NEW: Pre-build dependency analyzer |
| `src/enhanced_ninja_parser.py` | LEGACY: Post-build dependency parser |
| `src/selective_test_filter.py` | Test selection based on git changes |
| `tests/test_cmake_dependency_analyzer.py` | Unit tests (23 tests) |
| `tests/test_integration.py` | Integration tests with real build (9 tests) |
| `README_legacy.md` | Documentation for legacy post-build approach |
## References
- [CMake compile_commands.json](https://cmake.org/cmake/help/latest/variable/CMAKE_EXPORT_COMPILE_COMMANDS.html)
- [Clang dependency generation](https://clang.llvm.org/docs/ClangCommandLineReference.html#dependency-file-generation)
- [Ninja build system](https://ninja-build.org/)
## License
MIT - See top-level LICENSE file

View File

@@ -0,0 +1,107 @@
#!/bin/bash
# Copyright (c) Advanced Micro Devices, Inc., or its affiliates.
# SPDX-License-Identifier: MIT
# CI Safety Check for Smart Build System
#
# This script determines when to force full builds vs selective builds.
# Integrates with existing Jenkins infrastructure (FORCE_CI, BRANCH_NAME, etc.)
#
# Exit codes:
# 0 = Selective build OK (use smart build)
# 1 = Full build required
#
# Environment variables (set by Jenkins):
# FORCE_CI - Set to "true" for nightly/scheduled builds
# BRANCH_NAME - Git branch name
# CHANGE_ID - PR number (set by Jenkins Multibranch Pipeline for PRs)
# CHANGE_TARGET - Base branch for PR builds (set by Jenkins Multibranch Pipeline)
#
# Note: CHANGE_ID may not be set even for PR builds if Jenkins job is not
# configured as Multibranch Pipeline. Script uses three-dot git diff syntax
# to correctly detect PR changes regardless of CHANGE_ID availability.
#
# Manual override (set by developer/admin if needed):
# DISABLE_SMART_BUILD - Set to "true" to force full build
# BASE_BRANCH - Override base branch (default: "develop")
set -e
# Configuration
FORCE_FULL_BUILD=false
REASON=""
BASE_BRANCH="${CHANGE_TARGET:-${BASE_BRANCH:-develop}}"
# 1. Check if this is a nightly/scheduled build
# Existing Jenkins infrastructure sets FORCE_CI=true for cron-triggered builds
if [ "$FORCE_CI" = "true" ]; then
FORCE_FULL_BUILD=true
REASON="nightly/scheduled build (FORCE_CI=true from Jenkins cron)"
fi
# 2. Manual override to disable smart build
# Set DISABLE_SMART_BUILD=true in Jenkins job parameters if you want to force a full build
if [ "$DISABLE_SMART_BUILD" = "true" ]; then
FORCE_FULL_BUILD=true
REASON="manual override (DISABLE_SMART_BUILD=true)"
fi
# 3. Force full build if CMakeLists.txt or cmake/ configuration changed
# Always compare against base branch (not consecutive commits) to avoid false positives from merge commits
# Three-dot syntax (...) only shows changes actually made in the PR, not changes from merged develop branch
if [ -n "$CHANGE_ID" ]; then
# This is a PR build (CHANGE_ID set by Jenkins Multibranch Pipeline)
CHANGED_FILES=$(git diff --name-only origin/${BASE_BRANCH}...HEAD 2>/dev/null || echo "")
else
# Fallback: Works for both branch builds and PRs without CHANGE_ID
# Use three-dot syntax to avoid including merge commit changes from develop
CHANGED_FILES=$(git diff --name-only origin/${BASE_BRANCH}...HEAD 2>/dev/null || echo "")
fi
if echo "$CHANGED_FILES" | grep -qE "(CMakeLists\.txt|cmake/.*\.cmake)"; then
FORCE_FULL_BUILD=true
REASON="build system configuration changed (CMakeLists.txt or cmake/*.cmake)"
fi
# 4. Force full build if dependency cache is older than 7 days
CACHE_FILE="cmake_dependency_mapping.json"
if [ -f "$CACHE_FILE" ]; then
# Different stat command for Linux vs macOS
if [[ "$OSTYPE" == "darwin"* ]]; then
CACHE_MTIME=$(stat -f %m "$CACHE_FILE")
else
CACHE_MTIME=$(stat -c %Y "$CACHE_FILE")
fi
CURRENT_TIME=$(date +%s)
CACHE_AGE_DAYS=$(( ($CURRENT_TIME - $CACHE_MTIME) / 86400 ))
if [ $CACHE_AGE_DAYS -gt 7 ]; then
FORCE_FULL_BUILD=true
REASON="dependency cache older than 7 days"
fi
fi
# Output decision
echo "========================================="
echo "Smart Build Safety Check"
echo "========================================="
echo "FORCE_CI: ${FORCE_CI:-false}"
echo "BRANCH_NAME: ${BRANCH_NAME:-unknown}"
echo "BASE_BRANCH: ${BASE_BRANCH}"
echo "CHANGE_ID: ${CHANGE_ID:-<not a PR>}"
echo "DISABLE_SMART_BUILD: ${DISABLE_SMART_BUILD:-false}"
echo "-----------------------------------------"
if [ "$FORCE_FULL_BUILD" = true ]; then
echo "Decision: 🔴 FULL BUILD REQUIRED"
echo "Reason: $REASON"
echo "========================================="
echo "export SMART_BUILD_MODE=full" > build_mode.env
exit 1 # Exit with error to signal full build needed
else
echo "Decision: 🟢 SELECTIVE BUILD ENABLED"
echo "Using smart build for faster CI"
echo "========================================="
echo "export SMART_BUILD_MODE=selective" > build_mode.env
exit 0 # Exit success to signal selective build OK
fi

View File

@@ -0,0 +1,391 @@
#!/bin/bash
# Local Smart Build Runner for ComposableKernel
# Run smart build workflow locally without Jenkins
set -e
# Colors for output
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
RED='\033[0;31m'
NC='\033[0m' # No Color
# Default values
PARALLEL=$(nproc)
BASE_REF="HEAD~1" # Previous commit (default for local testing)
TARGET_REF="HEAD" # Current state including uncommitted changes
BUILD_DIR="../../build"
CTEST_ONLY="--ctest-only"
WORKSPACE_ROOT="../.."
print_help() {
cat << 'HELP'
Usage: local_smart_build.sh [COMMAND] [OPTIONS]
Commands:
analyze Generate dependency map (step 1)
select Select affected tests (step 2)
build Build selected tests (step 3)
test Run selected tests (step 4)
all Run complete workflow (analyze → select → build → test)
stats Show statistics about test selection
clean Clean generated files
Options:
-b, --base-ref REF Base ref to compare against (default: HEAD~1)
-t, --target-ref REF Target ref to compare (default: HEAD)
-j, --parallel NUM Parallel jobs for analysis (default: nproc)
--build-dir DIR Build directory relative to script (default: ../../build)
--no-ctest-only Include all executables (benchmarks, examples)
-h, --help Show this help
Examples:
# Test uncommitted changes vs last commit (default)
./local_smart_build.sh all
# Test current branch vs develop
./local_smart_build.sh select -b origin/develop -t HEAD
# Test specific commit range
./local_smart_build.sh all -b abc123 -t def456
# Step by step
./local_smart_build.sh analyze
./local_smart_build.sh select
./local_smart_build.sh build
./local_smart_build.sh test
# Include all executables (not just tests)
./local_smart_build.sh all --no-ctest-only
Default behavior (no options):
Compares HEAD~1 (previous commit) vs HEAD (current state + uncommitted changes)
This tests your latest changes including work-in-progress.
File locations (in build directory):
- compile_commands.json (CMake generated)
- build.ninja (CMake generated)
- enhanced_dependency_mapping.json (analyze output)
- tests_to_run.json (select output)
HELP
}
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
check_prerequisites() {
log_info "Checking prerequisites..."
if ! command -v cmake &> /dev/null; then
log_error "cmake not found. Please install CMake."
exit 1
fi
if ! command -v ninja &> /dev/null; then
log_error "ninja not found. Please install Ninja build system."
exit 1
fi
if ! command -v python3 &> /dev/null; then
log_error "python3 not found. Please install Python 3."
exit 1
fi
if ! command -v jq &> /dev/null; then
log_error "jq not found. Please install jq for JSON processing."
exit 1
fi
log_info "All prerequisites found ✓"
}
cmd_analyze() {
log_info "Step 1: Generating dependency map..."
cd "$BUILD_DIR" || exit 1
# Always reconfigure CMake to ensure fresh compile_commands.json
log_info "Running CMake configure to generate fresh compile_commands.json..."
# Use CMAKE flags similar to the dev preset and README recommendations
cmake -G Ninja \
-DCMAKE_PREFIX_PATH=/opt/rocm \
-DCMAKE_CXX_COMPILER=/opt/rocm/bin/hipcc \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_EXPORT_COMPILE_COMMANDS=ON \
-DBUILD_DEV=ON \
"$WORKSPACE_ROOT"
if [ ! -f "build.ninja" ]; then
log_error "build.ninja not found after CMake configure"
exit 1
fi
log_info "Analyzing dependencies with $PARALLEL workers (this takes ~2 minutes)..."
python3 "$WORKSPACE_ROOT/script/dependency-parser/main.py" cmake-parse \
compile_commands.json \
build.ninja \
--workspace-root "$WORKSPACE_ROOT" \
--parallel "$PARALLEL" \
--output enhanced_dependency_mapping.json
log_info "Dependency map generated: enhanced_dependency_mapping.json ✓"
# Show stats
local num_files=$(jq '.file_to_executables | length' enhanced_dependency_mapping.json)
log_info "Mapped $num_files files to executables"
}
cmd_select() {
log_info "Step 2: Selecting affected tests..."
cd "$BUILD_DIR" || exit 1
if [ ! -f "enhanced_dependency_mapping.json" ]; then
log_error "Dependency map not found. Run 'analyze' first."
exit 1
fi
log_info "Comparing $BASE_REF$TARGET_REF..."
python3 "$WORKSPACE_ROOT/script/dependency-parser/main.py" select \
enhanced_dependency_mapping.json \
"$BASE_REF" \
"$TARGET_REF" \
$CTEST_ONLY \
--output tests_to_run.json
# Show statistics
local num_files=$(jq -r '.statistics.total_changed_files' tests_to_run.json)
local num_tests=$(jq -r '.statistics.total_affected_executables' tests_to_run.json)
local num_chunks=$(jq -r '.statistics.num_regex_chunks' tests_to_run.json)
log_info "Test selection complete ✓"
echo " Changed files: $num_files"
echo " Affected tests: $num_tests"
echo " Regex chunks: $num_chunks"
if [ "$num_tests" -eq 0 ]; then
log_warn "No tests affected by your changes"
fi
}
cmd_build() {
log_info "Step 3: Building affected tests..."
cd "$BUILD_DIR" || exit 1
if [ ! -f "tests_to_run.json" ]; then
log_error "Test selection not found. Run 'select' first."
exit 1
fi
local num_tests=$(jq -r '.statistics.total_affected_executables' tests_to_run.json)
if [ "$num_tests" -eq 0 ]; then
log_warn "No tests to build"
return 0
fi
log_info "Building $num_tests test executables..."
local targets=$(jq -r '.executables[]' tests_to_run.json | tr '\n' ' ')
if [ -n "$targets" ]; then
ninja -j"$PARALLEL" $targets
log_info "Build complete ✓"
else
log_warn "No targets to build"
fi
}
cmd_test() {
log_info "Step 4: Running affected tests..."
cd "$BUILD_DIR" || exit 1
if [ ! -f "tests_to_run.json" ]; then
log_error "Test selection not found. Run 'select' first."
exit 1
fi
local num_chunks=$(jq -r '.regex_chunks | length' tests_to_run.json)
if [ "$num_chunks" -eq 0 ]; then
log_warn "No tests to run"
return 0
fi
log_info "Running tests in $num_chunks chunk(s)..."
if [ "$num_chunks" -eq 1 ]; then
# Single chunk - simple case
local regex=$(jq -r '.regex_chunks[0]' tests_to_run.json)
ctest -R "$regex" --output-on-failure
else
# Multiple chunks
for i in $(seq 0 $((num_chunks - 1))); do
log_info "Running test chunk $((i + 1))/$num_chunks"
local regex=$(jq -r ".regex_chunks[$i]" tests_to_run.json)
ctest -R "$regex" --output-on-failure
done
fi
log_info "All tests complete ✓"
}
cmd_all() {
log_info "Running complete smart build workflow..."
log_info "Testing changes: $BASE_REF$TARGET_REF"
echo ""
cmd_analyze
echo ""
cmd_select
echo ""
cmd_build
echo ""
cmd_test
echo ""
log_info "Smart build workflow complete! ✓"
}
cmd_stats() {
log_info "Smart Build Statistics"
echo ""
cd "$BUILD_DIR" || exit 1
if [ -f "enhanced_dependency_mapping.json" ]; then
echo "Dependency Map:"
local num_files=$(jq '.file_to_executables | length' enhanced_dependency_mapping.json)
echo " Total files tracked: $num_files"
# Check core.hpp as example
if jq -e '.file_to_executables["include/ck_tile/core.hpp"]' enhanced_dependency_mapping.json &> /dev/null; then
local core_deps=$(jq '.file_to_executables["include/ck_tile/core.hpp"] | length' enhanced_dependency_mapping.json)
echo " Executables depending on core.hpp: $core_deps"
fi
else
log_warn "Dependency map not found. Run 'analyze' first."
fi
echo ""
if [ -f "tests_to_run.json" ]; then
echo "Test Selection:"
jq '.statistics' tests_to_run.json
echo ""
echo "Changed files:"
jq -r '.changed_files[]' tests_to_run.json
echo ""
echo "Sample affected tests (first 10):"
jq -r '.executables[:10][]' tests_to_run.json
else
log_warn "Test selection not found. Run 'select' first."
fi
}
cmd_clean() {
log_info "Cleaning generated smart build files..."
cd "$BUILD_DIR" || exit 1
rm -f enhanced_dependency_mapping.json tests_to_run.json build_mode.env
log_info "Clean complete ✓"
}
# Parse command line arguments
COMMAND=""
while [[ $# -gt 0 ]]; do
case $1 in
analyze|select|build|test|all|stats|clean)
COMMAND="$1"
shift
;;
-b|--base-ref)
BASE_REF="$2"
shift 2
;;
-t|--target-ref)
TARGET_REF="$2"
shift 2
;;
-j|--parallel)
PARALLEL="$2"
shift 2
;;
--build-dir)
BUILD_DIR="$2"
shift 2
;;
--no-ctest-only)
CTEST_ONLY=""
shift
;;
-h|--help)
print_help
exit 0
;;
*)
log_error "Unknown option: $1"
print_help
exit 1
;;
esac
done
# Validate command
if [ -z "$COMMAND" ]; then
log_error "No command specified"
print_help
exit 1
fi
# Main execution
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
log_info "Script location: $SCRIPT_DIR"
cd "$SCRIPT_DIR" || exit 1
check_prerequisites
case "$COMMAND" in
analyze)
cmd_analyze
;;
select)
cmd_select
;;
build)
cmd_build
;;
test)
cmd_test
;;
all)
cmd_all
;;
stats)
cmd_stats
;;
clean)
cmd_clean
;;
esac

View File

@@ -6,7 +6,8 @@
Unified CLI for Ninja Dependency Analysis and Selective Testing
Features:
- Dependency parsing (from build.ninja)
- CMake pre-build dependency parsing (using compile_commands.json + clang -MM)
- Post-build dependency parsing (from build.ninja - legacy)
- Selective test filtering (between git refs)
- Code auditing (--audit)
- Build optimization (--optimize-build)
@@ -16,6 +17,13 @@ import argparse
import sys
def run_cmake_dependency_analyzer(args):
from src.cmake_dependency_analyzer import main as cmake_main
sys.argv = ["cmake_dependency_analyzer.py"] + args
cmake_main()
def run_dependency_parser(args):
from src.enhanced_ninja_parser import main as ninja_main
@@ -32,13 +40,53 @@ def run_selective_test_filter(args):
def main():
parser = argparse.ArgumentParser(
description="Unified Ninja Dependency & Selective Testing Tool"
description="Unified Dependency Analysis & Selective Testing Tool"
)
subparsers = parser.add_subparsers(dest="command", required=True)
# Dependency parsing
# CMake pre-build dependency parsing (NEW - RECOMMENDED)
parser_cmake = subparsers.add_parser(
"cmake-parse",
help="[NEW] Parse compile_commands.json for pre-build dependency analysis"
)
parser_cmake.add_argument(
"compile_commands",
help="Path to compile_commands.json"
)
parser_cmake.add_argument(
"build_ninja",
help="Path to build.ninja"
)
parser_cmake.add_argument(
"--workspace-root",
default=".",
help="Workspace root directory (default: current directory)"
)
parser_cmake.add_argument(
"--output",
default="cmake_dependency_mapping.json",
help="Output JSON file (default: cmake_dependency_mapping.json)"
)
parser_cmake.add_argument(
"--parallel",
type=int,
default=8,
help="Number of parallel workers (default: 8)"
)
parser_cmake.add_argument(
"--quiet",
action="store_true",
help="Suppress progress output"
)
parser_cmake.add_argument(
"--force",
action="store_true",
help="Force regeneration even if cache is valid"
)
# Ninja post-build dependency parsing (LEGACY)
parser_parse = subparsers.add_parser(
"parse", help="Parse build.ninja and generate dependency mapping"
"parse", help="[LEGACY] Parse build.ninja post-build (requires full build first)"
)
parser_parse.add_argument("build_ninja", help="Path to build.ninja")
parser_parse.add_argument(
@@ -63,6 +111,15 @@ def main():
action="store_true",
help="Only include executables starting with 'test_'",
)
parser_test.add_argument(
"--ctest-only",
action="store_true",
help="Only include tests registered with CTest (excludes EXCLUDE_FROM_ALL targets like benchmarks)",
)
parser_test.add_argument(
"--build-dir",
help="Build directory for ctest lookup (optional, default: inferred from depmap_json path)",
)
parser_test.add_argument(
"--output", help="Output JSON file", default="tests_to_run.json"
)
@@ -82,7 +139,17 @@ def main():
args = parser.parse_args()
if args.command == "parse":
if args.command == "cmake-parse":
cmake_args = [args.compile_commands, args.build_ninja]
cmake_args += ["--workspace-root", args.workspace_root]
cmake_args += ["--output", args.output]
cmake_args += ["--parallel", str(args.parallel)]
if args.quiet:
cmake_args.append("--quiet")
if args.force:
cmake_args.append("--force")
run_cmake_dependency_analyzer(cmake_args)
elif args.command == "parse":
parse_args = [args.build_ninja, args.ninja]
if args.workspace_root:
parse_args.append(args.workspace_root)
@@ -93,6 +160,10 @@ def main():
filter_args.append("--test-prefix")
if args.all:
filter_args.append("--all")
if args.ctest_only:
filter_args.append("--ctest-only")
if args.build_dir:
filter_args += ["--build-dir", args.build_dir]
if args.output:
filter_args += ["--output", args.output]
run_selective_test_filter(filter_args)

View File

@@ -0,0 +1,134 @@
#!/bin/bash
# Copyright (c) Advanced Micro Devices, Inc., or its affiliates.
# SPDX-License-Identifier: MIT
# Smart Build and Test Execution Script
#
# This script handles the complete smart-build workflow:
# 1. Runs smart_build_ci.sh to determine build mode and targets
# 2. Builds only affected targets (selective mode) or everything (full mode)
# 3. Runs affected tests using ctest with regex filtering
# 4. Optionally processes ninja build traces
#
# Exit codes:
# 0 = Success
# 1 = Build or test failure
#
# Environment variables:
# WORKSPACE_ROOT - Path to workspace root
# BUILD_DIR - Build directory (defaults to current directory)
# PARALLEL - Number of parallel jobs for dependency analysis (default: 32)
# NINJA_JOBS - Number of ninja parallel jobs (required)
# ARCH_NAME - Architecture name for trace files (required if PROCESS_NINJA_TRACE=true)
# PROCESS_NINJA_TRACE - Set to "true" to process ninja build traces (default: false)
# NINJA_FTIME_TRACE - Set to "true" to run ClangBuildAnalyzer (default: false)
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
BUILD_DIR="${BUILD_DIR:-$(pwd)}"
WORKSPACE_ROOT="${WORKSPACE_ROOT:-$(cd ${BUILD_DIR}/.. && pwd)}"
PARALLEL="${PARALLEL:-32}"
PROCESS_NINJA_TRACE="${PROCESS_NINJA_TRACE:-false}"
NINJA_FTIME_TRACE="${NINJA_FTIME_TRACE:-false}"
# Validate required parameters
if [ -z "$NINJA_JOBS" ]; then
echo "Error: NINJA_JOBS environment variable is required"
exit 1
fi
if [ "$PROCESS_NINJA_TRACE" = "true" ] && [ -z "$ARCH_NAME" ]; then
echo "Error: ARCH_NAME environment variable is required when PROCESS_NINJA_TRACE=true"
exit 1
fi
echo "========================================="
echo "Smart Build and Test Execution"
echo "========================================="
echo "BUILD_DIR: ${BUILD_DIR}"
echo "WORKSPACE_ROOT: ${WORKSPACE_ROOT}"
echo "NINJA_JOBS: ${NINJA_JOBS}"
echo "PROCESS_NINJA_TRACE: ${PROCESS_NINJA_TRACE}"
echo "NINJA_FTIME_TRACE: ${NINJA_FTIME_TRACE}"
echo "-----------------------------------------"
cd "${BUILD_DIR}"
# Step 1: Run smart-build CI script
echo "🚀 Using Smart Build System"
echo ""
export WORKSPACE_ROOT
export PARALLEL
if ! bash "${SCRIPT_DIR}/smart_build_ci.sh"; then
# Full build required (exit code 1 from smart_build_ci.sh)
echo "⚠ Full build mode - building and testing everything"
ninja -j${NINJA_JOBS} check
# Process ninja build trace if requested
if [ "$PROCESS_NINJA_TRACE" = "true" ]; then
echo ""
echo "Processing ninja build trace..."
python3 ../script/ninja_json_converter.py .ninja_log --legacy-format --output ck_build_trace_${ARCH_NAME}.json
python3 ../script/parse_ninja_trace.py ck_build_trace_${ARCH_NAME}.json
if [ "$NINJA_FTIME_TRACE" = "true" ]; then
echo "Running ClangBuildAnalyzer..."
/ClangBuildAnalyzer/build/ClangBuildAnalyzer --all . clang_build.log
/ClangBuildAnalyzer/build/ClangBuildAnalyzer --analyze clang_build.log > clang_build_analysis_${ARCH_NAME}.log
fi
fi
exit 0
fi
# Step 2: Selective build mode - read targets
BUILD_TARGETS=$(cat build_targets.txt)
if [ "$BUILD_TARGETS" = "none" ]; then
echo "✓ No tests affected by changes - skipping build and test execution"
exit 0
fi
# Step 3: Build only affected targets
echo "✓ Selective build - building only affected targets"
echo "Building targets: ${BUILD_TARGETS}"
ninja -j${NINJA_JOBS} ${BUILD_TARGETS}
# Process ninja build trace if requested
if [ "$PROCESS_NINJA_TRACE" = "true" ]; then
echo ""
echo "Processing ninja build trace..."
python3 ../script/ninja_json_converter.py .ninja_log --legacy-format --output ck_build_trace_${ARCH_NAME}.json
python3 ../script/parse_ninja_trace.py ck_build_trace_${ARCH_NAME}.json
if [ "$NINJA_FTIME_TRACE" = "true" ]; then
echo "Running ClangBuildAnalyzer..."
/ClangBuildAnalyzer/build/ClangBuildAnalyzer --all . clang_build.log
/ClangBuildAnalyzer/build/ClangBuildAnalyzer --analyze clang_build.log > clang_build_analysis_${ARCH_NAME}.log
fi
fi
# Step 4: Run affected tests using regex_chunks
echo ""
echo "Running affected tests..."
NUM_CHUNKS=$(jq -r '.regex_chunks | length' tests_to_run.json)
echo "Running ${NUM_CHUNKS} test chunk(s)"
if [ "$NUM_CHUNKS" -eq 1 ]; then
TEST_REGEX=$(jq -r '.regex_chunks[0]' tests_to_run.json)
CTEST_PARALLEL_LEVEL=4 ctest --output-on-failure -R "${TEST_REGEX}"
else
for ((i=0; i<NUM_CHUNKS; i++)); do
TEST_REGEX=$(jq -r ".regex_chunks[$i]" tests_to_run.json)
echo "Running test chunk $((i+1))/${NUM_CHUNKS}"
CTEST_PARALLEL_LEVEL=4 ctest --output-on-failure -R "${TEST_REGEX}"
done
fi
echo ""
echo "✓ Smart build and test execution complete"
exit 0

View File

@@ -0,0 +1,136 @@
#!/bin/bash
# Copyright (c) Advanced Micro Devices, Inc., or its affiliates.
# SPDX-License-Identifier: MIT
# Smart Build CI Script
#
# This script orchestrates the smart-build process:
# 1. Runs ci_safety_check.sh to determine if selective build is safe
# 2. Generates dependency map using cmake-parse
# 3. Selects affected tests
# 4. Outputs build targets to a file for Jenkins to consume
#
# Exit codes:
# 0 = Success (selective build targets generated)
# 1 = Full build required (run ninja check)
#
# Output files:
# tests_to_run.json - Selected tests and executables
# build_targets.txt - Space-separated list of ninja targets to build
# build_mode.env - Environment variables (SMART_BUILD_MODE=selective|full)
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
BUILD_DIR="${BUILD_DIR:-$(pwd)}"
WORKSPACE_ROOT="${WORKSPACE_ROOT:-$(cd ${BUILD_DIR}/.. && pwd)}"
PARALLEL="${PARALLEL:-32}"
BASE_BRANCH="${BASE_BRANCH:-develop}"
echo "========================================="
echo "Smart Build CI"
echo "========================================="
echo "BUILD_DIR: ${BUILD_DIR}"
echo "WORKSPACE_ROOT: ${WORKSPACE_ROOT}"
echo "BASE_BRANCH: ${BASE_BRANCH}"
echo "PARALLEL: ${PARALLEL}"
echo "-----------------------------------------"
# Step 1: Run CI safety check
echo "Step 1: Running CI safety check..."
cd "${BUILD_DIR}"
if ! bash "${SCRIPT_DIR}/ci_safety_check.sh"; then
echo "CI safety check failed - full build required"
echo "full" > build_targets.txt
exit 1
fi
echo "✓ CI safety check passed - selective build enabled"
# Step 2: Generate dependency map
echo ""
echo "Step 2: Generating dependency map..."
if [ ! -f "compile_commands.json" ]; then
echo "Error: compile_commands.json not found in ${BUILD_DIR}"
echo "Make sure cmake configure has been run with -DCMAKE_EXPORT_COMPILE_COMMANDS=ON"
exit 1
fi
if [ ! -f "build.ninja" ]; then
echo "Error: build.ninja not found in ${BUILD_DIR}"
echo "Make sure cmake configure has been run with -G Ninja"
exit 1
fi
python3 "${SCRIPT_DIR}/main.py" cmake-parse \
compile_commands.json \
build.ninja \
--workspace-root "${WORKSPACE_ROOT}" \
--parallel ${PARALLEL} \
--output enhanced_dependency_mapping.json
if [ ! -f "enhanced_dependency_mapping.json" ]; then
echo "Error: Failed to generate enhanced_dependency_mapping.json"
exit 1
fi
echo "✓ Dependency map generated"
# Step 3: Select affected tests
echo ""
echo "Step 3: Selecting affected tests..."
python3 "${SCRIPT_DIR}/main.py" select \
enhanced_dependency_mapping.json \
origin/${BASE_BRANCH} \
HEAD \
--ctest-only \
--output tests_to_run.json
if [ ! -f "tests_to_run.json" ]; then
echo "Error: Failed to generate tests_to_run.json"
exit 1
fi
# Step 4: Check if any tests were selected
num_tests=$(jq -r '.tests_to_run | length' tests_to_run.json 2>/dev/null || echo "0")
echo "✓ Selected ${num_tests} tests"
if [ "${num_tests}" -eq 0 ]; then
echo ""
echo "========================================="
echo "Result: No tests affected by changes"
echo "========================================="
echo "none" > build_targets.txt
exit 0
fi
# Step 5: Extract build targets (executables)
echo ""
echo "Step 4: Extracting build targets..."
jq -r '.executables[]' tests_to_run.json | tr '\n' ' ' > build_targets.txt
num_targets=$(jq -r '.executables | length' tests_to_run.json)
echo "✓ Generated ${num_targets} build targets"
# Display summary
echo ""
echo "========================================="
echo "Smart Build Summary"
echo "========================================="
echo "Tests to run: ${num_tests}"
echo "Build targets: ${num_targets}"
echo "Output files:"
echo " - tests_to_run.json (test selection)"
echo " - build_targets.txt (ninja targets)"
echo " - build_mode.env (SMART_BUILD_MODE=selective)"
echo "========================================="
# Show first few targets for verification
echo ""
echo "Sample build targets (first 5):"
head -1 build_targets.txt | tr ' ' '\n' | head -5
echo ""
echo "✓ Smart build preparation complete"
exit 0

View File

@@ -0,0 +1,745 @@
#!/usr/bin/env python3
# Copyright (c) Advanced Micro Devices, Inc., or its affiliates.
# SPDX-License-Identifier: MIT
"""
CMake Dependency Analyzer
Pre-build dependency analysis using compile_commands.json and clang -MM.
This approach extracts header dependencies without requiring a full build,
enabling selective test building in CI pipelines.
Key Features:
- Parses compile_commands.json generated by CMake at configure time
- Uses clang/amdclang -MM to extract header dependencies (preprocessing only)
- Parses build.ninja for target -> source mappings
- Outputs dependency_mapping.json compatible with selective_test_filter.py
"""
import hashlib
import json
import os
import re
import shlex
import subprocess
import sys
import tempfile
from collections import defaultdict
from concurrent.futures import ProcessPoolExecutor, as_completed
from pathlib import Path
from typing import Dict, List, Optional, Set, Tuple
class CompileCommandsParser:
"""Parses compile_commands.json generated by CMake."""
def __init__(self, compile_commands_path: str):
"""Initialize parser with path to compile_commands.json.
Args:
compile_commands_path: Path to compile_commands.json file
"""
self.compile_commands_path = compile_commands_path
def parse(self, extensions: Optional[List[str]] = None) -> List[Dict]:
"""Parse compile_commands.json and return list of compile commands.
Args:
extensions: Optional list of file extensions to filter by (e.g., ['.cpp', '.cc'])
Returns:
List of compile command dictionaries with 'file', 'directory', and 'command' keys
Raises:
FileNotFoundError: If compile_commands.json doesn't exist
json.JSONDecodeError: If file contains invalid JSON
"""
if not os.path.exists(self.compile_commands_path):
raise FileNotFoundError(f"compile_commands.json not found: {self.compile_commands_path}")
with open(self.compile_commands_path, "r") as f:
commands = json.load(f)
# Normalize commands to always have 'command' key (not 'arguments')
normalized = []
for cmd in commands:
# Handle 'arguments' format (convert to 'command' string)
if "arguments" in cmd and "command" not in cmd:
cmd["command"] = " ".join(shlex.quote(arg) for arg in cmd["arguments"])
# Filter by extension if specified
if extensions:
file_ext = os.path.splitext(cmd["file"])[1]
if file_ext not in extensions:
continue
normalized.append(cmd)
return normalized
class DependencyExtractor:
"""Extracts header dependencies using clang -MM."""
def __init__(self, parallel_workers: int = 1, timeout: int = 30):
"""Initialize dependency extractor.
Args:
parallel_workers: Number of parallel workers for extraction
timeout: Timeout in seconds for each clang -MM call
"""
self.parallel_workers = parallel_workers
self.timeout = timeout
self._temp_dir = None
def convert_to_dependency_command(self, compile_command: str, deps_output_file: str) -> List[str]:
"""Convert a compile command to a dependency extraction command.
Replaces -c with -MM and removes -o output specification.
Args:
compile_command: Original compile command string
deps_output_file: Path to write dependency output
Returns:
Modified command as a list of arguments for dependency extraction
"""
parts = shlex.split(compile_command)
new_parts = []
skip_next = False
for i, part in enumerate(parts):
if skip_next:
skip_next = False
continue
# Skip -c (compile flag)
if part == "-c":
continue
# Skip -o and its argument (output file)
if part == "-o":
skip_next = True
continue
# Skip standalone .o files that might appear
if part.endswith(".o") and not part.startswith("-"):
continue
new_parts.append(part)
# Insert -MM and -MF flags after the compiler
if new_parts:
compiler = new_parts[0]
rest = new_parts[1:]
new_parts = [compiler, "-MM", "-MF", deps_output_file] + rest
return new_parts
def parse_makefile_deps(self, deps_content: str) -> List[str]:
"""Parse makefile-style dependency output from clang -MM.
Args:
deps_content: Content of .d file generated by clang -MM
Returns:
List of dependency file paths (excluding the target .o file)
"""
if not deps_content.strip():
return []
# Join continuation lines and split on whitespace
content = deps_content.replace("\\\n", " ").replace("\\\r\n", " ")
# Find the colon separating target from dependencies
colon_pos = content.find(":")
if colon_pos == -1:
return []
# Everything after the colon is dependencies
deps_part = content[colon_pos + 1:]
# Split on whitespace and filter empty strings
deps = [d.strip() for d in deps_part.split() if d.strip()]
return deps
def _get_deps_file(self, source_file: str) -> str:
"""Get a temporary file path for dependency output.
Args:
source_file: Source file being analyzed
Returns:
Path to temporary .d file
"""
if self._temp_dir is None:
self._temp_dir = tempfile.mkdtemp(prefix="ck_deps_")
basename = os.path.basename(source_file)
return os.path.join(self._temp_dir, f"{basename}.d")
def extract(self, directory: str, compile_command: str, source_file: str) -> List[str]:
"""Extract dependencies for a single source file.
Args:
directory: Working directory for compilation
compile_command: Original compile command
source_file: Source file to analyze
Returns:
List of dependency file paths, or empty list on error
"""
deps_file = self._get_deps_file(source_file)
try:
dep_command = self.convert_to_dependency_command(compile_command, deps_file)
# Run the dependency extraction command
# Note: Use errors='replace' to handle non-UTF8 output from AMD clang
result = subprocess.run(
dep_command,
cwd=directory,
capture_output=True,
text=True,
errors='replace',
timeout=self.timeout,
)
if result.returncode != 0:
return []
# Parse the generated .d file
if os.path.exists(deps_file):
with open(deps_file, "r", errors='replace') as f:
deps_content = f.read()
return self.parse_makefile_deps(deps_content)
return []
except subprocess.TimeoutExpired:
return []
except Exception:
return []
finally:
# Clean up temp file
if os.path.exists(deps_file):
try:
os.unlink(deps_file)
except OSError:
pass
def extract_batch(
self, commands: List[Dict], progress_callback=None
) -> Dict[str, List[str]]:
"""Extract dependencies for multiple source files.
Args:
commands: List of compile command dictionaries
progress_callback: Optional callback(current, total) for progress reporting
Returns:
Dictionary mapping source files to their dependencies
"""
source_to_deps = {}
total = len(commands)
if self.parallel_workers <= 1:
# Serial execution
for i, cmd in enumerate(commands):
deps = self.extract(cmd["directory"], cmd["command"], cmd["file"])
source_to_deps[cmd["file"]] = deps
if progress_callback:
progress_callback(i + 1, total)
else:
# Parallel execution
with ProcessPoolExecutor(max_workers=self.parallel_workers) as executor:
futures = {
executor.submit(
self.extract, cmd["directory"], cmd["command"], cmd["file"]
): cmd["file"]
for cmd in commands
}
completed = 0
for future in as_completed(futures):
source_file = futures[future]
try:
deps = future.result()
source_to_deps[source_file] = deps
except Exception:
source_to_deps[source_file] = []
completed += 1
if progress_callback:
progress_callback(completed, total)
return source_to_deps
class NinjaTargetParser:
"""Parses ninja build files to get target mappings."""
def __init__(self, ninja_file_path: str):
"""Initialize parser with path to build.ninja.
Args:
ninja_file_path: Path to build.ninja file
"""
self.ninja_file_path = ninja_file_path
def parse_executable_mappings(self) -> Dict[str, List[str]]:
"""Parse executable -> object file mappings from build.ninja.
Returns:
Dictionary mapping executable paths to lists of object files
"""
if not os.path.exists(self.ninja_file_path):
return {}
exe_to_objects = {}
# Pattern to match executable build rules
# Example: build bin/test_gemm: CXX_EXECUTABLE_LINKER__test_gemm test.o lib.o | deps
exe_pattern = re.compile(r"^build\s+(bin/[^:]+):\s+\S+\s+([^|]+)")
with open(self.ninja_file_path, "r") as f:
for line in f:
match = exe_pattern.match(line)
if match:
exe = match.group(1)
deps_part = match.group(2).strip()
# Extract object files (ending in .o, not starting with /)
object_files = []
for dep in deps_part.split():
if dep.endswith(".o") and not dep.startswith("/"):
object_files.append(dep)
if object_files:
exe_to_objects[exe] = object_files
return exe_to_objects
def parse_object_to_source(self) -> Dict[str, str]:
"""Parse object -> source file mappings from build.ninja.
Returns:
Dictionary mapping object file paths to source file paths
"""
if not os.path.exists(self.ninja_file_path):
return {}
obj_to_source = {}
# Pattern to match object compilation rules
# Example: build test/test.cpp.o: CXX_COMPILER__target /src/test.cpp
obj_pattern = re.compile(r"^build\s+([^:]+\.(?:cpp|cc|cu|hip)\.o):\s+\S+\s+(\S+)")
with open(self.ninja_file_path, "r") as f:
for line in f:
match = obj_pattern.match(line)
if match:
obj_file = match.group(1)
source_file = match.group(2)
obj_to_source[obj_file] = source_file
return obj_to_source
class DependencyMapper:
"""Builds file -> executable dependency mappings."""
def __init__(self, workspace_root: Optional[str] = None):
"""Initialize dependency mapper.
Args:
workspace_root: Root directory of the workspace for path normalization
"""
self.workspace_root = workspace_root
if workspace_root:
self.workspace_root = os.path.abspath(workspace_root).rstrip("/") + "/"
def normalize_path(self, path: str) -> str:
"""Normalize a file path relative to workspace root.
Args:
path: File path to normalize
Returns:
Normalized relative path
"""
if self.workspace_root and path.startswith(self.workspace_root):
return path[len(self.workspace_root):]
return path
def is_project_file(self, file_path: str) -> bool:
"""Check if a file is part of the project (not a system file).
Args:
file_path: File path to check
Returns:
True if file is a project file, False if system file
"""
# Exclude system files
system_prefixes = ["/usr/", "/opt/rocm", "/lib/", "/system/", "/local/"]
if any(file_path.startswith(prefix) for prefix in system_prefixes):
return False
# Project directory prefixes
project_dirs = [
"include/",
"library/",
"test/",
"example/",
"src/",
"profiler/",
"build/include/",
"build/_deps/gtest",
"client_example",
"codegen",
"tile_engine",
"dispatcher",
"experimental",
"tutorial",
]
if any(file_path.startswith(prefix) for prefix in project_dirs):
return True
# Also check monorepo-style paths
if any(
file_path.startswith(f"projects/composablekernel/{prefix}")
for prefix in project_dirs
):
return True
# Include files with common source/header extensions
if file_path.endswith(
(".cpp", ".hpp", ".h", ".c", ".cc", ".cxx", ".cu", ".hip", ".inc")
):
return True
return False
def build_mapping(
self,
exe_to_objects: Dict[str, List[str]],
obj_to_source: Dict[str, str],
source_to_deps: Dict[str, List[str]],
) -> Dict[str, Set[str]]:
"""Build file -> executable mapping from component mappings.
Args:
exe_to_objects: Executable -> object files mapping
obj_to_source: Object file -> source file mapping
source_to_deps: Source file -> dependency files mapping
Returns:
Dictionary mapping file paths to sets of executables
"""
file_to_exes: Dict[str, Set[str]] = defaultdict(set)
for exe, object_files in exe_to_objects.items():
for obj_file in object_files:
source_file = obj_to_source.get(obj_file)
if not source_file:
continue
deps = source_to_deps.get(source_file, [])
for dep_file in deps:
# Normalize and filter
normalized = self.normalize_path(dep_file)
if self.is_project_file(normalized):
file_to_exes[normalized].add(exe)
return dict(file_to_exes)
class CMakeDependencyAnalyzer:
"""Main analyzer class combining all components."""
def __init__(
self,
compile_commands_path: Optional[str],
ninja_path: Optional[str],
workspace_root: str,
parallel_workers: int = 8,
):
"""Initialize the analyzer.
Args:
compile_commands_path: Path to compile_commands.json
ninja_path: Path to build.ninja
workspace_root: Root directory of the workspace
parallel_workers: Number of parallel workers for dependency extraction
"""
self.compile_commands_path = compile_commands_path
self.ninja_path = ninja_path
self.workspace_root = workspace_root
self.parallel_workers = parallel_workers
# Results
self.file_to_executables: Dict[str, Set[str]] = {}
self.executable_to_files: Dict[str, Set[str]] = {}
def calculate_input_hash(self) -> str:
"""Calculate hash of input files to detect when cache should be invalidated.
Returns:
SHA256 hash string representing the current state of input files
"""
hasher = hashlib.sha256()
# Hash compile_commands.json modification time and size
if self.compile_commands_path and os.path.exists(self.compile_commands_path):
stat = os.stat(self.compile_commands_path)
hasher.update(f"{stat.st_mtime}:{stat.st_size}".encode())
# Hash build.ninja modification time and size
if self.ninja_path and os.path.exists(self.ninja_path):
stat = os.stat(self.ninja_path)
hasher.update(f"{stat.st_mtime}:{stat.st_size}".encode())
# Hash compiler version (first compiler found in compile_commands.json)
if self.compile_commands_path and os.path.exists(self.compile_commands_path):
try:
with open(self.compile_commands_path, "r") as f:
commands = json.load(f)
if commands:
# Extract first compiler command
cmd = commands[0].get("command", "")
if cmd:
compiler = shlex.split(cmd)[0] if cmd else ""
if os.path.exists(compiler):
# Get compiler version
result = subprocess.run(
[compiler, "--version"],
capture_output=True,
text=True,
timeout=5,
)
hasher.update(result.stdout.encode())
except (json.JSONDecodeError, subprocess.TimeoutExpired, Exception):
pass
return hasher.hexdigest()
def should_regenerate_cache(self, cache_file: str) -> bool:
"""Check if dependency cache needs to be regenerated.
Args:
cache_file: Path to the cached dependency mapping JSON
Returns:
True if cache should be regenerated, False if cache is valid
"""
if not os.path.exists(cache_file):
return True
try:
# Load cached metadata
with open(cache_file, "r") as f:
data = json.load(f)
cached_hash = data.get("input_hash")
if not cached_hash:
return True
# Calculate current hash and compare
current_hash = self.calculate_input_hash()
return current_hash != cached_hash
except (json.JSONDecodeError, KeyError):
# Corrupted cache or old format
return True
def analyze(self, progress_callback=None):
"""Run the full dependency analysis.
Args:
progress_callback: Optional callback(phase, current, total) for progress
Raises:
ValueError: If compile_commands_path or ninja_path is None
"""
# Validate required paths
if self.compile_commands_path is None:
raise ValueError("compile_commands_path is required for analysis but was None")
if self.ninja_path is None:
raise ValueError("ninja_path is required for analysis but was None")
# Phase 1: Parse compile commands
if progress_callback:
progress_callback("parsing_compile_commands", 0, 1)
cc_parser = CompileCommandsParser(self.compile_commands_path)
commands = cc_parser.parse(extensions=[".cpp", ".cc", ".cu", ".hip"])
if progress_callback:
progress_callback("parsing_compile_commands", 1, 1)
# Phase 2: Extract dependencies
extractor = DependencyExtractor(parallel_workers=self.parallel_workers)
def dep_progress(current, total):
if progress_callback:
progress_callback("extracting_dependencies", current, total)
source_to_deps = extractor.extract_batch(commands, progress_callback=dep_progress)
# Phase 3: Parse ninja target mappings
if progress_callback:
progress_callback("parsing_ninja", 0, 1)
ninja_parser = NinjaTargetParser(self.ninja_path)
exe_to_objects = ninja_parser.parse_executable_mappings()
obj_to_source = ninja_parser.parse_object_to_source()
if progress_callback:
progress_callback("parsing_ninja", 1, 1)
# Phase 4: Build dependency mapping
if progress_callback:
progress_callback("building_mapping", 0, 1)
mapper = DependencyMapper(workspace_root=self.workspace_root)
self.file_to_executables = mapper.build_mapping(
exe_to_objects, obj_to_source, source_to_deps
)
# Build reverse mapping
self.executable_to_files = defaultdict(set)
for file_path, exes in self.file_to_executables.items():
for exe in exes:
self.executable_to_files[exe].add(file_path)
self.executable_to_files = dict(self.executable_to_files)
if progress_callback:
progress_callback("building_mapping", 1, 1)
def calculate_statistics(self) -> Dict:
"""Calculate statistics about the dependency mapping.
Returns:
Dictionary with statistics
"""
return {
"total_files": len(self.file_to_executables),
"total_executables": len(self.executable_to_files),
"files_with_multiple_executables": sum(
1 for exes in self.file_to_executables.values() if len(exes) > 1
),
}
def export_to_json(self, output_path: str):
"""Export dependency mapping to JSON file.
The output format is compatible with selective_test_filter.py.
Args:
output_path: Path to write JSON output
"""
# Convert sets to sorted lists for JSON serialization
data = {
"file_to_executables": {
f: sorted(exes) for f, exes in self.file_to_executables.items()
},
"executable_to_files": {
exe: sorted(files) for exe, files in self.executable_to_files.items()
},
"statistics": self.calculate_statistics(),
"repo": {
"type": "cmake_prebuild",
"workspace_root": self.workspace_root,
},
"input_hash": self.calculate_input_hash(),
}
with open(output_path, "w") as f:
json.dump(data, f, indent=2)
def main():
"""CLI entry point."""
import argparse
parser = argparse.ArgumentParser(
description="CMake-based dependency analyzer for pre-build test selection"
)
parser.add_argument(
"compile_commands",
help="Path to compile_commands.json",
)
parser.add_argument(
"build_ninja",
help="Path to build.ninja",
)
parser.add_argument(
"--workspace-root",
default=".",
help="Workspace root directory (default: current directory)",
)
parser.add_argument(
"--output",
default="cmake_dependency_mapping.json",
help="Output JSON file (default: cmake_dependency_mapping.json)",
)
parser.add_argument(
"--parallel",
type=int,
default=8,
help="Number of parallel workers (default: 8)",
)
parser.add_argument(
"--quiet",
action="store_true",
help="Suppress progress output",
)
parser.add_argument(
"--force",
action="store_true",
help="Force regeneration even if cache is valid",
)
args = parser.parse_args()
def progress(phase, current, total):
if not args.quiet:
print(f"[{phase}] {current}/{total}", end="\r")
if current == total:
print()
analyzer = CMakeDependencyAnalyzer(
compile_commands_path=args.compile_commands,
ninja_path=args.build_ninja,
workspace_root=args.workspace_root,
parallel_workers=args.parallel,
)
# Check if cache needs regeneration
if not args.force and not analyzer.should_regenerate_cache(args.output):
print(f"Cache is valid, skipping analysis. Use --force to regenerate.")
print(f"Using cached results from {args.output}")
return
if not args.force and os.path.exists(args.output):
print(f"Cache invalid or outdated, regenerating dependencies...")
print(f"Analyzing dependencies from {args.compile_commands}...")
analyzer.analyze(progress_callback=progress)
print(f"\nExporting to {args.output}...")
analyzer.export_to_json(args.output)
stats = analyzer.calculate_statistics()
print(f"\nResults:")
print(f" Total files: {stats['total_files']}")
print(f" Total executables: {stats['total_executables']}")
print(f" Files with multiple executables: {stats['files_with_multiple_executables']}")
if __name__ == "__main__":
main()

View File

@@ -21,7 +21,7 @@ import json
class EnhancedNinjaDependencyParser:
def __init__(self, build_file_path, ninja_executable="ninja"):
self.build_file_path = build_file_path
self.build_dir = os.path.dirname(build_file_path)
self.build_dir = os.path.dirname(build_file_path) or "."
self.ninja_executable = ninja_executable
# Core data structures

View File

@@ -34,10 +34,10 @@ import os
def get_changed_files(ref1, ref2, project: str = None):
"""Return a set of files changed between two git refs."""
try:
cmd = ["git", "diff", "--name-only", ref1, ref2]
if project:
# Scope git diff to only this project's subtree for efficiency
cmd += ["--", f"projects/{project}/"]
# Don't use git path filter - it can miss files when running from subdirectories
git_root = subprocess.run(["git", "rev-parse", "--show-toplevel"], capture_output=True, text=True, check=True).stdout.strip()
cmd = ["git", "-C", git_root, "diff", "--name-only", f"{ref1}...{ref2}", "--", "projects/composablekernel"]
result = subprocess.run(
cmd,
capture_output=True,
@@ -51,6 +51,7 @@ def get_changed_files(ref1, ref2, project: str = None):
files = raw_files
print(f"Identified {len(files)} modified files")
else:
# Strip projects/{project}/ prefix from changed files
root = f"projects/{project}/"
root_len = len(root)
files = set()
@@ -73,23 +74,79 @@ def load_depmap(depmap_json):
data = json.load(f)
# Support both old and new formats
json_project = None
if "repo" in data and data["repo"]["type"] == "monorepo":
json_project = data["repo"]["project"]
if "repo" in data:
if data["repo"]["type"] == "monorepo":
json_project = data["repo"]["project"]
elif "workspace_root" in data["repo"]:
# Extract project from workspace_root path
workspace_root = data["repo"]["workspace_root"]
# Convert relative path to absolute if needed
if not os.path.isabs(workspace_root):
depmap_dir = os.path.dirname(os.path.abspath(depmap_json))
workspace_root = os.path.abspath(os.path.join(depmap_dir, workspace_root))
# If workspace_root is like /path/to/projects/composablekernel, extract composablekernel
if "/projects/" in workspace_root:
json_project = workspace_root.split("/projects/")[1].rstrip("/").split("/")[0]
if "file_to_executables" in data:
return data["file_to_executables"], json_project
return data, json_project
def select_tests(file_to_executables, changed_files, filter_mode):
def get_ctest_registered_tests(build_dir=None):
"""Get list of tests registered with CTest (excludes EXCLUDE_FROM_ALL targets)."""
try:
cmd = ["ctest", "-N"]
if build_dir:
cmd.extend(["--test-dir", build_dir])
result = subprocess.run(
cmd,
capture_output=True,
text=True,
timeout=30
)
if result.returncode != 0:
return None
tests = set()
for line in result.stdout.splitlines():
if line.strip().startswith("Test #"):
parts = line.split(":", 1)
if len(parts) == 2:
test_name = parts[1].strip()
tests.add(test_name)
return tests
except (subprocess.TimeoutExpired, FileNotFoundError, Exception):
return None
def select_tests(file_to_executables, changed_files, filter_mode, ctest_only=False, build_dir=None):
"""Return a set of test executables affected by changed files."""
affected = set()
ctest_tests = None
if ctest_only:
ctest_tests = get_ctest_registered_tests(build_dir)
if ctest_tests is None:
print("Warning: Could not get CTest test list, including all executables")
else:
print(f"Filtering to {len(ctest_tests)} CTest-registered tests (excluding EXCLUDE_FROM_ALL targets)")
for f in changed_files:
if f in file_to_executables:
for exe in file_to_executables[f]:
if filter_mode == "all":
affected.add(exe)
elif filter_mode == "test_prefix" and exe.startswith("test_"):
affected.add(exe)
if filter_mode == "test_prefix" and not os.path.basename(exe).startswith("test_"):
continue
if ctest_only and ctest_tests is not None:
test_name = exe.replace("bin/", "")
if test_name not in ctest_tests:
continue
affected.add(exe)
return sorted(affected)
@@ -141,16 +198,32 @@ def main():
ref2 = sys.argv[3]
filter_mode = "all"
output_json = "tests_to_run.json"
ctest_only = False
build_dir = None
if "--test-prefix" in sys.argv:
filter_mode = "test_prefix"
if "--all" in sys.argv:
filter_mode = "all"
if "--ctest-only" in sys.argv:
ctest_only = True
if "--build-dir" in sys.argv:
idx = sys.argv.index("--build-dir")
if idx + 1 < len(sys.argv):
build_dir = sys.argv[idx + 1]
if "--output" in sys.argv:
idx = sys.argv.index("--output")
if idx + 1 < len(sys.argv):
output_json = sys.argv[idx + 1]
# If build_dir not specified, try to infer from depmap_json path
if ctest_only and build_dir is None:
depmap_dir = os.path.dirname(os.path.abspath(depmap_json))
if os.path.basename(depmap_dir) in ["build", "."]:
build_dir = depmap_dir
elif os.path.exists(os.path.join(depmap_dir, "build.ninja")):
build_dir = depmap_dir
if not os.path.exists(depmap_json):
print(f"Dependency map JSON not found: {depmap_json}")
sys.exit(1)
@@ -161,15 +234,55 @@ def main():
print("No changed files detected.")
tests = []
else:
tests = select_tests(file_to_executables, changed_files, filter_mode)
tests = select_tests(file_to_executables, changed_files, filter_mode, ctest_only, build_dir)
# Generate ctest regex from test names
# Split into chunks to avoid regex length limits in CTest
regex_chunks = []
chunk_size = 50 # Max tests per regex pattern
if tests:
# Extract basenames for regex (e.g., bin/test_gemm -> test_gemm)
test_names = [os.path.basename(t) for t in tests]
# Split into chunks
for i in range(0, len(test_names), chunk_size):
chunk = test_names[i:i + chunk_size]
regex_chunks.append("|".join(chunk))
# Keep single regex for backward compatibility (but may be too long)
regex = "|".join(test_names)
else:
regex = ""
# Output format matches Jenkinsfile usage and documentation
output = {
"tests_to_run": tests, # For backward compatibility and length check
"executables": tests, # Used by Jenkinsfile for ninja build
"regex": regex, # Used by Jenkinsfile for ctest (deprecated for large test sets)
"regex_chunks": regex_chunks, # Multiple regex patterns for large test sets
"changed_files": sorted(changed_files),
"statistics": {
"total_changed_files": len(changed_files),
"total_affected_executables": len(tests),
"num_regex_chunks": len(regex_chunks),
},
}
with open(output_json, "w") as f:
json.dump(
{"tests_to_run": tests, "changed_files": sorted(changed_files)}, f, indent=2
)
json.dump(output, f, indent=2)
# Print summary
print(f"Exported {len(tests)} tests to run to {output_json}")
# Print changed files for visibility
if changed_files:
print(f"\nChanged files ({len(changed_files)}):")
for f in sorted(changed_files):
print(f" - {f}")
else:
print("\nNo files changed.")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,512 @@
#!/usr/bin/env python3
# Copyright (c) Advanced Micro Devices, Inc., or its affiliates.
# SPDX-License-Identifier: MIT
"""
Test-Driven Development tests for CMake Dependency Analyzer.
This module tests the new pre-build dependency analysis approach that uses
compile_commands.json and clang -MM instead of requiring a full ninja build.
"""
import json
import os
import tempfile
import unittest
from pathlib import Path
from unittest.mock import Mock, patch, MagicMock
import shutil
import sys
# Add parent directory to path for imports
sys.path.insert(0, str(Path(__file__).parent.parent / "src"))
class TestCompileCommandsParser(unittest.TestCase):
"""Tests for parsing compile_commands.json."""
def setUp(self):
"""Create temporary directory and sample compile_commands.json."""
self.temp_dir = tempfile.mkdtemp()
self.compile_commands_path = os.path.join(self.temp_dir, "compile_commands.json")
def tearDown(self):
"""Clean up temporary directory."""
shutil.rmtree(self.temp_dir)
def test_parse_empty_compile_commands(self):
"""Parser should handle empty compile_commands.json gracefully."""
from cmake_dependency_analyzer import CompileCommandsParser
with open(self.compile_commands_path, "w") as f:
json.dump([], f)
parser = CompileCommandsParser(self.compile_commands_path)
commands = parser.parse()
self.assertEqual(len(commands), 0)
def test_parse_single_command(self):
"""Parser should correctly parse a single compile command."""
from cmake_dependency_analyzer import CompileCommandsParser
sample_commands = [
{
"directory": "/build",
"command": "/opt/rocm/bin/amdclang++ -DFOO=1 -I/include -c test.cpp -o test.o",
"file": "/src/test.cpp",
}
]
with open(self.compile_commands_path, "w") as f:
json.dump(sample_commands, f)
parser = CompileCommandsParser(self.compile_commands_path)
commands = parser.parse()
self.assertEqual(len(commands), 1)
self.assertEqual(commands[0]["file"], "/src/test.cpp")
self.assertEqual(commands[0]["directory"], "/build")
def test_parse_multiple_commands(self):
"""Parser should correctly parse multiple compile commands."""
from cmake_dependency_analyzer import CompileCommandsParser
sample_commands = [
{
"directory": "/build",
"command": "/opt/rocm/bin/amdclang++ -c test1.cpp -o test1.o",
"file": "/src/test1.cpp",
},
{
"directory": "/build",
"command": "/opt/rocm/bin/amdclang++ -c test2.cpp -o test2.o",
"file": "/src/test2.cpp",
},
]
with open(self.compile_commands_path, "w") as f:
json.dump(sample_commands, f)
parser = CompileCommandsParser(self.compile_commands_path)
commands = parser.parse()
self.assertEqual(len(commands), 2)
def test_filter_by_extension(self):
"""Parser should filter commands by file extension."""
from cmake_dependency_analyzer import CompileCommandsParser
sample_commands = [
{"directory": "/build", "command": "clang++ -c test.cpp -o test.o", "file": "/src/test.cpp"},
{"directory": "/build", "command": "clang++ -c test.cc -o test.o", "file": "/src/test.cc"},
{"directory": "/build", "command": "clang -c test.c -o test.o", "file": "/src/test.c"},
]
with open(self.compile_commands_path, "w") as f:
json.dump(sample_commands, f)
parser = CompileCommandsParser(self.compile_commands_path)
commands = parser.parse(extensions=[".cpp"])
self.assertEqual(len(commands), 1)
self.assertEqual(commands[0]["file"], "/src/test.cpp")
def test_handles_arguments_format(self):
"""Parser should handle both 'command' and 'arguments' formats."""
from cmake_dependency_analyzer import CompileCommandsParser
sample_commands = [
{
"directory": "/build",
"arguments": ["/opt/rocm/bin/amdclang++", "-c", "test.cpp", "-o", "test.o"],
"file": "/src/test.cpp",
}
]
with open(self.compile_commands_path, "w") as f:
json.dump(sample_commands, f)
parser = CompileCommandsParser(self.compile_commands_path)
commands = parser.parse()
self.assertEqual(len(commands), 1)
self.assertIn("command", commands[0])
class TestDependencyExtractor(unittest.TestCase):
"""Tests for extracting dependencies using clang -MM."""
def setUp(self):
"""Set up test fixtures."""
self.temp_dir = tempfile.mkdtemp()
def tearDown(self):
"""Clean up."""
shutil.rmtree(self.temp_dir)
def test_convert_compile_to_dependency_command(self):
"""Should convert compile command to dependency extraction command."""
from cmake_dependency_analyzer import DependencyExtractor
compile_cmd = "/opt/rocm/bin/amdclang++ -DFOO=1 -I/include -O3 -c /src/test.cpp -o /build/test.o"
extractor = DependencyExtractor()
dep_cmd = extractor.convert_to_dependency_command(compile_cmd, "/tmp/deps.d")
# Should have -MM flag
self.assertIn("-MM", dep_cmd)
# Should have -MF flag with output file
self.assertIn("-MF", dep_cmd)
self.assertIn("/tmp/deps.d", dep_cmd)
# Should NOT have -c flag
self.assertNotIn(" -c ", dep_cmd)
# Should NOT have -o flag with output
self.assertNotIn(" -o ", dep_cmd)
# Should preserve includes and defines
self.assertIn("-DFOO=1", dep_cmd)
self.assertIn("-I/include", dep_cmd)
# Should preserve source file
self.assertIn("/src/test.cpp", dep_cmd)
def test_parse_makefile_deps_simple(self):
"""Should parse simple makefile-style dependency output."""
from cmake_dependency_analyzer import DependencyExtractor
deps_content = "test.o: test.cpp header1.hpp header2.hpp\n"
extractor = DependencyExtractor()
deps = extractor.parse_makefile_deps(deps_content)
self.assertEqual(len(deps), 3)
self.assertIn("test.cpp", deps)
self.assertIn("header1.hpp", deps)
self.assertIn("header2.hpp", deps)
def test_parse_makefile_deps_multiline(self):
"""Should parse multiline makefile-style dependency output."""
from cmake_dependency_analyzer import DependencyExtractor
deps_content = """test.o: test.cpp \\
/include/header1.hpp \\
/include/header2.hpp \\
/include/header3.hpp
"""
extractor = DependencyExtractor()
deps = extractor.parse_makefile_deps(deps_content)
self.assertEqual(len(deps), 4)
self.assertIn("test.cpp", deps)
self.assertIn("/include/header1.hpp", deps)
self.assertIn("/include/header2.hpp", deps)
self.assertIn("/include/header3.hpp", deps)
def test_parse_makefile_deps_empty(self):
"""Should handle empty dependency output."""
from cmake_dependency_analyzer import DependencyExtractor
extractor = DependencyExtractor()
deps = extractor.parse_makefile_deps("")
self.assertEqual(len(deps), 0)
@patch("subprocess.run")
def test_extract_dependencies_success(self, mock_run):
"""Should successfully extract dependencies using clang -MM."""
from cmake_dependency_analyzer import DependencyExtractor
# Mock successful clang -MM execution
mock_run.return_value = Mock(returncode=0, stdout="", stderr="")
extractor = DependencyExtractor()
with tempfile.NamedTemporaryFile(mode="w", suffix=".d", delete=False) as f:
f.write("test.o: test.cpp header.hpp\n")
deps_file = f.name
with patch.object(extractor, "_get_deps_file", return_value=deps_file):
deps = extractor.extract("/build", "clang++ -c test.cpp -o test.o", "/src/test.cpp")
self.assertIn("test.cpp", deps)
self.assertIn("header.hpp", deps)
# Note: The implementation cleans up the temp file, so we don't need to
@patch("subprocess.run")
def test_extract_dependencies_compiler_error(self, mock_run):
"""Should handle compiler errors gracefully."""
from cmake_dependency_analyzer import DependencyExtractor
# Mock failed clang -MM execution
mock_run.return_value = Mock(returncode=1, stdout="", stderr="error: file not found")
extractor = DependencyExtractor()
deps = extractor.extract("/build", "clang++ -c test.cpp -o test.o", "/src/test.cpp")
# Should return empty list on error, not crash
self.assertEqual(deps, [])
class TestNinjaTargetParser(unittest.TestCase):
"""Tests for parsing ninja build files to get target mappings."""
def setUp(self):
"""Set up test fixtures."""
self.temp_dir = tempfile.mkdtemp()
self.ninja_file = os.path.join(self.temp_dir, "build.ninja")
def tearDown(self):
"""Clean up."""
shutil.rmtree(self.temp_dir)
def test_parse_executable_to_objects(self):
"""Should parse executable -> object file mappings from build.ninja."""
from cmake_dependency_analyzer import NinjaTargetParser
ninja_content = """
rule CXX_EXECUTABLE_LINKER__test_gemm
command = /opt/rocm/bin/amdclang++ $in -o $out
build bin/test_gemm: CXX_EXECUTABLE_LINKER__test_gemm test/test_gemm.cpp.o library/gemm.cpp.o | lib.so
"""
with open(self.ninja_file, "w") as f:
f.write(ninja_content)
parser = NinjaTargetParser(self.ninja_file)
exe_to_objects = parser.parse_executable_mappings()
self.assertIn("bin/test_gemm", exe_to_objects)
self.assertIn("test/test_gemm.cpp.o", exe_to_objects["bin/test_gemm"])
self.assertIn("library/gemm.cpp.o", exe_to_objects["bin/test_gemm"])
def test_parse_object_to_source(self):
"""Should parse object -> source file mappings from build.ninja."""
from cmake_dependency_analyzer import NinjaTargetParser
ninja_content = """
rule CXX_COMPILER__test_gemm
command = /opt/rocm/bin/amdclang++ -c $in -o $out
build test/test_gemm.cpp.o: CXX_COMPILER__test_gemm /src/test/test_gemm.cpp
build library/gemm.cpp.o: CXX_COMPILER__test_gemm /src/library/gemm.cpp
"""
with open(self.ninja_file, "w") as f:
f.write(ninja_content)
parser = NinjaTargetParser(self.ninja_file)
obj_to_source = parser.parse_object_to_source()
self.assertIn("test/test_gemm.cpp.o", obj_to_source)
self.assertEqual(obj_to_source["test/test_gemm.cpp.o"], "/src/test/test_gemm.cpp")
def test_filter_test_executables(self):
"""Should correctly filter test executables by prefix."""
from cmake_dependency_analyzer import NinjaTargetParser
ninja_content = """
build bin/test_gemm: CXX_EXECUTABLE_LINKER__test_gemm test.o
build bin/example_gemm: CXX_EXECUTABLE_LINKER__example_gemm example.o
build bin/ckProfiler: CXX_EXECUTABLE_LINKER__ckProfiler profiler.o
"""
with open(self.ninja_file, "w") as f:
f.write(ninja_content)
parser = NinjaTargetParser(self.ninja_file)
exe_to_objects = parser.parse_executable_mappings()
test_exes = [exe for exe in exe_to_objects if "test_" in exe]
self.assertEqual(len(test_exes), 1)
self.assertIn("bin/test_gemm", test_exes)
class TestDependencyMapper(unittest.TestCase):
"""Tests for building the file -> executable dependency mapping."""
def test_build_file_to_executable_mapping(self):
"""Should build correct file -> executable mapping."""
from cmake_dependency_analyzer import DependencyMapper
# Simulated data
exe_to_objects = {
"bin/test_gemm": ["test/test_gemm.cpp.o", "lib/gemm.cpp.o"],
"bin/test_conv": ["test/test_conv.cpp.o", "lib/conv.cpp.o"],
}
obj_to_source = {
"test/test_gemm.cpp.o": "test/test_gemm.cpp",
"lib/gemm.cpp.o": "lib/gemm.cpp",
"test/test_conv.cpp.o": "test/test_conv.cpp",
"lib/conv.cpp.o": "lib/conv.cpp",
}
source_to_deps = {
"test/test_gemm.cpp": ["test/test_gemm.cpp", "include/gemm.hpp", "include/common.hpp"],
"lib/gemm.cpp": ["lib/gemm.cpp", "include/gemm.hpp"],
"test/test_conv.cpp": ["test/test_conv.cpp", "include/conv.hpp", "include/common.hpp"],
"lib/conv.cpp": ["lib/conv.cpp", "include/conv.hpp"],
}
mapper = DependencyMapper()
file_to_exes = mapper.build_mapping(exe_to_objects, obj_to_source, source_to_deps)
# common.hpp should map to both test executables
self.assertIn("include/common.hpp", file_to_exes)
self.assertIn("bin/test_gemm", file_to_exes["include/common.hpp"])
self.assertIn("bin/test_conv", file_to_exes["include/common.hpp"])
# gemm.hpp should only map to test_gemm
self.assertIn("include/gemm.hpp", file_to_exes)
self.assertIn("bin/test_gemm", file_to_exes["include/gemm.hpp"])
self.assertNotIn("bin/test_conv", file_to_exes["include/gemm.hpp"])
def test_normalize_paths(self):
"""Should normalize paths relative to workspace root."""
from cmake_dependency_analyzer import DependencyMapper
mapper = DependencyMapper(workspace_root="/workspace/rocm-libraries/projects/composablekernel")
# Test monorepo-style path
normalized = mapper.normalize_path(
"/workspace/rocm-libraries/projects/composablekernel/include/ck/ck.hpp"
)
self.assertEqual(normalized, "include/ck/ck.hpp")
# Test already relative path
normalized = mapper.normalize_path("include/ck/ck.hpp")
self.assertEqual(normalized, "include/ck/ck.hpp")
def test_filter_system_files(self):
"""Should filter out system files."""
from cmake_dependency_analyzer import DependencyMapper
mapper = DependencyMapper()
self.assertFalse(mapper.is_project_file("/usr/include/stdio.h"))
self.assertFalse(mapper.is_project_file("/opt/rocm/include/hip/hip_runtime.h"))
self.assertTrue(mapper.is_project_file("include/ck/ck.hpp"))
self.assertTrue(mapper.is_project_file("test/test_gemm.cpp"))
class TestCMakeDependencyAnalyzer(unittest.TestCase):
"""Integration tests for the full CMake dependency analyzer."""
def setUp(self):
"""Set up test fixtures."""
self.temp_dir = tempfile.mkdtemp()
def tearDown(self):
"""Clean up."""
shutil.rmtree(self.temp_dir)
def test_output_format_compatibility(self):
"""Output JSON should be compatible with selective_test_filter.py."""
from cmake_dependency_analyzer import CMakeDependencyAnalyzer
# Create minimal test data
analyzer = CMakeDependencyAnalyzer(
compile_commands_path=None,
ninja_path=None,
workspace_root=self.temp_dir,
)
# Manually set internal state for testing output format
analyzer.file_to_executables = {
"include/ck/ck.hpp": {"bin/test_gemm", "bin/test_conv"},
"test/test_gemm.cpp": {"bin/test_gemm"},
}
analyzer.executable_to_files = {
"bin/test_gemm": {"include/ck/ck.hpp", "test/test_gemm.cpp"},
"bin/test_conv": {"include/ck/ck.hpp"},
}
output_file = os.path.join(self.temp_dir, "output.json")
analyzer.export_to_json(output_file)
with open(output_file) as f:
data = json.load(f)
# Check required fields for selective_test_filter.py compatibility
self.assertIn("file_to_executables", data)
self.assertIn("executable_to_files", data)
self.assertIn("statistics", data)
# Check file_to_executables format (should be lists, not sets)
self.assertIsInstance(data["file_to_executables"]["include/ck/ck.hpp"], list)
def test_statistics_calculation(self):
"""Should calculate correct statistics."""
from cmake_dependency_analyzer import CMakeDependencyAnalyzer
analyzer = CMakeDependencyAnalyzer(
compile_commands_path=None,
ninja_path=None,
workspace_root=self.temp_dir,
)
analyzer.file_to_executables = {
"include/common.hpp": {"bin/test1", "bin/test2", "bin/test3"},
"include/specific.hpp": {"bin/test1"},
"test/test1.cpp": {"bin/test1"},
}
stats = analyzer.calculate_statistics()
self.assertEqual(stats["total_files"], 3)
self.assertEqual(stats["files_with_multiple_executables"], 1)
class TestParallelDependencyExtraction(unittest.TestCase):
"""Tests for parallel dependency extraction."""
def test_batch_extraction_preserves_results(self):
"""Parallel extraction should produce same results as serial."""
from cmake_dependency_analyzer import DependencyExtractor
extractor = DependencyExtractor(parallel_workers=4)
# This is more of an integration test placeholder
# Real parallel testing would require actual compiler invocations
self.assertIsNotNone(extractor)
class TestEdgeCases(unittest.TestCase):
"""Tests for edge cases and error handling."""
def test_handles_missing_compile_commands(self):
"""Should raise appropriate error for missing compile_commands.json."""
from cmake_dependency_analyzer import CompileCommandsParser
with self.assertRaises(FileNotFoundError):
parser = CompileCommandsParser("/nonexistent/compile_commands.json")
parser.parse()
def test_handles_malformed_json(self):
"""Should handle malformed JSON gracefully."""
from cmake_dependency_analyzer import CompileCommandsParser
temp_dir = tempfile.mkdtemp()
try:
path = os.path.join(temp_dir, "compile_commands.json")
with open(path, "w") as f:
f.write("not valid json {{{")
parser = CompileCommandsParser(path)
with self.assertRaises(json.JSONDecodeError):
parser.parse()
finally:
shutil.rmtree(temp_dir)
def test_handles_empty_ninja_file(self):
"""Should handle empty ninja file gracefully."""
from cmake_dependency_analyzer import NinjaTargetParser
temp_dir = tempfile.mkdtemp()
try:
ninja_file = os.path.join(temp_dir, "build.ninja")
with open(ninja_file, "w") as f:
f.write("")
parser = NinjaTargetParser(ninja_file)
exe_to_objects = parser.parse_executable_mappings()
self.assertEqual(len(exe_to_objects), 0)
finally:
shutil.rmtree(temp_dir)
if __name__ == "__main__":
unittest.main()

View File

@@ -0,0 +1,264 @@
#!/usr/bin/env python3
# Copyright (c) Advanced Micro Devices, Inc., or its affiliates.
# SPDX-License-Identifier: MIT
"""
Integration tests for CMake Dependency Analyzer.
These tests use real compile_commands.json and actual AMD clang compiler
to verify the analyzer works correctly in production environment.
"""
import json
import os
import sys
import tempfile
import unittest
from pathlib import Path
# Add parent directory to path for imports
sys.path.insert(0, str(Path(__file__).parent.parent / "src"))
# Skip all tests if compile_commands.json doesn't exist
CK_ROOT = Path(__file__).parent.parent.parent.parent
BUILD_DIR = CK_ROOT / "build"
COMPILE_COMMANDS = BUILD_DIR / "compile_commands.json"
BUILD_NINJA = BUILD_DIR / "build.ninja"
SKIP_INTEGRATION = not COMPILE_COMMANDS.exists()
SKIP_REASON = f"compile_commands.json not found at {COMPILE_COMMANDS}"
@unittest.skipIf(SKIP_INTEGRATION, SKIP_REASON)
class TestRealCompileCommands(unittest.TestCase):
"""Tests using real compile_commands.json."""
def test_parse_real_compile_commands(self):
"""Should parse real CK compile_commands.json."""
from cmake_dependency_analyzer import CompileCommandsParser
parser = CompileCommandsParser(str(COMPILE_COMMANDS))
commands = parser.parse()
# CK has thousands of source files
self.assertGreater(len(commands), 100)
# Verify structure
for cmd in commands[:5]:
self.assertIn("file", cmd)
self.assertIn("directory", cmd)
self.assertIn("command", cmd)
def test_filter_cpp_files_only(self):
"""Should correctly filter to only .cpp files."""
from cmake_dependency_analyzer import CompileCommandsParser
parser = CompileCommandsParser(str(COMPILE_COMMANDS))
commands = parser.parse(extensions=[".cpp"])
for cmd in commands:
self.assertTrue(
cmd["file"].endswith(".cpp"),
f"Expected .cpp file, got {cmd['file']}",
)
@unittest.skipIf(SKIP_INTEGRATION, SKIP_REASON)
class TestRealDependencyExtraction(unittest.TestCase):
"""Tests using real AMD clang for dependency extraction."""
def test_extract_real_dependencies(self):
"""Should extract dependencies using real AMD clang."""
from cmake_dependency_analyzer import CompileCommandsParser, DependencyExtractor
parser = CompileCommandsParser(str(COMPILE_COMMANDS))
commands = parser.parse(extensions=[".cpp"])
# Test with first command
if not commands:
self.skipTest("No compile commands found")
cmd = commands[0]
extractor = DependencyExtractor()
deps = extractor.extract(cmd["directory"], cmd["command"], cmd["file"])
# Should have at least the source file itself
self.assertGreater(len(deps), 0, f"No deps found for {cmd['file']}")
# Should include the source file
source_basename = os.path.basename(cmd["file"])
found_source = any(source_basename in d for d in deps)
self.assertTrue(found_source, f"Source file not in deps: {deps[:5]}")
def test_extract_header_dependencies(self):
"""Should find CK header dependencies."""
from cmake_dependency_analyzer import CompileCommandsParser, DependencyExtractor
parser = CompileCommandsParser(str(COMPILE_COMMANDS))
commands = parser.parse(extensions=[".cpp"])
# Find a test file that includes CK headers
test_cmd = None
for cmd in commands:
if "test_" in cmd["file"] or "example_" in cmd["file"]:
test_cmd = cmd
break
if not test_cmd:
self.skipTest("No test file found")
extractor = DependencyExtractor()
deps = extractor.extract(test_cmd["directory"], test_cmd["command"], test_cmd["file"])
# Should include CK headers
ck_headers = [d for d in deps if "include/ck" in d or "include/ck_tile" in d]
self.assertGreater(
len(ck_headers), 0,
f"No CK headers found in deps for {test_cmd['file']}"
)
@unittest.skipIf(SKIP_INTEGRATION, SKIP_REASON)
@unittest.skipIf(not BUILD_NINJA.exists(), "build.ninja not found")
class TestRealNinjaParsing(unittest.TestCase):
"""Tests using real build.ninja."""
def test_parse_real_executables(self):
"""Should parse real executable mappings from build.ninja."""
from cmake_dependency_analyzer import NinjaTargetParser
parser = NinjaTargetParser(str(BUILD_NINJA))
exe_to_objects = parser.parse_executable_mappings()
# CK has many test executables
test_exes = [e for e in exe_to_objects if "test_" in e]
self.assertGreater(len(test_exes), 10, "Expected many test executables")
# Each executable should have at least one object file
for exe, objs in list(exe_to_objects.items())[:5]:
self.assertGreater(len(objs), 0, f"No objects for {exe}")
def test_parse_real_object_sources(self):
"""Should parse real object -> source mappings."""
from cmake_dependency_analyzer import NinjaTargetParser
parser = NinjaTargetParser(str(BUILD_NINJA))
obj_to_source = parser.parse_object_to_source()
# Should have many object files
self.assertGreater(len(obj_to_source), 100)
# Each mapping should have valid source file
for obj, src in list(obj_to_source.items())[:5]:
self.assertTrue(
src.endswith((".cpp", ".cc", ".cu", ".hip")),
f"Invalid source for {obj}: {src}",
)
@unittest.skipIf(SKIP_INTEGRATION, SKIP_REASON)
@unittest.skipIf(not BUILD_NINJA.exists(), "build.ninja not found")
class TestFullIntegration(unittest.TestCase):
"""Full integration test of the analyzer."""
def test_small_batch_analysis(self):
"""Should analyze a small batch of files correctly."""
from cmake_dependency_analyzer import (
CompileCommandsParser,
DependencyExtractor,
NinjaTargetParser,
DependencyMapper,
)
# Parse compile commands (limit to 10 for speed)
parser = CompileCommandsParser(str(COMPILE_COMMANDS))
all_commands = parser.parse(extensions=[".cpp"])
commands = all_commands[:10]
# Extract dependencies
extractor = DependencyExtractor()
source_to_deps = extractor.extract_batch(commands)
self.assertEqual(len(source_to_deps), len(commands))
# Parse ninja
ninja_parser = NinjaTargetParser(str(BUILD_NINJA))
exe_to_objects = ninja_parser.parse_executable_mappings()
obj_to_source = ninja_parser.parse_object_to_source()
# Build mapping
mapper = DependencyMapper(workspace_root=str(CK_ROOT))
file_to_exes = mapper.build_mapping(exe_to_objects, obj_to_source, source_to_deps)
# Should have some mappings (depends on which files were analyzed)
# This test mainly verifies no crashes occur
self.assertIsInstance(file_to_exes, dict)
def test_output_json_format(self):
"""Should produce JSON compatible with selective_test_filter.py."""
from cmake_dependency_analyzer import CMakeDependencyAnalyzer
# Create analyzer with limited scope
analyzer = CMakeDependencyAnalyzer(
compile_commands_path=str(COMPILE_COMMANDS),
ninja_path=str(BUILD_NINJA),
workspace_root=str(CK_ROOT),
parallel_workers=1,
)
# Manually set minimal data for output test
analyzer.file_to_executables = {
"include/ck/ck.hpp": {"bin/test_gemm"},
}
analyzer.executable_to_files = {
"bin/test_gemm": {"include/ck/ck.hpp"},
}
with tempfile.NamedTemporaryFile(mode="w", suffix=".json", delete=False) as f:
output_path = f.name
try:
analyzer.export_to_json(output_path)
with open(output_path) as f:
data = json.load(f)
# Verify format matches selective_test_filter.py expectations
self.assertIn("file_to_executables", data)
self.assertIn("statistics", data)
# Values should be lists, not sets
for key, value in data["file_to_executables"].items():
self.assertIsInstance(value, list)
finally:
os.unlink(output_path)
@unittest.skipIf(SKIP_INTEGRATION, SKIP_REASON)
class TestPerformance(unittest.TestCase):
"""Performance tests."""
def test_extraction_speed(self):
"""Single file extraction should be fast (<1s)."""
import time
from cmake_dependency_analyzer import CompileCommandsParser, DependencyExtractor
parser = CompileCommandsParser(str(COMPILE_COMMANDS))
commands = parser.parse(extensions=[".cpp"])
if not commands:
self.skipTest("No compile commands")
cmd = commands[0]
extractor = DependencyExtractor()
start = time.time()
deps = extractor.extract(cmd["directory"], cmd["command"], cmd["file"])
elapsed = time.time() - start
self.assertLess(elapsed, 1.0, f"Extraction took {elapsed:.2f}s, expected <1s")
self.assertGreater(len(deps), 0, "No dependencies extracted")
if __name__ == "__main__":
unittest.main()

View File

@@ -0,0 +1,372 @@
#!/bin/bash
# Validate Smart Build vs Legacy Method for a PR
#
# This script compares smart build and legacy dependency analysis
# to ensure both methods produce the same test selection results.
set -e
# Colors for output
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
RED='\033[0;31m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Configuration
PR_NUMBER=""
BASE_BRANCH="origin/develop"
SMART_BUILD_BRANCH="users/yraparti/ck/dependency-parser-smart-build"
BUILD_DIR="../../build"
SKIP_BUILD=false
SKIP_LEGACY=false
print_help() {
cat << 'HELP'
Usage: validate_pr.sh -p PR_NUMBER [OPTIONS]
Validates that smart build and legacy methods select the same tests for a PR.
Required:
-p, --pr PR_NUMBER PR number to validate
Options:
-b, --base BRANCH Base branch (default: origin/develop)
-s, --smart-build BRANCH Smart build branch (default: users/yraparti/ck/dependency-parser-smart-build)
--skip-build Skip full build (use existing build artifacts)
--skip-legacy Skip legacy analysis (only run smart build)
-h, --help Show this help
Examples:
# Validate PR 5324
./validate_pr.sh -p 5324
# Validate PR 5168 with custom base
./validate_pr.sh -p 5168 -b origin/main
# Quick validation (skip build, only smart build)
./validate_pr.sh -p 5324 --skip-build --skip-legacy
Output:
Results saved to build/prXXXX_validation_results.txt
HELP
}
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_section() {
echo -e "\n${BLUE}=== $1 ===${NC}\n"
}
# Parse arguments
while [[ $# -gt 0 ]]; do
case $1 in
-p|--pr)
PR_NUMBER="$2"
shift 2
;;
-b|--base)
BASE_BRANCH="$2"
shift 2
;;
-s|--smart-build)
SMART_BUILD_BRANCH="$2"
shift 2
;;
--skip-build)
SKIP_BUILD=true
shift
;;
--skip-legacy)
SKIP_LEGACY=true
shift
;;
-h|--help)
print_help
exit 0
;;
*)
log_error "Unknown option: $1"
print_help
exit 1
;;
esac
done
# Validate inputs
if [ -z "$PR_NUMBER" ]; then
log_error "PR number is required"
print_help
exit 1
fi
# Setup
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
BUILD_DIR="$PROJECT_ROOT/build"
OUTPUT_FILE="$BUILD_DIR/pr${PR_NUMBER}_validation_results.txt"
log_section "Validation Configuration"
echo "PR Number: $PR_NUMBER"
echo "Base Branch: $BASE_BRANCH"
echo "Smart Build Branch: $SMART_BUILD_BRANCH"
echo "Skip Build: $SKIP_BUILD"
echo "Skip Legacy: $SKIP_LEGACY"
echo "Output File: $OUTPUT_FILE"
# Start validation log
exec > >(tee "$OUTPUT_FILE") 2>&1
log_section "Step 1: Fetch PR $PR_NUMBER"
cd "$PROJECT_ROOT" || exit 1
log_info "Fetching PR #$PR_NUMBER..."
git fetch origin pull/${PR_NUMBER}/head:pr-${PR_NUMBER}
log_info "Checking out PR branch..."
git checkout pr-${PR_NUMBER}
log_info "PR commit:"
git log --oneline -1
log_section "Step 2: Rebase on Smart Build Branch"
log_info "Rebasing pr-${PR_NUMBER} on $SMART_BUILD_BRANCH..."
# Attempt rebase, handling conflicts by accepting PR changes
if ! git rebase $SMART_BUILD_BRANCH; then
log_warn "Rebase conflicts detected, resolving by accepting PR changes..."
# Loop to handle multiple conflicts during rebase
while true; do
# Get list of conflicted files
CONFLICTED_FILES=$(git diff --name-only --diff-filter=U)
if [ -z "$CONFLICTED_FILES" ]; then
log_info "No more conflicts, rebase complete"
break
fi
log_info "Conflicted files:"
echo "$CONFLICTED_FILES"
# For each conflicted file, accept the PR's version (theirs)
while IFS= read -r file; do
if [ -f "$file" ]; then
log_info "Accepting PR changes for: $file"
git checkout --theirs "$file"
git add "$file"
fi
done <<< "$CONFLICTED_FILES"
# Continue the rebase
log_info "Continuing rebase..."
if git -c core.editor=true rebase --continue 2>&1 | grep -q "No changes"; then
log_warn "No changes after conflict resolution, skipping commit"
git rebase --skip
elif git rebase --show-current-patch &>/dev/null; then
# Still in rebase, continue loop
continue
else
# Rebase complete
log_info "Rebase completed"
break
fi
done
fi
log_info "Rebased commits:"
git log --oneline -5
log_section "Step 3: Analyze Changed Files"
log_info "Files changed vs $BASE_BRANCH:"
CHANGED_FILES=$(git diff --name-only ${BASE_BRANCH}...HEAD -- projects/composablekernel)
NUM_FILES=$(echo "$CHANGED_FILES" | wc -l)
echo "$CHANGED_FILES" | head -20
if [ "$NUM_FILES" -gt 20 ]; then
echo "... (showing first 20 of $NUM_FILES files)"
fi
echo ""
echo "Total changed files: $NUM_FILES"
log_section "Step 4: Generate Fresh Dependency Map"
cd "$BUILD_DIR" || exit 1
log_info "Configuring CMake to generate compile_commands.json..."
cmake .. -GNinja -DCMAKE_EXPORT_COMPILE_COMMANDS=ON 2>&1 | grep -v "^-- " || true
if [ ! -f "compile_commands.json" ]; then
log_error "CMake configuration failed - compile_commands.json not generated"
exit 1
fi
if [ ! -f "build.ninja" ]; then
log_error "build.ninja not found - CMake should have generated it"
exit 1
fi
log_info "Generating fresh dependency map for PR validation..."
START_TIME=$(date +%s)
python3 ../script/dependency-parser/main.py cmake-parse \
compile_commands.json \
build.ninja \
--workspace-root .. \
--output enhanced_dependency_mapping.json
if [ ! -f "enhanced_dependency_mapping.json" ]; then
log_error "Dependency map generation failed"
exit 1
fi
END_TIME=$(date +%s)
DEP_TIME=$((END_TIME - START_TIME))
log_info "Dependency map generated in ${DEP_TIME} seconds"
SMART_MAP="enhanced_dependency_mapping.json"
SMART_FILES=$(jq '.file_to_executables | length' $SMART_MAP)
log_info "Dependency map tracks $SMART_FILES files"
log_section "Step 5: Smart Build Test Selection"
log_info "Running smart build test selection..."
python3 ../script/dependency-parser/main.py select \
"$SMART_MAP" \
$BASE_BRANCH \
HEAD \
--ctest-only \
--output pr${PR_NUMBER}_smart_build.json
SMART_TESTS=$(jq -r '.tests_to_run | length' pr${PR_NUMBER}_smart_build.json)
log_info "Smart build selected: $SMART_TESTS tests"
# Show statistics
echo ""
echo "Smart Build Results:"
jq '{changed_files: .changed_files | length, tests_selected: .tests_to_run | length, statistics}' pr${PR_NUMBER}_smart_build.json
if [ "$SKIP_LEGACY" = true ]; then
log_section "Validation Complete (Legacy Skipped)"
echo ""
echo "Smart Build: $SMART_TESTS tests selected"
echo "Legacy: Skipped"
exit 0
fi
log_section "Step 6: Full Build (for Legacy Method)"
if [ "$SKIP_BUILD" = true ]; then
log_warn "Skipping build (--skip-build specified)"
log_info "Using existing build artifacts..."
else
log_info "Running full build (this takes ~60 minutes)..."
START_TIME=$(date +%s)
if ninja 2>&1 | tee pr${PR_NUMBER}_build.log; then
END_TIME=$(date +%s)
BUILD_TIME=$((END_TIME - START_TIME))
log_info "Build completed in $((BUILD_TIME / 60)) minutes"
else
log_error "Build failed. Check pr${PR_NUMBER}_build.log for details."
exit 1
fi
fi
log_section "Step 7: Legacy Dependency Analysis"
log_info "Generating legacy dependency map (ninja -t deps)..."
python3 ../script/dependency-parser/main.py parse build.ninja
if [ ! -f "enhanced_dependency_mapping.json" ]; then
log_error "Legacy dependency map generation failed"
exit 1
fi
LEGACY_FILES=$(jq '.file_to_executables | length' enhanced_dependency_mapping.json)
log_info "Legacy map tracks $LEGACY_FILES files"
log_section "Step 8: Legacy Test Selection"
log_info "Running legacy test selection..."
python3 ../script/dependency-parser/main.py select \
enhanced_dependency_mapping.json \
$BASE_BRANCH \
HEAD \
--ctest-only \
--output pr${PR_NUMBER}_legacy_tests.json
LEGACY_TESTS=$(jq -r '.tests_to_run | length' pr${PR_NUMBER}_legacy_tests.json)
log_info "Legacy method selected: $LEGACY_TESTS tests"
# Show statistics
echo ""
echo "Legacy Method Results:"
jq '{changed_files: .changed_files | length, tests_selected: .tests_to_run | length, statistics}' pr${PR_NUMBER}_legacy_tests.json
log_section "Step 9: Compare Results"
echo ""
echo "╔════════════════════════════════════════════════════════════════╗"
echo "║ VALIDATION RESULTS ║"
echo "╠════════════════════════════════════════════════════════════════╣"
echo "║ PR Number: #${PR_NUMBER} "
echo "║ Changed Files: $NUM_FILES "
echo "║ Smart Build Tests: $SMART_TESTS "
echo "║ Legacy Tests: $LEGACY_TESTS "
echo "╠════════════════════════════════════════════════════════════════╣"
if [ "$SMART_TESTS" -eq "$LEGACY_TESTS" ]; then
echo "║ Result: ✅ MATCH "
echo "╚════════════════════════════════════════════════════════════════╝"
echo ""
log_info "VALIDATION PASSED: Both methods selected $SMART_TESTS tests"
# Detailed comparison
if [ "$SMART_TESTS" -gt 0 ]; then
log_info "Comparing test lists..."
SMART_LIST=$(jq -r '.tests_to_run | sort | .[]' pr${PR_NUMBER}_smart_build.json)
LEGACY_LIST=$(jq -r '.tests_to_run | sort | .[]' pr${PR_NUMBER}_legacy_tests.json)
if [ "$SMART_LIST" = "$LEGACY_LIST" ]; then
log_info "Test lists are identical ✓"
else
log_warn "Test counts match but lists differ!"
diff <(echo "$SMART_LIST") <(echo "$LEGACY_LIST") || true
fi
fi
EXIT_CODE=0
else
echo "║ Result: ❌ MISMATCH "
echo "╚════════════════════════════════════════════════════════════════╝"
echo ""
log_error "VALIDATION FAILED: Smart build selected $SMART_TESTS tests, Legacy selected $LEGACY_TESTS tests"
# Show differences
log_warn "Analyzing differences..."
SMART_ONLY=$(comm -23 <(jq -r '.tests_to_run | sort | .[]' pr${PR_NUMBER}_smart_build.json) \
<(jq -r '.tests_to_run | sort | .[]' pr${PR_NUMBER}_legacy_tests.json) | wc -l)
LEGACY_ONLY=$(comm -13 <(jq -r '.tests_to_run | sort | .[]' pr${PR_NUMBER}_smart_build.json) \
<(jq -r '.tests_to_run | sort | .[]' pr${PR_NUMBER}_legacy_tests.json) | wc -l)
echo "Tests only in smart build: $SMART_ONLY"
echo "Tests only in legacy: $LEGACY_ONLY"
EXIT_CODE=1
fi
log_section "Summary"
echo "Validation complete for PR #$PR_NUMBER"
echo "Results saved to: $OUTPUT_FILE"
echo ""
echo "Smart build JSON: pr${PR_NUMBER}_smart_build.json"
if [ "$SKIP_LEGACY" = false ]; then
echo "Legacy JSON: pr${PR_NUMBER}_legacy_tests.json"
fi
exit $EXIT_CODE