mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-03-29 03:27:39 +00:00
This PR introduces a Python toolkit for analyzing Clang's `-ftime-trace` build performance data. This is the foundation for our systematic effort to reduce CK and CK-Tile build times (#3575). The toolkit provides fast parsing of trace JSON files into pandas DataFrames using orjson, with specialized functions for analyzing template instantiation costs and compilation phase breakdowns. It includes a core library (`trace_analysis/`), example scripts for quick analysis, a comprehensive README with usage documentation, and an interactive Jupyter notebook demonstration. Key features include memory-efficient DataFrame schemas with optimized dtypes, recursive hierarchical phase analysis, automatic metadata extraction (source file, compilation timing), and template instantiation filtering. The design supports both standalone scripts and interactive Jupyter notebook workflows. This single-file analysis capability lays the groundwork for future multi-file analysis across thousands of compilation units, enabling data-driven optimization and build time regression detection.
264 lines
8.4 KiB
Markdown
264 lines
8.4 KiB
Markdown
# Build Trace Analysis
|
|
|
|
Simple to use, fast python tools for analyzing Clang `-ftime-trace` build performance data.
|
|
|
|
## Overview
|
|
|
|
We're kicking off a systematic effort to dramatically reduce CK and CK-Tile build times, [#3575](https://github.com/ROCm/composable_kernel/issues/3575). A key part of this work is improving our C++ metaprogramming to reduce the burden on the compiler.
|
|
|
|
In order to prioritize work and measure our progress, we need data on template instantiation. For single files, Clang's `-ftime-trace` build performance data is easy to analyze with the Perfetto UI. The problem we are solving here is how to analyze instantiation data across thousands of compilation units.
|
|
|
|
The python code in this directory provides helper functions to quickly load JSON files into pandas DataFrames that can be used for analysis in Jupyter notebooks.
|
|
|
|
## Directory Structure
|
|
|
|
```
|
|
script/analyze_build/
|
|
├── trace_analysis/ # Core library
|
|
│ ├── __init__.py # Main exports
|
|
│ ├── parse_file.py # Fast parsing of JSON trace files
|
|
│ ├── template_analysis.py # Template instantiation analysis
|
|
│ ├── template_parser.py # Template name parsing utilities
|
|
│ └── phase_breakdown.py # Compilation phase breakdown
|
|
├── notebooks/ # Jupyter notebooks for analysis
|
|
│ └── file_analysis_example.ipynb # Template analysis example
|
|
├── requirements.txt # Python dependencies
|
|
└── README.md # This file
|
|
```
|
|
|
|
## Python Requirements
|
|
|
|
See `requirements.txt` for the complete list of dependencies:
|
|
* **pandas** - DataFrame manipulation and analysis
|
|
* **orjson** - Fast JSON parsing for trace files
|
|
* **plotly** - Interactive visualizations (sunburst, treemap)
|
|
* **nbformat** - Jupyter notebook format support
|
|
* **ipykernel** - Kernel for running notebooks in VSCode/Jupyter
|
|
* **kaleido** - Static image export from Plotly charts
|
|
* **jupyter** - Full Jupyter environment
|
|
|
|
## Quick Start
|
|
|
|
### Setup
|
|
|
|
1. Create a virtual environment (recommended):
|
|
```bash
|
|
cd script/analyze_build
|
|
python3 -m venv .venv
|
|
source .venv/bin/activate # On Windows: .venv\Scripts\activate
|
|
```
|
|
|
|
2. Install dependencies:
|
|
```bash
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
3. Install VSCode extensions if you want to run notebooks in VSCode:
|
|
* Jupyter
|
|
* Data Wrangler (interact with Pandas DataFrames)
|
|
|
|
### Analyzing a Single File
|
|
|
|
Use the `parse_file` function to load a `-ftime-trace` JSON file into a Pandas DataFrame:
|
|
|
|
```python
|
|
from trace_analysis import parse_file
|
|
|
|
# Parse the trace file
|
|
df = parse_file('path/to/trace.json')
|
|
|
|
# View basic info
|
|
print(f"Total events: {len(df)}")
|
|
print(df.columns)
|
|
|
|
# Analyze duration statistics
|
|
print(df['dur'].describe())
|
|
```
|
|
|
|
### Extracting Compilation Metadata
|
|
|
|
Get high-level metadata about the compilation:
|
|
|
|
```python
|
|
from trace_analysis import get_metadata
|
|
|
|
# Extract metadata from trace file
|
|
metadata = get_metadata('trace.json')
|
|
|
|
print(f"Source file: {metadata['source_file']}")
|
|
print(f"Compilation time: {metadata['total_wall_time_s']:.2f}s")
|
|
print(f"Started: {metadata['wall_start_datetime']}")
|
|
print(f"Ended: {metadata['wall_end_datetime']}")
|
|
```
|
|
|
|
The metadata includes:
|
|
- `source_file`: Main .cpp/.c file being compiled
|
|
- `time_granularity`: Time unit used ("microseconds")
|
|
- `beginning_of_time`: Epoch timestamp in microseconds
|
|
- `wall_start_time`: Wall clock start (microseconds since epoch)
|
|
- `wall_end_time`: Wall clock end (microseconds since epoch)
|
|
- `wall_start_datetime`: Human-readable start time
|
|
- `wall_end_datetime`: Human-readable end time
|
|
- `total_wall_time_us`: Total compilation time in microseconds
|
|
- `total_wall_time_s`: Total compilation time in seconds
|
|
|
|
### Template Instantiation Analysis
|
|
|
|
The module includes specialized functions for analyzing C++ template instantiation costs:
|
|
|
|
```python
|
|
from trace_analysis import (
|
|
parse_file,
|
|
get_template_instantiation_events,
|
|
get_phase_breakdown,
|
|
)
|
|
|
|
df = parse_file('trace.json')
|
|
|
|
# Get all template instantiation events with parsed template information
|
|
template_events = get_template_instantiation_events(df)
|
|
|
|
# The returned DataFrame includes parsed columns:
|
|
# - namespace: Top-level namespace (e.g., 'std', 'ck')
|
|
# - template_name: Template name without parameters
|
|
# - full_qualified_name: Full namespace::template_name
|
|
# - param_count: Number of template parameters
|
|
# - is_ck_type: Boolean indicating CK library types
|
|
# - is_nested: Boolean indicating nested templates
|
|
|
|
# Find slowest template instantiations
|
|
top_templates = template_events.nlargest(20, 'dur')
|
|
print(top_templates[['template_name', 'namespace', 'param_count', 'dur']])
|
|
|
|
# Analyze by namespace
|
|
namespace_summary = template_events.groupby('namespace').agg({
|
|
'dur': ['count', 'sum', 'mean']
|
|
})
|
|
print(namespace_summary)
|
|
```
|
|
|
|
### Compilation Phase Breakdown
|
|
|
|
Analyze how compilation time is distributed across different phases:
|
|
|
|
```python
|
|
from trace_analysis import get_phase_breakdown, PhaseBreakdown
|
|
|
|
df = parse_file('trace.json')
|
|
|
|
# Get hierarchical phase breakdown
|
|
breakdown = get_phase_breakdown(df)
|
|
|
|
# Display in Jupyter (automatic rich HTML display)
|
|
display(breakdown)
|
|
|
|
# Print text representation
|
|
print(breakdown)
|
|
|
|
# Access the underlying DataFrame
|
|
print(breakdown.df)
|
|
|
|
# Convert to plotly format for visualization
|
|
import plotly.express as px
|
|
data = breakdown.to_plotly()
|
|
fig = px.sunburst(**data)
|
|
fig.show()
|
|
```
|
|
|
|
The `PhaseBreakdown` class provides:
|
|
- Hierarchical breakdown of compilation phases
|
|
- Automatic calculation of "Other" residual time at each level
|
|
- Validation that children don't exceed parent durations
|
|
- Multiple output formats (text, DataFrame, Plotly)
|
|
|
|
## DataFrame Schema
|
|
|
|
The parsed DataFrame contains the following columns from the `-ftime-trace` format:
|
|
|
|
- `name`: Event name (function, template instantiation, etc.)
|
|
- `ph`: Phase character ('X' for complete, 'B' for begin, 'E' for end, 'i' for instant)
|
|
- `ts`: Timestamp in microseconds
|
|
- `dur`: Duration in microseconds (for complete events)
|
|
- `pid`: Process ID
|
|
- `tid`: Thread ID
|
|
- `arg_*`: Flattened arguments from the event's `args` field
|
|
|
|
### Template Event Columns
|
|
|
|
When using `get_template_instantiation_events()`, additional parsed columns are included:
|
|
|
|
- `namespace`: Top-level namespace extracted from the template name
|
|
- `template_name`: Template name without namespace or parameters
|
|
- `full_qualified_name`: Complete namespace::template_name
|
|
- `param_count`: Number of template parameters
|
|
- `is_ck_type`: Boolean flag for CK library types (namespace starts with 'ck')
|
|
- `is_nested`: Boolean flag indicating nested template instantiations
|
|
|
|
## Use in Jupyter Notebooks
|
|
|
|
The module is designed to work seamlessly in Jupyter notebooks. See `notebooks/file_analysis_example.ipynb` for a complete example workflow that demonstrates:
|
|
|
|
- Loading and parsing trace files
|
|
- Extracting compilation metadata
|
|
- Analyzing phase breakdown with visualizations
|
|
- Template instantiation analysis with parsed columns
|
|
- Filtering and grouping by namespace
|
|
- Identifying CK-specific template costs
|
|
|
|
To use in a notebook:
|
|
|
|
```python
|
|
import sys
|
|
from pathlib import Path
|
|
|
|
# Add trace_analysis to path
|
|
sys.path.insert(0, str(Path.cwd().parent))
|
|
|
|
from trace_analysis import (
|
|
parse_file,
|
|
get_metadata,
|
|
get_template_instantiation_events,
|
|
get_phase_breakdown,
|
|
)
|
|
|
|
# Load and analyze
|
|
df = parse_file('path/to/trace.json')
|
|
breakdown = get_phase_breakdown(df)
|
|
templates = get_template_instantiation_events(df)
|
|
|
|
# Visualize
|
|
import plotly.express as px
|
|
fig = px.sunburst(**breakdown.to_plotly())
|
|
fig.show()
|
|
```
|
|
|
|
## API Reference
|
|
|
|
### Core Functions
|
|
|
|
- `parse_file(filepath)`: Parse a `-ftime-trace` JSON file into a pandas DataFrame
|
|
- `get_metadata(filepath_or_df)`: Extract compilation metadata from trace file or DataFrame
|
|
|
|
### Template Analysis
|
|
|
|
- `get_template_instantiation_events(df)`: Filter to template instantiation events with parsed template information
|
|
|
|
### Phase Breakdown
|
|
|
|
- `get_phase_breakdown(df)`: Generate hierarchical compilation phase breakdown
|
|
- `PhaseBreakdown`: Class representing phase breakdown with multiple output formats
|
|
|
|
## Contributing
|
|
|
|
This is an experimental project for analyzing and improving C++ metaprogramming build times. Contributions are welcome! When adding new analysis functions:
|
|
|
|
1. Add the function to the appropriate module in `trace_analysis/`
|
|
2. Export it in `__init__.py`
|
|
3. Update this README with usage examples
|
|
4. Consider adding a notebook example if the feature is substantial
|
|
|
|
## License
|
|
|
|
Copyright (c) Advanced Micro Devices, Inc., or its affiliates.
|
|
SPDX-License-Identifier: MIT
|