Update build analyzer for better usability

This commit is contained in:
John Shumway
2025-12-18 19:48:34 -05:00
parent 94daf3aa65
commit 814b609476
2 changed files with 2307 additions and 41 deletions

View File

@@ -6,10 +6,17 @@ Simple, fast tools for analyzing Clang `-ftime-trace` build performance data.
This directory provides straightforward Python tools for analyzing the JSON trace files generated during compilation with `-ftime-trace`. The focus is on simplicity and speed - no caching, no complexity, just fast parallel I/O and pandas DataFrames.
**Key principle: Fresh analysis every time is faster and simpler than managing caches.**
## Quick Start
Configure a build directory `build-trace` and edit CMakeCache to add `-ftime-trace` to the `CMAKE_CXX_FLAGS`. With `-ftime-trace` enabled, the clang compiler will generate `.json` trace files alongside each compiled object file. These trace files contain detailed timing information about:
- Template instantiations
- Function parsing
- Code generation phases
- Optimization passes
These JSON files are what the analysis tools in `script/analyze_build/` are designed to process.
```bash
# Analyze all trace files in a directory
cd script/analyze_build/examples
@@ -57,6 +64,7 @@ python examples/analyze_build.py ../../build-trace
```
This will:
- Find all `.json` files recursively
- Process them in parallel using all CPU cores
- Display comprehensive build statistics
@@ -93,6 +101,7 @@ print(f"Template time: {templates_df['dur'].sum() / 1e6:.2f}s")
For interactive analysis, see the comprehensive example notebook:
**[notebooks/comprehensive_example.ipynb](notebooks/comprehensive_example.ipynb)** - Complete guide covering:
- Single file analysis with detailed explanations
- Multi-file parallel processing
- Build-wide statistics and template analysis
@@ -130,11 +139,13 @@ print(event_totals.head(10))
## Performance
**Typical performance on 4,484 trace files (~46 GB):**
- Parsing: ~26 seconds (174 files/sec)
- Memory: ~1-2 GB
- Throughput: I/O limited (uses all CPU cores)
**Why no caching?**
- Fresh analysis is faster than cache management overhead
- Simpler code (60% less code than cached version)
- No cache invalidation issues
@@ -164,6 +175,7 @@ The trace files use the [Chrome Trace Event Format](https://docs.google.com/docu
```
**Key fields:**
- `name`: Event type (e.g., "InstantiateClass", "ParseFunctionDefinition")
- `dur`: Duration in microseconds
- `ts`: Timestamp in microseconds
@@ -248,7 +260,7 @@ template_time = templates_df['dur'].sum()
print(f"Template time: {(template_time / total_time) * 100:.1f}%")
```
## Tips
## Build time analysis philosophy
- **Use all CPU cores**: The tools automatically use all available cores for parallel processing
- **Memory is cheap**: 1-2GB for 4,484 files is acceptable on modern systems

File diff suppressed because it is too large Load Diff