## Summary
Add historical trend visualization (ASCII sparklines + directional
arrows) to the performance PR report, showing how each metric has moved
over recent commits on main.
## Changes
- **What**: New `sparkline()`, `trendDirection()`, `trendArrow()`
functions in `perf-stats.ts`. New collapsible "Trend" section in the
perf report showing per-metric sparklines, direction indicators, and
latest values. CI workflow updated to download historical data from the
`perf-data` orphan branch and switched to `setup-frontend` action with
`pnpm exec tsx`.
## Review Focus
- The trend section only renders when ≥3 historical data points exist
(gracefully absent otherwise)
- `trendDirection()` uses a split-half mean comparison with ±10%
threshold — review whether this sensitivity is appropriate
- The `git archive` step in `pr-perf-report.yaml` is idempotent and
fails silently if no perf-history data exists yet on the perf-data
branch
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-9939-feat-add-trend-visualization-with-sparklines-to-perf-report-3246d73d36508125a6fcc39612f850fe)
by [Unito](https://www.unito.io)
---------
Co-authored-by: GitHub Action <action@github.com>
## Summary
Replace fixed 10%/20% perf delta thresholds with dynamic σ-based
classification using z-scores, eliminating false alarms from naturally
noisy duration metrics (10-17% CV).
## Changes
- **What**:
- Run each perf test 3× (`--repeat-each=3`) and report the mean,
reducing single-run noise
- Download last 5 successful main branch perf artifacts to compute
historical μ/σ per metric
- Replace fixed threshold flags with z-score significance: `⚠️
regression` (z>2), `✅ neutral/improvement`, `🔇 noisy` (CV>50%)
- Add collapsible historical variance table (μ, σ, CV) to PR comment
- Graceful cold start: falls back to simple delta table until ≥2
historical runs exist
- New `scripts/perf-stats.ts` module with `computeStats`, `zScore`,
`classifyChange`
- 18 unit tests for stats functions
- **CI time impact**: ~3 min → ~5-6 min (repeat-each adds ~2 min,
historical download <10s)
## Review Focus
- The `gh api` call in the new "Download historical perf baselines"
step: it queries the last 5 successful push runs on the base branch. The
`gh` CLI is available natively on `ubuntu-latest` runners and
auto-authenticates with `GITHUB_TOKEN`.
- `getHistoricalStats` averages per-run measurements before computing
cross-run σ — this is intentional since historical artifacts may also
contain repeated measurements after this change lands.
- The `noisy` classification (CV>50%) suppresses metrics like `layouts`
that hover near 0 and have meaningless percentage swings.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-9305-feat-add-statistical-significance-to-perf-report-with-z-score-thresholds-3156d73d3650818d9360eeafd9ae7dc1)
by [Unito](https://www.unito.io)