Commit Graph

4 Commits

Author SHA1 Message Date
Christian Byrne
a7218b2922 fix: fix perf CI pipeline — z-score baselines, force-push staleness, baseline storage (#9886)
## Summary

Fixes three critical issues with the CI performance reporting pipeline
that made perf reports useless on PRs (demonstrated by PR #9248 — deep
watcher removal merged without useful perf signal).

## Changes

### 1. Fix z-score baseline variance collection (`0/5 runs`)
**Root cause:** PR #9305 added z-score statistical analysis code to
`perf-report.ts`, but the historical data download step was placed in
the wrong workflow file. The report is generated in
`pr-perf-report.yaml` (a `workflow_run`-triggered job), but the
historical download was in `ci-perf-report.yaml` (the test runner) —
different runners, different filesystems.

**Fix:** Implement `perf-data` orphan branch storage:
- On push to main: save `perf-metrics.json` to `perf-data` branch with
timestamped filename
- On PR report: fetch last 5 baselines from `perf-data` branch into
`temp/perf-history/`
- Rolling window of 20 baselines, oldest pruned automatically
- Same pattern used by `github-action-benchmark` (33.7k repos)

### 2. Fix force-push comment staleness
**Root cause:** `cancel-in-progress: true` kills the perf test run
before it uploads artifacts. The downstream report workflow only
triggers on `conclusion == 'success'` — cancelled runs are ignored, so
the comment from the first successful run goes stale.

**Fix:**
- Change `cancel-in-progress: false` — with GitHub's queue depth of 1,
rapid pushes (A,B,C,D) run A and D, skipping B and C
- Add SHA validation in `pr-perf-report.yaml` — before posting, check if
the workflow_run's head SHA still matches the PR's current head. Skip
posting stale results.

### 3. Add permissions for baseline operations
- `contents: write` on CI job (needed for pushing to perf-data branch)
- `actions: read` on both workflows (needed for artifact/baseline
access)

## One-time setup required
After merging, create the `perf-data` orphan branch:
```bash
git checkout --orphan perf-data
git rm -rf .
echo '# Performance Baselines' > README.md
mkdir -p baselines
git add README.md baselines
git commit -m 'Initialize perf-data branch'
git push origin perf-data
```

The first 2 pushes to main after setup will build up variance data, and
z-scores will start appearing in PR reports (threshold is
`historical.length >= 2`).

## Testing
- YAML validated with `yaml.safe_load()`
- `perf-report.ts` `loadHistoricalReports()` already reads from
`temp/perf-history/<index>/perf-metrics.json` — no code changes needed
- All new steps use `continue-on-error: true` for graceful degradation

┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-9886-fix-fix-perf-CI-pipeline-z-score-baselines-force-push-staleness-baseline-storage-3226d73d365081538424c7945e71f308)
by [Unito](https://www.unito.io)
2026-03-15 02:46:10 -07:00
Christian Byrne
6689a1b14e fix: split perf report workflow for fork PR support (#9382)
## Summary

Perf report workflow fails on fork PRs because `GITHUB_TOKEN` is
read-only for forks, causing "Resource not accessible by integration" on
the PR comment step.

## Changes

- **What**: Split `ci-perf-report.yaml` into a data-collection workflow
+ a `workflow_run`-triggered reporter (`pr-perf-report.yaml`), matching
the existing `ci-size-data`/`pr-size-report` pattern. Added fork PR
permissions guidance to `.github/AGENTS.md`.
- **ci-perf-report.yaml**: Removed the `report` job and `pull-requests:
write` permission. Added PR metadata (number + base branch) artifact
upload.
- **pr-perf-report.yaml** (new): Triggered by `workflow_run` on the perf
workflow. Downloads metrics + metadata artifacts, generates report,
posts PR comment with write permissions from the default-branch context.

## Review Focus

- The two-workflow split follows the same pattern as `ci-size-data.yaml`
→ `pr-size-report.yaml`, which already works for fork PRs.
- The `workflow_run` trigger runs in the base repo context per [GitHub
Security Lab
guidance](https://securitylab.github.com/resources/github-actions-preventing-pwn-requests/),
so it safely has write permissions even for fork PRs.
- AGENTS.md guidance documents this pattern to prevent recurrence.

Fixes the failure seen in
https://github.com/Comfy-Org/ComfyUI_frontend/actions/runs/22684230751/job/65763595989?pr=9380

┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-9382-fix-split-perf-report-workflow-for-fork-PR-support-3196d73d365081b29b35ed354e7789e2)
by [Unito](https://www.unito.io)
2026-03-04 22:09:31 -08:00
Christian Byrne
7b316eb9a2 feat: add statistical significance to perf report with z-score thresholds (#9305)
## Summary

Replace fixed 10%/20% perf delta thresholds with dynamic σ-based
classification using z-scores, eliminating false alarms from naturally
noisy duration metrics (10-17% CV).

## Changes

- **What**:
- Run each perf test 3× (`--repeat-each=3`) and report the mean,
reducing single-run noise
- Download last 5 successful main branch perf artifacts to compute
historical μ/σ per metric
- Replace fixed threshold flags with z-score significance: `⚠️
regression` (z>2), ` neutral/improvement`, `🔇 noisy` (CV>50%)
  - Add collapsible historical variance table (μ, σ, CV) to PR comment
- Graceful cold start: falls back to simple delta table until ≥2
historical runs exist
- New `scripts/perf-stats.ts` module with `computeStats`, `zScore`,
`classifyChange`
  - 18 unit tests for stats functions

- **CI time impact**: ~3 min → ~5-6 min (repeat-each adds ~2 min,
historical download <10s)

## Review Focus

- The `gh api` call in the new "Download historical perf baselines"
step: it queries the last 5 successful push runs on the base branch. The
`gh` CLI is available natively on `ubuntu-latest` runners and
auto-authenticates with `GITHUB_TOKEN`.
- `getHistoricalStats` averages per-run measurements before computing
cross-run σ — this is intentional since historical artifacts may also
contain repeated measurements after this change lands.
- The `noisy` classification (CV>50%) suppresses metrics like `layouts`
that hover near 0 and have meaningless percentage swings.

┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-9305-feat-add-statistical-significance-to-perf-report-with-z-score-thresholds-3156d73d3650818d9360eeafd9ae7dc1)
by [Unito](https://www.unito.io)
2026-03-04 16:16:53 -08:00
Christian Byrne
8e215b3174 feat: add performance testing infrastructure with CDP metrics (#9170)
## Summary

Add a permanent, non-failing performance regression detection system
using Chrome DevTools Protocol metrics, with automatic PR commenting.

## Changes

- **What**: Performance testing infrastructure — `PerformanceHelper`
fixture class using CDP `Performance.getMetrics` to collect
`RecalcStyleCount`, `LayoutCount`, `LayoutDuration`, `TaskDuration`,
`JSHeapUsedSize`. Adds `@perf` Playwright project (Chromium-only,
single-threaded, 60s timeout), 4 baseline perf tests, CI workflow with
sticky PR comment reporting, and `perf-report.js` script for generating
markdown comparison tables.

## Review Focus

- `PerformanceHelper` uses `page.context().newCDPSession(page)` — CDP is
Chromium-only, so perf metrics are not collected on Firefox. This is
intentional since CDP gives us browser-level style recalc/layout counts
that `performance.mark/measure` cannot capture.
- The CI workflow uses `continue-on-error: true` so perf tests never
block merging.
- Baseline comparison uses `dawidd6/action-download-artifact` to
download metrics from the target branch, following the same pattern as
`pr-size-report.yaml`.

## Stack

This is the foundation PR for the Firefox performance fix stack:
1. **→ This PR: perf testing infrastructure**
2. `perf/fix-cursor-cache` — cursor style caching (depends on this)
3. `perf/fix-subgraph-svg` — SVG pre-rasterization (depends on this)
4. `perf/fix-clippath-raf` — RAF batching for clip-path (depends on
this)

PRs 2-4 are independent of each other.

┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-9170-feat-add-performance-testing-infrastructure-with-CDP-metrics-3116d73d3650817cb43def6f8e9917f8)
by [Unito](https://www.unito.io)

---------

Co-authored-by: GitHub Action <action@github.com>
Co-authored-by: Alexander Brown <drjkl@comfy.org>
2026-02-25 20:09:57 -08:00