feat: badge shows reproduction method (E2E test / video / both)

- Add reproducedBy field to ResearchResult and done() tool - Agent reports how bug was proven: e2e_test, video, both, or none - Badge shows '1 via E2E test' instead of generic '1 reproduced' - Deploy script reads reproducedBy from research-log.json - Test code (reproduce.spec.ts) now deployed to report page Amp-Thread-ID: https://ampcode.com/threads/T-019d4786-eb5f-7115-a10e-5b086c921800 Co-authored-by: Amp <amp@ampcode.com>
fix: prevent trivial assertions from false REPRODUCED + deploy test code
2026-04-04 19:49:09 +00:00 · 2026-04-01 13:45:21 +00:00 · 2026-04-01 13:35:38 +00:00 · 2026-04-01 11:12:57 +00:00 · 2026-04-01 10:37:19 +00:00 · 2026-04-01 10:01:02 +00:00
27 changed files with 7861 additions and 251 deletions
--- a/.claude/skills/comfy-qa/REPRODUCE.md
+++ b/.claude/skills/comfy-qa/REPRODUCE.md
@@ -0,0 +1,278 @@
+---
+name: reproduce-issue
+description: 'Reproduce a GitHub issue by researching prerequisites, setting up the environment (custom nodes, workflows, settings), and interactively exploring ComfyUI via playwright-cli until the bug is confirmed. Then records a clean demo video.'
+---
+
+# Issue Reproduction Skill
+
+Reproduce a reported GitHub issue against a running ComfyUI instance. This skill uses an interactive, agent-driven approach — not a static script. You will research, explore, retry, and adapt until the bug is reproduced, then record a clean demo.
+
+## Architecture
+
+Two videos are produced:
+
+1. **Research video** — the full exploration session: installing deps, trying things, failing, retrying, figuring out the bug. Valuable for debugging context.
+2. **Reproduce video** — a clean, minimal recording of just the reproduction steps. This is the demo you'd attach to the issue.
+
+```
+Phase 1: Research       → Read issue, understand prerequisites
+Phase 2: Environment    → Install custom nodes, load workflows, configure settings
+Phase 3: Explore        → [VIDEO 1: research] Interactively try to reproduce (retries OK)
+Phase 4: Record         → [VIDEO 2: reproduce] Clean recording of just the minimal repro steps
+Phase 5: Report         → Generate a structured reproduction report
+```
+
+## Prerequisites
+
+- ComfyUI server running (ask user for URL, default: `http://127.0.0.1:8188`)
+- `playwright-cli` installed: `npm install -g @playwright/cli@latest`
+- `gh` CLI (authenticated, for reading issues)
+- ComfyUI backend with Python environment (for installing custom nodes)
+
+## Phase 1: Research the Issue
+
+1. Fetch the issue details:
+
+   ```bash
+   gh issue view <number> --repo Comfy-Org/ComfyUI_frontend --json title,body,comments
+   ```
+
+2. Extract from the issue body:
+   - **Reproduction steps** (the exact sequence)
+   - **Prerequisites**: specific workflows, custom nodes, settings, models
+   - **Environment**: OS, browser, ComfyUI version
+   - **Media**: screenshots or videos showing the bug
+
+3. Search the codebase for related code:
+   - Find the feature/component mentioned in the issue
+   - Understand how it works currently
+   - Identify what state the UI needs to be in
+
+## Phase 2: Environment Setup
+
+Set up everything the issue requires BEFORE attempting reproduction.
+
+### Custom Nodes
+
+If the issue mentions custom nodes:
+
+```bash
+# Find the custom node repo
+# Clone into ComfyUI's custom_nodes directory
+cd <comfyui_path>/custom_nodes
+git clone <custom_node_repo_url>
+
+# Install dependencies if needed
+cd <custom_node_name>
+pip install -r requirements.txt 2>/dev/null || true
+
+# Restart ComfyUI server to load the new nodes
+```
+
+### Workflows
+
+If the issue references a specific workflow:
+
+```bash
+# Download workflow JSON if a URL is provided
+curl -L "<workflow_url>" -o /tmp/test-workflow.json
+
+# Load it via the API
+curl -X POST http://127.0.0.1:8188/api/workflow \
+  -H "Content-Type: application/json" \
+  -d @/tmp/test-workflow.json
+```
+
+Or load via playwright-cli:
+
+```bash
+playwright-cli goto "http://127.0.0.1:8188"
+# Drag-and-drop or use File > Open to load the workflow
+```
+
+### Settings
+
+If the issue requires specific settings:
+
+```bash
+# Use playwright-cli to open settings and change them
+playwright-cli press "Control+,"
+playwright-cli snapshot
+# Find and modify the relevant setting
+```
+
+## Phase 3: Interactive Exploration — Research Video
+
+Start recording the **research video** (Video 1). This captures the full exploration — mistakes, retries, dead ends — all valuable context.
+
+```bash
+# Open browser and start video recording
+playwright-cli open "http://127.0.0.1:8188"
+playwright-cli video-start
+
+# Take a snapshot to see current state
+playwright-cli snapshot
+
+# Interact based on what you see
+playwright-cli click <ref>
+playwright-cli fill <ref> "text"
+playwright-cli press "Control+s"
+
+# Check results
+playwright-cli snapshot
+playwright-cli screenshot --filename=/tmp/qa/research-step-1.png
+```
+
+### Key Principles
+
+- **Observe before acting**: Always `snapshot` before interacting
+- **Retry and adapt**: If a step fails, try a different approach
+- **Document what works**: Keep notes on which steps trigger the bug
+- **Don't give up**: Try multiple approaches if the first doesn't work
+- **Establish prerequisites**: Many bugs require specific UI state:
+  - Save a workflow first (File > Save)
+  - Make changes to dirty the workflow
+  - Open multiple tabs
+  - Add specific node types
+  - Change settings
+  - Resize the window
+
+### Common ComfyUI Interactions via playwright-cli
+
+| Action              | Command                                                        |
+| ------------------- | -------------------------------------------------------------- |
+| Open hamburger menu | `playwright-cli click` on the C logo button                    |
+| Navigate menu       | `playwright-cli hover <ref>` then `playwright-cli click <ref>` |
+| Add node            | Double-click canvas → type node name → select from results     |
+| Connect nodes       | Drag from output slot to input slot                            |
+| Save workflow       | `playwright-cli press "Control+s"`                             |
+| Save As             | Menu > File > Save As                                          |
+| Select node         | Click on the node                                              |
+| Delete node         | Select → `playwright-cli press "Delete"`                       |
+| Right-click menu    | `playwright-cli click <ref> --button right`                    |
+| Keyboard shortcut   | `playwright-cli press "Control+z"`                             |
+
+## Phase 4: Record Clean Demo — Reproduce Video (max 5 minutes)
+
+Once the bug is confirmed, **stop the research video** and **close the research browser**:
+
+```bash
+playwright-cli video-stop
+playwright-cli close
+```
+
+Now start a **fresh browser session** for the clean reproduce video (Video 2).
+
+**IMPORTANT constraints:**
+
+- **Max 5 minutes** — the reproduce video must be short and focused
+- **No environment setup** — server, user, custom nodes are already set up from Phase 3. Just log in and go.
+- **No exploration** — you already know the exact steps. Execute them quickly and precisely.
+- **Start video recording immediately**, execute steps, stop. Don't leave the recording running while thinking.
+
+1. **Open browser and start recording**:
+
+   ```bash
+   playwright-cli open "http://127.0.0.1:8188"
+   playwright-cli video-start
+   ```
+
+2. **Execute only the minimal reproduction steps** — no exploration, no mistakes. Just the clean sequence that demonstrates the bug. You already know exactly what works from Phase 3.
+
+3. **Take key screenshots** at critical moments:
+
+   ```bash
+   playwright-cli screenshot --filename=/tmp/qa/before-bug.png
+   # ... trigger the bug ...
+   playwright-cli screenshot --filename=/tmp/qa/bug-visible.png
+   ```
+
+4. **Stop recording and close** immediately after the bug is demonstrated:
+   ```bash
+   playwright-cli video-stop
+   playwright-cli close
+   ```
+
+## Phase 5: Generate Report
+
+Create a reproduction report at `tmp/qa/reproduce-report.md`:
+
+```markdown
+# Issue Reproduction Report
+
+- **Issue**: <issue_url>
+- **Title**: <issue_title>
+- **Date**: <today>
+- **Status**: Reproduced / Not Reproduced / Partially Reproduced
+
+## Environment
+
+- ComfyUI Server: <url>
+- OS: <os>
+- Custom Nodes Installed: <list or "none">
+- Settings Changed: <list or "none">
+
+## Prerequisites
+
+List everything that had to be set up before the bug could be triggered:
+
+1. ...
+2. ...
+
+## Reproduction Steps
+
+Minimal steps to reproduce (the clean sequence):
+
+1. ...
+2. ...
+3. ...
+
+## Expected Behavior
+
+<from the issue>
+
+## Actual Behavior
+
+<what actually happened>
+
+## Evidence
+
+- Research video: `research-video/video.webm` (full exploration session)
+- Reproduce video: `reproduce-video/video.webm` (clean minimal repro)
+- Screenshots: `before-bug.png`, `bug-visible.png`
+
+## Root Cause Analysis (if identified)
+
+<code pointers, hypothesis about what's going wrong>
+
+## Notes
+
+<any additional observations, workarounds discovered, related issues>
+```
+
+## Handling Failures
+
+If the bug **cannot be reproduced**:
+
+1. Document what you tried and why it didn't work
+2. Check if the issue was already fixed (search git log for related commits)
+3. Check if it's environment-specific (OS, browser, specific version)
+4. Set report status to "Not Reproduced" with detailed notes
+5. The report is still valuable — it saves others from repeating the same investigation
+
+## CI Integration
+
+In CI, this skill runs as a Claude Code agent with:
+
+- `ANTHROPIC_API_KEY` for Claude
+- `GEMINI_API_KEY` for initial issue analysis (optional)
+- ComfyUI server pre-started in the container
+- `playwright-cli` pre-installed
+
+The CI workflow:
+
+1. Gemini generates a reproduce guide (markdown) from the issue
+2. Claude agent receives the guide and runs this skill
+3. Claude explores interactively, installs dependencies, retries
+4. Claude records a clean demo once reproduced
+5. Video and report are uploaded as artifacts
--- a/.claude/skills/comfy-qa/SKILL.md
+++ b/.claude/skills/comfy-qa/SKILL.md
@@ -0,0 +1,277 @@
+---
+name: comfy-qa
+description: 'Comprehensive QA of ComfyUI frontend. Navigates all routes, tests all interactive features using playwright-cli, generates a report, and submits a draft PR. Works in CI and local environments, cross-platform.'
+---
+
+# ComfyUI Frontend QA Skill
+
+Automated quality assurance for the ComfyUI frontend. The pipeline reproduces reported bugs using Playwright E2E tests, records video evidence, and deploys reports to Cloudflare Pages.
+
+## Architecture Overview
+
+The QA pipeline uses a **three-phase approach**:
+
+1. **RESEARCH** — Claude writes Playwright E2E tests to reproduce bugs (assertion-backed, no hallucination)
+2. **REPRODUCE** — Deterministic replay of the research test with video recording
+3. **REPORT** — Deploy results to Cloudflare Pages with badge, video, and verdict
+
+### Key Design Decision
+
+Earlier iterations used AI vision (Gemini) to drive a browser and judge results from video. This was abandoned after discovering **AI reviewers hallucinate** — Gemini reported "REPRODUCED" when videos showed idle screens. The current approach uses **Playwright assertions** as the source of truth: if the test passes, the bug is proven.
+
+## Prerequisites
+
+- Node.js 22+
+- `pnpm` package manager
+- `gh` CLI (authenticated)
+- Playwright browsers: `npx playwright install chromium`
+- Environment variables:
+  - `GEMINI_API_KEY` — for PR analysis and video review
+  - `ANTHROPIC_API_KEY` — for Claude Agent SDK (research phase)
+  - `CLOUDFLARE_API_TOKEN` + `CLOUDFLARE_ACCOUNT_ID` — for report deployment
+
+## Pipeline Scripts
+
+| Script | Role | Model |
+|---|---|---|
+| `scripts/qa-analyze-pr.ts` | Deep PR/issue analysis → QA guide | gemini-3.1-pro-preview |
+| `scripts/qa-agent.ts` | Research phase: Claude writes E2E tests | claude-sonnet-4-6 (Agent SDK) |
+| `scripts/qa-record.ts` | Before/after video recording with Gemini-driven actions | gemini-3.1-pro-preview |
+| `scripts/qa-reproduce.ts` | Deterministic replay with narration | gemini-3-flash-preview |
+| `scripts/qa-video-review.ts` | Video comparison review | gemini-3-flash-preview |
+| `scripts/qa-generate-test.ts` | Regression test generation from QA report | gemini-3-flash-preview |
+| `scripts/qa-deploy-pages.sh` | Deploy to Cloudflare Pages + badge | — |
+| `scripts/qa-batch.sh` | Batch-trigger QA for multiple issues | — |
+| `scripts/qa-report-template.html` | Report site (light/dark, seekbar, copy badge) | — |
+
+## Triggering QA
+
+### Via GitHub Labels
+
+- **`qa-changes`** — Focused QA on a PR (Linux-only, before/after comparison)
+- **`qa-full`** — Full QA (3-OS matrix, after-only)
+- **`qa-issue`** — Reproduce a bug from an issue
+
+### Via Batch Script
+
+```bash
+# Trigger QA for specific issue numbers
+./scripts/qa-batch.sh 10394 10238 9996
+
+# From a triage file (top 5 Tier 1 issues)
+./scripts/qa-batch.sh --from tmp/issues.md --top 5
+
+# Preview without pushing
+./scripts/qa-batch.sh --dry-run 10394
+
+# Clean up old trigger branches
+./scripts/qa-batch.sh --cleanup
+```
+
+### Via Workflow Dispatch
+
+Go to Actions → "PR: QA" → Run workflow → choose mode (focused/full).
+
+## CI Workflow (`.github/workflows/pr-qa.yaml`)
+
+```
+resolve-matrix → analyze-pr ──┐
+                               ├→ qa-before (main branch, worktree build)
+                               ├→ qa-after  (PR branch)
+                               └→ report (video review, deploy, comment)
+```
+
+Before/after jobs run **in parallel** on separate runners for clean isolation.
+
+### Issue Reproduce Mode
+
+For issues (not PRs), the pipeline:
+1. Fetches the issue body and comments
+2. Runs `qa-analyze-pr.ts --type issue` to generate a QA guide
+3. Runs the research phase (Claude writes E2E test to reproduce)
+4. Records video of the test execution
+5. Posts results as a comment on the issue
+
+## Running Locally
+
+### Step 1: Environment Setup
+
+```bash
+# Ensure ComfyUI server is running
+# Default: http://127.0.0.1:8188
+
+# Install Playwright browsers
+npx playwright install chromium
+```
+
+### Step 2: Analyze the Issue/PR
+
+```bash
+# For a PR
+pnpm exec tsx scripts/qa-analyze-pr.ts \
+  --pr-number 10394 \
+  --repo Comfy-Org/ComfyUI_frontend \
+  --output-dir qa-guides
+
+# For an issue
+pnpm exec tsx scripts/qa-analyze-pr.ts \
+  --pr-number 10394 \
+  --repo Comfy-Org/ComfyUI_frontend \
+  --output-dir qa-guides \
+  --type issue
+```
+
+### Step 3: Record Before/After
+
+```bash
+# Before (main branch)
+pnpm exec tsx scripts/qa-record.ts \
+  --mode before \
+  --diff /tmp/pr-diff.txt \
+  --output-dir /tmp/qa-before \
+  --qa-guide qa-guides/qa-guide-1.json
+
+# After (PR branch)
+pnpm exec tsx scripts/qa-record.ts \
+  --mode after \
+  --diff /tmp/pr-diff.txt \
+  --output-dir /tmp/qa-after \
+  --qa-guide qa-guides/qa-guide-1.json
+```
+
+### Step 4: Review Videos
+
+```bash
+pnpm exec tsx scripts/qa-video-review.ts \
+  --artifacts-dir /tmp/qa-artifacts \
+  --video-file qa-session.mp4 \
+  --before-video qa-before-session.mp4 \
+  --output-dir /tmp/video-reviews \
+  --pr-context /tmp/pr-context.txt
+```
+
+## Research Phase Details (`qa-agent.ts`)
+
+Claude receives:
+- The issue description and comments
+- A QA guide from `qa-analyze-pr.ts`
+- An accessibility tree snapshot of the current UI
+
+Claude's tools:
+- **`inspect(selector?)`** — Read a11y tree to discover element selectors
+- **`writeTest(code)`** — Write a Playwright `.spec.ts` file
+- **`runTest()`** — Execute the test and get pass/fail + errors
+- **`done(verdict, summary, evidence, testCode)`** — Finish with verdict
+
+The test uses the project's Playwright fixtures (`comfyPageFixture`), giving access to `comfyPage.page`, `comfyPage.menu`, `comfyPage.settings`, etc.
+
+### Verdict Logic
+
+- **REPRODUCED** — Test passes (asserting the bug exists) → bug is proven
+- **NOT_REPRODUCIBLE** — Claude exhausted attempts, test cannot pass
+- **INCONCLUSIVE** — Agent timed out or encountered infrastructure issues
+
+Auto-completion: if a test passed but `done()` was never called, the pipeline auto-completes with REPRODUCED.
+
+## Manual QA (Fallback)
+
+When the automated pipeline isn't suitable (e.g., visual-only bugs, complex multi-step interactions), use **playwright-cli** for manual browser interaction:
+
+```bash
+# Install
+npm install -g @playwright/cli@latest
+
+# Open browser and navigate
+playwright-cli open http://127.0.0.1:8188
+
+# Get element references
+playwright-cli snapshot
+
+# Interact
+playwright-cli click e1
+playwright-cli fill e2 "test text"
+playwright-cli press Escape
+playwright-cli screenshot --filename=f.png
+```
+
+Snapshots return element references (`e1`, `e2`, …). Always run `snapshot` after navigation to refresh refs.
+
+## Manual QA Test Plan
+
+When performing manual QA (either via playwright-cli or the automated pipeline), systematically test each area below.
+
+### Application Load & Routes
+
+| Test | Steps |
+|---|---|
+| Root route loads | Navigate to `/` — GraphView should render with canvas |
+| User select route | Navigate to `/user-select` — user selection UI should appear |
+| 404 handling | Navigate to `/nonexistent` — should handle gracefully |
+
+### Canvas & Graph View
+
+| Test | Steps |
+|---|---|
+| Canvas renders | The LiteGraph canvas is visible and interactive |
+| Pan canvas | Click and drag on empty canvas area |
+| Zoom in/out | Use scroll wheel or Alt+=/Alt+- |
+| Add node via double-click | Double-click canvas to open search, type "KSampler", select it |
+| Delete node | Select a node, press Delete key |
+| Connect nodes | Drag from output slot to input slot |
+| Copy/Paste | Select nodes, Ctrl+C then Ctrl+V |
+| Undo/Redo | Make changes, Ctrl+Z to undo, Ctrl+Y to redo |
+| Context menus | Right-click node vs empty canvas — different menus |
+
+### Sidebar Tabs
+
+| Test | Steps |
+|---|---|
+| Workflows tab | Press W — workflows sidebar opens |
+| Node Library tab | Press N — node library opens |
+| Model Library tab | Press M — model library opens |
+| Tab toggle | Press same key again — sidebar closes |
+| Search in sidebar | Type in search box — results filter |
+
+### Settings Dialog
+
+| Test | Steps |
+|---|---|
+| Open settings | Press Ctrl+, or click settings button |
+| Change a setting | Toggle a boolean setting — it persists after closing |
+| Search settings | Type in settings search box — results filter |
+| Close settings | Press Escape or click close button |
+
+### Execution & Queue
+
+| Test | Steps |
+|---|---|
+| Queue prompt | Load default workflow, click Queue — execution starts |
+| Queue progress | Progress indicator shows during execution |
+| Interrupt | Press Ctrl+Alt+Enter during execution — interrupts |
+
+## Report Site
+
+Deployed to Cloudflare Pages at `https://comfy-qa.pages.dev/<branch>/`.
+
+Features:
+- Light/dark theme
+- Seekable video player with preload
+- Copy badge button (markdown)
+- Date-stamped badges (e.g., `QA0327`)
+- Vertical box badge for issues and PRs
+
+## Known Issues & Troubleshooting
+
+See `docs/qa/TROUBLESHOOTING.md` for common failures:
+- `set -euo pipefail` + grep with no match → append `|| true`
+- `__name is not defined` in `page.evaluate` → use `addScriptTag`
+- Cursor not visible in videos → monkey-patch `page.mouse` methods
+- Agent not calling `done()` → auto-complete from passing test
+
+## Backlog
+
+See `docs/qa/backlog.md` for planned improvements:
+- **Type B comparison**: Different commits for regression detection
+- **Type C comparison**: Cross-browser testing
+- **Pre-seed assets**: Upload test images before recording
+- **Lazy a11y tree**: Reduce token usage with `inspect(selector)` vs full dump
--- a/.github/actions/setup-comfyui-server/action.yaml
+++ b/.github/actions/setup-comfyui-server/action.yaml
@@ -44,12 +44,17 @@ runs:
        python -m pip install --upgrade pip
        pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
        pip install -r requirements.txt
-        pip install wait-for-it

    - name: Start ComfyUI server
      if: ${{ inputs.launch_server == 'true' }}
      shell: bash
      working-directory: ComfyUI
+      env:
+        EXTRA_SERVER_PARAMS: ${{ inputs.extra_server_params }}
      run: |
-        python main.py --cpu --multi-user --front-end-root ../dist ${{ inputs.extra_server_params }} &
-        wait-for-it --service 127.0.0.1:8188 -t 600
+        python main.py --cpu --multi-user --front-end-root ../dist $EXTRA_SERVER_PARAMS &
+        for i in $(seq 1 300); do
+          curl -sf http://127.0.0.1:8188/api/system_stats >/dev/null 2>&1 && echo "Server ready" && exit 0
+          sleep 2
+        done
+        echo "::error::ComfyUI server did not start within 600s" && exit 1
--- a/.github/workflows/pr-qa.yaml
+++ b/.github/workflows/pr-qa.yaml
--- a/.gitignore
+++ b/.gitignore
@@ -99,4 +99,7 @@ vitest.config.*.timestamp*
 # Weekly docs check output
 /output.txt

-.amp
+.amp
+.playwright-cli/
+.playwright/
+.claude/scheduled_tasks.lock
--- a/.oxfmtrc.json
+++ b/.oxfmtrc.json
@@ -9,6 +9,7 @@
    "packages/registry-types/src/comfyRegistryTypes.ts",
    "public/materialdesignicons.min.css",
    "src/types/generatedManagerTypes.ts",
-    "**/__fixtures__/**/*.json"
+    "**/__fixtures__/**/*.json",
+    "scripts/qa-report-template.html"
  ]
 }
--- a/docs/qa/TROUBLESHOOTING.md
+++ b/docs/qa/TROUBLESHOOTING.md
@@ -0,0 +1,78 @@
+# QA Pipeline Troubleshooting
+
+## Common Failures
+
+### `set -euo pipefail` + grep with no match
+**Symptom**: Deploy script crashes silently, badge shows FAILED.
+**Cause**: `grep -oP` returns exit code 1 when no match. Under `pipefail`, this kills the entire script.
+**Fix**: Always append `|| true` to grep pipelines in bash scripts.
+
+### `__name is not defined` in page.evaluate
+**Symptom**: Recording crashes with `ReferenceError: __name is not defined`.
+**Cause**: tsx compiles arrow functions inside `page.evaluate()` with `__name` helpers. The browser context doesn't have these.
+**Fix**: Use `page.addScriptTag({ content: '...' })` with plain JS strings instead of `page.evaluate(() => { ... })` with arrow functions.
+
+### `Set<string>()` in page.evaluate
+**Symptom**: Same `__name` error.
+**Cause**: TypeScript generics like `new Set<string>()` get compiled incorrectly for browser context.
+**Fix**: Use `new Set()` without type parameter.
+
+### `zod/v4` import error
+**Symptom**: `ERR_PACKAGE_PATH_NOT_EXPORTED: Package subpath './v4' is not defined`.
+**Cause**: claude-agent-sdk depends on `zod/v4` internally, but the project's zod doesn't export it.
+**Fix**: Import from `zod` (not `zod/v4`) in project code.
+
+### `ERR_PNPM_LOCKFILE_CONFIG_MISMATCH`
+**Symptom**: pnpm install fails with frozen lockfile mismatch.
+**Cause**: Adding a new dependency changes the workspace catalog but lockfile wasn't regenerated.
+**Fix**: Run `pnpm install` to regenerate lockfile, commit `pnpm-workspace.yaml` + `pnpm-lock.yaml`.
+
+### `loadDefaultWorkflow` — "Load Default" not found
+**Symptom**: Menu item "Load Default" not found, canvas stays empty.
+**Cause**: The menu item name varies by version/locale. Menu navigation is fragile.
+**Fix**: Use `app.resetToDefaultWorkflow()` JS API via `page.evaluate` instead of menu navigation.
+
+### Model ID not found (Claude Agent SDK)
+**Symptom**: `There's an issue with the selected model (claude-sonnet-4-6-20250514)`.
+**Cause**: Dated model IDs like `claude-sonnet-4-6-20250514` don't exist.
+**Fix**: Use `claude-sonnet-4-6` (no date suffix).
+
+### Model not found (Gemini)
+**Symptom**: 404 from Gemini API.
+**Cause**: Preview model names like `gemini-2.5-flash-preview-05-20` expire.
+**Fix**: Use `gemini-3-flash-preview` (latest stable).
+
+## Badge Mismatches
+
+### False REPRODUCED
+**Symptom**: Badge says REPRODUCED but AI review says "could not reproduce".
+**Root cause**: Grep pattern `reproduc|confirm` matches neutral words like "reproduction steps" or "could not be confirmed".
+**Fix**: Use structured JSON verdict from AI (`## Verdict` section with `{"verdict": "..."}`) instead of regex matching the prose.
+
+### INCONCLUSIVE feedback loop
+**Symptom**: Once an issue gets INCONCLUSIVE, all future runs stay INCONCLUSIVE.
+**Cause**: QA bot's own previous comments contain "INCONCLUSIVE", which gets fed back into pr-context.txt.
+**Fix**: Filter out `github-actions[bot]` comments when building pr-context.
+
+### pressKey with hold prevents event propagation
+**Symptom**: BEFORE video doesn't show the bug (e.g., Escape doesn't close dialog).
+**Cause**: `keyboard.down()` + 400ms sleep + `keyboard.up()` changes event timing. Some UI frameworks handle held keys differently than instant presses.
+**Fix**: Use instant `keyboard.press()` for testing. Show key name via subtitle overlay instead.
+
+## Cursor Not Visible
+**Symptom**: No mouse cursor in recorded videos.
+**Cause**: Headless Chrome doesn't render system cursor. The CSS cursor overlay relies on DOM `mousemove` events which Playwright CDP doesn't reliably trigger.
+**Fix**: Monkey-patch `page.mouse.move/click/dblclick/down/up` to call `__moveCursor(x,y)` on the injected cursor div. This makes ALL mouse operations update the overlay.
+
+## Credit Balance Too Low
+**Symptom**: Research phase produces INCONCLUSIVE with 0 tool calls. Log shows "Credit balance is too low".
+**Cause**: The `ANTHROPIC_API_KEY` secret in the repo has exhausted its credits.
+**Fix**: Top up the Anthropic API account linked to the key, or rotate to a new key in repo Settings → Secrets.
+
+## Agent Doesn't Perform Steps
+**Symptom**: Agent opens menus and settings but never interacts with the canvas.
+**Causes**:
+1. `loadDefaultWorkflow` failed (no nodes on canvas)
+2. Agent ran out of turn budget (30 turns / 120s)
+3. Gemini Flash (old agent) ignores prompt hints
+**Fix**: Use hybrid agent (Claude Sonnet 4.6 + Gemini vision). Claude's superior reasoning follows instructions precisely.
--- a/docs/qa/backlog.md
+++ b/docs/qa/backlog.md
@@ -0,0 +1,59 @@
+# QA Pipeline Backlog
+
+## Comparison Modes
+
+### Type A: Same code, different settings (IMPLEMENTED)
+Agent demonstrates both working (control) and broken (test) states in one session by toggling settings. E.g., Nodes 2.0 OFF → drag works, Nodes 2.0 ON → drag broken.
+
+### Type B: Different commits
+For regressions reported as "worked in vX.Y, broken in vX.Z":
+- `qa-analyze-pr.ts` detects regression markers ("since v1.38", "after PR #1234")
+- Pipeline checks out the old commit, records control video
+- Records test video on current main
+- Side-by-side comparison on report page (reuses PR before/after infra)
+
+### Type C: Different browsers
+For browser-specific bugs ("works on Chrome, broken on Firefox"):
+- Run recording with different Playwright browser contexts
+- Compare behavior across browsers in one report
+
+## Agent Improvements
+
+### TTS Narration
+- OpenAI TTS (`tts-1`, nova voice) generates audio from agent reasoning
+- Merged into video via ffmpeg at correct timestamps
+- Currently in qa-record.ts but needs wiring into hybrid agent path
+
+### Image/Screenshot Reading
+- `qa-analyze-pr.ts` already downloads and sends images from issue bodies to Gemini
+- Could also send them to the Claude agent as context ("the reporter showed this screenshot")
+
+### Placeholder Page
+- Deploy a status page immediately when CI starts
+- Auto-refreshes every 30s until final report replaces it
+- Shows spinner, CI link, badge
+
+### Pre-seed Assets
+- Upload test images via ComfyUI API before recording
+- Enables reproduction of bugs requiring assets (#10424 zoom button)
+
+### Environment-Dependent Issues
+- #7942: needs custom TestNode — could install a test custom node pack in CI
+- #9101: needs completed generation — could run with a tiny model checkpoint
+
+## Cost Optimization
+
+### Lazy A11y Tree
+- `inspect(selector)` searches tree for specific element (~20 tokens)
+- `getUIChanges()` diffs against previous snapshot (~100 tokens)
+- vs dumping full tree every turn (~2000 tokens)
+
+### Gemini Video vs Images
+- 30s video clip: ~7,700 tokens (258 tok/s)
+- 15 screenshots: ~19,500 tokens (1,300 tok/frame)
+- Video is 2.5x cheaper and shows temporal changes
+
+### Model Selection
+- Claude Sonnet 4.6: $3/$15 per 1M in/out — best reasoning
+- Gemini 2.5 Flash: $0.10/$0.40 per 1M — best vision-per-dollar
+- Hybrid uses each where it's strongest
--- a/docs/qa/models.md
+++ b/docs/qa/models.md
@@ -0,0 +1,60 @@
+# QA Pipeline Model Selection
+
+## Current Configuration
+
+| Script                | Role                                   | Model                    | Why                                                                                                 |
+| --------------------- | -------------------------------------- | ------------------------ | --------------------------------------------------------------------------------------------------- |
+| `qa-analyze-pr.ts`    | PR/issue analysis, QA guide generation | `gemini-3.1-pro-preview` | Needs deep reasoning over PR diffs, screenshots, and issue threads                                  |
+| `qa-record.ts`        | Playwright step generation             | `gemini-3.1-pro-preview` | Step quality is critical — must understand ComfyUI's canvas UI and produce precise action sequences |
+| `qa-video-review.ts`  | Video comparison review                | `gemini-3-flash-preview` | Video analysis with structured output; flash is sufficient and faster                               |
+| `qa-generate-test.ts` | Regression test generation             | `gemini-3-flash-preview` | Code generation from QA reports; flash handles this well                                            |
+
+## Model Comparison
+
+### Gemini 3.1 Pro vs GPT-5.4
+
+|                   | Gemini 3.1 Pro Preview | GPT-5.4           |
+| ----------------- | ---------------------- | ----------------- |
+| Context window    | 1M tokens              | 1M tokens         |
+| Max output        | 65K tokens             | 128K tokens       |
+| Video input       | Yes                    | No                |
+| Image input       | Yes                    | Yes               |
+| Audio input       | Yes                    | No                |
+| Pricing (input)   | $2/1M tokens           | $2.50/1M tokens   |
+| Pricing (output)  | $12/1M tokens          | $15/1M tokens     |
+| Function calling  | Yes                    | Yes               |
+| Code execution    | Yes                    | Yes (interpreter) |
+| Structured output | Yes                    | Yes               |
+
+**Why Gemini over GPT for QA:**
+
+- Native video understanding (can review recordings directly)
+- Lower cost at comparable quality
+- Native multimodal input (screenshots, videos, audio from issue threads)
+- Better price/performance for high-volume CI usage
+
+### Gemini 3 Flash vs GPT-5.4 Mini
+
+|                  | Gemini 3 Flash Preview | GPT-5.4 Mini    |
+| ---------------- | ---------------------- | --------------- |
+| Context window   | 1M tokens              | 1M tokens       |
+| Pricing (input)  | $0.50/1M tokens        | $0.40/1M tokens |
+| Pricing (output) | $3/1M tokens           | $1.60/1M tokens |
+| Video input      | Yes                    | No              |
+
+**Why Gemini Flash for video review:**
+
+- Video input support is required — GPT models cannot process video files
+- Good enough quality for structured comparison reports
+
+## Upgrade History
+
+| Date       | Change                                                           | Reason                                                                     |
+| ---------- | ---------------------------------------------------------------- | -------------------------------------------------------------------------- |
+| 2026-03-24 | `gemini-2.5-flash` → `gemini-3.1-pro-preview` (record)           | Shallow step generation; pro model needed for complex ComfyUI interactions |
+| 2026-03-24 | `gemini-2.5-pro` → `gemini-3.1-pro-preview` (analyze)            | Keep analysis on latest pro                                                |
+| 2026-03-24 | `gemini-2.5-flash` → `gemini-3-flash-preview` (review, test-gen) | Latest flash for cost-efficient tasks                                      |
+
+## Override
+
+All scripts accept `--model <name>` to override the default. Pass any Gemini model ID.
--- a/knip.config.ts
+++ b/knip.config.ts
@@ -40,7 +40,7 @@ const config: KnipConfig = {
      ignoreDependencies: ['@comfyorg/design-system', '@vercel/analytics']
    }
  },
-  ignoreBinaries: ['python3'],
+  ignoreBinaries: ['python3', 'wrangler'],
  ignoreDependencies: [
    // Weird importmap things
    '@iconify-json/lucide',
--- a/package.json
+++ b/package.json
@@ -39,6 +39,7 @@
    "oxlint": "oxlint src --type-aware",
    "prepare": "husky || true && git config blame.ignoreRevsFile .git-blame-ignore-revs || true",
    "preview": "nx preview",
+    "qa:video-review": "tsx scripts/qa-video-review.ts",
    "storybook": "nx storybook",
    "storybook:desktop": "nx run @comfyorg/desktop-ui:storybook",
    "stylelint:fix": "stylelint --cache --fix '{apps,packages,src}/**/*.{css,vue}'",
@@ -122,7 +123,9 @@
    "zod-validation-error": "catalog:"
  },
  "devDependencies": {
+    "@anthropic-ai/claude-agent-sdk": "catalog:",
    "@eslint/js": "catalog:",
+    "@google/generative-ai": "catalog:",
    "@intlify/eslint-plugin-vue-i18n": "catalog:",
    "@lobehub/i18n-cli": "catalog:",
    "@nx/eslint": "catalog:",
--- a/pnpm-lock.yaml
+++ b/pnpm-lock.yaml
--- a/pnpm-workspace.yaml
+++ b/pnpm-workspace.yaml
@@ -4,10 +4,12 @@ packages:

 catalog:
  '@alloc/quick-lru': ^5.2.0
+  '@anthropic-ai/claude-agent-sdk': ^0.2.85
  '@astrojs/vue': ^5.0.0
  '@comfyorg/comfyui-electron-types': 0.6.2
  '@eslint/js': ^9.39.1
  '@formkit/auto-animate': ^0.9.0
+  '@google/generative-ai': ^0.24.1
  '@iconify-json/lucide': ^1.1.178
  '@iconify/json': ^2.2.380
  '@iconify/tailwind4': ^1.2.0
--- a/scripts/qa-agent.ts
+++ b/scripts/qa-agent.ts
@@ -0,0 +1,572 @@
+#!/usr/bin/env tsx
+/**
+ * QA Research Phase — Claude writes & debugs E2E tests to reproduce bugs
+ *
+ * Instead of driving a browser interactively, Claude:
+ * 1. Reads the issue + a11y snapshot of the UI
+ * 2. Writes a Playwright E2E test (.spec.ts) that reproduces the bug
+ * 3. Runs the test → reads errors → rewrites → repeats until it works
+ * 4. Outputs the passing test + verdict
+ *
+ * Tools:
+ *   - inspect(selector) — read a11y tree to understand UI state
+ *   - writeTest(code) — write a Playwright test file
+ *   - runTest() — execute the test and get results
+ *   - done(verdict, summary, testCode) — finish with the working test
+ */
+
+import type { Page } from '@playwright/test'
+import { query, tool, createSdkMcpServer } from '@anthropic-ai/claude-agent-sdk'
+import { z } from 'zod'
+import { mkdirSync, readFileSync, writeFileSync } from 'fs'
+import { execSync } from 'child_process'
+
+// ── Types ──
+
+interface ResearchOptions {
+  page: Page
+  issueContext: string
+  qaGuide: string
+  outputDir: string
+  serverUrl: string
+  anthropicApiKey?: string
+  maxTurns?: number
+  timeBudgetMs?: number
+}
+
+export type ReproMethod = 'e2e_test' | 'video' | 'both' | 'none'
+
+export interface ResearchResult {
+  verdict: 'REPRODUCED' | 'NOT_REPRODUCIBLE' | 'INCONCLUSIVE'
+  reproducedBy: ReproMethod
+  summary: string
+  evidence: string
+  testCode: string
+  log: Array<{
+    turn: number
+    timestampMs: number
+    toolName: string
+    toolInput: unknown
+    toolResult: string
+  }>
+}
+
+// ── Main research function ──
+
+export async function runResearchPhase(
+  opts: ResearchOptions
+): Promise<ResearchResult> {
+  const { page, issueContext, qaGuide, outputDir, serverUrl, anthropicApiKey } =
+    opts
+  const maxTurns = opts.maxTurns ?? 50
+
+  let agentDone = false
+  let finalVerdict: ResearchResult['verdict'] = 'INCONCLUSIVE'
+  let finalReproducedBy: ReproMethod = 'none'
+  let finalSummary = 'Agent did not complete'
+  let finalEvidence = ''
+  let finalTestCode = ''
+  let turnCount = 0
+  let lastPassedTurn = -1
+  const startTime = Date.now()
+  const researchLog: ResearchResult['log'] = []
+
+  const testDir = `${outputDir}/research`
+  mkdirSync(testDir, { recursive: true })
+  const testPath = `${testDir}/reproduce.spec.ts`
+
+  // Get initial a11y snapshot for context
+  let initialA11y = ''
+  try {
+    initialA11y = await page.locator('body').ariaSnapshot({ timeout: 5000 })
+    initialA11y = initialA11y.slice(0, 3000)
+  } catch {
+    initialA11y = '(could not capture initial a11y snapshot)'
+  }
+
+  // ── Tool: inspect ──
+  const inspectTool = tool(
+    'inspect',
+    'Read the current accessibility tree to understand UI state. Use this to discover element names, roles, and selectors for your test.',
+    {
+      selector: z
+        .string()
+        .optional()
+        .describe(
+          'Optional filter — only show elements matching this name/role. Omit for full tree.'
+        )
+    },
+    async (args) => {
+      let resultText: string
+      try {
+        const ariaText = await page
+          .locator('body')
+          .ariaSnapshot({ timeout: 5000 })
+        if (args.selector) {
+          const lines = ariaText.split('\n')
+          const matches = lines.filter((l: string) =>
+            l.toLowerCase().includes(args.selector!.toLowerCase())
+          )
+          resultText =
+            matches.length > 0
+              ? `Found "${args.selector}":\n${matches.slice(0, 15).join('\n')}`
+              : `"${args.selector}" not found. Full tree:\n${ariaText.slice(0, 2000)}`
+        } else {
+          resultText = ariaText.slice(0, 3000)
+        }
+      } catch (e) {
+        resultText = `inspect failed: ${e instanceof Error ? e.message : e}`
+      }
+
+      researchLog.push({
+        turn: turnCount,
+        timestampMs: Date.now() - startTime,
+        toolName: 'inspect',
+        toolInput: args,
+        toolResult: resultText.slice(0, 500)
+      })
+
+      return { content: [{ type: 'text' as const, text: resultText }] }
+    }
+  )
+
+  // ── Tool: readFixture ──
+  const readFixtureTool = tool(
+    'readFixture',
+    'Read a fixture or helper file from browser_tests/fixtures/ to understand the API. Use this to discover available methods on comfyPage helpers before writing your test.',
+    {
+      path: z
+        .string()
+        .describe(
+          'Relative path within browser_tests/fixtures/, e.g. "helpers/CanvasHelper.ts" or "components/Topbar.ts" or "ComfyPage.ts"'
+        )
+    },
+    async (args) => {
+      let resultText: string
+      try {
+        const fullPath = `${projectRoot}/browser_tests/fixtures/${args.path}`
+        const content = readFileSync(fullPath, 'utf-8')
+        resultText = content.slice(0, 4000)
+        if (content.length > 4000) {
+          resultText += `\n\n... (truncated, ${content.length} total chars)`
+        }
+      } catch (e) {
+        resultText = `Could not read fixture: ${e instanceof Error ? e.message : e}`
+      }
+
+      researchLog.push({
+        turn: turnCount,
+        timestampMs: Date.now() - startTime,
+        toolName: 'readFixture',
+        toolInput: args,
+        toolResult: resultText.slice(0, 500)
+      })
+
+      return { content: [{ type: 'text' as const, text: resultText }] }
+    }
+  )
+
+  // ── Tool: readTest ──
+  const readTestTool = tool(
+    'readTest',
+    'Read an existing E2E test file from browser_tests/tests/ to learn patterns and conventions used in this project.',
+    {
+      path: z
+        .string()
+        .describe(
+          'Relative path within browser_tests/tests/, e.g. "workflow.spec.ts" or "subgraph.spec.ts"'
+        )
+    },
+    async (args) => {
+      let resultText: string
+      try {
+        const fullPath = `${projectRoot}/browser_tests/tests/${args.path}`
+        const content = readFileSync(fullPath, 'utf-8')
+        resultText = content.slice(0, 4000)
+        if (content.length > 4000) {
+          resultText += `\n\n... (truncated, ${content.length} total chars)`
+        }
+      } catch (e) {
+        // List available test files if the path doesn't exist
+        try {
+          const { readdirSync } = await import('fs')
+          const files = readdirSync(`${projectRoot}/browser_tests/tests/`)
+            .filter((f: string) => f.endsWith('.spec.ts'))
+            .slice(0, 30)
+          resultText = `File not found: ${args.path}\n\nAvailable test files:\n${files.join('\n')}`
+        } catch {
+          resultText = `Could not read test: ${e instanceof Error ? e.message : e}`
+        }
+      }
+
+      researchLog.push({
+        turn: turnCount,
+        timestampMs: Date.now() - startTime,
+        toolName: 'readTest',
+        toolInput: args,
+        toolResult: resultText.slice(0, 500)
+      })
+
+      return { content: [{ type: 'text' as const, text: resultText }] }
+    }
+  )
+
+  // ── Tool: writeTest ──
+  const writeTestTool = tool(
+    'writeTest',
+    'Write a Playwright E2E test file that reproduces the bug. The test should assert the broken behavior exists.',
+    {
+      code: z
+        .string()
+        .describe('Complete Playwright test file content (.spec.ts)')
+    },
+    async (args) => {
+      writeFileSync(testPath, args.code)
+
+      researchLog.push({
+        turn: turnCount,
+        timestampMs: Date.now() - startTime,
+        toolName: 'writeTest',
+        toolInput: { path: testPath, codeLength: args.code.length },
+        toolResult: `Test written to ${testPath} (${args.code.length} chars)`
+      })
+
+      return {
+        content: [
+          {
+            type: 'text' as const,
+            text: `Test written to ${testPath}. Use runTest() to execute it.`
+          }
+        ]
+      }
+    }
+  )
+
+  // ── Tool: runTest ──
+  // Place test in browser_tests/ so Playwright config finds fixtures
+  const projectRoot = process.cwd()
+  const browserTestPath = `${projectRoot}/browser_tests/tests/qa-reproduce.spec.ts`
+
+  const runTestTool = tool(
+    'runTest',
+    'Run the Playwright test and get results. Returns stdout/stderr including assertion errors.',
+    {},
+    async () => {
+      turnCount++
+      // Copy the test to browser_tests/tests/ where Playwright expects it
+      const { copyFileSync } = await import('fs')
+      try {
+        copyFileSync(testPath, browserTestPath)
+      } catch {
+        // directory may not exist
+        mkdirSync(`${projectRoot}/browser_tests/tests`, { recursive: true })
+        copyFileSync(testPath, browserTestPath)
+      }
+
+      let resultText: string
+      try {
+        const output = execSync(
+          `cd "${projectRoot}" && npx playwright test browser_tests/tests/qa-reproduce.spec.ts --reporter=list --timeout=30000 --retries=0 --workers=1 2>&1`,
+          {
+            timeout: 90000,
+            encoding: 'utf-8',
+            env: {
+              ...process.env,
+              COMFYUI_BASE_URL: serverUrl
+            }
+          }
+        )
+        resultText = `TEST PASSED:\n${output.slice(-1500)}`
+      } catch (e) {
+        const err = e as { stdout?: string; stderr?: string; message?: string }
+        const output = (err.stdout || '') + '\n' + (err.stderr || '')
+        resultText = `TEST FAILED:\n${output.slice(-2000)}`
+      }
+
+      researchLog.push({
+        turn: turnCount,
+        timestampMs: Date.now() - startTime,
+        toolName: 'runTest',
+        toolInput: { testPath },
+        toolResult: resultText.slice(0, 1000)
+      })
+
+      // Auto-save passing test code for fallback completion
+      if (resultText.startsWith('TEST PASSED')) {
+        try {
+          finalTestCode = readFileSync(browserTestPath, 'utf-8')
+          lastPassedTurn = turnCount
+        } catch {
+          // ignore
+        }
+        resultText += '\n\n⚠️ Test PASSED — call done() now with verdict REPRODUCED and the test code. Do NOT write more tests.'
+      }
+
+      return { content: [{ type: 'text' as const, text: resultText }] }
+    }
+  )
+
+  // ── Tool: done ──
+  const doneTool = tool(
+    'done',
+    'Finish research with verdict and the final test code.',
+    {
+      verdict: z.enum(['REPRODUCED', 'NOT_REPRODUCIBLE', 'INCONCLUSIVE']),
+      reproducedBy: z
+        .enum(['e2e_test', 'video', 'both', 'none'])
+        .describe(
+          'How the bug was proven: e2e_test = Playwright assertion passed, video = visual evidence only, both = both methods, none = not reproduced'
+        ),
+      summary: z.string().describe('What you found and why'),
+      evidence: z.string().describe('Test output that proves the verdict'),
+      testCode: z
+        .string()
+        .describe(
+          'Final Playwright test code. If REPRODUCED, this test asserts the bug exists and passes.'
+        )
+    },
+    async (args) => {
+      agentDone = true
+      finalVerdict = args.verdict
+      finalReproducedBy = args.reproducedBy
+      finalSummary = args.summary
+      finalEvidence = args.evidence
+      finalTestCode = args.testCode
+      writeFileSync(testPath, args.testCode)
+      return {
+        content: [
+          { type: 'text' as const, text: `Research complete: ${args.verdict}` }
+        ]
+      }
+    }
+  )
+
+  // ── MCP Server ──
+  const server = createSdkMcpServer({
+    name: 'qa-research',
+    version: '1.0.0',
+    tools: [
+      inspectTool,
+      readFixtureTool,
+      readTestTool,
+      writeTestTool,
+      runTestTool,
+      doneTool
+    ]
+  })
+
+  // ── System prompt ──
+  const systemPrompt = `You are a senior QA engineer who writes Playwright E2E tests to reproduce reported bugs.
+
+## Your tools
+- inspect(selector?) — Read the accessibility tree to understand the current UI. Use to discover selectors, element names, and UI state.
+- readFixture(path) — Read fixture source code from browser_tests/fixtures/. Use to discover available methods. E.g. "helpers/CanvasHelper.ts", "components/Topbar.ts", "ComfyPage.ts"
+- readTest(path) — Read an existing test from browser_tests/tests/ to learn patterns. E.g. "workflow.spec.ts". Pass any name to list available files.
+- writeTest(code) — Write a Playwright test file (.spec.ts)
+- runTest() — Execute the test and get results (pass/fail + errors)
+- done(verdict, summary, evidence, testCode) — Finish with the final test
+
+## Workflow
+1. Read the issue description carefully
+2. Use inspect() to understand the current UI state and discover element selectors
+3. If unsure about the fixture API, use readFixture() to read the relevant helper source code
+4. If unsure about test patterns, use readTest() to read an existing test for reference
+5. Write a Playwright test that:
+   - Performs the exact reproduction steps from the issue
+   - Asserts the BROKEN behavior (the bug) — so the test PASSES when the bug exists
+6. Run the test with runTest()
+7. If it fails: read the error, fix the test, run again (max 5 attempts)
+8. Call done() with the final verdict and test code
+
+## Test writing guidelines
+- Import the project fixture: \`import { comfyPageFixture as test } from '../fixtures/ComfyPage'\`
+- Import expect: \`import { expect } from '@playwright/test'\`
+- The fixture provides \`comfyPage\` which has all the helpers listed below
+- If the bug IS present, the test should PASS. If the bug is fixed, the test would FAIL.
+- Keep tests focused and minimal — test ONLY the reported bug
+- Write ONE test, not multiple. Focus on the single clearest reproduction.
+- The test file will be placed in browser_tests/tests/qa-reproduce.spec.ts
+- Use \`comfyPage.nextFrame()\` after interactions that trigger UI updates
+- NEVER use \`page.waitForTimeout()\` — use Locator actions and retrying assertions instead
+- ALWAYS call done() when finished, even if the test passed — do not keep iterating after a passing test
+- Use \`expect.poll()\` for async assertions: \`await expect.poll(() => comfyPage.nodeOps.getGraphNodesCount()).toBe(8)\`
+- CRITICAL: Your assertions must be SPECIFIC TO THE BUG. A test that asserts \`expect(count).toBeGreaterThan(0)\` proves nothing — it would pass even without the bug. Instead assert the exact broken state, e.g. \`expect(clonedWidgets).toHaveLength(0)\` (missing widgets) or \`expect(zIndex).toBeLessThan(parentZIndex)\` (wrong z-order). If a test passes trivially, it's a false positive.
+- If you cannot write a bug-specific assertion, call done() with verdict NOT_REPRODUCIBLE and explain why.
+
+## ComfyPage Fixture API Reference
+
+### Core properties
+- \`comfyPage.page\` — raw Playwright Page
+- \`comfyPage.canvas\` — Locator for #graph-canvas
+- \`comfyPage.queueButton\` — "Queue Prompt" button
+- \`comfyPage.runButton\` — "Run" button (new UI)
+- \`comfyPage.confirmDialog\` — ConfirmDialog (has .confirm, .delete, .overwrite, .reject locators + .click(name) method)
+- \`comfyPage.nextFrame()\` — wait for next requestAnimationFrame
+- \`comfyPage.setup()\` — navigate + wait for app ready (called automatically by fixture)
+
+### Menu (comfyPage.menu)
+- \`comfyPage.menu.topbar\` — Topbar helper:
+  - \`.triggerTopbarCommand(['File', 'Save As'])\` — navigate menu hierarchy
+  - \`.openTopbarMenu()\` / \`.closeTopbarMenu()\` — open/close hamburger
+  - \`.openSubmenu('File')\` — hover to open submenu, returns submenu Locator
+  - \`.getTabNames()\` — get all open workflow tab names
+  - \`.getActiveTabName()\` — get active tab name
+  - \`.getWorkflowTab(name)\` — get tab Locator
+  - \`.closeWorkflowTab(name)\` — close a tab
+  - \`.saveWorkflow(name)\` / \`.saveWorkflowAs(name)\` / \`.exportWorkflow(name)\`
+  - \`.switchTheme('dark' | 'light')\`
+- \`comfyPage.menu.workflowsTab\` — WorkflowsSidebarTab:
+  - \`.open()\` / \`.close()\` — toggle workflows sidebar
+  - \`.getTopLevelSavedWorkflowNames()\` — list saved workflow names
+- \`comfyPage.menu.nodeLibraryTab\` — NodeLibrarySidebarTab
+- \`comfyPage.menu.assetsTab\` — AssetsSidebarTab
+
+### Canvas (comfyPage.canvasOps)
+- \`.click({x, y})\` — click at position on canvas
+- \`.rightClick(x, y)\` — right-click (opens context menu)
+- \`.doubleClick()\` — double-click canvas (opens node search)
+- \`.clickEmptySpace()\` — click known empty area
+- \`.dragAndDrop(source, target)\` — drag from source to target position
+- \`.pan(offset, safeSpot?)\` — pan canvas by offset
+- \`.zoom(deltaY, steps?)\` — zoom via scroll wheel
+- \`.resetView()\` — reset zoom/pan to default
+- \`.getScale()\` / \`.setScale(n)\` — get/set canvas zoom
+- \`.getNodeCenterByTitle(title)\` — get screen coords of node center
+- \`.disconnectEdge()\` / \`.connectEdge()\` — default graph edge operations
+
+### Node Operations (comfyPage.nodeOps)
+- \`.getGraphNodesCount()\` — count all nodes
+- \`.getSelectedGraphNodesCount()\` — count selected nodes
+- \`.getNodes()\` — get all nodes
+- \`.getFirstNodeRef()\` — get NodeReference for first node
+- \`.getNodeRefById(id)\` — get NodeReference by ID
+- \`.getNodeRefsByType(type)\` — get all nodes of a type
+- \`.waitForGraphNodes(count)\` — wait until node count matches
+
+### Settings (comfyPage.settings)
+- \`.setSetting(id, value)\` — change a ComfyUI setting
+- \`.getSetting(id)\` — read current setting value
+
+### Keyboard (comfyPage.keyboard)
+- \`.undo()\` / \`.redo()\` — Ctrl+Z / Ctrl+Y
+- \`.bypass()\` — Ctrl+B
+- \`.selectAll()\` — Ctrl+A
+- \`.ctrlSend(key)\` — send Ctrl+key
+
+### Workflow (comfyPage.workflow)
+- \`.loadWorkflow(name)\` — load from browser_tests/assets/{name}.json
+- \`.setupWorkflowsDirectory(structure)\` — setup test directory
+- \`.deleteWorkflow(name)\`
+- \`.isCurrentWorkflowModified()\` — check dirty state
+
+### Context Menu (comfyPage.contextMenu)
+- \`.openFor(locator)\` — right-click locator and wait for menu
+- \`.clickMenuItem(name)\` — click a menu item by name
+- \`.isVisible()\` — check if context menu is showing
+- \`.assertHasItems(items)\` — assert menu contains items
+
+### Other helpers
+- \`comfyPage.settingDialog\` — SettingDialog component
+- \`comfyPage.searchBox\` / \`comfyPage.searchBoxV2\` — node search
+- \`comfyPage.toast\` — ToastHelper (\`.visibleToasts\`)
+- \`comfyPage.subgraph\` — SubgraphHelper
+- \`comfyPage.vueNodes\` — VueNodeHelpers
+- \`comfyPage.bottomPanel\` — BottomPanel
+- \`comfyPage.clipboard\` — ClipboardHelper
+- \`comfyPage.dragDrop\` — DragDropHelper
+
+### Available fixture files (use readFixture to explore)
+- ComfyPage.ts — main fixture with all helpers
+- helpers/CanvasHelper.ts, NodeOperationsHelper.ts, WorkflowHelper.ts
+- helpers/KeyboardHelper.ts, SettingsHelper.ts, SubgraphHelper.ts
+- components/Topbar.ts, ContextMenu.ts, SettingDialog.ts, SidebarTab.ts
+
+## Current UI state (accessibility tree)
+${initialA11y}
+
+${qaGuide ? `## QA Analysis Guide\n${qaGuide}\n` : ''}
+## Issue to Reproduce
+${issueContext}`
+
+  // ── Run the agent ──
+  console.warn('Starting research phase (Claude writes E2E tests)...')
+
+  try {
+    for await (const message of query({
+      prompt:
+        'Write a Playwright E2E test that reproduces the reported bug. Use inspect() to discover selectors, readFixture() or readTest() if you need to understand the fixture API or see existing test patterns, writeTest() to write the test, runTest() to execute it. Iterate until it works or you determine the bug cannot be reproduced.',
+      options: {
+        model: 'claude-sonnet-4-6',
+        systemPrompt,
+        ...(anthropicApiKey ? { apiKey: anthropicApiKey } : {}),
+        maxTurns,
+        mcpServers: { 'qa-research': server },
+        allowedTools: [
+          'mcp__qa-research__inspect',
+          'mcp__qa-research__readFixture',
+          'mcp__qa-research__readTest',
+          'mcp__qa-research__writeTest',
+          'mcp__qa-research__runTest',
+          'mcp__qa-research__done'
+        ]
+      }
+    })) {
+      if (message.type === 'assistant' && message.message?.content) {
+        for (const block of message.message.content) {
+          if ('text' in block && block.text) {
+            console.warn(`  Claude: ${block.text.slice(0, 200)}`)
+          }
+          if ('name' in block) {
+            console.warn(
+              `  Tool: ${block.name}(${JSON.stringify(block.input).slice(0, 100)})`
+            )
+          }
+        }
+      }
+      if (agentDone) break
+    }
+  } catch (e) {
+    const errMsg = e instanceof Error ? e.message : String(e)
+    console.warn(`Research error: ${errMsg}`)
+
+    // Detect billing/auth errors and surface them clearly
+    if (
+      errMsg.includes('Credit balance is too low') ||
+      errMsg.includes('insufficient_quota') ||
+      errMsg.includes('rate_limit')
+    ) {
+      finalSummary = `API error: ${errMsg.slice(0, 200)}`
+      finalEvidence = 'Agent could not start due to API billing/auth issue'
+      console.warn(
+        '::error::Anthropic API credits exhausted — cannot run research phase'
+      )
+    }
+  }
+
+  // Auto-complete: if a test passed but done() was never called, use the passing test
+  if (!agentDone && lastPassedTurn >= 0 && finalTestCode) {
+    console.warn(
+      `Auto-completing: test passed at turn ${lastPassedTurn} but done() was not called`
+    )
+    finalVerdict = 'REPRODUCED'
+    finalReproducedBy = 'e2e_test'
+    finalSummary = `Test passed at turn ${lastPassedTurn} (auto-completed — agent did not call done())`
+    finalEvidence = `Test passed with exit code 0`
+  }
+
+  const result: ResearchResult = {
+    verdict: finalVerdict,
+    reproducedBy: finalReproducedBy,
+    summary: finalSummary,
+    evidence: finalEvidence,
+    testCode: finalTestCode,
+    log: researchLog
+  }
+
+  writeFileSync(`${testDir}/research-log.json`, JSON.stringify(result, null, 2))
+  console.warn(
+    `Research complete: ${finalVerdict} (${researchLog.length} tool calls)`
+  )
+
+  return result
+}
--- a/scripts/qa-analyze-pr.test.ts
+++ b/scripts/qa-analyze-pr.test.ts
@@ -0,0 +1,84 @@
+import { describe, expect, it } from 'vitest'
+
+import { extractMediaUrls } from './qa-analyze-pr'
+
+describe('extractMediaUrls', () => {
+  it('extracts markdown image URLs', () => {
+    const text = '![screenshot](https://example.com/image.png)'
+    expect(extractMediaUrls(text)).toEqual(['https://example.com/image.png'])
+  })
+
+  it('extracts multiple markdown images', () => {
+    const text = [
+      '![before](https://example.com/before.png)',
+      'Some text',
+      '![after](https://example.com/after.jpg)'
+    ].join('\n')
+    expect(extractMediaUrls(text)).toEqual([
+      'https://example.com/before.png',
+      'https://example.com/after.jpg'
+    ])
+  })
+
+  it('extracts raw URLs with media extensions', () => {
+    const text = 'Check this: https://cdn.example.com/demo.mp4 for details'
+    expect(extractMediaUrls(text)).toEqual(['https://cdn.example.com/demo.mp4'])
+  })
+
+  it('extracts GitHub user-attachments URLs', () => {
+    const text =
+      'https://github.com/user-attachments/assets/abc12345-6789-0def-1234-567890abcdef'
+    expect(extractMediaUrls(text)).toEqual([
+      'https://github.com/user-attachments/assets/abc12345-6789-0def-1234-567890abcdef'
+    ])
+  })
+
+  it('extracts private-user-images URLs', () => {
+    const text =
+      'https://private-user-images.githubusercontent.com/12345/abcdef-1234?jwt=token123'
+    expect(extractMediaUrls(text)).toEqual([
+      'https://private-user-images.githubusercontent.com/12345/abcdef-1234?jwt=token123'
+    ])
+  })
+
+  it('extracts URLs with query parameters', () => {
+    const text = 'https://example.com/image.png?w=800&h=600'
+    expect(extractMediaUrls(text)).toEqual([
+      'https://example.com/image.png?w=800&h=600'
+    ])
+  })
+
+  it('deduplicates URLs', () => {
+    const text = [
+      '![img](https://example.com/same.png)',
+      '![img2](https://example.com/same.png)',
+      'Also https://example.com/same.png'
+    ].join('\n')
+    expect(extractMediaUrls(text)).toEqual(['https://example.com/same.png'])
+  })
+
+  it('returns empty array for empty input', () => {
+    expect(extractMediaUrls('')).toEqual([])
+  })
+
+  it('returns empty array for text with no media URLs', () => {
+    expect(extractMediaUrls('Just some text without any URLs')).toEqual([])
+  })
+
+  it('handles mixed media types', () => {
+    const text = [
+      '![screen](https://example.com/screenshot.png)',
+      'Video: https://example.com/demo.webm',
+      '![gif](https://example.com/animation.gif)'
+    ].join('\n')
+    const urls = extractMediaUrls(text)
+    expect(urls).toContain('https://example.com/screenshot.png')
+    expect(urls).toContain('https://example.com/demo.webm')
+    expect(urls).toContain('https://example.com/animation.gif')
+  })
+
+  it('ignores non-http URLs in markdown', () => {
+    const text = '![local](./local-image.png)'
+    expect(extractMediaUrls(text)).toEqual([])
+  })
+})
--- a/scripts/qa-analyze-pr.ts
+++ b/scripts/qa-analyze-pr.ts
@@ -0,0 +1,799 @@
+#!/usr/bin/env tsx
+/**
+ * QA PR Analysis Script
+ *
+ * Deeply analyzes a PR using Gemini Pro to generate targeted QA guides
+ * for before/after recording sessions. Fetches PR thread, extracts media,
+ * and produces structured test plans.
+ *
+ * Usage:
+ *   pnpm exec tsx scripts/qa-analyze-pr.ts \
+ *     --pr-number 10270 \
+ *     --repo owner/repo \
+ *     --output-dir qa-guides/ \
+ *     [--model gemini-3.1-pro-preview]
+ *
+ * Env: GEMINI_API_KEY (required)
+ */
+
+import { execSync } from 'node:child_process'
+import { mkdirSync, readFileSync, writeFileSync } from 'node:fs'
+import { resolve } from 'node:path'
+import { fileURLToPath } from 'node:url'
+
+import { GoogleGenerativeAI } from '@google/generative-ai'
+
+// ── Types ──
+
+interface QaGuideStep {
+  action: string
+  description: string
+  expected_before?: string
+  expected_after?: string
+}
+
+interface QaGuide {
+  summary: string
+  test_focus: string
+  prerequisites: string[]
+  steps: QaGuideStep[]
+  visual_checks: string[]
+}
+
+interface PrThread {
+  title: string
+  body: string
+  labels: string[]
+  issueComments: string[]
+  reviewComments: string[]
+  reviews: string[]
+  diff: string
+}
+
+type TargetType = 'pr' | 'issue'
+
+interface Options {
+  prNumber: string
+  repo: string
+  outputDir: string
+  model: string
+  apiKey: string
+  mediaBudgetBytes: number
+  maxVideoBytes: number
+  type: TargetType
+}
+
+// ── CLI parsing ──
+
+function parseArgs(): Options {
+  const args = process.argv.slice(2)
+  const opts: Partial<Options> = {
+    model: 'gemini-3.1-pro-preview',
+    apiKey: process.env.GEMINI_API_KEY || '',
+    mediaBudgetBytes: 20 * 1024 * 1024,
+    maxVideoBytes: 10 * 1024 * 1024,
+    type: 'pr'
+  }
+
+  for (let i = 0; i < args.length; i++) {
+    switch (args[i]) {
+      case '--pr-number':
+        opts.prNumber = args[++i]
+        break
+      case '--repo':
+        opts.repo = args[++i]
+        break
+      case '--output-dir':
+        opts.outputDir = args[++i]
+        break
+      case '--model':
+        opts.model = args[++i]
+        break
+      case '--type':
+        opts.type = args[++i] as TargetType
+        break
+      case '--help':
+        console.warn(
+          'Usage: qa-analyze-pr.ts --pr-number <num> --repo <owner/repo> --output-dir <path> [--model <model>] [--type pr|issue]'
+        )
+        process.exit(0)
+    }
+  }
+
+  if (!opts.prNumber || !opts.repo || !opts.outputDir) {
+    console.error(
+      'Required: --pr-number <num> --repo <owner/repo> --output-dir <path>'
+    )
+    process.exit(1)
+  }
+
+  if (!opts.apiKey) {
+    console.error('GEMINI_API_KEY environment variable is required')
+    process.exit(1)
+  }
+
+  return opts as Options
+}
+
+// ── PR thread fetching ──
+
+function ghExec(cmd: string): string {
+  try {
+    return execSync(cmd, {
+      encoding: 'utf-8',
+      timeout: 30_000,
+      stdio: ['pipe', 'pipe', 'pipe']
+    }).trim()
+  } catch (err) {
+    console.warn(`gh command failed: ${cmd}`)
+    console.warn((err as Error).message)
+    return ''
+  }
+}
+
+function fetchPrThread(prNumber: string, repo: string): PrThread {
+  console.warn('Fetching PR thread...')
+
+  const prView = ghExec(
+    `gh pr view ${prNumber} --repo ${repo} --json title,body,labels`
+  )
+  const prData = prView
+    ? JSON.parse(prView)
+    : { title: '', body: '', labels: [] }
+
+  const issueCommentsRaw = ghExec(
+    `gh api repos/${repo}/issues/${prNumber}/comments --paginate`
+  )
+  const issueComments: string[] = issueCommentsRaw
+    ? JSON.parse(issueCommentsRaw).map((c: { body: string }) => c.body)
+    : []
+
+  const reviewCommentsRaw = ghExec(
+    `gh api repos/${repo}/pulls/${prNumber}/comments --paginate`
+  )
+  const reviewComments: string[] = reviewCommentsRaw
+    ? JSON.parse(reviewCommentsRaw).map((c: { body: string }) => c.body)
+    : []
+
+  const reviewsRaw = ghExec(
+    `gh api repos/${repo}/pulls/${prNumber}/reviews --paginate`
+  )
+  const reviews: string[] = reviewsRaw
+    ? JSON.parse(reviewsRaw)
+        .filter((r: { body: string }) => r.body)
+        .map((r: { body: string }) => r.body)
+    : []
+
+  const diff = ghExec(`gh pr diff ${prNumber} --repo ${repo}`)
+
+  console.warn(
+    `PR #${prNumber}: "${prData.title}" | ` +
+      `${issueComments.length} issue comments, ` +
+      `${reviewComments.length} review comments, ` +
+      `${reviews.length} reviews, ` +
+      `diff: ${diff.length} chars`
+  )
+
+  return {
+    title: prData.title || '',
+    body: prData.body || '',
+    labels: (prData.labels || []).map((l: { name: string }) => l.name),
+    issueComments,
+    reviewComments,
+    reviews,
+    diff
+  }
+}
+
+interface IssueThread {
+  title: string
+  body: string
+  labels: string[]
+  comments: string[]
+}
+
+function fetchIssueThread(issueNumber: string, repo: string): IssueThread {
+  console.warn('Fetching issue thread...')
+
+  const issueView = ghExec(
+    `gh issue view ${issueNumber} --repo ${repo} --json title,body,labels`
+  )
+  const issueData = issueView
+    ? JSON.parse(issueView)
+    : { title: '', body: '', labels: [] }
+
+  const commentsRaw = ghExec(
+    `gh api repos/${repo}/issues/${issueNumber}/comments --paginate`
+  )
+  const comments: string[] = commentsRaw
+    ? JSON.parse(commentsRaw).map((c: { body: string }) => c.body)
+    : []
+
+  console.warn(
+    `Issue #${issueNumber}: "${issueData.title}" | ` +
+      `${comments.length} comments`
+  )
+
+  return {
+    title: issueData.title || '',
+    body: issueData.body || '',
+    labels: (issueData.labels || []).map((l: { name: string }) => l.name),
+    comments
+  }
+}
+
+// ── Media extraction ──
+
+const MEDIA_EXTENSIONS = /\.(png|jpg|jpeg|gif|webp|mp4|webm|mov)$/i
+
+const MEDIA_URL_PATTERNS = [
+  // Markdown images: ![alt](url)
+  /!\[[^\]]*\]\(([^)]+)\)/g,
+  // GitHub user-attachments
+  /https:\/\/github\.com\/user-attachments\/assets\/[a-f0-9-]+/g,
+  // Private user images
+  /https:\/\/private-user-images\.githubusercontent\.com\/[^\s)"]+/g,
+  // Raw URLs with media extensions (standalone or in text)
+  /(?<!="|=')https?:\/\/[^\s)<>"]+\.(?:png|jpg|jpeg|gif|webp|mp4|webm|mov)(?:\?[^\s)<>"]*)?/gi
+]
+
+export function extractMediaUrls(text: string): string[] {
+  if (!text) return []
+
+  const urls = new Set<string>()
+
+  for (const pattern of MEDIA_URL_PATTERNS) {
+    // Reset lastIndex for global patterns
+    pattern.lastIndex = 0
+    let match: RegExpExecArray | null
+    while ((match = pattern.exec(text)) !== null) {
+      // For markdown images, the URL is in capture group 1
+      const url = match[1] || match[0]
+      // Clean trailing markdown/html artifacts
+      const cleaned = url.replace(/[)>"'\s]+$/, '')
+      if (cleaned.startsWith('http')) {
+        urls.add(cleaned)
+      }
+    }
+  }
+
+  return [...urls]
+}
+
+// ── Media downloading ──
+
+const ALLOWED_MEDIA_DOMAINS = [
+  'github.com',
+  'raw.githubusercontent.com',
+  'user-images.githubusercontent.com',
+  'private-user-images.githubusercontent.com',
+  'objects.githubusercontent.com',
+  'github.githubassets.com'
+]
+
+function isAllowedMediaDomain(url: string): boolean {
+  try {
+    const hostname = new URL(url).hostname
+    return ALLOWED_MEDIA_DOMAINS.some(
+      (domain) => hostname === domain || hostname.endsWith(`.${domain}`)
+    )
+  } catch {
+    return false
+  }
+}
+
+async function downloadMedia(
+  urls: string[],
+  outputDir: string,
+  budgetBytes: number,
+  maxVideoBytes: number
+): Promise<Array<{ path: string; mimeType: string }>> {
+  const downloaded: Array<{ path: string; mimeType: string }> = []
+  let totalBytes = 0
+
+  const mediaDir = resolve(outputDir, 'media')
+  mkdirSync(mediaDir, { recursive: true })
+
+  for (const url of urls) {
+    if (totalBytes >= budgetBytes) {
+      console.warn(
+        `Media budget exhausted (${totalBytes} bytes), skipping rest`
+      )
+      break
+    }
+
+    if (!isAllowedMediaDomain(url)) {
+      console.warn(`Skipping non-GitHub URL: ${url.slice(0, 80)}`)
+      continue
+    }
+
+    try {
+      const response = await fetch(url, {
+        signal: AbortSignal.timeout(15_000),
+        headers: { Accept: 'image/*,video/*' },
+        redirect: 'follow'
+      })
+
+      if (!response.ok) {
+        console.warn(`Failed to download ${url}: ${response.status}`)
+        continue
+      }
+
+      const contentLength = response.headers.get('content-length')
+      if (contentLength) {
+        const declaredSize = Number.parseInt(contentLength, 10)
+        if (declaredSize > budgetBytes - totalBytes) {
+          console.warn(
+            `Content-Length ${declaredSize} would exceed budget, skipping ${url}`
+          )
+          continue
+        }
+      }
+
+      const contentType = response.headers.get('content-type') || ''
+      const buffer = Buffer.from(await response.arrayBuffer())
+
+      // Skip oversized videos
+      const isVideo =
+        contentType.startsWith('video/') || /\.(mp4|webm|mov)$/i.test(url)
+      if (isVideo && buffer.length > maxVideoBytes) {
+        console.warn(
+          `Skipping large video ${url} (${(buffer.length / 1024 / 1024).toFixed(1)}MB > ${(maxVideoBytes / 1024 / 1024).toFixed(0)}MB cap)`
+        )
+        continue
+      }
+
+      if (totalBytes + buffer.length > budgetBytes) {
+        console.warn(`Would exceed budget, skipping ${url}`)
+        continue
+      }
+
+      const ext = guessExtension(url, contentType)
+      const filename = `media-${downloaded.length}${ext}`
+      const filepath = resolve(mediaDir, filename)
+      writeFileSync(filepath, buffer)
+      totalBytes += buffer.length
+
+      const mimeType = contentType.split(';')[0].trim() || guessMimeType(ext)
+
+      downloaded.push({ path: filepath, mimeType })
+      console.warn(
+        `Downloaded: ${url.slice(0, 80)}... (${(buffer.length / 1024).toFixed(0)}KB)`
+      )
+    } catch (err) {
+      console.warn(`Failed to download ${url}: ${(err as Error).message}`)
+    }
+  }
+
+  console.warn(
+    `Downloaded ${downloaded.length}/${urls.length} media files ` +
+      `(${(totalBytes / 1024 / 1024).toFixed(1)}MB)`
+  )
+  return downloaded
+}
+
+function guessExtension(url: string, contentType: string): string {
+  const urlMatch = url.match(MEDIA_EXTENSIONS)
+  if (urlMatch) return urlMatch[0].toLowerCase()
+
+  const typeMap: Record<string, string> = {
+    'image/png': '.png',
+    'image/jpeg': '.jpg',
+    'image/gif': '.gif',
+    'image/webp': '.webp',
+    'video/mp4': '.mp4',
+    'video/webm': '.webm'
+  }
+  return typeMap[contentType.split(';')[0]] || '.bin'
+}
+
+function guessMimeType(ext: string): string {
+  const map: Record<string, string> = {
+    '.png': 'image/png',
+    '.jpg': 'image/jpeg',
+    '.jpeg': 'image/jpeg',
+    '.gif': 'image/gif',
+    '.webp': 'image/webp',
+    '.mp4': 'video/mp4',
+    '.webm': 'video/webm',
+    '.mov': 'video/quicktime'
+  }
+  return map[ext] || 'application/octet-stream'
+}
+
+// ── Gemini analysis ──
+
+function buildIssueAnalysisPrompt(issue: IssueThread): string {
+  const allText = [
+    `# Issue: ${issue.title}`,
+    '',
+    '## Description',
+    issue.body,
+    '',
+    issue.comments.length > 0
+      ? `## Comments\n${issue.comments.join('\n\n---\n\n')}`
+      : ''
+  ]
+    .filter(Boolean)
+    .join('\n')
+
+  return `You are a senior QA engineer analyzing a bug report for ComfyUI frontend — a node-based visual workflow editor for AI image generation (Vue 3 + TypeScript).
+
+The UI has:
+- A large canvas (1280x720 viewport) showing a node graph centered at ~(640, 400)
+- Nodes are boxes with input/output slots connected by wires
+- A hamburger menu (top-left C logo) with File, Edit, Help submenus
+- Sidebars (Workflows, Node Library, Models)
+- A topbar with workflow tabs and Queue button
+- The default workflow loads with these nodes (approximate center coordinates):
+  - Load Checkpoint (~150, 300), CLIP Text Encode x2 (~450, 250 and ~450, 450)
+  - Empty Latent Image (~450, 600), KSampler (~750, 350), VAE Decode (~1000, 350), Save Image (~1200, 350)
+- Right-clicking ON a node shows node actions (Clone, Bypass, Convert, etc.)
+- Right-clicking on EMPTY canvas shows Add Node menu — different from node context menu
+
+Your task: Generate a DETAILED reproduction guide (8-15 steps) to trigger this bug on main.
+
+${allText}
+
+## Available test actions
+Each step must use one of these actions:
+
+### Menu actions
+- "openMenu" — clicks the Comfy hamburger menu (top-left C logo)
+- "hoverMenuItem" — hovers a top-level menu item to open submenu (label required)
+- "clickMenuItem" — clicks an item in the visible submenu (label required)
+
+### Element actions (by visible text)
+- "click" — clicks an element by visible text (text required)
+- "rightClick" — right-clicks an element to open context menu (text required)
+- "doubleClick" — double-clicks an element or coordinates (text or x,y)
+- "fillDialog" — fills dialog input and presses Enter (text required)
+- "pressKey" — presses a keyboard key (key required: Escape, Tab, Delete, Enter, etc.)
+
+### Canvas actions (by coordinates — viewport is 1280x720)
+- "clickCanvas" — click at coordinates (x, y required)
+- "rightClickCanvas" — right-click at coordinates (x, y required)
+- "doubleClick" — double-click at coordinates to open node search (x, y)
+- "dragCanvas" — drag from one point to another (fromX, fromY, toX, toY)
+- "scrollCanvas" — scroll wheel for zoom (x, y, deltaY: negative=zoom in, positive=zoom out)
+
+### Utility
+- "wait" — waits briefly (ms required, max 3000)
+- "screenshot" — takes a screenshot (name required)
+
+## Common ComfyUI interactions
+- Right-click a node → context menu with Clone, Bypass, Remove, Colors, etc.
+- Double-click empty canvas → opens node search dialog
+- Ctrl+C / Ctrl+V → copy/paste selected nodes
+- Delete key → remove selected node
+- Ctrl+G → group selected nodes
+- Drag from output slot to input slot → create connection
+- Click a node to select it, Shift+click for multi-select
+
+## Output format
+Return a JSON object with exactly one key: "reproduce", containing:
+{
+  "summary": "One sentence: what bug this issue reports",
+  "test_focus": "Specific behavior to reproduce",
+  "prerequisites": ["e.g. Load default workflow"],
+  "steps": [
+    {
+      "action": "clickCanvas",
+      "description": "Click on first node to select it",
+      "expected_before": "What should happen if the bug is present"
+    }
+  ],
+  "visual_checks": ["Specific visual evidence of the bug to look for"]
+}
+
+## Rules
+- Generate 8-15 DETAILED steps that actually trigger the reported bug.
+- Follow the issue's reproduction steps PRECISELY — translate them into available actions.
+- Use canvas coordinates for node interactions (nodes are typically in the center area 300-900 x 200-500).
+- Take screenshots BEFORE and AFTER critical actions to capture the bug state.
+- Do NOT just open a menu and screenshot — actually perform the full reproduction sequence.
+- Do NOT include login steps.
+- Output ONLY valid JSON, no markdown fences or explanation.`
+}
+
+function buildAnalysisPrompt(thread: PrThread): string {
+  const allText = [
+    `# PR: ${thread.title}`,
+    '',
+    '## Description',
+    thread.body,
+    '',
+    thread.issueComments.length > 0
+      ? `## Issue Comments\n${thread.issueComments.join('\n\n---\n\n')}`
+      : '',
+    thread.reviewComments.length > 0
+      ? `## Review Comments\n${thread.reviewComments.join('\n\n---\n\n')}`
+      : '',
+    thread.reviews.length > 0
+      ? `## Reviews\n${thread.reviews.join('\n\n---\n\n')}`
+      : '',
+    '',
+    '## Diff (truncated)',
+    '```',
+    thread.diff.slice(0, 8000),
+    '```'
+  ]
+    .filter(Boolean)
+    .join('\n')
+
+  return `You are a senior QA engineer analyzing a pull request for ComfyUI frontend (a Vue 3 + TypeScript web application for AI image generation workflows).
+
+Your task: Generate TWO targeted QA test guides — one for BEFORE the PR (main branch) and one for AFTER (PR branch).
+
+${allText}
+
+## Available test actions
+Each step must use one of these actions:
+- "openMenu" — clicks the Comfy hamburger menu (top-left C logo)
+- "hoverMenuItem" — hovers a top-level menu item to open submenu (label required)
+- "clickMenuItem" — clicks an item in the visible submenu (label required)
+- "fillDialog" — fills dialog input and presses Enter (text required)
+- "pressKey" — presses a keyboard key (key required)
+- "click" — clicks an element by visible text (text required)
+- "wait" — waits briefly (ms required, max 3000)
+- "screenshot" — takes a screenshot (name required)
+
+## Output format
+Return a JSON object with exactly two keys: "before" and "after", each containing:
+{
+  "summary": "One sentence: what this PR changes",
+  "test_focus": "Specific behaviors to verify in this recording",
+  "prerequisites": ["e.g. Load default workflow"],
+  "steps": [
+    {
+      "action": "openMenu",
+      "description": "Open the main menu to check file options",
+      "expected_before": "Old behavior description (before key only)",
+      "expected_after": "New behavior description (after key only)"
+    }
+  ],
+  "visual_checks": ["Specific visual elements to look for"]
+}
+
+## Rules
+- BEFORE guide: 2-4 steps, under 15 seconds. Show OLD/missing behavior.
+- AFTER guide: 3-6 steps, under 30 seconds. Prove the fix/feature works.
+- Focus on the SPECIFIC behavior changed by this PR, not generic testing.
+- Use information from PR description, screenshots, and comments to understand intended behavior.
+- Include at least one screenshot step in each guide.
+- Do NOT include login steps.
+- Menu pattern: openMenu -> hoverMenuItem -> clickMenuItem or screenshot.
+- Output ONLY valid JSON, no markdown fences or explanation.`
+}
+
+async function analyzeWithGemini(
+  thread: PrThread,
+  media: Array<{ path: string; mimeType: string }>,
+  model: string,
+  apiKey: string
+): Promise<{ before: QaGuide; after: QaGuide }> {
+  const genAI = new GoogleGenerativeAI(apiKey)
+  const geminiModel = genAI.getGenerativeModel({ model })
+
+  const prompt = buildAnalysisPrompt(thread)
+
+  const parts: Array<
+    { text: string } | { inlineData: { mimeType: string; data: string } }
+  > = [{ text: prompt }]
+
+  // Add media as inline data
+  for (const item of media) {
+    try {
+      const buffer = readFileSync(item.path)
+      parts.push({
+        inlineData: {
+          mimeType: item.mimeType,
+          data: buffer.toString('base64')
+        }
+      })
+    } catch (err) {
+      console.warn(
+        `Failed to read media ${item.path}: ${(err as Error).message}`
+      )
+    }
+  }
+
+  console.warn(
+    `Sending to ${model}: ${prompt.length} chars text, ${media.length} media files`
+  )
+
+  const result = await geminiModel.generateContent({
+    contents: [{ role: 'user', parts }],
+    generationConfig: {
+      temperature: 0.2,
+      maxOutputTokens: 8192,
+      responseMimeType: 'application/json'
+    }
+  })
+
+  let text = result.response.text()
+  // Strip markdown fences if present
+  text = text
+    .replace(/^```(?:json)?\n?/gm, '')
+    .replace(/```$/gm, '')
+    .trim()
+
+  console.warn('Gemini response received')
+  console.warn('Raw response (first 500 chars):', text.slice(0, 500))
+  const parsed = JSON.parse(text)
+
+  // Handle different response shapes from Gemini
+  let before: QaGuide
+  let after: QaGuide
+
+  if (Array.isArray(parsed) && parsed.length >= 2) {
+    // Array format: [before, after]
+    before = parsed[0]
+    after = parsed[1]
+  } else if (parsed.before && parsed.after) {
+    // Object format: { before, after }
+    before = parsed.before
+    after = parsed.after
+  } else {
+    // Try nested wrapper keys
+    const inner = parsed.qa_guide ?? parsed.guides ?? parsed
+    if (inner.before && inner.after) {
+      before = inner.before
+      after = inner.after
+    } else {
+      console.warn(
+        'Full response:',
+        JSON.stringify(parsed, null, 2).slice(0, 2000)
+      )
+      throw new Error(
+        `Unexpected response shape. Got keys: ${Object.keys(parsed).join(', ')}`
+      )
+    }
+  }
+
+  return { before, after }
+}
+
+async function analyzeIssueWithGemini(
+  issue: IssueThread,
+  media: Array<{ path: string; mimeType: string }>,
+  model: string,
+  apiKey: string
+): Promise<QaGuide> {
+  const genAI = new GoogleGenerativeAI(apiKey)
+  const geminiModel = genAI.getGenerativeModel({ model })
+
+  const prompt = buildIssueAnalysisPrompt(issue)
+
+  const parts: Array<
+    { text: string } | { inlineData: { mimeType: string; data: string } }
+  > = [{ text: prompt }]
+
+  for (const item of media) {
+    try {
+      const buffer = readFileSync(item.path)
+      parts.push({
+        inlineData: {
+          mimeType: item.mimeType,
+          data: buffer.toString('base64')
+        }
+      })
+    } catch (err) {
+      console.warn(
+        `Failed to read media ${item.path}: ${(err as Error).message}`
+      )
+    }
+  }
+
+  console.warn(
+    `Sending to ${model}: ${prompt.length} chars text, ${media.length} media files`
+  )
+
+  const result = await geminiModel.generateContent({
+    contents: [{ role: 'user', parts }],
+    generationConfig: {
+      temperature: 0.2,
+      maxOutputTokens: 8192,
+      responseMimeType: 'application/json'
+    }
+  })
+
+  let text = result.response.text()
+  text = text
+    .replace(/^```(?:json)?\n?/gm, '')
+    .replace(/```$/gm, '')
+    .trim()
+
+  console.warn('Gemini response received')
+  console.warn('Raw response (first 500 chars):', text.slice(0, 500))
+  const parsed = JSON.parse(text)
+
+  const guide: QaGuide =
+    parsed.reproduce ?? parsed.qa_guide?.reproduce ?? parsed
+  return guide
+}
+
+// ── Main ──
+
+async function main() {
+  const opts = parseArgs()
+  mkdirSync(opts.outputDir, { recursive: true })
+
+  if (opts.type === 'issue') {
+    await analyzeIssue(opts)
+  } else {
+    await analyzePr(opts)
+  }
+}
+
+async function analyzeIssue(opts: Options) {
+  const issue = fetchIssueThread(opts.prNumber, opts.repo)
+
+  const allText = [issue.body, ...issue.comments].join('\n')
+  const mediaUrls = extractMediaUrls(allText)
+  console.warn(`Found ${mediaUrls.length} media URLs`)
+
+  const media = await downloadMedia(
+    mediaUrls,
+    opts.outputDir,
+    opts.mediaBudgetBytes,
+    opts.maxVideoBytes
+  )
+
+  const guide = await analyzeIssueWithGemini(
+    issue,
+    media,
+    opts.model,
+    opts.apiKey
+  )
+
+  const beforePath = resolve(opts.outputDir, 'qa-guide-before.json')
+  writeFileSync(beforePath, JSON.stringify(guide, null, 2))
+
+  console.warn(`Wrote QA guide:`)
+  console.warn(`  Reproduce: ${beforePath}`)
+}
+
+async function analyzePr(opts: Options) {
+  const thread = fetchPrThread(opts.prNumber, opts.repo)
+
+  const allText = [
+    thread.body,
+    ...thread.issueComments,
+    ...thread.reviewComments,
+    ...thread.reviews
+  ].join('\n')
+  const mediaUrls = extractMediaUrls(allText)
+  console.warn(`Found ${mediaUrls.length} media URLs`)
+
+  const media = await downloadMedia(
+    mediaUrls,
+    opts.outputDir,
+    opts.mediaBudgetBytes,
+    opts.maxVideoBytes
+  )
+
+  const guides = await analyzeWithGemini(thread, media, opts.model, opts.apiKey)
+
+  const beforePath = resolve(opts.outputDir, 'qa-guide-before.json')
+  const afterPath = resolve(opts.outputDir, 'qa-guide-after.json')
+  writeFileSync(beforePath, JSON.stringify(guides.before, null, 2))
+  writeFileSync(afterPath, JSON.stringify(guides.after, null, 2))
+
+  console.warn(`Wrote QA guides:`)
+  console.warn(`  Before: ${beforePath}`)
+  console.warn(`  After:  ${afterPath}`)
+}
+
+function isExecutedAsScript(metaUrl: string): boolean {
+  const modulePath = fileURLToPath(metaUrl)
+  const scriptPath = process.argv[1] ? resolve(process.argv[1]) : ''
+  return modulePath === scriptPath
+}
+
+if (isExecutedAsScript(import.meta.url)) {
+  main().catch((err) => {
+    console.error('PR analysis failed:', err)
+    process.exit(1)
+  })
+}
--- a/scripts/qa-batch.sh
+++ b/scripts/qa-batch.sh
@@ -0,0 +1,176 @@
+#!/usr/bin/env bash
+# Batch-trigger QA runs by creating and pushing sno-qa-* branches.
+#
+# Usage:
+#   ./scripts/qa-batch.sh 10394 10238 9996          # Trigger specific numbers
+#   ./scripts/qa-batch.sh --from tmp/issues.md --top 5  # From triage file
+#   ./scripts/qa-batch.sh --dry-run 10394 10238     # Preview only
+#   ./scripts/qa-batch.sh --cleanup                 # Delete old sno-qa-* branches
+
+set -euo pipefail
+
+DELAY=5
+DRY_RUN=false
+CLEANUP=false
+FROM_FILE=""
+TOP_N=0
+NUMBERS=()
+
+die() { echo "error: $*" >&2; exit 1; }
+
+usage() {
+  cat <<'EOF'
+Usage: qa-batch.sh [options] [numbers...]
+
+Options:
+  --from <file>   Extract numbers from a triage markdown file
+  --top <N>       Take first N entries from Tier 1 (requires --from)
+  --dry-run       Print what would happen without pushing
+  --cleanup       Delete all sno-qa-* remote branches
+  --delay <secs>  Seconds between pushes (default: 5)
+  -h, --help      Show this help
+EOF
+  exit 0
+}
+
+# --- Parse args ---
+while [[ $# -gt 0 ]]; do
+  case "$1" in
+    --from)     FROM_FILE="$2"; shift 2 ;;
+    --top)      TOP_N="$2"; shift 2 ;;
+    --dry-run)  DRY_RUN=true; shift ;;
+    --cleanup)  CLEANUP=true; shift ;;
+    --delay)    DELAY="$2"; shift 2 ;;
+    -h|--help)  usage ;;
+    -*)         die "unknown option: $1" ;;
+    *)          NUMBERS+=("$1"); shift ;;
+  esac
+done
+
+# --- Cleanup mode ---
+if $CLEANUP; then
+  echo "Fetching remote sno-qa-* branches..."
+  branches=$(git ls-remote --heads origin 'refs/heads/sno-qa-*' | awk '{print $2}' | sed 's|refs/heads/||')
+
+  if [[ -z "$branches" ]]; then
+    echo "No sno-qa-* branches found on remote."
+    exit 0
+  fi
+
+  echo "Found branches:"
+  while IFS= read -r b; do echo "  $b"; done <<< "$branches"
+  echo
+
+  if $DRY_RUN; then
+    echo "[dry-run] Would delete the above branches."
+    exit 0
+  fi
+
+  read -rp "Delete all of the above? [y/N] " confirm
+  if [[ "$confirm" != "y" && "$confirm" != "Y" ]]; then
+    echo "Aborted."
+    exit 0
+  fi
+
+  for branch in $branches; do
+    echo "Deleting origin/$branch..."
+    git push origin --delete "$branch"
+  done
+  echo "Done. Cleaned up $(echo "$branches" | wc -l | tr -d ' ') branches."
+  exit 0
+fi
+
+# --- Extract numbers from markdown ---
+if [[ -n "$FROM_FILE" ]]; then
+  [[ -f "$FROM_FILE" ]] || die "file not found: $FROM_FILE"
+  [[ "$TOP_N" -gt 0 ]] || die "--top N required with --from"
+
+  # Extract Tier 1 table rows: | N | [#NNNNN](...) | ...
+  # Stop at the next ## heading after Tier 1
+  extracted=$(awk '/^## Tier 1/,/^## Tier [^1]/' "$FROM_FILE" \
+    | grep -oP '\[#\K\d+' \
+    | head -n "$TOP_N")
+
+  if [[ -z "$extracted" ]]; then
+    die "no numbers found in $FROM_FILE"
+  fi
+
+  while IFS= read -r num; do
+    NUMBERS+=("$num")
+  done <<< "$extracted"
+fi
+
+[[ ${#NUMBERS[@]} -gt 0 ]] || die "no numbers specified. Use positional args or --from/--top."
+
+# --- Validate ---
+for num in "${NUMBERS[@]}"; do
+  [[ "$num" =~ ^[0-9]+$ ]] || die "invalid number: $num"
+done
+
+# Deduplicate
+# shellcheck disable=SC2207 # mapfile not available on macOS default bash
+NUMBERS=($(printf '%s\n' "${NUMBERS[@]}" | sort -un))
+
+# --- Push branches ---
+echo "Triggering QA for: ${NUMBERS[*]}"
+if $DRY_RUN; then
+  echo "[dry-run]"
+fi
+echo
+
+pushed=()
+skipped=()
+
+# Fetch remote refs once
+remote_refs=$(git ls-remote --heads origin 'refs/heads/sno-qa-*' 2>/dev/null | awk '{print $2}' | sed 's|refs/heads/||')
+
+for num in "${NUMBERS[@]}"; do
+  branch="sno-qa-$num"
+
+  # Check if already exists on remote
+  if echo "$remote_refs" | grep -qx "$branch"; then
+    echo "  skip: $branch (already exists on remote)"
+    skipped+=("$num")
+    continue
+  fi
+
+  if $DRY_RUN; then
+    echo "  would push: $branch"
+    pushed+=("$num")
+    continue
+  fi
+
+  # Create branch at current HEAD and push
+  git branch -f "$branch" HEAD
+  git push origin "$branch"
+  pushed+=("$num")
+  echo "  pushed: $branch"
+
+  # Clean up local branch
+  git branch -D "$branch" 2>/dev/null || true
+
+  # Delay between pushes to avoid CI concurrency storm
+  if [[ "$num" != "${NUMBERS[-1]}" ]]; then
+    echo "  waiting ${DELAY}s..."
+    sleep "$DELAY"
+  fi
+done
+
+# --- Summary ---
+echo
+echo "=== Summary ==="
+echo "Triggered: ${#pushed[@]}"
+echo "Skipped:   ${#skipped[@]}"
+
+if [[ ${#pushed[@]} -gt 0 ]]; then
+  echo
+  echo "Triggered numbers: ${pushed[*]}"
+  repo_url=$(git remote get-url origin | sed 's/\.git$//' | sed 's|git@github.com:|https://github.com/|')
+  echo "Actions: ${repo_url}/actions"
+fi
+
+if [[ ${#skipped[@]} -gt 0 ]]; then
+  echo
+  echo "Skipped (already exist): ${skipped[*]}"
+  echo "Use --cleanup first to remove old branches."
+fi
--- a/scripts/qa-deploy-pages.sh
+++ b/scripts/qa-deploy-pages.sh
@@ -0,0 +1,381 @@
+#!/usr/bin/env bash
+# Deploy QA report to Cloudflare Pages.
+# Expected env vars: CLOUDFLARE_API_TOKEN, CLOUDFLARE_ACCOUNT_ID, RAW_BRANCH,
+#   BEFORE_SHA, AFTER_SHA, TARGET_NUM, TARGET_TYPE, REPO, RUN_ID
+# Writes outputs to GITHUB_OUTPUT: badge_status, url
+set -euo pipefail
+
+npm install -g wrangler@4.74.0 >/dev/null 2>&1
+
+DEPLOY_DIR=$(mktemp -d)
+mkdir -p "$DEPLOY_DIR"
+
+for os in Linux macOS Windows; do
+  DIR="qa-artifacts/qa-report-${os}-${RUN_ID}"
+  for prefix in qa qa-before; do
+    VID="${DIR}/${prefix}-session.mp4"
+    if [ -f "$VID" ]; then
+      DEST="$DEPLOY_DIR/${prefix}-${os}.mp4"
+      cp "$VID" "$DEST"
+      echo "Found ${prefix} ${os} video ($(du -h "$VID" | cut -f1))"
+    fi
+  done
+  # Copy multi-pass session videos (qa-session-1, qa-session-2, etc.)
+  for numbered in "$DIR"/qa-session-[0-9].mp4; do
+    [ -f "$numbered" ] || continue
+    NUM=$(basename "$numbered" | sed 's/qa-session-\([0-9]\).mp4/\1/')
+    DEST="$DEPLOY_DIR/qa-${os}-pass${NUM}.mp4"
+    cp "$numbered" "$DEST"
+    echo "Found pass ${NUM} ${os} video ($(du -h "$numbered" | cut -f1))"
+  done
+  # Generate GIF thumbnail from after video (or first pass)
+  THUMB_SRC="$DEPLOY_DIR/qa-${os}.mp4"
+  [ ! -f "$THUMB_SRC" ] && THUMB_SRC="$DEPLOY_DIR/qa-${os}-pass1.mp4"
+  if [ -f "$THUMB_SRC" ]; then
+    ffmpeg -y -ss 10 -i "$THUMB_SRC" -t 8 \
+      -vf "fps=8,scale=480:-1:flags=lanczos,split[s0][s1];[s0]palettegen=max_colors=64[p];[s1][p]paletteuse=dither=bayer" \
+      -loop 0 "$DEPLOY_DIR/qa-${os}-thumb.gif" 2>/dev/null \
+    || echo "GIF generation failed for ${os} (non-fatal)"
+  fi
+done
+
+# Build video cards and report sections
+CARDS=""
+# shellcheck disable=SC2034 # accessed via eval
+ICONS_Linux="&#x1F427;" ICONS_macOS="&#x1F34E;" ICONS_Windows="&#x1FA9F;"
+CARD_COUNT=0
+DL_ICON="<svg width=14 height=14 viewBox='0 0 24 24' fill=none stroke=currentColor stroke-width=2><path d='M21 15v4a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2v-4'/><polyline points='7 10 12 15 17 10'/><line x1=12 y1=15 x2=12 y2=3'/></svg>"
+
+for os in Linux macOS Windows; do
+  eval "ICON=\$ICONS_${os}"
+  OS_LOWER=$(echo "$os" | tr '[:upper:]' '[:lower:]')
+  HAS_BEFORE=$([ -f "$DEPLOY_DIR/qa-before-${os}.mp4" ] && echo 1 || echo 0)
+  HAS_AFTER=$( { [ -f "$DEPLOY_DIR/qa-${os}.mp4" ] || [ -f "$DEPLOY_DIR/qa-${os}-pass1.mp4" ]; } && echo 1 || echo 0)
+  [ "$HAS_AFTER" = "0" ] && continue
+
+  # Collect all reports for this platform (single + multi-pass)
+  REPORT_FILES=""
+  REPORT_LINK=""
+  REPORT_HTML=""
+  for rpt in "video-reviews/${OS_LOWER}-qa-video-report.md" "video-reviews/${OS_LOWER}-pass"*-qa-video-report.md; do
+    [ -f "$rpt" ] && REPORT_FILES="${REPORT_FILES} ${rpt}"
+  done
+
+  if [ -n "$REPORT_FILES" ]; then
+    # Concatenate all reports into one combined report file
+    COMBINED_MD=""
+    for rpt in $REPORT_FILES; do
+      cp "$rpt" "$DEPLOY_DIR/$(basename "$rpt")"
+      RPT_MD=$(sed 's/&/\&amp;/g; s/</\&lt;/g; s/>/\&gt;/g' "$rpt")
+      [ -n "$COMBINED_MD" ] && COMBINED_MD="${COMBINED_MD}&#10;&#10;---&#10;&#10;"
+      COMBINED_MD="${COMBINED_MD}${RPT_MD}"
+    done
+    FIRST_REPORT=$(echo "$REPORT_FILES" | awk '{print $1}')
+    FIRST_BASENAME=$(basename "$FIRST_REPORT")
+    REPORT_LINK="<a class=dl href=${FIRST_BASENAME}><svg width=14 height=14 viewBox='0 0 24 24' fill=none stroke=currentColor stroke-width=2><path d='M14 2H6a2 2 0 0 0-2 2v16a2 2 0 0 0 2 2h12a2 2 0 0 0 2-2V8z'/><polyline points='14 2 14 8 20 8'/><line x1=16 y1=13 x2=8 y2=13/><line x1=16 y1=17 x2=8 y2=17'/></svg>Report</a>"
+    REPORT_HTML="<details class=report open><summary><svg width=14 height=14 viewBox='0 0 24 24' fill=none stroke=currentColor stroke-width=2><circle cx=12 cy=12 r=10/><line x1=12 y1=16 x2=12 y2=12/><line x1=12 y1=8 x2=12.01 y2=8'/></svg> AI Comparative Review</summary><div class=report-body data-md>${COMBINED_MD}</div></details>"
+  fi
+
+  if [ "$HAS_BEFORE" = "1" ]; then
+    CARDS="${CARDS}<div class='card reveal' style='--i:${CARD_COUNT}'><div class=card-header><span class=platform><span class=icon>${ICON}</span>${os}</span><span class=links>${REPORT_LINK}</span></div><div class=comparison><div class=comp-panel><div class=comp-label>Before <span class=comp-tag>main</span></div><div class=video-wrap><video controls muted preload=auto><source src=qa-before-${os}.mp4 type=video/mp4></video></div><div class=comp-dl><a class=dl href=qa-before-${os}.mp4 download>${DL_ICON}Before</a></div></div><div class=comp-panel><div class=comp-label>After <span class=comp-tag>PR</span></div><div class=video-wrap><video controls muted preload=auto><source src=qa-${os}.mp4 type=video/mp4></video></div><div class=comp-dl><a class=dl href=qa-${os}.mp4 download>${DL_ICON}After</a></div></div></div>${REPORT_HTML}</div>"
+  elif [ -f "$DEPLOY_DIR/qa-${os}.mp4" ]; then
+    CARDS="${CARDS}<div class='card reveal' style='--i:${CARD_COUNT}'><div class=video-wrap><video controls muted preload=auto><source src=qa-${os}.mp4 type=video/mp4></video></div><div class=card-body><span class=platform><span class=icon>${ICON}</span>${os}</span><span class=links><a class=dl href=qa-${os}.mp4 download>${DL_ICON}Download</a>${REPORT_LINK}</span></div>${REPORT_HTML}</div>"
+  else
+    PASS_VIDEOS=""
+    for pass_vid in "$DEPLOY_DIR/qa-${os}-pass"[0-9].mp4; do
+      [ -f "$pass_vid" ] || continue
+      PASS_NUM=$(basename "$pass_vid" | sed "s/qa-${os}-pass\([0-9]\).mp4/\1/")
+      PASS_VIDEOS="${PASS_VIDEOS}<div class=comp-panel><div class=comp-label>Pass ${PASS_NUM}</div><div class=video-wrap><video controls muted preload=auto><source src=qa-${os}-pass${PASS_NUM}.mp4 type=video/mp4></video></div><div class=comp-dl><a class=dl href=qa-${os}-pass${PASS_NUM}.mp4 download>${DL_ICON}Pass ${PASS_NUM}</a></div></div>"
+    done
+    CARDS="${CARDS}<div class='card reveal' style='--i:${CARD_COUNT}'><div class=card-header><span class=platform><span class=icon>${ICON}</span>${os}</span><span class=links>${REPORT_LINK}</span></div><div class=comparison>${PASS_VIDEOS}</div>${REPORT_HTML}</div>"
+  fi
+  CARD_COUNT=$((CARD_COUNT + 1))
+done
+
+# Build commit info and target link for the report header
+COMMIT_HTML=""
+REPO_URL="https://github.com/${REPO}"
+if [ -n "${TARGET_NUM:-}" ]; then
+  if [ "$TARGET_TYPE" = "issue" ]; then
+    COMMIT_HTML="<a href=${REPO_URL}/issues/${TARGET_NUM} class=sha title='Issue'>Issue #${TARGET_NUM}</a>"
+  else
+    COMMIT_HTML="<a href=${REPO_URL}/pull/${TARGET_NUM} class=sha title='Pull Request'>PR #${TARGET_NUM}</a>"
+  fi
+fi
+if [ -n "${BEFORE_SHA:-}" ]; then
+  SHORT_BEFORE="${BEFORE_SHA:0:7}"
+  COMMIT_HTML="${COMMIT_HTML:+${COMMIT_HTML} &middot; }<a href=${REPO_URL}/commit/${BEFORE_SHA} class=sha title='main branch'>main @ ${SHORT_BEFORE}</a>"
+fi
+if [ -n "${AFTER_SHA:-}" ]; then
+  SHORT_AFTER="${AFTER_SHA:0:7}"
+  AFTER_LABEL="PR"
+  [ -n "${TARGET_NUM:-}" ] && AFTER_LABEL="#${TARGET_NUM}"
+  COMMIT_HTML="${COMMIT_HTML:+${COMMIT_HTML} &middot; }<a href=${REPO_URL}/commit/${AFTER_SHA} class=sha title='PR head commit'>${AFTER_LABEL} @ ${SHORT_AFTER}</a>"
+fi
+if [ -n "${PIPELINE_SHA:-}" ]; then
+  SHORT_PIPE="${PIPELINE_SHA:0:7}"
+  COMMIT_HTML="${COMMIT_HTML:+${COMMIT_HTML} &middot; }<a href=${REPO_URL}/commit/${PIPELINE_SHA} class=sha title='QA pipeline version'>QA @ ${SHORT_PIPE}</a>"
+fi
+[ -n "$COMMIT_HTML" ] && COMMIT_HTML=" &middot; ${COMMIT_HTML}"
+
+RUN_LINK=""
+if [ -n "${RUN_URL:-}" ]; then
+  RUN_LINK=" &middot; <a href=\"${RUN_URL}\" class=sha title=\"GitHub Actions run\">CI Job</a>"
+fi
+
+# Timing info
+DEPLOY_TIME=$(date -u '+%Y-%m-%d %H:%M UTC')
+TIMING_HTML=""
+if [ -n "${RUN_START_TIME:-}" ]; then
+  TIMING_HTML=" &middot; <span class=sha title='Pipeline timing'>${RUN_START_TIME} &rarr; ${DEPLOY_TIME}</span>"
+fi
+
+# Generate index.html from template
+SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
+TEMPLATE="$SCRIPT_DIR/qa-report-template.html"
+
+# Write dynamic content to temp files for safe substitution
+# Cloudflare Pages _headers file — enable range requests for video seeking
+cat > "$DEPLOY_DIR/_headers" <<'HEADERSEOF'
+/*.mp4
+  Accept-Ranges: bytes
+  Cache-Control: public, max-age=86400
+HEADERSEOF
+
+# Build purpose description from pr-context.txt
+PURPOSE_HTML=""
+if [ -f pr-context.txt ]; then
+  # Extract title line and first paragraph of description
+  PR_TITLE=$(grep -m1 '^Title:' pr-context.txt 2>/dev/null | sed 's/^Title: //' || true)
+  if [ "$TARGET_TYPE" = "issue" ]; then
+    PURPOSE_LABEL="Issue #${TARGET_NUM}"
+    PURPOSE_VERB="reports"
+  else
+    PURPOSE_LABEL="PR #${TARGET_NUM}"
+    PURPOSE_VERB="aims to"
+  fi
+  # Get first ~300 chars of description body (after "Description:" line)
+  PR_DESC=$(sed -n '/^Description:/,/^###/p' pr-context.txt 2>/dev/null | grep -v '^Description:\|^###' | head -5 | sed 's/&/\&amp;/g; s/</\&lt;/g; s/>/\&gt;/g' | tr '\n' ' ' | head -c 400 || true)
+  [ -z "$PR_DESC" ] && PR_DESC=$(sed -n '3,8p' pr-context.txt 2>/dev/null | sed 's/&/\&amp;/g; s/</\&lt;/g; s/>/\&gt;/g' | tr '\n' ' ' | head -c 400 || true)
+  # Build requirements from QA guide JSON
+  REQS_HTML=""
+  QA_GUIDE=$(ls qa-guides/qa-guide-*.json 2>/dev/null | head -1 || true)
+  if [ -f "$QA_GUIDE" ]; then
+    PREREQS=$(python3 -c "
+import json, sys, html
+try:
+  g = json.load(open(sys.argv[1]))
+  prereqs = g.get('prerequisites', [])
+  steps = g.get('steps', [])
+  focus = g.get('test_focus', '')
+  parts = []
+  if focus:
+    parts.append('<strong>Test focus:</strong> ' + html.escape(focus))
+  if prereqs:
+    parts.append('<strong>Prerequisites:</strong> ' + ', '.join(html.escape(p) for p in prereqs))
+  if steps:
+    parts.append('<strong>Steps:</strong> ' + ' → '.join(html.escape(s.get('description', str(s))) for s in steps[:6]))
+    if len(steps) > 6:
+      parts[-1] += ' → ...'
+  print('<br>'.join(parts))
+except: pass
+" "$QA_GUIDE" 2>/dev/null)
+    [ -n "$PREREQS" ] && REQS_HTML="<div class=purpose-reqs>${PREREQS}</div>"
+  fi
+
+  PURPOSE_HTML="<div class=purpose><div class=purpose-label>${PURPOSE_LABEL} ${PURPOSE_VERB}</div><strong>${PR_TITLE}</strong><br>${PR_DESC}${REQS_HTML}</div>"
+fi
+
+echo -n "$COMMIT_HTML" > "$DEPLOY_DIR/.commit_html"
+echo -n "$CARDS" > "$DEPLOY_DIR/.cards_html"
+echo -n "$RUN_LINK" > "$DEPLOY_DIR/.run_link"
+# Badge HTML with copy button (placeholder URL filled after deploy)
+echo -n '<div class="badge-bar"><img src="badge.svg" alt="QA Badge" class="badge-img"/><button class="copy-badge" title="Copy badge markdown" onclick="copyBadge()"><svg width=14 height=14 viewBox="0 0 24 24" fill=none stroke=currentColor stroke-width=2><rect x=9 y=9 width=13 height=13 rx=2/><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"/></svg></button></div>' > "$DEPLOY_DIR/.badge_html"
+echo -n "${TIMING_HTML:-}" > "$DEPLOY_DIR/.timing_html"
+echo -n "$PURPOSE_HTML" > "$DEPLOY_DIR/.purpose_html"
+python3 -c "
+import sys, pathlib
+d = pathlib.Path(sys.argv[1])
+t = pathlib.Path(sys.argv[2]).read_text()
+t = t.replace('{{COMMIT_HTML}}', (d / '.commit_html').read_text())
+t = t.replace('{{CARDS}}', (d / '.cards_html').read_text())
+t = t.replace('{{RUN_LINK}}', (d / '.run_link').read_text())
+t = t.replace('{{BADGE_HTML}}', (d / '.badge_html').read_text())
+t = t.replace('{{TIMING_HTML}}', (d / '.timing_html').read_text())
+t = t.replace('{{PURPOSE_HTML}}', (d / '.purpose_html').read_text())
+sys.stdout.write(t)
+" "$DEPLOY_DIR" "$TEMPLATE" > "$DEPLOY_DIR/index.html"
+rm -f "$DEPLOY_DIR/.commit_html" "$DEPLOY_DIR/.cards_html" "$DEPLOY_DIR/.run_link" "$DEPLOY_DIR/.badge_html" "$DEPLOY_DIR/.timing_html" "$DEPLOY_DIR/.purpose_html"
+
+cat > "$DEPLOY_DIR/404.html" <<'ERROREOF'
+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><title>404</title>
+<link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;600&display=swap" rel=stylesheet>
+<style>:root{--bg:oklch(8% 0.02 265);--fg:oklch(45% 0.01 265);--err:oklch(62% 0.22 25)}*{margin:0;padding:0;box-sizing:border-box}body{background:var(--bg);color:var(--fg);font-family:'Inter',system-ui,sans-serif;display:flex;align-items:center;justify-content:center;min-height:100vh}div{text-align:center}h1{color:var(--err);font-size:clamp(3rem,8vw,5rem);font-weight:700;letter-spacing:-.04em;margin-bottom:.5rem}p{font-size:1rem;max-width:32ch;line-height:1.5}</style>
+</head><body><div><h1>404</h1><p>File not found. The QA recording may have failed or been cancelled.</p></div></body></html>
+ERROREOF
+
+# Copy research log to deploy dir if it exists
+for rlog in qa-artifacts/*/research/research-log.json qa-artifacts/*/*/research/research-log.json qa-artifacts/before/*/research/research-log.json; do
+  if [ -f "$rlog" ]; then
+    cp "$rlog" "$DEPLOY_DIR/research-log.json"
+    echo "Found research log: $rlog"
+    break
+  fi
+done
+
+# Copy generated test code to deploy dir
+for tfile in qa-artifacts/*/research/reproduce.spec.ts qa-artifacts/*/*/research/reproduce.spec.ts qa-artifacts/before/*/research/reproduce.spec.ts; do
+  if [ -f "$tfile" ]; then
+    cp "$tfile" "$DEPLOY_DIR/reproduce.spec.ts"
+    echo "Found test code: $tfile"
+    break
+  fi
+done
+
+# Generate badge SVGs into deploy dir
+# Priority: research-log.json verdict (a11y-verified) > video review verdict (AI interpretation)
+REPRO_COUNT=0 INCONC_COUNT=0 NOT_REPRO_COUNT=0 TOTAL_REPORTS=0
+
+# Try research log first (ground truth from a11y assertions)
+RESEARCH_VERDICT=""
+REPRO_METHOD=""
+if [ -f "$DEPLOY_DIR/research-log.json" ]; then
+  RESEARCH_VERDICT=$(python3 -c "import json,sys; d=json.load(open(sys.argv[1])); print(d.get('verdict',''))" "$DEPLOY_DIR/research-log.json" 2>/dev/null || true)
+  REPRO_METHOD=$(python3 -c "import json,sys; d=json.load(open(sys.argv[1])); print(d.get('reproducedBy','none'))" "$DEPLOY_DIR/research-log.json" 2>/dev/null || true)
+  echo "Research verdict (a11y-verified): ${RESEARCH_VERDICT:-none} (by: ${REPRO_METHOD:-none})"
+  if [ -n "$RESEARCH_VERDICT" ]; then
+    TOTAL_REPORTS=1
+    case "$RESEARCH_VERDICT" in
+      REPRODUCED) REPRO_COUNT=1 ;;
+      NOT_REPRODUCIBLE) NOT_REPRO_COUNT=1 ;;
+      INCONCLUSIVE) INCONC_COUNT=1 ;;
+    esac
+  fi
+fi
+
+# Fall back to video review verdicts if no research log
+if [ -z "$RESEARCH_VERDICT" ] && [ -d video-reviews ]; then
+  for rpt in video-reviews/*-qa-video-report.md; do
+    [ -f "$rpt" ] || continue
+    TOTAL_REPORTS=$((TOTAL_REPORTS + 1))
+    # Try structured JSON verdict first (from ## Verdict section)
+    VERDICT_JSON=$(grep -oP '"verdict":\s*"[A-Z_]+' "$rpt" 2>/dev/null | tail -1 | grep -oP '[A-Z_]+$' || true)
+    RISK_JSON=$(grep -oP '"risk":\s*"[a-z]+' "$rpt" 2>/dev/null | tail -1 | grep -oP '[a-z]+$' || true)
+
+    if [ -n "$VERDICT_JSON" ]; then
+      case "$VERDICT_JSON" in
+        REPRODUCED) REPRO_COUNT=$((REPRO_COUNT + 1)) ;;
+        NOT_REPRODUCIBLE) NOT_REPRO_COUNT=$((NOT_REPRO_COUNT + 1)) ;;
+        INCONCLUSIVE) INCONC_COUNT=$((INCONC_COUNT + 1)) ;;
+      esac
+    else
+      # Fallback: grep Summary section (for older reports without ## Verdict)
+      SUMM=$(sed -n '/^## Summary/,/^## /p' "$rpt" 2>/dev/null | head -15)
+      if echo "$SUMM" | grep -iq 'INCONCLUSIVE'; then
+        INCONC_COUNT=$((INCONC_COUNT + 1))
+      elif echo "$SUMM" | grep -iq 'not reproduced\|could not reproduce\|could not be confirmed\|unable to reproduce\|fails\? to reproduce\|fails\? to perform\|was NOT\|NOT visible\|not observed\|fail.* to demonstrate\|does not demonstrate\|steps were not performed\|never.*tested\|never.*accessed\|not.* confirmed'; then
+        NOT_REPRO_COUNT=$((NOT_REPRO_COUNT + 1))
+      elif echo "$SUMM" | grep -iq 'reproduc\|confirm'; then
+        REPRO_COUNT=$((REPRO_COUNT + 1))
+      fi
+    fi
+  done
+fi
+FAIL_COUNT=$((TOTAL_REPORTS - REPRO_COUNT - NOT_REPRO_COUNT))
+[ "$FAIL_COUNT" -lt 0 ] && FAIL_COUNT=0
+echo "DEBUG verdict: repro=${REPRO_COUNT} not_repro=${NOT_REPRO_COUNT} inconc=${INCONC_COUNT} fail=${FAIL_COUNT} total=${TOTAL_REPORTS}"
+echo "Verdict: ${REPRO_COUNT}✓ ${NOT_REPRO_COUNT}✗ ${FAIL_COUNT}⚠ / ${TOTAL_REPORTS}"
+
+# Badge text:
+#   Single pass: "REPRODUCED" / "NOT REPRODUCIBLE" / "INCONCLUSIVE"
+#   Multi pass:  "2✓ 0✗ 1⚠ / 3" with color based on dominant result
+REPRO_RESULT="" REPRO_COLOR="#9f9f9f"
+if [ "$TOTAL_REPORTS" -le 1 ]; then
+  # Single report — simple label
+  if [ "$REPRO_COUNT" -gt 0 ]; then
+    REPRO_RESULT="REPRODUCED" REPRO_COLOR="#2196f3"
+  elif [ "$NOT_REPRO_COUNT" -gt 0 ]; then
+    REPRO_RESULT="NOT REPRODUCIBLE" REPRO_COLOR="#9f9f9f"
+  elif [ "$FAIL_COUNT" -gt 0 ]; then
+    REPRO_RESULT="INCONCLUSIVE" REPRO_COLOR="#9f9f9f"
+  fi
+else
+  # Multi pass — show breakdown: X✓ Y✗ Z⚠ / N
+  PARTS=""
+  [ "$REPRO_COUNT" -gt 0 ] && PARTS="${REPRO_COUNT}✓"
+  [ "$NOT_REPRO_COUNT" -gt 0 ] && PARTS="${PARTS:+${PARTS} }${NOT_REPRO_COUNT}✗"
+  [ "$FAIL_COUNT" -gt 0 ] && PARTS="${PARTS:+${PARTS} }${FAIL_COUNT}⚠"
+  REPRO_RESULT="${PARTS} / ${TOTAL_REPORTS}"
+  # Color based on best outcome
+  if [ "$REPRO_COUNT" -gt 0 ]; then
+    REPRO_COLOR="#2196f3"
+  elif [ "$NOT_REPRO_COUNT" -gt 0 ]; then
+    REPRO_COLOR="#9f9f9f"
+  fi
+fi
+
+# Badge label: #NUM QA0327 (with today's date)
+QA_DATE=$(date -u '+%m%d')
+BADGE_LABEL="QA${QA_DATE}"
+[ -n "${TARGET_NUM:-}" ] && BADGE_LABEL="#${TARGET_NUM} QA${QA_DATE}"
+
+# For PRs, also extract fix quality from Overall Risk section
+FIX_RESULT="" FIX_COLOR="#4c1"
+if [ "$TARGET_TYPE" != "issue" ]; then
+  # Try structured JSON risk first
+  ALL_RISKS=$(grep -ohP '"risk":\s*"[a-z]+' video-reviews/*.md 2>/dev/null | grep -oP '[a-z]+$' || true)
+  if [ -n "$ALL_RISKS" ]; then
+    # Use worst risk across all reports
+    if echo "$ALL_RISKS" | grep -q 'high'; then
+      FIX_RESULT="MAJOR ISSUES" FIX_COLOR="#e05d44"
+    elif echo "$ALL_RISKS" | grep -q 'medium'; then
+      FIX_RESULT="MINOR ISSUES" FIX_COLOR="#dfb317"
+    elif echo "$ALL_RISKS" | grep -q 'low'; then
+      FIX_RESULT="APPROVED" FIX_COLOR="#4c1"
+    fi
+  else
+    # Fallback: grep Overall Risk section
+    RISK_TEXT=""
+    if [ -d video-reviews ]; then
+      RISK_TEXT=$(sed -n '/^## Overall Risk/,/^## /p' video-reviews/*.md 2>/dev/null | sed 's/\*//g' | head -20 || true)
+    fi
+    RISK_FIRST=$(echo "$RISK_TEXT" | grep -oiP '^\s*(high|medium|moderate|low|minimal|critical)' | head -1 | tr '[:upper:]' '[:lower:]' || true)
+    if [ -n "$RISK_FIRST" ]; then
+      case "$RISK_FIRST" in
+        *low*|*minimal*) FIX_RESULT="APPROVED" FIX_COLOR="#4c1" ;;
+        *medium*|*moderate*) FIX_RESULT="MINOR ISSUES" FIX_COLOR="#dfb317" ;;
+        *high*|*critical*) FIX_RESULT="MAJOR ISSUES" FIX_COLOR="#e05d44" ;;
+      esac
+    elif echo "$RISK_TEXT" | grep -iq 'no.*risk\|approved\|looks good'; then
+      FIX_RESULT="APPROVED" FIX_COLOR="#4c1"
+    fi
+  fi
+fi
+
+# Always use vertical box badge
+/tmp/gen-badge-box.sh "$DEPLOY_DIR/badge.svg" "$BADGE_LABEL" \
+  "$REPRO_COUNT" "$NOT_REPRO_COUNT" "$FAIL_COUNT" "$TOTAL_REPORTS" \
+  "$FIX_RESULT" "$FIX_COLOR" "$REPRO_METHOD"
+BADGE_STATUS="${REPRO_RESULT:-UNKNOWN}${FIX_RESULT:+ | Fix: ${FIX_RESULT}}"
+echo "badge_status=${BADGE_STATUS:-FINISHED}" >> "$GITHUB_OUTPUT"
+
+# Remove files exceeding Cloudflare Pages 25MB limit to prevent silent deploy failures
+MAX_SIZE=$((25 * 1024 * 1024))
+find "$DEPLOY_DIR" -type f -size +${MAX_SIZE}c | while read -r big_file; do
+  SIZE_MB=$(( $(stat -c%s "$big_file") / 1024 / 1024 ))
+  echo "Removing oversized file: $(basename "$big_file") (${SIZE_MB}MB > 25MB limit)"
+  rm "$big_file"
+done
+
+BRANCH=$(echo "$RAW_BRANCH" | sed 's/[^a-zA-Z0-9-]/-/g' | sed 's/--*/-/g' | sed 's/^-//;s/-$//' | cut -c1-28)
+
+DEPLOY_OUTPUT=$(wrangler pages deploy "$DEPLOY_DIR" \
+  --project-name="comfy-qa" \
+  --branch="$BRANCH" 2>&1) || true
+echo "$DEPLOY_OUTPUT" | tail -5
+
+URL=$(echo "$DEPLOY_OUTPUT" | grep -oE 'https://[a-zA-Z0-9.-]+\.pages\.dev\S*' | head -1 || true)
+FALLBACK_URL="https://${BRANCH}.comfy-qa.pages.dev"
+
+echo "url=${URL:-$FALLBACK_URL}" >> "$GITHUB_OUTPUT"
+echo "Deployed to: ${URL:-$FALLBACK_URL}"
--- a/scripts/qa-generate-test.ts
+++ b/scripts/qa-generate-test.ts
@@ -0,0 +1,208 @@
+#!/usr/bin/env tsx
+/**
+ * Generates a Playwright regression test (.spec.ts) from a QA report + PR diff.
+ * Uses Gemini to produce a test that asserts UIUX behavior verified during QA.
+ *
+ * Usage:
+ *   pnpm exec tsx scripts/qa-generate-test.ts \
+ *     --qa-report <path>       QA video review report (markdown)
+ *     --pr-diff <path>         PR diff file
+ *     --output <path>          Output .spec.ts file path
+ *     --model <name>           Gemini model (default: gemini-3-flash-preview)
+ */
+import { readFile, writeFile } from 'node:fs/promises'
+import { basename, resolve } from 'node:path'
+
+import { GoogleGenerativeAI } from '@google/generative-ai'
+
+interface CliOptions {
+  qaReport: string
+  prDiff: string
+  output: string
+  model: string
+}
+
+const DEFAULTS: CliOptions = {
+  qaReport: '',
+  prDiff: '',
+  output: '',
+  model: 'gemini-3-flash-preview'
+}
+
+// ── Fixture API reference for the prompt ────────────────────────────
+const FIXTURE_API = `
+## ComfyUI Playwright Test Fixture API
+
+Import pattern:
+\`\`\`typescript
+import { expect } from '@playwright/test'
+import { comfyPageFixture as test } from '../fixtures/ComfyPage'
+\`\`\`
+
+### Available helpers on \`comfyPage\`:
+- \`comfyPage.page\` — raw Playwright Page
+- \`comfyPage.menu.topbar\` — Topbar helper:
+  - \`.getTabNames(): Promise<string[]>\` — get all open tab names
+  - \`.getActiveTabName(): Promise<string>\` — get active tab name
+  - \`.saveWorkflow(name)\` — Save via File > Save dialog
+  - \`.saveWorkflowAs(name)\` — Save via File > Save As dialog
+  - \`.exportWorkflow(name)\` — Export via File > Export dialog
+  - \`.triggerTopbarCommand(path: string[])\` — e.g. ['File', 'Save As']
+  - \`.getWorkflowTab(name)\` — get a tab locator by name
+  - \`.closeWorkflowTab(name)\` — close a tab
+  - \`.openTopbarMenu()\` — open the hamburger menu
+  - \`.openSubmenu(label)\` — hover to open a submenu
+- \`comfyPage.menu.workflowsTab\` — Workflows sidebar:
+  - \`.open()\` / \`.close()\` — toggle sidebar
+  - \`.getTopLevelSavedWorkflowNames()\` — list saved workflows
+  - \`.getPersistedItem(name)\` — get a workflow item locator
+- \`comfyPage.workflow\` — WorkflowHelper:
+  - \`.loadWorkflow(name)\` — load from browser_tests/assets/{name}.json
+  - \`.setupWorkflowsDirectory(structure)\` — setup test directory
+  - \`.deleteWorkflow(name)\` — delete a workflow
+  - \`.isCurrentWorkflowModified(): Promise<boolean>\` — check dirty state
+  - \`.getUndoQueueSize()\` / \`.getRedoQueueSize()\`
+- \`comfyPage.settings.setSetting(key, value)\` — change settings
+- \`comfyPage.keyboard\` — KeyboardHelper:
+  - \`.undo()\` / \`.redo()\` / \`.bypass()\`
+- \`comfyPage.nodeOps\` — NodeOperationsHelper
+- \`comfyPage.canvas\` — CanvasHelper
+- \`comfyPage.contextMenu\` — ContextMenu
+- \`comfyPage.toast\` — ToastHelper
+- \`comfyPage.confirmDialog\` — confirmation dialog
+- \`comfyPage.nextFrame()\` — wait for Vue re-render
+
+### Test patterns:
+- Use \`test.describe('Name', { tag: '@ui' }, () => { ... })\` for UI tests
+- Use \`test.beforeEach\` to set up common state (settings, workflow dir)
+- Use \`expect(locator).toHaveScreenshot('name.png')\` for visual assertions
+- Use \`expect(locator).toBeVisible()\` / \`.toHaveText()\` for behavioral assertions
+- Use \`comfyPage.workflow.setupWorkflowsDirectory({})\` to ensure clean state
+`
+
+// ── Prompt builder ──────────────────────────────────────────────────
+function buildPrompt(qaReport: string, prDiff: string): string {
+  return `You are a Playwright test generator for the ComfyUI frontend.
+
+Your task: Generate a single .spec.ts regression test file that asserts the UIUX behavior
+described in the QA report below. The test must:
+
+1. Use the ComfyUI Playwright fixture API (documented below)
+2. Test UIUX behavior ONLY — element visibility, tab names, dialog states, workflow states
+3. NOT test code implementation details
+4. Be concise — only test the behavior that the PR changed
+5. Follow existing test conventions (see API reference)
+
+${FIXTURE_API}
+
+## QA Video Review Report
+${qaReport}
+
+## PR Diff (for context on what changed)
+${prDiff.slice(0, 8000)}
+
+## Output Requirements
+- Output ONLY the .spec.ts file content — no markdown fences, no explanations
+- Start with imports, end with closing brace
+- Use descriptive test names that explain the expected behavior
+- Add screenshot assertions where visual verification matters
+- Keep it focused: 2-5 test cases covering the core behavioral change
+- Use \`test.beforeEach\` for common setup (settings, workflow directory)
+- Tag the describe block with \`{ tag: '@ui' }\` or \`{ tag: '@workflow' }\` as appropriate
+`
+}
+
+// ── Gemini call ─────────────────────────────────────────────────────
+async function generateTest(
+  qaReport: string,
+  prDiff: string,
+  model: string
+): Promise<string> {
+  const apiKey = process.env.GEMINI_API_KEY
+  if (!apiKey) throw new Error('GEMINI_API_KEY env var required')
+
+  const genAI = new GoogleGenerativeAI(apiKey)
+  const genModel = genAI.getGenerativeModel({ model })
+
+  const prompt = buildPrompt(qaReport, prDiff)
+  console.warn(`Sending prompt to ${model} (${prompt.length} chars)...`)
+
+  const result = await genModel.generateContent({
+    contents: [{ role: 'user', parts: [{ text: prompt }] }],
+    generationConfig: {
+      temperature: 0.2,
+      maxOutputTokens: 8192
+    }
+  })
+
+  const text = result.response.text()
+
+  // Strip markdown fences if model wraps output
+  return text
+    .replace(/^```(?:typescript|ts)?\n?/, '')
+    .replace(/\n?```$/, '')
+    .trim()
+}
+
+// ── CLI ─────────────────────────────────────────────────────────────
+function parseArgs(): CliOptions {
+  const args = process.argv.slice(2)
+  const opts = { ...DEFAULTS }
+
+  for (let i = 0; i < args.length; i++) {
+    switch (args[i]) {
+      case '--qa-report':
+        opts.qaReport = args[++i]
+        break
+      case '--pr-diff':
+        opts.prDiff = args[++i]
+        break
+      case '--output':
+        opts.output = args[++i]
+        break
+      case '--model':
+        opts.model = args[++i]
+        break
+      case '--help':
+        console.warn(`Usage:
+  pnpm exec tsx scripts/qa-generate-test.ts [options]
+
+Options:
+  --qa-report <path>   QA video review report (markdown) [required]
+  --pr-diff <path>     PR diff file [required]
+  --output <path>      Output .spec.ts path [required]
+  --model <name>       Gemini model (default: gemini-3-flash-preview)`)
+        process.exit(0)
+    }
+  }
+
+  if (!opts.qaReport || !opts.prDiff || !opts.output) {
+    console.error('Missing required args. Run with --help for usage.')
+    process.exit(1)
+  }
+
+  return opts
+}
+
+async function main() {
+  const opts = parseArgs()
+
+  const qaReport = await readFile(resolve(opts.qaReport), 'utf-8')
+  const prDiff = await readFile(resolve(opts.prDiff), 'utf-8')
+
+  console.warn(
+    `QA report: ${basename(opts.qaReport)} (${qaReport.length} chars)`
+  )
+  console.warn(`PR diff: ${basename(opts.prDiff)} (${prDiff.length} chars)`)
+
+  const testCode = await generateTest(qaReport, prDiff, opts.model)
+
+  const outputPath = resolve(opts.output)
+  await writeFile(outputPath, testCode + '\n')
+  console.warn(`Generated test: ${outputPath} (${testCode.length} chars)`)
+}
+
+main().catch((err) => {
+  console.error(err)
+  process.exit(1)
+})
--- a/scripts/qa-record.ts
+++ b/scripts/qa-record.ts
--- a/scripts/qa-report-template.html
+++ b/scripts/qa-report-template.html
@@ -0,0 +1,135 @@
+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=viewport content="width=device-width,initial-scale=1"><title>QA Session Recordings</title>
+<link rel=preconnect href=https://fonts.googleapis.com><link rel=preconnect href=https://fonts.gstatic.com crossorigin><link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600;700&family=JetBrains+Mono:wght@400;500&display=swap" rel=stylesheet>
+<script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script>
+<style>
+:root{--bg:oklch(97% 0.01 265);--surface:oklch(100% 0 0);--surface-up:oklch(94% 0.01 265);--fg:oklch(15% 0.02 265);--fg-muted:oklch(40% 0.01 265);--fg-dim:oklch(55% 0.01 265);--primary:oklch(50% 0.21 265);--primary-up:oklch(45% 0.21 265);--primary-glow:oklch(55% 0.15 265);--ok:oklch(45% 0.18 155);--err:oklch(50% 0.22 25);--border:oklch(85% 0.01 265);--border-faint:oklch(90% 0.01 265);--r:0.75rem;--r-lg:1rem;--ease-out:cubic-bezier(0.22,1,0.36,1);--dur-base:250ms;--dur-slow:500ms;--font:'Inter',system-ui,sans-serif;--font-mono:'JetBrains Mono',monospace}
+@media(prefers-color-scheme:dark){:root{--bg:oklch(8% 0.02 265);--surface:oklch(12% 0.02 265);--surface-up:oklch(16% 0.02 265);--fg:oklch(96% 0.01 95);--fg-muted:oklch(65% 0.01 265);--fg-dim:oklch(45% 0.01 265);--primary:oklch(62% 0.21 265);--primary-up:oklch(68% 0.21 265);--primary-glow:oklch(62% 0.15 265);--ok:oklch(62% 0.18 155);--err:oklch(62% 0.22 25);--border:oklch(22% 0.02 265);--border-faint:oklch(15% 0.01 265)}}
+*{margin:0;padding:0;box-sizing:border-box}
+body{background:var(--bg);color:var(--fg);font-family:var(--font);min-height:100vh;padding:clamp(1.5rem,4vw,3rem) clamp(1rem,3vw,2rem);position:relative}
+@media(prefers-color-scheme:dark){body::after{content:'';position:fixed;inset:0;pointer-events:none;opacity:.03;background:url("data:image/svg+xml,%3Csvg viewBox='0 0 256 256' xmlns='http://www.w3.org/2000/svg'%3E%3Cfilter id='n'%3E%3CfeTurbulence type='fractalNoise' baseFrequency='.85' numOctaves='4' stitchTiles='stitch'/%3E%3C/filter%3E%3Crect width='100%25' height='100%25' filter='url(%23n)'/%3E%3C/svg%3E")}}
+.container{max-width:1200px;margin:0 auto}
+header{display:flex;align-items:center;gap:1rem;margin-bottom:clamp(1.5rem,4vw,3rem);padding-bottom:1.25rem;border-bottom:1px solid var(--border)}
+.header-icon{width:36px;height:36px;display:grid;place-items:center;background:linear-gradient(135deg,oklch(100% 0 0/.06),oklch(100% 0 0/.02));backdrop-filter:blur(12px);border:1px solid oklch(100% 0 0/.1);border-radius:var(--r);flex-shrink:0}
+.header-icon svg{color:var(--primary)}
+h1{font-size:clamp(1.25rem,2.5vw,1.625rem);font-weight:700;letter-spacing:-.03em;background:linear-gradient(135deg,var(--fg),var(--fg-muted));-webkit-background-clip:text;-webkit-text-fill-color:transparent;background-clip:text}
+.meta{color:var(--fg-dim);font-size:.8125rem;margin-top:.15rem;letter-spacing:.01em}
+.grid{display:grid;grid-template-columns:repeat(auto-fill,minmax(min(480px,100%),1fr));gap:1.5rem}
+.card{background:var(--surface);border:1px solid var(--border);border-radius:var(--r-lg);overflow:hidden;transition:border-color var(--dur-base) var(--ease-out),box-shadow var(--dur-base) var(--ease-out),transform var(--dur-base) var(--ease-out)}
+.card:hover{border-color:var(--primary);box-shadow:0 4px 16px oklch(0% 0 0/.1);transform:translateY(-2px)}
+.video-wrap{position:relative;background:var(--surface);border-bottom:1px solid var(--border-faint)}
+.video-wrap video{width:100%;display:block;aspect-ratio:16/9;object-fit:contain}
+.card-body{padding:.75rem 1rem;display:flex;align-items:center;justify-content:space-between}
+.platform{display:flex;align-items:center;gap:.5rem;font-weight:600;font-size:.9375rem;letter-spacing:-.01em}
+.icon{font-size:1.125rem}
+.links{display:flex;gap:.5rem}
+.dl{color:var(--fg-muted);text-decoration:none;font-size:.75rem;font-weight:500;display:inline-flex;align-items:center;gap:.3rem;padding:.25rem .6rem;border-radius:9999px;border:1px solid var(--border);background:oklch(100% 0 0/.03);transition:all var(--dur-base) var(--ease-out)}
+.dl:hover{color:var(--primary-up);border-color:var(--primary);background:oklch(62% 0.21 265/.08)}
+.badge{font-size:.6875rem;font-weight:600;padding:.2rem .625rem;border-radius:9999px;text-transform:uppercase;letter-spacing:.05em}
+.card-header{padding:.75rem 1rem;display:flex;align-items:center;justify-content:space-between;border-bottom:1px solid var(--border-faint)}
+.comparison{display:grid;grid-template-columns:1fr 1fr;gap:0}
+.comp-panel{border-right:1px solid var(--border-faint)}
+.comp-panel:last-child{border-right:none}
+.comp-label{padding:.4rem .75rem;font-size:.7rem;font-weight:600;text-transform:uppercase;letter-spacing:.05em;color:var(--fg-muted);background:var(--surface);display:flex;align-items:center;gap:.4rem}
+.comp-tag{font-size:.6rem;padding:.1rem .4rem;border-radius:9999px;font-weight:600}
+.comp-panel:first-child .comp-tag{background:oklch(65% 0.01 265/.15);color:var(--fg-muted);border:1px solid var(--border)}
+.comp-panel:last-child .comp-tag{background:oklch(62% 0.18 155/.15);color:var(--ok);border:1px solid oklch(62% 0.18 155/.25)}
+.comp-dl{padding:.4rem .75rem;display:flex;justify-content:center}
+.report{border-top:1px solid var(--border-faint);padding:.75rem 1rem;font-size:.8125rem}
+.report summary{cursor:pointer;color:var(--fg-muted);font-weight:500;display:flex;align-items:center;gap:.4rem;user-select:none;transition:color var(--dur-base) var(--ease-out)}
+.report summary:hover{color:var(--fg)}
+.report summary svg{flex-shrink:0;opacity:.5}
+.report[open] summary{margin-bottom:.75rem;padding-bottom:.5rem;border-bottom:1px solid var(--border-faint)}
+.report-body{line-height:1.7;color:oklch(80% 0.01 265);overflow-x:auto}
+.report-body h1,.report-body h2{margin:1.25rem 0 .5rem;color:var(--fg);font-size:1rem;font-weight:600;letter-spacing:-.02em;border-bottom:1px solid var(--border-faint);padding-bottom:.4rem}
+.report-body h3{margin:.75rem 0 .4rem;color:var(--fg);font-size:.875rem;font-weight:600}
+.report-body p{margin:.4rem 0}
+.report-body ul,.report-body ol{margin:.4rem 0 .4rem 1.5rem}
+.report-body li{margin:.25rem 0}
+.report-body code{background:var(--surface-up);padding:.125rem .375rem;border-radius:.25rem;font-size:.7rem;font-family:var(--font-mono);border:1px solid var(--border-faint)}
+.report-body h3+p>code:first-child{background:oklch(62% 0.22 25/.15);color:var(--err);border-color:oklch(62% 0.22 25/.25)}
+.report-body h3+p>code:nth-child(2){background:oklch(62% 0.21 265/.15);color:var(--primary-up);border-color:oklch(62% 0.21 265/.25)}
+.report-body h3+p>code:nth-child(3){background:oklch(65% 0.01 265/.15);color:var(--fg-muted);border-color:var(--border)}
+.report-body table{width:100%;border-collapse:collapse;margin:.75rem 0;font-size:.75rem;border:1px solid var(--border);border-radius:var(--r);overflow:hidden}
+.report-body th,.report-body td{border:1px solid var(--border-faint);padding:.5rem .75rem;text-align:left;vertical-align:top;word-wrap:break-word}
+.report-body th{background:var(--surface-up);color:var(--fg);font-weight:600;font-size:.6875rem;text-transform:uppercase;letter-spacing:.05em;position:sticky;top:0;white-space:nowrap}
+.report-body tr:nth-child(even){background:color-mix(in oklch,var(--surface) 50%,transparent)}
+.report-body tr:hover{background:color-mix(in oklch,var(--surface-up) 50%,transparent)}
+.report-body strong{color:var(--fg)}
+.report-body hr{border:none;border-top:1px solid var(--border-faint);margin:1rem 0}
+@keyframes fade-up{from{opacity:0;transform:translateY(16px)}to{opacity:1;transform:translateY(0)}}
+.reveal{animation:fade-up var(--dur-slow) var(--ease-out) both;animation-delay:calc(var(--i,0) * 120ms)}
+@media(prefers-reduced-motion:reduce){.reveal{animation:none}}
+@media(max-width:480px){.grid{grid-template-columns:1fr}.card-body{flex-wrap:wrap;gap:.5rem}}
+.sha{color:var(--primary);text-decoration:none;font-family:var(--font-mono);font-size:.75rem;font-weight:500;padding:.1rem .4rem;border-radius:.25rem;background:oklch(62% 0.21 265/.08);border:1px solid oklch(62% 0.21 265/.15);transition:all var(--dur-base) var(--ease-out)}
+.sha:hover{background:oklch(62% 0.21 265/.15);border-color:var(--primary)}
+.badge-bar{display:flex;align-items:center;gap:.5rem;margin-bottom:1rem}
+.badge-img{height:20px;display:block}
+.copy-badge{background:oklch(100% 0 0/.06);border:1px solid var(--border);color:var(--fg-muted);padding:.3rem .4rem;border-radius:var(--r);cursor:pointer;display:inline-flex;align-items:center;transition:all var(--dur-base) var(--ease-out)}
+.copy-badge:hover{color:var(--primary-up);border-color:var(--primary);background:oklch(62% 0.21 265/.1)}
+.copy-badge.copied{color:var(--ok);border-color:var(--ok)}
+.vseek{width:100%;padding:0 .75rem;background:var(--surface);border-top:1px solid var(--border-faint);position:relative;height:24px;display:flex;align-items:center}
+.vseek input[type=range]{-webkit-appearance:none;appearance:none;width:100%;height:4px;background:var(--border);border-radius:2px;outline:none;cursor:pointer;position:relative;z-index:2}
+.vseek input[type=range]::-webkit-slider-thumb{-webkit-appearance:none;width:12px;height:12px;border-radius:50%;background:var(--primary);cursor:pointer;border:2px solid var(--bg);box-shadow:0 0 4px oklch(0% 0 0/.3)}
+.vseek input[type=range]::-moz-range-thumb{width:12px;height:12px;border-radius:50%;background:var(--primary);cursor:pointer;border:2px solid var(--bg)}
+.vseek .vbuf{position:absolute;left:.75rem;right:.75rem;height:4px;border-radius:2px;pointer-events:none;top:50%;transform:translateY(-50%)}
+.vseek .vbuf-bar{height:100%;background:oklch(62% 0.21 265/.25);border-radius:2px;transition:width 200ms linear}
+.vctrl{display:flex;align-items:center;gap:.375rem;padding:.5rem .75rem;background:var(--surface);border-top:1px solid var(--border-faint);flex-wrap:wrap}
+.vctrl button{background:oklch(100% 0 0/.06);border:1px solid var(--border);color:var(--fg-muted);font-size:.6875rem;font-weight:600;font-family:var(--font-mono);padding:.25rem .5rem;border-radius:.25rem;cursor:pointer;transition:all var(--dur-base) var(--ease-out);white-space:nowrap}
+.vctrl button:hover{color:var(--primary-up);border-color:var(--primary);background:oklch(62% 0.21 265/.1)}
+.vctrl button.active{color:var(--primary);border-color:var(--primary);background:oklch(62% 0.21 265/.15)}
+.vctrl .vtime{font-family:var(--font-mono);font-size:.6875rem;color:var(--fg-dim);min-width:10ch;text-align:center}
+.vctrl .vsep{width:1px;height:1rem;background:var(--border);flex-shrink:0}
+.vctrl .vhint{font-size:.6rem;color:var(--fg-dim);margin-left:auto}
+.purpose{background:linear-gradient(135deg,oklch(100% 0 0/.04),oklch(100% 0 0/.02));border:1px solid oklch(100% 0 0/.08);border-radius:var(--r-lg);padding:1rem 1.25rem;margin-bottom:1.5rem;font-size:.85rem;line-height:1.7;color:oklch(80% 0.01 265)}
+.purpose strong{color:var(--fg);font-weight:600}
+.purpose .purpose-label{font-size:.7rem;font-weight:600;text-transform:uppercase;letter-spacing:.05em;color:var(--fg-muted);margin-bottom:.4rem}
+.purpose .purpose-reqs{margin-top:.75rem;padding-top:.75rem;border-top:1px solid oklch(100% 0 0/.06);font-size:.8rem;color:oklch(70% 0.01 265);line-height:1.8}
+</style></head><body><div class=container>
+<header><div class=header-icon><svg width=20 height=20 viewBox="0 0 24 24" fill=none stroke=currentColor stroke-width=2 stroke-linecap=round stroke-linejoin=round><polygon points="23 7 16 12 23 17 23 7"/><rect x=1 y=5 width=15 height=14 rx=2 ry=2/></svg></div><div><h1>QA Session Recordings</h1><div class=meta>ComfyUI Frontend &middot; Automated QA{{COMMIT_HTML}}{{RUN_LINK}}{{TIMING_HTML}}</div>{{BADGE_HTML}}</div></header>
+{{PURPOSE_HTML}}<div class=grid>{{CARDS}}</div>
+</div><script>
+function copyBadge(){const u=location.href.replace(/\/[^/]*$/,'/');const b=u+'badge.svg';const md='[![QA Badge]('+b+')]('+u+')';navigator.clipboard.writeText(md).then(()=>{const btn=document.querySelector('.copy-badge');btn.classList.add('copied');btn.innerHTML='<svg width=14 height=14 viewBox="0 0 24 24" fill=none stroke=currentColor stroke-width=2><polyline points="20 6 9 17 4 12"/></svg>';setTimeout(()=>{btn.classList.remove('copied');btn.innerHTML='<svg width=14 height=14 viewBox="0 0 24 24" fill=none stroke=currentColor stroke-width=2><rect x=9 y=9 width=13 height=13 rx=2/><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"/></svg>'},2000)})}
+document.querySelectorAll('[data-md]').forEach(el=>{const t=el.textContent;el.removeAttribute('data-md');el.innerHTML=marked.parse(t)});
+const FPS=30,FT=1/FPS,SPEEDS=[0.1,0.25,0.5,1,1.5,2];
+document.querySelectorAll('.video-wrap video').forEach(v=>{
+  v.playbackRate=0.5;v.removeAttribute('autoplay');v.pause();
+  const c=document.createElement('div');c.className='vctrl';
+  const btn=(label,fn)=>{const b=document.createElement('button');b.textContent=label;b.onclick=fn;c.appendChild(b);return b};
+  const sep=()=>{const s=document.createElement('div');s.className='vsep';c.appendChild(s)};
+  const time=document.createElement('span');time.className='vtime';time.textContent='0:00.000';
+  btn('\u23EE',()=>{v.currentTime=0});
+  btn('\u25C0\u25C0',()=>{v.currentTime=Math.max(0,v.currentTime-FT*10)});
+  btn('\u25C0',()=>{v.pause();v.currentTime=Math.max(0,v.currentTime-FT)});
+  const playBtn=btn('\u25B6',()=>{v.paused?v.play():v.pause()});
+  btn('\u25B6\u25B6',()=>{v.pause();v.currentTime+=FT});
+  btn('\u25B6\u25B6\u25B6',()=>{v.currentTime+=FT*10});
+  sep();
+  const spdBtns=SPEEDS.map(s=>{const b=btn(s+'x',()=>{v.playbackRate=s;spdBtns.forEach(x=>x.classList.remove('active'));b.classList.add('active')});if(s===0.5)b.classList.add('active');return b});
+  sep();c.appendChild(time);
+  const hint=document.createElement('span');hint.className='vhint';hint.textContent='\u2190\u2192 frame \u2022 space play';c.appendChild(hint);
+  // Custom seekbar — works even without server range request support
+  const seekWrap=document.createElement('div');seekWrap.className='vseek';
+  const seekBar=document.createElement('input');seekBar.type='range';seekBar.min=0;seekBar.max=1000;seekBar.value=0;seekBar.step=1;
+  const bufWrap=document.createElement('div');bufWrap.className='vbuf';
+  const bufBar=document.createElement('div');bufBar.className='vbuf-bar';bufBar.style.width='0%';
+  bufWrap.appendChild(bufBar);seekWrap.appendChild(bufWrap);seekWrap.appendChild(seekBar);
+  let seeking=false;
+  seekBar.oninput=()=>{seeking=true;if(v.duration){v.currentTime=v.duration*(seekBar.value/1000)}};
+  seekBar.onchange=()=>{seeking=false};
+  v.closest('.video-wrap').after(seekWrap);
+  seekWrap.after(c);
+  v.ontimeupdate=()=>{
+    const m=Math.floor(v.currentTime/60),s=Math.floor(v.currentTime%60),ms=Math.floor((v.currentTime%1)*1000);
+    time.textContent=m+':'+(s<10?'0':'')+s+'.'+String(ms).padStart(3,'0');
+    if(!seeking&&v.duration){seekBar.value=Math.round((v.currentTime/v.duration)*1000)}
+  };
+  v.onprogress=v.onloadeddata=()=>{if(v.buffered.length&&v.duration){bufBar.style.width=(v.buffered.end(v.buffered.length-1)/v.duration*100)+'%'}};
+  v.onplay=()=>{playBtn.textContent='\u23F8'};v.onpause=()=>{playBtn.textContent='\u25B6'};
+  v.parentElement.addEventListener('keydown',e=>{
+    if(e.key==='ArrowLeft'){e.preventDefault();v.pause();v.currentTime=Math.max(0,v.currentTime-FT)}
+    if(e.key==='ArrowRight'){e.preventDefault();v.pause();v.currentTime+=FT}
+    if(e.key===' '){e.preventDefault();v.paused?v.play():v.pause()}
+  });
+  v.parentElement.setAttribute('tabindex','0');
+});
+</script></body></html>
--- a/scripts/qa-reproduce.ts
+++ b/scripts/qa-reproduce.ts
@@ -0,0 +1,253 @@
+#!/usr/bin/env tsx
+/**
+ * QA Reproduce Phase — Deterministic replay of research plan with narration
+ *
+ * Takes a reproduction plan from the research phase and replays it:
+ * 1. Execute each action deterministically (no AI decisions)
+ * 2. Capture a11y snapshot before/after each action
+ * 3. Gemini describes what visually changed (narration for humans)
+ * 4. Output: narration-log.json with full evidence chain
+ */
+
+import type { Page } from '@playwright/test'
+import { GoogleGenerativeAI } from '@google/generative-ai'
+import { mkdirSync, writeFileSync } from 'fs'
+
+import type { ActionResult } from './qa-record.js'
+
+// ── Types ──
+
+interface ReproductionStep {
+  action: Record<string, unknown> & { action: string }
+  expectedAssertion: string
+}
+
+interface NarrationEntry {
+  step: number
+  action: string
+  params: Record<string, unknown>
+  result: ActionResult
+  a11yBefore: unknown
+  a11yAfter: unknown
+  assertionExpected: string
+  assertionPassed: boolean
+  assertionActual: string
+  geminiNarration: string
+  timestampMs: number
+}
+
+export interface NarrationLog {
+  entries: NarrationEntry[]
+  allAssertionsPassed: boolean
+}
+
+interface ReproduceOptions {
+  page: Page
+  plan: ReproductionStep[]
+  geminiApiKey: string
+  outputDir: string
+}
+
+// ── A11y helpers ──
+
+interface A11yNode {
+  role: string
+  name: string
+  value?: string
+  checked?: boolean
+  disabled?: boolean
+  expanded?: boolean
+  children?: A11yNode[]
+}
+
+function searchA11y(node: A11yNode | null, selector: string): A11yNode | null {
+  if (!node) return null
+  const sel = selector.toLowerCase()
+  if (
+    node.name?.toLowerCase().includes(sel) ||
+    node.role?.toLowerCase().includes(sel)
+  ) {
+    return node
+  }
+  if (node.children) {
+    for (const child of node.children) {
+      const found = searchA11y(child, selector)
+      if (found) return found
+    }
+  }
+  return null
+}
+
+function summarizeA11y(node: A11yNode | null): string {
+  if (!node) return 'null'
+  const parts = [`role=${node.role}`, `name="${node.name}"`]
+  if (node.value !== undefined) parts.push(`value="${node.value}"`)
+  if (node.checked !== undefined) parts.push(`checked=${node.checked}`)
+  if (node.disabled) parts.push('disabled')
+  if (node.expanded !== undefined) parts.push(`expanded=${node.expanded}`)
+  return `{${parts.join(', ')}}`
+}
+
+// ── Subtitle overlay ──
+
+async function showSubtitle(page: Page, text: string, step: number) {
+  const encoded = encodeURIComponent(
+    text.slice(0, 120).replace(/'/g, "\\'").replace(/\n/g, ' ')
+  )
+  await page.addScriptTag({
+    content: `(function(){
+      var id='qa-subtitle';
+      var el=document.getElementById(id);
+      if(!el){
+        el=document.createElement('div');
+        el.id=id;
+        Object.assign(el.style,{position:'fixed',bottom:'32px',left:'50%',transform:'translateX(-50%)',zIndex:'2147483646',maxWidth:'90%',padding:'6px 14px',borderRadius:'6px',background:'rgba(0,0,0,0.8)',color:'rgba(255,255,255,0.95)',fontSize:'12px',fontFamily:'system-ui,sans-serif',fontWeight:'400',lineHeight:'1.4',pointerEvents:'none',textAlign:'center',whiteSpace:'normal'});
+        document.body.appendChild(el);
+      }
+      el.textContent='['+${step}+'] '+decodeURIComponent('${encoded}');
+    })()`
+  })
+}
+
+// ── Gemini visual narration ──
+
+async function geminiDescribe(
+  page: Page,
+  geminiApiKey: string,
+  focus: string
+): Promise<string> {
+  try {
+    const screenshot = await page.screenshot({ type: 'jpeg', quality: 70 })
+    const genAI = new GoogleGenerativeAI(geminiApiKey)
+    const model = genAI.getGenerativeModel({ model: 'gemini-3-flash-preview' })
+
+    const result = await model.generateContent([
+      {
+        text: `Describe in 1-2 sentences what you see on this ComfyUI screen. Focus on: ${focus}. Be factual — only describe what is visible.`
+      },
+      {
+        inlineData: {
+          mimeType: 'image/jpeg',
+          data: screenshot.toString('base64')
+        }
+      }
+    ])
+    return result.response.text().trim()
+  } catch (e) {
+    return `(Gemini narration failed: ${e instanceof Error ? e.message.slice(0, 50) : e})`
+  }
+}
+
+// ── Main reproduce function ──
+
+export async function runReproducePhase(
+  opts: ReproduceOptions
+): Promise<NarrationLog> {
+  const { page, plan, geminiApiKey, outputDir } = opts
+  const { executeAction } = await import('./qa-record.js')
+
+  const narrationDir = `${outputDir}/narration`
+  mkdirSync(narrationDir, { recursive: true })
+
+  const entries: NarrationEntry[] = []
+  const startMs = Date.now()
+
+  console.warn(`Reproduce phase: replaying ${plan.length} steps...`)
+
+  for (let i = 0; i < plan.length; i++) {
+    const step = plan[i]
+    const actionObj = step.action
+    const elapsed = Date.now() - startMs
+
+    // Show subtitle
+    await showSubtitle(page, `Step ${i + 1}: ${actionObj.action}`, i + 1)
+    console.warn(`  [${i + 1}/${plan.length}] ${actionObj.action}`)
+
+    // Capture a11y BEFORE
+    const a11yBefore = await page
+      .locator('body')
+      .ariaSnapshot({ timeout: 3000 })
+      .catch(() => null)
+
+    // Execute action
+    const result = await executeAction(
+      page,
+      actionObj as Parameters<typeof executeAction>[1],
+      outputDir
+    )
+    await new Promise((r) => setTimeout(r, 500))
+
+    // Capture a11y AFTER
+    const a11yAfter = await page
+      .locator('body')
+      .ariaSnapshot({ timeout: 3000 })
+      .catch(() => null)
+
+    // Check assertion
+    let assertionPassed = false
+    let assertionActual = ''
+    if (step.expectedAssertion) {
+      // Parse the expected assertion — e.g. "Settings dialog: visible" or "tab count: 2"
+      const parts = step.expectedAssertion.split(':').map((s) => s.trim())
+      const selectorName = parts[0]
+      const expectedState = parts.slice(1).join(':').trim()
+
+      const found = searchA11y(a11yAfter as A11yNode | null, selectorName)
+      assertionActual = found ? summarizeA11y(found) : 'NOT FOUND'
+
+      if (expectedState === 'visible' || expectedState === 'exists') {
+        assertionPassed = found !== null
+      } else if (expectedState === 'hidden' || expectedState === 'gone') {
+        assertionPassed = found === null
+      } else {
+        // Generic: check if the actual state contains the expected text
+        assertionPassed = assertionActual
+          .toLowerCase()
+          .includes(expectedState.toLowerCase())
+      }
+
+      console.warn(
+        `    Assertion: "${step.expectedAssertion}" → ${assertionPassed ? '✓ PASS' : '✗ FAIL'} (actual: ${assertionActual})`
+      )
+    }
+
+    // Gemini narration (visual description for humans)
+    const geminiNarration = await geminiDescribe(
+      page,
+      geminiApiKey,
+      `What changed after ${actionObj.action}?`
+    )
+
+    entries.push({
+      step: i + 1,
+      action: actionObj.action,
+      params: actionObj,
+      result,
+      a11yBefore,
+      a11yAfter,
+      assertionExpected: step.expectedAssertion,
+      assertionPassed,
+      assertionActual,
+      geminiNarration,
+      timestampMs: elapsed
+    })
+  }
+
+  // Final screenshot
+  await page.screenshot({ path: `${outputDir}/reproduce-final.png` })
+
+  const log: NarrationLog = {
+    entries,
+    allAssertionsPassed: entries.every((e) => e.assertionPassed)
+  }
+
+  writeFileSync(
+    `${narrationDir}/narration-log.json`,
+    JSON.stringify(log, null, 2)
+  )
+  console.warn(
+    `Reproduce phase complete: ${entries.filter((e) => e.assertionPassed).length}/${entries.length} assertions passed`
+  )
+
+  return log
+}
--- a/scripts/qa-video-review.test.ts
+++ b/scripts/qa-video-review.test.ts
@@ -0,0 +1,150 @@
+import { describe, expect, it } from 'vitest'
+
+import {
+  extractPlatformFromArtifactDirName,
+  pickLatestVideosByPlatform,
+  selectVideoCandidateByFile
+} from './qa-video-review'
+
+describe('extractPlatformFromArtifactDirName', () => {
+  it('extracts and normalizes known qa artifact directory names', () => {
+    expect(
+      extractPlatformFromArtifactDirName('qa-report-Windows-22818315023')
+    ).toBe('windows')
+    expect(
+      extractPlatformFromArtifactDirName('qa-report-macOS-22818315023')
+    ).toBe('macos')
+    expect(
+      extractPlatformFromArtifactDirName('qa-report-Linux-22818315023')
+    ).toBe('linux')
+  })
+
+  it('falls back to slugifying unknown directory names', () => {
+    expect(extractPlatformFromArtifactDirName('custom platform run')).toBe(
+      'custom-platform-run'
+    )
+  })
+})
+
+describe('pickLatestVideosByPlatform', () => {
+  it('keeps only the latest candidate per platform', () => {
+    const selected = pickLatestVideosByPlatform([
+      {
+        platformName: 'windows',
+        videoPath: '/tmp/windows-old.mp4',
+        mtimeMs: 100
+      },
+      {
+        platformName: 'windows',
+        videoPath: '/tmp/windows-new.mp4',
+        mtimeMs: 200
+      },
+      {
+        platformName: 'linux',
+        videoPath: '/tmp/linux.mp4',
+        mtimeMs: 150
+      }
+    ])
+
+    expect(selected).toEqual([
+      {
+        platformName: 'linux',
+        videoPath: '/tmp/linux.mp4',
+        mtimeMs: 150
+      },
+      {
+        platformName: 'windows',
+        videoPath: '/tmp/windows-new.mp4',
+        mtimeMs: 200
+      }
+    ])
+  })
+})
+
+describe('selectVideoCandidateByFile', () => {
+  it('selects a single candidate by artifacts-relative path', () => {
+    const selected = selectVideoCandidateByFile(
+      [
+        {
+          platformName: 'windows',
+          videoPath: '/tmp/qa-artifacts/qa-report-Windows-1/qa-session.mp4',
+          mtimeMs: 100
+        },
+        {
+          platformName: 'linux',
+          videoPath: '/tmp/qa-artifacts/qa-report-Linux-1/qa-session.mp4',
+          mtimeMs: 200
+        }
+      ],
+      {
+        artifactsDir: '/tmp/qa-artifacts',
+        videoFile: 'qa-report-Linux-1/qa-session.mp4'
+      }
+    )
+
+    expect(selected).toEqual({
+      platformName: 'linux',
+      videoPath: '/tmp/qa-artifacts/qa-report-Linux-1/qa-session.mp4',
+      mtimeMs: 200
+    })
+  })
+
+  it('throws when basename matches multiple videos', () => {
+    expect(() =>
+      selectVideoCandidateByFile(
+        [
+          {
+            platformName: 'windows',
+            videoPath: '/tmp/qa-artifacts/qa-report-Windows-1/qa-session.mp4',
+            mtimeMs: 100
+          },
+          {
+            platformName: 'linux',
+            videoPath: '/tmp/qa-artifacts/qa-report-Linux-1/qa-session.mp4',
+            mtimeMs: 200
+          }
+        ],
+        {
+          artifactsDir: '/tmp/qa-artifacts',
+          videoFile: 'qa-session.mp4'
+        }
+      )
+    ).toThrow('matched 2 videos')
+  })
+
+  it('throws when there is no matching video', () => {
+    expect(() =>
+      selectVideoCandidateByFile(
+        [
+          {
+            platformName: 'windows',
+            videoPath: '/tmp/qa-artifacts/qa-report-Windows-1/qa-session.mp4',
+            mtimeMs: 100
+          }
+        ],
+        {
+          artifactsDir: '/tmp/qa-artifacts',
+          videoFile: 'qa-report-macOS-1/qa-session.mp4'
+        }
+      )
+    ).toThrow('No video matched')
+  })
+
+  it('throws when video file is missing', () => {
+    expect(() =>
+      selectVideoCandidateByFile(
+        [
+          {
+            platformName: 'windows',
+            videoPath: '/tmp/qa-artifacts/qa-report-Windows-1/qa-session.mp4',
+            mtimeMs: 100
+          }
+        ],
+        {
+          artifactsDir: '/tmp/qa-artifacts',
+          videoFile: '   '
+        }
+      )
+    ).toThrow('--video-file is required')
+  })
+})
--- a/scripts/qa-video-review.ts
+++ b/scripts/qa-video-review.ts
@@ -0,0 +1,765 @@
+#!/usr/bin/env tsx
+import { mkdir, readFile, stat, writeFile } from 'node:fs/promises'
+import { basename, dirname, extname, relative, resolve } from 'node:path'
+import { fileURLToPath } from 'node:url'
+
+import { GoogleGenerativeAI } from '@google/generative-ai'
+import { globSync } from 'glob'
+
+interface CliOptions {
+  artifactsDir: string
+  videoFile: string
+  beforeVideo: string
+  outputDir: string
+  model: string
+  requestTimeoutMs: number
+  dryRun: boolean
+  prContext: string
+  targetUrl: string
+  passLabel: string
+}
+
+interface VideoCandidate {
+  platformName: string
+  videoPath: string
+  mtimeMs: number
+}
+
+const DEFAULT_OPTIONS: CliOptions = {
+  artifactsDir: './tmp/qa-artifacts',
+  videoFile: '',
+  beforeVideo: '',
+  outputDir: './tmp',
+  model: 'gemini-3-flash-preview',
+  requestTimeoutMs: 300_000,
+  dryRun: false,
+  prContext: '',
+  targetUrl: '',
+  passLabel: ''
+}
+
+const USAGE = `Usage:
+  pnpm exec tsx scripts/qa-video-review.ts [options]
+
+Options:
+  --artifacts-dir <path>        Artifacts root directory
+                                 (default: ./tmp/qa-artifacts)
+  --video-file <name-or-path>   Video file to analyze (required)
+                                 (supports basename or relative/absolute path)
+  --before-video <path>         Before video (main branch) for comparison
+                                 When provided, sends both videos to Gemini
+                                 for comparative before/after analysis
+  --output-dir <path>           Output directory for markdown reports
+                                 (default: ./tmp)
+  --model <name>                Gemini model
+                                 (default: gemini-3-flash-preview)
+  --request-timeout-ms <n>      Request timeout in milliseconds
+                                 (default: 300000)
+  --pr-context <file>           File with PR context (title, body, diff)
+                                 for PR-aware review
+  --target-url <url>            Issue or PR URL to include in the report
+  --pass-label <label>          Label for multi-pass reports (e.g. pass1)
+                                 Output becomes {platform}-{label}-qa-video-report.md
+  --dry-run                     Discover videos and output targets only
+  --help                        Show this help text
+
+Environment:
+  GEMINI_API_KEY                Required unless --dry-run
+`
+
+function parsePositiveInteger(rawValue: string, flagName: string): number {
+  const parsedValue = Number.parseInt(rawValue, 10)
+  if (!Number.isInteger(parsedValue) || parsedValue <= 0) {
+    throw new Error(`Invalid value for ${flagName}: "${rawValue}"`)
+  }
+  return parsedValue
+}
+
+function parseCliOptions(args: string[]): CliOptions {
+  const options: CliOptions = { ...DEFAULT_OPTIONS }
+
+  for (let index = 0; index < args.length; index += 1) {
+    const argument = args[index]
+    const nextValue = args[index + 1]
+    const requireValue = (flagName: string): string => {
+      if (!nextValue || nextValue.startsWith('--')) {
+        throw new Error(`Missing value for ${flagName}`)
+      }
+      index += 1
+      return nextValue
+    }
+
+    if (argument === '--help') {
+      process.stdout.write(USAGE)
+      process.exit(0)
+    }
+
+    if (argument === '--artifacts-dir') {
+      options.artifactsDir = requireValue(argument)
+      continue
+    }
+
+    if (argument === '--video-file') {
+      options.videoFile = requireValue(argument)
+      continue
+    }
+
+    if (argument === '--output-dir') {
+      options.outputDir = requireValue(argument)
+      continue
+    }
+
+    if (argument === '--model') {
+      options.model = requireValue(argument)
+      continue
+    }
+
+    if (argument === '--request-timeout-ms') {
+      options.requestTimeoutMs = parsePositiveInteger(
+        requireValue(argument),
+        argument
+      )
+      continue
+    }
+
+    if (argument === '--before-video') {
+      options.beforeVideo = requireValue(argument)
+      continue
+    }
+
+    if (argument === '--pr-context') {
+      options.prContext = requireValue(argument)
+      continue
+    }
+
+    if (argument === '--target-url') {
+      options.targetUrl = requireValue(argument)
+      continue
+    }
+
+    if (argument === '--pass-label') {
+      options.passLabel = requireValue(argument)
+      continue
+    }
+
+    if (argument === '--dry-run') {
+      options.dryRun = true
+      continue
+    }
+
+    throw new Error(`Unknown argument: ${argument}`)
+  }
+
+  return options
+}
+
+function normalizePlatformName(value: string): string {
+  const slug = value
+    .trim()
+    .toLowerCase()
+    .replace(/[^a-z0-9]+/g, '-')
+    .replace(/^-+|-+$/g, '')
+
+  return slug.length > 0 ? slug : 'unknown-platform'
+}
+
+export function extractPlatformFromArtifactDirName(dirName: string): string {
+  const matchedValue = dirName.match(/^qa-report-(.+?)(?:-\d+)?$/i)?.[1]
+  return normalizePlatformName(matchedValue ?? dirName)
+}
+
+function extractPlatformFromVideoPath(videoPath: string): string {
+  const artifactDirName = basename(dirname(videoPath))
+  return extractPlatformFromArtifactDirName(artifactDirName)
+}
+
+export function pickLatestVideosByPlatform(
+  candidates: VideoCandidate[]
+): VideoCandidate[] {
+  const latestByPlatform = new Map<string, VideoCandidate>()
+
+  for (const candidate of candidates) {
+    const current = latestByPlatform.get(candidate.platformName)
+    if (!current || candidate.mtimeMs > current.mtimeMs) {
+      latestByPlatform.set(candidate.platformName, candidate)
+    }
+  }
+
+  return [...latestByPlatform.values()].sort((a, b) =>
+    a.platformName.localeCompare(b.platformName)
+  )
+}
+
+function toProjectRelativePath(targetPath: string): string {
+  const relativePath = relative(process.cwd(), targetPath)
+  if (relativePath.startsWith('.')) {
+    return relativePath
+  }
+  return `./${relativePath}`
+}
+
+function errorToString(error: unknown): string {
+  return error instanceof Error ? error.message : String(error)
+}
+
+function normalizePathForMatch(value: string): string {
+  return value.replaceAll('\\', '/').replace(/^\.\/+/, '')
+}
+
+export function selectVideoCandidateByFile(
+  candidates: VideoCandidate[],
+  options: { artifactsDir: string; videoFile: string }
+): VideoCandidate {
+  const requestedValue = options.videoFile.trim()
+  if (requestedValue.length === 0) {
+    throw new Error('--video-file is required')
+  }
+
+  const artifactsRoot = resolve(options.artifactsDir)
+  const requestedAbsolutePath = resolve(requestedValue)
+  const requestedPathKey = normalizePathForMatch(requestedValue)
+
+  const matches = candidates.filter((candidate) => {
+    const candidateAbsolutePath = resolve(candidate.videoPath)
+    if (candidateAbsolutePath === requestedAbsolutePath) {
+      return true
+    }
+
+    const candidateBaseName = basename(candidate.videoPath)
+    if (candidateBaseName === requestedValue) {
+      return true
+    }
+
+    const relativeToCwd = normalizePathForMatch(
+      relative(process.cwd(), candidateAbsolutePath)
+    )
+    if (relativeToCwd === requestedPathKey) {
+      return true
+    }
+
+    const relativeToArtifacts = normalizePathForMatch(
+      relative(artifactsRoot, candidateAbsolutePath)
+    )
+    return relativeToArtifacts === requestedPathKey
+  })
+
+  if (matches.length === 1) {
+    return matches[0]
+  }
+
+  if (matches.length === 0) {
+    const availableVideos = candidates.map((candidate) =>
+      toProjectRelativePath(candidate.videoPath)
+    )
+    throw new Error(
+      [
+        `No video matched --video-file "${options.videoFile}".`,
+        'Available videos:',
+        ...availableVideos.map((videoPath) => `- ${videoPath}`)
+      ].join('\n')
+    )
+  }
+
+  throw new Error(
+    [
+      `--video-file "${options.videoFile}" matched ${matches.length} videos.`,
+      'Please pass a more specific path.',
+      ...matches.map((match) => `- ${toProjectRelativePath(match.videoPath)}`)
+    ].join('\n')
+  )
+}
+
+async function collectVideoCandidates(
+  artifactsDir: string
+): Promise<VideoCandidate[]> {
+  const absoluteArtifactsDir = resolve(artifactsDir)
+  const videoPaths = globSync('**/qa-session{,-[0-9]}.mp4', {
+    cwd: absoluteArtifactsDir,
+    absolute: true,
+    nodir: true
+  }).sort()
+
+  const candidates = await Promise.all(
+    videoPaths.map(async (videoPath) => {
+      const videoStat = await stat(videoPath)
+      return {
+        platformName: extractPlatformFromVideoPath(videoPath),
+        videoPath,
+        mtimeMs: videoStat.mtimeMs
+      }
+    })
+  )
+
+  return candidates
+}
+
+function getMimeType(filePath: string): string {
+  const ext = extname(filePath).toLowerCase()
+  const mimeMap: Record<string, string> = {
+    '.mp4': 'video/mp4',
+    '.webm': 'video/webm',
+    '.mov': 'video/quicktime',
+    '.avi': 'video/x-msvideo',
+    '.mkv': 'video/x-matroska',
+    '.m4v': 'video/mp4'
+  }
+  return mimeMap[ext] || 'video/mp4'
+}
+
+function buildReviewPrompt(options: {
+  platformName: string
+  videoPath: string
+  prContext: string
+  isComparative: boolean
+}): string {
+  const { platformName, videoPath, prContext, isComparative } = options
+
+  if (isComparative) {
+    return buildComparativePrompt(platformName, videoPath, prContext)
+  }
+
+  return buildSingleVideoPrompt(platformName, videoPath, prContext)
+}
+
+function buildComparativePrompt(
+  platformName: string,
+  videoPath: string,
+  prContext: string
+): string {
+  const lines = [
+    'You are a senior QA engineer performing a BEFORE/AFTER comparison review.',
+    '',
+    'You are given TWO videos:',
+    '- **Video 1 (BEFORE)**: The main branch BEFORE the PR. This shows the OLD behavior.',
+    '- **Video 2 (AFTER)**: The PR branch AFTER the changes. This shows the NEW behavior.',
+    '',
+    'Both videos show the same test steps executed on different code versions.',
+    ''
+  ]
+
+  if (prContext) {
+    lines.push('## PR Context', prContext, '')
+  }
+
+  lines.push(
+    '## Your Task',
+    `Platform: "${platformName}". After video: ${toProjectRelativePath(videoPath)}.`,
+    '',
+    '1. **BEFORE video**: Does it demonstrate the old behavior or bug that the PR aims to fix?',
+    '   Describe what you observe — this establishes the baseline.',
+    '2. **AFTER video**: Does it prove the PR fix works? Is the intended new behavior visible?',
+    '3. **Comparison**: What specifically changed between before and after?',
+    '4. **Regressions**: Did the PR introduce any new problems visible in the AFTER video',
+    '   that were NOT present in the BEFORE video?',
+    '',
+    'Note: Brief black frames during page transitions are NORMAL.',
+    'Note: Small cyan/purple dashed labels prefixed with "QA:" are annotations placed by the automated test script — they are NOT part of the application UI. Do not treat them as bugs or evidence.',
+    'Report only concrete, visible differences. Avoid speculation.',
+    '',
+    'Return markdown with these sections exactly:',
+    '## Summary',
+    '(What the PR changes, whether BEFORE confirms the old behavior, whether AFTER proves the fix)',
+    '',
+    '## Behavior Changes',
+    'Summarize ALL behavioral differences as a markdown TABLE:',
+    '| Behavior | Before (main) | After (PR) | Verdict |',
+    '',
+    '- **Behavior**: short name for the behavior (e.g. "Save shortcut label", "Menu hover style")',
+    '- **Before (main)**: how it works/looks in the BEFORE video',
+    '- **After (PR)**: how it works/looks in the AFTER video',
+    '- **Verdict**: `Fixed`, `Improved`, `Changed`, `Regression`, or `No Change`',
+    '',
+    'One row per distinct behavior. Include both changed AND unchanged key behaviors',
+    'that were tested, so reviewers can confirm nothing was missed.',
+    '',
+    '## Timeline Comparison',
+    'Present a chronological frame-by-frame comparison as a markdown TABLE:',
+    '| Time | Type | Severity | Before (main) | After (PR) |',
+    '',
+    '- **Time**: timestamp or range from the videos (e.g. `0:05-0:08`)',
+    '- **Type**: category such as `Visual`, `Behavior`, `Layout`, `Text`, `Animation`, `Menu`, `State`',
+    '- **Severity**: `None` (neutral change), `Fixed` (bug resolved), `Regression`, `Minor`, `Major`',
+    '- **Before (main)**: what is observed in the BEFORE video at that time',
+    '- **After (PR)**: what is observed in the AFTER video at that time',
+    '',
+    'Include one row per distinct observable difference. If behavior is identical at a timestamp,',
+    'omit that row. Focus on meaningful differences, not narrating every frame.',
+    '',
+    '## Confirmed Issues',
+    'For each issue, use this exact format:',
+    '',
+    '### [Short issue title]',
+    '`SEVERITY` `TIMESTAMP` `Confidence: LEVEL`',
+    '',
+    '[Description — specify whether it appears in BEFORE, AFTER, or both]',
+    '',
+    '**Evidence:** [What you observed at the given timestamp in which video]',
+    '',
+    '**Suggested Fix:** [Actionable recommendation]',
+    '',
+    '---',
+    '',
+    '## Possible Issues (Needs Human Verification)',
+    '## Overall Risk',
+    '(Assess whether the PR achieves its goal based on the before/after comparison)',
+    '',
+    '## Verdict',
+    'End your report with this EXACT JSON block (no markdown fence):',
+    '{"verdict": "REPRODUCED" | "NOT_REPRODUCIBLE" | "INCONCLUSIVE", "risk": "low" | "medium" | "high", "confidence": "high" | "medium" | "low"}',
+    '- REPRODUCED: the before video confirms the old behavior and the after video shows the fix working',
+    '- NOT_REPRODUCIBLE: the before video does not show the reported bug',
+    '- INCONCLUSIVE: the videos do not adequately demonstrate the behavior change'
+  )
+
+  return lines.filter(Boolean).join('\n')
+}
+
+function buildSingleVideoPrompt(
+  platformName: string,
+  videoPath: string,
+  prContext: string
+): string {
+  const lines = [
+    'You are a senior QA engineer reviewing a UI test session recording.',
+    '',
+    '## ANTI-HALLUCINATION RULES (READ FIRST)',
+    '- Describe ONLY what you can directly observe in the video frames',
+    '- NEVER infer or assume what "must have happened" between frames',
+    '- If a step is not visible in the video, say "NOT SHOWN" — do not guess',
+    '- Your job is to be a CAMERA — report facts, not interpretations',
+    ''
+  ]
+
+  const isIssueContext =
+    prContext &&
+    /^### Issue #|^Title:.*\bbug\b|^This video attempts to reproduce/im.test(
+      prContext
+    )
+
+  if (prContext) {
+    lines.push(
+      '## Phase 1: Blind Observation (describe what you SEE)',
+      'First, describe every UI interaction chronologically WITHOUT knowing the expected outcome:',
+      '- What elements does the user click/hover/type?',
+      '- What dialogs/menus open and close?',
+      '- What keyboard indicators appear? (look for subtitle overlays)',
+      '- What is the BEFORE state and AFTER state of each action?',
+      '',
+      '## Phase 2: Compare against expected behavior',
+      'Now compare your observations against the context below.',
+      'Only claim a match if your Phase 1 observations EXPLICITLY support it.',
+      ''
+    )
+
+    if (isIssueContext) {
+      lines.push(
+        '## Issue Context',
+        prContext,
+        '',
+        '## Comparison Questions',
+        '1. Did the video perform the reproduction steps described in the issue?',
+        '2. Did your Phase 1 observations show the reported bug behavior?',
+        '3. If the steps were not performed or the bug was not visible, say INCONCLUSIVE.',
+        ''
+      )
+    } else {
+      lines.push(
+        '## PR Context',
+        prContext,
+        '',
+        '## Comparison Questions',
+        '1. Did the video test the specific behavior the PR changes?',
+        '2. Did your Phase 1 observations show the expected before/after difference?',
+        '3. If the test was incomplete or inconclusive, say so honestly.',
+        ''
+      )
+    }
+  }
+
+  lines.push(
+    `Review this QA session video for platform "${platformName}".`,
+    `Source video: ${toProjectRelativePath(videoPath)}.`,
+    'The video shows the full test session — analyze it chronologically.',
+    'Focus on UI regressions, broken states, visual glitches, unreadable text, missing labels/i18n, and clear workflow failures.',
+    'Note: Brief black frames during page transitions are NORMAL and should NOT be reported as issues.',
+    'Note: Small cyan/purple dashed labels prefixed with "QA:" are annotations placed by the automated test script — they are NOT part of the application UI. Do not treat them as bugs or evidence.',
+    'Report only concrete, visible problems and avoid speculation.',
+    'If confidence is low, mark it explicitly.',
+    '',
+    'Return markdown with these sections exactly:',
+    '## Summary',
+    isIssueContext
+      ? '(Explain what bug was reported and whether the video confirms it is reproducible)'
+      : prContext
+        ? '(Explain what the PR intended and whether the video confirms it works)'
+        : '',
+    '## Confirmed Issues',
+    'For each confirmed issue, use this exact format (one block per issue):',
+    '',
+    '### [Short issue title]',
+    '`HIGH` `01:03` `Confidence: High`',
+    '',
+    '[Description of the issue — what went wrong and what was expected]',
+    '',
+    '**Evidence:** [What you observed in the video at the given timestamp]',
+    '',
+    '**Suggested Fix:** [Actionable recommendation]',
+    '',
+    '---',
+    '',
+    'The first line after the heading MUST be exactly three backtick-wrapped labels:',
+    '`SEVERITY` `TIMESTAMP` `Confidence: LEVEL`',
+    'Do NOT use a table for issues — use the block format above.',
+    '## Possible Issues (Needs Human Verification)',
+    '## Overall Risk',
+    '',
+    '## Verdict',
+    'End your report with this EXACT JSON block (no markdown fence):',
+    '{"verdict": "REPRODUCED" | "NOT_REPRODUCIBLE" | "INCONCLUSIVE", "risk": "low" | "medium" | "high" | null, "confidence": "high" | "medium" | "low"}',
+    '- REPRODUCED: the bug/behavior is clearly visible in the video',
+    '- NOT_REPRODUCIBLE: the steps were performed correctly but the bug was not observed',
+    '- INCONCLUSIVE: the reproduction steps were not performed or the video is insufficient'
+  )
+
+  return lines.filter(Boolean).join('\n')
+}
+
+const MAX_VIDEO_BYTES = 100 * 1024 * 1024
+
+async function readVideoFile(videoPath: string): Promise<Buffer> {
+  const fileStat = await stat(videoPath)
+  if (fileStat.size > MAX_VIDEO_BYTES) {
+    throw new Error(
+      `Video ${basename(videoPath)} is ${formatBytes(fileStat.size)}, exceeds ${formatBytes(MAX_VIDEO_BYTES)} limit`
+    )
+  }
+  return readFile(videoPath)
+}
+
+async function requestGeminiReview(options: {
+  apiKey: string
+  model: string
+  platformName: string
+  videoPath: string
+  beforeVideoPath: string
+  timeoutMs: number
+  prContext: string
+}): Promise<string> {
+  const genAI = new GoogleGenerativeAI(options.apiKey)
+  const model = genAI.getGenerativeModel({ model: options.model })
+
+  const isComparative = options.beforeVideoPath.length > 0
+  const prompt = buildReviewPrompt({
+    platformName: options.platformName,
+    videoPath: options.videoPath,
+    prContext: options.prContext,
+    isComparative
+  })
+
+  const parts: Array<
+    { text: string } | { inlineData: { mimeType: string; data: string } }
+  > = [{ text: prompt }]
+
+  if (isComparative) {
+    const beforeBuffer = await readVideoFile(options.beforeVideoPath)
+    parts.push(
+      { text: 'Video 1 — BEFORE (main branch):' },
+      {
+        inlineData: {
+          mimeType: getMimeType(options.beforeVideoPath),
+          data: beforeBuffer.toString('base64')
+        }
+      }
+    )
+  }
+
+  const afterBuffer = await readVideoFile(options.videoPath)
+  if (isComparative) {
+    parts.push({ text: 'Video 2 — AFTER (PR branch):' })
+  }
+  parts.push({
+    inlineData: {
+      mimeType: getMimeType(options.videoPath),
+      data: afterBuffer.toString('base64')
+    }
+  })
+
+  const result = await model.generateContent(parts, {
+    timeout: options.timeoutMs
+  })
+  const response = result.response
+  const text = response.text()
+
+  if (!text || text.trim().length === 0) {
+    throw new Error('Gemini API returned no output text')
+  }
+
+  return text.trim()
+}
+
+function formatBytes(bytes: number): string {
+  if (bytes < 1024) return `${bytes} B`
+  if (bytes < 1024 * 1024) return `${(bytes / 1024).toFixed(1)} KB`
+  return `${(bytes / (1024 * 1024)).toFixed(1)} MB`
+}
+
+function buildReportMarkdown(input: {
+  platformName: string
+  model: string
+  videoPath: string
+  videoSizeBytes: number
+  beforeVideoPath?: string
+  beforeVideoSizeBytes?: number
+  reviewText: string
+  targetUrl?: string
+}): string {
+  const headerLines = [
+    `# ${input.platformName} QA Video Report`,
+    '',
+    `- Generated at: ${new Date().toISOString()}`,
+    `- Model: \`${input.model}\``
+  ]
+
+  if (input.targetUrl) {
+    headerLines.push(`- Target: ${input.targetUrl}`)
+  }
+
+  if (input.beforeVideoPath) {
+    headerLines.push(
+      `- Before video: \`${toProjectRelativePath(input.beforeVideoPath)}\` (${formatBytes(input.beforeVideoSizeBytes ?? 0)})`,
+      `- After video: \`${toProjectRelativePath(input.videoPath)}\` (${formatBytes(input.videoSizeBytes)})`,
+      '- Mode: **Comparative (before/after)**'
+    )
+  } else {
+    headerLines.push(
+      `- Source video: \`${toProjectRelativePath(input.videoPath)}\``,
+      `- Video size: ${formatBytes(input.videoSizeBytes)}`
+    )
+  }
+
+  headerLines.push('', '## AI Review', '')
+  return `${headerLines.join('\n')}${input.reviewText.trim()}\n`
+}
+
+async function reviewVideo(
+  video: VideoCandidate,
+  options: CliOptions,
+  apiKey: string
+): Promise<void> {
+  let prContext = ''
+  if (options.prContext) {
+    try {
+      prContext = await readFile(options.prContext, 'utf-8')
+      process.stdout.write(
+        `[${video.platformName}] Loaded PR context from ${options.prContext}\n`
+      )
+    } catch {
+      process.stdout.write(
+        `[${video.platformName}] Warning: Could not read PR context file ${options.prContext}\n`
+      )
+    }
+  }
+
+  const beforeVideoPath = options.beforeVideo
+    ? resolve(options.beforeVideo)
+    : ''
+
+  if (beforeVideoPath) {
+    const beforeStat = await stat(beforeVideoPath)
+    process.stdout.write(
+      `[${video.platformName}] Before video: ${toProjectRelativePath(beforeVideoPath)} (${formatBytes(beforeStat.size)})\n`
+    )
+  }
+
+  process.stdout.write(
+    `[${video.platformName}] Sending ${beforeVideoPath ? '2 videos (comparative)' : 'video'} to ${options.model}\n`
+  )
+
+  const reviewText = await requestGeminiReview({
+    apiKey,
+    model: options.model,
+    platformName: video.platformName,
+    videoPath: video.videoPath,
+    beforeVideoPath,
+    timeoutMs: options.requestTimeoutMs,
+    prContext
+  })
+
+  const videoStat = await stat(video.videoPath)
+  const passSegment = options.passLabel ? `-${options.passLabel}` : ''
+  const outputPath = resolve(
+    options.outputDir,
+    `${video.platformName}${passSegment}-qa-video-report.md`
+  )
+
+  const reportInput: Parameters<typeof buildReportMarkdown>[0] = {
+    platformName: video.platformName,
+    model: options.model,
+    videoPath: video.videoPath,
+    videoSizeBytes: videoStat.size,
+    reviewText,
+    targetUrl: options.targetUrl || undefined
+  }
+
+  if (beforeVideoPath) {
+    const beforeStat = await stat(beforeVideoPath)
+    reportInput.beforeVideoPath = beforeVideoPath
+    reportInput.beforeVideoSizeBytes = beforeStat.size
+  }
+
+  const reportMarkdown = buildReportMarkdown(reportInput)
+
+  await mkdir(dirname(outputPath), { recursive: true })
+  await writeFile(outputPath, reportMarkdown, 'utf-8')
+
+  process.stdout.write(
+    `[${video.platformName}] Wrote ${toProjectRelativePath(outputPath)}\n`
+  )
+}
+
+function isExecutedAsScript(metaUrl: string): boolean {
+  const modulePath = fileURLToPath(metaUrl)
+  const scriptPath = process.argv[1] ? resolve(process.argv[1]) : ''
+  return modulePath === scriptPath
+}
+
+async function main(): Promise<void> {
+  const options = parseCliOptions(process.argv.slice(2))
+  const candidates = await collectVideoCandidates(options.artifactsDir)
+
+  if (candidates.length === 0) {
+    process.stdout.write(
+      `No qa-session.mp4 files found under ${toProjectRelativePath(resolve(options.artifactsDir))}\n`
+    )
+    return
+  }
+
+  const selectedVideo = selectVideoCandidateByFile(candidates, {
+    artifactsDir: options.artifactsDir,
+    videoFile: options.videoFile
+  })
+
+  process.stdout.write(
+    `Selected ${selectedVideo.platformName}: ${toProjectRelativePath(selectedVideo.videoPath)}\n`
+  )
+
+  if (options.dryRun) {
+    process.stdout.write('\nDry run mode enabled, no API calls were made.\n')
+    return
+  }
+
+  const apiKey = process.env.GEMINI_API_KEY
+  if (!apiKey) {
+    throw new Error('GEMINI_API_KEY is required unless --dry-run is set')
+  }
+
+  await reviewVideo(selectedVideo, options, apiKey)
+}
+
+if (isExecutedAsScript(import.meta.url)) {
+  void main().catch((error: unknown) => {
+    const message = errorToString(error)
+    process.stderr.write(`qa-video-review failed: ${message}\n`)
+    process.exit(1)
+  })
+}
--- a/src/components/common/LazyImage.vue
+++ b/src/components/common/LazyImage.vue
@@ -42,7 +42,6 @@ import type { StyleValue } from 'vue'

 import { useIntersectionObserver } from '@/composables/useIntersectionObserver'
 import { useMediaCache } from '@/services/mediaCacheService'
-import type { ClassValue } from '@/utils/tailwindUtil'

 const {
  src,
@@ -54,8 +53,8 @@ const {
 } = defineProps<{
  src: string
  alt?: string
-  containerClass?: ClassValue
-  imageClass?: ClassValue
+  containerClass?: string
+  imageClass?: string
  imageStyle?: StyleValue
  rootMargin?: string
 }>()
--- a/src/components/templates/thumbnails/DefaultThumbnail.vue
+++ b/src/components/templates/thumbnails/DefaultThumbnail.vue
@@ -3,12 +3,14 @@
    <LazyImage
      :src="src"
      :alt="alt"
-      :image-class="[
-        'transform-gpu transition-transform duration-300 ease-out',
-        isVideoType
-          ? 'w-full h-full object-cover'
-          : 'max-w-full max-h-64 object-contain'
-      ]"
+      :image-class="
+        cn(
+          'transform-gpu transition-transform duration-300 ease-out',
+          isVideoType
+            ? 'size-full object-cover'
+            : 'max-h-64 max-w-full object-contain'
+        )
+      "
      :image-style="
        isHovered ? { transform: `scale(${1 + hoverZoom / 100})` } : undefined
      "
@@ -19,6 +21,7 @@
 <script setup lang="ts">
 import LazyImage from '@/components/common/LazyImage.vue'
 import BaseThumbnail from '@/components/templates/thumbnails/BaseThumbnail.vue'
+import { cn } from '@/utils/tailwindUtil'

 const { src, isVideo } = defineProps<{
  src: string
--- a/src/stores/dialogStore.ts
+++ b/src/stores/dialogStore.ts
@@ -6,7 +6,6 @@ import type { DialogPassThroughOptions } from 'primevue/dialog'
 import { markRaw, ref } from 'vue'
 import type { Component } from 'vue'

-import type GlobalDialog from '@/components/dialog/GlobalDialog.vue'
 import type { ComponentAttrs } from 'vue-component-type-helpers'

 type DialogPosition =
@@ -34,23 +33,19 @@ interface CustomDialogComponentProps {
  headless?: boolean
 }

-export type DialogComponentProps = ComponentAttrs<typeof GlobalDialog> &
-  CustomDialogComponentProps
+export type DialogComponentProps = CustomDialogComponentProps &
+  Record<string, unknown>

-export interface DialogInstance<
-  H extends Component = Component,
-  B extends Component = Component,
-  F extends Component = Component
-> {
+export interface DialogInstance {
  key: string
  visible: boolean
  title?: string
-  headerComponent?: H
-  headerProps?: ComponentAttrs<H>
-  component: B
-  contentProps: ComponentAttrs<B>
-  footerComponent?: F
-  footerProps?: ComponentAttrs<F>
+  headerComponent?: Component
+  headerProps?: Record<string, unknown>
+  component: Component
+  contentProps: Record<string, unknown>
+  footerComponent?: Component
+  footerProps?: Record<string, unknown>
  dialogComponentProps: DialogComponentProps
  priority: number
 }