fix(qa): use first frame (title card) for thumbnail.jpg

feat(qa): generate thumbnail.jpg for comfy-qa.pages.dev dashboard cards
fix(e2e): retry + fallback when setupUser hits duplicate username
2026-04-19 22:09:37 +00:00 · 2026-04-17 07:43:02 +00:00 · 2026-04-17 07:28:06 +00:00 · 2026-04-17 14:10:53 +09:00 · 2026-04-15 19:13:38 +00:00 · 2026-04-15 18:58:44 +00:00
42 changed files with 7013 additions and 284 deletions
--- a/.claude/skills/comfy-qa/SKILL.md
+++ b/.claude/skills/comfy-qa/SKILL.md
@@ -55,6 +55,9 @@ All scripts live in `.claude/skills/comfy-qa/scripts/`:
 | `qa-report-template.html` | Report site template                                  |
 | `qa-video-review.ts`      | Gemini video review                                   |
 | `qa-analyze-pr.ts`        | Deep PR/issue analysis → QA guide                     |
+| `qa-generate-test.ts`     | Regression test generation from QA report             |
+| `qa-reproduce.ts`         | Deterministic replay with narration                   |
+| `scripts/qa-batch.sh`     | Batch-trigger QA for multiple issues                  |

 ## Triggering QA

@@ -80,6 +83,15 @@ pnpm qa --uncommitted          # local changes
 git push origin sno-skills:sno-qa-10253 --force
 ```

+### Via Batch Script
+
+```bash
+./scripts/qa-batch.sh 10394 10238 9996
+./scripts/qa-batch.sh --from tmp/issues.md --top 5
+./scripts/qa-batch.sh --dry-run 10394
+./scripts/qa-batch.sh --cleanup
+```
+
 ## Research Phase (`qa-agent.ts`)

 Claude receives the issue/PR context + a11y tree snapshot + ComfyPage fixture API docs.
@@ -89,12 +101,24 @@ Tools:
 - **`inspect(selector?)`** — Read a11y tree
 - **`readFixture(path)`** — Read fixture source code
 - **`readTest(path)`** — Read existing tests for patterns
+- **`downloadAttachment(url)`** — Fetch URL contents (GitHub user-attachments, gist raw) — for workflow JSON attached to issues
+- **`loadWorkflow(json)`** — Load a workflow into the canvas via `window.app.loadGraphData()` so `inspect()` reflects the user's graph state
 - **`writeTest(code)`** — Write a Playwright .spec.ts
 - **`runTest()`** — Execute and get pass/fail + errors
 - **`done(verdict, summary, evidence, testCode, videoScript?)`** — Finish

+When the issue attaches a workflow.json, the agent is instructed to `downloadAttachment` then `loadWorkflow` before inspecting — many bugs only manifest in the user's specific graph state. The emitted test must itself re-fetch and `loadGraphData()` because `runTest` spawns a fresh browser.
+
 When `verdict=REPRODUCED`, Claude also provides a `videoScript` — a separate test file using demowright's `createVideoScript()` for professional narrated demo video with title cards, TTS segments, and outro.

+### Verdict Logic
+
+- **REPRODUCED** — Test passes (asserting the bug exists) → bug is proven
+- **NOT_REPRODUCIBLE** — Claude exhausted attempts, test cannot pass
+- **INCONCLUSIVE** — Agent timed out or encountered infrastructure issues
+
+Auto-completion: if a test passed with a bug-specific assertion but `done()` was never called, the pipeline auto-completes with REPRODUCED. Trivial assertions (`toBeDefined()`, `toBeGreaterThan(0)`) and discovery-style tests are excluded from auto-save to avoid false positives.
+
 ## Video Recording (demowright)

 Phase 2 uses the video script to record with:
@@ -115,19 +139,126 @@ Features:
 - E2E test code + video script code
 - Verdict banner for NOT_REPRODUCIBLE/INCONCLUSIVE with failure reason
 - Copy badge button (markdown)
-
-## Prerequisites
-
- `GEMINI_API_KEY` — video review, TTS
- `ANTHROPIC_API_KEY` — Claude Agent SDK (research phase)
- `CLOUDFLARE_API_TOKEN` + `CLOUDFLARE_ACCOUNT_ID` — report deployment (CI only)
- ComfyUI server running (auto-detected, or auto-started)
+- Date-stamped badges, vertical box badge for issues and PRs

 ## CI Workflow (`.github/workflows/pr-qa.yaml`)

 ```
 resolve-matrix → analyze-pr ──┐
-                               ├→ qa-before (main branch)
+                               ├→ qa-before (main branch, worktree build)
                               ├→ qa-after  (PR branch)
                               └→ report (video review, deploy, comment)
 ```
+
+Before/after jobs run **in parallel** on separate runners for clean isolation.
+
+### Issue Reproduce Mode
+
+For issues (not PRs), the pipeline:
+
+1. Fetches the issue body and comments
+2. Runs `qa-analyze-pr.ts --type issue` to generate a QA guide
+3. Runs the research phase (Claude writes E2E test to reproduce)
+4. Records video of the test execution
+5. Posts results as a comment on the issue
+
+## Prerequisites
+
+- Node.js 22+
+- `pnpm` package manager
+- `gh` CLI (authenticated)
+- Playwright browsers: `npx playwright install chromium`
+- Environment variables:
+  - `GEMINI_API_KEY` — PR analysis, video review, TTS
+  - `ANTHROPIC_API_KEY` — Claude Agent SDK (research phase)
+  - `CLOUDFLARE_API_TOKEN` + `CLOUDFLARE_ACCOUNT_ID` — report deployment (CI only)
+- ComfyUI server running (auto-detected, or auto-started)
+
+## Manual QA (Fallback)
+
+When the automated pipeline isn't suitable (e.g., visual-only bugs, complex multi-step interactions), use **playwright-cli** for manual browser interaction:
+
+```bash
+npm install -g @playwright/cli@latest
+
+playwright-cli open http://127.0.0.1:8188
+playwright-cli snapshot
+playwright-cli click e1
+playwright-cli fill e2 "test text"
+playwright-cli press Escape
+playwright-cli screenshot --filename=f.png
+```
+
+Snapshots return element references (`e1`, `e2`, …). Always run `snapshot` after navigation to refresh refs.
+
+## Manual QA Test Plan
+
+When performing manual QA systematically, cover each area below.
+
+### Application Load & Routes
+
+| Test              | Steps                                                        |
+| ----------------- | ------------------------------------------------------------ |
+| Root route loads  | Navigate to `/` — GraphView should render with canvas        |
+| User select route | Navigate to `/user-select` — user selection UI should appear |
+| 404 handling      | Navigate to `/nonexistent` — should handle gracefully        |
+
+### Canvas & Graph View
+
+| Test                      | Steps                                                          |
+| ------------------------- | -------------------------------------------------------------- |
+| Canvas renders            | The LiteGraph canvas is visible and interactive                |
+| Pan canvas                | Click and drag on empty canvas area                            |
+| Zoom in/out               | Use scroll wheel or Alt+=/Alt+-                                |
+| Add node via double-click | Double-click canvas to open search, type "KSampler", select it |
+| Delete node               | Select a node, press Delete key                                |
+| Connect nodes             | Drag from output slot to input slot                            |
+| Copy/Paste                | Select nodes, Ctrl+C then Ctrl+V                               |
+| Undo/Redo                 | Make changes, Ctrl+Z to undo, Ctrl+Y to redo                   |
+| Context menus             | Right-click node vs empty canvas — different menus             |
+
+### Sidebar Tabs
+
+| Test              | Steps                                 |
+| ----------------- | ------------------------------------- |
+| Workflows tab     | Press W — workflows sidebar opens     |
+| Node Library tab  | Press N — node library opens          |
+| Model Library tab | Press M — model library opens         |
+| Tab toggle        | Press same key again — sidebar closes |
+| Search in sidebar | Type in search box — results filter   |
+
+### Settings Dialog
+
+| Test             | Steps                                                |
+| ---------------- | ---------------------------------------------------- |
+| Open settings    | Press Ctrl+, or click settings button                |
+| Change a setting | Toggle a boolean setting — it persists after closing |
+| Search settings  | Type in settings search box — results filter         |
+| Close settings   | Press Escape or click close button                   |
+
+### Execution & Queue
+
+| Test           | Steps                                                 |
+| -------------- | ----------------------------------------------------- |
+| Queue prompt   | Load default workflow, click Queue — execution starts |
+| Queue progress | Progress indicator shows during execution             |
+| Interrupt      | Press Ctrl+Alt+Enter during execution — interrupts    |
+
+## Known Issues & Troubleshooting
+
+See `docs/qa/TROUBLESHOOTING.md` for common failures:
+
+- `set -euo pipefail` + grep with no match → append `|| true`
+- `__name is not defined` in `page.evaluate` → use `addScriptTag`
+- Cursor not visible in videos → monkey-patch `page.mouse` methods
+- Agent not calling `done()` → auto-complete from passing test with bug-specific assertion
+
+## Backlog
+
+See `docs/qa/backlog.md` for planned improvements:
+
+- **Type B comparison**: Different commits for regression detection
+- **Type C comparison**: Cross-browser testing
+- **Pre-seed assets**: Upload test images before recording (for #10424-style bugs)
+- **Custom node install in CI**: Requires backend-in-CI (`comfy node install`); see `QA_REPRODUCE_IMPROVEMENT.md`
+- **Lazy a11y tree**: Reduce token usage with `inspect(selector)` vs full dump
--- a/.claude/skills/comfy-qa/scripts/qa-agent.ts
+++ b/.claude/skills/comfy-qa/scripts/qa-agent.ts
@@ -16,14 +16,41 @@
 */

 import type { Page } from '@playwright/test'
-/* eslint-disable import-x/no-unresolved */
-// @ts-expect-error — claude-agent-sdk has no type declarations for vue-tsc
+ 
 import { query, tool, createSdkMcpServer } from '@anthropic-ai/claude-agent-sdk'
-/* eslint-enable import-x/no-unresolved */
+ 
 import { z } from 'zod'
 import { mkdirSync, readFileSync, writeFileSync } from 'fs'
 import { execSync } from 'child_process'

+// ── Helpers ──
+
+/**
+ * Filter npm/pnpm/node warnings that dominate Playwright output tails, then
+ * return the most relevant slice. Prioritizes lines that mention errors,
+ * expected/received values, stack frames, or Playwright FAIL/PASS markers.
+ */
+function summarizePlaywrightOutput(raw: string, maxChars: number): string {
+  const lines = raw.split('\n').filter((l) => {
+    if (/^\s*npm (warn|notice)\b/i.test(l)) return false
+    if (/^\s*pnpm (warn|notice)\b/i.test(l)) return false
+    if (/^\s*\(node:\d+\)\s*(Experimental|Deprecation)Warning/.test(l))
+      return false
+    if (/^\s*Warning:.*Use .* --help/.test(l)) return false
+    return true
+  })
+
+  const errorRegex =
+    /(Error:|Expected|Received|FAIL|PASS|✓|✘|×|Playwright|assert|at [a-zA-Z/._-]+:\d+:\d+|Timeout|waitFor|locator|toHave|toBe|toEqual|strictEqual|deepEqual)/i
+  const keyLines = lines.filter((l) => errorRegex.test(l))
+
+  // Prefer the tail of key lines, fall back to tail of all filtered lines
+  const chosen = keyLines.length > 0 ? keyLines : lines
+  const joined = chosen.join('\n')
+  if (joined.length <= maxChars) return joined
+  return '…(truncated)…\n' + joined.slice(-maxChars)
+}
+
 // ── Types ──

 interface ResearchOptions {
@@ -35,6 +62,7 @@ interface ResearchOptions {
  anthropicApiKey?: string
  maxTurns?: number
  timeBudgetMs?: number
+  model?: string
 }

 export type ReproMethod = 'e2e_test' | 'video' | 'both' | 'none'
@@ -73,6 +101,8 @@ export async function runResearchPhase(
  let finalVideoScript = ''
  let turnCount = 0
  let lastPassedTurn = -1
+  let consecutiveTestFailures = 0
+  const MAX_FAILED_RUNS = 5
  const startTime = Date.now()
  const researchLog: ResearchResult['log'] = []

@@ -216,6 +246,124 @@ export async function runResearchPhase(
    }
  )

+  // ── Tool: downloadAttachment ──
+  const downloadAttachmentTool = tool(
+    'downloadAttachment',
+    'Download a URL (e.g. GitHub user-attachments, gist raw, pastebin raw) and return its text content. Use this to fetch workflow.json attached to the issue before calling loadWorkflow().',
+    {
+      url: z
+        .string()
+        .describe(
+          'Absolute URL to fetch. Typically a GitHub user-attachments link or a gist/pastebin raw URL extracted from the issue body.'
+        )
+    },
+    async (args: { url: string }) => {
+      let resultText: string
+      try {
+        const res = await fetch(args.url, { redirect: 'follow' })
+        if (!res.ok) {
+          resultText = `HTTP ${res.status} ${res.statusText} for ${args.url}`
+        } else {
+          const body = await res.text()
+          const truncated =
+            body.length > 8000
+              ? body.slice(0, 8000) +
+                `\n... (truncated, ${body.length} total chars)`
+              : body
+          resultText = truncated
+        }
+      } catch (e) {
+        resultText = `fetch failed: ${e instanceof Error ? e.message : e}`
+      }
+
+      researchLog.push({
+        turn: turnCount,
+        timestampMs: Date.now() - startTime,
+        toolName: 'downloadAttachment',
+        toolInput: args,
+        toolResult: resultText.slice(0, 300)
+      })
+
+      return { content: [{ type: 'text' as const, text: resultText }] }
+    }
+  )
+
+  // ── Tool: loadWorkflow ──
+  const loadWorkflowTool = tool(
+    'loadWorkflow',
+    'Load a ComfyUI workflow into the canvas via window.app.loadGraphData(). Accepts either a URL (http/https — will be fetched) OR a complete workflow JSON string. Returns the resulting node count. For a single step from an issue attachment URL, pass the URL directly — no separate downloadAttachment needed.',
+    {
+      jsonOrUrl: z
+        .string()
+        .describe(
+          'Either a URL starting with http(s):// (will be fetched), or a complete workflow JSON string. Must ultimately parse to an object with a "nodes" array.'
+        )
+    },
+    async (args: { jsonOrUrl: string }) => {
+      let resultText: string
+      try {
+        let jsonText = args.jsonOrUrl
+        if (/^https?:\/\//i.test(jsonText.trim())) {
+          const res = await fetch(jsonText.trim(), { redirect: 'follow' })
+          if (!res.ok) {
+            return {
+              content: [
+                {
+                  type: 'text' as const,
+                  text: `Fetch failed: HTTP ${res.status} ${res.statusText}`
+                }
+              ]
+            }
+          }
+          jsonText = await res.text()
+        }
+        const parsed = JSON.parse(jsonText) as { nodes?: unknown[] }
+        if (
+          !parsed ||
+          typeof parsed !== 'object' ||
+          !Array.isArray(parsed.nodes)
+        ) {
+          resultText =
+            'Parsed JSON has no "nodes" array — not a ComfyUI workflow.'
+        } else {
+          await page.evaluate((data) => {
+            const w = window as unknown as {
+              app?: { loadGraphData?: (d: unknown) => void }
+            }
+            if (!w.app?.loadGraphData)
+              throw new Error('window.app.loadGraphData unavailable')
+            w.app.loadGraphData(data)
+          }, parsed)
+          // Let canvas settle
+          await page
+            .waitForFunction(
+              () => {
+                const w = window as unknown as {
+                  app?: { graph?: { _nodes?: unknown[] } }
+                }
+                return (w.app?.graph?._nodes?.length ?? 0) > 0
+              },
+              { timeout: 5000 }
+            )
+            .catch(() => {})
+          resultText = `Workflow loaded. nodes=${parsed.nodes.length}`
+        }
+      } catch (e) {
+        resultText = `loadWorkflow failed: ${e instanceof Error ? e.message : e}`
+      }
+
+      researchLog.push({
+        turn: turnCount,
+        timestampMs: Date.now() - startTime,
+        toolName: 'loadWorkflow',
+        toolInput: { inputLength: args.jsonOrUrl.length },
+        toolResult: resultText.slice(0, 300)
+      })
+
+      return { content: [{ type: 'text' as const, text: resultText }] }
+    }
+  )
+
  // ── Tool: writeTest ──
  const writeTestTool = tool(
    'writeTest',
@@ -281,11 +429,11 @@ export async function runResearchPhase(
            }
          }
        )
-        resultText = `TEST PASSED:\n${output.slice(-1500)}`
+        resultText = `TEST PASSED:\n${summarizePlaywrightOutput(output, 1500)}`
      } catch (e) {
        const err = e as { stdout?: string; stderr?: string; message?: string }
        const output = (err.stdout || '') + '\n' + (err.stderr || '')
-        resultText = `TEST FAILED:\n${output.slice(-2000)}`
+        resultText = `TEST FAILED:\n${summarizePlaywrightOutput(output, 2500)}`
      }

      researchLog.push({
@@ -299,6 +447,7 @@ export async function runResearchPhase(
      // Auto-save passing test code for fallback completion — but only if
      // the test contains a bug-specific assertion (not just a discovery/debug test)
      if (resultText.startsWith('TEST PASSED')) {
+        consecutiveTestFailures = 0
        try {
          const code = readFileSync(browserTestPath, 'utf-8')
          const hasBugAssertion =
@@ -317,6 +466,13 @@ export async function runResearchPhase(
        }
        resultText +=
          '\n\n⚠️ Test PASSED — call done() now with verdict REPRODUCED and the test code. Do NOT write more tests.'
+      } else {
+        consecutiveTestFailures++
+        if (consecutiveTestFailures >= MAX_FAILED_RUNS) {
+          resultText += `\n\n⛔ ${consecutiveTestFailures} consecutive test failures — you MUST call done() NOW with verdict NOT_REPRODUCIBLE and summary explaining why. Do not run more tests.`
+        } else if (consecutiveTestFailures >= 3) {
+          resultText += `\n\n⚠️ ${consecutiveTestFailures} failed runs. ${MAX_FAILED_RUNS - consecutiveTestFailures} remaining before forced done(NOT_REPRODUCIBLE).`
+        }
      }

      return { content: [{ type: 'text' as const, text: resultText }] }
@@ -383,6 +539,8 @@ export async function runResearchPhase(
      inspectTool,
      readFixtureTool,
      readTestTool,
+      downloadAttachmentTool,
+      loadWorkflowTool,
      writeTestTool,
      runTestTool,
      doneTool
@@ -396,21 +554,42 @@ export async function runResearchPhase(
 - inspect(selector?) — Read the accessibility tree to understand the current UI. Use to discover selectors, element names, and UI state.
 - readFixture(path) — Read fixture source code from browser_tests/fixtures/. Use to discover available methods. E.g. "helpers/CanvasHelper.ts", "components/Topbar.ts", "ComfyPage.ts"
 - readTest(path) — Read an existing test from browser_tests/tests/ to learn patterns. E.g. "workflow.spec.ts". Pass any name to list available files.
+- downloadAttachment(url) — Fetch a URL (GitHub user-attachments, gist raw, pastebin raw) and return its text. Use when you need the body for inspection.
+- loadWorkflow(jsonOrUrl) — Load a workflow into the canvas via window.app.loadGraphData(). Accepts either a URL (will be fetched) OR a JSON string. For a workflow.json attached to an issue, prefer passing the URL directly in ONE call. Many bugs only manifest in the user's specific graph state.
 - writeTest(code) — Write a Playwright test file (.spec.ts)
 - runTest() — Execute the test and get results (pass/fail + errors)
 - done(verdict, summary, evidence, testCode) — Finish with the final test

 ## Workflow
-1. Read the issue description carefully
-2. Use inspect() to understand the current UI state and discover element selectors
-3. If unsure about the fixture API, use readFixture() to read the relevant helper source code
-4. If unsure about test patterns, use readTest() to read an existing test for reference
-5. Write a Playwright test that:
-   - Performs the exact reproduction steps from the issue
-   - Asserts the BROKEN behavior (the bug) — so the test PASSES when the bug exists
-6. Run the test with runTest()
-7. If it fails: read the error, fix the test, run again (max 5 attempts)
-8. Call done() with the final verdict and test code
+1. Read the issue description carefully. Think about:
+   - What PRECONDITIONS are needed? (many nodes on canvas? specific layout? saved workflow? subgraph?)
+   - What HIDDEN ASSUMPTIONS exist? (e.g. "z-index bug" means nodes must overlap → need a crowded canvas)
+   - What specific UI STATE triggers the bug? (dirty workflow? collapsed node? specific menu open?)
+2. **Environment setup (critical)** — many bugs only reproduce with the user's specific workflow. If the issue body contains a workflow JSON (code block or a link like \`[workflow.json](https://github.com/user-attachments/...)\`), you MUST load it:
+   - **URL attachment**: call loadWorkflow(url) directly with the attachment URL — ONE call, no download step needed.
+   - **Inline JSON code block**: call loadWorkflow(jsonString) with the raw JSON text from the issue body.
+   - downloadAttachment(url) is only needed when you specifically want to inspect a non-workflow file.
+   Skip this step only if the issue clearly does not involve a pre-existing workflow (e.g. pure menu/settings bugs).
+3. FIRST: Use readTest() to read 1-2 existing tests similar to the bug you're reproducing:
+   - For menu/workflow bugs: readTest("workflow.spec.ts") or readTest("topbarMenu.spec.ts")
+   - For node/canvas bugs: readTest("nodeInteraction.spec.ts") or readTest("copyPaste.spec.ts")
+   - For settings bugs: readTest("settingDialogSearch.spec.ts")
+   - For subgraph bugs: readTest("subgraph.spec.ts")
+4. Use inspect() to understand the current UI state and discover element selectors
+5. If unsure about the fixture API, use readFixture("ComfyPage.ts") or relevant helper
+6. Write a Playwright test that:
+   - FIRST sets up the preconditions (add multiple nodes, create specific layout, save workflow, etc.)
+   - THEN performs the exact reproduction steps from the issue
+   - FINALLY asserts the BROKEN behavior (the bug) — so the test PASSES when the bug exists
+   - Think like a tester: the bug may only appear under specific conditions that the reporter assumed were obvious
+   - If a workflow was loaded in step 2, the test MUST load the same workflow itself (fetch + loadGraphData inside the test) — the agent's setup does not persist into the separate test browser session
+7. Run the test with runTest()
+8. If it fails, ANALYZE the error before retrying (max 5 attempts):
+   - Is it a selector issue? Use inspect() to find the right element
+   - Is it a timing issue? The UI may need time to update — use nextFrame() or expect.poll()
+   - Is the precondition wrong? Maybe the bug only appears with MORE nodes, AFTER a save, etc.
+   - Try a DIFFERENT approach, not the same code with minor tweaks
+9. Call done() with the final verdict and test code

 ## Test writing guidelines
 - Import the project fixture: \`import { comfyPageFixture as test } from '../fixtures/ComfyPage'\`
@@ -423,10 +602,23 @@ export async function runResearchPhase(
 - Use \`comfyPage.nextFrame()\` after interactions that trigger UI updates
 - NEVER use \`page.waitForTimeout()\` — use Locator actions and retrying assertions instead
 - ALWAYS call done() when finished, even if the test passed — do not keep iterating after a passing test
+- CRITICAL: If your test FAILS 3 times in a row with the same or similar error, call done(NOT_REPRODUCIBLE) immediately. Do NOT keep retrying the same approach — try a completely different strategy or give up. Spending 20+ tool calls on failing tests is wasteful.
+- Budget your turns: spend at most 3 turns on inspect/readFixture, 2 turns writing the first test, then max 3 fix attempts. If still failing after ~10 tool calls, call done().
 - Use \`expect.poll()\` for async assertions: \`await expect.poll(() => comfyPage.nodeOps.getGraphNodesCount()).toBe(8)\`
 - CRITICAL: Your assertions must be SPECIFIC TO THE BUG. A test that asserts \`expect(count).toBeGreaterThan(0)\` proves nothing — it would pass even without the bug. Instead assert the exact broken state, e.g. \`expect(clonedWidgets).toHaveLength(0)\` (missing widgets) or \`expect(zIndex).toBeLessThan(parentZIndex)\` (wrong z-order). If a test passes trivially, it's a false positive.
 - NEVER write "debug", "discovery", or "inspect node types" tests. These waste turns and produce false REPRODUCED verdicts. If you need to discover node type names, use inspect() or readFixture() — not a passing test.
 - If you cannot write a bug-specific assertion, call done() with verdict NOT_REPRODUCIBLE and explain why.
+- Loading the user's workflow inside the test (when the issue attached one):
+  \`\`\`ts
+  const WORKFLOW_URL = 'https://github.com/user-attachments/files/.../workflow.json'
+  await comfyPage.page.evaluate(async (url) => {
+    const res = await fetch(url)
+    const json = await res.json()
+    ;(window as any).app.loadGraphData(json)
+  }, WORKFLOW_URL)
+  await comfyPage.nextFrame()
+  \`\`\`
+  Prefer fetching the attachment URL over embedding a large JSON literal in the test.

 ## ComfyPage Fixture API Reference

@@ -481,6 +673,10 @@ export async function runResearchPhase(
 ### Settings (comfyPage.settings)
 - \`.setSetting(id, value)\` — change a ComfyUI setting
 - \`.getSetting(id)\` — read current setting value
+- Common setting IDs:
+  - \`'Comfy.UseNewMenu'\` — 'Top' | 'Bottom' | 'Disabled'
+  - \`'Comfy.Locale'\` — 'en' | 'zh' | 'ja' | 'ko' | 'ru' | 'fr' | 'es' etc. (change UI language)
+  - \`'Comfy.NodeBadge.NodeSourceBadgeMode'\` — node badge display

 ### Keyboard (comfyPage.keyboard)
 - \`.undo()\` / \`.redo()\` — Ctrl+Z / Ctrl+Y
@@ -493,6 +689,7 @@ export async function runResearchPhase(
 - \`.setupWorkflowsDirectory(structure)\` — setup test directory
 - \`.deleteWorkflow(name)\`
 - \`.isCurrentWorkflowModified()\` — check dirty state
+- Available subgraph assets: loadWorkflow('subgraphs/basic-subgraph'), 'subgraphs/nested-subgraph', 'subgraphs/subgraph-with-promoted-text-widget', etc.

 ### Context Menu (comfyPage.contextMenu)
 - \`.openFor(locator)\` — right-click locator and wait for menu
@@ -500,12 +697,26 @@ export async function runResearchPhase(
 - \`.isVisible()\` — check if context menu is showing
 - \`.assertHasItems(items)\` — assert menu contains items

+### Queue & Assets (comfyPage.assets)
+- \`comfyPage.runButton.click()\` — execute current workflow (backend runs with --cpu in CI)
+- \`comfyPage.assets.mockOutputHistory(jobs)\` — mock queue history with fake job items
+- \`comfyPage.assets.mockEmptyState()\` — clear all mocked state
+- Queue overlay: \`page.getByTestId('queue-overlay-toggle')\` to open queue panel
+
+### Subgraph (comfyPage.subgraph)
+- \`.isInSubgraph()\` — check if currently viewing a subgraph
+- \`.getNodeCount()\` — nodes in current graph view
+- \`.getSlotCount('input'|'output')\` — I/O slot count
+- \`.connectToInput(sourceNode, slotIdx, inputName)\` — connect to subgraph input
+- \`.exitViaBreadcrumb()\` — navigate out of subgraph
+- \`.convertDefaultKSamplerToSubgraph()\` — helper: convert default workflow node to subgraph
+- NodeReference: \`.convertToSubgraph()\`, \`.navigateIntoSubgraph()\`
+
 ### Other helpers
 - \`comfyPage.settingDialog\` — SettingDialog component
 - \`comfyPage.searchBox\` / \`comfyPage.searchBoxV2\` — node search
 - \`comfyPage.toast\` — ToastHelper (\`.visibleToasts\`)
- \`comfyPage.subgraph\` — SubgraphHelper
- \`comfyPage.vueNodes\` — VueNodeHelpers
+- \`comfyPage.vueNodes\` — VueNodeHelpers (\`.enterSubgraph(nodeId)\`, \`.selectNode(nodeId)\`)
 - \`comfyPage.bottomPanel\` — BottomPanel
 - \`comfyPage.clipboard\` — ClipboardHelper
 - \`comfyPage.dragDrop\` — DragDropHelper
@@ -596,18 +807,40 @@ ${issueContext}`
      prompt:
        'Write a Playwright E2E test that reproduces the reported bug. Use inspect() to discover selectors, readFixture() or readTest() if you need to understand the fixture API or see existing test patterns, writeTest() to write the test, runTest() to execute it. Iterate until it works or you determine the bug cannot be reproduced.',
      options: {
-        model: 'claude-sonnet-4-6',
+        model: opts.model ?? 'claude-sonnet-4-6',
        systemPrompt,
        ...(anthropicApiKey ? { apiKey: anthropicApiKey } : {}),
        maxTurns,
        mcpServers: { 'qa-research': server },
+        // Sonnet handles the loop; Opus is consulted via server-side advisor
+        // only when the agent explicitly asks for help — delivers Opus-grade
+        // judgment on hard decisions without paying Opus rates for every turn.
+        settings: {
+          advisorModel: 'claude-opus-4-6',
+          permissions: {
+            // Deny destructive built-ins that could touch the user's machine.
+            // Read/Grep/Glob stay allowed so the agent can explore the repo.
+            deny: [
+              'Bash(*)',
+              'Agent(*)',
+              'Write(*)',
+              'Edit(*)',
+              'NotebookEdit(*)'
+            ]
+          }
+        },
        allowedTools: [
          'mcp__qa-research__inspect',
          'mcp__qa-research__readFixture',
          'mcp__qa-research__readTest',
+          'mcp__qa-research__downloadAttachment',
+          'mcp__qa-research__loadWorkflow',
          'mcp__qa-research__writeTest',
          'mcp__qa-research__runTest',
-          'mcp__qa-research__done'
+          'mcp__qa-research__done',
+          'Read',
+          'Grep',
+          'Glob'
        ]
      }
    })) {
@@ -652,6 +885,20 @@ ${issueContext}`
    finalReproducedBy = 'e2e_test'
    finalSummary = `Test passed at turn ${lastPassedTurn} (auto-completed — agent did not call done())`
    finalEvidence = `Test passed with exit code 0`
+  } else if (!agentDone && consecutiveTestFailures > 0) {
+    // Agent ran tests but none passed and it never called done() — treat as not reproducible
+    console.warn(
+      `Auto-completing: ${consecutiveTestFailures} test failures, no passes, done() not called`
+    )
+    finalVerdict = 'NOT_REPRODUCIBLE'
+    finalReproducedBy = 'none'
+    finalSummary = `Agent ran ${consecutiveTestFailures} test attempts, none passed (auto-completed — agent did not call done())`
+    finalEvidence = `${consecutiveTestFailures} consecutive TEST FAILED runs without calling done()`
+    try {
+      finalTestCode = readFileSync(browserTestPath, 'utf-8')
+    } catch {
+      // leave finalTestCode empty
+    }
  }

  const result: ResearchResult = {
--- a/.claude/skills/comfy-qa/scripts/qa-deploy-pages.sh
+++ b/.claude/skills/comfy-qa/scripts/qa-deploy-pages.sh
@@ -36,6 +36,11 @@ for os in Linux macOS Windows; do
      -vf "fps=8,scale=480:-1:flags=lanczos,split[s0][s1];[s0]palettegen=max_colors=64[p];[s1][p]paletteuse=dither=bayer" \
      -loop 0 "$DEPLOY_DIR/qa-${os}-thumb.gif" 2>/dev/null \
    || echo "GIF generation failed for ${os} (non-fatal)"
+    # Also generate thumbnail.jpg (title card = first frame) for comfy-qa.pages.dev dashboard cards
+    ffmpeg -y -ss 1 -i "$THUMB_SRC" -vframes 1 \
+      -vf "scale=640:-1" -q:v 3 \
+      "$DEPLOY_DIR/thumbnail.jpg" 2>/dev/null \
+    || echo "thumbnail.jpg generation failed for ${os} (non-fatal)"
  fi
 done

--- a/.claude/skills/comfy-qa/scripts/qa-record.ts
+++ b/.claude/skills/comfy-qa/scripts/qa-record.ts
@@ -1952,7 +1952,7 @@ async function main() {
            // QA guide not available
          }
        }
-        const research = await runResearchPhase({
+        let research = await runResearchPhase({
          page,
          issueContext: issueCtx,
          qaGuide: qaGuideText,
@@ -1963,6 +1963,44 @@ async function main() {
        console.warn(
          `Research complete: ${research.verdict} — ${research.summary.slice(0, 100)}`
        )
+
+        // Opus escalation: if Sonnet couldn't reproduce, try Opus
+        if (
+          research.verdict === 'INCONCLUSIVE' &&
+          anthropicKey &&
+          process.env.QA_OPUS_ESCALATION !== '0'
+        ) {
+          console.warn('Escalating to claude-opus-4-6 for complex issue...')
+          try {
+            const opusResult = await runResearchPhase({
+              page,
+              issueContext: issueCtx,
+              qaGuide: qaGuideText,
+              outputDir: opts.outputDir,
+              serverUrl: opts.serverUrl,
+              anthropicApiKey: anthropicKey,
+              model: 'claude-opus-4-6',
+              maxTurns: 30
+            })
+            console.warn(
+              `Opus result: ${opusResult.verdict} — ${opusResult.summary.slice(0, 100)}`
+            )
+            // Only use Opus result if it's better than Sonnet's
+            if (
+              opusResult.verdict !== 'INCONCLUSIVE' ||
+              !opusResult.summary.includes('API error')
+            ) {
+              research = opusResult
+            } else {
+              console.warn('Opus failed (API error) — keeping Sonnet result')
+            }
+          } catch (opusErr) {
+            console.warn(
+              `Opus escalation failed: ${opusErr instanceof Error ? opusErr.message : opusErr}`
+            )
+            // Keep Sonnet's result
+          }
+        }
        console.warn(`Evidence: ${research.evidence.slice(0, 200)}`)

        // ═══ Phase 2: Record demo video with demowright ═══
--- a/.claude/skills/comfy-qa/scripts/qa.ts
+++ b/.claude/skills/comfy-qa/scripts/qa.ts
@@ -193,7 +193,17 @@ function fetchIssue(number: string, repo: string, outputDir: string): string {
  const body = shell(
    `gh issue view ${number} --repo ${repo} --json title,body,labels --jq '"Title: " + .title + "\\n\\nLabels: " + ([.labels[].name] | join(", ")) + "\\n\\n" + .body'`
  )
-  return writeTmpFile(outputDir, `issue-${number}.txt`, body)
+  // Append relevant comments for reproduction context
+  let comments = ''
+  try {
+    comments = shell(
+      `gh issue view ${number} --repo ${repo} --comments --json comments --jq '[.comments[] | select(.body | test("repro|step|how to|workaround"; "i")) | .body] | first(5; .[]) // empty'`
+    )
+  } catch {
+    // comments fetch failed, not critical
+  }
+  const content = comments ? `${body}\n\n--- Comments ---\n\n${comments}` : body
+  return writeTmpFile(outputDir, `issue-${number}.txt`, content)
 }

 function fetchPR(number: string, repo: string, outputDir: string): string {
--- a/.github/actions/setup-comfyui-server/action.yaml
+++ b/.github/actions/setup-comfyui-server/action.yaml
@@ -44,12 +44,17 @@ runs:
        python -m pip install --upgrade pip
        pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
        pip install -r requirements.txt
-        pip install wait-for-it

    - name: Start ComfyUI server
      if: ${{ inputs.launch_server == 'true' }}
      shell: bash
      working-directory: ComfyUI
+      env:
+        EXTRA_SERVER_PARAMS: ${{ inputs.extra_server_params }}
      run: |
-        python main.py --cpu --multi-user --front-end-root ../dist ${{ inputs.extra_server_params }} &
-        wait-for-it --service 127.0.0.1:8188 -t 600
+        python main.py --cpu --multi-user --front-end-root ../dist $EXTRA_SERVER_PARAMS &
+        for i in $(seq 1 300); do
+          curl -sf http://127.0.0.1:8188/api/system_stats >/dev/null 2>&1 && echo "Server ready" && exit 0
+          sleep 2
+        done
+        echo "::error::ComfyUI server did not start within 600s" && exit 1
--- a/.github/workflows/pr-qa.yaml
+++ b/.github/workflows/pr-qa.yaml
@@ -10,8 +10,9 @@
 name: 'PR: QA'

 on:
+  # TODO: remove push trigger before merge
  push:
-    branches: [sno-qa-*]
+    branches: [sno-skills, sno-qa-*]
  pull_request:
    types: [labeled]
    branches: [main]
@@ -25,9 +26,10 @@ on:
        options: [focused, full]
        default: focused

-concurrency:
-  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.event.issue.number || github.ref }}
-  cancel-in-progress: true
+# TODO: restore concurrency group before merge (disabled for parallel sno-qa-* testing)
+# concurrency:
+#   group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.event.issue.number || github.ref }}
+#   cancel-in-progress: true

 jobs:
  resolve-matrix:
@@ -53,7 +55,7 @@ jobs:

          # Only run on label events if it's one of our labels
          if [ "$EVENT_ACTION" = "labeled" ] && \
-             [ "$LABEL" != "qa-changes" ] && [ "$LABEL" != "qa-full" ] && [ "$LABEL" != "qa-issue" ]; then
+             [ "$LABEL" != "qa-changes" ] && [ "$LABEL" != "qa-full" ] && [ "$LABEL" != "qa-issue" ] && [ "$LABEL" != "Potential Bug" ] && [ "$LABEL" != "verified bug" ]; then
             echo "skip=true" >> "$GITHUB_OUTPUT"
          fi

@@ -272,6 +274,12 @@ jobs:
            --repo ${{ github.repository }} \
            --json title,body,labels --jq '"Labels: \([.labels[].name] | join(", "))\nTitle: \(.title)\n\n\(.body)"' \
            > "${{ runner.temp }}/issue-body.txt"
+          # Append top comments for reproduction context
+          gh issue view ${{ needs.resolve-matrix.outputs.number }} \
+            --repo ${{ github.repository }} \
+            --comments --json comments \
+            --jq '[.comments[] | select(.authorAssociation != "NONE" or (.body | test("repro|step|how to|workaround"; "i"))) | .body] | first(5; .[]) // empty' \
+            >> "${{ runner.temp }}/issue-body.txt" 2>/dev/null || true
          echo "Issue body saved ($(wc -c < "${{ runner.temp }}/issue-body.txt") bytes)"

      - name: Download QA guide
@@ -389,6 +397,30 @@ jobs:
        with:
          include_build_step: true

+      # When triggered via sno-qa-* push, the checkout above gets sno-skills
+      # (the scripts branch), not the actual PR. Rebuild with PR code.
+      - name: Rebuild with PR frontend for sno-qa-* triggers
+        if: >-
+          !github.head_ref &&
+          needs.resolve-matrix.outputs.target_type == 'pr' &&
+          needs.resolve-matrix.outputs.number
+        shell: bash
+        env:
+          PR_NUM: ${{ needs.resolve-matrix.outputs.number }}
+        run: |
+          SNO_REF=$(git rev-parse HEAD)
+          git fetch origin "refs/pull/${PR_NUM}/head"
+          git checkout FETCH_HEAD
+          echo "Building PR #${PR_NUM} frontend at $(git rev-parse --short HEAD)"
+
+          pnpm install --frozen-lockfile || pnpm install
+          pnpm build
+
+          # Switch back to sno-skills so QA scripts are available
+          git checkout "$SNO_REF"
+          pnpm install --frozen-lockfile || pnpm install
+          echo "Restored sno-skills scripts at $(git rev-parse --short HEAD)"
+
      - name: Install QA dependencies
        run: |
          pnpm add -D @google/generative-ai@^0.24.1
--- a/.gitignore
+++ b/.gitignore
@@ -29,6 +29,9 @@ dist-ssr
 .claude/worktrees
 CLAUDE.local.md

+# QA pipeline ephemeral test file (written per run by qa-agent.ts)
+browser_tests/tests/qa-reproduce.spec.ts
+
 # Editor directories and files
 .vscode/*
 *.code-workspace
@@ -102,4 +105,7 @@ vitest.config.*.timestamp*
 # Weekly docs check output
 /output.txt

-.amp
+.amp
+.playwright-cli/
+.playwright/
+.claude/scheduled_tasks.lock
--- a/.oxfmtrc.json
+++ b/.oxfmtrc.json
@@ -9,6 +9,7 @@
    "packages/registry-types/src/comfyRegistryTypes.ts",
    "public/materialdesignicons.min.css",
    "src/types/generatedManagerTypes.ts",
-    "**/__fixtures__/**/*.json"
+    "**/__fixtures__/**/*.json",
+    "scripts/qa-report-template.html"
  ]
 }
--- a/apps/website/src/components/AcademySection.vue
+++ b/apps/website/src/components/AcademySection.vue
@@ -17,7 +17,7 @@ const features = computed(() => [
    <div class="mx-auto max-w-3xl px-6 text-center">
      <!-- Badge -->
      <span
-        class="inline-block rounded-full bg-brand-yellow/10 px-4 py-1.5 text-xs uppercase tracking-widest text-brand-yellow"
+        class="inline-block rounded-full bg-brand-yellow/10 px-4 py-1.5 text-xs tracking-widest text-brand-yellow uppercase"
      >
        {{ t('academy.badge', locale) }}
      </span>
--- a/apps/website/src/components/GetStartedSection.vue
+++ b/apps/website/src/components/GetStartedSection.vue
@@ -40,7 +40,7 @@ const steps = computed(() => [
          <!-- Connecting line between steps (desktop only) -->
          <div
            v-if="index < steps.length - 1"
-            class="absolute right-0 top-8 hidden w-full translate-x-1/2 border-t border-brand-yellow/20 md:block"
+            class="absolute top-8 right-0 hidden w-full translate-x-1/2 border-t border-brand-yellow/20 md:block"
          />

          <div class="relative">
--- a/apps/website/src/components/HeroSection.vue
+++ b/apps/website/src/components/HeroSection.vue
@@ -31,11 +31,11 @@ const ctaButtons = computed(() => [
      <div class="flex w-full items-center justify-center md:w-[55%]">
        <div class="relative -ml-12 -rotate-15 md:-ml-24" aria-hidden="true">
          <div
-            class="h-64 w-64 rounded-full border-[40px] border-brand-yellow md:h-[28rem] md:w-[28rem] md:border-[64px] lg:h-[36rem] lg:w-[36rem] lg:border-[80px]"
+            class="size-64 rounded-full border-40 border-brand-yellow md:h-112 md:w-md md:border-64 lg:h-144 lg:w-xl lg:border-80"
          >
            <!-- Gap on the right side to form "C" shape -->
            <div
-              class="absolute right-0 top-1/2 h-32 w-24 -translate-y-1/2 translate-x-1/2 bg-black md:h-48 md:w-36 lg:h-64 lg:w-48"
+              class="absolute top-1/2 right-0 h-32 w-24 translate-x-1/2 -translate-y-1/2 bg-black md:h-48 md:w-36 lg:h-64 lg:w-48"
            />
          </div>
        </div>
@@ -44,7 +44,7 @@ const ctaButtons = computed(() => [
      <!-- Right: Text content -->
      <div class="flex w-full flex-col items-start md:w-[45%]">
        <h1
-          class="text-5xl font-bold leading-tight tracking-tight text-white md:text-6xl lg:text-7xl"
+          class="text-5xl/tight font-bold tracking-tight text-white md:text-6xl lg:text-7xl"
        >
          {{ t('hero.headline', locale) }}
        </h1>
--- a/apps/website/src/components/ManifestoSection.vue
+++ b/apps/website/src/components/ManifestoSection.vue
@@ -17,7 +17,7 @@ const { locale = 'en' } = defineProps<{ locale?: Locale }>()
        {{ t('manifesto.heading', locale) }}
      </h2>

-      <p class="mx-auto mt-6 max-w-2xl text-lg leading-relaxed text-smoke-700">
+      <p class="mx-auto mt-6 max-w-2xl text-lg/relaxed text-smoke-700">
        {{ t('manifesto.body', locale) }}
      </p>

--- a/apps/website/src/components/ProductShowcase.vue
+++ b/apps/website/src/components/ProductShowcase.vue
@@ -33,11 +33,11 @@ const features = computed(() => [
        <div class="flex flex-col items-center gap-4">
          <!-- Play button triangle -->
          <div
-            class="flex h-16 w-16 items-center justify-center rounded-full border-2 border-white/20"
+            class="flex size-16 items-center justify-center rounded-full border-2 border-white/20"
            aria-hidden="true"
          >
            <div
-              class="ml-1 h-0 w-0 border-y-8 border-l-[14px] border-y-transparent border-l-white"
+              class="ml-1 size-0 border-y-8 border-l-14 border-y-transparent border-l-white"
            />
          </div>
          <p class="text-sm text-smoke-700">
@@ -54,7 +54,7 @@ const features = computed(() => [
          class="flex items-center gap-2"
        >
          <span
-            class="h-2 w-2 rounded-full bg-brand-yellow"
+            class="size-2 rounded-full bg-brand-yellow"
            aria-hidden="true"
          />
          <span class="text-sm text-smoke-700">{{ feature }}</span>
--- a/apps/website/src/components/SocialProofBar.vue
+++ b/apps/website/src/components/SocialProofBar.vue
@@ -32,7 +32,7 @@ const metrics = computed(() => [
    <div class="mx-auto max-w-7xl px-6">
      <!-- Heading -->
      <p
-        class="text-center text-xs font-medium uppercase tracking-widest text-smoke-700"
+        class="text-center text-xs font-medium tracking-widest text-smoke-700 uppercase"
      >
        {{ t('social.heading', locale) }}
      </p>
--- a/apps/website/src/components/TestimonialsSection.vue
+++ b/apps/website/src/components/TestimonialsSection.vue
@@ -90,7 +90,7 @@ const filteredTestimonials = computed(() => {
          :key="testimonial.name"
          class="rounded-xl border border-white/10 bg-charcoal-600 p-6"
        >
-          <blockquote class="text-base italic text-white">
+          <blockquote class="text-base text-white italic">
            &ldquo;{{ testimonial.quote }}&rdquo;
          </blockquote>

--- a/apps/website/src/components/UseCaseSection.vue
+++ b/apps/website/src/components/UseCaseSection.vue
@@ -24,12 +24,12 @@ const activeCategory = ref(0)
        <!-- Left placeholder image (desktop only) -->
        <div class="hidden flex-1 lg:block">
          <div
-            class="aspect-[2/3] rounded-full border border-white/10 bg-charcoal-600"
+            class="aspect-2/3 rounded-full border border-white/10 bg-charcoal-600"
          />
        </div>

        <!-- Center content -->
-        <div class="flex flex-col items-center text-center lg:flex-[2]">
+        <div class="flex flex-col items-center text-center lg:flex-2">
          <h2 class="text-3xl font-bold text-white">
            {{ t('useCase.heading', locale) }}
          </h2>
@@ -70,7 +70,7 @@ const activeCategory = ref(0)
        <!-- Right placeholder image (desktop only) -->
        <div class="hidden flex-1 lg:block">
          <div
-            class="aspect-[2/3] rounded-3xl border border-white/10 bg-charcoal-600"
+            class="aspect-2/3 rounded-3xl border border-white/10 bg-charcoal-600"
          />
        </div>
      </div>
--- a/apps/website/src/components/ValuePillars.vue
+++ b/apps/website/src/components/ValuePillars.vue
@@ -53,7 +53,7 @@ const pillars = computed(() => [
          class="rounded-xl border border-white/10 bg-charcoal-600 p-6 transition-colors hover:border-brand-yellow"
        >
          <div
-            class="flex h-12 w-12 items-center justify-center rounded-full bg-brand-yellow text-xl"
+            class="flex size-12 items-center justify-center rounded-full bg-brand-yellow text-xl"
          >
            {{ pillar.icon }}
          </div>
--- a/browser_tests/fixtures/ComfyPage.ts
+++ b/browser_tests/fixtures/ComfyPage.ts
@@ -10,6 +10,11 @@ import { TestIds } from '@e2e/fixtures/selectors'
 import { comfyExpect } from '@e2e/fixtures/utils/customMatchers'
 import { assetPath } from '@e2e/fixtures/utils/paths'
 import { sleep } from '@e2e/fixtures/utils/timing'
+import {
+  buildFallbackUsername,
+  findUserIdByUsername,
+  isDuplicateUserErrorMessage
+} from '@e2e/fixtures/utils/userSetup'
 import { VueNodeHelpers } from '@e2e/fixtures/VueNodeHelpers'
 import { BottomPanel } from '@e2e/fixtures/components/BottomPanel'
 import { ComfyNodeSearchBox } from '@e2e/fixtures/components/ComfyNodeSearchBox'
@@ -242,17 +247,40 @@ export class ComfyPage {
  }

  async setupUser(username: string) {
+    const existingUserId = await this.findUserId(username)
+    if (existingUserId) {
+      return existingUserId
+    }
+
+    try {
+      return await this.createUser(username)
+    } catch (error) {
+      if (
+        !(error instanceof Error) ||
+        !isDuplicateUserErrorMessage(error.message)
+      ) {
+        throw error
+      }
+
+      const recoveredUserId = await this.findUserId(username)
+      if (recoveredUserId) {
+        return recoveredUserId
+      }
+
+      const fallbackUsername = buildFallbackUsername(username)
+      console.warn(
+        `[e2e] Username "${username}" already exists but is not returned by /api/users. Falling back to "${fallbackUsername}".`
+      )
+      return await this.createUser(fallbackUsername)
+    }
+  }
+
+  private async findUserId(username: string) {
    const res = await this.request.get(`${this.url}/api/users`)
    if (res.status() !== 200)
      throw new Error(`Failed to retrieve users: ${await res.text()}`)

-    const apiRes = await res.json()
-    const user = Object.entries(apiRes?.users ?? {}).find(
-      ([, name]) => name === username
-    )
-    const id = user?.[0]
-
-    return id ? id : await this.createUser(username)
+    return findUserIdByUsername(await res.json(), username)
  }

  async createUser(username: string) {
--- a/browser_tests/fixtures/utils/userSetup.ts
+++ b/browser_tests/fixtures/utils/userSetup.ts
@@ -0,0 +1,77 @@
+function isRecord(value: unknown): value is Record<string, unknown> {
+  return typeof value === 'object' && value !== null && !Array.isArray(value)
+}
+
+function normalizeUserEntry(
+  entry: unknown
+): { userId: string; username: string } | null {
+  if (Array.isArray(entry) && entry.length >= 2) {
+    const [userId, username] = entry
+    if (typeof userId === 'string' && typeof username === 'string') {
+      return { userId, username }
+    }
+    return null
+  }
+
+  if (!isRecord(entry)) {
+    return null
+  }
+
+  const userId =
+    typeof entry.userId === 'string'
+      ? entry.userId
+      : typeof entry.id === 'string'
+        ? entry.id
+        : null
+  const username =
+    typeof entry.username === 'string'
+      ? entry.username
+      : typeof entry.name === 'string'
+        ? entry.name
+        : null
+
+  return userId && username ? { userId, username } : null
+}
+
+export function findUserIdByUsername(
+  payload: unknown,
+  username: string
+): string | null {
+  if (!isRecord(payload)) {
+    return null
+  }
+
+  const { users } = payload
+  if (Array.isArray(users)) {
+    for (const entry of users) {
+      const normalized = normalizeUserEntry(entry)
+      if (normalized?.username === username) {
+        return normalized.userId
+      }
+    }
+    return null
+  }
+
+  if (!isRecord(users)) {
+    return null
+  }
+
+  for (const [userId, currentUsername] of Object.entries(users)) {
+    if (currentUsername === username) {
+      return userId
+    }
+  }
+
+  return null
+}
+
+export function isDuplicateUserErrorMessage(message: string): boolean {
+  return /duplicate username|already exists/i.test(message)
+}
+
+export function buildFallbackUsername(
+  username: string,
+  suffix: string | number = Date.now()
+): string {
+  return `${username}-${suffix}`
+}
--- a/browser_tests/tests/appModePruning.spec.ts
+++ b/browser_tests/tests/appModePruning.spec.ts
@@ -1,10 +1,10 @@
-import type { ComfyPage } from '../fixtures/ComfyPage'
+import type { ComfyPage } from '@e2e/fixtures/ComfyPage'
 import {
  comfyPageFixture as test,
  comfyExpect as expect
-} from '../fixtures/ComfyPage'
-import { setupBuilder } from '../helpers/builderTestUtils'
-import { fitToViewInstant } from '../helpers/fitToView'
+} from '@e2e/fixtures/ComfyPage'
+import { setupBuilder } from '@e2e/helpers/builderTestUtils'
+import { fitToViewInstant } from '@e2e/helpers/fitToView'

 const RESIZE_NODE_TITLE = 'Resize Image/Mask'
 const RESIZE_NODE_ID = '1'
--- a/docs/qa/TROUBLESHOOTING.md
+++ b/docs/qa/TROUBLESHOOTING.md
@@ -82,6 +82,12 @@
 **Cause**: The `ANTHROPIC_API_KEY` secret in the repo has exhausted its credits.
 **Fix**: Top up the Anthropic API account linked to the key, or rotate to a new key in repo Settings → Secrets.

+## Duplicate Username During QA Reproduce
+
+**Symptom**: Reproduce tests fail before any assertions with `Failed to create user: {"error": "Duplicate username."}`.
+**Cause**: The backend can retain an old `playwright-test-*` user while `/api/users` does not report it in the format the fixture expects, so setup falls through to `POST /api/users` and collides.
+**Fix**: The Playwright fixture now retries lookup after a duplicate response and falls back to a unique username if the stale user still cannot be resolved. If this still appears, isolate the backend with `TEST_COMFYUI_DIR` or clear stale test users from the QA backend state.
+
 ## Agent Doesn't Perform Steps

 **Symptom**: Agent opens menus and settings but never interacts with the canvas.
@@ -90,7 +96,8 @@
 1. `loadDefaultWorkflow` failed (no nodes on canvas)
 2. Agent ran out of turn budget (30 turns / 120s)
 3. Gemini Flash (old agent) ignores prompt hints
-   **Fix**: Use hybrid agent (Claude Sonnet 4.6 + Gemini vision). Claude's superior reasoning follows instructions precisely.
+
+**Fix**: Use hybrid agent (Claude Sonnet 4.6 + Gemini vision). Claude's superior reasoning follows instructions precisely.

 ## False REPRODUCED from Discovery Tests

--- a/package.json
+++ b/package.json
@@ -39,6 +39,7 @@
    "oxlint": "oxlint src --type-aware",
    "prepare": "husky || true && git config blame.ignoreRevsFile .git-blame-ignore-revs || true",
    "preview": "nx preview",
+    "qa:video-review": "tsx scripts/qa-video-review.ts",
    "storybook": "nx storybook",
    "storybook:desktop": "nx run @comfyorg/desktop-ui:storybook",
    "stylelint:fix": "stylelint --cache --fix '{apps,packages,src}/**/*.{css,vue}'",
@@ -123,6 +124,7 @@
    "zod-validation-error": "catalog:"
  },
  "devDependencies": {
+    "@anthropic-ai/claude-agent-sdk": "catalog:",
    "@comfyorg/ingest-types": "workspace:*",
    "@eslint/js": "catalog:",
    "@google/generative-ai": "catalog:",
--- a/pnpm-lock.yaml
+++ b/pnpm-lock.yaml
--- a/pnpm-workspace.yaml
+++ b/pnpm-workspace.yaml
@@ -4,6 +4,7 @@ packages:

 catalog:
  '@alloc/quick-lru': ^5.2.0
+  '@anthropic-ai/claude-agent-sdk': latest
  '@astrojs/sitemap': ^3.7.1
  '@astrojs/vue': ^5.0.0
  '@comfyorg/comfyui-electron-types': 0.6.2
--- a/scripts/qa-agent.ts
+++ b/scripts/qa-agent.ts
@@ -0,0 +1,572 @@
+#!/usr/bin/env tsx
+/**
+ * QA Research Phase — Claude writes & debugs E2E tests to reproduce bugs
+ *
+ * Instead of driving a browser interactively, Claude:
+ * 1. Reads the issue + a11y snapshot of the UI
+ * 2. Writes a Playwright E2E test (.spec.ts) that reproduces the bug
+ * 3. Runs the test → reads errors → rewrites → repeats until it works
+ * 4. Outputs the passing test + verdict
+ *
+ * Tools:
+ *   - inspect(selector) — read a11y tree to understand UI state
+ *   - writeTest(code) — write a Playwright test file
+ *   - runTest() — execute the test and get results
+ *   - done(verdict, summary, testCode) — finish with the working test
+ */
+
+import type { Page } from '@playwright/test'
+import { query, tool, createSdkMcpServer } from '@anthropic-ai/claude-agent-sdk'
+import { z } from 'zod'
+import { mkdirSync, readFileSync, writeFileSync } from 'fs'
+import { execSync } from 'child_process'
+
+// ── Types ──
+
+interface ResearchOptions {
+  page: Page
+  issueContext: string
+  qaGuide: string
+  outputDir: string
+  serverUrl: string
+  anthropicApiKey?: string
+  maxTurns?: number
+  timeBudgetMs?: number
+}
+
+export type ReproMethod = 'e2e_test' | 'video' | 'both' | 'none'
+
+export interface ResearchResult {
+  verdict: 'REPRODUCED' | 'NOT_REPRODUCIBLE' | 'INCONCLUSIVE'
+  reproducedBy: ReproMethod
+  summary: string
+  evidence: string
+  testCode: string
+  log: Array<{
+    turn: number
+    timestampMs: number
+    toolName: string
+    toolInput: unknown
+    toolResult: string
+  }>
+}
+
+// ── Main research function ──
+
+export async function runResearchPhase(
+  opts: ResearchOptions
+): Promise<ResearchResult> {
+  const { page, issueContext, qaGuide, outputDir, serverUrl, anthropicApiKey } =
+    opts
+  const maxTurns = opts.maxTurns ?? 50
+
+  let agentDone = false
+  let finalVerdict: ResearchResult['verdict'] = 'INCONCLUSIVE'
+  let finalReproducedBy: ReproMethod = 'none'
+  let finalSummary = 'Agent did not complete'
+  let finalEvidence = ''
+  let finalTestCode = ''
+  let turnCount = 0
+  let lastPassedTurn = -1
+  const startTime = Date.now()
+  const researchLog: ResearchResult['log'] = []
+
+  const testDir = `${outputDir}/research`
+  mkdirSync(testDir, { recursive: true })
+  const testPath = `${testDir}/reproduce.spec.ts`
+
+  // Get initial a11y snapshot for context
+  let initialA11y = ''
+  try {
+    initialA11y = await page.locator('body').ariaSnapshot({ timeout: 5000 })
+    initialA11y = initialA11y.slice(0, 3000)
+  } catch {
+    initialA11y = '(could not capture initial a11y snapshot)'
+  }
+
+  // ── Tool: inspect ──
+  const inspectTool = tool(
+    'inspect',
+    'Read the current accessibility tree to understand UI state. Use this to discover element names, roles, and selectors for your test.',
+    {
+      selector: z
+        .string()
+        .optional()
+        .describe(
+          'Optional filter — only show elements matching this name/role. Omit for full tree.'
+        )
+    },
+    async (args) => {
+      let resultText: string
+      try {
+        const ariaText = await page
+          .locator('body')
+          .ariaSnapshot({ timeout: 5000 })
+        if (args.selector) {
+          const lines = ariaText.split('\n')
+          const matches = lines.filter((l: string) =>
+            l.toLowerCase().includes(args.selector!.toLowerCase())
+          )
+          resultText =
+            matches.length > 0
+              ? `Found "${args.selector}":\n${matches.slice(0, 15).join('\n')}`
+              : `"${args.selector}" not found. Full tree:\n${ariaText.slice(0, 2000)}`
+        } else {
+          resultText = ariaText.slice(0, 3000)
+        }
+      } catch (e) {
+        resultText = `inspect failed: ${e instanceof Error ? e.message : e}`
+      }
+
+      researchLog.push({
+        turn: turnCount,
+        timestampMs: Date.now() - startTime,
+        toolName: 'inspect',
+        toolInput: args,
+        toolResult: resultText.slice(0, 500)
+      })
+
+      return { content: [{ type: 'text' as const, text: resultText }] }
+    }
+  )
+
+  // ── Tool: readFixture ──
+  const readFixtureTool = tool(
+    'readFixture',
+    'Read a fixture or helper file from browser_tests/fixtures/ to understand the API. Use this to discover available methods on comfyPage helpers before writing your test.',
+    {
+      path: z
+        .string()
+        .describe(
+          'Relative path within browser_tests/fixtures/, e.g. "helpers/CanvasHelper.ts" or "components/Topbar.ts" or "ComfyPage.ts"'
+        )
+    },
+    async (args) => {
+      let resultText: string
+      try {
+        const fullPath = `${projectRoot}/browser_tests/fixtures/${args.path}`
+        const content = readFileSync(fullPath, 'utf-8')
+        resultText = content.slice(0, 4000)
+        if (content.length > 4000) {
+          resultText += `\n\n... (truncated, ${content.length} total chars)`
+        }
+      } catch (e) {
+        resultText = `Could not read fixture: ${e instanceof Error ? e.message : e}`
+      }
+
+      researchLog.push({
+        turn: turnCount,
+        timestampMs: Date.now() - startTime,
+        toolName: 'readFixture',
+        toolInput: args,
+        toolResult: resultText.slice(0, 500)
+      })
+
+      return { content: [{ type: 'text' as const, text: resultText }] }
+    }
+  )
+
+  // ── Tool: readTest ──
+  const readTestTool = tool(
+    'readTest',
+    'Read an existing E2E test file from browser_tests/tests/ to learn patterns and conventions used in this project.',
+    {
+      path: z
+        .string()
+        .describe(
+          'Relative path within browser_tests/tests/, e.g. "workflow.spec.ts" or "subgraph.spec.ts"'
+        )
+    },
+    async (args) => {
+      let resultText: string
+      try {
+        const fullPath = `${projectRoot}/browser_tests/tests/${args.path}`
+        const content = readFileSync(fullPath, 'utf-8')
+        resultText = content.slice(0, 4000)
+        if (content.length > 4000) {
+          resultText += `\n\n... (truncated, ${content.length} total chars)`
+        }
+      } catch (e) {
+        // List available test files if the path doesn't exist
+        try {
+          const { readdirSync } = await import('fs')
+          const files = readdirSync(`${projectRoot}/browser_tests/tests/`)
+            .filter((f: string) => f.endsWith('.spec.ts'))
+            .slice(0, 30)
+          resultText = `File not found: ${args.path}\n\nAvailable test files:\n${files.join('\n')}`
+        } catch {
+          resultText = `Could not read test: ${e instanceof Error ? e.message : e}`
+        }
+      }
+
+      researchLog.push({
+        turn: turnCount,
+        timestampMs: Date.now() - startTime,
+        toolName: 'readTest',
+        toolInput: args,
+        toolResult: resultText.slice(0, 500)
+      })
+
+      return { content: [{ type: 'text' as const, text: resultText }] }
+    }
+  )
+
+  // ── Tool: writeTest ──
+  const writeTestTool = tool(
+    'writeTest',
+    'Write a Playwright E2E test file that reproduces the bug. The test should assert the broken behavior exists.',
+    {
+      code: z
+        .string()
+        .describe('Complete Playwright test file content (.spec.ts)')
+    },
+    async (args) => {
+      writeFileSync(testPath, args.code)
+
+      researchLog.push({
+        turn: turnCount,
+        timestampMs: Date.now() - startTime,
+        toolName: 'writeTest',
+        toolInput: { path: testPath, codeLength: args.code.length },
+        toolResult: `Test written to ${testPath} (${args.code.length} chars)`
+      })
+
+      return {
+        content: [
+          {
+            type: 'text' as const,
+            text: `Test written to ${testPath}. Use runTest() to execute it.`
+          }
+        ]
+      }
+    }
+  )
+
+  // ── Tool: runTest ──
+  // Place test in browser_tests/ so Playwright config finds fixtures
+  const projectRoot = process.cwd()
+  const browserTestPath = `${projectRoot}/browser_tests/tests/qa-reproduce.spec.ts`
+
+  const runTestTool = tool(
+    'runTest',
+    'Run the Playwright test and get results. Returns stdout/stderr including assertion errors.',
+    {},
+    async () => {
+      turnCount++
+      // Copy the test to browser_tests/tests/ where Playwright expects it
+      const { copyFileSync } = await import('fs')
+      try {
+        copyFileSync(testPath, browserTestPath)
+      } catch {
+        // directory may not exist
+        mkdirSync(`${projectRoot}/browser_tests/tests`, { recursive: true })
+        copyFileSync(testPath, browserTestPath)
+      }
+
+      let resultText: string
+      try {
+        const output = execSync(
+          `cd "${projectRoot}" && npx playwright test browser_tests/tests/qa-reproduce.spec.ts --reporter=list --timeout=30000 --retries=0 --workers=1 2>&1`,
+          {
+            timeout: 90000,
+            encoding: 'utf-8',
+            env: {
+              ...process.env,
+              COMFYUI_BASE_URL: serverUrl
+            }
+          }
+        )
+        resultText = `TEST PASSED:\n${output.slice(-1500)}`
+      } catch (e) {
+        const err = e as { stdout?: string; stderr?: string; message?: string }
+        const output = (err.stdout || '') + '\n' + (err.stderr || '')
+        resultText = `TEST FAILED:\n${output.slice(-2000)}`
+      }
+
+      researchLog.push({
+        turn: turnCount,
+        timestampMs: Date.now() - startTime,
+        toolName: 'runTest',
+        toolInput: { testPath },
+        toolResult: resultText.slice(0, 1000)
+      })
+
+      // Auto-save passing test code for fallback completion
+      if (resultText.startsWith('TEST PASSED')) {
+        try {
+          finalTestCode = readFileSync(browserTestPath, 'utf-8')
+          lastPassedTurn = turnCount
+        } catch {
+          // ignore
+        }
+        resultText += '\n\n⚠️ Test PASSED — call done() now with verdict REPRODUCED and the test code. Do NOT write more tests.'
+      }
+
+      return { content: [{ type: 'text' as const, text: resultText }] }
+    }
+  )
+
+  // ── Tool: done ──
+  const doneTool = tool(
+    'done',
+    'Finish research with verdict and the final test code.',
+    {
+      verdict: z.enum(['REPRODUCED', 'NOT_REPRODUCIBLE', 'INCONCLUSIVE']),
+      reproducedBy: z
+        .enum(['e2e_test', 'video', 'both', 'none'])
+        .describe(
+          'How the bug was proven: e2e_test = Playwright assertion passed, video = visual evidence only, both = both methods, none = not reproduced'
+        ),
+      summary: z.string().describe('What you found and why'),
+      evidence: z.string().describe('Test output that proves the verdict'),
+      testCode: z
+        .string()
+        .describe(
+          'Final Playwright test code. If REPRODUCED, this test asserts the bug exists and passes.'
+        )
+    },
+    async (args) => {
+      agentDone = true
+      finalVerdict = args.verdict
+      finalReproducedBy = args.reproducedBy
+      finalSummary = args.summary
+      finalEvidence = args.evidence
+      finalTestCode = args.testCode
+      writeFileSync(testPath, args.testCode)
+      return {
+        content: [
+          { type: 'text' as const, text: `Research complete: ${args.verdict}` }
+        ]
+      }
+    }
+  )
+
+  // ── MCP Server ──
+  const server = createSdkMcpServer({
+    name: 'qa-research',
+    version: '1.0.0',
+    tools: [
+      inspectTool,
+      readFixtureTool,
+      readTestTool,
+      writeTestTool,
+      runTestTool,
+      doneTool
+    ]
+  })
+
+  // ── System prompt ──
+  const systemPrompt = `You are a senior QA engineer who writes Playwright E2E tests to reproduce reported bugs.
+
+## Your tools
+- inspect(selector?) — Read the accessibility tree to understand the current UI. Use to discover selectors, element names, and UI state.
+- readFixture(path) — Read fixture source code from browser_tests/fixtures/. Use to discover available methods. E.g. "helpers/CanvasHelper.ts", "components/Topbar.ts", "ComfyPage.ts"
+- readTest(path) — Read an existing test from browser_tests/tests/ to learn patterns. E.g. "workflow.spec.ts". Pass any name to list available files.
+- writeTest(code) — Write a Playwright test file (.spec.ts)
+- runTest() — Execute the test and get results (pass/fail + errors)
+- done(verdict, summary, evidence, testCode) — Finish with the final test
+
+## Workflow
+1. Read the issue description carefully
+2. Use inspect() to understand the current UI state and discover element selectors
+3. If unsure about the fixture API, use readFixture() to read the relevant helper source code
+4. If unsure about test patterns, use readTest() to read an existing test for reference
+5. Write a Playwright test that:
+   - Performs the exact reproduction steps from the issue
+   - Asserts the BROKEN behavior (the bug) — so the test PASSES when the bug exists
+6. Run the test with runTest()
+7. If it fails: read the error, fix the test, run again (max 5 attempts)
+8. Call done() with the final verdict and test code
+
+## Test writing guidelines
+- Import the project fixture: \`import { comfyPageFixture as test } from '../fixtures/ComfyPage'\`
+- Import expect: \`import { expect } from '@playwright/test'\`
+- The fixture provides \`comfyPage\` which has all the helpers listed below
+- If the bug IS present, the test should PASS. If the bug is fixed, the test would FAIL.
+- Keep tests focused and minimal — test ONLY the reported bug
+- Write ONE test, not multiple. Focus on the single clearest reproduction.
+- The test file will be placed in browser_tests/tests/qa-reproduce.spec.ts
+- Use \`comfyPage.nextFrame()\` after interactions that trigger UI updates
+- NEVER use \`page.waitForTimeout()\` — use Locator actions and retrying assertions instead
+- ALWAYS call done() when finished, even if the test passed — do not keep iterating after a passing test
+- Use \`expect.poll()\` for async assertions: \`await expect.poll(() => comfyPage.nodeOps.getGraphNodesCount()).toBe(8)\`
+- CRITICAL: Your assertions must be SPECIFIC TO THE BUG. A test that asserts \`expect(count).toBeGreaterThan(0)\` proves nothing — it would pass even without the bug. Instead assert the exact broken state, e.g. \`expect(clonedWidgets).toHaveLength(0)\` (missing widgets) or \`expect(zIndex).toBeLessThan(parentZIndex)\` (wrong z-order). If a test passes trivially, it's a false positive.
+- If you cannot write a bug-specific assertion, call done() with verdict NOT_REPRODUCIBLE and explain why.
+
+## ComfyPage Fixture API Reference
+
+### Core properties
+- \`comfyPage.page\` — raw Playwright Page
+- \`comfyPage.canvas\` — Locator for #graph-canvas
+- \`comfyPage.queueButton\` — "Queue Prompt" button
+- \`comfyPage.runButton\` — "Run" button (new UI)
+- \`comfyPage.confirmDialog\` — ConfirmDialog (has .confirm, .delete, .overwrite, .reject locators + .click(name) method)
+- \`comfyPage.nextFrame()\` — wait for next requestAnimationFrame
+- \`comfyPage.setup()\` — navigate + wait for app ready (called automatically by fixture)
+
+### Menu (comfyPage.menu)
+- \`comfyPage.menu.topbar\` — Topbar helper:
+  - \`.triggerTopbarCommand(['File', 'Save As'])\` — navigate menu hierarchy
+  - \`.openTopbarMenu()\` / \`.closeTopbarMenu()\` — open/close hamburger
+  - \`.openSubmenu('File')\` — hover to open submenu, returns submenu Locator
+  - \`.getTabNames()\` — get all open workflow tab names
+  - \`.getActiveTabName()\` — get active tab name
+  - \`.getWorkflowTab(name)\` — get tab Locator
+  - \`.closeWorkflowTab(name)\` — close a tab
+  - \`.saveWorkflow(name)\` / \`.saveWorkflowAs(name)\` / \`.exportWorkflow(name)\`
+  - \`.switchTheme('dark' | 'light')\`
+- \`comfyPage.menu.workflowsTab\` — WorkflowsSidebarTab:
+  - \`.open()\` / \`.close()\` — toggle workflows sidebar
+  - \`.getTopLevelSavedWorkflowNames()\` — list saved workflow names
+- \`comfyPage.menu.nodeLibraryTab\` — NodeLibrarySidebarTab
+- \`comfyPage.menu.assetsTab\` — AssetsSidebarTab
+
+### Canvas (comfyPage.canvasOps)
+- \`.click({x, y})\` — click at position on canvas
+- \`.rightClick(x, y)\` — right-click (opens context menu)
+- \`.doubleClick()\` — double-click canvas (opens node search)
+- \`.clickEmptySpace()\` — click known empty area
+- \`.dragAndDrop(source, target)\` — drag from source to target position
+- \`.pan(offset, safeSpot?)\` — pan canvas by offset
+- \`.zoom(deltaY, steps?)\` — zoom via scroll wheel
+- \`.resetView()\` — reset zoom/pan to default
+- \`.getScale()\` / \`.setScale(n)\` — get/set canvas zoom
+- \`.getNodeCenterByTitle(title)\` — get screen coords of node center
+- \`.disconnectEdge()\` / \`.connectEdge()\` — default graph edge operations
+
+### Node Operations (comfyPage.nodeOps)
+- \`.getGraphNodesCount()\` — count all nodes
+- \`.getSelectedGraphNodesCount()\` — count selected nodes
+- \`.getNodes()\` — get all nodes
+- \`.getFirstNodeRef()\` — get NodeReference for first node
+- \`.getNodeRefById(id)\` — get NodeReference by ID
+- \`.getNodeRefsByType(type)\` — get all nodes of a type
+- \`.waitForGraphNodes(count)\` — wait until node count matches
+
+### Settings (comfyPage.settings)
+- \`.setSetting(id, value)\` — change a ComfyUI setting
+- \`.getSetting(id)\` — read current setting value
+
+### Keyboard (comfyPage.keyboard)
+- \`.undo()\` / \`.redo()\` — Ctrl+Z / Ctrl+Y
+- \`.bypass()\` — Ctrl+B
+- \`.selectAll()\` — Ctrl+A
+- \`.ctrlSend(key)\` — send Ctrl+key
+
+### Workflow (comfyPage.workflow)
+- \`.loadWorkflow(name)\` — load from browser_tests/assets/{name}.json
+- \`.setupWorkflowsDirectory(structure)\` — setup test directory
+- \`.deleteWorkflow(name)\`
+- \`.isCurrentWorkflowModified()\` — check dirty state
+
+### Context Menu (comfyPage.contextMenu)
+- \`.openFor(locator)\` — right-click locator and wait for menu
+- \`.clickMenuItem(name)\` — click a menu item by name
+- \`.isVisible()\` — check if context menu is showing
+- \`.assertHasItems(items)\` — assert menu contains items
+
+### Other helpers
+- \`comfyPage.settingDialog\` — SettingDialog component
+- \`comfyPage.searchBox\` / \`comfyPage.searchBoxV2\` — node search
+- \`comfyPage.toast\` — ToastHelper (\`.visibleToasts\`)
+- \`comfyPage.subgraph\` — SubgraphHelper
+- \`comfyPage.vueNodes\` — VueNodeHelpers
+- \`comfyPage.bottomPanel\` — BottomPanel
+- \`comfyPage.clipboard\` — ClipboardHelper
+- \`comfyPage.dragDrop\` — DragDropHelper
+
+### Available fixture files (use readFixture to explore)
+- ComfyPage.ts — main fixture with all helpers
+- helpers/CanvasHelper.ts, NodeOperationsHelper.ts, WorkflowHelper.ts
+- helpers/KeyboardHelper.ts, SettingsHelper.ts, SubgraphHelper.ts
+- components/Topbar.ts, ContextMenu.ts, SettingDialog.ts, SidebarTab.ts
+
+## Current UI state (accessibility tree)
+${initialA11y}
+
+${qaGuide ? `## QA Analysis Guide\n${qaGuide}\n` : ''}
+## Issue to Reproduce
+${issueContext}`
+
+  // ── Run the agent ──
+  console.warn('Starting research phase (Claude writes E2E tests)...')
+
+  try {
+    for await (const message of query({
+      prompt:
+        'Write a Playwright E2E test that reproduces the reported bug. Use inspect() to discover selectors, readFixture() or readTest() if you need to understand the fixture API or see existing test patterns, writeTest() to write the test, runTest() to execute it. Iterate until it works or you determine the bug cannot be reproduced.',
+      options: {
+        model: 'claude-sonnet-4-6',
+        systemPrompt,
+        ...(anthropicApiKey ? { apiKey: anthropicApiKey } : {}),
+        maxTurns,
+        mcpServers: { 'qa-research': server },
+        allowedTools: [
+          'mcp__qa-research__inspect',
+          'mcp__qa-research__readFixture',
+          'mcp__qa-research__readTest',
+          'mcp__qa-research__writeTest',
+          'mcp__qa-research__runTest',
+          'mcp__qa-research__done'
+        ]
+      }
+    })) {
+      if (message.type === 'assistant' && message.message?.content) {
+        for (const block of message.message.content) {
+          if ('text' in block && block.text) {
+            console.warn(`  Claude: ${block.text.slice(0, 200)}`)
+          }
+          if ('name' in block) {
+            console.warn(
+              `  Tool: ${block.name}(${JSON.stringify(block.input).slice(0, 100)})`
+            )
+          }
+        }
+      }
+      if (agentDone) break
+    }
+  } catch (e) {
+    const errMsg = e instanceof Error ? e.message : String(e)
+    console.warn(`Research error: ${errMsg}`)
+
+    // Detect billing/auth errors and surface them clearly
+    if (
+      errMsg.includes('Credit balance is too low') ||
+      errMsg.includes('insufficient_quota') ||
+      errMsg.includes('rate_limit')
+    ) {
+      finalSummary = `API error: ${errMsg.slice(0, 200)}`
+      finalEvidence = 'Agent could not start due to API billing/auth issue'
+      console.warn(
+        '::error::Anthropic API credits exhausted — cannot run research phase'
+      )
+    }
+  }
+
+  // Auto-complete: if a test passed but done() was never called, use the passing test
+  if (!agentDone && lastPassedTurn >= 0 && finalTestCode) {
+    console.warn(
+      `Auto-completing: test passed at turn ${lastPassedTurn} but done() was not called`
+    )
+    finalVerdict = 'REPRODUCED'
+    finalReproducedBy = 'e2e_test'
+    finalSummary = `Test passed at turn ${lastPassedTurn} (auto-completed — agent did not call done())`
+    finalEvidence = `Test passed with exit code 0`
+  }
+
+  const result: ResearchResult = {
+    verdict: finalVerdict,
+    reproducedBy: finalReproducedBy,
+    summary: finalSummary,
+    evidence: finalEvidence,
+    testCode: finalTestCode,
+    log: researchLog
+  }
+
+  writeFileSync(`${testDir}/research-log.json`, JSON.stringify(result, null, 2))
+  console.warn(
+    `Research complete: ${finalVerdict} (${researchLog.length} tool calls)`
+  )
+
+  return result
+}
--- a/scripts/qa-analyze-pr.test.ts
+++ b/scripts/qa-analyze-pr.test.ts
@@ -0,0 +1,84 @@
+import { describe, expect, it } from 'vitest'
+
+import { extractMediaUrls } from './qa-analyze-pr'
+
+describe('extractMediaUrls', () => {
+  it('extracts markdown image URLs', () => {
+    const text = '![screenshot](https://example.com/image.png)'
+    expect(extractMediaUrls(text)).toEqual(['https://example.com/image.png'])
+  })
+
+  it('extracts multiple markdown images', () => {
+    const text = [
+      '![before](https://example.com/before.png)',
+      'Some text',
+      '![after](https://example.com/after.jpg)'
+    ].join('\n')
+    expect(extractMediaUrls(text)).toEqual([
+      'https://example.com/before.png',
+      'https://example.com/after.jpg'
+    ])
+  })
+
+  it('extracts raw URLs with media extensions', () => {
+    const text = 'Check this: https://cdn.example.com/demo.mp4 for details'
+    expect(extractMediaUrls(text)).toEqual(['https://cdn.example.com/demo.mp4'])
+  })
+
+  it('extracts GitHub user-attachments URLs', () => {
+    const text =
+      'https://github.com/user-attachments/assets/abc12345-6789-0def-1234-567890abcdef'
+    expect(extractMediaUrls(text)).toEqual([
+      'https://github.com/user-attachments/assets/abc12345-6789-0def-1234-567890abcdef'
+    ])
+  })
+
+  it('extracts private-user-images URLs', () => {
+    const text =
+      'https://private-user-images.githubusercontent.com/12345/abcdef-1234?jwt=token123'
+    expect(extractMediaUrls(text)).toEqual([
+      'https://private-user-images.githubusercontent.com/12345/abcdef-1234?jwt=token123'
+    ])
+  })
+
+  it('extracts URLs with query parameters', () => {
+    const text = 'https://example.com/image.png?w=800&h=600'
+    expect(extractMediaUrls(text)).toEqual([
+      'https://example.com/image.png?w=800&h=600'
+    ])
+  })
+
+  it('deduplicates URLs', () => {
+    const text = [
+      '![img](https://example.com/same.png)',
+      '![img2](https://example.com/same.png)',
+      'Also https://example.com/same.png'
+    ].join('\n')
+    expect(extractMediaUrls(text)).toEqual(['https://example.com/same.png'])
+  })
+
+  it('returns empty array for empty input', () => {
+    expect(extractMediaUrls('')).toEqual([])
+  })
+
+  it('returns empty array for text with no media URLs', () => {
+    expect(extractMediaUrls('Just some text without any URLs')).toEqual([])
+  })
+
+  it('handles mixed media types', () => {
+    const text = [
+      '![screen](https://example.com/screenshot.png)',
+      'Video: https://example.com/demo.webm',
+      '![gif](https://example.com/animation.gif)'
+    ].join('\n')
+    const urls = extractMediaUrls(text)
+    expect(urls).toContain('https://example.com/screenshot.png')
+    expect(urls).toContain('https://example.com/demo.webm')
+    expect(urls).toContain('https://example.com/animation.gif')
+  })
+
+  it('ignores non-http URLs in markdown', () => {
+    const text = '![local](./local-image.png)'
+    expect(extractMediaUrls(text)).toEqual([])
+  })
+})
--- a/scripts/qa-analyze-pr.ts
+++ b/scripts/qa-analyze-pr.ts
@@ -0,0 +1,799 @@
+#!/usr/bin/env tsx
+/**
+ * QA PR Analysis Script
+ *
+ * Deeply analyzes a PR using Gemini Pro to generate targeted QA guides
+ * for before/after recording sessions. Fetches PR thread, extracts media,
+ * and produces structured test plans.
+ *
+ * Usage:
+ *   pnpm exec tsx scripts/qa-analyze-pr.ts \
+ *     --pr-number 10270 \
+ *     --repo owner/repo \
+ *     --output-dir qa-guides/ \
+ *     [--model gemini-3.1-pro-preview]
+ *
+ * Env: GEMINI_API_KEY (required)
+ */
+
+import { execSync } from 'node:child_process'
+import { mkdirSync, readFileSync, writeFileSync } from 'node:fs'
+import { resolve } from 'node:path'
+import { fileURLToPath } from 'node:url'
+
+import { GoogleGenerativeAI } from '@google/generative-ai'
+
+// ── Types ──
+
+interface QaGuideStep {
+  action: string
+  description: string
+  expected_before?: string
+  expected_after?: string
+}
+
+interface QaGuide {
+  summary: string
+  test_focus: string
+  prerequisites: string[]
+  steps: QaGuideStep[]
+  visual_checks: string[]
+}
+
+interface PrThread {
+  title: string
+  body: string
+  labels: string[]
+  issueComments: string[]
+  reviewComments: string[]
+  reviews: string[]
+  diff: string
+}
+
+type TargetType = 'pr' | 'issue'
+
+interface Options {
+  prNumber: string
+  repo: string
+  outputDir: string
+  model: string
+  apiKey: string
+  mediaBudgetBytes: number
+  maxVideoBytes: number
+  type: TargetType
+}
+
+// ── CLI parsing ──
+
+function parseArgs(): Options {
+  const args = process.argv.slice(2)
+  const opts: Partial<Options> = {
+    model: 'gemini-3.1-pro-preview',
+    apiKey: process.env.GEMINI_API_KEY || '',
+    mediaBudgetBytes: 20 * 1024 * 1024,
+    maxVideoBytes: 10 * 1024 * 1024,
+    type: 'pr'
+  }
+
+  for (let i = 0; i < args.length; i++) {
+    switch (args[i]) {
+      case '--pr-number':
+        opts.prNumber = args[++i]
+        break
+      case '--repo':
+        opts.repo = args[++i]
+        break
+      case '--output-dir':
+        opts.outputDir = args[++i]
+        break
+      case '--model':
+        opts.model = args[++i]
+        break
+      case '--type':
+        opts.type = args[++i] as TargetType
+        break
+      case '--help':
+        console.warn(
+          'Usage: qa-analyze-pr.ts --pr-number <num> --repo <owner/repo> --output-dir <path> [--model <model>] [--type pr|issue]'
+        )
+        process.exit(0)
+    }
+  }
+
+  if (!opts.prNumber || !opts.repo || !opts.outputDir) {
+    console.error(
+      'Required: --pr-number <num> --repo <owner/repo> --output-dir <path>'
+    )
+    process.exit(1)
+  }
+
+  if (!opts.apiKey) {
+    console.error('GEMINI_API_KEY environment variable is required')
+    process.exit(1)
+  }
+
+  return opts as Options
+}
+
+// ── PR thread fetching ──
+
+function ghExec(cmd: string): string {
+  try {
+    return execSync(cmd, {
+      encoding: 'utf-8',
+      timeout: 30_000,
+      stdio: ['pipe', 'pipe', 'pipe']
+    }).trim()
+  } catch (err) {
+    console.warn(`gh command failed: ${cmd}`)
+    console.warn((err as Error).message)
+    return ''
+  }
+}
+
+function fetchPrThread(prNumber: string, repo: string): PrThread {
+  console.warn('Fetching PR thread...')
+
+  const prView = ghExec(
+    `gh pr view ${prNumber} --repo ${repo} --json title,body,labels`
+  )
+  const prData = prView
+    ? JSON.parse(prView)
+    : { title: '', body: '', labels: [] }
+
+  const issueCommentsRaw = ghExec(
+    `gh api repos/${repo}/issues/${prNumber}/comments --paginate`
+  )
+  const issueComments: string[] = issueCommentsRaw
+    ? JSON.parse(issueCommentsRaw).map((c: { body: string }) => c.body)
+    : []
+
+  const reviewCommentsRaw = ghExec(
+    `gh api repos/${repo}/pulls/${prNumber}/comments --paginate`
+  )
+  const reviewComments: string[] = reviewCommentsRaw
+    ? JSON.parse(reviewCommentsRaw).map((c: { body: string }) => c.body)
+    : []
+
+  const reviewsRaw = ghExec(
+    `gh api repos/${repo}/pulls/${prNumber}/reviews --paginate`
+  )
+  const reviews: string[] = reviewsRaw
+    ? JSON.parse(reviewsRaw)
+        .filter((r: { body: string }) => r.body)
+        .map((r: { body: string }) => r.body)
+    : []
+
+  const diff = ghExec(`gh pr diff ${prNumber} --repo ${repo}`)
+
+  console.warn(
+    `PR #${prNumber}: "${prData.title}" | ` +
+      `${issueComments.length} issue comments, ` +
+      `${reviewComments.length} review comments, ` +
+      `${reviews.length} reviews, ` +
+      `diff: ${diff.length} chars`
+  )
+
+  return {
+    title: prData.title || '',
+    body: prData.body || '',
+    labels: (prData.labels || []).map((l: { name: string }) => l.name),
+    issueComments,
+    reviewComments,
+    reviews,
+    diff
+  }
+}
+
+interface IssueThread {
+  title: string
+  body: string
+  labels: string[]
+  comments: string[]
+}
+
+function fetchIssueThread(issueNumber: string, repo: string): IssueThread {
+  console.warn('Fetching issue thread...')
+
+  const issueView = ghExec(
+    `gh issue view ${issueNumber} --repo ${repo} --json title,body,labels`
+  )
+  const issueData = issueView
+    ? JSON.parse(issueView)
+    : { title: '', body: '', labels: [] }
+
+  const commentsRaw = ghExec(
+    `gh api repos/${repo}/issues/${issueNumber}/comments --paginate`
+  )
+  const comments: string[] = commentsRaw
+    ? JSON.parse(commentsRaw).map((c: { body: string }) => c.body)
+    : []
+
+  console.warn(
+    `Issue #${issueNumber}: "${issueData.title}" | ` +
+      `${comments.length} comments`
+  )
+
+  return {
+    title: issueData.title || '',
+    body: issueData.body || '',
+    labels: (issueData.labels || []).map((l: { name: string }) => l.name),
+    comments
+  }
+}
+
+// ── Media extraction ──
+
+const MEDIA_EXTENSIONS = /\.(png|jpg|jpeg|gif|webp|mp4|webm|mov)$/i
+
+const MEDIA_URL_PATTERNS = [
+  // Markdown images: ![alt](url)
+  /!\[[^\]]*\]\(([^)]+)\)/g,
+  // GitHub user-attachments
+  /https:\/\/github\.com\/user-attachments\/assets\/[a-f0-9-]+/g,
+  // Private user images
+  /https:\/\/private-user-images\.githubusercontent\.com\/[^\s)"]+/g,
+  // Raw URLs with media extensions (standalone or in text)
+  /(?<!="|=')https?:\/\/[^\s)<>"]+\.(?:png|jpg|jpeg|gif|webp|mp4|webm|mov)(?:\?[^\s)<>"]*)?/gi
+]
+
+export function extractMediaUrls(text: string): string[] {
+  if (!text) return []
+
+  const urls = new Set<string>()
+
+  for (const pattern of MEDIA_URL_PATTERNS) {
+    // Reset lastIndex for global patterns
+    pattern.lastIndex = 0
+    let match: RegExpExecArray | null
+    while ((match = pattern.exec(text)) !== null) {
+      // For markdown images, the URL is in capture group 1
+      const url = match[1] || match[0]
+      // Clean trailing markdown/html artifacts
+      const cleaned = url.replace(/[)>"'\s]+$/, '')
+      if (cleaned.startsWith('http')) {
+        urls.add(cleaned)
+      }
+    }
+  }
+
+  return [...urls]
+}
+
+// ── Media downloading ──
+
+const ALLOWED_MEDIA_DOMAINS = [
+  'github.com',
+  'raw.githubusercontent.com',
+  'user-images.githubusercontent.com',
+  'private-user-images.githubusercontent.com',
+  'objects.githubusercontent.com',
+  'github.githubassets.com'
+]
+
+function isAllowedMediaDomain(url: string): boolean {
+  try {
+    const hostname = new URL(url).hostname
+    return ALLOWED_MEDIA_DOMAINS.some(
+      (domain) => hostname === domain || hostname.endsWith(`.${domain}`)
+    )
+  } catch {
+    return false
+  }
+}
+
+async function downloadMedia(
+  urls: string[],
+  outputDir: string,
+  budgetBytes: number,
+  maxVideoBytes: number
+): Promise<Array<{ path: string; mimeType: string }>> {
+  const downloaded: Array<{ path: string; mimeType: string }> = []
+  let totalBytes = 0
+
+  const mediaDir = resolve(outputDir, 'media')
+  mkdirSync(mediaDir, { recursive: true })
+
+  for (const url of urls) {
+    if (totalBytes >= budgetBytes) {
+      console.warn(
+        `Media budget exhausted (${totalBytes} bytes), skipping rest`
+      )
+      break
+    }
+
+    if (!isAllowedMediaDomain(url)) {
+      console.warn(`Skipping non-GitHub URL: ${url.slice(0, 80)}`)
+      continue
+    }
+
+    try {
+      const response = await fetch(url, {
+        signal: AbortSignal.timeout(15_000),
+        headers: { Accept: 'image/*,video/*' },
+        redirect: 'follow'
+      })
+
+      if (!response.ok) {
+        console.warn(`Failed to download ${url}: ${response.status}`)
+        continue
+      }
+
+      const contentLength = response.headers.get('content-length')
+      if (contentLength) {
+        const declaredSize = Number.parseInt(contentLength, 10)
+        if (declaredSize > budgetBytes - totalBytes) {
+          console.warn(
+            `Content-Length ${declaredSize} would exceed budget, skipping ${url}`
+          )
+          continue
+        }
+      }
+
+      const contentType = response.headers.get('content-type') || ''
+      const buffer = Buffer.from(await response.arrayBuffer())
+
+      // Skip oversized videos
+      const isVideo =
+        contentType.startsWith('video/') || /\.(mp4|webm|mov)$/i.test(url)
+      if (isVideo && buffer.length > maxVideoBytes) {
+        console.warn(
+          `Skipping large video ${url} (${(buffer.length / 1024 / 1024).toFixed(1)}MB > ${(maxVideoBytes / 1024 / 1024).toFixed(0)}MB cap)`
+        )
+        continue
+      }
+
+      if (totalBytes + buffer.length > budgetBytes) {
+        console.warn(`Would exceed budget, skipping ${url}`)
+        continue
+      }
+
+      const ext = guessExtension(url, contentType)
+      const filename = `media-${downloaded.length}${ext}`
+      const filepath = resolve(mediaDir, filename)
+      writeFileSync(filepath, buffer)
+      totalBytes += buffer.length
+
+      const mimeType = contentType.split(';')[0].trim() || guessMimeType(ext)
+
+      downloaded.push({ path: filepath, mimeType })
+      console.warn(
+        `Downloaded: ${url.slice(0, 80)}... (${(buffer.length / 1024).toFixed(0)}KB)`
+      )
+    } catch (err) {
+      console.warn(`Failed to download ${url}: ${(err as Error).message}`)
+    }
+  }
+
+  console.warn(
+    `Downloaded ${downloaded.length}/${urls.length} media files ` +
+      `(${(totalBytes / 1024 / 1024).toFixed(1)}MB)`
+  )
+  return downloaded
+}
+
+function guessExtension(url: string, contentType: string): string {
+  const urlMatch = url.match(MEDIA_EXTENSIONS)
+  if (urlMatch) return urlMatch[0].toLowerCase()
+
+  const typeMap: Record<string, string> = {
+    'image/png': '.png',
+    'image/jpeg': '.jpg',
+    'image/gif': '.gif',
+    'image/webp': '.webp',
+    'video/mp4': '.mp4',
+    'video/webm': '.webm'
+  }
+  return typeMap[contentType.split(';')[0]] || '.bin'
+}
+
+function guessMimeType(ext: string): string {
+  const map: Record<string, string> = {
+    '.png': 'image/png',
+    '.jpg': 'image/jpeg',
+    '.jpeg': 'image/jpeg',
+    '.gif': 'image/gif',
+    '.webp': 'image/webp',
+    '.mp4': 'video/mp4',
+    '.webm': 'video/webm',
+    '.mov': 'video/quicktime'
+  }
+  return map[ext] || 'application/octet-stream'
+}
+
+// ── Gemini analysis ──
+
+function buildIssueAnalysisPrompt(issue: IssueThread): string {
+  const allText = [
+    `# Issue: ${issue.title}`,
+    '',
+    '## Description',
+    issue.body,
+    '',
+    issue.comments.length > 0
+      ? `## Comments\n${issue.comments.join('\n\n---\n\n')}`
+      : ''
+  ]
+    .filter(Boolean)
+    .join('\n')
+
+  return `You are a senior QA engineer analyzing a bug report for ComfyUI frontend — a node-based visual workflow editor for AI image generation (Vue 3 + TypeScript).
+
+The UI has:
+- A large canvas (1280x720 viewport) showing a node graph centered at ~(640, 400)
+- Nodes are boxes with input/output slots connected by wires
+- A hamburger menu (top-left C logo) with File, Edit, Help submenus
+- Sidebars (Workflows, Node Library, Models)
+- A topbar with workflow tabs and Queue button
+- The default workflow loads with these nodes (approximate center coordinates):
+  - Load Checkpoint (~150, 300), CLIP Text Encode x2 (~450, 250 and ~450, 450)
+  - Empty Latent Image (~450, 600), KSampler (~750, 350), VAE Decode (~1000, 350), Save Image (~1200, 350)
+- Right-clicking ON a node shows node actions (Clone, Bypass, Convert, etc.)
+- Right-clicking on EMPTY canvas shows Add Node menu — different from node context menu
+
+Your task: Generate a DETAILED reproduction guide (8-15 steps) to trigger this bug on main.
+
+${allText}
+
+## Available test actions
+Each step must use one of these actions:
+
+### Menu actions
+- "openMenu" — clicks the Comfy hamburger menu (top-left C logo)
+- "hoverMenuItem" — hovers a top-level menu item to open submenu (label required)
+- "clickMenuItem" — clicks an item in the visible submenu (label required)
+
+### Element actions (by visible text)
+- "click" — clicks an element by visible text (text required)
+- "rightClick" — right-clicks an element to open context menu (text required)
+- "doubleClick" — double-clicks an element or coordinates (text or x,y)
+- "fillDialog" — fills dialog input and presses Enter (text required)
+- "pressKey" — presses a keyboard key (key required: Escape, Tab, Delete, Enter, etc.)
+
+### Canvas actions (by coordinates — viewport is 1280x720)
+- "clickCanvas" — click at coordinates (x, y required)
+- "rightClickCanvas" — right-click at coordinates (x, y required)
+- "doubleClick" — double-click at coordinates to open node search (x, y)
+- "dragCanvas" — drag from one point to another (fromX, fromY, toX, toY)
+- "scrollCanvas" — scroll wheel for zoom (x, y, deltaY: negative=zoom in, positive=zoom out)
+
+### Utility
+- "wait" — waits briefly (ms required, max 3000)
+- "screenshot" — takes a screenshot (name required)
+
+## Common ComfyUI interactions
+- Right-click a node → context menu with Clone, Bypass, Remove, Colors, etc.
+- Double-click empty canvas → opens node search dialog
+- Ctrl+C / Ctrl+V → copy/paste selected nodes
+- Delete key → remove selected node
+- Ctrl+G → group selected nodes
+- Drag from output slot to input slot → create connection
+- Click a node to select it, Shift+click for multi-select
+
+## Output format
+Return a JSON object with exactly one key: "reproduce", containing:
+{
+  "summary": "One sentence: what bug this issue reports",
+  "test_focus": "Specific behavior to reproduce",
+  "prerequisites": ["e.g. Load default workflow"],
+  "steps": [
+    {
+      "action": "clickCanvas",
+      "description": "Click on first node to select it",
+      "expected_before": "What should happen if the bug is present"
+    }
+  ],
+  "visual_checks": ["Specific visual evidence of the bug to look for"]
+}
+
+## Rules
+- Generate 8-15 DETAILED steps that actually trigger the reported bug.
+- Follow the issue's reproduction steps PRECISELY — translate them into available actions.
+- Use canvas coordinates for node interactions (nodes are typically in the center area 300-900 x 200-500).
+- Take screenshots BEFORE and AFTER critical actions to capture the bug state.
+- Do NOT just open a menu and screenshot — actually perform the full reproduction sequence.
+- Do NOT include login steps.
+- Output ONLY valid JSON, no markdown fences or explanation.`
+}
+
+function buildAnalysisPrompt(thread: PrThread): string {
+  const allText = [
+    `# PR: ${thread.title}`,
+    '',
+    '## Description',
+    thread.body,
+    '',
+    thread.issueComments.length > 0
+      ? `## Issue Comments\n${thread.issueComments.join('\n\n---\n\n')}`
+      : '',
+    thread.reviewComments.length > 0
+      ? `## Review Comments\n${thread.reviewComments.join('\n\n---\n\n')}`
+      : '',
+    thread.reviews.length > 0
+      ? `## Reviews\n${thread.reviews.join('\n\n---\n\n')}`
+      : '',
+    '',
+    '## Diff (truncated)',
+    '```',
+    thread.diff.slice(0, 8000),
+    '```'
+  ]
+    .filter(Boolean)
+    .join('\n')
+
+  return `You are a senior QA engineer analyzing a pull request for ComfyUI frontend (a Vue 3 + TypeScript web application for AI image generation workflows).
+
+Your task: Generate TWO targeted QA test guides — one for BEFORE the PR (main branch) and one for AFTER (PR branch).
+
+${allText}
+
+## Available test actions
+Each step must use one of these actions:
+- "openMenu" — clicks the Comfy hamburger menu (top-left C logo)
+- "hoverMenuItem" — hovers a top-level menu item to open submenu (label required)
+- "clickMenuItem" — clicks an item in the visible submenu (label required)
+- "fillDialog" — fills dialog input and presses Enter (text required)
+- "pressKey" — presses a keyboard key (key required)
+- "click" — clicks an element by visible text (text required)
+- "wait" — waits briefly (ms required, max 3000)
+- "screenshot" — takes a screenshot (name required)
+
+## Output format
+Return a JSON object with exactly two keys: "before" and "after", each containing:
+{
+  "summary": "One sentence: what this PR changes",
+  "test_focus": "Specific behaviors to verify in this recording",
+  "prerequisites": ["e.g. Load default workflow"],
+  "steps": [
+    {
+      "action": "openMenu",
+      "description": "Open the main menu to check file options",
+      "expected_before": "Old behavior description (before key only)",
+      "expected_after": "New behavior description (after key only)"
+    }
+  ],
+  "visual_checks": ["Specific visual elements to look for"]
+}
+
+## Rules
+- BEFORE guide: 2-4 steps, under 15 seconds. Show OLD/missing behavior.
+- AFTER guide: 3-6 steps, under 30 seconds. Prove the fix/feature works.
+- Focus on the SPECIFIC behavior changed by this PR, not generic testing.
+- Use information from PR description, screenshots, and comments to understand intended behavior.
+- Include at least one screenshot step in each guide.
+- Do NOT include login steps.
+- Menu pattern: openMenu -> hoverMenuItem -> clickMenuItem or screenshot.
+- Output ONLY valid JSON, no markdown fences or explanation.`
+}
+
+async function analyzeWithGemini(
+  thread: PrThread,
+  media: Array<{ path: string; mimeType: string }>,
+  model: string,
+  apiKey: string
+): Promise<{ before: QaGuide; after: QaGuide }> {
+  const genAI = new GoogleGenerativeAI(apiKey)
+  const geminiModel = genAI.getGenerativeModel({ model })
+
+  const prompt = buildAnalysisPrompt(thread)
+
+  const parts: Array<
+    { text: string } | { inlineData: { mimeType: string; data: string } }
+  > = [{ text: prompt }]
+
+  // Add media as inline data
+  for (const item of media) {
+    try {
+      const buffer = readFileSync(item.path)
+      parts.push({
+        inlineData: {
+          mimeType: item.mimeType,
+          data: buffer.toString('base64')
+        }
+      })
+    } catch (err) {
+      console.warn(
+        `Failed to read media ${item.path}: ${(err as Error).message}`
+      )
+    }
+  }
+
+  console.warn(
+    `Sending to ${model}: ${prompt.length} chars text, ${media.length} media files`
+  )
+
+  const result = await geminiModel.generateContent({
+    contents: [{ role: 'user', parts }],
+    generationConfig: {
+      temperature: 0.2,
+      maxOutputTokens: 8192,
+      responseMimeType: 'application/json'
+    }
+  })
+
+  let text = result.response.text()
+  // Strip markdown fences if present
+  text = text
+    .replace(/^```(?:json)?\n?/gm, '')
+    .replace(/```$/gm, '')
+    .trim()
+
+  console.warn('Gemini response received')
+  console.warn('Raw response (first 500 chars):', text.slice(0, 500))
+  const parsed = JSON.parse(text)
+
+  // Handle different response shapes from Gemini
+  let before: QaGuide
+  let after: QaGuide
+
+  if (Array.isArray(parsed) && parsed.length >= 2) {
+    // Array format: [before, after]
+    before = parsed[0]
+    after = parsed[1]
+  } else if (parsed.before && parsed.after) {
+    // Object format: { before, after }
+    before = parsed.before
+    after = parsed.after
+  } else {
+    // Try nested wrapper keys
+    const inner = parsed.qa_guide ?? parsed.guides ?? parsed
+    if (inner.before && inner.after) {
+      before = inner.before
+      after = inner.after
+    } else {
+      console.warn(
+        'Full response:',
+        JSON.stringify(parsed, null, 2).slice(0, 2000)
+      )
+      throw new Error(
+        `Unexpected response shape. Got keys: ${Object.keys(parsed).join(', ')}`
+      )
+    }
+  }
+
+  return { before, after }
+}
+
+async function analyzeIssueWithGemini(
+  issue: IssueThread,
+  media: Array<{ path: string; mimeType: string }>,
+  model: string,
+  apiKey: string
+): Promise<QaGuide> {
+  const genAI = new GoogleGenerativeAI(apiKey)
+  const geminiModel = genAI.getGenerativeModel({ model })
+
+  const prompt = buildIssueAnalysisPrompt(issue)
+
+  const parts: Array<
+    { text: string } | { inlineData: { mimeType: string; data: string } }
+  > = [{ text: prompt }]
+
+  for (const item of media) {
+    try {
+      const buffer = readFileSync(item.path)
+      parts.push({
+        inlineData: {
+          mimeType: item.mimeType,
+          data: buffer.toString('base64')
+        }
+      })
+    } catch (err) {
+      console.warn(
+        `Failed to read media ${item.path}: ${(err as Error).message}`
+      )
+    }
+  }
+
+  console.warn(
+    `Sending to ${model}: ${prompt.length} chars text, ${media.length} media files`
+  )
+
+  const result = await geminiModel.generateContent({
+    contents: [{ role: 'user', parts }],
+    generationConfig: {
+      temperature: 0.2,
+      maxOutputTokens: 8192,
+      responseMimeType: 'application/json'
+    }
+  })
+
+  let text = result.response.text()
+  text = text
+    .replace(/^```(?:json)?\n?/gm, '')
+    .replace(/```$/gm, '')
+    .trim()
+
+  console.warn('Gemini response received')
+  console.warn('Raw response (first 500 chars):', text.slice(0, 500))
+  const parsed = JSON.parse(text)
+
+  const guide: QaGuide =
+    parsed.reproduce ?? parsed.qa_guide?.reproduce ?? parsed
+  return guide
+}
+
+// ── Main ──
+
+async function main() {
+  const opts = parseArgs()
+  mkdirSync(opts.outputDir, { recursive: true })
+
+  if (opts.type === 'issue') {
+    await analyzeIssue(opts)
+  } else {
+    await analyzePr(opts)
+  }
+}
+
+async function analyzeIssue(opts: Options) {
+  const issue = fetchIssueThread(opts.prNumber, opts.repo)
+
+  const allText = [issue.body, ...issue.comments].join('\n')
+  const mediaUrls = extractMediaUrls(allText)
+  console.warn(`Found ${mediaUrls.length} media URLs`)
+
+  const media = await downloadMedia(
+    mediaUrls,
+    opts.outputDir,
+    opts.mediaBudgetBytes,
+    opts.maxVideoBytes
+  )
+
+  const guide = await analyzeIssueWithGemini(
+    issue,
+    media,
+    opts.model,
+    opts.apiKey
+  )
+
+  const beforePath = resolve(opts.outputDir, 'qa-guide-before.json')
+  writeFileSync(beforePath, JSON.stringify(guide, null, 2))
+
+  console.warn(`Wrote QA guide:`)
+  console.warn(`  Reproduce: ${beforePath}`)
+}
+
+async function analyzePr(opts: Options) {
+  const thread = fetchPrThread(opts.prNumber, opts.repo)
+
+  const allText = [
+    thread.body,
+    ...thread.issueComments,
+    ...thread.reviewComments,
+    ...thread.reviews
+  ].join('\n')
+  const mediaUrls = extractMediaUrls(allText)
+  console.warn(`Found ${mediaUrls.length} media URLs`)
+
+  const media = await downloadMedia(
+    mediaUrls,
+    opts.outputDir,
+    opts.mediaBudgetBytes,
+    opts.maxVideoBytes
+  )
+
+  const guides = await analyzeWithGemini(thread, media, opts.model, opts.apiKey)
+
+  const beforePath = resolve(opts.outputDir, 'qa-guide-before.json')
+  const afterPath = resolve(opts.outputDir, 'qa-guide-after.json')
+  writeFileSync(beforePath, JSON.stringify(guides.before, null, 2))
+  writeFileSync(afterPath, JSON.stringify(guides.after, null, 2))
+
+  console.warn(`Wrote QA guides:`)
+  console.warn(`  Before: ${beforePath}`)
+  console.warn(`  After:  ${afterPath}`)
+}
+
+function isExecutedAsScript(metaUrl: string): boolean {
+  const modulePath = fileURLToPath(metaUrl)
+  const scriptPath = process.argv[1] ? resolve(process.argv[1]) : ''
+  return modulePath === scriptPath
+}
+
+if (isExecutedAsScript(import.meta.url)) {
+  main().catch((err) => {
+    console.error('PR analysis failed:', err)
+    process.exit(1)
+  })
+}
--- a/scripts/qa-batch.sh
+++ b/scripts/qa-batch.sh
--- a/scripts/qa-deploy-pages.sh
+++ b/scripts/qa-deploy-pages.sh
@@ -0,0 +1,381 @@
+#!/usr/bin/env bash
+# Deploy QA report to Cloudflare Pages.
+# Expected env vars: CLOUDFLARE_API_TOKEN, CLOUDFLARE_ACCOUNT_ID, RAW_BRANCH,
+#   BEFORE_SHA, AFTER_SHA, TARGET_NUM, TARGET_TYPE, REPO, RUN_ID
+# Writes outputs to GITHUB_OUTPUT: badge_status, url
+set -euo pipefail
+
+npm install -g wrangler@4.74.0 >/dev/null 2>&1
+
+DEPLOY_DIR=$(mktemp -d)
+mkdir -p "$DEPLOY_DIR"
+
+for os in Linux macOS Windows; do
+  DIR="qa-artifacts/qa-report-${os}-${RUN_ID}"
+  for prefix in qa qa-before; do
+    VID="${DIR}/${prefix}-session.mp4"
+    if [ -f "$VID" ]; then
+      DEST="$DEPLOY_DIR/${prefix}-${os}.mp4"
+      cp "$VID" "$DEST"
+      echo "Found ${prefix} ${os} video ($(du -h "$VID" | cut -f1))"
+    fi
+  done
+  # Copy multi-pass session videos (qa-session-1, qa-session-2, etc.)
+  for numbered in "$DIR"/qa-session-[0-9].mp4; do
+    [ -f "$numbered" ] || continue
+    NUM=$(basename "$numbered" | sed 's/qa-session-\([0-9]\).mp4/\1/')
+    DEST="$DEPLOY_DIR/qa-${os}-pass${NUM}.mp4"
+    cp "$numbered" "$DEST"
+    echo "Found pass ${NUM} ${os} video ($(du -h "$numbered" | cut -f1))"
+  done
+  # Generate GIF thumbnail from after video (or first pass)
+  THUMB_SRC="$DEPLOY_DIR/qa-${os}.mp4"
+  [ ! -f "$THUMB_SRC" ] && THUMB_SRC="$DEPLOY_DIR/qa-${os}-pass1.mp4"
+  if [ -f "$THUMB_SRC" ]; then
+    ffmpeg -y -ss 10 -i "$THUMB_SRC" -t 8 \
+      -vf "fps=8,scale=480:-1:flags=lanczos,split[s0][s1];[s0]palettegen=max_colors=64[p];[s1][p]paletteuse=dither=bayer" \
+      -loop 0 "$DEPLOY_DIR/qa-${os}-thumb.gif" 2>/dev/null \
+    || echo "GIF generation failed for ${os} (non-fatal)"
+  fi
+done
+
+# Build video cards and report sections
+CARDS=""
+# shellcheck disable=SC2034 # accessed via eval
+ICONS_Linux="&#x1F427;" ICONS_macOS="&#x1F34E;" ICONS_Windows="&#x1FA9F;"
+CARD_COUNT=0
+DL_ICON="<svg width=14 height=14 viewBox='0 0 24 24' fill=none stroke=currentColor stroke-width=2><path d='M21 15v4a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2v-4'/><polyline points='7 10 12 15 17 10'/><line x1=12 y1=15 x2=12 y2=3'/></svg>"
+
+for os in Linux macOS Windows; do
+  eval "ICON=\$ICONS_${os}"
+  OS_LOWER=$(echo "$os" | tr '[:upper:]' '[:lower:]')
+  HAS_BEFORE=$([ -f "$DEPLOY_DIR/qa-before-${os}.mp4" ] && echo 1 || echo 0)
+  HAS_AFTER=$( { [ -f "$DEPLOY_DIR/qa-${os}.mp4" ] || [ -f "$DEPLOY_DIR/qa-${os}-pass1.mp4" ]; } && echo 1 || echo 0)
+  [ "$HAS_AFTER" = "0" ] && continue
+
+  # Collect all reports for this platform (single + multi-pass)
+  REPORT_FILES=""
+  REPORT_LINK=""
+  REPORT_HTML=""
+  for rpt in "video-reviews/${OS_LOWER}-qa-video-report.md" "video-reviews/${OS_LOWER}-pass"*-qa-video-report.md; do
+    [ -f "$rpt" ] && REPORT_FILES="${REPORT_FILES} ${rpt}"
+  done
+
+  if [ -n "$REPORT_FILES" ]; then
+    # Concatenate all reports into one combined report file
+    COMBINED_MD=""
+    for rpt in $REPORT_FILES; do
+      cp "$rpt" "$DEPLOY_DIR/$(basename "$rpt")"
+      RPT_MD=$(sed 's/&/\&amp;/g; s/</\&lt;/g; s/>/\&gt;/g' "$rpt")
+      [ -n "$COMBINED_MD" ] && COMBINED_MD="${COMBINED_MD}&#10;&#10;---&#10;&#10;"
+      COMBINED_MD="${COMBINED_MD}${RPT_MD}"
+    done
+    FIRST_REPORT=$(echo "$REPORT_FILES" | awk '{print $1}')
+    FIRST_BASENAME=$(basename "$FIRST_REPORT")
+    REPORT_LINK="<a class=dl href=${FIRST_BASENAME}><svg width=14 height=14 viewBox='0 0 24 24' fill=none stroke=currentColor stroke-width=2><path d='M14 2H6a2 2 0 0 0-2 2v16a2 2 0 0 0 2 2h12a2 2 0 0 0 2-2V8z'/><polyline points='14 2 14 8 20 8'/><line x1=16 y1=13 x2=8 y2=13/><line x1=16 y1=17 x2=8 y2=17'/></svg>Report</a>"
+    REPORT_HTML="<details class=report open><summary><svg width=14 height=14 viewBox='0 0 24 24' fill=none stroke=currentColor stroke-width=2><circle cx=12 cy=12 r=10/><line x1=12 y1=16 x2=12 y2=12/><line x1=12 y1=8 x2=12.01 y2=8'/></svg> AI Comparative Review</summary><div class=report-body data-md>${COMBINED_MD}</div></details>"
+  fi
+
+  if [ "$HAS_BEFORE" = "1" ]; then
+    CARDS="${CARDS}<div class='card reveal' style='--i:${CARD_COUNT}'><div class=card-header><span class=platform><span class=icon>${ICON}</span>${os}</span><span class=links>${REPORT_LINK}</span></div><div class=comparison><div class=comp-panel><div class=comp-label>Before <span class=comp-tag>main</span></div><div class=video-wrap><video controls muted preload=auto><source src=qa-before-${os}.mp4 type=video/mp4></video></div><div class=comp-dl><a class=dl href=qa-before-${os}.mp4 download>${DL_ICON}Before</a></div></div><div class=comp-panel><div class=comp-label>After <span class=comp-tag>PR</span></div><div class=video-wrap><video controls muted preload=auto><source src=qa-${os}.mp4 type=video/mp4></video></div><div class=comp-dl><a class=dl href=qa-${os}.mp4 download>${DL_ICON}After</a></div></div></div>${REPORT_HTML}</div>"
+  elif [ -f "$DEPLOY_DIR/qa-${os}.mp4" ]; then
+    CARDS="${CARDS}<div class='card reveal' style='--i:${CARD_COUNT}'><div class=video-wrap><video controls muted preload=auto><source src=qa-${os}.mp4 type=video/mp4></video></div><div class=card-body><span class=platform><span class=icon>${ICON}</span>${os}</span><span class=links><a class=dl href=qa-${os}.mp4 download>${DL_ICON}Download</a>${REPORT_LINK}</span></div>${REPORT_HTML}</div>"
+  else
+    PASS_VIDEOS=""
+    for pass_vid in "$DEPLOY_DIR/qa-${os}-pass"[0-9].mp4; do
+      [ -f "$pass_vid" ] || continue
+      PASS_NUM=$(basename "$pass_vid" | sed "s/qa-${os}-pass\([0-9]\).mp4/\1/")
+      PASS_VIDEOS="${PASS_VIDEOS}<div class=comp-panel><div class=comp-label>Pass ${PASS_NUM}</div><div class=video-wrap><video controls muted preload=auto><source src=qa-${os}-pass${PASS_NUM}.mp4 type=video/mp4></video></div><div class=comp-dl><a class=dl href=qa-${os}-pass${PASS_NUM}.mp4 download>${DL_ICON}Pass ${PASS_NUM}</a></div></div>"
+    done
+    CARDS="${CARDS}<div class='card reveal' style='--i:${CARD_COUNT}'><div class=card-header><span class=platform><span class=icon>${ICON}</span>${os}</span><span class=links>${REPORT_LINK}</span></div><div class=comparison>${PASS_VIDEOS}</div>${REPORT_HTML}</div>"
+  fi
+  CARD_COUNT=$((CARD_COUNT + 1))
+done
+
+# Build commit info and target link for the report header
+COMMIT_HTML=""
+REPO_URL="https://github.com/${REPO}"
+if [ -n "${TARGET_NUM:-}" ]; then
+  if [ "$TARGET_TYPE" = "issue" ]; then
+    COMMIT_HTML="<a href=${REPO_URL}/issues/${TARGET_NUM} class=sha title='Issue'>Issue #${TARGET_NUM}</a>"
+  else
+    COMMIT_HTML="<a href=${REPO_URL}/pull/${TARGET_NUM} class=sha title='Pull Request'>PR #${TARGET_NUM}</a>"
+  fi
+fi
+if [ -n "${BEFORE_SHA:-}" ]; then
+  SHORT_BEFORE="${BEFORE_SHA:0:7}"
+  COMMIT_HTML="${COMMIT_HTML:+${COMMIT_HTML} &middot; }<a href=${REPO_URL}/commit/${BEFORE_SHA} class=sha title='main branch'>main @ ${SHORT_BEFORE}</a>"
+fi
+if [ -n "${AFTER_SHA:-}" ]; then
+  SHORT_AFTER="${AFTER_SHA:0:7}"
+  AFTER_LABEL="PR"
+  [ -n "${TARGET_NUM:-}" ] && AFTER_LABEL="#${TARGET_NUM}"
+  COMMIT_HTML="${COMMIT_HTML:+${COMMIT_HTML} &middot; }<a href=${REPO_URL}/commit/${AFTER_SHA} class=sha title='PR head commit'>${AFTER_LABEL} @ ${SHORT_AFTER}</a>"
+fi
+if [ -n "${PIPELINE_SHA:-}" ]; then
+  SHORT_PIPE="${PIPELINE_SHA:0:7}"
+  COMMIT_HTML="${COMMIT_HTML:+${COMMIT_HTML} &middot; }<a href=${REPO_URL}/commit/${PIPELINE_SHA} class=sha title='QA pipeline version'>QA @ ${SHORT_PIPE}</a>"
+fi
+[ -n "$COMMIT_HTML" ] && COMMIT_HTML=" &middot; ${COMMIT_HTML}"
+
+RUN_LINK=""
+if [ -n "${RUN_URL:-}" ]; then
+  RUN_LINK=" &middot; <a href=\"${RUN_URL}\" class=sha title=\"GitHub Actions run\">CI Job</a>"
+fi
+
+# Timing info
+DEPLOY_TIME=$(date -u '+%Y-%m-%d %H:%M UTC')
+TIMING_HTML=""
+if [ -n "${RUN_START_TIME:-}" ]; then
+  TIMING_HTML=" &middot; <span class=sha title='Pipeline timing'>${RUN_START_TIME} &rarr; ${DEPLOY_TIME}</span>"
+fi
+
+# Generate index.html from template
+SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
+TEMPLATE="$SCRIPT_DIR/qa-report-template.html"
+
+# Write dynamic content to temp files for safe substitution
+# Cloudflare Pages _headers file — enable range requests for video seeking
+cat > "$DEPLOY_DIR/_headers" <<'HEADERSEOF'
+/*.mp4
+  Accept-Ranges: bytes
+  Cache-Control: public, max-age=86400
+HEADERSEOF
+
+# Build purpose description from pr-context.txt
+PURPOSE_HTML=""
+if [ -f pr-context.txt ]; then
+  # Extract title line and first paragraph of description
+  PR_TITLE=$(grep -m1 '^Title:' pr-context.txt 2>/dev/null | sed 's/^Title: //' || true)
+  if [ "$TARGET_TYPE" = "issue" ]; then
+    PURPOSE_LABEL="Issue #${TARGET_NUM}"
+    PURPOSE_VERB="reports"
+  else
+    PURPOSE_LABEL="PR #${TARGET_NUM}"
+    PURPOSE_VERB="aims to"
+  fi
+  # Get first ~300 chars of description body (after "Description:" line)
+  PR_DESC=$(sed -n '/^Description:/,/^###/p' pr-context.txt 2>/dev/null | grep -v '^Description:\|^###' | head -5 | sed 's/&/\&amp;/g; s/</\&lt;/g; s/>/\&gt;/g' | tr '\n' ' ' | head -c 400 || true)
+  [ -z "$PR_DESC" ] && PR_DESC=$(sed -n '3,8p' pr-context.txt 2>/dev/null | sed 's/&/\&amp;/g; s/</\&lt;/g; s/>/\&gt;/g' | tr '\n' ' ' | head -c 400 || true)
+  # Build requirements from QA guide JSON
+  REQS_HTML=""
+  QA_GUIDE=$(ls qa-guides/qa-guide-*.json 2>/dev/null | head -1 || true)
+  if [ -f "$QA_GUIDE" ]; then
+    PREREQS=$(python3 -c "
+import json, sys, html
+try:
+  g = json.load(open(sys.argv[1]))
+  prereqs = g.get('prerequisites', [])
+  steps = g.get('steps', [])
+  focus = g.get('test_focus', '')
+  parts = []
+  if focus:
+    parts.append('<strong>Test focus:</strong> ' + html.escape(focus))
+  if prereqs:
+    parts.append('<strong>Prerequisites:</strong> ' + ', '.join(html.escape(p) for p in prereqs))
+  if steps:
+    parts.append('<strong>Steps:</strong> ' + ' → '.join(html.escape(s.get('description', str(s))) for s in steps[:6]))
+    if len(steps) > 6:
+      parts[-1] += ' → ...'
+  print('<br>'.join(parts))
+except: pass
+" "$QA_GUIDE" 2>/dev/null)
+    [ -n "$PREREQS" ] && REQS_HTML="<div class=purpose-reqs>${PREREQS}</div>"
+  fi
+
+  PURPOSE_HTML="<div class=purpose><div class=purpose-label>${PURPOSE_LABEL} ${PURPOSE_VERB}</div><strong>${PR_TITLE}</strong><br>${PR_DESC}${REQS_HTML}</div>"
+fi
+
+echo -n "$COMMIT_HTML" > "$DEPLOY_DIR/.commit_html"
+echo -n "$CARDS" > "$DEPLOY_DIR/.cards_html"
+echo -n "$RUN_LINK" > "$DEPLOY_DIR/.run_link"
+# Badge HTML with copy button (placeholder URL filled after deploy)
+echo -n '<div class="badge-bar"><img src="badge.svg" alt="QA Badge" class="badge-img"/><button class="copy-badge" title="Copy badge markdown" onclick="copyBadge()"><svg width=14 height=14 viewBox="0 0 24 24" fill=none stroke=currentColor stroke-width=2><rect x=9 y=9 width=13 height=13 rx=2/><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"/></svg></button></div>' > "$DEPLOY_DIR/.badge_html"
+echo -n "${TIMING_HTML:-}" > "$DEPLOY_DIR/.timing_html"
+echo -n "$PURPOSE_HTML" > "$DEPLOY_DIR/.purpose_html"
+python3 -c "
+import sys, pathlib
+d = pathlib.Path(sys.argv[1])
+t = pathlib.Path(sys.argv[2]).read_text()
+t = t.replace('{{COMMIT_HTML}}', (d / '.commit_html').read_text())
+t = t.replace('{{CARDS}}', (d / '.cards_html').read_text())
+t = t.replace('{{RUN_LINK}}', (d / '.run_link').read_text())
+t = t.replace('{{BADGE_HTML}}', (d / '.badge_html').read_text())
+t = t.replace('{{TIMING_HTML}}', (d / '.timing_html').read_text())
+t = t.replace('{{PURPOSE_HTML}}', (d / '.purpose_html').read_text())
+sys.stdout.write(t)
+" "$DEPLOY_DIR" "$TEMPLATE" > "$DEPLOY_DIR/index.html"
+rm -f "$DEPLOY_DIR/.commit_html" "$DEPLOY_DIR/.cards_html" "$DEPLOY_DIR/.run_link" "$DEPLOY_DIR/.badge_html" "$DEPLOY_DIR/.timing_html" "$DEPLOY_DIR/.purpose_html"
+
+cat > "$DEPLOY_DIR/404.html" <<'ERROREOF'
+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><title>404</title>
+<link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;600&display=swap" rel=stylesheet>
+<style>:root{--bg:oklch(8% 0.02 265);--fg:oklch(45% 0.01 265);--err:oklch(62% 0.22 25)}*{margin:0;padding:0;box-sizing:border-box}body{background:var(--bg);color:var(--fg);font-family:'Inter',system-ui,sans-serif;display:flex;align-items:center;justify-content:center;min-height:100vh}div{text-align:center}h1{color:var(--err);font-size:clamp(3rem,8vw,5rem);font-weight:700;letter-spacing:-.04em;margin-bottom:.5rem}p{font-size:1rem;max-width:32ch;line-height:1.5}</style>
+</head><body><div><h1>404</h1><p>File not found. The QA recording may have failed or been cancelled.</p></div></body></html>
+ERROREOF
+
+# Copy research log to deploy dir if it exists
+for rlog in qa-artifacts/*/research/research-log.json qa-artifacts/*/*/research/research-log.json qa-artifacts/before/*/research/research-log.json; do
+  if [ -f "$rlog" ]; then
+    cp "$rlog" "$DEPLOY_DIR/research-log.json"
+    echo "Found research log: $rlog"
+    break
+  fi
+done
+
+# Copy generated test code to deploy dir
+for tfile in qa-artifacts/*/research/reproduce.spec.ts qa-artifacts/*/*/research/reproduce.spec.ts qa-artifacts/before/*/research/reproduce.spec.ts; do
+  if [ -f "$tfile" ]; then
+    cp "$tfile" "$DEPLOY_DIR/reproduce.spec.ts"
+    echo "Found test code: $tfile"
+    break
+  fi
+done
+
+# Generate badge SVGs into deploy dir
+# Priority: research-log.json verdict (a11y-verified) > video review verdict (AI interpretation)
+REPRO_COUNT=0 INCONC_COUNT=0 NOT_REPRO_COUNT=0 TOTAL_REPORTS=0
+
+# Try research log first (ground truth from a11y assertions)
+RESEARCH_VERDICT=""
+REPRO_METHOD=""
+if [ -f "$DEPLOY_DIR/research-log.json" ]; then
+  RESEARCH_VERDICT=$(python3 -c "import json,sys; d=json.load(open(sys.argv[1])); print(d.get('verdict',''))" "$DEPLOY_DIR/research-log.json" 2>/dev/null || true)
+  REPRO_METHOD=$(python3 -c "import json,sys; d=json.load(open(sys.argv[1])); print(d.get('reproducedBy','none'))" "$DEPLOY_DIR/research-log.json" 2>/dev/null || true)
+  echo "Research verdict (a11y-verified): ${RESEARCH_VERDICT:-none} (by: ${REPRO_METHOD:-none})"
+  if [ -n "$RESEARCH_VERDICT" ]; then
+    TOTAL_REPORTS=1
+    case "$RESEARCH_VERDICT" in
+      REPRODUCED) REPRO_COUNT=1 ;;
+      NOT_REPRODUCIBLE) NOT_REPRO_COUNT=1 ;;
+      INCONCLUSIVE) INCONC_COUNT=1 ;;
+    esac
+  fi
+fi
+
+# Fall back to video review verdicts if no research log
+if [ -z "$RESEARCH_VERDICT" ] && [ -d video-reviews ]; then
+  for rpt in video-reviews/*-qa-video-report.md; do
+    [ -f "$rpt" ] || continue
+    TOTAL_REPORTS=$((TOTAL_REPORTS + 1))
+    # Try structured JSON verdict first (from ## Verdict section)
+    VERDICT_JSON=$(grep -oP '"verdict":\s*"[A-Z_]+' "$rpt" 2>/dev/null | tail -1 | grep -oP '[A-Z_]+$' || true)
+    RISK_JSON=$(grep -oP '"risk":\s*"[a-z]+' "$rpt" 2>/dev/null | tail -1 | grep -oP '[a-z]+$' || true)
+
+    if [ -n "$VERDICT_JSON" ]; then
+      case "$VERDICT_JSON" in
+        REPRODUCED) REPRO_COUNT=$((REPRO_COUNT + 1)) ;;
+        NOT_REPRODUCIBLE) NOT_REPRO_COUNT=$((NOT_REPRO_COUNT + 1)) ;;
+        INCONCLUSIVE) INCONC_COUNT=$((INCONC_COUNT + 1)) ;;
+      esac
+    else
+      # Fallback: grep Summary section (for older reports without ## Verdict)
+      SUMM=$(sed -n '/^## Summary/,/^## /p' "$rpt" 2>/dev/null | head -15)
+      if echo "$SUMM" | grep -iq 'INCONCLUSIVE'; then
+        INCONC_COUNT=$((INCONC_COUNT + 1))
+      elif echo "$SUMM" | grep -iq 'not reproduced\|could not reproduce\|could not be confirmed\|unable to reproduce\|fails\? to reproduce\|fails\? to perform\|was NOT\|NOT visible\|not observed\|fail.* to demonstrate\|does not demonstrate\|steps were not performed\|never.*tested\|never.*accessed\|not.* confirmed'; then
+        NOT_REPRO_COUNT=$((NOT_REPRO_COUNT + 1))
+      elif echo "$SUMM" | grep -iq 'reproduc\|confirm'; then
+        REPRO_COUNT=$((REPRO_COUNT + 1))
+      fi
+    fi
+  done
+fi
+FAIL_COUNT=$((TOTAL_REPORTS - REPRO_COUNT - NOT_REPRO_COUNT))
+[ "$FAIL_COUNT" -lt 0 ] && FAIL_COUNT=0
+echo "DEBUG verdict: repro=${REPRO_COUNT} not_repro=${NOT_REPRO_COUNT} inconc=${INCONC_COUNT} fail=${FAIL_COUNT} total=${TOTAL_REPORTS}"
+echo "Verdict: ${REPRO_COUNT}✓ ${NOT_REPRO_COUNT}✗ ${FAIL_COUNT}⚠ / ${TOTAL_REPORTS}"
+
+# Badge text:
+#   Single pass: "REPRODUCED" / "NOT REPRODUCIBLE" / "INCONCLUSIVE"
+#   Multi pass:  "2✓ 0✗ 1⚠ / 3" with color based on dominant result
+REPRO_RESULT="" REPRO_COLOR="#9f9f9f"
+if [ "$TOTAL_REPORTS" -le 1 ]; then
+  # Single report — simple label
+  if [ "$REPRO_COUNT" -gt 0 ]; then
+    REPRO_RESULT="REPRODUCED" REPRO_COLOR="#2196f3"
+  elif [ "$NOT_REPRO_COUNT" -gt 0 ]; then
+    REPRO_RESULT="NOT REPRODUCIBLE" REPRO_COLOR="#9f9f9f"
+  elif [ "$FAIL_COUNT" -gt 0 ]; then
+    REPRO_RESULT="INCONCLUSIVE" REPRO_COLOR="#9f9f9f"
+  fi
+else
+  # Multi pass — show breakdown: X✓ Y✗ Z⚠ / N
+  PARTS=""
+  [ "$REPRO_COUNT" -gt 0 ] && PARTS="${REPRO_COUNT}✓"
+  [ "$NOT_REPRO_COUNT" -gt 0 ] && PARTS="${PARTS:+${PARTS} }${NOT_REPRO_COUNT}✗"
+  [ "$FAIL_COUNT" -gt 0 ] && PARTS="${PARTS:+${PARTS} }${FAIL_COUNT}⚠"
+  REPRO_RESULT="${PARTS} / ${TOTAL_REPORTS}"
+  # Color based on best outcome
+  if [ "$REPRO_COUNT" -gt 0 ]; then
+    REPRO_COLOR="#2196f3"
+  elif [ "$NOT_REPRO_COUNT" -gt 0 ]; then
+    REPRO_COLOR="#9f9f9f"
+  fi
+fi
+
+# Badge label: #NUM QA0327 (with today's date)
+QA_DATE=$(date -u '+%m%d')
+BADGE_LABEL="QA${QA_DATE}"
+[ -n "${TARGET_NUM:-}" ] && BADGE_LABEL="#${TARGET_NUM} QA${QA_DATE}"
+
+# For PRs, also extract fix quality from Overall Risk section
+FIX_RESULT="" FIX_COLOR="#4c1"
+if [ "$TARGET_TYPE" != "issue" ]; then
+  # Try structured JSON risk first
+  ALL_RISKS=$(grep -ohP '"risk":\s*"[a-z]+' video-reviews/*.md 2>/dev/null | grep -oP '[a-z]+$' || true)
+  if [ -n "$ALL_RISKS" ]; then
+    # Use worst risk across all reports
+    if echo "$ALL_RISKS" | grep -q 'high'; then
+      FIX_RESULT="MAJOR ISSUES" FIX_COLOR="#e05d44"
+    elif echo "$ALL_RISKS" | grep -q 'medium'; then
+      FIX_RESULT="MINOR ISSUES" FIX_COLOR="#dfb317"
+    elif echo "$ALL_RISKS" | grep -q 'low'; then
+      FIX_RESULT="APPROVED" FIX_COLOR="#4c1"
+    fi
+  else
+    # Fallback: grep Overall Risk section
+    RISK_TEXT=""
+    if [ -d video-reviews ]; then
+      RISK_TEXT=$(sed -n '/^## Overall Risk/,/^## /p' video-reviews/*.md 2>/dev/null | sed 's/\*//g' | head -20 || true)
+    fi
+    RISK_FIRST=$(echo "$RISK_TEXT" | grep -oiP '^\s*(high|medium|moderate|low|minimal|critical)' | head -1 | tr '[:upper:]' '[:lower:]' || true)
+    if [ -n "$RISK_FIRST" ]; then
+      case "$RISK_FIRST" in
+        *low*|*minimal*) FIX_RESULT="APPROVED" FIX_COLOR="#4c1" ;;
+        *medium*|*moderate*) FIX_RESULT="MINOR ISSUES" FIX_COLOR="#dfb317" ;;
+        *high*|*critical*) FIX_RESULT="MAJOR ISSUES" FIX_COLOR="#e05d44" ;;
+      esac
+    elif echo "$RISK_TEXT" | grep -iq 'no.*risk\|approved\|looks good'; then
+      FIX_RESULT="APPROVED" FIX_COLOR="#4c1"
+    fi
+  fi
+fi
+
+# Always use vertical box badge
+/tmp/gen-badge-box.sh "$DEPLOY_DIR/badge.svg" "$BADGE_LABEL" \
+  "$REPRO_COUNT" "$NOT_REPRO_COUNT" "$FAIL_COUNT" "$TOTAL_REPORTS" \
+  "$FIX_RESULT" "$FIX_COLOR" "$REPRO_METHOD"
+BADGE_STATUS="${REPRO_RESULT:-UNKNOWN}${FIX_RESULT:+ | Fix: ${FIX_RESULT}}"
+echo "badge_status=${BADGE_STATUS:-FINISHED}" >> "$GITHUB_OUTPUT"
+
+# Remove files exceeding Cloudflare Pages 25MB limit to prevent silent deploy failures
+MAX_SIZE=$((25 * 1024 * 1024))
+find "$DEPLOY_DIR" -type f -size +${MAX_SIZE}c | while read -r big_file; do
+  SIZE_MB=$(( $(stat -c%s "$big_file") / 1024 / 1024 ))
+  echo "Removing oversized file: $(basename "$big_file") (${SIZE_MB}MB > 25MB limit)"
+  rm "$big_file"
+done
+
+BRANCH=$(echo "$RAW_BRANCH" | sed 's/[^a-zA-Z0-9-]/-/g' | sed 's/--*/-/g' | sed 's/^-//;s/-$//' | cut -c1-28)
+
+DEPLOY_OUTPUT=$(wrangler pages deploy "$DEPLOY_DIR" \
+  --project-name="comfy-qa" \
+  --branch="$BRANCH" 2>&1) || true
+echo "$DEPLOY_OUTPUT" | tail -5
+
+URL=$(echo "$DEPLOY_OUTPUT" | grep -oE 'https://[a-zA-Z0-9.-]+\.pages\.dev\S*' | head -1 || true)
+FALLBACK_URL="https://${BRANCH}.comfy-qa.pages.dev"
+
+echo "url=${URL:-$FALLBACK_URL}" >> "$GITHUB_OUTPUT"
+echo "Deployed to: ${URL:-$FALLBACK_URL}"
--- a/scripts/qa-generate-test.ts
+++ b/scripts/qa-generate-test.ts
@@ -0,0 +1,208 @@
+#!/usr/bin/env tsx
+/**
+ * Generates a Playwright regression test (.spec.ts) from a QA report + PR diff.
+ * Uses Gemini to produce a test that asserts UIUX behavior verified during QA.
+ *
+ * Usage:
+ *   pnpm exec tsx scripts/qa-generate-test.ts \
+ *     --qa-report <path>       QA video review report (markdown)
+ *     --pr-diff <path>         PR diff file
+ *     --output <path>          Output .spec.ts file path
+ *     --model <name>           Gemini model (default: gemini-3-flash-preview)
+ */
+import { readFile, writeFile } from 'node:fs/promises'
+import { basename, resolve } from 'node:path'
+
+import { GoogleGenerativeAI } from '@google/generative-ai'
+
+interface CliOptions {
+  qaReport: string
+  prDiff: string
+  output: string
+  model: string
+}
+
+const DEFAULTS: CliOptions = {
+  qaReport: '',
+  prDiff: '',
+  output: '',
+  model: 'gemini-3-flash-preview'
+}
+
+// ── Fixture API reference for the prompt ────────────────────────────
+const FIXTURE_API = `
+## ComfyUI Playwright Test Fixture API
+
+Import pattern:
+\`\`\`typescript
+import { expect } from '@playwright/test'
+import { comfyPageFixture as test } from '../fixtures/ComfyPage'
+\`\`\`
+
+### Available helpers on \`comfyPage\`:
+- \`comfyPage.page\` — raw Playwright Page
+- \`comfyPage.menu.topbar\` — Topbar helper:
+  - \`.getTabNames(): Promise<string[]>\` — get all open tab names
+  - \`.getActiveTabName(): Promise<string>\` — get active tab name
+  - \`.saveWorkflow(name)\` — Save via File > Save dialog
+  - \`.saveWorkflowAs(name)\` — Save via File > Save As dialog
+  - \`.exportWorkflow(name)\` — Export via File > Export dialog
+  - \`.triggerTopbarCommand(path: string[])\` — e.g. ['File', 'Save As']
+  - \`.getWorkflowTab(name)\` — get a tab locator by name
+  - \`.closeWorkflowTab(name)\` — close a tab
+  - \`.openTopbarMenu()\` — open the hamburger menu
+  - \`.openSubmenu(label)\` — hover to open a submenu
+- \`comfyPage.menu.workflowsTab\` — Workflows sidebar:
+  - \`.open()\` / \`.close()\` — toggle sidebar
+  - \`.getTopLevelSavedWorkflowNames()\` — list saved workflows
+  - \`.getPersistedItem(name)\` — get a workflow item locator
+- \`comfyPage.workflow\` — WorkflowHelper:
+  - \`.loadWorkflow(name)\` — load from browser_tests/assets/{name}.json
+  - \`.setupWorkflowsDirectory(structure)\` — setup test directory
+  - \`.deleteWorkflow(name)\` — delete a workflow
+  - \`.isCurrentWorkflowModified(): Promise<boolean>\` — check dirty state
+  - \`.getUndoQueueSize()\` / \`.getRedoQueueSize()\`
+- \`comfyPage.settings.setSetting(key, value)\` — change settings
+- \`comfyPage.keyboard\` — KeyboardHelper:
+  - \`.undo()\` / \`.redo()\` / \`.bypass()\`
+- \`comfyPage.nodeOps\` — NodeOperationsHelper
+- \`comfyPage.canvas\` — CanvasHelper
+- \`comfyPage.contextMenu\` — ContextMenu
+- \`comfyPage.toast\` — ToastHelper
+- \`comfyPage.confirmDialog\` — confirmation dialog
+- \`comfyPage.nextFrame()\` — wait for Vue re-render
+
+### Test patterns:
+- Use \`test.describe('Name', { tag: '@ui' }, () => { ... })\` for UI tests
+- Use \`test.beforeEach\` to set up common state (settings, workflow dir)
+- Use \`expect(locator).toHaveScreenshot('name.png')\` for visual assertions
+- Use \`expect(locator).toBeVisible()\` / \`.toHaveText()\` for behavioral assertions
+- Use \`comfyPage.workflow.setupWorkflowsDirectory({})\` to ensure clean state
+`
+
+// ── Prompt builder ──────────────────────────────────────────────────
+function buildPrompt(qaReport: string, prDiff: string): string {
+  return `You are a Playwright test generator for the ComfyUI frontend.
+
+Your task: Generate a single .spec.ts regression test file that asserts the UIUX behavior
+described in the QA report below. The test must:
+
+1. Use the ComfyUI Playwright fixture API (documented below)
+2. Test UIUX behavior ONLY — element visibility, tab names, dialog states, workflow states
+3. NOT test code implementation details
+4. Be concise — only test the behavior that the PR changed
+5. Follow existing test conventions (see API reference)
+
+${FIXTURE_API}
+
+## QA Video Review Report
+${qaReport}
+
+## PR Diff (for context on what changed)
+${prDiff.slice(0, 8000)}
+
+## Output Requirements
+- Output ONLY the .spec.ts file content — no markdown fences, no explanations
+- Start with imports, end with closing brace
+- Use descriptive test names that explain the expected behavior
+- Add screenshot assertions where visual verification matters
+- Keep it focused: 2-5 test cases covering the core behavioral change
+- Use \`test.beforeEach\` for common setup (settings, workflow directory)
+- Tag the describe block with \`{ tag: '@ui' }\` or \`{ tag: '@workflow' }\` as appropriate
+`
+}
+
+// ── Gemini call ─────────────────────────────────────────────────────
+async function generateTest(
+  qaReport: string,
+  prDiff: string,
+  model: string
+): Promise<string> {
+  const apiKey = process.env.GEMINI_API_KEY
+  if (!apiKey) throw new Error('GEMINI_API_KEY env var required')
+
+  const genAI = new GoogleGenerativeAI(apiKey)
+  const genModel = genAI.getGenerativeModel({ model })
+
+  const prompt = buildPrompt(qaReport, prDiff)
+  console.warn(`Sending prompt to ${model} (${prompt.length} chars)...`)
+
+  const result = await genModel.generateContent({
+    contents: [{ role: 'user', parts: [{ text: prompt }] }],
+    generationConfig: {
+      temperature: 0.2,
+      maxOutputTokens: 8192
+    }
+  })
+
+  const text = result.response.text()
+
+  // Strip markdown fences if model wraps output
+  return text
+    .replace(/^```(?:typescript|ts)?\n?/, '')
+    .replace(/\n?```$/, '')
+    .trim()
+}
+
+// ── CLI ─────────────────────────────────────────────────────────────
+function parseArgs(): CliOptions {
+  const args = process.argv.slice(2)
+  const opts = { ...DEFAULTS }
+
+  for (let i = 0; i < args.length; i++) {
+    switch (args[i]) {
+      case '--qa-report':
+        opts.qaReport = args[++i]
+        break
+      case '--pr-diff':
+        opts.prDiff = args[++i]
+        break
+      case '--output':
+        opts.output = args[++i]
+        break
+      case '--model':
+        opts.model = args[++i]
+        break
+      case '--help':
+        console.warn(`Usage:
+  pnpm exec tsx scripts/qa-generate-test.ts [options]
+
+Options:
+  --qa-report <path>   QA video review report (markdown) [required]
+  --pr-diff <path>     PR diff file [required]
+  --output <path>      Output .spec.ts path [required]
+  --model <name>       Gemini model (default: gemini-3-flash-preview)`)
+        process.exit(0)
+    }
+  }
+
+  if (!opts.qaReport || !opts.prDiff || !opts.output) {
+    console.error('Missing required args. Run with --help for usage.')
+    process.exit(1)
+  }
+
+  return opts
+}
+
+async function main() {
+  const opts = parseArgs()
+
+  const qaReport = await readFile(resolve(opts.qaReport), 'utf-8')
+  const prDiff = await readFile(resolve(opts.prDiff), 'utf-8')
+
+  console.warn(
+    `QA report: ${basename(opts.qaReport)} (${qaReport.length} chars)`
+  )
+  console.warn(`PR diff: ${basename(opts.prDiff)} (${prDiff.length} chars)`)
+
+  const testCode = await generateTest(qaReport, prDiff, opts.model)
+
+  const outputPath = resolve(opts.output)
+  await writeFile(outputPath, testCode + '\n')
+  console.warn(`Generated test: ${outputPath} (${testCode.length} chars)`)
+}
+
+main().catch((err) => {
+  console.error(err)
+  process.exit(1)
+})
--- a/scripts/qa-record.ts
+++ b/scripts/qa-record.ts
--- a/scripts/qa-report-template.html
+++ b/scripts/qa-report-template.html
@@ -0,0 +1,135 @@
+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=viewport content="width=device-width,initial-scale=1"><title>QA Session Recordings</title>
+<link rel=preconnect href=https://fonts.googleapis.com><link rel=preconnect href=https://fonts.gstatic.com crossorigin><link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600;700&family=JetBrains+Mono:wght@400;500&display=swap" rel=stylesheet>
+<script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script>
+<style>
+:root{--bg:oklch(97% 0.01 265);--surface:oklch(100% 0 0);--surface-up:oklch(94% 0.01 265);--fg:oklch(15% 0.02 265);--fg-muted:oklch(40% 0.01 265);--fg-dim:oklch(55% 0.01 265);--primary:oklch(50% 0.21 265);--primary-up:oklch(45% 0.21 265);--primary-glow:oklch(55% 0.15 265);--ok:oklch(45% 0.18 155);--err:oklch(50% 0.22 25);--border:oklch(85% 0.01 265);--border-faint:oklch(90% 0.01 265);--r:0.75rem;--r-lg:1rem;--ease-out:cubic-bezier(0.22,1,0.36,1);--dur-base:250ms;--dur-slow:500ms;--font:'Inter',system-ui,sans-serif;--font-mono:'JetBrains Mono',monospace}
+@media(prefers-color-scheme:dark){:root{--bg:oklch(8% 0.02 265);--surface:oklch(12% 0.02 265);--surface-up:oklch(16% 0.02 265);--fg:oklch(96% 0.01 95);--fg-muted:oklch(65% 0.01 265);--fg-dim:oklch(45% 0.01 265);--primary:oklch(62% 0.21 265);--primary-up:oklch(68% 0.21 265);--primary-glow:oklch(62% 0.15 265);--ok:oklch(62% 0.18 155);--err:oklch(62% 0.22 25);--border:oklch(22% 0.02 265);--border-faint:oklch(15% 0.01 265)}}
+*{margin:0;padding:0;box-sizing:border-box}
+body{background:var(--bg);color:var(--fg);font-family:var(--font);min-height:100vh;padding:clamp(1.5rem,4vw,3rem) clamp(1rem,3vw,2rem);position:relative}
+@media(prefers-color-scheme:dark){body::after{content:'';position:fixed;inset:0;pointer-events:none;opacity:.03;background:url("data:image/svg+xml,%3Csvg viewBox='0 0 256 256' xmlns='http://www.w3.org/2000/svg'%3E%3Cfilter id='n'%3E%3CfeTurbulence type='fractalNoise' baseFrequency='.85' numOctaves='4' stitchTiles='stitch'/%3E%3C/filter%3E%3Crect width='100%25' height='100%25' filter='url(%23n)'/%3E%3C/svg%3E")}}
+.container{max-width:1200px;margin:0 auto}
+header{display:flex;align-items:center;gap:1rem;margin-bottom:clamp(1.5rem,4vw,3rem);padding-bottom:1.25rem;border-bottom:1px solid var(--border)}
+.header-icon{width:36px;height:36px;display:grid;place-items:center;background:linear-gradient(135deg,oklch(100% 0 0/.06),oklch(100% 0 0/.02));backdrop-filter:blur(12px);border:1px solid oklch(100% 0 0/.1);border-radius:var(--r);flex-shrink:0}
+.header-icon svg{color:var(--primary)}
+h1{font-size:clamp(1.25rem,2.5vw,1.625rem);font-weight:700;letter-spacing:-.03em;background:linear-gradient(135deg,var(--fg),var(--fg-muted));-webkit-background-clip:text;-webkit-text-fill-color:transparent;background-clip:text}
+.meta{color:var(--fg-dim);font-size:.8125rem;margin-top:.15rem;letter-spacing:.01em}
+.grid{display:grid;grid-template-columns:repeat(auto-fill,minmax(min(480px,100%),1fr));gap:1.5rem}
+.card{background:var(--surface);border:1px solid var(--border);border-radius:var(--r-lg);overflow:hidden;transition:border-color var(--dur-base) var(--ease-out),box-shadow var(--dur-base) var(--ease-out),transform var(--dur-base) var(--ease-out)}
+.card:hover{border-color:var(--primary);box-shadow:0 4px 16px oklch(0% 0 0/.1);transform:translateY(-2px)}
+.video-wrap{position:relative;background:var(--surface);border-bottom:1px solid var(--border-faint)}
+.video-wrap video{width:100%;display:block;aspect-ratio:16/9;object-fit:contain}
+.card-body{padding:.75rem 1rem;display:flex;align-items:center;justify-content:space-between}
+.platform{display:flex;align-items:center;gap:.5rem;font-weight:600;font-size:.9375rem;letter-spacing:-.01em}
+.icon{font-size:1.125rem}
+.links{display:flex;gap:.5rem}
+.dl{color:var(--fg-muted);text-decoration:none;font-size:.75rem;font-weight:500;display:inline-flex;align-items:center;gap:.3rem;padding:.25rem .6rem;border-radius:9999px;border:1px solid var(--border);background:oklch(100% 0 0/.03);transition:all var(--dur-base) var(--ease-out)}
+.dl:hover{color:var(--primary-up);border-color:var(--primary);background:oklch(62% 0.21 265/.08)}
+.badge{font-size:.6875rem;font-weight:600;padding:.2rem .625rem;border-radius:9999px;text-transform:uppercase;letter-spacing:.05em}
+.card-header{padding:.75rem 1rem;display:flex;align-items:center;justify-content:space-between;border-bottom:1px solid var(--border-faint)}
+.comparison{display:grid;grid-template-columns:1fr 1fr;gap:0}
+.comp-panel{border-right:1px solid var(--border-faint)}
+.comp-panel:last-child{border-right:none}
+.comp-label{padding:.4rem .75rem;font-size:.7rem;font-weight:600;text-transform:uppercase;letter-spacing:.05em;color:var(--fg-muted);background:var(--surface);display:flex;align-items:center;gap:.4rem}
+.comp-tag{font-size:.6rem;padding:.1rem .4rem;border-radius:9999px;font-weight:600}
+.comp-panel:first-child .comp-tag{background:oklch(65% 0.01 265/.15);color:var(--fg-muted);border:1px solid var(--border)}
+.comp-panel:last-child .comp-tag{background:oklch(62% 0.18 155/.15);color:var(--ok);border:1px solid oklch(62% 0.18 155/.25)}
+.comp-dl{padding:.4rem .75rem;display:flex;justify-content:center}
+.report{border-top:1px solid var(--border-faint);padding:.75rem 1rem;font-size:.8125rem}
+.report summary{cursor:pointer;color:var(--fg-muted);font-weight:500;display:flex;align-items:center;gap:.4rem;user-select:none;transition:color var(--dur-base) var(--ease-out)}
+.report summary:hover{color:var(--fg)}
+.report summary svg{flex-shrink:0;opacity:.5}
+.report[open] summary{margin-bottom:.75rem;padding-bottom:.5rem;border-bottom:1px solid var(--border-faint)}
+.report-body{line-height:1.7;color:oklch(80% 0.01 265);overflow-x:auto}
+.report-body h1,.report-body h2{margin:1.25rem 0 .5rem;color:var(--fg);font-size:1rem;font-weight:600;letter-spacing:-.02em;border-bottom:1px solid var(--border-faint);padding-bottom:.4rem}
+.report-body h3{margin:.75rem 0 .4rem;color:var(--fg);font-size:.875rem;font-weight:600}
+.report-body p{margin:.4rem 0}
+.report-body ul,.report-body ol{margin:.4rem 0 .4rem 1.5rem}
+.report-body li{margin:.25rem 0}
+.report-body code{background:var(--surface-up);padding:.125rem .375rem;border-radius:.25rem;font-size:.7rem;font-family:var(--font-mono);border:1px solid var(--border-faint)}
+.report-body h3+p>code:first-child{background:oklch(62% 0.22 25/.15);color:var(--err);border-color:oklch(62% 0.22 25/.25)}
+.report-body h3+p>code:nth-child(2){background:oklch(62% 0.21 265/.15);color:var(--primary-up);border-color:oklch(62% 0.21 265/.25)}
+.report-body h3+p>code:nth-child(3){background:oklch(65% 0.01 265/.15);color:var(--fg-muted);border-color:var(--border)}
+.report-body table{width:100%;border-collapse:collapse;margin:.75rem 0;font-size:.75rem;border:1px solid var(--border);border-radius:var(--r);overflow:hidden}
+.report-body th,.report-body td{border:1px solid var(--border-faint);padding:.5rem .75rem;text-align:left;vertical-align:top;word-wrap:break-word}
+.report-body th{background:var(--surface-up);color:var(--fg);font-weight:600;font-size:.6875rem;text-transform:uppercase;letter-spacing:.05em;position:sticky;top:0;white-space:nowrap}
+.report-body tr:nth-child(even){background:color-mix(in oklch,var(--surface) 50%,transparent)}
+.report-body tr:hover{background:color-mix(in oklch,var(--surface-up) 50%,transparent)}
+.report-body strong{color:var(--fg)}
+.report-body hr{border:none;border-top:1px solid var(--border-faint);margin:1rem 0}
+@keyframes fade-up{from{opacity:0;transform:translateY(16px)}to{opacity:1;transform:translateY(0)}}
+.reveal{animation:fade-up var(--dur-slow) var(--ease-out) both;animation-delay:calc(var(--i,0) * 120ms)}
+@media(prefers-reduced-motion:reduce){.reveal{animation:none}}
+@media(max-width:480px){.grid{grid-template-columns:1fr}.card-body{flex-wrap:wrap;gap:.5rem}}
+.sha{color:var(--primary);text-decoration:none;font-family:var(--font-mono);font-size:.75rem;font-weight:500;padding:.1rem .4rem;border-radius:.25rem;background:oklch(62% 0.21 265/.08);border:1px solid oklch(62% 0.21 265/.15);transition:all var(--dur-base) var(--ease-out)}
+.sha:hover{background:oklch(62% 0.21 265/.15);border-color:var(--primary)}
+.badge-bar{display:flex;align-items:center;gap:.5rem;margin-bottom:1rem}
+.badge-img{height:20px;display:block}
+.copy-badge{background:oklch(100% 0 0/.06);border:1px solid var(--border);color:var(--fg-muted);padding:.3rem .4rem;border-radius:var(--r);cursor:pointer;display:inline-flex;align-items:center;transition:all var(--dur-base) var(--ease-out)}
+.copy-badge:hover{color:var(--primary-up);border-color:var(--primary);background:oklch(62% 0.21 265/.1)}
+.copy-badge.copied{color:var(--ok);border-color:var(--ok)}
+.vseek{width:100%;padding:0 .75rem;background:var(--surface);border-top:1px solid var(--border-faint);position:relative;height:24px;display:flex;align-items:center}
+.vseek input[type=range]{-webkit-appearance:none;appearance:none;width:100%;height:4px;background:var(--border);border-radius:2px;outline:none;cursor:pointer;position:relative;z-index:2}
+.vseek input[type=range]::-webkit-slider-thumb{-webkit-appearance:none;width:12px;height:12px;border-radius:50%;background:var(--primary);cursor:pointer;border:2px solid var(--bg);box-shadow:0 0 4px oklch(0% 0 0/.3)}
+.vseek input[type=range]::-moz-range-thumb{width:12px;height:12px;border-radius:50%;background:var(--primary);cursor:pointer;border:2px solid var(--bg)}
+.vseek .vbuf{position:absolute;left:.75rem;right:.75rem;height:4px;border-radius:2px;pointer-events:none;top:50%;transform:translateY(-50%)}
+.vseek .vbuf-bar{height:100%;background:oklch(62% 0.21 265/.25);border-radius:2px;transition:width 200ms linear}
+.vctrl{display:flex;align-items:center;gap:.375rem;padding:.5rem .75rem;background:var(--surface);border-top:1px solid var(--border-faint);flex-wrap:wrap}
+.vctrl button{background:oklch(100% 0 0/.06);border:1px solid var(--border);color:var(--fg-muted);font-size:.6875rem;font-weight:600;font-family:var(--font-mono);padding:.25rem .5rem;border-radius:.25rem;cursor:pointer;transition:all var(--dur-base) var(--ease-out);white-space:nowrap}
+.vctrl button:hover{color:var(--primary-up);border-color:var(--primary);background:oklch(62% 0.21 265/.1)}
+.vctrl button.active{color:var(--primary);border-color:var(--primary);background:oklch(62% 0.21 265/.15)}
+.vctrl .vtime{font-family:var(--font-mono);font-size:.6875rem;color:var(--fg-dim);min-width:10ch;text-align:center}
+.vctrl .vsep{width:1px;height:1rem;background:var(--border);flex-shrink:0}
+.vctrl .vhint{font-size:.6rem;color:var(--fg-dim);margin-left:auto}
+.purpose{background:linear-gradient(135deg,oklch(100% 0 0/.04),oklch(100% 0 0/.02));border:1px solid oklch(100% 0 0/.08);border-radius:var(--r-lg);padding:1rem 1.25rem;margin-bottom:1.5rem;font-size:.85rem;line-height:1.7;color:oklch(80% 0.01 265)}
+.purpose strong{color:var(--fg);font-weight:600}
+.purpose .purpose-label{font-size:.7rem;font-weight:600;text-transform:uppercase;letter-spacing:.05em;color:var(--fg-muted);margin-bottom:.4rem}
+.purpose .purpose-reqs{margin-top:.75rem;padding-top:.75rem;border-top:1px solid oklch(100% 0 0/.06);font-size:.8rem;color:oklch(70% 0.01 265);line-height:1.8}
+</style></head><body><div class=container>
+<header><div class=header-icon><svg width=20 height=20 viewBox="0 0 24 24" fill=none stroke=currentColor stroke-width=2 stroke-linecap=round stroke-linejoin=round><polygon points="23 7 16 12 23 17 23 7"/><rect x=1 y=5 width=15 height=14 rx=2 ry=2/></svg></div><div><h1>QA Session Recordings</h1><div class=meta>ComfyUI Frontend &middot; Automated QA{{COMMIT_HTML}}{{RUN_LINK}}{{TIMING_HTML}}</div>{{BADGE_HTML}}</div></header>
+{{PURPOSE_HTML}}<div class=grid>{{CARDS}}</div>
+</div><script>
+function copyBadge(){const u=location.href.replace(/\/[^/]*$/,'/');const b=u+'badge.svg';const md='[![QA Badge]('+b+')]('+u+')';navigator.clipboard.writeText(md).then(()=>{const btn=document.querySelector('.copy-badge');btn.classList.add('copied');btn.innerHTML='<svg width=14 height=14 viewBox="0 0 24 24" fill=none stroke=currentColor stroke-width=2><polyline points="20 6 9 17 4 12"/></svg>';setTimeout(()=>{btn.classList.remove('copied');btn.innerHTML='<svg width=14 height=14 viewBox="0 0 24 24" fill=none stroke=currentColor stroke-width=2><rect x=9 y=9 width=13 height=13 rx=2/><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"/></svg>'},2000)})}
+document.querySelectorAll('[data-md]').forEach(el=>{const t=el.textContent;el.removeAttribute('data-md');el.innerHTML=marked.parse(t)});
+const FPS=30,FT=1/FPS,SPEEDS=[0.1,0.25,0.5,1,1.5,2];
+document.querySelectorAll('.video-wrap video').forEach(v=>{
+  v.playbackRate=0.5;v.removeAttribute('autoplay');v.pause();
+  const c=document.createElement('div');c.className='vctrl';
+  const btn=(label,fn)=>{const b=document.createElement('button');b.textContent=label;b.onclick=fn;c.appendChild(b);return b};
+  const sep=()=>{const s=document.createElement('div');s.className='vsep';c.appendChild(s)};
+  const time=document.createElement('span');time.className='vtime';time.textContent='0:00.000';
+  btn('\u23EE',()=>{v.currentTime=0});
+  btn('\u25C0\u25C0',()=>{v.currentTime=Math.max(0,v.currentTime-FT*10)});
+  btn('\u25C0',()=>{v.pause();v.currentTime=Math.max(0,v.currentTime-FT)});
+  const playBtn=btn('\u25B6',()=>{v.paused?v.play():v.pause()});
+  btn('\u25B6\u25B6',()=>{v.pause();v.currentTime+=FT});
+  btn('\u25B6\u25B6\u25B6',()=>{v.currentTime+=FT*10});
+  sep();
+  const spdBtns=SPEEDS.map(s=>{const b=btn(s+'x',()=>{v.playbackRate=s;spdBtns.forEach(x=>x.classList.remove('active'));b.classList.add('active')});if(s===0.5)b.classList.add('active');return b});
+  sep();c.appendChild(time);
+  const hint=document.createElement('span');hint.className='vhint';hint.textContent='\u2190\u2192 frame \u2022 space play';c.appendChild(hint);
+  // Custom seekbar — works even without server range request support
+  const seekWrap=document.createElement('div');seekWrap.className='vseek';
+  const seekBar=document.createElement('input');seekBar.type='range';seekBar.min=0;seekBar.max=1000;seekBar.value=0;seekBar.step=1;
+  const bufWrap=document.createElement('div');bufWrap.className='vbuf';
+  const bufBar=document.createElement('div');bufBar.className='vbuf-bar';bufBar.style.width='0%';
+  bufWrap.appendChild(bufBar);seekWrap.appendChild(bufWrap);seekWrap.appendChild(seekBar);
+  let seeking=false;
+  seekBar.oninput=()=>{seeking=true;if(v.duration){v.currentTime=v.duration*(seekBar.value/1000)}};
+  seekBar.onchange=()=>{seeking=false};
+  v.closest('.video-wrap').after(seekWrap);
+  seekWrap.after(c);
+  v.ontimeupdate=()=>{
+    const m=Math.floor(v.currentTime/60),s=Math.floor(v.currentTime%60),ms=Math.floor((v.currentTime%1)*1000);
+    time.textContent=m+':'+(s<10?'0':'')+s+'.'+String(ms).padStart(3,'0');
+    if(!seeking&&v.duration){seekBar.value=Math.round((v.currentTime/v.duration)*1000)}
+  };
+  v.onprogress=v.onloadeddata=()=>{if(v.buffered.length&&v.duration){bufBar.style.width=(v.buffered.end(v.buffered.length-1)/v.duration*100)+'%'}};
+  v.onplay=()=>{playBtn.textContent='\u23F8'};v.onpause=()=>{playBtn.textContent='\u25B6'};
+  v.parentElement.addEventListener('keydown',e=>{
+    if(e.key==='ArrowLeft'){e.preventDefault();v.pause();v.currentTime=Math.max(0,v.currentTime-FT)}
+    if(e.key==='ArrowRight'){e.preventDefault();v.pause();v.currentTime+=FT}
+    if(e.key===' '){e.preventDefault();v.paused?v.play():v.pause()}
+  });
+  v.parentElement.setAttribute('tabindex','0');
+});
+</script></body></html>
--- a/scripts/qa-reproduce.ts
+++ b/scripts/qa-reproduce.ts
@@ -0,0 +1,253 @@
+#!/usr/bin/env tsx
+/**
+ * QA Reproduce Phase — Deterministic replay of research plan with narration
+ *
+ * Takes a reproduction plan from the research phase and replays it:
+ * 1. Execute each action deterministically (no AI decisions)
+ * 2. Capture a11y snapshot before/after each action
+ * 3. Gemini describes what visually changed (narration for humans)
+ * 4. Output: narration-log.json with full evidence chain
+ */
+
+import type { Page } from '@playwright/test'
+import { GoogleGenerativeAI } from '@google/generative-ai'
+import { mkdirSync, writeFileSync } from 'fs'
+
+import type { ActionResult } from './qa-record.js'
+
+// ── Types ──
+
+interface ReproductionStep {
+  action: Record<string, unknown> & { action: string }
+  expectedAssertion: string
+}
+
+interface NarrationEntry {
+  step: number
+  action: string
+  params: Record<string, unknown>
+  result: ActionResult
+  a11yBefore: unknown
+  a11yAfter: unknown
+  assertionExpected: string
+  assertionPassed: boolean
+  assertionActual: string
+  geminiNarration: string
+  timestampMs: number
+}
+
+export interface NarrationLog {
+  entries: NarrationEntry[]
+  allAssertionsPassed: boolean
+}
+
+interface ReproduceOptions {
+  page: Page
+  plan: ReproductionStep[]
+  geminiApiKey: string
+  outputDir: string
+}
+
+// ── A11y helpers ──
+
+interface A11yNode {
+  role: string
+  name: string
+  value?: string
+  checked?: boolean
+  disabled?: boolean
+  expanded?: boolean
+  children?: A11yNode[]
+}
+
+function searchA11y(node: A11yNode | null, selector: string): A11yNode | null {
+  if (!node) return null
+  const sel = selector.toLowerCase()
+  if (
+    node.name?.toLowerCase().includes(sel) ||
+    node.role?.toLowerCase().includes(sel)
+  ) {
+    return node
+  }
+  if (node.children) {
+    for (const child of node.children) {
+      const found = searchA11y(child, selector)
+      if (found) return found
+    }
+  }
+  return null
+}
+
+function summarizeA11y(node: A11yNode | null): string {
+  if (!node) return 'null'
+  const parts = [`role=${node.role}`, `name="${node.name}"`]
+  if (node.value !== undefined) parts.push(`value="${node.value}"`)
+  if (node.checked !== undefined) parts.push(`checked=${node.checked}`)
+  if (node.disabled) parts.push('disabled')
+  if (node.expanded !== undefined) parts.push(`expanded=${node.expanded}`)
+  return `{${parts.join(', ')}}`
+}
+
+// ── Subtitle overlay ──
+
+async function showSubtitle(page: Page, text: string, step: number) {
+  const encoded = encodeURIComponent(
+    text.slice(0, 120).replace(/'/g, "\\'").replace(/\n/g, ' ')
+  )
+  await page.addScriptTag({
+    content: `(function(){
+      var id='qa-subtitle';
+      var el=document.getElementById(id);
+      if(!el){
+        el=document.createElement('div');
+        el.id=id;
+        Object.assign(el.style,{position:'fixed',bottom:'32px',left:'50%',transform:'translateX(-50%)',zIndex:'2147483646',maxWidth:'90%',padding:'6px 14px',borderRadius:'6px',background:'rgba(0,0,0,0.8)',color:'rgba(255,255,255,0.95)',fontSize:'12px',fontFamily:'system-ui,sans-serif',fontWeight:'400',lineHeight:'1.4',pointerEvents:'none',textAlign:'center',whiteSpace:'normal'});
+        document.body.appendChild(el);
+      }
+      el.textContent='['+${step}+'] '+decodeURIComponent('${encoded}');
+    })()`
+  })
+}
+
+// ── Gemini visual narration ──
+
+async function geminiDescribe(
+  page: Page,
+  geminiApiKey: string,
+  focus: string
+): Promise<string> {
+  try {
+    const screenshot = await page.screenshot({ type: 'jpeg', quality: 70 })
+    const genAI = new GoogleGenerativeAI(geminiApiKey)
+    const model = genAI.getGenerativeModel({ model: 'gemini-3-flash-preview' })
+
+    const result = await model.generateContent([
+      {
+        text: `Describe in 1-2 sentences what you see on this ComfyUI screen. Focus on: ${focus}. Be factual — only describe what is visible.`
+      },
+      {
+        inlineData: {
+          mimeType: 'image/jpeg',
+          data: screenshot.toString('base64')
+        }
+      }
+    ])
+    return result.response.text().trim()
+  } catch (e) {
+    return `(Gemini narration failed: ${e instanceof Error ? e.message.slice(0, 50) : e})`
+  }
+}
+
+// ── Main reproduce function ──
+
+export async function runReproducePhase(
+  opts: ReproduceOptions
+): Promise<NarrationLog> {
+  const { page, plan, geminiApiKey, outputDir } = opts
+  const { executeAction } = await import('./qa-record.js')
+
+  const narrationDir = `${outputDir}/narration`
+  mkdirSync(narrationDir, { recursive: true })
+
+  const entries: NarrationEntry[] = []
+  const startMs = Date.now()
+
+  console.warn(`Reproduce phase: replaying ${plan.length} steps...`)
+
+  for (let i = 0; i < plan.length; i++) {
+    const step = plan[i]
+    const actionObj = step.action
+    const elapsed = Date.now() - startMs
+
+    // Show subtitle
+    await showSubtitle(page, `Step ${i + 1}: ${actionObj.action}`, i + 1)
+    console.warn(`  [${i + 1}/${plan.length}] ${actionObj.action}`)
+
+    // Capture a11y BEFORE
+    const a11yBefore = await page
+      .locator('body')
+      .ariaSnapshot({ timeout: 3000 })
+      .catch(() => null)
+
+    // Execute action
+    const result = await executeAction(
+      page,
+      actionObj as Parameters<typeof executeAction>[1],
+      outputDir
+    )
+    await new Promise((r) => setTimeout(r, 500))
+
+    // Capture a11y AFTER
+    const a11yAfter = await page
+      .locator('body')
+      .ariaSnapshot({ timeout: 3000 })
+      .catch(() => null)
+
+    // Check assertion
+    let assertionPassed = false
+    let assertionActual = ''
+    if (step.expectedAssertion) {
+      // Parse the expected assertion — e.g. "Settings dialog: visible" or "tab count: 2"
+      const parts = step.expectedAssertion.split(':').map((s) => s.trim())
+      const selectorName = parts[0]
+      const expectedState = parts.slice(1).join(':').trim()
+
+      const found = searchA11y(a11yAfter as A11yNode | null, selectorName)
+      assertionActual = found ? summarizeA11y(found) : 'NOT FOUND'
+
+      if (expectedState === 'visible' || expectedState === 'exists') {
+        assertionPassed = found !== null
+      } else if (expectedState === 'hidden' || expectedState === 'gone') {
+        assertionPassed = found === null
+      } else {
+        // Generic: check if the actual state contains the expected text
+        assertionPassed = assertionActual
+          .toLowerCase()
+          .includes(expectedState.toLowerCase())
+      }
+
+      console.warn(
+        `    Assertion: "${step.expectedAssertion}" → ${assertionPassed ? '✓ PASS' : '✗ FAIL'} (actual: ${assertionActual})`
+      )
+    }
+
+    // Gemini narration (visual description for humans)
+    const geminiNarration = await geminiDescribe(
+      page,
+      geminiApiKey,
+      `What changed after ${actionObj.action}?`
+    )
+
+    entries.push({
+      step: i + 1,
+      action: actionObj.action,
+      params: actionObj,
+      result,
+      a11yBefore,
+      a11yAfter,
+      assertionExpected: step.expectedAssertion,
+      assertionPassed,
+      assertionActual,
+      geminiNarration,
+      timestampMs: elapsed
+    })
+  }
+
+  // Final screenshot
+  await page.screenshot({ path: `${outputDir}/reproduce-final.png` })
+
+  const log: NarrationLog = {
+    entries,
+    allAssertionsPassed: entries.every((e) => e.assertionPassed)
+  }
+
+  writeFileSync(
+    `${narrationDir}/narration-log.json`,
+    JSON.stringify(log, null, 2)
+  )
+  console.warn(
+    `Reproduce phase complete: ${entries.filter((e) => e.assertionPassed).length}/${entries.length} assertions passed`
+  )
+
+  return log
+}
--- a/scripts/qa-user-setup.test.ts
+++ b/scripts/qa-user-setup.test.ts
@@ -0,0 +1,82 @@
+import { describe, expect, it } from 'vitest'
+
+import {
+  buildFallbackUsername,
+  findUserIdByUsername,
+  isDuplicateUserErrorMessage
+} from '../browser_tests/fixtures/utils/userSetup'
+
+describe('findUserIdByUsername', () => {
+  it('finds a user in the standard id-to-name map', () => {
+    expect(
+      findUserIdByUsername({ users: { user_1: 'alice', user_2: 'bob' } }, 'bob')
+    ).toBe('user_2')
+  })
+
+  it('finds a user in tuple-style entries', () => {
+    expect(
+      findUserIdByUsername(
+        {
+          users: [
+            ['user_1', 'alice'],
+            ['user_2', 'bob']
+          ]
+        },
+        'alice'
+      )
+    ).toBe('user_1')
+  })
+
+  it('finds a user in object-style entries', () => {
+    expect(
+      findUserIdByUsername(
+        {
+          users: [
+            { userId: 'user_1', username: 'alice' },
+            { id: 'user_2', name: 'bob' }
+          ]
+        },
+        'bob'
+      )
+    ).toBe('user_2')
+  })
+
+  it('returns null for malformed payloads and unknown users', () => {
+    expect(findUserIdByUsername(null, 'alice')).toBeNull()
+    expect(
+      findUserIdByUsername({ users: 'not-a-collection' }, 'alice')
+    ).toBeNull()
+    expect(
+      findUserIdByUsername({ users: { user_1: 'alice' } }, 'bob')
+    ).toBeNull()
+  })
+})
+
+describe('isDuplicateUserErrorMessage', () => {
+  it('matches duplicate-user API errors', () => {
+    expect(
+      isDuplicateUserErrorMessage(
+        'Failed to create user: {"error":"Duplicate username."}'
+      )
+    ).toBe(true)
+    expect(
+      isDuplicateUserErrorMessage('User already exists in the server state')
+    ).toBe(true)
+  })
+
+  it('does not match unrelated failures', () => {
+    expect(
+      isDuplicateUserErrorMessage(
+        'Failed to create user: {"error":"Unauthorized"}'
+      )
+    ).toBe(false)
+  })
+})
+
+describe('buildFallbackUsername', () => {
+  it('adds a deterministic suffix', () => {
+    expect(buildFallbackUsername('playwright-test-0', 1234)).toBe(
+      'playwright-test-0-1234'
+    )
+  })
+})
--- a/scripts/qa-video-review.test.ts
+++ b/scripts/qa-video-review.test.ts
@@ -0,0 +1,150 @@
+import { describe, expect, it } from 'vitest'
+
+import {
+  extractPlatformFromArtifactDirName,
+  pickLatestVideosByPlatform,
+  selectVideoCandidateByFile
+} from './qa-video-review'
+
+describe('extractPlatformFromArtifactDirName', () => {
+  it('extracts and normalizes known qa artifact directory names', () => {
+    expect(
+      extractPlatformFromArtifactDirName('qa-report-Windows-22818315023')
+    ).toBe('windows')
+    expect(
+      extractPlatformFromArtifactDirName('qa-report-macOS-22818315023')
+    ).toBe('macos')
+    expect(
+      extractPlatformFromArtifactDirName('qa-report-Linux-22818315023')
+    ).toBe('linux')
+  })
+
+  it('falls back to slugifying unknown directory names', () => {
+    expect(extractPlatformFromArtifactDirName('custom platform run')).toBe(
+      'custom-platform-run'
+    )
+  })
+})
+
+describe('pickLatestVideosByPlatform', () => {
+  it('keeps only the latest candidate per platform', () => {
+    const selected = pickLatestVideosByPlatform([
+      {
+        platformName: 'windows',
+        videoPath: '/tmp/windows-old.mp4',
+        mtimeMs: 100
+      },
+      {
+        platformName: 'windows',
+        videoPath: '/tmp/windows-new.mp4',
+        mtimeMs: 200
+      },
+      {
+        platformName: 'linux',
+        videoPath: '/tmp/linux.mp4',
+        mtimeMs: 150
+      }
+    ])
+
+    expect(selected).toEqual([
+      {
+        platformName: 'linux',
+        videoPath: '/tmp/linux.mp4',
+        mtimeMs: 150
+      },
+      {
+        platformName: 'windows',
+        videoPath: '/tmp/windows-new.mp4',
+        mtimeMs: 200
+      }
+    ])
+  })
+})
+
+describe('selectVideoCandidateByFile', () => {
+  it('selects a single candidate by artifacts-relative path', () => {
+    const selected = selectVideoCandidateByFile(
+      [
+        {
+          platformName: 'windows',
+          videoPath: '/tmp/qa-artifacts/qa-report-Windows-1/qa-session.mp4',
+          mtimeMs: 100
+        },
+        {
+          platformName: 'linux',
+          videoPath: '/tmp/qa-artifacts/qa-report-Linux-1/qa-session.mp4',
+          mtimeMs: 200
+        }
+      ],
+      {
+        artifactsDir: '/tmp/qa-artifacts',
+        videoFile: 'qa-report-Linux-1/qa-session.mp4'
+      }
+    )
+
+    expect(selected).toEqual({
+      platformName: 'linux',
+      videoPath: '/tmp/qa-artifacts/qa-report-Linux-1/qa-session.mp4',
+      mtimeMs: 200
+    })
+  })
+
+  it('throws when basename matches multiple videos', () => {
+    expect(() =>
+      selectVideoCandidateByFile(
+        [
+          {
+            platformName: 'windows',
+            videoPath: '/tmp/qa-artifacts/qa-report-Windows-1/qa-session.mp4',
+            mtimeMs: 100
+          },
+          {
+            platformName: 'linux',
+            videoPath: '/tmp/qa-artifacts/qa-report-Linux-1/qa-session.mp4',
+            mtimeMs: 200
+          }
+        ],
+        {
+          artifactsDir: '/tmp/qa-artifacts',
+          videoFile: 'qa-session.mp4'
+        }
+      )
+    ).toThrow('matched 2 videos')
+  })
+
+  it('throws when there is no matching video', () => {
+    expect(() =>
+      selectVideoCandidateByFile(
+        [
+          {
+            platformName: 'windows',
+            videoPath: '/tmp/qa-artifacts/qa-report-Windows-1/qa-session.mp4',
+            mtimeMs: 100
+          }
+        ],
+        {
+          artifactsDir: '/tmp/qa-artifacts',
+          videoFile: 'qa-report-macOS-1/qa-session.mp4'
+        }
+      )
+    ).toThrow('No video matched')
+  })
+
+  it('throws when video file is missing', () => {
+    expect(() =>
+      selectVideoCandidateByFile(
+        [
+          {
+            platformName: 'windows',
+            videoPath: '/tmp/qa-artifacts/qa-report-Windows-1/qa-session.mp4',
+            mtimeMs: 100
+          }
+        ],
+        {
+          artifactsDir: '/tmp/qa-artifacts',
+          videoFile: '   '
+        }
+      )
+    ).toThrow('--video-file is required')
+  })
+})
--- a/scripts/qa-video-review.ts
+++ b/scripts/qa-video-review.ts
@@ -0,0 +1,765 @@
+#!/usr/bin/env tsx
+import { mkdir, readFile, stat, writeFile } from 'node:fs/promises'
+import { basename, dirname, extname, relative, resolve } from 'node:path'
+import { fileURLToPath } from 'node:url'
+
+import { GoogleGenerativeAI } from '@google/generative-ai'
+import { globSync } from 'glob'
+
+interface CliOptions {
+  artifactsDir: string
+  videoFile: string
+  beforeVideo: string
+  outputDir: string
+  model: string
+  requestTimeoutMs: number
+  dryRun: boolean
+  prContext: string
+  targetUrl: string
+  passLabel: string
+}
+
+interface VideoCandidate {
+  platformName: string
+  videoPath: string
+  mtimeMs: number
+}
+
+const DEFAULT_OPTIONS: CliOptions = {
+  artifactsDir: './tmp/qa-artifacts',
+  videoFile: '',
+  beforeVideo: '',
+  outputDir: './tmp',
+  model: 'gemini-3-flash-preview',
+  requestTimeoutMs: 300_000,
+  dryRun: false,
+  prContext: '',
+  targetUrl: '',
+  passLabel: ''
+}
+
+const USAGE = `Usage:
+  pnpm exec tsx scripts/qa-video-review.ts [options]
+
+Options:
+  --artifacts-dir <path>        Artifacts root directory
+                                 (default: ./tmp/qa-artifacts)
+  --video-file <name-or-path>   Video file to analyze (required)
+                                 (supports basename or relative/absolute path)
+  --before-video <path>         Before video (main branch) for comparison
+                                 When provided, sends both videos to Gemini
+                                 for comparative before/after analysis
+  --output-dir <path>           Output directory for markdown reports
+                                 (default: ./tmp)
+  --model <name>                Gemini model
+                                 (default: gemini-3-flash-preview)
+  --request-timeout-ms <n>      Request timeout in milliseconds
+                                 (default: 300000)
+  --pr-context <file>           File with PR context (title, body, diff)
+                                 for PR-aware review
+  --target-url <url>            Issue or PR URL to include in the report
+  --pass-label <label>          Label for multi-pass reports (e.g. pass1)
+                                 Output becomes {platform}-{label}-qa-video-report.md
+  --dry-run                     Discover videos and output targets only
+  --help                        Show this help text
+
+Environment:
+  GEMINI_API_KEY                Required unless --dry-run
+`
+
+function parsePositiveInteger(rawValue: string, flagName: string): number {
+  const parsedValue = Number.parseInt(rawValue, 10)
+  if (!Number.isInteger(parsedValue) || parsedValue <= 0) {
+    throw new Error(`Invalid value for ${flagName}: "${rawValue}"`)
+  }
+  return parsedValue
+}
+
+function parseCliOptions(args: string[]): CliOptions {
+  const options: CliOptions = { ...DEFAULT_OPTIONS }
+
+  for (let index = 0; index < args.length; index += 1) {
+    const argument = args[index]
+    const nextValue = args[index + 1]
+    const requireValue = (flagName: string): string => {
+      if (!nextValue || nextValue.startsWith('--')) {
+        throw new Error(`Missing value for ${flagName}`)
+      }
+      index += 1
+      return nextValue
+    }
+
+    if (argument === '--help') {
+      process.stdout.write(USAGE)
+      process.exit(0)
+    }
+
+    if (argument === '--artifacts-dir') {
+      options.artifactsDir = requireValue(argument)
+      continue
+    }
+
+    if (argument === '--video-file') {
+      options.videoFile = requireValue(argument)
+      continue
+    }
+
+    if (argument === '--output-dir') {
+      options.outputDir = requireValue(argument)
+      continue
+    }
+
+    if (argument === '--model') {
+      options.model = requireValue(argument)
+      continue
+    }
+
+    if (argument === '--request-timeout-ms') {
+      options.requestTimeoutMs = parsePositiveInteger(
+        requireValue(argument),
+        argument
+      )
+      continue
+    }
+
+    if (argument === '--before-video') {
+      options.beforeVideo = requireValue(argument)
+      continue
+    }
+
+    if (argument === '--pr-context') {
+      options.prContext = requireValue(argument)
+      continue
+    }
+
+    if (argument === '--target-url') {
+      options.targetUrl = requireValue(argument)
+      continue
+    }
+
+    if (argument === '--pass-label') {
+      options.passLabel = requireValue(argument)
+      continue
+    }
+
+    if (argument === '--dry-run') {
+      options.dryRun = true
+      continue
+    }
+
+    throw new Error(`Unknown argument: ${argument}`)
+  }
+
+  return options
+}
+
+function normalizePlatformName(value: string): string {
+  const slug = value
+    .trim()
+    .toLowerCase()
+    .replace(/[^a-z0-9]+/g, '-')
+    .replace(/^-+|-+$/g, '')
+
+  return slug.length > 0 ? slug : 'unknown-platform'
+}
+
+export function extractPlatformFromArtifactDirName(dirName: string): string {
+  const matchedValue = dirName.match(/^qa-report-(.+?)(?:-\d+)?$/i)?.[1]
+  return normalizePlatformName(matchedValue ?? dirName)
+}
+
+function extractPlatformFromVideoPath(videoPath: string): string {
+  const artifactDirName = basename(dirname(videoPath))
+  return extractPlatformFromArtifactDirName(artifactDirName)
+}
+
+export function pickLatestVideosByPlatform(
+  candidates: VideoCandidate[]
+): VideoCandidate[] {
+  const latestByPlatform = new Map<string, VideoCandidate>()
+
+  for (const candidate of candidates) {
+    const current = latestByPlatform.get(candidate.platformName)
+    if (!current || candidate.mtimeMs > current.mtimeMs) {
+      latestByPlatform.set(candidate.platformName, candidate)
+    }
+  }
+
+  return [...latestByPlatform.values()].sort((a, b) =>
+    a.platformName.localeCompare(b.platformName)
+  )
+}
+
+function toProjectRelativePath(targetPath: string): string {
+  const relativePath = relative(process.cwd(), targetPath)
+  if (relativePath.startsWith('.')) {
+    return relativePath
+  }
+  return `./${relativePath}`
+}
+
+function errorToString(error: unknown): string {
+  return error instanceof Error ? error.message : String(error)
+}
+
+function normalizePathForMatch(value: string): string {
+  return value.replaceAll('\\', '/').replace(/^\.\/+/, '')
+}
+
+export function selectVideoCandidateByFile(
+  candidates: VideoCandidate[],
+  options: { artifactsDir: string; videoFile: string }
+): VideoCandidate {
+  const requestedValue = options.videoFile.trim()
+  if (requestedValue.length === 0) {
+    throw new Error('--video-file is required')
+  }
+
+  const artifactsRoot = resolve(options.artifactsDir)
+  const requestedAbsolutePath = resolve(requestedValue)
+  const requestedPathKey = normalizePathForMatch(requestedValue)
+
+  const matches = candidates.filter((candidate) => {
+    const candidateAbsolutePath = resolve(candidate.videoPath)
+    if (candidateAbsolutePath === requestedAbsolutePath) {
+      return true
+    }
+
+    const candidateBaseName = basename(candidate.videoPath)
+    if (candidateBaseName === requestedValue) {
+      return true
+    }
+
+    const relativeToCwd = normalizePathForMatch(
+      relative(process.cwd(), candidateAbsolutePath)
+    )
+    if (relativeToCwd === requestedPathKey) {
+      return true
+    }
+
+    const relativeToArtifacts = normalizePathForMatch(
+      relative(artifactsRoot, candidateAbsolutePath)
+    )
+    return relativeToArtifacts === requestedPathKey
+  })
+
+  if (matches.length === 1) {
+    return matches[0]
+  }
+
+  if (matches.length === 0) {
+    const availableVideos = candidates.map((candidate) =>
+      toProjectRelativePath(candidate.videoPath)
+    )
+    throw new Error(
+      [
+        `No video matched --video-file "${options.videoFile}".`,
+        'Available videos:',
+        ...availableVideos.map((videoPath) => `- ${videoPath}`)
+      ].join('\n')
+    )
+  }
+
+  throw new Error(
+    [
+      `--video-file "${options.videoFile}" matched ${matches.length} videos.`,
+      'Please pass a more specific path.',
+      ...matches.map((match) => `- ${toProjectRelativePath(match.videoPath)}`)
+    ].join('\n')
+  )
+}
+
+async function collectVideoCandidates(
+  artifactsDir: string
+): Promise<VideoCandidate[]> {
+  const absoluteArtifactsDir = resolve(artifactsDir)
+  const videoPaths = globSync('**/qa-session{,-[0-9]}.mp4', {
+    cwd: absoluteArtifactsDir,
+    absolute: true,
+    nodir: true
+  }).sort()
+
+  const candidates = await Promise.all(
+    videoPaths.map(async (videoPath) => {
+      const videoStat = await stat(videoPath)
+      return {
+        platformName: extractPlatformFromVideoPath(videoPath),
+        videoPath,
+        mtimeMs: videoStat.mtimeMs
+      }
+    })
+  )
+
+  return candidates
+}
+
+function getMimeType(filePath: string): string {
+  const ext = extname(filePath).toLowerCase()
+  const mimeMap: Record<string, string> = {
+    '.mp4': 'video/mp4',
+    '.webm': 'video/webm',
+    '.mov': 'video/quicktime',
+    '.avi': 'video/x-msvideo',
+    '.mkv': 'video/x-matroska',
+    '.m4v': 'video/mp4'
+  }
+  return mimeMap[ext] || 'video/mp4'
+}
+
+function buildReviewPrompt(options: {
+  platformName: string
+  videoPath: string
+  prContext: string
+  isComparative: boolean
+}): string {
+  const { platformName, videoPath, prContext, isComparative } = options
+
+  if (isComparative) {
+    return buildComparativePrompt(platformName, videoPath, prContext)
+  }
+
+  return buildSingleVideoPrompt(platformName, videoPath, prContext)
+}
+
+function buildComparativePrompt(
+  platformName: string,
+  videoPath: string,
+  prContext: string
+): string {
+  const lines = [
+    'You are a senior QA engineer performing a BEFORE/AFTER comparison review.',
+    '',
+    'You are given TWO videos:',
+    '- **Video 1 (BEFORE)**: The main branch BEFORE the PR. This shows the OLD behavior.',
+    '- **Video 2 (AFTER)**: The PR branch AFTER the changes. This shows the NEW behavior.',
+    '',
+    'Both videos show the same test steps executed on different code versions.',
+    ''
+  ]
+
+  if (prContext) {
+    lines.push('## PR Context', prContext, '')
+  }
+
+  lines.push(
+    '## Your Task',
+    `Platform: "${platformName}". After video: ${toProjectRelativePath(videoPath)}.`,
+    '',
+    '1. **BEFORE video**: Does it demonstrate the old behavior or bug that the PR aims to fix?',
+    '   Describe what you observe — this establishes the baseline.',
+    '2. **AFTER video**: Does it prove the PR fix works? Is the intended new behavior visible?',
+    '3. **Comparison**: What specifically changed between before and after?',
+    '4. **Regressions**: Did the PR introduce any new problems visible in the AFTER video',
+    '   that were NOT present in the BEFORE video?',
+    '',
+    'Note: Brief black frames during page transitions are NORMAL.',
+    'Note: Small cyan/purple dashed labels prefixed with "QA:" are annotations placed by the automated test script — they are NOT part of the application UI. Do not treat them as bugs or evidence.',
+    'Report only concrete, visible differences. Avoid speculation.',
+    '',
+    'Return markdown with these sections exactly:',
+    '## Summary',
+    '(What the PR changes, whether BEFORE confirms the old behavior, whether AFTER proves the fix)',
+    '',
+    '## Behavior Changes',
+    'Summarize ALL behavioral differences as a markdown TABLE:',
+    '| Behavior | Before (main) | After (PR) | Verdict |',
+    '',
+    '- **Behavior**: short name for the behavior (e.g. "Save shortcut label", "Menu hover style")',
+    '- **Before (main)**: how it works/looks in the BEFORE video',
+    '- **After (PR)**: how it works/looks in the AFTER video',
+    '- **Verdict**: `Fixed`, `Improved`, `Changed`, `Regression`, or `No Change`',
+    '',
+    'One row per distinct behavior. Include both changed AND unchanged key behaviors',
+    'that were tested, so reviewers can confirm nothing was missed.',
+    '',
+    '## Timeline Comparison',
+    'Present a chronological frame-by-frame comparison as a markdown TABLE:',
+    '| Time | Type | Severity | Before (main) | After (PR) |',
+    '',
+    '- **Time**: timestamp or range from the videos (e.g. `0:05-0:08`)',
+    '- **Type**: category such as `Visual`, `Behavior`, `Layout`, `Text`, `Animation`, `Menu`, `State`',
+    '- **Severity**: `None` (neutral change), `Fixed` (bug resolved), `Regression`, `Minor`, `Major`',
+    '- **Before (main)**: what is observed in the BEFORE video at that time',
+    '- **After (PR)**: what is observed in the AFTER video at that time',
+    '',
+    'Include one row per distinct observable difference. If behavior is identical at a timestamp,',
+    'omit that row. Focus on meaningful differences, not narrating every frame.',
+    '',
+    '## Confirmed Issues',
+    'For each issue, use this exact format:',
+    '',
+    '### [Short issue title]',
+    '`SEVERITY` `TIMESTAMP` `Confidence: LEVEL`',
+    '',
+    '[Description — specify whether it appears in BEFORE, AFTER, or both]',
+    '',
+    '**Evidence:** [What you observed at the given timestamp in which video]',
+    '',
+    '**Suggested Fix:** [Actionable recommendation]',
+    '',
+    '---',
+    '',
+    '## Possible Issues (Needs Human Verification)',
+    '## Overall Risk',
+    '(Assess whether the PR achieves its goal based on the before/after comparison)',
+    '',
+    '## Verdict',
+    'End your report with this EXACT JSON block (no markdown fence):',
+    '{"verdict": "REPRODUCED" | "NOT_REPRODUCIBLE" | "INCONCLUSIVE", "risk": "low" | "medium" | "high", "confidence": "high" | "medium" | "low"}',
+    '- REPRODUCED: the before video confirms the old behavior and the after video shows the fix working',
+    '- NOT_REPRODUCIBLE: the before video does not show the reported bug',
+    '- INCONCLUSIVE: the videos do not adequately demonstrate the behavior change'
+  )
+
+  return lines.filter(Boolean).join('\n')
+}
+
+function buildSingleVideoPrompt(
+  platformName: string,
+  videoPath: string,
+  prContext: string
+): string {
+  const lines = [
+    'You are a senior QA engineer reviewing a UI test session recording.',
+    '',
+    '## ANTI-HALLUCINATION RULES (READ FIRST)',
+    '- Describe ONLY what you can directly observe in the video frames',
+    '- NEVER infer or assume what "must have happened" between frames',
+    '- If a step is not visible in the video, say "NOT SHOWN" — do not guess',
+    '- Your job is to be a CAMERA — report facts, not interpretations',
+    ''
+  ]
+
+  const isIssueContext =
+    prContext &&
+    /^### Issue #|^Title:.*\bbug\b|^This video attempts to reproduce/im.test(
+      prContext
+    )
+
+  if (prContext) {
+    lines.push(
+      '## Phase 1: Blind Observation (describe what you SEE)',
+      'First, describe every UI interaction chronologically WITHOUT knowing the expected outcome:',
+      '- What elements does the user click/hover/type?',
+      '- What dialogs/menus open and close?',
+      '- What keyboard indicators appear? (look for subtitle overlays)',
+      '- What is the BEFORE state and AFTER state of each action?',
+      '',
+      '## Phase 2: Compare against expected behavior',
+      'Now compare your observations against the context below.',
+      'Only claim a match if your Phase 1 observations EXPLICITLY support it.',
+      ''
+    )
+
+    if (isIssueContext) {
+      lines.push(
+        '## Issue Context',
+        prContext,
+        '',
+        '## Comparison Questions',
+        '1. Did the video perform the reproduction steps described in the issue?',
+        '2. Did your Phase 1 observations show the reported bug behavior?',
+        '3. If the steps were not performed or the bug was not visible, say INCONCLUSIVE.',
+        ''
+      )
+    } else {
+      lines.push(
+        '## PR Context',
+        prContext,
+        '',
+        '## Comparison Questions',
+        '1. Did the video test the specific behavior the PR changes?',
+        '2. Did your Phase 1 observations show the expected before/after difference?',
+        '3. If the test was incomplete or inconclusive, say so honestly.',
+        ''
+      )
+    }
+  }
+
+  lines.push(
+    `Review this QA session video for platform "${platformName}".`,
+    `Source video: ${toProjectRelativePath(videoPath)}.`,
+    'The video shows the full test session — analyze it chronologically.',
+    'Focus on UI regressions, broken states, visual glitches, unreadable text, missing labels/i18n, and clear workflow failures.',
+    'Note: Brief black frames during page transitions are NORMAL and should NOT be reported as issues.',
+    'Note: Small cyan/purple dashed labels prefixed with "QA:" are annotations placed by the automated test script — they are NOT part of the application UI. Do not treat them as bugs or evidence.',
+    'Report only concrete, visible problems and avoid speculation.',
+    'If confidence is low, mark it explicitly.',
+    '',
+    'Return markdown with these sections exactly:',
+    '## Summary',
+    isIssueContext
+      ? '(Explain what bug was reported and whether the video confirms it is reproducible)'
+      : prContext
+        ? '(Explain what the PR intended and whether the video confirms it works)'
+        : '',
+    '## Confirmed Issues',
+    'For each confirmed issue, use this exact format (one block per issue):',
+    '',
+    '### [Short issue title]',
+    '`HIGH` `01:03` `Confidence: High`',
+    '',
+    '[Description of the issue — what went wrong and what was expected]',
+    '',
+    '**Evidence:** [What you observed in the video at the given timestamp]',
+    '',
+    '**Suggested Fix:** [Actionable recommendation]',
+    '',
+    '---',
+    '',
+    'The first line after the heading MUST be exactly three backtick-wrapped labels:',
+    '`SEVERITY` `TIMESTAMP` `Confidence: LEVEL`',
+    'Do NOT use a table for issues — use the block format above.',
+    '## Possible Issues (Needs Human Verification)',
+    '## Overall Risk',
+    '',
+    '## Verdict',
+    'End your report with this EXACT JSON block (no markdown fence):',
+    '{"verdict": "REPRODUCED" | "NOT_REPRODUCIBLE" | "INCONCLUSIVE", "risk": "low" | "medium" | "high" | null, "confidence": "high" | "medium" | "low"}',
+    '- REPRODUCED: the bug/behavior is clearly visible in the video',
+    '- NOT_REPRODUCIBLE: the steps were performed correctly but the bug was not observed',
+    '- INCONCLUSIVE: the reproduction steps were not performed or the video is insufficient'
+  )
+
+  return lines.filter(Boolean).join('\n')
+}
+
+const MAX_VIDEO_BYTES = 100 * 1024 * 1024
+
+async function readVideoFile(videoPath: string): Promise<Buffer> {
+  const fileStat = await stat(videoPath)
+  if (fileStat.size > MAX_VIDEO_BYTES) {
+    throw new Error(
+      `Video ${basename(videoPath)} is ${formatBytes(fileStat.size)}, exceeds ${formatBytes(MAX_VIDEO_BYTES)} limit`
+    )
+  }
+  return readFile(videoPath)
+}
+
+async function requestGeminiReview(options: {
+  apiKey: string
+  model: string
+  platformName: string
+  videoPath: string
+  beforeVideoPath: string
+  timeoutMs: number
+  prContext: string
+}): Promise<string> {
+  const genAI = new GoogleGenerativeAI(options.apiKey)
+  const model = genAI.getGenerativeModel({ model: options.model })
+
+  const isComparative = options.beforeVideoPath.length > 0
+  const prompt = buildReviewPrompt({
+    platformName: options.platformName,
+    videoPath: options.videoPath,
+    prContext: options.prContext,
+    isComparative
+  })
+
+  const parts: Array<
+    { text: string } | { inlineData: { mimeType: string; data: string } }
+  > = [{ text: prompt }]
+
+  if (isComparative) {
+    const beforeBuffer = await readVideoFile(options.beforeVideoPath)
+    parts.push(
+      { text: 'Video 1 — BEFORE (main branch):' },
+      {
+        inlineData: {
+          mimeType: getMimeType(options.beforeVideoPath),
+          data: beforeBuffer.toString('base64')
+        }
+      }
+    )
+  }
+
+  const afterBuffer = await readVideoFile(options.videoPath)
+  if (isComparative) {
+    parts.push({ text: 'Video 2 — AFTER (PR branch):' })
+  }
+  parts.push({
+    inlineData: {
+      mimeType: getMimeType(options.videoPath),
+      data: afterBuffer.toString('base64')
+    }
+  })
+
+  const result = await model.generateContent(parts, {
+    timeout: options.timeoutMs
+  })
+  const response = result.response
+  const text = response.text()
+
+  if (!text || text.trim().length === 0) {
+    throw new Error('Gemini API returned no output text')
+  }
+
+  return text.trim()
+}
+
+function formatBytes(bytes: number): string {
+  if (bytes < 1024) return `${bytes} B`
+  if (bytes < 1024 * 1024) return `${(bytes / 1024).toFixed(1)} KB`
+  return `${(bytes / (1024 * 1024)).toFixed(1)} MB`
+}
+
+function buildReportMarkdown(input: {
+  platformName: string
+  model: string
+  videoPath: string
+  videoSizeBytes: number
+  beforeVideoPath?: string
+  beforeVideoSizeBytes?: number
+  reviewText: string
+  targetUrl?: string
+}): string {
+  const headerLines = [
+    `# ${input.platformName} QA Video Report`,
+    '',
+    `- Generated at: ${new Date().toISOString()}`,
+    `- Model: \`${input.model}\``
+  ]
+
+  if (input.targetUrl) {
+    headerLines.push(`- Target: ${input.targetUrl}`)
+  }
+
+  if (input.beforeVideoPath) {
+    headerLines.push(
+      `- Before video: \`${toProjectRelativePath(input.beforeVideoPath)}\` (${formatBytes(input.beforeVideoSizeBytes ?? 0)})`,
+      `- After video: \`${toProjectRelativePath(input.videoPath)}\` (${formatBytes(input.videoSizeBytes)})`,
+      '- Mode: **Comparative (before/after)**'
+    )
+  } else {
+    headerLines.push(
+      `- Source video: \`${toProjectRelativePath(input.videoPath)}\``,
+      `- Video size: ${formatBytes(input.videoSizeBytes)}`
+    )
+  }
+
+  headerLines.push('', '## AI Review', '')
+  return `${headerLines.join('\n')}${input.reviewText.trim()}\n`
+}
+
+async function reviewVideo(
+  video: VideoCandidate,
+  options: CliOptions,
+  apiKey: string
+): Promise<void> {
+  let prContext = ''
+  if (options.prContext) {
+    try {
+      prContext = await readFile(options.prContext, 'utf-8')
+      process.stdout.write(
+        `[${video.platformName}] Loaded PR context from ${options.prContext}\n`
+      )
+    } catch {
+      process.stdout.write(
+        `[${video.platformName}] Warning: Could not read PR context file ${options.prContext}\n`
+      )
+    }
+  }
+
+  const beforeVideoPath = options.beforeVideo
+    ? resolve(options.beforeVideo)
+    : ''
+
+  if (beforeVideoPath) {
+    const beforeStat = await stat(beforeVideoPath)
+    process.stdout.write(
+      `[${video.platformName}] Before video: ${toProjectRelativePath(beforeVideoPath)} (${formatBytes(beforeStat.size)})\n`
+    )
+  }
+
+  process.stdout.write(
+    `[${video.platformName}] Sending ${beforeVideoPath ? '2 videos (comparative)' : 'video'} to ${options.model}\n`
+  )
+
+  const reviewText = await requestGeminiReview({
+    apiKey,
+    model: options.model,
+    platformName: video.platformName,
+    videoPath: video.videoPath,
+    beforeVideoPath,
+    timeoutMs: options.requestTimeoutMs,
+    prContext
+  })
+
+  const videoStat = await stat(video.videoPath)
+  const passSegment = options.passLabel ? `-${options.passLabel}` : ''
+  const outputPath = resolve(
+    options.outputDir,
+    `${video.platformName}${passSegment}-qa-video-report.md`
+  )
+
+  const reportInput: Parameters<typeof buildReportMarkdown>[0] = {
+    platformName: video.platformName,
+    model: options.model,
+    videoPath: video.videoPath,
+    videoSizeBytes: videoStat.size,
+    reviewText,
+    targetUrl: options.targetUrl || undefined
+  }
+
+  if (beforeVideoPath) {
+    const beforeStat = await stat(beforeVideoPath)
+    reportInput.beforeVideoPath = beforeVideoPath
+    reportInput.beforeVideoSizeBytes = beforeStat.size
+  }
+
+  const reportMarkdown = buildReportMarkdown(reportInput)
+
+  await mkdir(dirname(outputPath), { recursive: true })
+  await writeFile(outputPath, reportMarkdown, 'utf-8')
+
+  process.stdout.write(
+    `[${video.platformName}] Wrote ${toProjectRelativePath(outputPath)}\n`
+  )
+}
+
+function isExecutedAsScript(metaUrl: string): boolean {
+  const modulePath = fileURLToPath(metaUrl)
+  const scriptPath = process.argv[1] ? resolve(process.argv[1]) : ''
+  return modulePath === scriptPath
+}
+
+async function main(): Promise<void> {
+  const options = parseCliOptions(process.argv.slice(2))
+  const candidates = await collectVideoCandidates(options.artifactsDir)
+
+  if (candidates.length === 0) {
+    process.stdout.write(
+      `No qa-session.mp4 files found under ${toProjectRelativePath(resolve(options.artifactsDir))}\n`
+    )
+    return
+  }
+
+  const selectedVideo = selectVideoCandidateByFile(candidates, {
+    artifactsDir: options.artifactsDir,
+    videoFile: options.videoFile
+  })
+
+  process.stdout.write(
+    `Selected ${selectedVideo.platformName}: ${toProjectRelativePath(selectedVideo.videoPath)}\n`
+  )
+
+  if (options.dryRun) {
+    process.stdout.write('\nDry run mode enabled, no API calls were made.\n')
+    return
+  }
+
+  const apiKey = process.env.GEMINI_API_KEY
+  if (!apiKey) {
+    throw new Error('GEMINI_API_KEY is required unless --dry-run is set')
+  }
+
+  await reviewVideo(selectedVideo, options, apiKey)
+}
+
+if (isExecutedAsScript(import.meta.url)) {
+  void main().catch((error: unknown) => {
+    const message = errorToString(error)
+    process.stderr.write(`qa-video-review failed: ${message}\n`)
+    process.exit(1)
+  })
+}
--- a/src/components/common/LazyImage.vue
+++ b/src/components/common/LazyImage.vue
@@ -42,7 +42,6 @@ import type { StyleValue } from 'vue'

 import { useIntersectionObserver } from '@/composables/useIntersectionObserver'
 import { useMediaCache } from '@/services/mediaCacheService'
-import type { ClassValue } from '@/utils/tailwindUtil'

 const {
  src,
@@ -54,8 +53,8 @@ const {
 } = defineProps<{
  src: string
  alt?: string
-  containerClass?: ClassValue
-  imageClass?: ClassValue
+  containerClass?: string
+  imageClass?: string
  imageStyle?: StyleValue
  rootMargin?: string
 }>()
--- a/src/components/templates/thumbnails/DefaultThumbnail.test.ts
+++ b/src/components/templates/thumbnails/DefaultThumbnail.test.ts
@@ -63,8 +63,7 @@ describe('DefaultThumbnail', () => {
      isVideo: true
    })
    expect(screen.getByRole('img')).toHaveClass(
-      'w-full',
-      'h-full',
+      'size-full',
      'object-cover'
    )
  })
--- a/src/components/templates/thumbnails/DefaultThumbnail.vue
+++ b/src/components/templates/thumbnails/DefaultThumbnail.vue
@@ -3,12 +3,14 @@
    <LazyImage
      :src="src"
      :alt="alt"
-      :image-class="[
-        'transform-gpu transition-transform duration-300 ease-out',
-        isVideoType
-          ? 'w-full h-full object-cover'
-          : 'max-w-full max-h-64 object-contain'
-      ]"
+      :image-class="
+        cn(
+          'transform-gpu transition-transform duration-300 ease-out',
+          isVideoType
+            ? 'size-full object-cover'
+            : 'max-h-64 max-w-full object-contain'
+        )
+      "
      :image-style="
        isHovered ? { transform: `scale(${1 + hoverZoom / 100})` } : undefined
      "
@@ -19,6 +21,7 @@
 <script setup lang="ts">
 import LazyImage from '@/components/common/LazyImage.vue'
 import BaseThumbnail from '@/components/templates/thumbnails/BaseThumbnail.vue'
+import { cn } from '@/utils/tailwindUtil'

 const { src, isVideo } = defineProps<{
  src: string
--- a/src/renderer/glsl/useGLSLUniforms.ts
+++ b/src/renderer/glsl/useGLSLUniforms.ts
@@ -17,12 +17,14 @@ interface AutogrowGroup {
  prefix?: string
 }

-interface UniformSource {
+/** @knipIgnoreUsedByStackedPR */
+export interface UniformSource {
  nodeId: NodeId
  widgetName: string
 }

-interface UniformSources {
+/** @knipIgnoreUsedByStackedPR */
+export interface UniformSources {
  floats: UniformSource[]
  ints: UniformSource[]
  bools: UniformSource[]
--- a/src/stores/dialogStore.ts
+++ b/src/stores/dialogStore.ts
@@ -6,7 +6,6 @@ import type { DialogPassThroughOptions } from 'primevue/dialog'
 import { markRaw, ref } from 'vue'
 import type { Component } from 'vue'

-import type GlobalDialog from '@/components/dialog/GlobalDialog.vue'
 import type { ComponentAttrs } from 'vue-component-type-helpers'

 type DialogPosition =
@@ -34,23 +33,19 @@ interface CustomDialogComponentProps {
  headless?: boolean
 }

-export type DialogComponentProps = ComponentAttrs<typeof GlobalDialog> &
-  CustomDialogComponentProps
+export type DialogComponentProps = CustomDialogComponentProps &
+  Record<string, unknown>

-export interface DialogInstance<
-  H extends Component = Component,
-  B extends Component = Component,
-  F extends Component = Component
-> {
+export interface DialogInstance {
  key: string
  visible: boolean
  title?: string
-  headerComponent?: H
-  headerProps?: ComponentAttrs<H>
-  component: B
-  contentProps: ComponentAttrs<B>
-  footerComponent?: F
-  footerProps?: ComponentAttrs<F>
+  headerComponent?: Component
+  headerProps?: Record<string, unknown>
+  component: Component
+  contentProps: Record<string, unknown>
+  footerComponent?: Component
+  footerProps?: Record<string, unknown>
  dialogComponentProps: DialogComponentProps
  priority: number
 }