- Add reproducedBy field to ResearchResult and done() tool
- Agent reports how bug was proven: e2e_test, video, both, or none
- Badge shows '1 via E2E test' instead of generic '1 reproduced'
- Deploy script reads reproducedBy from research-log.json
- Test code (reproduce.spec.ts) now deployed to report page
Amp-Thread-ID: https://ampcode.com/threads/T-019d4786-eb5f-7115-a10e-5b086c921800
Co-authored-by: Amp <amp@ampcode.com>
- Add rule: assertions must be specific to the bug (not just > 0)
- If no bug-specific assertion possible, verdict must be NOT_REPRODUCIBLE
- Copy reproduce.spec.ts to deployed report for transparency
- Addresses #10307 false REPRODUCED (test only asserted node count > 0)
Amp-Thread-ID: https://ampcode.com/threads/T-019d4786-eb5f-7115-a10e-5b086c921800
Co-authored-by: Amp <amp@ampcode.com>
Root cause: Cloudflare Pages serves stale deployments when multiple
deploys race to the same branch. The ANALYZING placeholder deployed
seconds before the final report would sometimes 'win' the race.
Fix: Only deploy once — the final report with the real badge. No more
intermediate PREPARING or ANALYZING placeholders.
Amp-Thread-ID: https://ampcode.com/threads/T-019d4786-eb5f-7115-a10e-5b086c921800
Co-authored-by: Amp <amp@ampcode.com>
- Add explicit instruction to call done() immediately after test passes
- Inject warning message in runTest response when test passes
- Ban page.waitForTimeout() in system prompt (use retrying assertions)
- Instruct agent to write ONE focused test, not multiple
Amp-Thread-ID: https://ampcode.com/threads/T-019d4786-eb5f-7115-a10e-5b086c921800
Co-authored-by: Amp <amp@ampcode.com>
- Add readFixture and readTest tools to qa-agent for fixture API discovery
- Enrich system prompt with comprehensive ComfyPage fixture API reference
- Switch CI to ANTHROPIC_API_KEY_QA secret
- Fix all TS errors in qa-agent.ts, qa-record.ts, qa-reproduce.ts
- Better error handling for API credit exhaustion
- Rewrite SKILL.md to reflect three-phase pipeline
Amp-Thread-ID: https://ampcode.com/threads/T-019d4786-eb5f-7115-a10e-5b086c921800
Co-authored-by: Amp <amp@ampcode.com>
Claude sometimes keeps iterating after a test passes, exhausting
the time budget without calling done(). Now: when runTest() returns
TEST PASSED, the test code is saved. If the agent loop ends without
done(), auto-sets verdict=REPRODUCED with the passing test.
Fixes#8532 (17 calls, test passed twice, but INCONCLUSIVE verdict).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added || true to all grep/sed pipe chains that could exit non-zero:
- grep on pr-context.txt (line 149)
- sed/grep on pr-context description (line 158-159)
- grep -oiP on RISK_FIRST (line 331)
- wrangler deploy | grep URL (line 355)
All tested under set -euo pipefail with empty inputs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
sed on video-reviews/*.md returns exit code 2 when no files match,
killing the script under set -euo pipefail. Added || true to all
potentially empty glob pipelines.
Affects 13/20 QA runs that had successful research but failed deploy.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The test uses comfyPageFixture, not bare page. Also match
firstNode await calls for node interaction pauses.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Regex inserts await page.waitForTimeout(800) before every
comfyPage/topbar/page/canvas/expect await call in the Phase 2
test code. Adds ~5-8s to a 10-step test (negligible vs 10min research).
Default playback changed to 0.5x (was 0.25x) since pauses provide
natural breathing room. A 15s video at 0.5x = 30s viewing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Report player defaults to 0.25x speed (was 0.5x) — 5s test videos
play in 20s, much more watchable
- Phase 2 injects cursor overlay via addInitScript into the test code
before running — white SVG arrow follows mousemove events
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Playwright config only records video when PLAYWRIGHT_LOCAL is set.
In CI, this env var was missing so Phase 2 produced no video.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
After context.close(), renameLatestWebm would overwrite the Phase 2
test execution video with the idle research browser recording.
Now skips the rename if qa-session.webm already exists from Phase 2.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The old video showed an idle screen (research browser doing nothing).
Now Phase 2 runs the test with --video=on from browser_tests/tests/,
finds the recorded .webm, and copies it to qa-session.webm where
the deploy script expects it.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Copy test to browser_tests/tests/ where Playwright config expects it
- System prompt teaches Claude the project's test fixtures:
comfyPageFixture, comfyPage.menu.topbar, comfyPage.workflow, etc.
- Increased time budget to 10 min for write→run→fix iterations
- Increased max turns to 50
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Also search qa-artifacts/before/*/research/ for the research log
since artifacts are downloaded with that nested structure.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Phase 1: Claude reads issue + a11y tree, writes a Playwright .spec.ts
test that asserts the bug exists. Runs the test, reads errors, iterates
until the test passes (proving the bug) or determines NOT_REPRODUCIBLE.
Phase 2: Run the passing test with --video=on for clean recording.
This replaces interactive browser driving with deterministic test code.
Claude Sonnet 4.6 excels at writing Playwright tests — much more
reliable than real-time browser interaction.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
page.accessibility.snapshot() was removed in Playwright 1.49+.
Use page.locator('body').ariaSnapshot() which returns a text
representation of the accessibility tree.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Copy research-log.json to deploy dir (accessible at /research-log.json)
- Read verdict from research log first (a11y-verified ground truth)
- Fall back to video review verdict only if no research log exists
- Research log is uploaded as part of QA artifacts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Phase 1 (qa-agent.ts): Claude investigates via a11y API only.
- No video, no Gemini vision — only page.accessibility.snapshot()
- Every action logged with a11y before/after state
- done() requires evidence citing inspect() results
- Outputs reproduction plan for Phase 2
Phase 2 (qa-reproduce.ts): Deterministic replay of research plan.
- Executes each step with a11y assertions
- Gemini describes visual changes (narration for humans)
- Clean focused video with subtitles
Phase 3: Report job reads research-log.json for verdict (ground truth),
narration-log.json for descriptions, video for visuals.
Gemini formats logs into report — never determines verdict.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Agent: MUST use inspect() after every action, verdict based on DOM
state not opinions. "NEVER claim REPRODUCED unless inspect() confirms."
Reviewer: Two-phase prompt — Phase 1 describes what it SEES (blind,
no context). Phase 2 compares observations against issue/PR context.
Anti-hallucination rules: "describe ONLY what you observe, NEVER infer."
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ANTHROPIC_API_KEY is optional: Agent SDK uses Claude Code OAuth
session when running locally (detects CLAUDE_CODE_SSE_PORT).
In CI, ANTHROPIC_API_KEY from secrets is used.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Gemini-only agentic loop had ~47% success rate — too low to be
useful as a fallback. Now ANTHROPIC_API_KEY is required for issue
reproduction. Fails clearly if missing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
addInitScript runs before page load — Vue's app mount destroys the
cursor div when it takes over the DOM. Using addScriptTag after login
ensures the cursor persists in the stable DOM.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Locator.click/hover bypasses our page.mouse monkey-patch. Now
clickByText, hoverMenuItem, clickSubmenuItem get the element
bounding box and update cursor overlay manually.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The grep \{"verdict":\s*"[^"]+ captures up to but not including the
closing quote. The second grep for "[A-Z_]+"$ then fails because
there's no closing quote. Fixed: match "verdict":\s*"[A-Z_]+ then
extract [A-Z_]+$ (no quotes needed).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Video review prompt now requests a ## Verdict JSON block:
{"verdict": "REPRODUCED|NOT_REPRODUCIBLE|INCONCLUSIVE", "risk": "low|medium|high"}
- Deploy script reads JSON verdict first, falls back to grep
- Eliminates all regex-matching false positives permanently
- Theme: light mode is default, dark via prefers-color-scheme:dark
- Cards use solid backgrounds, grain overlay only in dark mode
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add prefers-color-scheme:light media query with light palette.
Replace hardcoded dark oklch values with CSS variables.
Light mode: white surfaces, dark text, subtle borders, no grain overlay.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- loadDefaultWorkflow now calls app.resetToDefaultWorkflow() via JS API
instead of navigating File → Load Default menu (menu item name varies)
- pressKey reverted to instant press() — the 400ms hold via down/up
prevented Escape from propagating to parent dialog (#10397 BEFORE video
showed wrong behavior because hold intercepted the event)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of manually calling moveCursorOverlay in each action,
patch page.mouse.move/click/dblclick/down/up globally. Now EVERY
mouse operation shows the cursor — text clicks, menu hovers, etc.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
"could not be confirmed" contains "confirmed" which matched the
positive reproduc|confirm check. Now caught by the negative check first.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Agent SDK returned "model not found" for claude-sonnet-4-6-20250514.
Correct ID is claude-sonnet-4-6.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Headless Chrome's Playwright CDP doesn't trigger DOM mousemove events
reliably. Now executeAction calls __moveCursor(x,y) directly after
every mouse.move/click/drag. Cursor is an SVG arrow (white + outline).
Click state shown via scale animation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
pressKey now uses keyboard.down/up with 400ms hold instead of
instant press(). Shows subtitle "⌨ Escape" and the keyboard HUD
catches the held state for video frame capture.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Agent system prompt now instructs Claude to demonstrate BOTH working
(control) and broken (test) states when bug is triggered by a setting
- Added docs/qa/backlog.md with future improvements: Type B/C comparisons,
TTS, pre-seeding, cost optimization, environment-dependent issues
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Shows "QA @ abc1234" linking to the pipeline code commit
- Shows start time → deploy time in header
- Helps trace which version of QA scripts generated each report
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Architecture:
- Claude Sonnet 4.6 plans and reasons (via Claude Agent SDK)
- Gemini 2.5 Flash watches video buffer and describes what it sees
- 4 tools: observe(), inspect(), perform(), done()
observe(seconds, focus): builds video clip from screenshot buffer,
sends to Gemini with Claude's focused question.
inspect(selector): searches a11y tree for specific element state.
perform(action, params): executes Playwright action.
done(verdict, summary): signals completion.
Falls back to Gemini-only loop if ANTHROPIC_API_KEY not set.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Agent reasoning shown as subtitle bar at bottom of video during recording
- After recording, generates TTS audio via OpenAI API (tts-1, nova voice)
- Merges audio clips at correct timestamps into the video with ffmpeg
- Requires OPENAI_API_KEY env var; gracefully skips if not set
- No-sandbox + disable-dev-shm-usage for headless Chrome compatibility
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Download QA guide artifact in report job
- Extract prerequisites, test focus, and steps from guide JSON
- Display below the purpose description: focus → prerequisites → steps
- Separated by a subtle divider with smaller font
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Report site shows "PR #N aims to..." or "Issue #N reports..." block
above the video cards, extracted from pr-context.txt
- Multi-pass video links fall back to pass1 when qa-{os}.mp4 is 404
- More negative verdict patterns: "does not demonstrate", "never tested"
- Risk uses first word of Overall Risk (avoids "high confidence" match)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add "does not demonstrate", "steps were not performed", "never tested"
to NOT_REPRO patterns (fixes#9101 false positive)
- Risk detection uses first word of Overall Risk section instead of
grepping entire text (fixes "high confidence" matching HIGH)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
tsx compiles arrow functions with __name helpers that don't exist in
browser context. Using addScriptTag with plain JS string avoids this.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Set<string>() in page.evaluate causes __name ReferenceError in browser.
Use untyped Set() since browser JS doesn't support TS generics.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
"fails to reproduce" contains "reproduce" — must check negatives first
within each report. Across reports, REPRODUCED still wins (multi-pass).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Injects a persistent overlay in bottom-right corner that displays
currently held keys (e.g. "⌨ Space", "⌨ CTRL+C"). Makes keyboard
interactions visible in the recording for both human and AI reviewers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When multiple report files exist, badge shows "2/3 REPRODUCED" instead
of just "REPRODUCED". Single-pass issues still show plain verdict.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When multiple passes exist and one confirms while another is
inconclusive, the badge should show REPRODUCED. Previously
INCONCLUSIVE was checked first, hiding successful reproductions.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The "Clone" context menu item doesn't exist in Nodes 2.0 mode.
Using Ctrl+C/Ctrl+V works in both legacy and Nodes 2.0.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- #10307: preflight clones KSampler node, hint says drag to overlap
- #7414: preflight clicks numeric widget, hint says drag to change value
- #7806: preflight takes baseline screenshot, hint gives exact coords
for holdKeyAndDrag with spacebar
- Hints now reference "Preflight already did X, NOW do Y" pattern
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Agent was wasting turns re-doing loadDefaultWorkflow and setSetting
that preflight already executed. Now the system prompt includes
"Already Done" section listing preflight actions.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Auto-execute prerequisite actions (enable Nodes 2.0, load default
workflow) BEFORE the agentic loop starts. Agent model ignores prompt
hints but preflight guarantees nodes are on canvas.
- Add "fails to reproduce" to NOT REPRODUCIBLE badge patterns
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add issues:[labeled] trigger and qa-issue label support
- Resolve github.event.issue.number for issue-triggered runs
- Include issue labels in context (feeds keyword matcher for hints)
- Remove qa-issue label after run completes (same as qa-changes/qa-full)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Scan issue context for keywords (clone, copy-paste, spacebar, resize,
sidebar, scroll, middle-click, node shape, Nodes 2.0, etc.) and inject
specific MUST-follow action steps into the agentic system prompt.
Addresses 9 INCONCLUSIVE issues where agent had actions available
but didn't know when to use them.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Annotations now use cyan dashed border + monospace "QA:" prefix
instead of red solid labels that look like UI error messages
- Video review prompts explicitly tell reviewer to ignore QA annotations
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Change preload=metadata to preload=auto for full video download
- Add _headers file with Accept-Ranges for Cloudflare Pages
- Add custom seekbar (range input + buffer indicator) that works
even without server HTTP range request support
- Seekbar shows buffered progress and allows dragging to any point
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previous prompt said "test the specific behavior" which was too vague,
leading to generic UI walkthroughs instead of targeted tests.
New prompt: explicitly instructs to read the diff, trigger the exact
scenario the PR fixes, and avoid generic menu screenshots.
Also added reload action to before/after prompt for state persistence tests.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix quality badge now reads "## Overall Risk" section only
- Prevents false MAJOR ISSUES from severity labels or negated phrases
- "Low" risk → APPROVED, "High" → MAJOR ISSUES, "Medium" → MINOR ISSUES
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
setup-frontend must run first to install node/pnpm, then rebuild
with PR code. Also re-install sno-skills deps after switching back
so QA scripts' dependencies are available.
Also gitignore .claude/scheduled_tasks.lock.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When triggered via sno-qa-* push, the workflow checks out the PR code
to build its frontend, but this replaces qa-record.ts which only
exists on sno-skills. Fix: build PR frontend, then checkout back to
sno-skills so QA scripts remain available.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- SC2231: quote glob expansions in for loop
- SC2002: use sed directly instead of cat | sed
- SC2086: quote variable in echo
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Update DefaultThumbnail test to match size-full class change
- Fix shellcheck warnings in qa-batch.sh (SC2001, SC2207)
- Fix shellcheck warnings in qa-deploy-pages.sh (SC2034, SC2235, SC2231, SC2002)
- Add qa-report-template.html to oxfmt ignore (minified, not formattable)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- "confirmed" didn't match "confirms"/"reproducible" — use "reproduc|confirm" stem
- "partial" matched unrelated text — require "partially reproduced" specifically
- collectVideoCandidates now finds qa-session-*.mp4 for multi-pass reviews
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Check INCONCLUSIVE before reproduced/confirmed in badge detection
- Exclude markdown headings from reproduced grep match
- Add --pass-label to qa-video-review.ts for unique multi-pass filenames
- Pass pass label from workflow YAML when reviewing numbered sessions
- Collect all pass-specific reports in deploy script HTML
- Add addNode/cloneNode convenience actions to qa-record agent
- Improve strategy hints for visual/rendering bug reproduction
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove push trigger (was for dev testing only)
- Restore concurrency group (was commented out for dev)
- Move misplaced import in qa-analyze-pr.ts to top of file
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Strengthen prompt: MUST use openMenu → hoverMenuItem → clickMenuItem
in that order. Previous runs skipped openMenu causing silent failures.
- Add CI Job link to the QA report site header for quick navigation
to the GitHub Actions run that generated the report.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GitHub Actions has a 21000 char limit per expression. The combined
badge setup step exceeded this after adding the dual badge generator.
Split into its own step.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Badge now reads: QA Bot | REPRODUCED | Fix: APPROVED
Not all issues are bugs — could be feature requests too.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reproducing a bug is a successful outcome for the QA bot.
Blue (#2196f3) = bot succeeded. Red = bot found problems with the fix.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PRs now show one badge with three segments:
QA Bot | Bug: REPRODUCED | Fix: APPROVED
Instead of two separate badges. Uses gen-badge-dual.sh which
renders label + bug status + fix status in one SVG.
Issues still use single two-segment badge:
QA Bot | FINISHED: REPRODUCED
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Inject fake cursor (red dot with click animation) via addInitScript
since headless Chrome doesn't render the system cursor in video
- Add hover-before-click delay to clickByText and canvas clicks
so viewers can see where the cursor moves before clicking
- Add 'annotate' action: shows a floating label at (x,y) for N ms
so AI can draw viewer attention to important UI state changes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause: runAgenticLoop never read the QA guide — agent saw
"No issue context provided" for issues. Now reads qaGuideFile,
parses structured fields, and injects into system prompt.
Also: fetch issue body via gh issue view in workflow, increase
budget to 120s/30 turns, add focus reminders, smarter stuck
detection (50px grid normalization + action-type frequency),
reject invalid click targets, add loadDefaultWorkflow and
openSettings convenience actions, strategy hints in prompt.
Fix pre-existing typecheck error in eslint.config.ts.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PRs now get two separate badges:
- Bug: REPRODUCED / NOT REPRODUCIBLE / PARTIAL (before branch)
- Fix: APPROVED / MAJOR ISSUES / MINOR ISSUES (after branch)
Issues keep a single badge: FINISHED: REPRODUCED / etc.
Both badge-bug.svg and badge-fix.svg served from the deploy site.
PR comment shows all three: ![badge] ![bug] ![fix]
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
FINISHED is not standalone — always shows result:
- FINISHED: REPRODUCED / NOT REPRODUCIBLE / PARTIAL (issues)
- FINISHED: APPROVED / MAJOR ISSUES / MINOR ISSUES (PRs)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Badge shows QA pipeline status, deployed at each stage:
- PREPARING (blue) — setting up artifacts
- ANALYZING (orange) — running video review
- Final status with color:
- Issues: REPRODUCED (red) / NOT REPRODUCIBLE (gray) / PARTIAL (yellow)
- PRs: APPROVED (green) / MAJOR ISSUES (red) / MINOR ISSUES (yellow)
Badge served as /badge.svg from the same Cloudflare Pages site.
Included in PR comment as .
Also restore @ts-expect-error for import-x plugin type incompatibility.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace single-shot step generation in reproduce mode with an agentic
loop where Gemini sees the screen after each action and decides what
to do next. For multi-bug issues, decompose into sub-issues and run
separate recording passes.
- Extract executeAction() from executeSteps() for reuse
- Add reload and done action types
- Add captureScreenshotForGemini() (JPEG q50, ~50KB)
- Add runAgenticLoop() with sliding window history (3 screenshots)
- Add decomposeIssue() for multi-pass recording (1-3 sub-issues)
- Update workflow to handle numbered session videos (qa-session-1, etc.)
- Add custom video controls below each video with frame stepping
- Frame back/forward buttons (1 frame at 30fps, 10 frames skip)
- Speed selector: 0.1x, 0.25x, 0.5x (default), 1x, 1.5x, 2x
- Keyboard shortcuts: arrow keys for frame step, space for play/pause
- SMPTE-style timecode display (m:ss.ms)
- Default 0.5x speed since AI operates UI faster than humans
- Videos no longer autoplay (pause on load for inspection)
- Zero external dependencies (pure HTML5 video API)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Reproduce video must be max 5 minutes (short, focused demo)
- Phase 4 reuses the environment from Phase 3 (no re-setup)
- Use video-start/video-stop commands (not --save-video flag)
- Start recording right before steps, stop immediately after
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New Claude agent-driven issue reproduction skill that:
- Phase 1-2: Research issue and set up environment (custom nodes, workflows, settings)
- Phase 3: Record research video while exploring interactively via playwright-cli
- Phase 4: Record clean reproduce video with only the minimal repro steps
- Phase 5: Generate structured reproduction report
Key difference from the old approach: Claude agent explores and adapts
instead of blindly executing a Gemini-generated static plan.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Gemini was right-clicking empty canvas instead of nodes because it
didn't know where the default workflow nodes are positioned. Now the
prompt includes approximate coordinates for all 7 default nodes and
clarifies the difference between node context menu vs canvas menu.
Also fixes TS2352 in page.evaluate by using double-cast through unknown.
- Add --target-url CLI option to qa-video-review.ts
- Include target URL in generated markdown reports
- Add clickable issue/PR link in deployed HTML report header
- Workflow passes the target URL automatically
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove unused @ts-expect-error directives in eslint.config.ts
- Simplify LazyImage prop types from ClassValue to string
- Fix DialogInstance to avoid infinitely deep type instantiation
- Use cn() in DefaultThumbnail for class merging
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- fillDialog now tries: PrimeVue dialog → node search box → focused input → keyboard fallback
- clickSubmenuItem now tries: PrimeVue tiered menu → litegraph context menu → role menuitem
- Fixes double-click-to-add-node flow and right-click context menu clicks
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Wrap each step in try/catch so failed steps don't abort the recording
- Add 5s timeout to clickByText to prevent 30s hangs on disabled elements
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
gh pr view can't distinguish PRs from issues — it succeeds for both.
Use the REST API endpoint repos/{owner}/{repo}/pulls/{number} which
returns 404 for issues.
- Enforce requestTimeoutMs via Gemini SDK requestOptions
- Add 100MB video size check before base64 encoding
- Sanitize screenshot filenames to prevent path traversal
- Sort video files by mtime for reliable rename
- Validate --mode arg against allowed values
- Add Content-Length pre-check in downloadMedia
- Add GitHub domain allowlist for media downloads (SSRF mitigation)
- Add contents:write permission and git config for report job
- Update Node.js requirement in SKILL.md from 18+ to 22+
Instead of crashing the entire recording session when Gemini generates
an invalid key name (e.g. "mouseWheelDown"), catch the error and
continue with remaining steps.
Add a "Behavior Changes" table (Behavior, Before, After, Verdict)
alongside the existing timeline comparison. This gives reviewers a
quick high-level view of all behavioral differences before diving
into the frame-by-frame timeline.
Instruct Gemini to output the Before vs After section as a markdown
table with Time, Type, Severity, Before, After columns for easier
comparison. Update HTML template table styles with fixed layout and
column widths optimized for the 5-column comparison format.
- Log raw Gemini response for debugging when parsing fails
- Handle possible wrapper keys in response
- Make qa-before/qa-after run even if analyze-pr fails (only gate
on resolve-matrix success)
When running on push events for sno-qa-* branches without an open PR,
extract the PR number from the branch name so analyze-pr can fetch
the full PR thread for analysis.
Add Gemini Pro-powered PR analysis that generates targeted QA guides
from the full PR thread (description, comments, screenshots, diff).
The analyze-pr job runs on lightweight ubuntu before recordings start,
producing qa-guide-before.json and qa-guide-after.json that are
downloaded by recording jobs to produce more focused test steps.
Graceful fallback: if analysis fails, recordings proceed without guides.
download-artifact@v7 merges all files flat regardless of
merge-multiple setting. Use separate path dirs (before/after)
and copy all files into the report directory.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
download-artifact@v7 defaults merge-multiple to true, which puts all
files flat in qa-artifacts/ instead of per-artifact subdirectories.
The merge step expects qa-artifacts/qa-before-{os}-{run}/ subdirs,
so the report directory never gets created and video review finds
no files.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of running before/after sequentially in a single job with
fragile git stash/checkout gymnastics, split into two independent
parallel jobs on separate runners:
resolve-matrix → qa-before (main) ─┐
→ qa-after (PR) ─┴→ report
- qa-before: uses git worktree for clean main branch build
- qa-after: normal PR build via setup-frontend
- report: downloads both artifact sets, merges, runs Gemini review
Benefits:
- Clean workspace isolation (no git checkout origin/main -- .)
- ~2x faster (parallel execution)
- Each job gets its own ComfyUI server (no shared state)
- Eliminates entire class of workspace contamination bugs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Pre-seed step creates qa-ci via API, so the "New user" form
shows "already exists" error. Fix by selecting the existing user
from the dropdown first, falling back to a unique username.
The localStorage userId bypass doesn't work because the server
validates user IDs and rejects the simple 'qa-ci' string. Instead,
detect the login page by its input fields and create a user via the
"New user" text input, which is how real users would log in.
Firefox headless doesn't support WebGL, causing "getCanvas: canvas is
null" errors. Switch to Chromium which has full headless WebGL support.
Also fix login flow to wait for async router guard to settle and
create user via text input as fallback.
Add coordinate fallback when .comfy-menu-button-wrapper selector isn't
found, and capture a debug screenshot after login to diagnose what the
page looks like when the editor UI fails to render.
The openComfyMenu was clicking at hardcoded coordinates (20, 67) which
missed the menu button. Now uses .comfy-menu-button-wrapper selector
matching the browser tests. Also fixes menu item hover/click selectors
to use .p-menubar-item-label and .p-tieredmenu-item classes, and adds
a wait for the editor UI to fully load before executing test steps.
The QA recordings were stuck on the user selection screen because CI
has no existing users. Fix by pre-seeding localStorage with userId,
userName, and TutorialCompleted before navigation, plus creating a
qa-ci user via API as a fallback.
The main branch build step was running pnpm install with main's lockfile,
which removed @google/generative-ai from node_modules. Move the reinstall
to after restoring PR files so the QA recording script can find its deps.
git checkout - uses @{-1} which requires a previous branch switch.
Since we use 'git checkout origin/main -- .' (file checkout, not branch
switch), there is no @{-1} ref. Use HEAD to restore from current branch.
Also restore proper concurrency group.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pass the comprehensive test plan from .claude/skills/comfy-qa/SKILL.md
to Gemini when generating test steps. This gives Gemini knowledge of all
12 QA categories (canvas, menus, sidebar, settings, etc.) so it picks
the most relevant tests for each PR.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
nx build runs typecheck as a prerequisite (via @nx/vite/plugin config).
Use vite build directly for the main branch comparison build.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Main branch may have transient TS errors when built with the PR
branch's lockfile. Since we only need the dist for visual comparison,
run nx build directly instead of pnpm build (which includes typecheck).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the unreliable codex exec approach with a Playwright script
(qa-record.ts) that uses Gemini to generate targeted test steps from
the PR diff, then executes them deterministically via Playwright's API.
Key changes:
- New scripts/qa-record.ts: Gemini generates JSON test actions, Playwright
executes them with reliable helper functions (menu nav, dialog fill, etc.)
- Remove codex CLI and playwright-cli dependencies
- Remove 150+ lines of prompt templates from pr-qa.yaml
- Firefox headless with video recording (same approach proven locally)
- Fallback steps if Gemini fails
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Tighten BEFORE prompt to 15s snapshot (show old state only)
- Add qa-generate-test.ts: Gemini-powered Playwright test generator
- New workflow step: generate .spec.ts and push to {branch}-add-qa-test
- Tests assert UIUX behavior (tab names, dirty state, visibility)
- Build both main (dist-before/) and PR (dist/) frontends in focused mode
- Run QA twice: BEFORE on main branch frontend, AFTER on PR branch
- Send both videos to Gemini in one request for comparative analysis
- Side-by-side dashboard layout with Before (main) / After (PR) panels
- Comparative prompt evaluates whether before confirms old behavior
and after proves the fix works
- Falls back to single-video mode when no before video available
moov atom was at end of file (8.6MB offset) — browser had to download
the entire video before seeking. Keyframes were only every 10 seconds.
Add -movflags +faststart (moov before mdat) and -g 60 (keyframe every
2.4s at 25fps) to ffmpeg conversion.
- Remove autoplay/loop so video timeline is seekable
- Skip card generation for platforms without recordings
- Add --pr-context flag to qa-video-review.ts so Gemini evaluates
against PR purpose instead of just describing what happened
- Workflow now builds pr-context.txt from PR title/body/diff
The Codex agent was spending 35s browsing the "Getting Started" template
gallery instead of testing the PR's changes. Pre-seeding this setting
via the ComfyUI API ensures the agent lands directly in the graph editor.
The Codex agent was spending time on login flow, template browsing,
and general smoke testing instead of testing the PR's actual changes.
Changes:
- Add 30-second time budget for video recording
- Move video-start AFTER login and editor verification
- Explicitly prohibit template browsing and sidebar exploration
- Reduce test steps to 3-6 targeted actions
- Restructure prompt with clear Instructions/Rules sections
Replace crude sed-based markdown conversion with client-side
rendering via marked.js CDN. Adds proper table, list, and
code styling for the report section.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the OpenAI GPT-based frame extraction approach (ffmpeg + screenshots)
with Gemini 2.5 Flash's native video understanding. This eliminates false
positives from frame-based analysis (e.g. "black screen = critical bug" during
page transitions) and produces dramatically better QA reviews.
Changes:
- Remove ffmpeg frame extraction, ffprobe duration detection, and all related
logic (~365 lines removed)
- Add @google/generative-ai SDK for native video/mp4 upload to Gemini
- Update CLI: remove --max-frames, --min-interval-seconds, --keep-frames flags
- Update env: OPENAI_API_KEY → GEMINI_API_KEY
- Update workflow: swap API key secret and model in pr-qa.yaml
- Update report: replace "Frames analyzed" with "Video size"
- Add note in prompt that brief black frames during transitions are normal
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
playwright-cli doesn't support 'evaluate' command. Instead, instruct
Codex to quickly fill the username input and click Next on user-select
page BEFORE starting video recording, so the video only shows actual
QA testing.
storageState config doesn't work with playwright-cli. Instead, use
evaluate to set Comfy.userId/userName after opening the page, then
navigate back. This skips user-select before video-start so the
recording only shows actual QA testing.
Write a Playwright storageState JSON with Comfy.userId/userName pre-set
so the app loads directly to the graph editor. Saves ~40s per QA run
that was wasted on navigating the user-select page.
The convert step was using find which picked up a 0-byte file from
playwright's videos/ directory instead of the valid qa-session.webm.
Now prefers qa-session.webm explicitly and skips empty files.
Codex was using pnpm dlx instead of the global playwright-cli.
Pre-install chromium in setup step and make prompt explicit about
using the global command directly without pnpm/npx.
Replace claude --print with codex exec for cheaper QA runs.
Uses codex-mini-latest model ($1.50/$6 vs Sonnet $3/$15).
Uses existing OPENAI_API_KEY secret (no new secrets needed).
- Replace saveVideo config (didn't produce video) with explicit
playwright-cli video-start/video-stop commands in QA prompt
- Remove apt-get install ffmpeg step (pre-installed on GH runners)
- Switch video review model from gpt-4o to gpt-4.1-mini
- Enable saveVideo in playwright-cli config for real video recording
- Replace screenshot stitching with webm→mp4 conversion
- Move video review step before deploy so reports are included
- Add GPT video review reports inline on the Cloudflare Pages site
- Each video card now has expandable "GPT Video Review" section
- Set .playwright/cli.config.json with outputDir pointing to screenshots/
- This way bare 'playwright-cli screenshot' auto-saves to the right place
- Create screenshot directory before Claude runs (don't rely on Claude)
- Collect step now searches working directory for stray PNGs
- Simplified prompt: no --filename needed, just 'playwright-cli screenshot'
Screenshots were saved to artifact root but stitch looked in frames/.
Now: prompt tells Claude to save to screenshots/ dir with numbered names,
collect step consolidates PNGs there, stitch step globs from screenshots/.
Removed video-start/video-stop (Claude doesn't use them).
- Add playwright-cli config with outputDir and saveVideo
- Use video-start/video-stop instead of relying on screenshot frames
- Add fallback artifact collection from .playwright-cli/ default dir
- Simplify prompts to focus on video recording workflow
The escaped \$QA_ARTIFACTS in the heredoc produced literal text
'$QA_ARTIFACTS' in the prompt. Claude's Bash tool didn't reliably
expand this env var, so no screenshots or reports were saved.
Remove the escapes so the heredoc expands the variable to the actual
path (e.g. /home/runner/work/_temp/qa-artifacts).
Backtick-wrapped playwright-cli examples in the unquoted heredoc were
being interpreted as bash command substitution, producing empty prompts.
Replace backtick syntax with plain "Run:" prefixed commands.
- Remove all Xvfb/ffmpeg screen recording infrastructure from qa job
(captured blank display since playwright-cli runs headless)
- Add screenshot instructions to QA prompts: Claude saves sequential
frames to $QA_ARTIFACTS/frames/ after every interaction
- Stitch screenshots into video via ffmpeg in report job (2fps)
- Merge video-review job into report job (4 jobs → 3 jobs)
- Unified PR comment with video links + video review in <details> collapse
- Clean up stale QA_VIDEO_REVIEW_COMMENT markers from prior runs
Move extra_server_params input to env var to prevent shell injection
from untrusted input. Replace wait-for-it pip dependency with a
cross-platform curl polling loop.
Add Claude Code skills and a label-triggered QA workflow:
- .claude/skills/comfy-qa/SKILL.md: 12-category QA test plan using
playwright-cli for browser automation
- .github/workflows/pr-qa.yaml: CI workflow triggered by qa-changes
(focused, Linux) or qa-full (3-OS matrix) labels. Records screen via
ffmpeg, runs Claude CLI with playwright-cli, deploys video gallery to
Cloudflare Pages, posts PR comment with GIF thumbnails, and runs
OpenAI vision-based video review
- scripts/qa-video-review.ts: frame extraction + GPT-4o analysis
- scripts/qa-video-review.test.ts: unit tests for video review
- knip.config.ts: resolve knip errors for ingest-types package
*PR Created by the Glary-Bot Agent*
---
## Summary
- Replace all `as unknown as Type` assertions in 59 unit test files with
type-safe `@total-typescript/shoehorn` functions
- Use `fromPartial<Type>()` for partial mock objects where deep-partial
type-checks (21 files)
- Use `fromAny<Type>()` for fundamentally incompatible types: null,
undefined, primitives, variables, class expressions, and mocks with
test-specific extra properties that `PartialDeepObject` rejects
(remaining files)
- All explicit type parameters preserved so TypeScript return types are
correct
- Browser test `.spec.ts` files excluded (shoehorn unavailable in
`page.evaluate` browser context)
## Verification
- `pnpm typecheck` ✅
- `pnpm lint` ✅
- `pnpm format` ✅
- Pre-commit hooks passed (format + oxlint + eslint + typecheck)
- Migrated test files verified passing (ran representative subset)
- No test behavior changes — only type assertion syntax changed
- No UI changes — screenshots not applicable
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10761-test-migrate-as-unknown-as-to-total-typescript-shoehorn-3336d73d365081f6b8adc44db5dcc380)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Glary-Bot <glary-bot@users.noreply.github.com>
Co-authored-by: Amp <amp@ampcode.com>
*PR Created by the Glary-Bot Agent*
---
## Summary
Fixes the `Bulk context menu shows when multiple assets selected` test
that is failing on main.
**Root cause — two issues:**
1. `click({ modifiers: ['ControlOrMeta'] })` does not fire `keydown`
events that VueUse's `useKeyModifier('Control')` tracks (used in
`useAssetSelection.ts`). Multi-select silently fails because the
composable never sees the Control key pressed. Fix: use
`keyboard.down('Control')` / `keyboard.up('Control')` around the click.
2. `click({ button: 'right' })` can be intercepted by canvas overlays
(documented gotcha in `browser_tests/AGENTS.md`). Fix: use
`dispatchEvent('contextmenu', { bubbles: true, cancelable: true })`
which bypasses overlay interception.
Also removed the `toPass()` retry wrapper since the root causes are now
addressed directly.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10762-fix-test-fix-bulk-context-menu-test-using-correct-Playwright-patterns-3346d73d3650811c843ee4a39d3ab305)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Glary-Bot <glary-bot@users.noreply.github.com>
## What changed
Added a runtime-safe `#e2e/*` alias for `browser_tests`, updated the
browser test docs, and migrated a representative fixture/spec import
path to the new convention.
## Why
`@/*` only covers `src/`, so browser test imports were falling back to
deep relative paths. `#e2e/*` resolves in both Node/Playwright runtime
and TypeScript.
## Validation
- `pnpm format`
- `pnpm typecheck:browser`
- `pnpm exec playwright test browser_tests/tests/actionbar.spec.ts
--list`
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10735-test-add-runtime-safe-browser_tests-alias-3336d73d36508122b253cb36a4ead1c1)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Alexander Brown <drjkl@comfy.org>
## Summary
- Closing an inactive workflow tab and clicking "Save" overwrites that
workflow with the **active** tab's content, causing permanent data loss
- `saveWorkflow()` and `saveWorkflowAs()` call `checkState()` which
serializes `app.rootGraph` (the active canvas) into the inactive
workflow's `changeTracker.activeState`
- Guard `checkState()` to only run when the workflow being saved is the
active one — in both `saveWorkflow` and `saveWorkflowAs`
## Linked Issues
- Fixes https://github.com/Comfy-Org/ComfyUI/issues/13230
## Root Cause
PR #9137 (commit `9fb93a5b0`, v1.41.7) added
`workflow.changeTracker?.checkState()` inside `saveWorkflow()` and
`saveWorkflowAs()`. `checkState()` always serializes `app.rootGraph` —
the graph on the canvas. When called on an inactive tab's change
tracker, it captures the active tab's data instead.
## Test plan
- [x] E2E: "Closing an inactive tab with save preserves its own content"
— persisted workflow B with added node, close while A is active, re-open
and verify
- [x] E2E: "Closing an inactive unsaved tab with save preserves its own
content" — temporary workflow B with added node, close while A is
active, save-as with filename, re-open and verify
- [x] Manual: open A and B, edit B, switch to A, close B tab, click
Save, re-open B — content should be B's not A's
## What changed
Removed stale `tests-ui` configuration and documentation references from
the repo.
## Why
`tests-ui/` no longer exists, but the repo still carried:
- a dead `@tests-ui/*` tsconfig path
- stale `tests-ui/**/*` include
- a Vite watch ignore for a missing directory
- documentation examples that still referenced the old path
## Validation
- `pnpm format:check`
- `pnpm typecheck`
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10736-chore-remove-stale-tests-ui-config-3336d73d3650814a98bedfc113b6eb9b)
by [Unito](https://www.unito.io)
## Summary
replacement for https://github.com/Comfy-Org/ComfyUI_frontend/pull/9201
the first commit squashed
https://github.com/Comfy-Org/ComfyUI_frontend/pull/9201 and fixed
conflict.
the second commit change needed by:
- Enable GLSL live preview on SubgraphNodes by detecting the inner
GLSLShader and rendering its preview directly on the parent SubgraphNode
- Previously, SubgraphNodes containing a GLSLShader showed no live
preview at all To achieve this:
- Read shader source, uniform values, and renderer config from the inner
GLSLShader's widgets
- Trace IMAGE inputs through the subgraph boundary so the inner shader
can use images connected to the SubgraphNode's outer inputs
- Set preview output using the inner node's locator ID so the promoted
preview system picks it up on the SubgraphNode
- Extract setNodePreviewsByLocatorId from nodeOutputStore to support
setting previews by locator ID directly
- Fix graphId to use rootGraph.id for widget store lookups (was using
graph.id, which broke lookups for nodes inside subgraphs)
- Read uniform values from connected upstream nodes, not just local
widgets
- Fix blob URL lifecycle: use the store's
createSharedObjectUrl/releaseSharedObjectUrl reference-counting system
instead of manual revoke, preventing leaks on composable re-creation
## Screenshot
https://github.com/user-attachments/assets/9623fa32-de39-4a3a-b8b3-28688851390b
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10349-Feat-glsl-live-preview-3296d73d3650814b83aef52ab1962a77)
by [Unito](https://www.unito.io)
## Summary
Extract auth-routing logic (`getAuthHeaderOrThrow`,
`getFirebaseAuthHeaderOrThrow`) from `workspaceApi.ts` into
`authStore.ts`, eliminating a layering violation where the workspace API
re-implemented auth header resolution.
## Changes
- **What**: Moved `getAuthHeaderOrThrow` and
`getFirebaseAuthHeaderOrThrow` from `workspaceApi.ts` to `authStore.ts`.
`workspaceApi.ts` now calls through `useAuthStore()` instead of
re-implementing token resolution. Added tests for the new methods in
`authStore.test.ts`. Updated `authStoreMock.ts` with the new methods.
- **Files**: 4 files changed
## Review Focus
- The `getAuthHeaderOrThrow` / `getFirebaseAuthHeaderOrThrow` methods
throw `AuthStoreError` (auth domain error) — callers in workspace can
catch and re-wrap if needed
- `workspaceApi.ts` is simplified by ~19 lines
## Stack
PR 2/5: #10483 → **→ This PR** → #10485 → #10486 → #10487
## What
- Add `include: ['src/**/*.{ts,vue}']` to vitest coverage config so ALL
source files appear in reports (previously only imported files showed
up)
- Add `lcov` reporter for CI integration and VS Code coverage gutter
- Add `exclude` patterns for test files, locales, litegraph, assets,
declarations, stories
- Add `test:coverage` npm script
## Why
Coverage reports currently only show files that are imported during test
runs. Adding the `include` pattern reveals the true gap — files with
zero coverage that were previously invisible. The lcov reporter enables
IDE integration and future CI coverage comments (Codecov/Coveralls).
## Testing
`npx tsc --noEmit` passes. No behavioral changes — this only affects
coverage reporting configuration.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10575-config-add-vitest-coverage-include-pattern-lcov-reporter-32f6d73d365081c8b59ad2316dd2b198)
by [Unito](https://www.unito.io)
## Summary
Extract `makeMatcher` and `comfyExpect` from `ComfyPage.ts` into the
standalone `browser_tests/fixtures/utils/customMatchers.ts` module,
reducing the page-object file by ~50 lines.
## Changes
- **What**: Removed duplicate `makeMatcher`/`comfyExpect` definitions
from `ComfyPage.ts`; the canonical implementation now lives in
`customMatchers.ts`. A backward-compatible re-export keeps all existing
imports working.
## Review Focus
- The re-export ensures `import { comfyExpect } from
'../fixtures/ComfyPage'` continues to resolve for all ~25 spec files.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10652-refactor-extract-comfyExpect-and-makeMatcher-from-ComfyPage-3316d73d365081bf8e7cd7fa324bf9a6)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Alexander Brown <drjkl@comfy.org>
Co-authored-by: GitHub Action <action@github.com>
## Summary
Adds Playwright E2E tests for the QueueClearHistoryDialog component.
## Tests added
- Dialog opens from queue panel history actions menu
- Dialog shows confirmation message with title, description, and assets
note
- Cancel button closes dialog without clearing history
- Close (X) button closes dialog without clearing history
- Confirm clear action triggers queue history clear API call
- Dialog state resets properly after close/reopen
## Task
Part of Test Coverage Q2 Overhaul (DLG-02).
## Conventions
- Uses Vue nodes with new menu enabled (`Comfy.UseNewMenu: 'Top'`)
- Tests read as user stories
- No full-page screenshots
- Proper waits, no sleeps
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10586-test-add-QueueClearHistoryDialog-E2E-tests-DLG-02-3306d73d36508174a07bd9782340a0f7)
by [Unito](https://www.unito.io)
---------
Co-authored-by: GitHub Action <action@github.com>
## Summary
Comprehensive Playwright E2E tests for the properties panel (right
sidebar).
Part of the **Test Coverage Q2 Overhaul** initiative (Phase 2: PNL-01).
## What's included
- **PropertiesPanelHelper** page object in `browser_tests/helpers/` —
locators + action methods for all panel elements
- **35 test cases** covering:
- Open/close via actionbar toggle
- Workflow Overview (no selection): tabs, title, nodes list, global
settings
- Single node selection: title, parameters, info tab, widgets display
- Multi-node selection: item count, node listing, hidden Info tab
- Title editing: pencil icon, edit mode, rename, visibility rules
- Search filtering: query, clear, empty state
- Settings tab: Normal/Bypass/Mute state, color swatches, pinned toggle
- Selection transitions: no-selection ↔ single ↔ multi
- Nodes tab: list all, search filter
- Tab label changes based on selection count
- **Errors tab scaffold** (for @jaeone94 ADD-03)
## Testing
- All tests use Vue nodes with new menu enabled
- Zero flaky tests (proper waits, no sleeps)
- Screenshots scoped to panel elements
## Unblocks
- **ADD-03** (error systems by @jaeone94) — errors tab scaffold ready to
extend
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10548-test-comprehensive-properties-panel-E2E-tests-PNL-01-32f6d73d36508199a216fd8d953d8e18)
by [Unito](https://www.unito.io)
---------
Co-authored-by: GitHub Action <action@github.com>
## What
12 regression tests covering 10 workflow persistence bug gaps, including
the **critical data corruption fix in PR #9531** (pythongosssss) which
previously had ZERO test coverage.
## Why
Deep scan of 37 workflow persistence bugs found 12 E2E-testable gaps
with no regression tests. Workflow persistence is a core reliability
concern — data corruption bugs are the highest risk category.
## Tests
### 🔴 Critical
| Bug | PR | Tests | Description |
|-----|----|-------|-------------|
| Data corruption | #9531 | 2 | checkState during graph loading corrupts
workflow data |
| State desync | #9533 | 2 | Rapid tab switching desyncs workflow/graph
state |
### 🟡 Medium
| Bug | PR/Commit | Tests | Description |
|-----|-----------|-------|-------------|
| Lost previews | #9380 | 1 | Node output previews lost on tab switch |
| Stale canvas | 44bb6f13 | 1 | Canvas not cleared before loading new
workflow |
| Widget loss | #7648 | 1 | Widget values lost on graph change |
| API format | #9694 | 1 | API format workflows fail with missing nodes
|
| Paste duplication | #8259 | 1 | Middle-click paste duplicates workflow
|
| Blob URLs | #8715 | 1 | Transient blob: URLs in serialization |
### 🟢 Low
| Bug | PR/Commit | Tests | Description |
|-----|-----------|-------|-------------|
| Locale break | #8963 | 1 | Locale change breaks workflows |
| Panel drift | — | 1 | Splitter panel size drift |
## Conventions
- All tests use Vue nodes + new menu enabled
- Each test documents which PR/commit it regresses
- Proper waits (no sleeps)
- Screenshots scoped to relevant elements
- Tests read like user stories
## 🎉 Shoutout
PR #9531 by @pythongosssss was a critical data corruption fix that now
has regression test coverage for the first time.
Part of: Test Coverage Q2 Overhaul (REG-01)
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10547-test-12-workflow-persistence-regression-tests-incl-critical-PR-9531-32f6d73d3650818796c6c5950c77f6d1)
by [Unito](https://www.unito.io)
---------
Co-authored-by: GitHub Action <action@github.com>
## Summary
Add deterministic mock data fixtures for browser tests so they can use
`page.route()` to intercept API calls without depending on a live
backend.
## Changes
- **`browser_tests/fixtures/data/nodeDefinitions.ts`** — Mock
`ComfyNodeDef` objects for KSampler, CheckpointLoaderSimple, and
CLIPTextEncode
- **`browser_tests/fixtures/data/systemStats.ts`** — Mock `SystemStats`
with realistic RTX 4090 GPU info
- **`browser_tests/fixtures/data/README.md`** — Usage guide for
`page.route()` interception
All fixtures are typed against the Zod schemas in `src/schemas/` and
pass `pnpm typecheck:browser`.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10662-test-add-mock-data-fixtures-for-backend-API-responses-3316d73d3650813ea5c8c1faa215db63)
by [Unito](https://www.unito.io)
---------
Co-authored-by: dante01yoon <bunggl@naver.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: GitHub Action <action@github.com>
## Summary
Adds Playwright E2E tests for the SignIn dialog component and its
sub-forms.
## Tests added
- Dialog opens from login button in topbar
- Sign In form is the default view with email/password fields
- Toggle between Sign In and Sign Up forms
- API Key form navigation (forward and back)
- Terms of Service and Privacy Policy links present
- Form field presence verification
- Dialog close behavior (close button and Escape key)
- Forgot password link presence
- 'Or continue with' divider and API key button
## Notes
- Tests focus on UI navigation and element presence (no real Firebase
auth in test env)
- Dialog opened via `extensionManager.dialog.showSignInDialog()` API
- All selectors use stable IDs from the component source
(`#comfy-org-sign-in-email`, etc.)
## Task
Part of Test Coverage Q2 Overhaul (DLG-04).
## Conventions
- Uses Vue nodes with new menu enabled (`Comfy.UseNewMenu: 'Top'`)
- Tests read as user stories
- No full-page screenshots
- Proper waits, no sleeps
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10587-test-add-SignIn-dialog-E2E-tests-DLG-04-3306d73d3650815db171f8c5228e2cf3)
by [Unito](https://www.unito.io)
---------
Co-authored-by: GitHub Action <action@github.com>
## Summary
Expose `renderMarkdownToHtml()` on the `ExtensionManager` interface so
custom node extensions can render markdown to sanitized HTML without
bundling their own copies of `marked`/`DOMPurify`.
## Motivation
Multiple custom node packs (KJNodes, comfy_mtb, rgthree-comfy) bundle
their own markdown rendering libraries to implement help popups on
nodes. This causes:
- **Cloud breakage**: KJNodes uses a `kjweb_async` pattern (custom
aiohttp static route) to lazily load `marked.min.js` and
`purify.min.js`. This 404s on Cloud because the custom route is not
registered.
- **Redundant bundling**: Both `marked` (^15.0.11) and `dompurify`
(^3.2.5) are already direct dependencies of the frontend, used
internally by `markdownRendererUtil.ts`, `NodePreview.vue`,
`WhatsNewPopup.vue`, etc.
- **XSS risk**: Custom nodes using raw `marked` without `DOMPurify`
could introduce XSS vulnerabilities.
By exposing the existing `renderMarkdownToHtml()` through the official
`ExtensionManager` API, custom nodes can:
```js
const html = app.extensionManager.renderMarkdownToHtml(nodeData.description)
```
...instead of bundling and loading their own copies.
## Changes
- **`src/types/extensionTypes.ts`**: Add `renderMarkdownToHtml(markdown:
string, baseUrl?: string): string` to the `ExtensionManager` interface
with JSDoc.
- **`src/stores/workspaceStore.ts`**: Import and re-export
`renderMarkdownToHtml` from `@/utils/markdownRendererUtil`.
## Impact
- **Zero bundle size increase** — the function and its dependencies are
already bundled in the `vendor-markdown` chunk.
- **No breaking changes** — purely additive to the `ExtensionManager`
interface.
- **Follows existing pattern** — same approach as `toast`, `dialog`,
`command`, `setting` on `ExtensionManager`.
Related: #TBD (long-term plan for custom node extension library
dependencies)
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10700-feat-expose-renderMarkdownToHtml-on-ExtensionManager-3326d73d36508149bc1dc6bb45e7c077)
by [Unito](https://www.unito.io)
## Summary
Extract `assetPath` from a `ComfyPage` method to a standalone pure
function, removing unnecessary coupling to the page object.
## Changes
- **What**: Moved `assetPath` to
`browser_tests/fixtures/utils/paths.ts`. `DragDropHelper` and
`WorkflowHelper` import it directly instead of receiving it via
`ComfyPage`. `ComfyPage.assetPath` kept as thin delegate for backward
compat.
## Review Focus
Structural-only refactor — no behavioral changes. The function was
already pure (no `this`/`page` usage).
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10651-refactor-extract-assetPath-as-standalone-pure-function-3316d73d365081c0b0e0ce6dde57ef8e)
by [Unito](https://www.unito.io)
## Summary
Add guidance to `docs/guidance/playwright.md` that new node-specific
assertions should be methods on page objects/helpers rather than new
`comfyExpect` custom matchers.
## Changes
- **What**: New "Custom Assertions" section in Playwright guidance
documenting that existing `comfyExpect` matchers are fine to use, but
new assertions should go on the page object for IntelliSense
discoverability.
## Review Focus
Documentation-only change. No code refactoring — this is a convention
for new code only.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10660-docs-add-convention-for-new-assertions-prefer-page-objects-over-custom-matchers-3316d73d3650816d97a8fbbdc33f6b75)
by [Unito](https://www.unito.io)
---------
Co-authored-by: GitHub Action <action@github.com>
## Summary
Remove the exclusion filter that prevented backend-mirrored endpoint
types from being generated in `@comfyorg/ingest-types`.
## Changes
- **What**: The `openapi-ts.config.ts` excluded all endpoints shared
with the ComfyUI Python backend (system_stats, object_info, prompt,
queue, history, settings, userdata, etc.). Since the cloud ingest API
mirrors the backend, these types should be generated from the OpenAPI
spec as the canonical source. This adds ~250 new types and Zod schemas
covering previously excluded endpoints.
- **Breaking**: None. This only adds new exported types — no existing
types or imports are changed.
## Review Focus
- The cloud ingest API is designed to mirror the ComfyUI Python backend.
The original exclusion filter was added to avoid duplication with
`src/schemas/apiSchema.ts`, but the generated types should be the
canonical source since they are auto-generated from the OpenAPI spec.
- A follow-up PR will migrate imports in `src/` from `apiSchema.ts` to
`@comfyorg/ingest-types` where applicable.
- Webhooks and internal analytics endpoints remain excluded
(server-to-server, not frontend-relevant).
Related: #10662
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10697-refactor-include-backend-mirrored-endpoints-in-ingest-types-codegen-3326d73d365081569614f743ab6f074d)
by [Unito](https://www.unito.io)
---------
Co-authored-by: GitHub Action <action@github.com>
## Motivation
Browser tests mock API responses with `route.fulfill()` using untyped
inline JSON. When the OpenAPI spec changes, these mocks silently drift —
mismatches aren't caught at compile time and only surface as test
failures at runtime.
We already have auto-generated types from OpenAPI and manual Zod
schemas. This PR makes those types the source of truth for test mock
data.
From Mar 27 PR review session action item: "instruct agents to use
schemas and types when writing browser tests."
## Type packages and their API coverage
The frontend has two OpenAPI-generated type packages, each targeting a
different backend API with a different code generation tool:
| Package | Target API | Generator | TS types | Zod schemas |
|---------|-----------|-----------|----------|-------------|
| `@comfyorg/registry-types` | Registry API (node packages, releases,
subscriptions, customers) | `openapi-typescript` | Yes | **No** |
| `@comfyorg/ingest-types` | Ingest API (hub workflows, asset uploads,
workspaces) | `@hey-api/openapi-ts` | Yes | Yes |
Additionally, Python backend endpoints (`/api/queue`, `/api/features`,
`/api/settings`, etc.) are typed via manual Zod schemas in
`src/schemas/apiSchema.ts`.
This PR applies **compile-time type checking** using these existing
types. Runtime validation via Zod `.parse()` is not yet possible for all
endpoints because `registry-types` does not generate Zod schemas — this
requires a separate migration of `registry-types` to
`@hey-api/openapi-ts` (#10674).
## Summary
- Add "Typed API Mocks" guideline to `docs/guidance/playwright.md` with
a sources-of-truth table mapping endpoint categories to their type
packages
- Add rule to `AGENTS.md` Playwright section requiring typed mock data
- Refactor `releaseNotifications.spec.ts` to use `ReleaseNote` type
(from `registry-types`) via `createMockRelease()` factory
- Annotate template mock in `templates.spec.ts` with
`WorkflowTemplates[]` type
Refs #10656
## Example workflow: writing a new typed E2E test mock
When adding a new `route.fulfill()` mock, follow these steps:
### 1. Identify the type source
Check which API the endpoint belongs to:
| Endpoint category | Type source | Zod available |
|---|---|---|
| Ingest API (hub, billing, workflows) | `@comfyorg/ingest-types` | Yes
— use `.parse()` |
| Registry API (releases, nodes, publishers) |
`@comfyorg/registry-types` | Not yet (#10674) — TS type only |
| Python backend (queue, history, settings) | `src/schemas/apiSchema.ts`
| Yes — use `z.infer` |
| Templates | `src/platform/workflow/templates/types/template.ts` | No —
TS type only |
### 2. Create a typed factory (with Zod when available)
**Ingest API endpoints** — Zod schemas exist, use `.parse()` for runtime
validation:
```typescript
import { zBillingStatusResponse } from '@comfyorg/ingest-types/zod'
import type { BillingStatusResponse } from '@comfyorg/ingest-types'
function createMockBillingStatus(
overrides?: Partial<BillingStatusResponse>
): BillingStatusResponse {
return zBillingStatusResponse.parse({
plan: 'free',
credits_remaining: 100,
renewal_date: '2026-04-28T00:00:00Z',
...overrides
})
}
```
**Registry API endpoints** — TS type only (Zod not yet generated):
```typescript
import type { ReleaseNote } from '../../src/platform/updates/common/releaseService'
function createMockRelease(
overrides?: Partial<ReleaseNote>
): ReleaseNote {
return {
id: 1,
project: 'comfyui',
version: 'v0.3.44',
attention: 'medium',
content: '## New Features',
published_at: new Date().toISOString(),
...overrides
}
}
```
### 3. Use in test
```typescript
test('should show upgrade banner for free plan', async ({ comfyPage }) => {
await comfyPage.page.route('**/billing/status', async (route) => {
await route.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify(createMockBillingStatus({ plan: 'free' }))
})
})
await comfyPage.setup()
await expect(comfyPage.page.getByText('Upgrade')).toBeVisible()
})
```
The factory pattern keeps test bodies focused on **what varies** (the
override) rather than the full response shape.
## Scope decisions
| File | Decision | Reason |
|------|----------|--------|
| `releaseNotifications.spec.ts` | Typed | `ReleaseNote` type available
from `registry-types` |
| `templates.spec.ts` | Typed | `WorkflowTemplates` type available in
`src/platform/workflow/templates/types/` |
| `QueueHelper.ts` | Skipped | Dead code — instantiated but never called
in any test |
| `FeatureFlagHelper.ts` | Skipped | Response type is inherently
`Record<string, unknown>`, no stronger type exists |
| Fixture factories | Deferred | Coordinate with Ben's fixture
restructuring work to avoid duplication |
## Follow-up work
Sub-issues of #10656:
- #10670 — Clean up dead `QueueHelper` or rewrite against `/api/jobs`
endpoint
- #10671 — Expand typed factory pattern to more endpoints
- #10672 — Evaluate OpenAPI generation for excluded Python backend
endpoints
- #10674 — Migrate `registry-types` from `openapi-typescript` to
`@hey-api/openapi-ts` to enable Zod schema generation
## Test plan
- [x] `pnpm typecheck:browser` passes
- [x] `pnpm lint` passes
- [ ] Existing `releaseNotifications` and `templates` tests pass in CI
## Summary
Document the agreed-upon architectural separation for browser test
fixtures:
- `fixtures/data/` — Static test data (mock API responses, workflow
JSONs, node definitions)
- `fixtures/components/` — Page object components (locators, user
interactions)
- `fixtures/helpers/` — Focused helper classes (domain-specific actions)
- `fixtures/utils/` — Pure utility functions (no page dependency)
## Changes
- **`browser_tests/AGENTS.md`** — Added architectural separation section
with clear rules for each directory
- **`browser_tests/fixtures/data/README.md`** (new) — Explains the data
directory purpose and what belongs here vs `assets/`
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10645-docs-document-fixture-page-object-separation-in-browser-tests-3316d73d365081febf52d165282c68f6)
by [Unito](https://www.unito.io)
---------
Co-authored-by: GitHub Action <action@github.com>
## What
Adds a `cloud` Playwright project so E2E tests can run against
`DISTRIBUTION=cloud` builds, with `@cloud` / `@oss` test tagging.
## Why
100+ usages of `isCloud` / `DISTRIBUTION` across 9 categories (API
routing, UI visibility, settings, auth). Zero cloud test infrastructure
existed — cloud-specific UI components (LoginButton, SubscribeButton,
etc.) had no E2E coverage path.
## Investigation: Runtime Toggle
Investigated whether `isCloud` could be made runtime-toggleable in
dev/test mode (via `window.__FORCE_CLOUD__`). **Not feasible** —
`__DISTRIBUTION__` is a Vite `define` compile-time constant used for
dead-code elimination. Runtime override would break tree-shaking in
production.
Full investigation:
`research/architecture/cloud-runtime-toggle-investigation.md`
## What's included
### Playwright Config
- New `cloud` project alongside existing `chromium`
- Cloud project: `grep: /@cloud/` — only runs `@cloud` tagged tests
- Chromium project: `grepInvert: /@cloud/` — excludes cloud tests
### Build Script
- `npm run build:cloud` → `DISTRIBUTION=cloud vite build`
### Test Tagging Convention
```typescript
test('works in both', async () => { ... });
test('subscription button visible @cloud', async () => { ... });
test('install manager prompt @oss', async () => { ... });
```
### Example Tests
- 2 cloud-only tests validating cloud UI visibility
## NOT included (future work)
- CI workflow job for cloud tests (separate PR)
- Cloud project is opt-in — not run by default locally
## Unblocks
- Cloud-specific E2E tests for entire team
- TB-03 LoginButton, TB-04 SubscribeButton (@Kaili Yang)
- DLG-04 SignIn, DLG-06 CancelSubscription
Part of: Test Coverage Q2 Overhaul
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10546-test-infra-cloud-Playwright-project-with-cloud-oss-tagging-32f6d73d3650810ebb59dea8ce4891e9)
by [Unito](https://www.unito.io)
---------
Co-authored-by: GitHub Action <action@github.com>
Co-authored-by: Alexander Brown <drjkl@comfy.org>
## Summary
Document the recommended pattern for adding new domain-specific test
helpers as Playwright fixtures via `base.extend()` instead of attaching
them to `ComfyPage`.
## Changes
- **What**: Added "Creating New Test Helpers" section to
`docs/guidance/playwright.md` with fixture extension example and rules
## Review Focus
Documentation-only change. Verify the example code matches the existing
pattern in `browser_tests/fixtures/ComfyPage.ts`.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10653-docs-document-Playwright-fixture-injection-pattern-for-new-helpers-3316d73d36508145b402cf02a5c2c696)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Alexander Brown <drjkl@comfy.org>
## Summary
Document the arrange/act/assert pattern for Playwright browser tests to
keep mock setup out of test bodies.
## Changes
- **What**: Added "Test Structure: Arrange/Act/Assert" section to
`docs/guidance/playwright.md` documenting that mock setup belongs in
`beforeEach`/fixtures, test bodies should only act and assert, and
`clearAllMocks` should never be called mid-test. Includes good/bad
examples.
## Review Focus
Docs-only change — no code impact.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10657-docs-add-arrange-act-assert-pattern-guidance-for-browser-tests-3316d73d365081aa92c0fb6442084484)
by [Unito](https://www.unito.io)
---------
Co-authored-by: GitHub Action <action@github.com>
Co-authored-by: Alexander Brown <drjkl@comfy.org>
## Summary
From the primordial entropy of 17 scattered spec files — a formless
sprawl of mixed concerns and inconsistent naming — emerges a clean,
domain-organized hierarchy. Order triumphs over chaos.
## Changes
- **What**: Reorganize all subgraph E2E tests from 17 flat files in
`browser_tests/tests/` into 10 domain-grouped files under
`browser_tests/tests/subgraph/`.
| File | Tests | Domain |
|------|-------|--------|
| `subgraphSlots` | 16 | I/O slot CRUD, rename, alignment, promoted slot
position |
| `subgraphPromotion` | 22 | Auto-promote, visibility, reactivity,
context menu, cleanup |
| `subgraphSerialization` | 16 | Hydration, round-trip, legacy formats,
ID remapping |
| `subgraphNavigation` | 10 | Breadcrumb, viewport, hotkeys, progress
state |
| `subgraphNested` | 9 | Configure order, duplicate names, pack values,
stale proxies |
| `subgraphLifecycle` | 7 | Source removal cleanup, pseudo-preview
lifecycle |
| `subgraphPromotionDom` | 6 | DOM widget persistence, cleanup,
positioning |
| `subgraphCrud` | 5 | Create, delete, copy, unpack |
| `subgraphSearch` | 3 | Search aliases, description, persistence |
| `subgraphOperations` | 2 | Copy/paste inside, undo/redo inside |
Where once the monolith `subgraph.spec.ts` (856 lines) mixed slot CRUD
with hotkeys, DOM widgets with navigation, and copy/paste with undo/redo
— now each behavioral domain has its sovereign territory.
Where once `subgraph-rename-dialog.spec.ts`,
`subgraphInputSlotRename.spec.ts`, and
`subgraph-promoted-slot-position.spec.ts` scattered rename concerns
across three kingdoms — now they answer to one crown:
`subgraphSlots.spec.ts`.
Where once `kebab-case` and `camelCase` warred for dominion — now a
single convention reigns.
All 96 test cases preserved. Zero test logic changes. Purely structural.
## Review Focus
- Verify no tests were lost in the consolidation
- Confirm import paths all resolve correctly at the new depth
(`../../fixtures/`)
- The `import.meta.dirname` asset path in `subgraphSlots.spec.ts` (slot
alignment test) updated for new directory depth
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10695-test-reorganize-subgraph-E2E-tests-into-domain-organized-directory-3326d73d36508197939be8825b69ea88)
by [Unito](https://www.unito.io)
Co-authored-by: Amp <amp@ampcode.com>
## Summary
Add a "Fixture Data & Schemas" section to `docs/guidance/playwright.md`
so agents reference existing Zod schemas and TypeScript types when
creating test fixture data.
## Changes
- **What**: New section listing key schema/type locations (`apiSchema`,
`nodeDefSchema`, `jobTypes`, `workflowSchema`, etc.) to keep test
fixtures in sync with production types.
## Review Focus
Documentation-only change; no runtime impact.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10642-docs-add-Fixture-Data-Schemas-section-to-Playwright-test-guidance-3316d73d365081f5a234e4672b3dc4b9)
by [Unito](https://www.unito.io)
## Summary
Add composite assertion and scoped opening methods to the `ContextMenu`
Playwright page object.
## Changes
- **What**: Added `assertHasItems(items: string[])` using
`expect.soft()` per item, and `openFor(locator: Locator)` which
right-clicks and waits for menu visibility. Fully backward-compatible.
## Review Focus
Both methods reuse existing locators (`primeVueMenu`, `litegraphMenu`,
`getByRole("menuitem")`). `openFor` uses `.or()` to handle both menu
types.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10659-feat-add-assertHasItems-and-openFor-to-ContextMenu-page-object-3316d73d36508193af45da7d3af4f50c)
by [Unito](https://www.unito.io)
## Summary
Add helpers for safely interacting with nodes that share the same title
without hitting Playwright strict mode.
## Changes
- **What**: Added `getNodesByTitle(title)` and `getNodeByTitleNth(title,
index)` to `VueNodeHelpers`. Updated `docs/guidance/playwright.md` with
a gotcha note about duplicate node names.
## Review Focus
These are purely additive helpers — no existing behavior changes.
`getNodesByTitle` returns all matching nodes (callers use `.nth()` to
pick), and `getNodeByTitleNth` is a convenience wrapper. The existing
`selectNodes(nodeIds)` by-ID method is unchanged.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10666-feat-add-getNodesByTitle-and-getNodeByTitleNth-helpers-to-VueNodeHelpers-3316d73d3650812eabe6e56a768a34d2)
by [Unito](https://www.unito.io)
## Summary
Add ESLint `no-restricted-imports` rule to prevent usage of
`useVirtualList` from `@vueuse/core`.
## Changes
- **What**: New ESLint config block banning `useVirtualList` in
`**/*.{ts,vue}` files. The team standardized on TanStack Virtual (via
Reka UI virtualizer or `@tanstack/vue-virtual`) for all virtualization.
`useVirtualList` requires uniform item heights and is no longer desired.
This is a preventive ban — no existing usage exists.
## Review Focus
Straightforward lint rule addition following the existing
`no-restricted-imports` pattern in `eslint.config.ts`.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10643-feat-ban-useVirtualList-from-vueuse-core-via-ESLint-3316d73d365081d5adf0ec926aab6e28)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Benjamin Lu <benjaminlu1107@gmail.com>
## Summary
Extract repeated patterns from 12 subgraph Playwright spec files into
shared test utilities, reducing duplication by ~142 lines.
## Changes
- **What**: New shared helpers for common subgraph test operations:
- `SubgraphHelper`: `getSlotCount()`, `getSlotLabel()`, `removeSlot()`,
`findSubgraphNodeId()`
- `NodeReference`: `delete()`
- `subgraphTestUtils`: `serializeAndReload()`,
`convertDefaultKSamplerToSubgraph()`, `expectWidgetBelowHeader()`,
`collectConsoleWarnings()`, `packAllInteriorNodes()`
- Replaced ~72 inline `page.evaluate` blocks and multi-line sequences
with single helper calls across 12 spec files
## Review Focus
- Behavioral equivalence: every replacement is a mechanical extraction
with no test logic changes
- API surface of new helpers: naming, parameter types, placement in
existing utility classes
- Whether any remaining inline patterns in the spec files would benefit
from further extraction
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10629-test-extract-shared-subgraph-E2E-test-utilities-3306d73d365081b0b6b5db52ed0a4552)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Amp <amp@ampcode.com>
## Summary
Adds a `@perf` test to establish a baseline for viewport panning GC
churn on large graphs.
## Changes
- **What**: New `large graph viewport pan sweep` perf test that pans
aggressively back and forth across a 245-node graph, forcing many nodes
to cross the viewport boundary. Measures style recalcs, forced layouts,
task duration, heap delta, and DOM node count.
## Review Focus
This is **PR 1 of 2** (perf-fix-with-proof pattern). The fix (viewport
culling) will follow in a separate PR once this baseline is established
on main. CI will then show the delta proving the improvement.
The test uses 120 steps out + 120 steps back at 8px/step = ~960px total
displacement, enough to sweep across a significant portion of the large
graph layout.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10479-test-add-perf-test-for-viewport-pan-sweep-GC-churn-32d6d73d365081cc9f15fe3d5890675d)
by [Unito](https://www.unito.io)
## Summary
Adds the layout shell for the marketing site: SEO head, analytics, nav,
and footer.
## Changes (incremental from #10140)
- BaseLayout.astro: SEO meta (OG/Twitter), GTM (GTM-NP9JM6K7), Vercel
Analytics, ClientRouter, i18n
- SiteNav.vue: Fixed nav with logo, Enterprise/Gallery/About/Careers
links, COMFY CLOUD + COMFY HUB CTAs, mobile hamburger with ARIA
- SiteFooter.vue: Product/Resources/Company/Legal columns, social icons
## Stack (via Graphite)
- #10140 [1/3] Scaffold ← merge first
- **[2/3] Layout Shell** ← this PR
- #10142 [3/3] Homepage Sections
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10141-feat-add-layout-shell-BaseLayout-SiteNav-SiteFooter-2-3-3266d73d365081aeb2d7e598943a8e17)
by [Unito](https://www.unito.io)
## Summary
Add a deprecation warning when custom nodes access `widget.inputEl` on
STRING multiline widgets, directing them to use `widget.element`
instead.
## Changes
- **What**: Add a reusable `defineDeprecatedProperty` helper in
`feedback.ts` that creates an ODP getter/setter proxy from a deprecated
property to its replacement, logging via the existing `warnDeprecated`
utility (deduplicates: warns once per unique message per session). Use
it to deprecate `widget.inputEl` → `widget.element`.
## Review Focus
- `defineDeprecatedProperty` is generic and can be reused for future
property deprecations across the codebase.
- `warnDeprecated` already handles deduplication via a `Set`, so heavy
access patterns (e.g. custom nodes reading `widget.inputEl` in tight
loops) won't spam.
- `enumerable: false` keeps the deprecated alias out of `Object.keys()`
/ `for...in` / `JSON.stringify`.
FixesComfy-Org/ComfyUI#12893
<!-- Pipeline-Ticket: 6b291ba2-694c-42d6-ac0c-fcbdcba9373a -->
---------
Co-authored-by: Dante <bunggl@naver.com>
## What
Replace `v-if` with `v-show` on SelectionRectangle and NodeTooltip
components.
## Why
Firefox profiler shows 687 Vue `insert` markers from mount/unmount
cycling during canvas interaction. These components toggle frequently
during drag and mouse move events.
## How
- **SelectionRectangle**: `v-if` → `v-show` (single element, safe to
keep in DOM)
- **NodeTooltip**: `v-if` → `v-show` + no-op guard on `hideTooltip()` to
skip redundant reactivity triggers
## Perf Impact
Expected reduction: ~687 Vue insert/remove operations per profiling
session
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-9401-fix-use-v-show-for-frequently-toggled-canvas-overlay-components-31a6d73d365081aba2d7fce079bde7e9)
by [Unito](https://www.unito.io)
## Summary
With a previously saved workflow, selecting "Save as" in app mode would
not correctly change the file extension to the chosen mode, and would
require an additional save after to persist the actual mode change.
Recreation:
- Build app
- Save as worklow X, app mode
- Select Save as from builder footer [Save | v] chevron button
- Select node graph
- Save
- Check workflow on disk - it's still called X.app.json and doesn't have
linearMode: false <-- bug
## Changes
- **What**:
- pass isApp to save workflow
- ensure active graph & initialMode are correctly set when calling
saveAs BEFORE the actual saveWorkflow call
- add linearMode to workflowShema to prevent casts
- tests
## Review Focus
e2e tests coming in a follow up PR along with some refactoring of the
browser tests (left this PR focused to the actual fix)
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10679-fix-App-mode-Save-as-not-using-correct-extension-or-persisting-mode-on-change-3316d73d365081ef985cf57c91c34299)
by [Unito](https://www.unito.io)
## Summary
- Rename `text-xxxs`/`text-xxs` to `text-3xs`/`text-2xs` in design
system CSS — fixes `tailwind-merge` incorrectly classifying custom
font-size utilities as color classes, which clobbered text color
- Add `Badge` component with updated severity colors matching Figma
design (white text on colored backgrounds)
- Add Badge stories under `Components/Badges/Badge`
- Add unit tests including twMerge regression coverage
Split from #10438 per review feedback — this PR contains the
foundational Badge component; migration of consumers follows in a
separate PR.
## Test plan
- [x] Unit tests pass (`Badge.test.ts` — 12 tests)
- [x] Typecheck passes
- [x] Lint passes
- [ ] Verify Badge stories render correctly in Storybook
- [ ] Verify existing components using `text-2xs`/`text-3xs` render
unchanged
Fixes#10438 (partial)
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10580-refactor-add-Badge-component-and-fix-twMerge-font-size-detection-32f6d73d3650810dae7cd0d4af67fd1c)
by [Unito](https://www.unito.io)
## Summary
Adds SHA-256 hashed user email to GTM dataLayer `sign_up` and `login`
events to improve Meta/LinkedIn Conversions API (CAPI) match rate via
Stape server-side tracking.
## Privacy
- Email is SHA-256 hashed client-side before being pushed to dataLayer —
the raw email never enters the analytics pipeline.
- Email is normalized (trimmed + lowercased) before hashing per
Google/Meta requirements.
- If email is absent (e.g., GitHub OAuth without public email), no
`user_data` entry is pushed.
## Testing
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10591-feat-add-SHA-256-hashed-email-to-GTM-dataLayer-for-sign_up-login-events-3306d73d36508148a321d62810698013)
by [Unito](https://www.unito.io)
## Summary
Add property-based tests (using `fast-check`) for asset-related pure
utility functions, complementing existing example-based unit tests with
algebraic invariant checks across thousands of randomized inputs.
Fixes#10617
## Changes
- **What**: 4 new `*.property.test.ts` files covering
`assetFilterUtils`, `assetSortUtils`, `useAssetSelection`, and
`useOutputStacks` — 32 property-based tests total
## Why property-based testing (fast-check)?
### Gap in existing tests
The existing example-based unit tests (53 tests across 3 files) verify
behavior for **hand-picked inputs** — specific category names, known
sort orderings, fixed asset lists. This leaves two blind spots:
1. **Edge-case discovery**: Example tests only cover cases the author
anticipates. Property tests generate hundreds of randomized inputs per
run, probing boundaries the author didn't consider (e.g., empty strings,
single-char names, deeply nested tag paths, assets with `undefined`
metadata fields).
2. **Algebraic invariants**: Certain guarantees should hold for **all**
inputs, not just the handful tested. For example:
- "Filtering always produces a subset" — impossible to violate with 5
examples, easy to violate in production with unexpected metadata shapes
- "Sorting is idempotent" — an unstable sort bug would only surface with
specific duplicate patterns
- "Reconciled selection IDs are always within visible assets" — a
set-intersection bug might only appear with specific overlap patterns
between selection and visible sets
3. **No test coverage for `useOutputStacks`**: The composable had zero
tests before this PR.
### What these tests verify (invariant catalog)
| Module | # Properties | Key invariants |
|--------|-------------|----------------|
| `assetFilterUtils` | 10 | Filter result ⊆ input; `"all"` is identity;
ownership partitions into disjoint my/public; empty constraint is
identity |
| `assetSortUtils` | 8 | Never mutates input; output is permutation of
input; idempotent (sort∘sort = sort); adjacent pairs satisfy comparator;
`"default"` preserves order |
| `useAssetSelection` | 7 | After reconcile: selected ⊆ visible;
reconcile never adds new IDs; superset preserves all; empty visible
clears; `getOutputCount` ≥ 1; `getTotalOutputCount` ≥ len(assets) |
| `useOutputStacks` | 7 | Collapsed count = input count; items reference
input assets; unique keys; selectableAssets length = assetItems length;
no collapsed child flags; reactive ref updates |
### Quantitative impact
Each property runs 100 iterations by default → **3,200 randomized inputs
per test run** vs 53 hand-picked examples in existing tests.
**Coverage delta** (v8, measured against target modules):
| Module | Metric | Before (53 tests) | After (+32 property) | Delta |
|--------|--------|-------------------|---------------------|-------|
| `useAssetSelection.ts` | Branch | 76.92% | 94.87% | **+17.95pp** |
| `useAssetSelection.ts` | Stmts | 82.50% | 90.00% | **+7.50pp** |
| `useAssetSelection.ts` | Lines | 81.69% | 88.73% | **+7.04pp** |
| `useOutputStacks.ts` | Stmts | 0% | 37.50% | **+37.50pp** (new) |
| `useOutputStacks.ts` | Funcs | 0% | 75.00% | **+75.00pp** (new) |
| `assetFilterUtils.ts` | All | 97.5%+ | 97.5%+ | maintained |
| `assetSortUtils.ts` | All | 100% | 100% | maintained |
### Prior art
Follows the established pattern from
`src/platform/workflow/persistence/base/draftCacheV2.property.test.ts`.
## Review Focus
- Are the chosen invariants correct and meaningful (not just
change-detector tests)?
- Are the `fc.Arbitrary` generators representative of real-world asset
data shapes?
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10619-test-assets-add-property-based-tests-for-asset-utility-functions-3306d73d3650816985ebcd611bbe0837)
by [Unito](https://www.unito.io)
<img width="1305" height="730" alt="스크린샷 2026-03-28 오전 10 17
30"
src="https://github.com/user-attachments/assets/316fcb72-e749-40da-b29f-05af91f30610"
/>
## Summary
- Replace hardcoded `COMFY_HUB_TAG_OPTIONS` with dynamic fetch from `GET
/hub/labels?type=tag`
- Falls back to the existing static tag list when the API call fails
- Adds `zHubLabelListResponse` Zod schema and `fetchTagLabels` service
method
## Test plan
- [ ] Open publish wizard → verify tag suggestions load from API
- [ ] Disconnect network / use env without hub API → verify hardcoded
fallback tags appear
- [ ] Select and deselect tags → verify behavior unchanged
- [ ] Unit tests pass (`pnpm vitest run` on affected files)
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10497-feat-fetch-publish-tag-suggestions-from-hub-labels-API-32e6d73d3650815fb113cf591030d4e8)
by [Unito](https://www.unito.io)
## Summary
Fix timezone-dependent test failure in SubscriptionPanel and add a local
CI script.
## Changes
- **What**: The `renders refill date with literal slashes` test
hardcoded `12/31/24` but the component renders using local timezone
`Date` methods. In UTC-negative timezones, `2024-12-31T00:00:00Z`
renders as Dec 30. Now computes the expected string the same way the
component does.
- **What**: Added `pnpm test:ci:local` script
(`scripts/test-ci-local.sh`) that builds the frontend, starts a ComfyUI
backend with `--multi-user --front-end-root dist`, runs vitest +
Playwright, then cleans up. One command for full local CI.
## Review Focus
This is a test-only change — no production code modified. The
SubscriptionPanel component itself is unchanged; only the test assertion
is made timezone-agnostic.
## E2E Regression Test
Not applicable — this PR fixes a unit test assertion, not a production
bug. No user-facing behavior changed.
## Summary
Phase 3 of the VTL migration: migrate 8 hard-case component tests from
@vue/test-utils to @testing-library/vue (68 tests).
Stacked on #10490.
## Changes
- **What**: Migrate SignInForm, CurrentUserButton, NodeSearchBoxPopover,
BaseThumbnail, JobAssetsList, SelectionToolbox, QueueOverlayExpanded,
PackVersionSelectorPopover from VTU to VTL
- **`wrapper.vm` elimination**: 13 instances across 4 files (5 in
SignInForm, 3 in CurrentUserButton, 3 in PackVersionSelectorPopover, 2
in BaseThumbnail) replaced with user interactions or removed
- **`vm.$emit()` on stubs**: Interactive stubs with `setup(_, { emit })`
expose buttons or closure-based emit functions (QueueOverlayExpanded,
NodeSearchBoxPopover, JobAssetsList)
- **Removed**: 6 change-detector/redundant tests, 3 `@ts-expect-error`
annotations, `PackVersionSelectorVM` interface, `getVM` helper
- **BaseThumbnail**: Removed `useEventListener` mock — real event
handler attaches, `fireEvent.error(img)` triggers error state
## Review Focus
- Interactive stub patterns: `JobAssetsListStub` and `NodeSearchBoxStub`
use closure-based emit functions to trigger parent event handlers
without `vm.$emit`
- SignInForm form submission test fills PrimeVue Form fields via
`userEvent.type` and submits via button click (replaces `vm.onSubmit()`
direct call)
- CurrentUserButton Popover stub tracks open/close state reactively
- JobAssetsList: file-level `eslint-disable` for
`no-container`/`no-node-access`/`prefer-user-event` since stubs lack
ARIA roles and hover tests need `fireEvent`
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10493-test-migrate-8-hard-case-component-tests-from-VTU-to-VTL-Phase-3-32e6d73d365081f88097df634606d7e3)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Amp <amp@ampcode.com>
## Summary
Fix subgraph viewport (zoom + position) drifting when navigating in/out
of subgraphs and switching workflow tabs.
## Problem
Three root causes:
1. **First visit**: `restoreViewport()` silently returned on cache miss,
leaving canvas at stale position
2. **Cross-workflow leakage**: Cache keyed by bare `graphId` — two
workflows with the same subgraph or unsaved workflows shared cache
entries
3. **Stale save on tab switch**: `loadGraphData` and
`changeTracker.restore()` overwrite `canvas.ds` before the async watcher
could save the old viewport
## Solution
1. **Workflow-scoped cache keys**: `${path}#${instanceId}:${graphId}` —
WeakMap assigns unique IDs per workflow object, handling unsaved
workflows with identical paths
2. **`flush: 'sync'` on activeSubgraph watcher**: Fires immediately
during `setGraph()`, BEFORE `loadGraphData`/`changeTracker` can corrupt
`canvas.ds`
3. **Cache miss → rAF fitToBounds**: On first visit, computes bounds
from `graph._nodes` and calls `ds.fitToBounds()` after the browser has
rendered
4. **Workflow switch watcher** (`flush: 'sync'`): Pre-saves viewport
under old workflow identity, suppresses `onNavigated` saves during load
cycle
Key architectural insight: `setGraph()` never touches `canvas.ds`, but
`loadGraphData` and `changeTracker.restore()` both write to it. By using
`flush: 'sync'`, the save happens during `setGraph` (before the
overwrites).
## Review Focus
- `subgraphNavigationStore.ts` — the three fixes and their interaction
- `flush: 'sync'` watchers — critical for correct save timing
- `suppressNavigatedSave` flag — prevents stale saves during async
workflow load
## Breaking Changes
None. Viewport cache is session-only (in-memory LRU). Existing workflows
unaffected.
## Demo Video of Fix
https://github.com/user-attachments/assets/71dd4107-a030-4e68-aa11-47fe00101b25
## Test plan
- [x] Unit: save/restore with workflow-scoped keys
- [x] Unit: cache miss doesn't mutate canvas synchronously
- [x] Unit: navigation integration (enter/exit preserves viewport)
- [x] E2E: first subgraph visit has visible nodes
- [x] Manual: enter subgraph → zoom/pan → exit → re-enter → viewport
restored
- [x] Manual: tab with subgraph → different tab → back → viewport
restored
- [x] Manual: two unsaved workflows → switch between → viewports
isolated
- Fixes#10246
- Related: #8173
<!-- QA_REPORT_SECTION -->
---
## 🔍 Automated QA Report
| | |
|---|---|
| **Status** | ✅ Complete |
| **Report** |
[sno-qa-10247.comfy-qa.pages.dev](https://sno-qa-10247.comfy-qa.pages.dev/)
|
| **CI Run** | [View
workflow](https://github.com/Comfy-Org/ComfyUI_frontend/actions/runs/23373279990)
|
Before/after video recordings with **Behavior Changes** and **Timeline
Comparison** tables.
## Summary
Rename `useFirebaseAuthStore` → `useAuthStore` and
`FirebaseAuthStoreError` → `AuthStoreError`. Introduce shared mock
factory (`authStoreMock.ts`) to replace 16 independent bespoke mocks.
## Changes
- **What**: Mechanical rename of store, composable, class, and store ID
(`firebaseAuth` → `auth`). Created
`src/stores/__tests__/authStoreMock.ts` — a shared mock factory with
reactive controls, used by all consuming test files. Migrated all 16
test files from ad-hoc mocks to the shared factory.
- **Files**: 62 files changed (rename propagation + new test infra)
## Review Focus
- Mock factory API design in `authStoreMock.ts` — covers all store
properties with reactive `controls` for per-test customization
- Self-test in `authStoreMock.test.ts` validates computed reactivity
Fixes#8219
## Stack
This is PR 1/5 in a stacked refactoring series:
1. **→ This PR**: Rename + shared test fixtures
2. #10484: Extract auth-routing from workspaceApi
3. #10485: Auth token priority tests
4. #10486: Decompose MembersPanelContent
5. #10487: Consolidate SubscriptionTier type
---------
Co-authored-by: Alexander Brown <drjkl@comfy.org>
this fixes two issues, setting store race did not await load, and it
only cleared shown on clear not on show
## Summary
Wait for settings to load before deciding whether to show the one-time
macOS desktop cloud promo so the persisted dismissal state is respected
on launch.
## Changes
- **What**: Await `settingStore.load()` before checking
`Comfy.Desktop.CloudNotificationShown`, keep the promo gated to macOS
desktop, and persist the shown flag before awaiting dialog close.
- **Dependencies**: None
## Review Focus
- Launch-time settings race for `Comfy.Desktop.CloudNotificationShown`
- One-time modal behavior if the app closes before the dialog is
dismissed
- Regression coverage in `src/App.test.ts`
## Screenshots (if applicable)
- N/A
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10526-fix-wait-for-settings-before-cloud-desktop-promo-32e6d73d365081939fc3ca5b4346b873)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Alexander Brown <drjkl@comfy.org>
## Summary
Disable Sentry `browserApiErrorsIntegration` event target wrapping for
cloud builds to eliminate 231.7ms of `sentryWrapped` overhead during
canvas interaction.
## Changes
- **What**: Configure `browserApiErrorsIntegration({ eventTarget: false
})` in the cloud Sentry init path. This prevents Sentry from wrapping
every `addEventListener` callback in try/catch, which was the #1 hot
function during multi-cluster panning (100 profiling samples). Error
capturing still works via `window.onerror` and `unhandledrejection`.
## Review Focus
- Confirm that disabling event target wrapping is acceptable for cloud
error monitoring — Sentry still captures unhandled errors, just not
errors thrown inside individual event handler callbacks.
- Non-cloud builds already had `integrations: []` /
`defaultIntegrations: false`, so no change there.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10472-perf-disable-Sentry-event-target-wrapping-to-reduce-DOM-event-overhead-32d6d73d365081cdb455e47aee34dcc6)
by [Unito](https://www.unito.io)
## Summary
Hide image resolution subtitle on cloud asset cards because thumbnails
are downscaled to max 512px, causing `naturalWidth`/`naturalHeight` to
report incorrect dimensions.
## Changes
- **What**: Gate the dimension display in `MediaAssetCard.vue` behind
`!isCloud` so resolution is only shown on local (where full-res images
are loaded). Added TODO referencing #10590 for re-enabling once
`/assets` API returns original dimensions in metadata.
## Review Focus
One-line conditional change — the `isCloud` import from
`@/platform/distribution/types` follows the established pattern used
across the repo.
Fixes#10590
## Screenshots (if applicable)
N/A — this removes a subtitle that was displaying wrong values (e.g.,
showing 512x512 for a 1024x1024 image on cloud).
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10602-fix-hide-inaccurate-resolution-subtitle-on-cloud-asset-cards-3306d73d36508186bd3ad704bd83bf14)
by [Unito](https://www.unito.io)
## Summary
Add the first user-centric Playwright coverage for the assets sidebar
empty state and introduce a small assets-specific test helper/page
object surface.
## Changes
- **What**: add `AssetsSidebarTab`, add `AssetsHelper`, and cover
generated/imported empty states in a dedicated browser spec
## Review Focus
This is intentionally a small first slice for assets-sidebar coverage.
The new helper still mocks the HTTP boundary in Playwright for now
because current OSS job history and input files are global backend
state, which makes true backend-seeded parallel coverage a separate
backend change.
Long-term recommendation: add backend-owned, user-scoped test seeding
for jobs/history and input assets so browser tests can hit the real
routes on a shared backend. Follow-up: COM-307.
Fixes COM-306
## Screenshots (if applicable)
Not applicable.
## Validation
- `pnpm typecheck:browser`
- `pnpm exec playwright test browser_tests/tests/sidebar/assets.spec.ts
--project=chromium` against an isolated preview env
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10595-test-add-assets-sidebar-empty-state-coverage-3306d73d365081d1b34fdd146ae6c5c6)
by [Unito](https://www.unito.io)
## Summary
- Add E2E Playwright tests for zoom controls: default zoom level, zoom
to fit, zoom out with clamping at 10% minimum, manual percentage input,
and toggle visibility
- Add `data-testid` attributes to `ZoomControlsModal.vue` for stable
test selectors
- Add new TestId entries to `selectors.ts`
## Test plan
- [x] All 6 new tests pass locally
- [x] Existing minimap and graphCanvasMenu tests still pass
- [ ] CI passes
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10589-test-add-browser-tests-for-zoom-controls-3306d73d36508177ae19e16b3f62b8e7)
by [Unito](https://www.unito.io)
## Summary
Migrate 13 component test files from @vue/test-utils to
@testing-library/vue as Phase 1 of incremental VTL adoption.
## Changes
- **What**: Rewrite 13 test files (88 tests) to use `render`/`screen`
queries, `userEvent` interactions, and `jest-dom` assertions. Add
`data-testid` attributes to 6 components for lint-clean icon/element
queries. Delete unused `src/utils/test-utils.ts`.
- **Dependencies**: `@testing-library/vue`,
`@testing-library/user-event`, `@testing-library/jest-dom` (installed in
Phase 0)
## Review Focus
- `data-testid` additions to component templates are minimal and
non-behavioral
- PrimeVue passthrough (`pt`) usage in UserAvatar.vue for icon testid
- 2 targeted `eslint-disable` in FormRadioGroup.test.ts where PrimeVue
places `aria-describedby` on wrapper div, not input
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10471-test-migrate-13-component-tests-from-VTU-to-VTL-Phase-1-32d6d73d36508159a33ffa285afb4c38)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Amp <amp@ampcode.com>
## Summary
- Add a Playwright-based diagnostic tool (`@audit` tagged) that
automatically detects DOM elements where CSS `contain: layout style`
would improve rendering performance
- Extend `ComfyPage` fixture and `playwright.config.ts` to support
`@audit` tag (excluded from CI, perf infra enabled)
- Add `/contain-audit` skill definition documenting the workflow
## How it works
1. Loads the 245-node workflow in a real browser
2. Walks the DOM tree and scores every element by subtree size and
sizing constraints
3. For each high-scoring candidate, applies `contain: layout style` via
JS
4. Measures rendering performance (style recalcs, layouts, task
duration) before and after
5. Takes before/after screenshots to detect visual breakage
6. Outputs a ranked report to console
## Test plan
- [ ] `pnpm typecheck` passes
- [ ] `pnpm typecheck:browser` passes
- [ ] `pnpm lint` passes
- [ ] Existing Playwright tests unaffected (`@audit` excluded from CI
via `grepInvert`)
- [ ] Run `pnpm exec playwright test
browser_tests/tests/containAudit.spec.ts --project=chromium` locally
with dev server
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10026-tool-add-CSS-containment-audit-skill-and-Playwright-diagnostic-3256d73d365081b29470df164f798f7d)
by [Unito](https://www.unito.io)
---------
Co-authored-by: GitHub Action <action@github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
## Summary
- Move error red border from TopMenuSection/ComfyActionbar to
ErrorOverlay
- Add error indicator (outline + StatusBadge dot) on right side panel
toggle button when errors are present, the panel/overlay are closed, and
the errors tab setting is enabled
- Replace technical group titles (e.g. "Missing Node Packs") with
user-friendly i18n messages in ErrorOverlay
- Dynamically change action button label based on single error type
(e.g. "Show missing nodes" instead of "See Errors")
- Remove unused `hasAnyError` prop from ComfyActionbar
- Fix `type="secondary"` → `variant="secondary"` on panel toggle button
- Pre-wire `missing_media` error type support for #10309
- Migrate ErrorOverlay E2E selectors from `getByText`/`getByRole` to
`data-testid`
- Update E2E screenshot snapshots affected by TopMenuSection error state
design changes
## Test plan
- [x] Trigger execution error → verify red border on ErrorOverlay, no
red border on TopMenuSection/ComfyActionbar
- [x] With errors and right side panel closed → verify red outline + dot
on panel toggle button
- [x] Open right side panel or error overlay → verify indicator
disappears
- [x] Disable `Comfy.RightSidePanel.ShowErrorsTab` → verify no indicator
even with errors
- [x] Load workflow with only missing nodes → verify "Show missing
nodes" button label and friendly message
- [x] Load workflow with only missing models → verify "Show missing
models" button label and count message
- [x] Load workflow with mixed errors → verify "See Errors" fallback
label
- [x] E2E: `pnpm test:browser:local -- --grep "Error overlay"`
## Screenshots
<img width="498" height="381" alt="스크린샷 2026-03-26 230252"
src="https://github.com/user-attachments/assets/034f0f3f-e6a1-4617-b8f6-cd4c145e3a47"
/>
<img width="550" height="303" alt="스크린샷 2026-03-26 230525"
src="https://github.com/user-attachments/assets/2958914b-0ff0-461b-a6ea-7f2811bf33c2"
/>
<img width="551" height="87" alt="스크린샷 2026-03-26 230318"
src="https://github.com/user-attachments/assets/396e9cb1-667e-44c4-83fe-ab113b313d16"
/>
---------
Co-authored-by: github-actions <github-actions@github.com>
Co-authored-by: Dante <bunggl@naver.com>
## Summary
Fix right-side sidebar panels and left-side panels sharing the same
PrimeVue Splitter state key, causing them to incorrectly apply each
other's saved widths.
## Changes
- **What**: Make `sidebarStateKey` position-aware by including
`sidebarLocation` and offside panel visibility in the localStorage key
## Problem
When sidebar location is set to **right**, all panels (both the
right-side sidebar like Job History and left-side panels like Workflow
overview) share a single PrimeVue Splitter `state-key`
(`unified-sidebar`). PrimeVue persists panel widths to localStorage
using this key, so any resize on one side gets applied to the other.
### AS-IS (before fix)
The `sidebarStateKey` is computed without any awareness of panel
position:
```typescript
// Always returns 'unified-sidebar' (when unified width enabled)
// or the active tab id — regardless of sidebar location or offside panel state
const sidebarStateKey = computed(() => {
return unifiedWidth.value
? 'unified-sidebar'
: (activeSidebarTabId.value ?? 'default-sidebar')
})
```
This produces a **single localStorage key** for all layout
configurations. The result:
1. Set sidebar to **right**, open **Job History** → resize it smaller →
saved to `unified-sidebar`
2. Open **Workflow overview** (appears on the left as an offside panel)
→ loads the same `unified-sidebar` key → gets the Job History width
applied to a completely different panel position
3. Both panels open simultaneously share the same persisted width, even
though they are on opposite sides of the screen
This is exactly the behavior shown in the [issue
screenshots](https://github.com/Comfy-Org/ComfyUI_frontend/issues/9440):
pulling the Workflow overview smaller also changes Job History to that
same size, and vice versa.
### TO-BE (after fix)
The `sidebarStateKey` now includes `sidebarLocation` (`left`/`right`)
and whether the offside panel is visible:
```typescript
const sidebarTabKey = computed(() => {
return unifiedWidth.value
? 'unified-sidebar'
: (activeSidebarTabId.value ?? 'default-sidebar')
})
const sidebarStateKey = computed(() => {
const base = sidebarTabKey.value
const suffix = showOffsideSplitter.value ? '-with-offside' : ''
return `${base}-${sidebarLocation.value}${suffix}`
})
```
This produces **distinct localStorage keys** per layout configuration:
| Layout | Key |
|--------|-----|
| Sidebar left, no offside | `unified-sidebar-left` |
| Sidebar left, right panel open | `unified-sidebar-left-with-offside` |
| Sidebar right, no offside | `unified-sidebar-right` |
| Sidebar right, left panel open | `unified-sidebar-right-with-offside`
|
Each configuration now persists and restores its own panel sizes
independently, so resizing Job History on the right no longer affects
Workflow overview on the left.
## Review Focus
- The offside suffix (`-with-offside`) is necessary because the Splitter
transitions from a 2-panel layout (sidebar + center) to a 3-panel layout
(sidebar + center + offside) — these are fundamentally different panel
configurations and should not share persisted sizes.
Fixes#9440
## Screenshots (if applicable)
See issue for reproduction screenshots:
https://github.com/Comfy-Org/ComfyUI_frontend/issues/9440🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
Normalize legacy prefixed proxyWidget entries during subgraph configure
so nested subgraph widgets resolve correctly.
## Changes
- **What**: Extract `normalizeLegacyProxyWidgetEntry` to strip legacy
`nodeId: innerNodeId: widgetName` prefixes from serialized proxyWidgets
and resolve the correct `disambiguatingSourceNodeId`. Write-back
comparison now checks serialized content (not just array length) so
stale formats are cleaned up even when the entry count is unchanged.
## Review Focus
- The iterative prefix-stripping loop in `resolveLegacyPrefixedEntry` —
it peels one `N: ` prefix per iteration and tries all disambiguator
candidates at each level.
- The write-back condition change from length comparison to
`JSON.stringify` equality.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10573-fix-normalize-legacy-prefixed-proxyWidget-entries-on-configure-32f6d73d365081e886e1c9b3939e3b9f)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Amp <amp@ampcode.com>
## Summary
- When `Comfy.Workflow.Persist` is OFF and storage is empty,
`initializeWorkflow()` returned without creating any workflow tab —
leaving users with no tab and no way to save
- Now falls through to `loadDefaultWorkflow()` so a default temporary
workflow is always created
## Root Cause
In `useWorkflowPersistenceV2.ts`, `initializeWorkflow()` had an early
return when persistence was disabled:
```ts
if (!workflowPersistenceEnabled.value) return
```
This skipped `loadDefaultWorkflow()`, which is responsible for creating
the initial temporary workflow tab via `comfyApp.loadGraphData()` →
`afterLoadNewGraph()` → `workflowStore.createNewTemporary()`.
## Fix
One-line change: `return` → `return loadDefaultWorkflow()`.
## Test plan
- [x] E2E test: verifies `openWorkflows.length >= 1` after reload with
persistence OFF
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10565-fix-create-initial-workflow-tab-when-persistence-is-disabled-32f6d73d365081d5a681c3e019d373c3)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Alexander Brown <drjkl@comfy.org>
## Summary
Packing nodes inside a subgraph into a nested subgraph no longer blanks
the parent subgraph node's promoted widget values.
## Changes
- **What**: After `convertToSubgraph` moves interior nodes into a nested
subgraph, `_repointAncestorPromotions` rewrites the promotion store
entries on all host SubgraphNodes so they chain through the new nested
node. `rebuildInputWidgetBindings()` then clears the stale
`input._widget` PromotedWidgetView cache and re-resolves bindings from
current connections.
- The root cause was two separate sets of PromotedWidgetView references:
`node.widgets` (rebuilt from the store — correct) vs `input._widget`
(cached at promotion time — stale). `SubgraphNode.serialize()` reads
`input._widget.value`, which resolved against removed node IDs →
`missing-node` → blank values on the next `checkState` cycle.
## Review Focus
- `_repointAncestorPromotions` iterates all graphs to find host nodes of
the current subgraph type — verify this covers all cases (multiple
instances of the same subgraph type).
- `rebuildInputWidgetBindings()` clears `_promotedViewManager` and
re-resolves — confirm no side effects on event listeners or pending
promotions.
- The nested node gets duplicate promotion entries (from both
`_repointAncestorPromotions` and `promoteRecommendedWidgets` via the
`subgraph-converted` event). `store.promote()` deduplicates via
`isPromoted`, but worth verifying.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10532-fix-repoint-ancestor-promoted-widget-bindings-when-packing-nested-subgraphs-32e6d73d365081109d5aea0660434082)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Amp <amp@ampcode.com>
Co-authored-by: Benjamin Lu <benjaminlu1107@gmail.com>
Co-authored-by: Christian Byrne <cbyrne@comfy.org>
Co-authored-by: GitHub Action <action@github.com>
Co-authored-by: Comfy Org PR Bot <snomiao+comfy-pr@gmail.com>
Co-authored-by: christian-byrne <72887196+christian-byrne@users.noreply.github.com>
Co-authored-by: github-actions <github-actions@github.com>
Co-authored-by: pythongosssss <125205205+pythongosssss@users.noreply.github.com>
Co-authored-by: Yourz <crazilou@vip.qq.com>
## Summary
Adds a `@perf` test that establishes a baseline for ResizeObserver
layout cost during zoom on a large graph (245 nodes).
## Changes
- **What**: New `large graph zoom interaction` perf test that zooms
in/out 30 steps on `large-graph-workflow`, measuring `layouts`,
`layoutDurationMs`, `frameDurationMs`, and `TBT`. Each zoom step
triggers ResizeObserver for all node elements due to CSS scale changes.
## Review Focus
This is **PR 1 of 2** for throttling the ResizeObserver during zoom/pan.
Once this merges and establishes a baseline on main, the fix PR (#10473)
will show a CI-proven delta demonstrating the improvement.
The test follows the same patterns as `large graph pan interaction` and
`canvas zoom sweep`.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10478-test-add-large-graph-zoom-perf-test-for-ResizeObserver-baseline-32d6d73d365081169537e557c14d8c51)
by [Unito](https://www.unito.io)
## Summary
Fix unhandled promise rejection ("Document is not focused") in Copy
Image and improve clipboard fallback reliability.
## Changes
- **What**: Two clipboard fixes:
1. `litegraphService.ts`: The "Copy Image" context menu passed async
`writeImage` as a callback to `canvas.toBlob()` without awaiting —
errors became unhandled promise rejections reported in [Sentry
CLOUD-FRONTEND-STAGING-AQ](https://comfy-org.sentry.io/issues/6948073569/).
Extracted `convertToPngBlob` helper that wraps `toBlob` in a proper
Promise so errors propagate to the existing outer try/catch and surface
as a user-facing toast instead of a silent Sentry error.
2. `useCopyToClipboard.ts`: Replaced `useClipboard({ legacy: true })`
with explicit modern→legacy fallback that checks
`document.execCommand('copy')` return value. VueUse's `legacyCopy` sets
`copied.value = true` regardless of whether `execCommand` succeeded,
causing false success toasts.
## Review Focus
- The `convertToPngBlob` helper does the same canvas→PNG work as the old
inline code but properly awaited
- The happy path (PNG clipboard write succeeds first try) is unchanged
- No public API surface changes — verified zero custom node dependencies
via ecosystem code search
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-9299-fix-handle-clipboard-errors-in-Copy-Image-and-useCopyToClipboard-3156d73d3650817c8608cba861ee64a9)
by [Unito](https://www.unito.io)
## Summary
Fix URI drops (e.g. dragging `<img>` thumbnails) onto Vue-rendered nodes
by letting unhandled drops bubble to the document-level `text/uri-list`
fallback in `app.ts`.
## Changes
- **What**: Removed unconditional `.stop` modifier from `@drop` in
`LGraphNode.vue`. `stopPropagation()` is now called conditionally — only
when `onDragDrop` returns `true` (file drop handled). Made `handleDrop`
synchronous since `onDragDrop` returns a plain boolean.
## Review Focus
The key insight is that `onDragDrop` (from `useNodeDragAndDrop`) returns
`false` synchronously for URI drags (no files in `DataTransfer`), so the
event must bubble to reach the document handler that fetches the URI.
The original `async` + `await` pattern would have deferred
`stopPropagation` past the synchronous propagation phase, so
`handleDrop` is now synchronous.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-9463-fix-allow-URI-drops-to-bubble-from-Vue-nodes-to-document-handler-31b6d73d36508196a1b3f17e7e4837a9)
by [Unito](https://www.unito.io)
## Summary
Add regression tests for subgraph slot label propagation. The
OutputSlot.vue fix (adding `slotData.label` to the display template) was
already merged via another PR — this adds tests to prevent future
regressions.
## Changes
- **What**: Two new test files covering the label/localized_name
fallback chain in OutputSlot.vue and SubgraphNode label propagation
through configure() and rename event paths.
## Review Focus
Tests only — no production code changes. Verifies that renamed subgraph
inputs/outputs display correctly in Nodes 2.0 mode.
Fixes#9998
<!-- Pipeline-Ticket: 7d887122-eea5-45f1-b6eb-aed94f708555 -->
## Summary
Remove the unused `_config` parameter from the Playwright global
setup/teardown hooks and drop the now-unused `FullConfig` imports.
## Changes
- **What**: Simplified `browser_tests/globalSetup.ts` and
`browser_tests/globalTeardown.ts` to match actual usage.
## Review Focus
Verify that removing the unused hook argument does not change Playwright
behavior.
## Screenshots (if applicable)
N/A
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10513-fix-remove-unused-Playwright-hook-config-args-32e6d73d365081d59b63dbbca0596025)
by [Unito](https://www.unito.io)
Co-authored-by: Alexander Brown <drjkl@comfy.org>
Adds lessons learned from a bulk backport session where 69 PRs were
admin-merged without CI checks, shipping 3 test failures to core/1.41.
**Changes:**
- **SKILL.md**: CI Safety Rules section, wave verification with `pnpm
test:unit`, continuous backporting recommendation, Never Admin-Merge
Without CI lesson
- **execution.md**: Wait-for-CI step after automation, `gh pr checks
--watch` for manual cherry-picks, CI Failure Triage section with common
failure categories
- **logging.md**: Wave verification log template, CI failure report
table in session report
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10164-chore-add-CI-safety-rules-to-backport-management-skill-3266d73d365081aa856de1fb85a31887)
by [Unito](https://www.unito.io)
---------
Co-authored-by: GitHub Action <action@github.com>
## Summary
Prune stale proxyWidgets entries that reference grandchild nodes no
longer present in the outer subgraph after nested packing.
## Changes
- **What**: Filter out proxyWidgets entries during hydration when the
source node doesn't exist in the subgraph. Also skip missing-node
entries in `_pruneStaleAliasFallbackEntries` as defense-in-depth. Write
back cleaned entries so stale data doesn't persist.
## Review Focus
The fix touches two codepaths in `SubgraphNode.ts`:
1. **Hydration** (`_internalConfigureAfterSlots`): Added `getNodeById`
guard before accepting a proxyWidget entry, and broadened the write-back
condition from legacy-only to any filtered entries.
2. **Runtime pruning** (`_pruneStaleAliasFallbackEntries`): Added
early-exit for entries whose source node no longer exists — previously
these survived because failed resolution returned `undefined` which
bypassed the concrete-key comparison.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10390-fix-prune-stale-proxyWidgets-referencing-nodes-removed-by-nested-subgraph-packing-32b6d73d365081e69eedcb2b67d7043d)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Amp <amp@ampcode.com>
## Summary
Move the `getNodeDefs` unit test out of deprecated `tests-ui` and into
`src/scripts` so Vitest discovers and runs it.
## Changes
- **What**: Renamed `tests-ui/tests/scripts/app.getNodeDefs.test.ts` to
`src/scripts/app.getNodeDefs.test.ts`
## Review Focus
Confirm the spec now follows the colocated test convention and is
included by the existing Vitest `include` globs.
## Screenshots (if applicable)
N/A
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10503-test-move-getNodeDefs-spec-into-src-scripts-32e6d73d3650816f9211dc4c20daba4b)
by [Unito](https://www.unito.io)
## Summary
Use named `dotenv` config imports where we were calling
`dotenv.config()` so ESLint and IDEs stop flagging
`import-x/no-named-as-default-member`.
## Changes
- **What**: Replace default `dotenv` imports plus `.config()` member
access with `import { config as dotenvConfig } from 'dotenv'` in browser
test setup/fixture files and the desktop Vite config.
- **What**: Keep behavior unchanged while aligning those files with the
cleaner import form already used elsewhere in the repo.
## Review Focus
This is a no-behavior-change cleanup. The issue was that `dotenv`
exposes `config` both as a named export and as a property on the
default-exported module object, so `import dotenv from 'dotenv';
dotenv.config()` triggers `import-x/no-named-as-default-member` even
though it works at runtime.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10514-fix-use-named-dotenv-config-imports-32e6d73d36508195b346dbcab764a6b8)
by [Unito](https://www.unito.io)
## Summary
Add layout duration, style recalc duration, and heap usage metrics to CI
perf reports, while improving statistical reliability to reduce false
positive regressions.
## Changes
- **What**:
- Collect `layoutDurationMs`, `styleRecalcDurationMs`, `heapUsedBytes`
(absolute snapshot) alongside existing metrics
- Add effect size gate (`minAbsDelta`) for integer-quantized count
metrics (style recalcs, layouts, DOM nodes, event listeners) — prevents
z=7.2 false positives from e.g. 11→12 style recalcs
- Switch from mean to **median** for PR metric aggregation — robust to
outlier CI runs that dominate n=3 mean
- Increase historical baseline window from **5 to 15 runs** for more
stable σ estimates
- Reorder reported metrics: layout/style duration first (actionable),
counts and heap after (informational)
## Review Focus
The effect size gate in `classifyChange()` — it now requires both z > 2
AND absolute delta ≥ `minAbsDelta` (when configured) to flag a
regression. This addresses the core false positive issue where integer
metrics with near-zero historical variance produce extreme z-scores for
trivial changes.
Median vs mean tradeoff: median is more robust to outliers but less
sensitive to real shifts — acceptable given n=3 and CI noise levels.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10477-perf-add-layout-GC-metrics-reduce-false-positives-in-regression-detection-32d6d73d365081daa72cec96d8a07b90)
by [Unito](https://www.unito.io)
## Summary
App mode templates (names ending in `.app`, e.g.
`templates-qwen_multiangle.app`) were never counted as template
executions in Mixpanel because `getExecutionContext` used
`activeWorkflow.filename` for the `knownTemplateNames` lookup — but
`getFilenameDetails` treats `.app.json` as a compound extension and
strips it entirely, leaving `"templates-qwen_multiangle"` instead of
`"templates-qwen_multiangle.app"`. The set lookup always returned
`false`, so every execution was sent with `is_template: false`.
## Changes
- **Fix**: derive the template lookup key from
`fullFilename.replace(/\.json$/i, '')` instead of `filename`, which
preserves the `.app` suffix and correctly matches `knownTemplateNames`
- **Also fixes**: `workflow_name`, `getTemplateByName`, and
`getEnglishMetadata` calls in the same branch now use the corrected name
- **Tests**: three new cases in `MixpanelTelemetryProvider.test.ts` —
regular template, `.app` template (regression), and non-template
## Before / After
| Template name in index | `activeWorkflow.filename` | `fullFilename` →
stripped | `is_template` |
|---|---|---|---|
| `flux-dev` | `flux-dev` | `flux-dev` | ✅ true |
| `templates-qwen_multiangle.app` | `templates-qwen_multiangle` ❌ |
`templates-qwen_multiangle.app` ✅ | fixed: true |
## Review Focus
The change is confined to `getExecutionContext.ts`. `fullFilename` is
always set (it is assigned in `UserFile` constructor from
`getPathDetails`), so no null-safety issue.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10252-fix-restore-is_template-tracking-for-app-mode-templates-3276d73d365081d4b998edc62ad010dc)
by [Unito](https://www.unito.io)
`formatJsonValue()` uses a loose regex `^\d{4}-\d{2}-\d{2}` to detect
date-like strings, which matches non-date strings like
`"2024-01-01-beta"`.
Changes:
- Require ISO 8601 `T` separator: `/^\d{4}-\d{2}-\d{2}T/`
- Validate parse result with `!Number.isNaN(date.getTime())`
- Use `d()` i18n formatter for consistency with `formatDate()` in the
same file
## Summary
Scaffolds the new apps/website/ Astro 5 + Vue 3 marketing site inside
the monorepo.
## Changes
- apps/website/ with package.json, astro.config.mjs, tsconfig, Nx
targets
- @comfyorg/design-system/css/base.css — brand tokens + fonts (no
PrimeVue)
- pnpm-workspace.yaml catalog entries for Astro deps
- .gitignore and env.d.ts for Astro
## Stack (via Graphite)
- **[1/3] Scaffold** ← this PR
- #10141 [2/3] Layout Shell
- #10142 [3/3] Homepage Sections
Part of the comfy.org website refresh (replacing Framer).
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10140-feat-scaffold-Astro-5-website-app-design-system-base-css-1-3-3266d73d365081688dcee0220a03eca4)
by [Unito](https://www.unito.io)
## Summary
Restores `getAdditionalUserInfo` from Firebase Auth so sign-up telemetry
fires when *either* Firebase or the UI context identifies a new user,
fixing a regression from #10388.
## Changes
- **What**: In `loginWithGoogle` and `loginWithGithub`, call
`getAdditionalUserInfo(result)` and OR it with the UI-provided
`options?.isNewUser` flag: `is_new_user: options?.isNewUser ||
additionalUserInfo?.isNewUser || false`. Added 8 parameterized unit
tests covering the OR truth table (Firebase true, UI true, both false,
null result).
## Review Focus
The OR semantics: if either source says new user, we send `sign_up`
telemetry. Previously only the UI flag was checked, which missed cases
where the user lands directly on the OAuth provider without going
through the sign-up view.
## Testing
Unit tests cover all branches of the OR logic. An e2e test is not
feasible here because it would require completing a real OAuth flow with
Google/GitHub (interactive popup, valid credentials, CAPTCHA) and
intercepting the resulting `getAdditionalUserInfo` response from
Firebase — none of which can be reliably automated in a headless
Playwright environment without a live Firebase project seeded with
disposable accounts.
Fixes#10447
## Summary
Fix subgraph node slot connector links appearing misaligned after
workflow load, caused by a transform desync between LiteGraph's internal
canvas transform and the Vue TransformPane's CSS transform.
## Changes
- **What**: Changed `syncNodeSlotLayoutsFromDOM` to use DOM-relative
measurement (slot position relative to its parent `[data-node-id]`
element) instead of absolute canvas-space conversion via
`clientPosToCanvasPos`. This makes the slot offset calculation
independent of the global canvas transform, eliminating the frame-lag
desync that occurred when `fitView()` updated `lgCanvas.ds` before the
Vue CSS transform caught up.
- **Cleanup**: Removed the unreachable fallback path that still used
`clientPosToCanvasPos` when the parent node element wasn't found (every
slot element is necessarily a child of a `[data-node-id]` element — if
`closest()` fails the element is detached and measuring is meaningless).
This also removed the `conv` parameter from `syncNodeSlotLayoutsFromDOM`
and `flushScheduledSlotLayoutSync`, and the
`useSharedCanvasPositionConversion` import.
- **Test**: Added a Playwright browser test that loads a subgraph
workflow with `workflowRendererVersion: "LG"` (triggering the 1.2x scale
in `ensureCorrectLayoutScale`) as a template (triggering `fitView`), and
verifies all slot connector positions are within bounds of their parent
node element.
## Review Focus
- The core change is in `useSlotElementTracking.ts` — the new
measurement approach uses `getBoundingClientRect()` on both the slot and
its parent node element, then divides by `currentScale` to get
canvas-space offsets. This is simpler and more robust than the previous
approach.
- SubgraphNodes were disproportionately affected because they are
relatively static and don't often trigger `ResizeObserver`-based
re-syncs that would eventually correct stale offsets.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-9121-fix-resolve-subgraph-node-slot-link-misalignment-during-workflow-load-3106d73d365081eca413c84f2e0571d6)
by [Unito](https://www.unito.io)
---------
Co-authored-by: GitHub Action <action@github.com>
Co-authored-by: Alexander Brown <448862+DrJKL@users.noreply.github.com>
Co-authored-by: Amp <amp@ampcode.com>
Co-authored-by: Alexander Brown <drjkl@comfy.org>
The custom context menu provided by the frontend exposes widget specific
options. In order to support renaming, promotion, and favoriting, there
needs to be a way to access this context menu when targeting a textarea.
However, always displaying this custom context menu will cause the user
to lose access to browser specific functionality like spell checking,
translation, and the ability to copy paste text.
This PR updates the behaviour so that the native browser context menu
will display when the text area already has focus. Our custom frontend
context menu will continue to display when it does not.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10454-Use-native-context-menu-for-focused-textareas-32d6d73d365081909673d81d6a6ba054)
by [Unito](https://www.unito.io)
Rebased and adopted from #5774 by @felixturner.
## Changes
- Remove unused font-size properties (`NODE_TEXT_SIZE`,
`NODE_SUBTEXT_SIZE`, `DEFAULT_GROUP_FONT`) from theme palettes and color
palette schema
- Replace `DEFAULT_GROUP_FONT`/`DEFAULT_GROUP_FONT_SIZE` with a single
`GROUP_TEXT_SIZE = 20` constant (reduced from 24px)
- Use `NODE_TITLE_HEIGHT` for group header height instead of `font_size
* 1.4`
- Vertically center group title text using `textBaseline = 'middle'`
- Use `GROUP_TEXT_SIZE` directly in TitleEditor instead of per-group
`font_size`
- Remove `font_size` from group serialization (no longer per-group
configurable)
## Original PR
Closes#5774
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-9839-feat-Improve-group-title-layout-3216d73d36508112a0edc2a370af20ba)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Felix Turner <felixturner@gmail.com>
Co-authored-by: github-actions <github-actions@github.com>
Co-authored-by: Alexander Brown <drjkl@comfy.org>
## Problem
When upgrading from V1 to V2 draft persistence, users' open workflow
tabs were lost. V1 stored tab state (open paths + active index) in
localStorage via `setStorageValue` fallback, but the V1→V2 migration
only migrated draft payloads — not these tab state pointers.
This meant that after upgrading, all previously open tabs disappeared
and users had to manually reopen their workflows.
## Solution
Add `migrateV1TabState()` to the V1→V2 migration path. After draft
payloads are migrated, the function reads the V1 localStorage keys
(`Comfy.OpenWorkflowsPaths` and `Comfy.ActiveWorkflowIndex`) and writes
them to V2's sessionStorage format via `writeOpenPaths()`.
The `clientId` is threaded from `useWorkflowPersistenceV2` (which has
access to `api.clientId`) through to `migrateV1toV2()`.
## Changes
- **`migrateV1toV2.ts`**: Added `migrateV1TabState()` + V1 key constants
for tab state
- **`useWorkflowPersistenceV2.ts`**: Pass `api.clientId` to migration
call
- **`migrateV1toV2.test.ts`**: Two new tests proving tab state migration
works
## Testing
TDD approach — RED commit shows the test failing, GREEN commit shows it
passing.
All 123 persistence tests pass.
- Fixes#9974
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10007-fix-migrate-V1-tab-state-pointers-during-V1-V2-draft-migration-3256d73d36508103b619e521c1b603f5)
by [Unito](https://www.unito.io)
## Summary
The preload error toast fires whenever any custom node extension fails
to load via dynamic `import()`. In practice, this is almost always
caused by third-party plugin bugs rather than ComfyUI core issues.
Common triggers include:
- Bare module specifiers (e.g., `import from "vue"`) that the browser
cannot resolve without an import map
- Incorrect relative paths to `scripts/app.js` due to nested web
directory structures
- Missing dependencies on other extensions (e.g., `clipspace.js`)
Since many users have multiple custom nodes installed, the toast
frequently appears on startup — sometimes multiple times — with a
generic message that offers no actionable guidance. This creates
unnecessary alarm and support burden.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10445-fix-disable-preload-error-toast-triggered-by-third-party-plugin-failures-32d6d73d365081f281efcd6fe90642a5)
by [Unito](https://www.unito.io)
## Summary
Adds custom status messages that are shown under the previews in order
to provide additional progress feedback to the user
Nodes matching the words:
Save, Preview -> Saving
Load, Loader -> Loading
Encode -> Encoding
Decode -> Decoding
Compile, Conditioning, Merge, -> Processing
Upscale, Resize -> Resizing
ToVideo -> Generating video
Specific nodes:
KSampler, KSamplerAdvanced, SamplerCustom, SamplerCustomAdvanced ->
Generating
Video Slice, GetVideoComponents, CreateVideo -> Processing video
TrainLoraNode -> Training
## Changes
- **What**:
- add specific node lookups for non-easily matchable patterns
- add regex based matching for common patterns
- show on both latent preview & skeleton preview
- allow app mode workflow authors to override status with custom
property `Execution Message` (no UI for doing this)
## Review Focus
This is purely pattern/lookup based, in future we could update the
backend node schema to allow nodes to define their own status key.
## Screenshots (if applicable)
<img width="757" height="461" alt="image"
src="https://github.com/user-attachments/assets/2b32cc54-c4e7-4aeb-912d-b39ac8428be7"
/>
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10369-feat-App-mode-add-execution-status-messages-32a6d73d3650814e8ca2da5eb33f3b65)
by [Unito](https://www.unito.io)
---------
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
## Problem
Since PR #8520 (`feat(persistence): fix QuotaExceededError and
cross-workspace draft leakage`), all workflow tabs are lost when the
browser is closed and reopened.
PR #8520 moved tab pointers (`ActivePath`, `OpenPaths`) from
`localStorage` to `sessionStorage` for per-tab isolation. However,
`sessionStorage` is cleared when the browser closes, so the open tab
list is lost on restart. The draft data itself survives in
`localStorage` — only the pointers to which tabs were open are lost.
Reported in
[Comfy-Org/ComfyUI#12984](https://github.com/Comfy-Org/ComfyUI/issues/12984).
Confirmed via binary search: v1.40.9 (last good) → v1.40.10 (first bad).
## Changes
Dual-write tab pointers to both storage layers:
- **sessionStorage** (scoped by `clientId`) — used for in-session
refresh, preserves per-tab isolation
- **localStorage** (scoped by `workspaceId`) — fallback for browser
restart when sessionStorage is empty
Also adds:
- `storageAvailable` guard on write functions for consistency with
`writeIndex`/`writePayload`
- `isValidPointer` validation on localStorage reads to reject stale or
malformed data
## Benefits
- Workflow tabs survive browser restart (restores V1 behavior)
- Per-tab isolation is preserved for in-session use (sessionStorage is
still preferred when available)
## Trade-offs
- On browser restart, the restored tabs come from whichever browser tab
wrote last to localStorage. If Tab A had workflows 1,2,3 and Tab B had
4,5 — the user gets whichever tab wrote most recently. This is the same
limitation V1 had with `Comfy.OpenWorkflowsPaths` in localStorage.
- Previously (post-#8520), opening a new browser tab would only restore
the single most recent draft. With this fix, a new tab restores the full
set of open tabs from the last session. This may be surprising for
multi-tab users who expect a clean slate in new tabs.
## Test plan
- [x] `pnpm typecheck` passes
- [x] `pnpm lint` passes
- [x] All 121 persistence tests pass
- [x] Manual: open multiple workflow tabs → close browser → reopen →
tabs restored
- [x] Manual: open two browser tabs with different workflows → refresh
each → correct tabs in each
FixesComfy-Org/ComfyUI#12984
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10336-fix-restore-workflow-tabs-on-browser-restart-3296d73d365081b7a7d3e91427d08d17)
by [Unito](https://www.unito.io)
<!-- QA_REPORT_SECTION -->
---
## 🔍 Automated QA Report
| | |
|---|---|
| **Status** | ✅ Complete |
| **Report** |
[sno-qa-10336.comfy-qa.pages.dev](https://sno-qa-10336.comfy-qa.pages.dev/)
|
| **CI Run** | [View
workflow](https://github.com/Comfy-Org/ComfyUI_frontend/actions/runs/23373697656)
|
Before/after video recordings with **Behavior Changes** and **Timeline
Comparison** tables.
## Summary
Refactor essentials tab node organization to eliminate duplicated logic
and restrict essentials to core nodes only.
## Changes
- **What**:
- Extract `resolveEssentialsCategory` to centralize category resolution
(was duplicated between filter and pathExtractor).
- Add `isCoreNode` guard so third-party nodes never appear in
essentials.
- Replace `indexOf`-based sorting with precomputed rank maps
(`ESSENTIALS_CATEGORY_RANK`, `ESSENTIALS_NODE_RANK`).
<img width="589" height="769" alt="image"
src="https://github.com/user-attachments/assets/66f41f35-aef5-4e12-97d5-0f33baf0ac45"
/>
## Review Focus
- The `isCoreNode` guard in `resolveEssentialsCategory` — ensures only
core nodes can appear in essentials even if a custom node sets
`essentials_category`.
- Rank map precomputation vs previous `indexOf` — functionally
equivalent but O(1) lookup.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10433-refactor-clean-up-essentials-node-organization-logic-32d6d73d36508193a4d1f7f9c18fcef7)
by [Unito](https://www.unito.io)
Co-authored-by: Amp <amp@ampcode.com>
## Summary
Refactors the error system to improve separation of concerns, fix DDD
layer violations, and address code quality issues.
- Extract `missingNodesErrorStore` from `executionErrorStore`, removing
the delegation pattern that coupled missing-node logic into the
execution error store
- Extract `useNodeErrorFlagSync` composable for node error flag
reconciliation (previously inlined)
- Extract `useErrorClearingHooks` composable with explicit callback
cleanup on node removal
- Extract `useErrorActions` composable to deduplicate telemetry+command
patterns across error card components
- Move `getCnrIdFromNode`/`getCnrIdFromProperties` to
`platform/nodeReplacement` layer (DDD fix)
- Move `missingNodesErrorStore` to `platform/nodeReplacement` (DDD
alignment)
- Add unmount cancellation guard to `useErrorReport` async `onMounted`
- Return watch stop handle from `useNodeErrorFlagSync`
- Add `asyncResolvedIds` eviction on `missingNodesError` reset
- Add `console.warn` to silent catch blocks and empty array guard
- Hoist `useCommandStore` to setup scope, fix floating promises
- Add `data-testid` to error groups, image/video error spans, copy
button
- Update E2E tests to use scoped locators and testids
- Add unit tests for `onNodeRemoved` restoration and double-install
guard
Fixes#9875, Fixes#10027, Fixes#10033, Fixes#10085
## Test plan
- [x] Existing unit tests pass with updated imports and mocks
- [x] New unit tests for `useErrorClearingHooks` (callback restoration,
double-install guard)
- [x] E2E tests updated to use scoped locators and `data-testid`
- [ ] Manual: verify error tab shows runtime errors and missing nodes
correctly
- [ ] Manual: verify "Find on GitHub", "Copy", and "Get Help" buttons
work in error cards
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10302-refactor-error-system-cleanup-store-separation-DDD-fix-test-improvements-3286d73d365081838279d045b8dd957a)
by [Unito](https://www.unito.io)
---------
Co-authored-by: GitHub Action <action@github.com>
## Summary
- The "Show Advanced Inputs" footer button was missing `headerColor`
style binding, causing it to not sync with the node header color (unlike
the "Enter Subgraph" button which already had it)
- Extracted the repeated `{ backgroundColor: headerColor }` inline style
(4 occurrences) into a `headerColorStyle` computed
## Screenshots
before
<img width="211" height="286" alt="스크린샷 2026-03-24 154312"
src="https://github.com/user-attachments/assets/edfd9480-04fa-4cd4-813d-a95adffbe2d3"
/>
after
<img width="261" height="333" alt="스크린샷 2026-03-24 154622"
src="https://github.com/user-attachments/assets/eab28717-889e-4a6b-8775-bfc08fa727ff"
/>
## Test plan
- [x] Set a custom color on a node with advanced inputs and verify the
footer button matches the header color
- [x] Verify subgraph enter button still syncs correctly
- [x] Verify dual-tab layouts (error + advanced, error + subgraph) both
show correct colors
### Why no E2E test
Node header color is applied as an inline style via `headerColor` prop,
which is already passed and tested through the existing subgraph enter
button path. This change simply extends the same binding to the advanced
inputs buttons — no new data flow or interaction is introduced, so a
screenshot-based E2E test would add maintenance cost without meaningful
regression coverage.
## Summary
Extract duplicated click-vs-drag detection logic into a shared
`useClickDragGuard` composable and `exceedsClickThreshold` pure utility
function.
## Changes
- **What**: New `useClickDragGuard(threshold)` composable in
`src/composables/useClickDragGuard.ts` that stores pointer start
position and checks squared distance against a threshold. Also exports
`exceedsClickThreshold` for non-Vue contexts.
- Migrated `DropZone.vue`, `useNodePointerInteractions.ts`, and
`Load3d.ts` to use the shared utility
- `CanvasPointer.ts` left as-is (LiteGraph internal)
- All consumers now use squared-distance comparison (no `Math.sqrt` or
per-axis `Math.abs`)
## Review Focus
- The composable uses plain `let` state instead of `ref` since
reactivity is not needed for the start position
- `Load3d.ts` uses the pure `exceedsClickThreshold` function directly
since it is a class, not a Vue component
- Threshold values preserved per-consumer: DropZone=5,
useNodePointerInteractions=3, Load3d=5
Fixes#10356
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10357-refactor-extract-shared-click-vs-drag-guard-utility-32a6d73d3650816e83f5cb89872fb184)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Alexander Brown <drjkl@comfy.org>
## Summary
Reduce settings dialog size and autofocus search input for better
usability.
## Changes
- **What**: Reduce dialog size from `md` to `sm` (max-width 1400px →
960px); autofocus search input on open
## Review Focus
User feedback indicated the settings dialog was too wide and search
required an extra click to focus.
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10396-fix-improve-settings-dialog-UX-32c6d73d365081e29eceed55afde1967)
by [Unito](https://www.unito.io)
Co-authored-by: Amp <amp@ampcode.com>
## Summary
Add grid view mode for multi-image batches in ImagePreview (Nodes 2.0),
replicating the Nodes 1.0 grid UX where all output images are visible as
clickable thumbnails.
## Changes
- **What**: Multi-image batches now default to a grid view showing all
thumbnails. Clicking a thumbnail switches to gallery mode for that
image. A persistent back-to-grid button sits next to navigation dots,
and hover action bars provide gallery toggle, download, and remove.
Replaced PrimeVue `Skeleton` with shadcn `Skeleton`. Added `viewGrid`,
`viewGallery`, `imageCount`, `galleryThumbnail` i18n keys.
## Review Focus
- Grid column count strategy: fixed breakpoints (2 cols ≤4, 3 cols ≤9, 4
cols 10+) vs CSS auto-fill
- Default view mode: grid for multi-image, gallery for single — matches
Nodes 1.0 behavior
- `object-contain` on thumbnails to avoid cropping (with `aspect-square`
containers for uniform cells)
Fixes#9162
<!-- Pipeline-Ticket: f8f8effa-adff-4ede-b1d3-3c4f04b9c4a0 -->
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-9241-feat-add-grid-view-mode-for-multi-image-batches-in-ImagePreview-3136d73d36508166895ed6c635150434)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Alexander Brown <drjkl@comfy.org>
Co-authored-by: Alexander Brown <448862+DrJKL@users.noreply.github.com>
Co-authored-by: Amp <amp@ampcode.com>
## Summary
- When the cloud backend returns a 403 (user not whitelisted), the
frontend showed a generic "Prompt Execution Error" dialog with a cryptic
message
- Now catches 403 responses specifically and shows an "Access
Restricted" dialog with the backend's actual error message
- Adds `status` field to `PromptExecutionError` so error handlers can
distinguish HTTP status codes
## Changes
- `api.ts`: Added optional `status` to `PromptExecutionError`, pass
`res.status` from `queuePrompt`
- `app.ts`: New `else if` branch in the prompt error handler for `status
=== 403` — shows "Access Restricted" with the backend message
## Backwards compatible
- **Old backend** (`"not authorized"`): Shows "Access Restricted: not
authorized"
- **New backend**
([cloud#2941](https://github.com/Comfy-Org/cloud/pull/2941), `"your
account is not whitelisted for this feature"`): Shows "Access
Restricted: your account is not whitelisted for this feature"
- No behavior change for non-403 errors
## Related
- Backend fix: Comfy-Org/cloud#2941
- Notion: COM-16179
## Test plan
- [ ] Submit a prompt as a non-whitelisted user → should see "Access
Restricted" dialog with clear message
- [ ] Submit a prompt as a whitelisted user → no change in behavior
- [ ] Submit a prompt that fails for other reasons (missing nodes, etc.)
→ existing error handling unchanged
🤖 Generated with [Claude Code](https://claude.com/claude-code)
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10402-fix-show-clear-error-dialog-for-403-whitelist-failures-32c6d73d365081eb9528d7feac4e8681)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Matt Miller <mattmiller@Matts-MacBook-Pro.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
## Summary
- When users open/import a workflow with missing nodes, we already track
this via Mixpanel
- This adds a parallel fire-and-forget POST to
`/api/internal/cloud_analytics` so the data also lands in **ClickHouse**
as `frontend:missing_nodes_detected` events
- Payload includes `missing_class_types[]`, `missing_count`, and
`source` (file_button/file_drop/template/unknown)
## Motivation
The frontend is where the **high-value** missing node signal lives —
most users see "missing nodes" and never submit. The backend only
catches the rare case where someone submits anyway. This change captures
both sides.
Companion cloud PR: https://github.com/Comfy-Org/cloud/pull/2886
## Changes
- `MixpanelTelemetryProvider.ts`: Added
`reportMissingNodesToClickHouse()` private method, called from
`trackWorkflowImported()` and `trackWorkflowOpened()`
- Only fires when `missing_node_count > 0`
- Fire-and-forget (`.catch(() => {})`) — no impact on user experience
- Uses existing `api.fetchApi()` which handles auth automatically
## Test plan
- [ ] Open a workflow with missing nodes → verify
`frontend:missing_nodes_detected` event appears in ClickHouse
- [ ] Open a workflow with no missing nodes → verify no event is sent
(check network tab)
- [ ] Verify Mixpanel tracking still works as before
🤖 Generated with [Claude Code](https://claude.com/claude-code)
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10132-feat-send-missing-node-data-to-ClickHouse-3266d73d365081559db5ed3efde33e95)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: GitHub Action <action@github.com>
## Summary
Promoted primitive subgraph inputs (String, Int) render their link
anchor at the header position instead of the widget row. Renaming
subgraph input labels breaks the match entirely, causing connections to
detach from their widgets visually.
## Changes
- **What**: Fix widget-input slot positioning for promoted subgraph
inputs in both LiteGraph and Vue (Nodes 2.0) rendering modes
- `_arrangeWidgetInputSlots`: Removed Vue mode branch that skipped
setting `input.pos`. Promoted widget inputs aren't rendered as
`<InputSlot>` Vue components (NodeSlots filters them out), so
`input.pos` is the only position fallback
- `drawConnections`: Added pre-pass to arrange nodes with unpositioned
widget-input slots before link rendering. The background canvas renders
before the foreground canvas calls `arrange()`, so positions weren't set
on the first frame
- `SubgraphNode`: Sync `input.widget.name` with the display name on
label rename and initial setup. The `IWidgetLocator` name diverged from
`PromotedWidgetView.name` after rename, breaking all name-based
slot↔widget matching (`_arrangeWidgetInputSlots`, `getWidgetFromSlot`,
`getSlotFromWidget`)
## Review Focus
- The `_arrangeWidgetInputSlots` rewrite iterates `_concreteInputs`
directly instead of building a spread-copy map — simpler and avoids the
stale index issue
- `input.widget.name` is now kept in sync with the display name
(`input.label ?? subgraphInput.name`). This is a semantic shift from
using the raw internal name, but it's required for all name-based
matching to work after renames. The value is overwritten on deserialize
by `_setWidget` anyway
- The `_widget` fallback in `_arrangeWidgetInputSlots` is a safety net
for edge cases where the name still doesn't match (e.g., stale cache)
Fixes#9998
## Screenshots
<img width="847" height="476" alt="Screenshot 2026-03-17 at 3 05 32 PM"
src="https://github.com/user-attachments/assets/38f10563-f0bc-44dd-a1a5-f4a7832575d0"
/>
<img width="804" height="471" alt="Screenshot 2026-03-17 at 3 05 23 PM"
src="https://github.com/user-attachments/assets/3237a7ee-f3e5-4084-b330-371def3415bd"
/>
<img width="974" height="571" alt="Screenshot 2026-03-17 at 3 05 16 PM"
src="https://github.com/user-attachments/assets/cafdca46-8d9b-40e1-8561-02cbb25ee8f2"
/>
<img width="967" height="558" alt="Screenshot 2026-03-17 at 3 05 06 PM"
src="https://github.com/user-attachments/assets/fc03ce43-906c-474d-b3bc-ddf08eb37c75"
/>
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10195-fix-subgraph-promoted-widget-input-slot-positions-after-label-rename-3266d73d365081dfa623dd94dd87c718)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: jaeone94 <jaeone.prt@gmail.com>
## Summary
- Extract all 75 `quickRegister()` mapping entries from
`modelToNodeStore.ts` into a new `modelNodeMappings.ts` constants file
- The store now iterates over the `MODEL_NODE_MAPPINGS` array instead of
having inline calls
- **Zero behavioral change** — same mappings, same order, same runtime
behavior
## Motivation
Adding new model-to-node mappings is currently a code change to the
store. By separating the data into its own file:
- New mappings are a **pure data change** (append a tuple to an array)
- The data file can have its own CODEOWNERS entry, so mapping PRs can be
merged without requiring frontend team review
- Easier to audit — all mappings visible in one place without
interleaved store logic
### Before
```ts
// 250+ lines of quickRegister() calls mixed into store logic
quickRegister('checkpoints', 'CheckpointLoaderSimple', 'ckpt_name')
quickRegister('checkpoints', 'ImageOnlyCheckpointLoader', 'ckpt_name')
// ... 73 more
```
### After
```ts
// modelNodeMappings.ts — pure data
export const MODEL_NODE_MAPPINGS = [
['checkpoints', 'CheckpointLoaderSimple', 'ckpt_name'],
['checkpoints', 'ImageOnlyCheckpointLoader', 'ckpt_name'],
// ...
]
// modelToNodeStore.ts — just iterates
for (const [modelType, nodeClass, key] of MODEL_NODE_MAPPINGS) {
quickRegister(modelType, nodeClass, key)
}
```
## Test plan
- [ ] "Use" button in model browser still works for all model types
- [ ] No regressions in model-to-node resolution (same mappings, same
order)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10237-refactor-extract-model-to-node-mappings-into-separate-data-file-3276d73d365081988656e2ddae772bbc)
by [Unito](https://www.unito.io)
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Alexander Brown <drjkl@comfy.org>
Co-authored-by: GitHub Action <action@github.com>
## Problem
GA4 shows **zero** `sign_up` events from cloud.comfy.org. All new users
(Google/GitHub) are tracked as `login` instead of `sign_up`.
**Root cause:** `getAdditionalUserInfo(result)?.isNewUser` from Firebase
is unreliable for popup auth flows — it compares `creationDate` vs
`lastSignInDate` timestamps, which can differ even for genuinely new
users. When it returns `null` or `false`, the code defaults to pushing a
`login` event instead of `sign_up`.
**Evidence:** GA4 Exploration filtered to `sign_up` + `cloud.comfy.org`
shows 0 events, while `login` shows 8,804 Google users, 519 email, 193
GitHub — all new users are being misclassified as logins.
(Additionally, the ~300 `sign_up` events visible in GA4 are actually
from `blog.comfy.org` — Substack newsletter subscriptions — not from the
app at all.)
## Fix
Use the UI flow context to determine `is_new_user` instead of Firebase's
unreliable API:
- `CloudSignupView.vue` → passes `{ isNewUser: true }` (user is on the
sign-up page)
- `CloudLoginView.vue` → no flag needed, defaults to `false` (user is on
the login page)
- `SignInContent.vue` → passes `{ isNewUser: !isSignIn.value }` (dialog
toggles between sign-in/sign-up)
- Removes the unused `getAdditionalUserInfo` import
## Changes
- `firebaseAuthStore.ts`: `loginWithGoogle`/`loginWithGithub` accept
optional `{ isNewUser }` parameter instead of calling
`getAdditionalUserInfo`
- `useFirebaseAuthActions.ts`: passes the option through
- `CloudSignupView.vue`: passes `{ isNewUser: true }`
- `SignInContent.vue`: passes `{ isNewUser: !isSignIn.value }`
## Testing
- All 32 `firebaseAuthStore.test.ts` tests pass
- All 19 `GtmTelemetryProvider.test.ts` tests pass
- Typecheck passes
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10388-fix-use-UI-flow-context-for-sign_up-vs-login-telemetry-32b6d73d3650811e96cec281108abbf3)
by [Unito](https://www.unito.io)
---------
Co-authored-by: GitHub Action <action@github.com>
## Summary
Differentiates the subscription pricing dialog between personal and team
workspaces with distinct visual treatments and a two-stage team
workspace upgrade flow.
### Changes
- **Personal pricing dialog**: Shows "P" avatar badge, "Plans for
Personal Workspace" header, and "Solo use only – Need team workspace?"
banner on each tier card
- **Team pricing dialog**: Shows workspace avatar, "Plans for Team
Workspace" header (emerald), green "Invite up to X members" badge, and
emerald border on Creator card
- **Two-stage upgrade flow**: "Need team workspace?" → closes pricing →
opens CreateWorkspaceDialog → sessionStorage flag → page reload →
WorkspaceAuthGate auto-opens team pricing dialog
- **Spacing**: Reduced vertical gaps/padding/font sizes so the table
fits without scrolling
### Key decisions
- sessionStorage key `comfy:resume-team-pricing` bridges the page reload
during workspace creation
- `onChooseTeam` prop is conditionally passed only to the personal
variant
- `resumePendingPricingFlow()` is called from WorkspaceAuthGate after
workspace initialization
┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-9901-feat-differentiate-personal-team-pricing-table-with-two-stage-team-workspace-flow-3226d73d365081e7af60dcca86e83673)
by [Unito](https://www.unito.io)
These are the primary architectural guardrails. Every entity/litegraph change must be checked against them.
### Command Pattern (ADR 0003)
All entity state mutations MUST be expressible as **serializable, idempotent, deterministic commands**. This is required for CRDT sync, undo/redo, cross-environment portability, and gateway backends.
Flag:
- **Direct spatial mutation** — `node.pos = ...`, `node.size = ...`, `group.pos = ...` outside of a store or command. All spatial data flows through `layoutStore` commands.
- **Imperative fire-and-forget mutation** — Any new API that mutates entity state as a side effect rather than producing a serializable command object. Systems should produce command batches, not execute mutations directly.
- **Void-returning mutation APIs** — New entity mutation functions that return `void` instead of a result type (`{ status: 'applied' | 'rejected' | 'no-op' }`). Commands need error/rejection semantics.
- **Auto-incrementing IDs in new entity code** — New entity creation using auto-increment counters without acknowledging the CRDT collision problem. Concurrent environments need globally unique, stable identifiers.
### ECS Architecture (ADR 0008)
The graph domain model is migrating to ECS. New code must not make the migration harder.
Flag:
- **God-object growth** — New methods/properties added to `LGraphNode` (~4k lines), `LGraphCanvas` (~9k lines), `LGraph` (~3k lines), or `Subgraph`. Extract to systems, stores, or composables instead.
- **Mixed data and behavior** — New component-like data structures that contain methods or back-references to parent entities. ECS components are plain data objects.
- **New circular entity dependencies** — New circular imports between `LGraph` ↔ `Subgraph`, `LGraphNode` ↔ `LGraphCanvas`, or similar entity classes.
- **Direct `graph._version++`** — Mutating the private version counter directly instead of through a public API. Extensions already depend on this side-channel; it must become a proper API.
### Centralized Registries and ECS-Style Access
All entity data access should move toward centralized query patterns, not instance property access.
Flag:
- **New instance method/property patterns** — Adding `node.someProperty` or `node.someMethod()` for data that should be a component in the World, queried via `world.getComponent(entityId, ComponentType)`.
- **OOP inheritance for entity modeling** — Extending entity classes with new subclasses instead of composing behavior through components and systems.
- **Scattered state** — New entity state stored in multiple locations (class properties, stores, local variables) instead of being consolidated in the World or in a single store.
### Extension Ecosystem Impact
Entity API changes affect 40+ custom node repos. Changes to these patterns require an extension migration path.
For all other ADRs, iterate through each file in `docs/adr/` and extract the core lesson. Ensure changed code does not contradict accepted ADRs. Flag contradictions with proposed ADRs as directional guidance.
### How to Apply
1. Read `docs/adr/README.md` to get the full ADR index
2. For each ADR, read the Decision and Consequences sections
3. Check the diff against each ADR's constraints
4. Only flag ACTUAL violations in changed code, not pre-existing patterns
### Skip List
These ADRs can be skipped for most reviews (they cover completed or narrow-scope decisions):
- **ADR 0004** (Rejected — Fork PrimeVue) — only relevant if someone proposes forking PrimeVue again
## How to Check
1. Identify changed files in the entity/litegraph layer: `src/lib/litegraph/`, `src/ecs/`, `src/platform/`, entity-related stores
2. For Priority 1 patterns, use targeted searches:
description: Reviews Playwright E2E test code for ComfyUI-specific patterns, flakiness risks, and fixture misuse
severity-default: medium
tools: [Read, Grep]
---
You are reviewing Playwright E2E test code in `browser_tests/`. Focus on issues a **reviewer** would catch that an author might miss — flakiness risks, fixture misuse, test isolation problems, and convention violations.
1.**`waitForTimeout` usage** — Always wrong. Must use retrying assertions (`toBeVisible`, `toHaveText`), `expect.poll()`, or `expect().toPass()`. See retry patterns in `.claude/skills/writing-playwright-tests/SKILL.md`.
2.**Missing `nextFrame()` after canvas ops** — Any `drag`, `click` on canvas, `resizeNode`, `pan`, `zoom`, or programmatic graph mutation via `page.evaluate` that changes visual state needs `await comfyPage.nextFrame()` before assertions. `loadWorkflow()` does NOT need it. Prefer encapsulating `nextFrame()` calls inside Page Object methods so tests don't manage frame timing directly.
3.**Keyboard actions without prior focus** — `page.keyboard.press()` without a preceding `comfyPage.canvas.click()` or element `.focus()` will silently send keys to nothing.
4.**Coordinate-based interactions where node refs exist** — Raw `{ x, y }` clicks on canvas are fragile. If the test targets a node, use `comfyPage.nodeOps.getNodeRefById()` / `getNodeRefsByTitle()` / `getNodeRefsByType()` instead.
5.**Shared mutable state between tests** — Variables declared outside `test()` blocks, `let` state mutated across tests, or tests depending on execution order. Each test must be independently runnable.
6.**Missing cleanup of server-persisted state** — Settings changed via `comfyPage.settings.setSetting()` persist across tests. Must be reset in `afterEach` or at test start. Same for uploaded files or saved workflows. Prefer moving cleanup into [fixture options](https://playwright.dev/docs/test-fixtures#fixtures-options) so individual tests don't manage reset logic.
7.**Double-click without `{ delay }` option** — `dblclick()` without `{ delay: 5 }` or similar can be too fast for the canvas event handler.
### Fixture & API Misuse (Medium)
8.**Reimplementing existing fixture helpers** — Before flagging, grep `browser_tests/fixtures/` for the functionality. Common missed helpers:
-`comfyPage.command.executeCommand()` for menu/command actions
-`comfyPage.workflow.loadWorkflow()` for loading test workflows
-`comfyPage.canvasOps.resetView()` for view reset
-`comfyPage.settings.setSetting()` for settings
- Component page objects in `browser_tests/fixtures/components/`
9.**Building workflows programmatically when a JSON asset would work** — Complex `page.evaluate` chains to construct a graph should use a premade JSON workflow in `browser_tests/assets/` loaded via `comfyPage.workflow.loadWorkflow()`.
10.**Selectors not using `TestIds`** — Hard-coded `data-testid` strings should reference `browser_tests/fixtures/selectors.ts` when a matching entry exists. Check `selectors.ts` before flagging.
### Convention Violations (Minor)
11.**Missing test tags** — Every `test.describe` should have `tag` with at least one of: `@smoke`, `@slow`, `@screenshot`, `@canvas`, `@node`, `@widget`, `@mobile`, `@2x`. See `.claude/skills/writing-playwright-tests/SKILL.md` for when to use each.
12.**`as any` type assertions** — Forbidden in E2E tests. Use specific type assertions or test-local type helpers. See `docs/guidance/playwright.md` for acceptable patterns.
13.**Screenshot tests without masking dynamic content** — Timestamps, version numbers, or other non-deterministic content in screenshots will cause flakes. Use `mask` option.
14.**`test.describe` without `afterEach` cleanup when canvas state changes** — Tests that manipulate canvas view (drag, zoom, pan) should include `afterEach` with `comfyPage.canvasOps.resetView()`. Prefer moving canvas reset into the fixture so individual tests don't manage cleanup.
15.**Debug helpers left in committed code** — `debugAddMarker`, `debugAttachScreenshot`, `debugShowCanvasOverlay`, `debugGetCanvasDataURL` are for local debugging only.
### Test Design (Nitpick)
16.**Screenshot-only assertions where functional assertions are possible** — Prefer `expect(await node.isPinned()).toBe(true)` over screenshot comparison when testing non-visual behavior.
17.**Overly large test workflows** — Test should load the minimal workflow needed. If a test only needs one node, don't load the full default graph.
18.**Vue Nodes / LiteGraph mismatch** — If testing Vue-rendered node UI (DOM widgets, CSS states), should use `comfyPage.vueNodes.*`. If testing canvas interactions/connections, should use `comfyPage.nodeOps.*`. Mixing both in one test is a smell.
## Rules
- Only review `.spec.ts` files and supporting code in `browser_tests/`
- Do NOT flag patterns in fixture/helper code (`browser_tests/fixtures/`) — those are shared infrastructure with different rules
- "Major" for flakiness risks (items 1-7), "medium" for fixture misuse (8-10), "minor" for convention violations (11-15), "nitpick" for test design (16-18)
- When flagging missing fixture usage (item 8), confirm the helper exists by checking the fixture code — don't assume
- Existing tests that predate conventions are acceptable to modify but not required to fix
2.**Imperative fire-and-forget APIs** — Functions that mutate entity state as side effects rather than producing serializable command objects. Systems should produce command batches, not execute mutations directly.
4.**Auto-increment IDs** — New entity creation via counters without addressing CRDT collision. Concurrent environments need globally unique identifiers.
5.**Missing transaction semantics** — Multi-entity operations without atomic grouping (e.g., node removal = 10+ deletes with no rollback on failure)
### Check B: ECS Architecture (ADR 0008)
Flag:
1.**God-object growth** — New methods/properties on `LGraphNode`, `LGraphCanvas`, `LGraph`, `Subgraph`
2.**Mixed data/behavior** — Component-like structures with methods or back-references
3.**OOP instance patterns** — New `node.someProperty` or `node.someMethod()` for data that should be a World component
4.**OOP inheritance** — New entity subclasses instead of component composition
- Priority 1 findings: N (command-pattern: N, ECS: N, ecosystem: N)
- Priority 2 findings: N
### Priority 1: Command Pattern & ECS
(List each with ADR reference, file, line, description)
### Priority 1: Extension Ecosystem Impact
(List each changed callback/API with affected custom node repos)
### Priority 2: General ADR Compliance
(List each with ADR reference, file, line, description)
### Compliant Patterns
(Note changes that positively align with ADR direction)
```
## Severity
- **Must fix**: Contradicts accepted ADR, or introduces imperative mutation API without command-pattern wrapper, or breaks extension callback without migration path
- **Should discuss**: Contradicts proposed ADR direction — either align or propose ADR amendment
- **Note**: Surfaces open architectural question not yet addressed by ADRs
**NEVER merge a backport PR without all CI checks passing.** This applies to both automation-created and manual cherry-pick PRs.
- **Automation PRs:** The `pr-backport.yaml` workflow now enables `gh pr merge --auto --squash`, so clean PRs auto-merge once CI passes. Monitor with polling (`gh pr list --base TARGET_BRANCH --state open`). Do not intervene unless CI fails.
- **Manual cherry-pick PRs:** After `gh pr create`, wait for CI before merging. Poll with `gh pr checks $PR --watch` or use a sleep+check loop. Only merge after all checks pass.
- **CI failures:** DO NOT use `--admin` to bypass failing CI. Analyze the failure, present it to the user with possible causes (test backported without implementation, missing dependency, flaky test), and let the user decide the next step.
If typecheck fails, stop and investigate before continuing. A broken branch after wave N means all subsequent waves will compound the problem.
If typecheck or tests fail, stop and investigate before continuing. A broken branch after wave N means all subsequent waves will compound the problem.
### Never Admin-Merge Without CI
In a previous bulk session, all 69 backport PRs were merged with `gh pr merge --squash --admin`, bypassing required CI checks. This shipped 3 test failures to a release branch. **Lesson: `--admin` skips all branch protection, including required status checks.** Only use `--admin` after confirming CI has passed (e.g., `gh pr checks $PR` shows all green), or rely on auto-merge (`--auto --squash`) which waits for CI by design.
# Check which got auto-PRs (auto-merge is enabled, so clean ones will self-merge after CI)
gh pr list --base TARGET_BRANCH --state open --limit 50 --json number,title
```
## Step 2: Review & Merge Clean Auto-PRs
> **Note:** The `pr-backport.yaml` workflow now enables `gh pr merge --auto --squash` on automation-created PRs. Clean PRs will auto-merge once CI passes — no manual merge needed for those.
## Step 2: Wait for CI & Merge Clean Auto-PRs
Most automation PRs will auto-merge once CI passes (via `--auto --squash` in the workflow). Monitor and handle failures:
7.**appModeStore.ts, painter files, GLSLShader files** don't exist on core/1.40 — `git rm` these
8.**Always validate JSON** after resolving locale file conflicts
9.**Dep refresh PRs** — skip on stable branches. Risk of transitive dep regressions outweighs audit cleanup. Cherry-pick individual CVE fixes instead.
10.**Verify after each wave** — run `pnpm typecheck` on the target branch after merging a batch. Catching breakage early prevents compounding errors.
10.**Verify after each wave** — run `pnpm typecheck && pnpm test:unit` on the target branch after merging a batch. Catching breakage early prevents compounding errors.
11.**Cloud-only PRs don't belong on core/\* branches** — app mode, cloud auth, and cloud-specific UI changes are irrelevant to local users. Always check PR scope against branch scope before backporting.
12.**Never admin-merge without CI** — `--admin` bypasses all branch protections including required status checks. A bulk session of 69 admin-merges shipped 3 test failures. Always wait for CI to pass first, or use `--auto --squash` which waits by design.
## CI Failure Triage
When CI fails on a backport PR, present failures to the user using this template:
```markdown
### PR #XXXX — CI Failed
- **Failing check:** test / lint / typecheck
- **Error:** (summary of the failure message)
- **Likely cause:** test backported without implementation / missing dependency / flaky test / snapshot mismatch
- **Recommendation:** backport PR #YYYY first / skip this PR / rerun CI after fixing prerequisites
description: 'Reproduce a GitHub issue by researching prerequisites, setting up the environment (custom nodes, workflows, settings), and interactively exploring ComfyUI via playwright-cli until the bug is confirmed. Then records a clean demo video.'
---
# Issue Reproduction Skill
Reproduce a reported GitHub issue against a running ComfyUI instance. This skill uses an interactive, agent-driven approach — not a static script. You will research, explore, retry, and adapt until the bug is reproduced, then record a clean demo.
## Architecture
Two videos are produced:
1.**Research video** — the full exploration session: installing deps, trying things, failing, retrying, figuring out the bug. Valuable for debugging context.
2.**Reproduce video** — a clean, minimal recording of just the reproduction steps. This is the demo you'd attach to the issue.
```
Phase 1: Research → Read issue, understand prerequisites
## Phase 4: Record Clean Demo — Reproduce Video (max 5 minutes)
Once the bug is confirmed, **stop the research video** and **close the research browser**:
```bash
playwright-cli video-stop
playwright-cli close
```
Now start a **fresh browser session** for the clean reproduce video (Video 2).
**IMPORTANT constraints:**
- **Max 5 minutes** — the reproduce video must be short and focused
- **No environment setup** — server, user, custom nodes are already set up from Phase 3. Just log in and go.
- **No exploration** — you already know the exact steps. Execute them quickly and precisely.
- **Start video recording immediately**, execute steps, stop. Don't leave the recording running while thinking.
1. **Open browser and start recording**:
```bash
playwright-cli open "http://127.0.0.1:8188"
playwright-cli video-start
```
2. **Execute only the minimal reproduction steps** — no exploration, no mistakes. Just the clean sequence that demonstrates the bug. You already know exactly what works from Phase 3.
description: 'Comprehensive QA of ComfyUI frontend. Navigates all routes, tests all interactive features using playwright-cli, generates a report, and submits a draft PR. Works in CI and local environments, cross-platform.'
---
# ComfyUI Frontend QA Skill
Automated quality assurance for the ComfyUI frontend. The pipeline reproduces reported bugs using Playwright E2E tests, records video evidence, and deploys reports to Cloudflare Pages.
## Architecture Overview
The QA pipeline uses a **three-phase approach**:
1.**RESEARCH** — Claude writes Playwright E2E tests to reproduce bugs (assertion-backed, no hallucination)
2.**REPRODUCE** — Deterministic replay of the research test with video recording
3.**REPORT** — Deploy results to Cloudflare Pages with badge, video, and verdict
### Key Design Decision
Earlier iterations used AI vision (Gemini) to drive a browser and judge results from video. This was abandoned after discovering **AI reviewers hallucinate** — Gemini reported "REPRODUCED" when videos showed idle screens. The current approach uses **Playwright assertions** as the source of truth: if the test passes, the bug is proven.
Go to Actions → "PR: QA" → Run workflow → choose mode (focused/full).
## CI Workflow (`.github/workflows/pr-qa.yaml`)
```
resolve-matrix → analyze-pr ──┐
├→ qa-before (main branch, worktree build)
├→ qa-after (PR branch)
└→ report (video review, deploy, comment)
```
Before/after jobs run **in parallel** on separate runners for clean isolation.
### Issue Reproduce Mode
For issues (not PRs), the pipeline:
1. Fetches the issue body and comments
2. Runs `qa-analyze-pr.ts --type issue` to generate a QA guide
3. Runs the research phase (Claude writes E2E test to reproduce)
4. Records video of the test execution
5. Posts results as a comment on the issue
## Running Locally
### Step 1: Environment Setup
```bash
# Ensure ComfyUI server is running
# Default: http://127.0.0.1:8188
# Install Playwright browsers
npx playwright install chromium
```
### Step 2: Analyze the Issue/PR
```bash
# For a PR
pnpm exec tsx scripts/qa-analyze-pr.ts \
--pr-number 10394\
--repo Comfy-Org/ComfyUI_frontend \
--output-dir qa-guides
# For an issue
pnpm exec tsx scripts/qa-analyze-pr.ts \
--pr-number 10394\
--repo Comfy-Org/ComfyUI_frontend \
--output-dir qa-guides \
--type issue
```
### Step 3: Record Before/After
```bash
# Before (main branch)
pnpm exec tsx scripts/qa-record.ts \
--mode before \
--diff /tmp/pr-diff.txt \
--output-dir /tmp/qa-before \
--qa-guide qa-guides/qa-guide-1.json
# After (PR branch)
pnpm exec tsx scripts/qa-record.ts \
--mode after \
--diff /tmp/pr-diff.txt \
--output-dir /tmp/qa-after \
--qa-guide qa-guides/qa-guide-1.json
```
### Step 4: Review Videos
```bash
pnpm exec tsx scripts/qa-video-review.ts \
--artifacts-dir /tmp/qa-artifacts \
--video-file qa-session.mp4 \
--before-video qa-before-session.mp4 \
--output-dir /tmp/video-reviews \
--pr-context /tmp/pr-context.txt
```
## Research Phase Details (`qa-agent.ts`)
Claude receives:
- The issue description and comments
- A QA guide from `qa-analyze-pr.ts`
- An accessibility tree snapshot of the current UI
Claude's tools:
- **`inspect(selector?)`** — Read a11y tree to discover element selectors
- **`writeTest(code)`** — Write a Playwright `.spec.ts` file
- **`runTest()`** — Execute the test and get pass/fail + errors
- **`done(verdict, summary, evidence, testCode)`** — Finish with verdict
The test uses the project's Playwright fixtures (`comfyPageFixture`), giving access to `comfyPage.page`, `comfyPage.menu`, `comfyPage.settings`, etc.
### Verdict Logic
- **REPRODUCED** — Test passes (asserting the bug exists) → bug is proven
- **NOT_REPRODUCIBLE** — Claude exhausted attempts, test cannot pass
- **INCONCLUSIVE** — Agent timed out or encountered infrastructure issues
Auto-completion: if a test passed but `done()` was never called, the pipeline auto-completes with REPRODUCED.
## Manual QA (Fallback)
When the automated pipeline isn't suitable (e.g., visual-only bugs, complex multi-step interactions), use **playwright-cli** for manual browser interaction:
```bash
# Install
npm install -g @playwright/cli@latest
# Open browser and navigate
playwright-cli open http://127.0.0.1:8188
# Get element references
playwright-cli snapshot
# Interact
playwright-cli click e1
playwright-cli fill e2 "test text"
playwright-cli press Escape
playwright-cli screenshot --filename=f.png
```
Snapshots return element references (`e1`, `e2`, …). Always run `snapshot` after navigation to refresh refs.
## Manual QA Test Plan
When performing manual QA (either via playwright-cli or the automated pipeline), systematically test each area below.
### Application Load & Routes
| Test | Steps |
|---|---|
| Root route loads | Navigate to `/` — GraphView should render with canvas |
| User select route | Navigate to `/user-select` — user selection UI should appear |
| 404 handling | Navigate to `/nonexistent` — should handle gracefully |
### Canvas & Graph View
| Test | Steps |
|---|---|
| Canvas renders | The LiteGraph canvas is visible and interactive |
| Pan canvas | Click and drag on empty canvas area |
| Zoom in/out | Use scroll wheel or Alt+=/Alt+- |
| Add node via double-click | Double-click canvas to open search, type "KSampler", select it |
| Delete node | Select a node, press Delete key |
| Connect nodes | Drag from output slot to input slot |
| Copy/Paste | Select nodes, Ctrl+C then Ctrl+V |
| Undo/Redo | Make changes, Ctrl+Z to undo, Ctrl+Y to redo |
| Context menus | Right-click node vs empty canvas — different menus |
description: 'Detect DOM elements where CSS contain:layout+style would improve rendering performance. Runs a Playwright-based audit on a large workflow, scores candidates by subtree size and sizing constraints, measures performance impact, and generates a ranked report.'
---
# CSS Containment Audit
Automatically finds DOM elements where adding `contain: layout style` would reduce browser recalculation overhead.
## What It Does
1. Loads a large workflow (245 nodes) in a real browser
2. Walks the DOM tree and scores every element as a containment candidate
3. For each high-scoring candidate, applies `contain: layout style` via JavaScript
4. Measures rendering performance (style recalcs, layouts, task duration) before and after
5. Takes before/after screenshots to detect visual breakage
6. Generates a ranked report with actionable recommendations
## When to Use
- After adding new Vue components to the node rendering pipeline
- When investigating rendering performance on large workflows
- Before and after refactoring node DOM structure
- As part of periodic performance audits
## How to Run
```bash
# Start the dev server first
pnpm dev &
# Run the audit (uses the @audit tag, not included in normal CI runs)
pnpm exec playwright test browser_tests/tests/containAudit.spec.ts --project=audit
This PR added `contain-layout contain-style` to the node inner wrapper div in `LGraphNode.vue`. The audit tool would have flagged this element as a high-scoring candidate because:
- **Externally constrained size** (`w-(--node-width)`, `flex-1` — dimensions set by CSS variables and flex parent)
- **Natural isolation boundary** between frequently-changing content (widgets) and infrequently-changing overlays (selection outlines, borders)
The actual change was a single line: adding `'contain-layout contain-style'` to the inner wrapper's class list at `src/renderer/extensions/vueNodes/components/LGraphNode.vue:79`.
3. The PR description includes a concrete, non-placeholder explanation of why an end-to-end regression test was not added.
Fail otherwise. When failing, mention which bug-fix signal you found and ask the author to either add or update a Playwright regression test under `browser_tests/` or add a concrete explanation in the PR description of why an end-to-end regression test is not practical.
- name:ADR compliance for entity/litegraph changes
mode:warning
instructions:|
Use only PR metadata already available in the review context: the changed-file list relative to the PR base, the PR description, and the diff content. Do not rely on shell commands.
This check applies ONLY when the PR modifies files under `src/lib/litegraph/`, `src/ecs/`, or files related to graph entities (nodes, links, widgets, slots, reroutes, groups, subgraphs).
If none of those paths appear in the changed files, pass immediately.
When applicable, check for:
1. **Command pattern (ADR 0003)**: Entity state mutations must be serializable, idempotent, deterministic commands — not imperative fire-and-forget side effects. Flag direct spatial mutation (`node.pos =`, `node.size =`, `group.pos =`) outside of a store or command, and any new void-returning mutation API that should produce a command object.
2. **God-object growth (ADR 0008)**: New methods/properties added to `LGraphNode`, `LGraphCanvas`, `LGraph`, or `Subgraph` that add responsibilities rather than extracting/migrating existing ones.
3. **ECS data/behavior separation (ADR 0008)**: Component-like data structures that contain methods or back-references to parent entities. ECS components must be plain data. New OOP instance patterns (`node.someProperty`, `node.someMethod()`) for data that should be a World component.
4. **Extension ecosystem (ADR 0008)**: Changes to extension-facing callbacks (`onConnectionsChange`, `onRemoved`, `onAdded`, `onConfigure`, `onConnectInput/Output`, `onWidgetChanged`), `node.widgets` access, `node.serialize` overrides, or `graph._version++` without migration guidance. These affect 40+ custom node repos.
Pass if none of these patterns are found in the diff.
When warning, reference the specific ADR by number and link to `docs/adr/` for context. Frame findings as directional guidance since ADR 0003 and 0008 are in Proposed status.
--onlyAllow 'MIT;MIT*;Apache-2.0;BSD-2-Clause;BSD-3-Clause;ISC;0BSD;BlueOak-1.0.0;Python-2.0;CC0-1.0;Unlicense;(MIT OR Apache-2.0);(MIT OR GPL-3.0);(Apache-2.0 OR MIT);(MPL-2.0 OR Apache-2.0);CC-BY-4.0;CC-BY-3.0;GPL-3.0-only'; then
@@ -208,7 +208,7 @@ See @docs/testing/\*.md for detailed patterns.
3. Keep your module mocks contained
Do not use global mutable state within the test file
Use `vi.hoisted()` if necessary to allow for per-test Arrange phase manipulation of deeper mock state
4. For Component testing, use [Vue Test Utils](https://test-utils.vuejs.org/) and especially follow the advice [about making components easy to test](https://test-utils.vuejs.org/guide/essentials/easy-to-test.html)
4. For Component testing, prefer [@testing-library/vue](https://testing-library.com/docs/vue-testing-library/intro/) with `@testing-library/user-event` for user-centric, behavioral tests. [Vue Test Utils](https://test-utils.vuejs.org/) is also accepted, especially for tests that need direct access to the component wrapper (e.g., `findComponent`, `emitted()`). Follow the advice [about making components easy to test](https://test-utils.vuejs.org/guide/essentials/easy-to-test.html)
5. Aim for behavioral coverage of critical and new features
### Playwright / Browser / E2E Tests
@@ -216,6 +216,7 @@ See @docs/testing/\*.md for detailed patterns.
1. Follow the Best Practices described [in the Playwright documentation](https://playwright.dev/docs/best-practices)
2. Do not use waitForTimeout, use Locator actions and [retrying assertions](https://playwright.dev/docs/test-assertions#auto-retrying-assertions)
3. Tags like `@mobile`, `@2x` are respected by config and should be used for relevant tests
4. Type all API mock responses in `route.fulfill()` using generated types or schemas from `packages/ingest-types`, `packages/registry-types`, `src/workbench/extensions/manager/types/generatedManagerTypes.ts`, or `src/schemas/` — see `docs/guidance/playwright.md` for the full source-of-truth table
## External Resources
@@ -231,6 +232,18 @@ See @docs/testing/\*.md for detailed patterns.
- Nx: <https://nx.dev/docs/reference/nx-commands>
- [Practical Test Pyramid](https://martinfowler.com/articles/practical-test-pyramid.html)
## Architecture Decision Records
All architectural decisions are documented in `docs/adr/`. Code changes must be consistent with accepted ADRs. Proposed ADRs indicate design direction and should be treated as guidance. See `.agents/checks/adr-compliance.md` for automated validation rules.
1. **Command pattern for all mutations**: Every entity state change must be a serializable, idempotent, deterministic command — replayable, undoable, and transmittable over CRDT. No imperative fire-and-forget mutation APIs. Systems produce command batches, not direct side effects.
2. **Centralized registries and ECS-style access**: Entity data lives in the World (centralized registry), queried via `world.getComponent(entityId, ComponentType)`. Do not add new instance properties/methods to entity classes. Do not use OOP inheritance for entity modeling.
3. **No god-object growth**: Do not add methods to `LGraphNode`, `LGraphCanvas`, `LGraph`, or `Subgraph`. Extract to systems, stores, or composables.
4. **Plain data components**: ECS components are plain data objects — no methods, no back-references to parent entities. Behavior belongs in systems (pure functions).
5. **Extension ecosystem impact**: Changes to entity callbacks (`onConnectionsChange`, `onRemoved`, `onAdded`, `onConnectInput/Output`, `onConfigure`, `onWidgetChanged`), `node.widgets` access, `node.serialize`, or `graph._version++` affect 40+ custom node repos and require migration guidance.
- **`fixtures/utils/`** — Pure utility functions. No `Page` dependency; stateless helpers that can be used anywhere.
## Polling Assertions
Prefer `expect.poll()` over `expect(async () => { ... }).toPass()` when the block contains a single async call with a single assertion. `expect.poll()` is more readable and gives better error messages (shows actual vs expected on failure).
```typescript
// ✅ Correct — single async call + single assertion
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.