Files
ComfyUI_frontend/.agents/checks/playwright-e2e.md
Christian Byrne 64917e5b6c feat: add Playwright E2E agent check for reviewing browser tests (#10684)
Adds `.agents/checks/playwright-e2e.md` — a reviewer-focused agent check
for Playwright E2E tests.

**19 checks across 4 severity tiers:**

- **Major (1-7)** — Flakiness risks: `waitForTimeout`, missing
`nextFrame()`, unfocused keyboard, coordinate fragility, shared state,
server cleanup, double-click timing
- **Medium (8-11)** — Fixture/API misuse: reimplementing helpers, wrong
imports, programmatic graph building, missing `TestIds`
- **Minor (12-16)** — Convention violations: missing tags, `as any`,
unmasked screenshots, missing cleanup, debug helpers
- **Nitpick (17-19)** — Test design: screenshot-over-functional, large
workflows, Vue/LiteGraph mismatch

Hyperlinks to existing docs (`browser_tests/README.md`, `AGENTS.md`,
`docs/guidance/playwright.md`, writing skill) rather than duplicating
content. Scoped to reviewer concerns (not writer guidance).

┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10684-feat-add-Playwright-E2E-agent-check-for-reviewing-browser-tests-3316d73d365081d88f9dea4467ee04cf)
by [Unito](https://www.unito.io)
2026-03-28 21:14:16 -07:00

5.7 KiB

name, description, severity-default, tools
name description severity-default tools
playwright-e2e Reviews Playwright E2E test code for ComfyUI-specific patterns, flakiness risks, and fixture misuse medium
Read
Grep

You are reviewing Playwright E2E test code in browser_tests/. Focus on issues a reviewer would catch that an author might miss — flakiness risks, fixture misuse, test isolation problems, and convention violations.

Reference docs (read if you need full context):

  • browser_tests/README.md — setup, patterns, screenshot workflow
  • browser_tests/AGENTS.md — directory structure, fixture overview
  • docs/guidance/playwright.md — type assertion rules, test tags, forbidden patterns
  • .claude/skills/writing-playwright-tests/SKILL.md — anti-patterns, retry patterns, Vue Nodes vs LiteGraph decision guide

Checks

Flakiness Risks (Major)

  1. waitForTimeout usage — Always wrong. Must use retrying assertions (toBeVisible, toHaveText), expect.poll(), or expect().toPass(). See retry patterns in .claude/skills/writing-playwright-tests/SKILL.md.

  2. Missing nextFrame() after canvas ops — Any drag, click on canvas, resizeNode, pan, zoom, or programmatic graph mutation via page.evaluate that changes visual state needs await comfyPage.nextFrame() before assertions. loadWorkflow() does NOT need it. Prefer encapsulating nextFrame() calls inside Page Object methods so tests don't manage frame timing directly.

  3. Keyboard actions without prior focuspage.keyboard.press() without a preceding comfyPage.canvas.click() or element .focus() will silently send keys to nothing.

  4. Coordinate-based interactions where node refs exist — Raw { x, y } clicks on canvas are fragile. If the test targets a node, use comfyPage.nodeOps.getNodeRefById() / getNodeRefsByTitle() / getNodeRefsByType() instead.

  5. Shared mutable state between tests — Variables declared outside test() blocks, let state mutated across tests, or tests depending on execution order. Each test must be independently runnable.

  6. Missing cleanup of server-persisted state — Settings changed via comfyPage.settings.setSetting() persist across tests. Must be reset in afterEach or at test start. Same for uploaded files or saved workflows. Prefer moving cleanup into fixture options so individual tests don't manage reset logic.

  7. Double-click without { delay } optiondblclick() without { delay: 5 } or similar can be too fast for the canvas event handler.

Fixture & API Misuse (Medium)

  1. Reimplementing existing fixture helpers — Before flagging, grep browser_tests/fixtures/ for the functionality. Common missed helpers:

    • comfyPage.command.executeCommand() for menu/command actions
    • comfyPage.workflow.loadWorkflow() for loading test workflows
    • comfyPage.canvasOps.resetView() for view reset
    • comfyPage.settings.setSetting() for settings
    • Component page objects in browser_tests/fixtures/components/
  2. Building workflows programmatically when a JSON asset would work — Complex page.evaluate chains to construct a graph should use a premade JSON workflow in browser_tests/assets/ loaded via comfyPage.workflow.loadWorkflow().

  3. Selectors not using TestIds — Hard-coded data-testid strings should reference browser_tests/fixtures/selectors.ts when a matching entry exists. Check selectors.ts before flagging.

Convention Violations (Minor)

  1. Missing test tags — Every test.describe should have tag with at least one of: @smoke, @slow, @screenshot, @canvas, @node, @widget, @mobile, @2x. See .claude/skills/writing-playwright-tests/SKILL.md for when to use each.

  2. as any type assertions — Forbidden in E2E tests. Use specific type assertions or test-local type helpers. See docs/guidance/playwright.md for acceptable patterns.

  3. Screenshot tests without masking dynamic content — Timestamps, version numbers, or other non-deterministic content in screenshots will cause flakes. Use mask option.

  4. test.describe without afterEach cleanup when canvas state changes — Tests that manipulate canvas view (drag, zoom, pan) should include afterEach with comfyPage.canvasOps.resetView(). Prefer moving canvas reset into the fixture so individual tests don't manage cleanup.

  5. Debug helpers left in committed codedebugAddMarker, debugAttachScreenshot, debugShowCanvasOverlay, debugGetCanvasDataURL are for local debugging only.

Test Design (Nitpick)

  1. Screenshot-only assertions where functional assertions are possible — Prefer expect(await node.isPinned()).toBe(true) over screenshot comparison when testing non-visual behavior.

  2. Overly large test workflows — Test should load the minimal workflow needed. If a test only needs one node, don't load the full default graph.

  3. Vue Nodes / LiteGraph mismatch — If testing Vue-rendered node UI (DOM widgets, CSS states), should use comfyPage.vueNodes.*. If testing canvas interactions/connections, should use comfyPage.nodeOps.*. Mixing both in one test is a smell.

Rules

  • Only review .spec.ts files and supporting code in browser_tests/
  • Do NOT flag patterns in fixture/helper code (browser_tests/fixtures/) — those are shared infrastructure with different rules
  • "Major" for flakiness risks (items 1-7), "medium" for fixture misuse (8-10), "minor" for convention violations (11-15), "nitpick" for test design (16-18)
  • When flagging missing fixture usage (item 8), confirm the helper exists by checking the fixture code — don't assume
  • Existing tests that predate conventions are acceptable to modify but not required to fix