Adds `.agents/checks/playwright-e2e.md` — a reviewer-focused agent check for Playwright E2E tests. **19 checks across 4 severity tiers:** - **Major (1-7)** — Flakiness risks: `waitForTimeout`, missing `nextFrame()`, unfocused keyboard, coordinate fragility, shared state, server cleanup, double-click timing - **Medium (8-11)** — Fixture/API misuse: reimplementing helpers, wrong imports, programmatic graph building, missing `TestIds` - **Minor (12-16)** — Convention violations: missing tags, `as any`, unmasked screenshots, missing cleanup, debug helpers - **Nitpick (17-19)** — Test design: screenshot-over-functional, large workflows, Vue/LiteGraph mismatch Hyperlinks to existing docs (`browser_tests/README.md`, `AGENTS.md`, `docs/guidance/playwright.md`, writing skill) rather than duplicating content. Scoped to reviewer concerns (not writer guidance). ┆Issue is synchronized with this [Notion page](https://www.notion.so/PR-10684-feat-add-Playwright-E2E-agent-check-for-reviewing-browser-tests-3316d73d365081d88f9dea4467ee04cf) by [Unito](https://www.unito.io)
5.7 KiB
name, description, severity-default, tools
| name | description | severity-default | tools | ||
|---|---|---|---|---|---|
| playwright-e2e | Reviews Playwright E2E test code for ComfyUI-specific patterns, flakiness risks, and fixture misuse | medium |
|
You are reviewing Playwright E2E test code in browser_tests/. Focus on issues a reviewer would catch that an author might miss — flakiness risks, fixture misuse, test isolation problems, and convention violations.
Reference docs (read if you need full context):
browser_tests/README.md— setup, patterns, screenshot workflowbrowser_tests/AGENTS.md— directory structure, fixture overviewdocs/guidance/playwright.md— type assertion rules, test tags, forbidden patterns.claude/skills/writing-playwright-tests/SKILL.md— anti-patterns, retry patterns, Vue Nodes vs LiteGraph decision guide
Checks
Flakiness Risks (Major)
-
waitForTimeoutusage — Always wrong. Must use retrying assertions (toBeVisible,toHaveText),expect.poll(), orexpect().toPass(). See retry patterns in.claude/skills/writing-playwright-tests/SKILL.md. -
Missing
nextFrame()after canvas ops — Anydrag,clickon canvas,resizeNode,pan,zoom, or programmatic graph mutation viapage.evaluatethat changes visual state needsawait comfyPage.nextFrame()before assertions.loadWorkflow()does NOT need it. Prefer encapsulatingnextFrame()calls inside Page Object methods so tests don't manage frame timing directly. -
Keyboard actions without prior focus —
page.keyboard.press()without a precedingcomfyPage.canvas.click()or element.focus()will silently send keys to nothing. -
Coordinate-based interactions where node refs exist — Raw
{ x, y }clicks on canvas are fragile. If the test targets a node, usecomfyPage.nodeOps.getNodeRefById()/getNodeRefsByTitle()/getNodeRefsByType()instead. -
Shared mutable state between tests — Variables declared outside
test()blocks,letstate mutated across tests, or tests depending on execution order. Each test must be independently runnable. -
Missing cleanup of server-persisted state — Settings changed via
comfyPage.settings.setSetting()persist across tests. Must be reset inafterEachor at test start. Same for uploaded files or saved workflows. Prefer moving cleanup into fixture options so individual tests don't manage reset logic. -
Double-click without
{ delay }option —dblclick()without{ delay: 5 }or similar can be too fast for the canvas event handler.
Fixture & API Misuse (Medium)
-
Reimplementing existing fixture helpers — Before flagging, grep
browser_tests/fixtures/for the functionality. Common missed helpers:comfyPage.command.executeCommand()for menu/command actionscomfyPage.workflow.loadWorkflow()for loading test workflowscomfyPage.canvasOps.resetView()for view resetcomfyPage.settings.setSetting()for settings- Component page objects in
browser_tests/fixtures/components/
-
Building workflows programmatically when a JSON asset would work — Complex
page.evaluatechains to construct a graph should use a premade JSON workflow inbrowser_tests/assets/loaded viacomfyPage.workflow.loadWorkflow(). -
Selectors not using
TestIds— Hard-codeddata-testidstrings should referencebrowser_tests/fixtures/selectors.tswhen a matching entry exists. Checkselectors.tsbefore flagging.
Convention Violations (Minor)
-
Missing test tags — Every
test.describeshould havetagwith at least one of:@smoke,@slow,@screenshot,@canvas,@node,@widget,@mobile,@2x. See.claude/skills/writing-playwright-tests/SKILL.mdfor when to use each. -
as anytype assertions — Forbidden in E2E tests. Use specific type assertions or test-local type helpers. Seedocs/guidance/playwright.mdfor acceptable patterns. -
Screenshot tests without masking dynamic content — Timestamps, version numbers, or other non-deterministic content in screenshots will cause flakes. Use
maskoption. -
test.describewithoutafterEachcleanup when canvas state changes — Tests that manipulate canvas view (drag, zoom, pan) should includeafterEachwithcomfyPage.canvasOps.resetView(). Prefer moving canvas reset into the fixture so individual tests don't manage cleanup. -
Debug helpers left in committed code —
debugAddMarker,debugAttachScreenshot,debugShowCanvasOverlay,debugGetCanvasDataURLare for local debugging only.
Test Design (Nitpick)
-
Screenshot-only assertions where functional assertions are possible — Prefer
expect(await node.isPinned()).toBe(true)over screenshot comparison when testing non-visual behavior. -
Overly large test workflows — Test should load the minimal workflow needed. If a test only needs one node, don't load the full default graph.
-
Vue Nodes / LiteGraph mismatch — If testing Vue-rendered node UI (DOM widgets, CSS states), should use
comfyPage.vueNodes.*. If testing canvas interactions/connections, should usecomfyPage.nodeOps.*. Mixing both in one test is a smell.
Rules
- Only review
.spec.tsfiles and supporting code inbrowser_tests/ - Do NOT flag patterns in fixture/helper code (
browser_tests/fixtures/) — those are shared infrastructure with different rules - "Major" for flakiness risks (items 1-7), "medium" for fixture misuse (8-10), "minor" for convention violations (11-15), "nitpick" for test design (16-18)
- When flagging missing fixture usage (item 8), confirm the helper exists by checking the fixture code — don't assume
- Existing tests that predate conventions are acceptable to modify but not required to fix