fix: prevent AI lies — assertion-based verdicts + blind reviewer

Agent: MUST use inspect() after every action, verdict based on DOM
state not opinions. "NEVER claim REPRODUCED unless inspect() confirms."

Reviewer: Two-phase prompt — Phase 1 describes what it SEES (blind,
no context). Phase 2 compares observations against issue/PR context.
Anti-hallucination rules: "describe ONLY what you observe, NEVER infer."

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
snomiao
2026-03-28 17:26:48 +00:00
parent c4a243060b
commit 6044452b8f
2 changed files with 39 additions and 22 deletions

View File

@@ -477,12 +477,21 @@ export async function runHybridAgent(opts: AgentOptions): Promise<{
- done(verdict, summary) — Finish with your conclusion.
## Strategy
1. Start by understanding the issue, then plan your reproduction steps.
2. Use perform() to take actions. After each action, use inspect() to verify state or observe() for visual confirmation.
1. Start by understanding the issue, then plan your FULL reproduction sequence before acting.
2. Use perform() to take actions. After EVERY action, use inspect() to verify the DOM state changed as expected.
3. If a setting change doesn't seem to take effect, try reload() then verify again.
4. Focus on the specific bug — don't explore randomly.
5. Take screenshots at key moments for the video evidence.
6. When you've confirmed or ruled out the bug, call done().
7. You MUST complete ALL reproduction steps. Do NOT stop after setup — the actual bug trigger is the most important part.
## Verification Rules (CRITICAL — prevents false results)
- NEVER claim REPRODUCED unless inspect() confirms the expected broken state exists
- After EVERY action, call inspect() to verify the DOM state. This is your source of truth.
- Your done() verdict MUST be supported by inspect() results, not by what you think happened
- If you perform an action but inspect() shows no state change, the action FAILED — try again or adapt
- Example: if you press Escape and inspect("Settings dialog") still returns visible → the dialog did NOT close → report that honestly
- ALWAYS include inspect() evidence in your done() summary: "inspect('X') returned {state} confirming Y"
## Control/Test Comparison (IMPORTANT)
When a bug is triggered by a specific setting, mode, or configuration:

View File

@@ -421,6 +421,12 @@ function buildSingleVideoPrompt(
): string {
const lines = [
'You are a senior QA engineer reviewing a UI test session recording.',
'',
'## ANTI-HALLUCINATION RULES (READ FIRST)',
'- Describe ONLY what you can directly observe in the video frames',
'- NEVER infer or assume what "must have happened" between frames',
'- If a step is not visible in the video, say "NOT SHOWN" — do not guess',
'- Your job is to be a CAMERA — report facts, not interpretations',
''
]
@@ -431,38 +437,40 @@ function buildSingleVideoPrompt(
)
if (prContext) {
lines.push(
'## Phase 1: Blind Observation (describe what you SEE)',
'First, describe every UI interaction chronologically WITHOUT knowing the expected outcome:',
'- What elements does the user click/hover/type?',
'- What dialogs/menus open and close?',
'- What keyboard indicators appear? (look for subtitle overlays)',
'- What is the BEFORE state and AFTER state of each action?',
'',
'## Phase 2: Compare against expected behavior',
'Now compare your observations against the context below.',
'Only claim a match if your Phase 1 observations EXPLICITLY support it.',
''
)
if (isIssueContext) {
lines.push(
'## Issue Context',
'This video attempts to reproduce a reported bug on the main branch.',
'Your review MUST evaluate whether the reported bug is visible and reproducible.',
'',
prContext,
'',
'## Review Instructions',
'1. Does the video demonstrate the reported bug occurring?',
'2. Is the bug clearly visible and reproducible from the steps shown?',
'3. Are there any other issues visible during the reproduction attempt?',
'',
'## CRITICAL: Honesty Requirements',
'- If the video only shows login, idle canvas, or trivial menu interactions WITHOUT actually performing the reproduction steps, say "INCONCLUSIVE — reproduction steps were not performed".',
'- Do NOT claim a bug is "confirmed" unless you can clearly see the bug behavior described in the issue.',
'- Do NOT hallucinate findings. If the video does not show meaningful interaction, say so clearly.',
'- Rate confidence as "Low" if the video does not actually demonstrate the bug scenario.',
'## Comparison Questions',
'1. Did the video perform the reproduction steps described in the issue?',
'2. Did your Phase 1 observations show the reported bug behavior?',
'3. If the steps were not performed or the bug was not visible, say INCONCLUSIVE.',
''
)
} else {
lines.push(
'## PR Context',
'The video is a QA session testing a specific pull request.',
'Your review MUST evaluate whether the PR achieves its stated purpose.',
'',
prContext,
'',
'## Review Instructions',
"1. Does the video demonstrate the PR's intended behavior working correctly?",
'2. Are there regressions or side effects caused by the PR changes?',
'3. Does the observed behavior match what the PR claims to implement/fix?',
'## Comparison Questions',
'1. Did the video test the specific behavior the PR changes?',
'2. Did your Phase 1 observations show the expected before/after difference?',
'3. If the test was incomplete or inconclusive, say so honestly.',
''
)
}