devAlice
← Alice Way

7. Verification loops — locking down the definition of 'done' in the system

The moment the operator feels 'that's done' is the most dangerous. A 7-step verification loop with auto-fire triggers, lint·build·test·diff·report checks, SKIP conditions, and the trace left when bypassed.

This is part 7 of the Alice Way series. In part 6 Multi-agent delegation, automated verification appeared as the gate for delegated work. This post takes that verification one level up — whether the work was done directly or delegated, the system enforces a loop the moment "done" is about to come out of someone's mouth.

0. A verification loop is the system re-checking what "mind thinks is done"

The moment work feels done is the most dangerous. Mind is tilted into the relief of "okay, that's it" and missing procedures are hard to notice. Lint not run, build broken, tests skipped — and "done" still gets typed.

The verification loop is the device that breaks that moment for one beat. Right before "done" gets written, lint · build · test · security · diff checks auto-run, and the report cannot go out unless they pass. The system re-confirms what mind thinks is done.

This post is a record of which seven steps make up that loop, what triggers the auto-fire, and how bypass is handled.

The auto-fire mechanism is the Anthropic Claude Code hooks system, already cited in part 5 — Hooks and automation. This post is a record of the verification steps the operator built on top of that mechanism.

1. Why auto-fire is required

If the operator has to consciously invoke verification — sometimes it gets skipped. Skipped verification accumulates — eventually prod breaks. If skipping is a cognitive limit, the enforcement has to live outside cognition.

Verification modeSkip likelihoodFit
Operator runs it consciously every timeHigh (when tired or in flow)
Only runs in CI0 — but post-commit/push, too late⚠️ supporting
Hook auto-fires before report0 — the report itself is blocked

Third is the answer. Force verification before the report goes out and the omission is blocked at the system level. CI is a secondary safety net, not the front line.

2. The 7-step verification loop

What I converged on. Each step halts immediately on failure and reports a clear message.

2.1 Lint

Style, unused vars, obvious errors. The fastest and most-often-caught. Run first — failure here saves the time of later steps.

2.2 Type check

TypeScript / mypy / other static type checkers. Catches deeper errors lint cannot.

2.3 Build

Does the actual build break. Stops changes the compiler cannot pass from getting merged.

2.4 Test

Unit tests + integration tests. Integration tests hit real resources, not mocked ones — once mocked tests passed and a prod migration broke, and that incident set the rule.

2.5 Security check

Secret hardcoding · SQL injection patterns · operator real-name leakage — grep for risks. An 8-item checklist runs on every change. Halt immediately on any hit.

2.6 Diff review

Do the changed lines match the operator's intent. Has anything unintended slipped in. The last human-eye gate.

2.7 Report-output check

Does the report text itself carry leak risk (secrets · internal identifiers · real names). One more grep just before the report is sent.

The "done" report cannot go out unless all seven pass. Any failure — the report is blocked and the operator gets a one-line "step X failed."

3. Auto-fire triggers — when does it run

Conditions under which the verification loop fires automatically.

3.1 Explicit trigger — just before "done" phrasing

Right when the operator tries to output "finished" / "done" / "ready" / equivalent, the hook cuts in. Before the output, not after — after would be late.

3.2 Explicit trigger — "PR ready" / "mergeable" reports

Messages like "the PR is ready" also fire the same hook. Verification does not run after the PR goes out; it runs before.

3.3 Explicit trigger — "commit-ready" reports

"Ready to commit" — same. Does not block the commit itself, but enforces verification once more before that signal reaches the operator.

3.4 Explicit invocation — /verify

When the operator consciously wants to verify, the slash command invokes the same loop. Two entries pointing at the same procedure as the hook.

4. When auto-fire is SKIPped

Not every piece of work needs verification. The following conditions are declared as SKIP for auto-fire.

  • Read-only work — scans · reviews · investigations. No code change, nothing to verify.
  • Docs-only changes — touched only .md files. Lint/build is meaningless here.
  • Intentional scratch — experiments, prototypes. Operator declares.
  • Operator explicitly said "skip verification" — explicit bypass.

The SKIP conditions live in the persona, so the hook judges automatically. The operator does not have to say "skip this" every time.

5. The trace left by bypass

Sometimes the operator intentionally has to bypass — mid-debugging, ad-hoc work, hotfix in progress.

Bypass is possible. But the fact of the bypass always lands in the log.

[verify] BYPASSED at 2026-05-17 14:23 — reason: emergency hotfix
[verify] Skipped: lint / build / test / security / diff / report
[verify] Note: re-run /verify manually after hotfix lands

This way — bypasses are traceable and post-hoc "why was this not verified?" investigations are easy. The bypass is not blocked, but the trace stays.

If bypass is not possible, the operator turns verification off entirely. Once off, it stays off forever.

6. Handling failure

What the operator sees when verification fails.

[verify] FAILED at step: Test (3/7)
[verify] 2 tests failed:
  - integration/auth.test.ts > "github oauth callback redirects to next"
  - integration/db.test.ts > "rls policy blocks other user's row"
[verify] Full log: /tmp/verify-2026-05-17-1423.log
[verify] Report blocked.

The core is — failure info compresses into one screen. Where it failed, what the issue is, where the detailed log lives. The operator decides next from that single screen (fix / bypass / SKIP).

Spilling the full log into the console makes the operator re-summarize it themselves. That itself is fresh load.

7. Traps — patterns where verification fails

7.1 Too slow

If verification takes more than 30 seconds — the operator starts thinking "this is too slow" and uses bypass often. Once bypass is the default, verification loses its point. → Cheapest steps first (lint/type catch most failures), expensive steps later.

7.2 Too many false positives

If verification often raises false alarms — the operator starts distrusting it. Distrust → bypass. → Verification itself must be trustworthy. The moment one false alarm is caught, fix the verification logic itself immediately.

7.3 Pass/fail only, no specifics

Saying only "failed" with no where/why — the operator has to dig through logs themselves. Load shifts back to the operator. → Output stays short but specific (which step, which test, which line).

7.4 SKIP conditions too narrow

If verification fires on almost every action, the operator gets fatigued. Verifying read-only work — verification itself becomes noise. → Spell out enough SKIP conditions in the persona.

8. Compressed into one principle

The core of verification-loop design collapses into one sentence.

"The moment the operator is about to say 'done,' the system cuts in for one beat. Only when it passes can that word go out. Bypass is possible but always leaves a trace."

When this holds, the verification loop becomes a safety net that halts the operator's relief for one beat. When it breaks — verification gets turned off, or distrusted as false alarms, or bypass becomes the default.

The next post covers the foundational resource everything above (verification, memory, skills, hooks) depends on — the token economy, i.e. what to admit into the context window and what to keep out.


Comments