The moment the operator feels 'that's done' is the most dangerous. A 7-step verification loop with auto-fire triggers and the trace left when bypassed.

This is part 7 of the Alice Way series. In part 6 Multi-agent delegation, automated verification appeared as the gate for delegated work. This post takes that verification one level up — whether the work was done directly or delegated, the system enforces a loop the moment "done" is about to come out of someone's mouth.

0. A verification loop is the system re-checking what "mind thinks is done"

The moment work feels done is the most dangerous. Mind is tilted into the relief of "okay, that's it" and missing procedures are hard to notice. Lint not run, build broken, tests skipped — and "done" still gets typed.

The verification loop is the device that breaks that moment for one beat. Right before "done" gets written, lint · build · test · security · diff checks auto-run, and the report cannot go out unless they pass. The system re-confirms what mind thinks is done.

I think the most important property of a verification loop is not what it checks, but that it's mandatory. Not because the individual checks are hard to run manually, but rather because "mandatory" is what transforms verification from a good habit into a system guarantee. Initially the loop was something I tried to remember to run. Once it became auto-fire, the nature of "done" changed — it stopped meaning "I think this works" and started meaning "the system agrees."

This post is a record of which seven steps make up that loop, what triggers the auto-fire, and how bypass is handled.

The auto-fire mechanism is the Anthropic Claude Code hooks system, already cited in part 5 — Hooks and automation. This post is a record of the verification steps the operator built on top of that mechanism.

1. Why auto-fire is required

If the operator has to consciously invoke verification — sometimes it gets skipped. Skipped verification accumulates — eventually prod breaks. If skipping is a cognitive limit, the enforcement has to live outside cognition.

Verification mode	Skip likelihood	Fit
Operator runs it consciously every time	High (when tired or in flow)	❌
Only runs in CI	0 — but post-commit/push, too late	⚠️ supporting
Hook auto-fires before report	0 — the report itself is blocked	✅

Third is the answer. Force verification before the report goes out and the omission is blocked at the system level. CI is a secondary safety net, not the front line.

2. The 7-step verification loop

What I converged on. Each step halts immediately on failure and reports a clear message.

Loading diagram…

Seven gates that step in just before the word 'done' leaves your mouth. The report only goes out once they all pass.

2.1 Lint

Style, unused vars, obvious errors. The fastest and most-often-caught. Run first — failure here saves the time of later steps.

2.2 Type check

TypeScript / mypy / other static type checkers. Catches deeper errors lint cannot.

2.3 Build

Does the actual build break. Stops changes the compiler cannot pass from getting merged.

2.4 Test

Unit tests + integration tests. Integration tests hit real resources, not mocked ones — once mocked tests passed and a prod migration broke, and that incident set the rule.

2.5 Security check

Secret hardcoding · SQL injection patterns · operator real-name leakage — grep for risks. An 8-item checklist runs on every change. Halt immediately on any hit.

2.6 Diff review

Do the changed lines match the operator's intent. Has anything unintended slipped in. The last human-eye gate.

2.7 Report-output check

Does the report text itself carry leak risk (secrets · internal identifiers · real names). One more grep just before the report is sent.

The "done" report cannot go out unless all seven pass. Any failure — the report is blocked and the operator gets a one-line "step X failed."

3. Auto-fire triggers — when does it run

Conditions under which the verification loop fires automatically.

3.1 Explicit trigger — just before "done" phrasing

Right when the operator tries to output "finished" / "done" / "ready" / equivalent, the hook cuts in. Before the output, not after — after would be late.

3.2 Explicit trigger — "PR ready" / "mergeable" reports

Messages like "the PR is ready" also fire the same hook. Verification does not run after the PR goes out; it runs before.

3.3 Explicit trigger — "commit-ready" reports

"Ready to commit" — same. Does not block the commit itself, but enforces verification once more before that signal reaches the operator.

3.4 Explicit invocation — `/verify`

When the operator consciously wants to verify, the slash command invokes the same loop. Two entries pointing at the same procedure as the hook.

3.5 The two entries are not interchangeable

The hook fires before the report; the slash command fires when the operator pulls. Same procedure, opposite direction — push vs pull. I noticed they catch different things. The hook catches the moment of relief, when mind is already tilted into "okay, done." The slash command catches the moment of doubt, when mind asks "did I actually verify this?" Both moments live in different parts of the workflow, and a system that only runs verification in one of them misses half the failures.

In practice the hook fires more often than the slash command — the relief moment shows up at the end of every task, while the doubt moment is more occasional. But the slash command is where the operator controls verification, where it can be invoked outside the auto-fire conditions. The two together cover the full surface: forced by the system, available by intent.

4. When auto-fire is SKIPped

Not every piece of work needs verification. The following conditions are declared as SKIP for auto-fire.

Read-only work — scans · reviews · investigations. No code change, nothing to verify.
Docs-only changes — touched only .md files. Lint/build is meaningless here.
Intentional scratch — experiments, prototypes. Operator declares.
Operator explicitly said "skip verification" — explicit bypass.

The SKIP conditions live in the persona, so the hook judges automatically. The operator does not have to say "skip this" every time.

5. The trace left by bypass

Sometimes the operator intentionally has to bypass — mid-debugging, ad-hoc work, hotfix in progress.

Bypass is possible. But the fact of the bypass always lands in the log.

[verify] BYPASSED at 2026-05-17 14:23 — reason: emergency hotfix
[verify] Skipped: lint / build / test / security / diff / report
[verify] Note: re-run /verify manually after hotfix lands

This way — bypasses are traceable and post-hoc "why was this not verified?" investigations are easy. The bypass is not blocked, but the trace stays.

If bypass is not possible, the operator turns verification off entirely. Once off, it stays off forever.

The trace has a second purpose beyond the moment of bypass. A week after a hotfix lands, when memory of which corners were cut has already faded, the bypass log is the only artifact that survives. I've gone back to bypass entries from three weeks earlier and reconstructed what was skipped, why, and whether the post-hotfix verification ever caught up. Without the trace, that reconstruction is impossible — the operator just has to trust that nothing was missed, which is the same as not verifying at all.

6. Handling failure

What the operator sees when verification fails.

[verify] FAILED at step: Test (3/7)
[verify] 2 tests failed:
  - integration/auth.test.ts > "github oauth callback redirects to next"
  - integration/db.test.ts > "rls policy blocks other user's row"
[verify] Full log: /tmp/verify-2026-05-17-1423.log
[verify] Report blocked.

The core is — failure info compresses into one screen. Where it failed, what the issue is, where the detailed log lives. The operator decides next from that single screen (fix / bypass / SKIP).

Spilling the full log into the console makes the operator re-summarize it themselves. That itself is fresh load.

7. Traps — patterns where verification fails

7.1 Too slow

If verification takes more than 30 seconds — the operator starts thinking "this is too slow" and uses bypass often. Once bypass is the default, verification loses its point. → Cheapest steps first (lint/type catch most failures), expensive steps later.

7.2 Too many false positives

If verification often raises false alarms — the operator starts distrusting it. Distrust → bypass. → Verification itself must be trustworthy. The moment one false alarm is caught, fix the verification logic itself immediately.

7.3 Pass/fail only, no specifics

Saying only "failed" with no where/why — the operator has to dig through logs themselves. Load shifts back to the operator. → Output stays short but specific (which step, which test, which line).

7.4 SKIP conditions too narrow

If verification fires on almost every action, the operator gets fatigued. Verifying read-only work — verification itself becomes noise. → Spell out enough SKIP conditions in the persona.

7.5 Verification verifying itself

The verification harness is itself code, and that code can regress. If the lint rule fires but is silently broken, or the test runner is mid-upgrade and skips half the suite, the loop returns "passed" while the actual checks did not run. The operator sees green and ships.

There is no clean answer here — verifying the verifier is an infinite regress. What does help is treating any change to the verification scripts themselves as the highest-risk category of change. The 8-item security checklist applies, integration smoke tests run on the harness itself, and any "I refactored verify, all looks fine" report is the exact place to slow down. The trap is that verification ergonomics push toward "fix the verifier silently and move on." The system has to push the other way — verifier changes get more scrutiny, not less.

8. Compressed into one principle

The core of verification-loop design collapses into one sentence.

"The moment the operator is about to say 'done,' the system cuts in for one beat. Only when it passes can that word go out. Bypass is possible but always leaves a trace."

When this holds, the verification loop becomes a safety net that halts the operator's relief for one beat. When it breaks — verification gets turned off, or distrusted as false alarms, or bypass becomes the default.

The longer this loop runs, the harder it becomes to imagine working without it. After six months of auto-fire, the moment of relief no longer feels safe to act on — mind has internalized the pattern that "done" is the system's word, not mine. That internalization is the actual product of the loop, beyond any single bug it caught. The procedural discipline shifts from being something the operator enforces on themselves to something the environment quietly guarantees.

The next post covers the foundational resource everything above (verification, memory, skills, hooks) depends on — the token economy, i.e. what to admit into the context window and what to keep out.

7. Verification loops — locking down the definition of 'done' in the system