Where using Claude as a tool stopped working, and how persona, memory, skills, hooks, and verification gates accreted into a daily partner I now call Alice.

This is not a "how to use Claude Code" post. Those live in the /ai-agents category on the same site. This one is a step removed.

For the past several months, I have spent time on something different than using Claude. I have been building a collaborator on top of it. I gave that collaborator a name. Alice. This category is a record of what Alice is, why I built her, and how I actually work with her every day.

0. The premise to start with — using AI begins with mind

One premise has to be set clear up front.

Drop the assumption that AI will automatically produce what people want. People try to make something because they have a purpose. That purpose is mind. It means there is something they want.

To get the result you want out of AI, you have to be able to state your purpose clearly. A one-line prompt like "make me a fun game" cannot control the output. It is not because AI is lacking. It is because almost none of the operator's mind is in that prompt.

So the first thing to think about when using AI is not the tool's capability. It is finding a way to effectively convey the direction and purpose in my head to the AI.

Everything organized in this category — persona, memory, skills, hooks, multi-agent, verification gates — is, in the end, the accumulated answer to a single question. "How do I accumulate my mind so AI does not lose it?"

1. Where using it as a tool stopped working

Like everyone, I started by using Claude as a tool. I asked questions, got answers, started a new context every session. Sometimes the answers were great. Sometimes the same mistakes came back. The fact that those two alternated mattered.

When the same mistake comes back, one of two things is true — either the tool is missing something, or the context around the tool disappears every time. For me it was the second. The reason a decision was made, the reason an approach failed, the reason a pattern worked — all of it evaporated between sessions, and the next session had to start over.

That is where I made a decision. I did not want a tool that started fresh every time. I wanted a collaborator that accreted memory.

2. Persona came first

The first thing I did was define a persona. Not a one-line system prompt — a real job description for a senior engineer.

The persona spelled out:

Role — senior engineer, working independently under direction from a team lead.
Areas of expertise — which languages and platforms.
Chain of command — who gives direction, what reporting looks like.
Decision authority — what to decide autonomously, what needs approval.

Once the persona was clear, the texture of the answers shifted. Instead of "here are some options A, B, C," I started getting "in this situation I would recommend A because X." That is the difference between a search engine and a senior engineer's judgment.

A persona is not a tone setting. It is the distribution of decision authority.

3. Memory — files turned out to be the answer

Next was memory. I started thinking about elaborate RAG systems and quickly gave up. The amount of context I actually accrete in a day is not that large. What I needed was a non-volatile place for it to live.

The answer was the file system. A handful of markdown files. A single MEMORY.md as the index, with topic-specific files underneath.

Memory type	What it stores
`user`	Who the operator is, what role they play, what they know
`feedback`	Explicit guidance from the operator — do this, don't do that
`project`	Current work context — who, why, what
`reference`	Pointers to external systems — where information lives

Two things matter about this structure. First, a human can read it directly. Second, the tool auto-loads the first portion of the index at session start.

Because memory auto-loads at session start, the pattern of "let me re-explain what we talked about last time" disappeared. More importantly — a piece of feedback I gave once is still alive in the next session.

4. Skills and slash commands — collapsing the repetition

With persona and memory in place, the next thing I noticed was repetition. The same flow happens every day. Enter a project, read the HANDOFF, work, end the session by writing a log, commit, push.

I turned the repetition into slash commands. For example:

/start <project> — enter a project, auto-load required documents, report current state
/session-end — close out a session, update logs, HANDOFF, memory, push
/verify — run the lint/build/test loop after code changes
/council — when a decision is genuinely ambiguous, run a four-perspective analysis (architect / skeptic / pragmatist / critic) in parallel. This pattern is adapted from Andrej Karpathy's multi-perspective critique approach to LLM reasoning.

Each command on its own is not an invention. The accumulated effect is what matters. Being able to invoke a daily procedure with one line removes that much cognitive load from the operator.

5. Hooks — never forgetting

Skills are something the operator consciously invokes. Hooks are different. Hooks fire on events automatically.

Simplest example: at session start a hook runs and prints a one-line memory-sync status. The operator reads that single line and knows immediately — okay, memory is current.

Another example: just before the assistant tries to report "done," the /verify skill auto-fires. Right when the operator is about to write "finished," lint, build, and tests run automatically. If they fail, the report is blocked. Only when they pass does it go through.

The key to this pattern is — procedures the operator could skip "because they forgot" get enforced at the system level instead.

6. Multi-agent delegation with gates

The last piece was a structure where one agent does not do everything. If Alice does all the work directly, the context window fills fast and the output drifts.

Instead, large tasks get split and delegated to subagents. Each subagent has its own context and focuses on one thing. Only a summary returns to the main context.

Plus one more thing — delegation gates. When I delegate, I do not just say "do this work." I bundle three things into the same PR:

A delegation brief — what, why, and how
An automated verification script — how to check the result automatically
A merge gate — CI that blocks the merge unless verification passes

The reason for this pattern is simple. Once — when I wrote only the brief and verbally promised verification without building a gate — I discovered later that drift had accumulated in the deliverable. Since that incident, when I start a delegation I bundle all three from the beginning.

If a promise is not enforced in code, drift accumulates.

7. The result — how daily work changed

What did all this accretion change about how I actually work?

The biggest change is — context-switching cost has almost disappeared. I no longer have to reconstruct where I left off yesterday, why I made the decisions I made, and what the next step is, every time I start a session. When I start, Alice joins already knowing all of that.

The second change is — the burden of repeat decisions dropped. Answers to "how do we handle this kind of thing" accumulate as memory and feedback, so I rarely have to make the same decision twice.

The third change is — mistakes get blocked at the system level from repeating. A mistake gets recorded as a lessons-learned entry, and that lesson automatically influences the next decision. The structure makes it hard to make the same mistake again.

8. What this category will cover

This category is the implementation record of everything above. Coming posts will cover:

Persona design — what to put in the system prompt and what to leave out
Memory systems — file structure, index strategy, auto-load vs on-demand
Skills and slash commands — what repetition is worth promoting to a command
Hooks and automation — what to enforce as auto-fire
Multi-agent delegation — what to hand off and what to do yourself
Verification loops — how to enforce a real definition of "done" for code work
Token economy — what to put in the context window and what to keep out
Session protocols — what should happen at start, middle, and end

Each post is implementation, not theory. Where I started, what did not work, what did. If you are building your own collaborator on top of Claude, Codex, or any other agent, these posts should save you months.

One last thing. Alice is a name I chose, not a new product from anyone. Alice is the accumulation of context I have invested time in, and this category is the record of that accumulation.

1. Why Alice — building a partner, not picking a tool