Loop Engineering — Building the System That Prompts Claude For You

June 23, 2026 · selma kocabıyık

For two years we’ve been stuck inside the same loop with AI agents: you write a prompt, Claude runs and stops; you read, you prompt again. Write → wait → check → tell it again. The biggest bottleneck in this loop is you.

Every turn is umbilically tied to your input. If you wake up and sit in front of Claude Code for 6 hours, exchanging messages every 15 minutes, you’re losing two things at once: your own time, and your tokens, because you have no idea where the agent got stuck.

The real leverage is no longer in writing better prompts. The leverage is in building the system that prompts the agent for you. This is called loop engineering. I built this system on top of Claude Code; below I explain step by step why and how it works.

And I didn’t just explain it — with the loop-engineering skill I share at the end of this article, you can set up the same architecture in your own Claude Code with a single command.

1 → Are you the loop? — Manual vs. Automatic

There are two kinds of loops in front of us.

Manual (YOU are the loop): Prompt → Claude runs → stops → you review → prompt again. The system depends on your tempo; if you’re away, there’s no progress.

Automatic (Claude closes the loop): You give the goal → Claude runs → a controller checks “Is it done?” → if it is, it ends; if not, Claude takes another turn. You step out of the loop; the system keeps turning until the goal is met.

Loop engineering enables exactly this transition — handing control over to Claude. But “Claude turns on its own” is not enough by itself; how that loop turns determines everything.

2 → Open vs. Closed Loop — Is there a “Gate” at the door?

Let’s say we’ve put the system on automatic. But is there a boundary? Here the work splits in two.

Open Loop: You say “go handle this”; there’s no boundary, the freedom is wide. It’s a powerful approach but it’s expensive, the output is always open to surprises, and if the agent misses the goal you won’t even feel it. It’s only suitable for those with an unlimited budget.

Closed Loop: You set a clear goal, a control gate, and a definite stopping point. It’s cheap, safe, repeatable, and suitable for routine work.

Don’t mix the two axes: the first section was the question “is it you, or is it automation”; this section is the question “if automation exists, is there a controller at the door”.

Loop engineering = Automatic and closed loop.

3 → Don’t confuse /goal, /loop and routines

When using Claude Code, these three concepts get constantly mixed up. The difference between them is clear.

/goal “goal”: You give the system the main objective you want to reach (e.g. “keep turning until the tests pass”). Claude works on its own across turns in the background until it completes this objective. After each turn, a small and fast model (Haiku by default) checks whether the goal has been completed and stops the loop once the objective is reached. This is the ready-made engine for a conditional closed loop.

/loop: This isn’t scheduling, it’s instant repetition. It puts a prompt into a loop while the same session is open — either at a fixed interval (e.g. /loop 5m ...) or, if you give no interval, Claude turns at its own pace. It stops when you close the session.

Routines (/schedule): This is where you actually say “run this every night at this hour”. They are scheduled tasks in Anthropic’s cloud, tied to cron, that run even when you shut down your computer. This is routine automation and it’s an entirely separate topic.

In short: /goal manages the condition, /loop manages instant repetition in the same session, and routines manage the calendar. The closed-loop architecture in this article takes shape on the /goal side (and, even more robust, the Stop hook in the next section).

4 → Who is the controller? — The most critical part of the work

So who decides that the work is finished and that it’s “done”? In the /goal command, this control is done by default by a small model. This is extremely practical, flexible, and works great in most scenarios. However, in areas like coding where there’s a precise mathematical answer to whether the output “works or not”, leaving control solely to an LLM has two big side effects: cost and a lengthening of the improvement loop.

At this point there’s a much-discussed technique: the Ralph Wiggum loop — the approach popularized by Geoffrey Huntley, which even ships as a plugin for Claude Code. The logic is this: you keep imposing the same goal on the agent until it finishes the work; it’s a kind of brute force and persistence management.

It works; but its biggest weakness is this: if there’s no mechanical control gate behind it, the agent can leave the work half-done — or fill the code with placeholders — and say “done”. We call this false completeness; when you wake up in the morning, a broken, won’t-compile pile of code is waiting for you. The reason is very clear: when the maker and the checker are the same AI, you’re only too willing to believe that “done” lie.

The system’s second weakness is its inability to figure out on its own when to stop. If you don’t put in a real gate and a ceiling, the loop will either give up too early and say “done”, or, on the contrary, start turning without ever stopping. That both burns tokens and inflates the context window because of memory piling up on top of itself. That’s exactly why a second safety condition is needed alongside the mechanical gate: a maximum attempt limit (max_turns or a budget limit).

If your goal is something testable, the solution is crystal clear: don’t leave control to an LLM. Put a Stop hook in the system to do the auditing. Whenever Claude wants to stop, let this hook fire; if the tests aren’t passing, don’t let the system stop, and throw Claude back inside to solve the work again. Technically the hook does this either by returning exit code 2 or by producing the {"decision": "block", "reason": "..."} output; in either case Claude cannot stop, it keeps working.

This gate:

▸Cannot be fooled: The script you wrote can’t say “good enough”; the output is either green or red.
▸Spends zero tokens: Because verification is done with a plain script rather than AI, the control stage is completely free.

And here’s the real distinction: most loop setups look to an “approval” word to stop, or to the decision of a small model as in /goal — I put the real test at the gate. The whole secret lies right here: we don’t have AI verified by AI again.

(If the goal isn’t testable — for example “is this presentation aesthetically good?” — then you necessarily need an LLM-judge and a clear rubric; but in technical areas like code, a real test is always the most robust path.)

5 → Token honesty — Why isn’t the closed loop the enemy of your wallet?

The first reaction from people who see this system is always the same: “The loop will keep turning on its own, who knows how many tokens it’ll burn?” Let me be honest; the closed-loop architecture, when set up correctly, is on the contrary far cheaper. The reason is this quartet:

▸Context Caching: We don’t process the same context from scratch every turn; the system reads the memory at roughly 10% of the cost, that is 10 times cheaper. Instead of reading the book from the beginning every time, we put a bookmark where we left off.
▸Model Routing: Cheap models take on the heavy lifting and the verifications, while the main brain stays on the powerful model (roughly a 5x cost advantage).
▸Gate = Zero Tokens: Because we do verification with the Stop hook (that is, code/script), the control step comes completely free. This is the strongest savings line item.
▸Hard Limits: You put a max_turns or budget limit on the system. The loop never turns forever.

The real token monster is the open-loop architecture; because none of these protective shields exist there. In a closed loop, with a correct architecture, the token worry vanishes entirely.

6 → Do you really need this loop? — A short honesty test

I’m not selling anything here: loop engineering isn’t worth it for every scenario. If you don’t meet the four conditions below, don’t build this system — you’ll just burn money.

▸Does the work repeat itself? If it’s a one-off job, a good prompt is enough; building a loop is overkill.
▸Can verification be automated? If you don’t have a test, build, or lint mechanism in hand, the agent is forced to approve its own homework and the gate becomes useless.
▸Can your budget handle this? The closed loop is economical but it isn’t free either.
▸Does the agent have the tools it needs? It must have an isolated environment where it can run the tests and read the logs.

Bonus — it applies exactly the same to long automation systems: the work doesn’t have to be small. If you have a website idea, you can build an end-to-end closed loop spanning everything from its planning to carrying out the tasks, and from there to auditing itself. This inner loop runs even while you sleep — and the upper layer (outer loop) that learns and grows is in Section 8.

7 → Practice: Build your own closed loop (Skeleton)

We’ve understood the conceptual part, let’s make it concrete. The smallest working closed loop consists of these four parts — all of them are set up inside Claude Code, no extra framework needed.

▸Goal: Write a clear and testable task (“reject the password in the signup form if it’s shorter than 8 characters”). If you don’t know exactly which file the bug is in, don’t impose the file on the agent; saying “examine this behavior, find the cause and fix it” is enough.
▸Maker: The agent that writes the code. Assign it the powerful model (Sonnet or Opus).
▸Gate — the heart of the work: A Stop hook. A command that you bind to the Stop event inside .claude/settings.json and that runs the real test (for example a small bekci.py script that runs npm test behind it). If the test is red, the hook doesn’t allow stopping. This step spends 0 tokens; because the thing running is not AI, it’s a plain script.
▸Limit: /goal “keep turning until the tests pass” use, or put a turn/budget limit so the loop doesn’t turn forever.

Get the model split done cheaply: Have Haiku do the work like auditing, log reading and general evaluation; let the main brain (the part writing the code) stay on the powerful model.

Which gate suits which work?

▸Testable work (code): always use a script hook; it can’t be fooled and it’s completely free.
▸Non-testable work (presentation, text, “does this look good?”): the gate is set up not by a script but by another agent with a clear rubric. This scenario spends tokens, which we have to accept.

See the loop work with your own eyes: After Claude fixes the code, deliberately add the bug back (for example node sifirla.js) and say “fix until the tests pass”. You’ll see the watcher catch the bug and the loop repair the code on its own. Even if the screen says “auto mode on”, the real control in the background is done by the watcher script you wrote.

If you’d rather not bother setting up these four parts one by one — the loop-engineering skill I share starts from the honesty test, asks you the right questions, and builds the loop for you.

8 → From single loops to networks: There’s an “Outer Loop” on the horizon

The real big conversation in the AI world is heading here: building massive, self-growing networks by connecting individual loops to one another. These loops feed off a shared memory layer (shared records, signals, logs); while one works on support, one on SEO, and another on growth, they all benefit from what the others learn.

This is called the outer loop: the upper layer that decides what to do, stores the state, and gets smarter over time. (The single system I describe in this article is the inner loop.)

It’s a very powerful vision; but a warning: a network where multiple agents run connected to one another can only be as solid as the quality of the individual loops it’s built on. If your individual loops don’t have a real, mechanical control gate, each agent falsely approves its own output. With a chain-reaction error, all the agents start passing wrong data and code to each other; the system collapses like dominoes.

My next step is exactly this: turning this single loop into a fleet. A manager agent on top, and below it expert agents working with the same gate discipline.

Summary

The real leverage isn’t in writing prompts, it’s in building the system. Give the goal clearly (with a plain prompt or with /goal), put a real control gate behind it (in testable scenarios Stop hook = can’t be fooled + zero tokens) and let the system turn on its own until it reaches the goal. And you step out of the loop and take it easy. The closed loop protects both your time and your wallet; but if you don’t build that gate solidly, the system gets stuck on Ralph Wiggum and silently produces the wrong thing.

Most people just talk about this; I built one that runs like clockwork on top of Claude Code: I left a deliberate bug in the code → the tests turned red → I gave the goal → I took my hands off the keyboard → the loop found and fixed the bug → and the screen turned green.

You can build the same thing: github.com/selmakcby/loop-engineering. For more practical Turkish AI content: @selmaaii.