Workflow strategy06/12/2026

Agent Loops and Loop Engineering: Old wine in new bottles?

In June 2026, "Agent Loops" is trending across the AI community. Peter Steinberger, Boris Cherny, and Addy Osmani frame it as the next step. We investigated what is genuinely new and what is just a relabel of concepts that have been in every good agent document since 2023: ReAct, Reflexion, iterative tool-use loops, stop hooks. A critical comparison for SMEs and decision-makers.

Old Wine in New Bottles? A Critical Take on "Agent Loops"

In June 2026, a two-sentence X post from Peter Steinberger spread across the AI coding community, with reports of more than 6.5 million views. The message: developers should stop merely prompting coding agents and start designing the loops that prompt those agents instead. Boris Cherny, who leads Claude Code at Anthropic, said on the Acquired podcast, in effect: "I do not prompt Claude anymore. I write loops that prompt Claude and decide what to do next." Addy Osmani framed this as "Loop Engineering" and published a framework of five building blocks plus memory.

We asked: is this genuinely a new approach, or is an established concept being given a fresh marketing label? This article places the current discussion in critical context.

What exactly is meant by "Agent Loops" and "Loop Engineering"?

Loop Engineering, as Osmani describes it, means: instead of typing every next prompt yourself, you design a small system that finds work, delegates it to agents, checks the result, stores progress, and repeats. Six building blocks are named: trigger, context, action, verification, memory, and escalation to a human.

At first glance this sounds like a clean break from classical prompting. Reality is more nuanced. Claude Code and Codex have, in recent months, shipped product features that operationalize the allegedly new idea: /loop for timed reruns, /goal with an evaluator model that checks after each turn whether a stopping condition has been met, Cloud Routines, sub-agents, worktrees. That is new as a product feature, not as a concept.

What is genuinely new about Loop Engineering

Three aspects deserve the label "new":

1. Product maturity, not invention. What used to be hand-rolled bash scripts with cron and a few wrappers is now available as native functionality in Claude Code and Codex. /goal, for example, uses a small fast model (Haiku by default) that reads only the transcript and decides after each turn whether the condition is met. The split between worker and evaluator is not revolutionary as a pattern, but it is as an immediately usable product feature.

2. Stopping conditions as a first-class concern. The conversation moves from "What should the agent do?" to "When must it stop?". That question is not new, but the current wave consistently puts it at the center. Osmani names tests, type checks, real errors, review queues, and budget limits as the tools. The attitude: a loop without a clear stopping condition is not a production system, it is an agent agreeing with itself.

3. Verification as a chain of signals. Instead of "tests green, done", the current wave promotes a hierarchy: deterministic oracles (CI, lint, type check) as the strongest signals, visual or perceptual checks (Playwright, screenshots) as middle, critic sub-agents and self-critique as the weakest. This hierarchy is a useful orientation, even if the individual signal types have been known for a long time.

What is not new, but is sold as a revolution

Here is the part that gives us pause.

Iterative tool-use loops have existed since LangChain 0.1 in 2023. Every agent that runs a Thought, Action, Observation loop is a loop. The ReAct paper by Yao et al. from 2022 describes exactly that pattern. Anyone who set up AutoGPT in 2024 ran a loop system. This is neither new nor surprising.

Reflexion and self-critique have been documented as a pattern since 2023. The Reflexion paper by Shinn et al. (2023) and the LangChain implementations show how an agent critically evaluates its own outputs and improves iteratively. The "critic sub-agents" in the current Loop Engineering framework are a repackaging of that pattern, with the addition that critic and worker can be different models. That is a useful practical recommendation, not a new pattern.

Persistent memory in the form of files is older than any current AI agent. Every developer who has ever written conventions into a README, a CONTRIBUTING.md, or a wiki has been doing external state. What Loop Engineering sells as a sixth building block has always been the foundation of every long-lived software system: documentation.

HITL as a concept comes from control engineering of the 1950s and has been established in human-computer interaction for a long time. In software development, we know it as code reviews, pair programming, and the four-eyes principle. Escalation to a human is and remains an organizational pattern, not a technical novelty.

Trigger patterns like cron, event-driven, or webhook-based are as old as Unix itself. at, cron, systemd timers, GitHub Actions, cloud run jobs: none of it was invented in 2026.

Why the current wave still matters

This critique is not meant to downplay the value of the discussion. Three effects are real and useful:

1. Terminology clarity helps when selling to decision-makers. If you have to explain to a managing director why a coding agent is "now more autonomous", "Loop Engineering" gets more attention than "We put cron on a wrapper." That is marketing, but productive marketing, because it legitimizes investment in verification and stopping conditions.

2. The shift in the design question. The discourse moves from "How do I write the perfect prompt?" to "What verification signals and stopping conditions do I need?". That is the right question, and in many teams it was asked too rarely before 2026.

3. Product features enforce discipline. If you use /goal with an evaluator model, you must formulate a measurable stopping condition. If you use sub-agents with isolation: worktree, you must separate responsibility. These structural constraints are valuable, even if the underlying ideas are older.

Where Loop Engineering hits limits

The same sources that fuel Loop Engineering also supply the cautionary tales. One Japanese field report documents 129 successful automated branch deletions with a clear verification rule, but also 43 commits from a PR babysitter loop in a single day, almost all of which were rejected because the scope was unclear and the loop drifted.

Three problems remain, even with the most elegant framework:

Verification stays a human responsibility. A second agent that checks the first shares its training data and makes similar mistakes. The idea that a critic sub-agent can reliably control a worker agent overestimates the diversity of models. In practice, a well-formulated stopping condition catches more errors than an additional LLM check.

Comprehension drift. The loop can produce code faster than the team can understand it. That is not a new problem, but Loop Engineering intensifies it because the frequency goes up. A team that no longer reads what its agent writes accumulates technical debt at high speed.

Cognitive surrender. Once loops are running, it becomes tempting to accept whatever they return. The loop reinforces confirmation bias because the critic agent was trained on the same data. Osmani himself names this risk without offering a patent solution.

What companies should take away

If you are in an SME, a public administration, or a mid-sized IT department considering Loop Engineering, four points are worth sorting out:

1. Which task has a clearly measurable stopping condition? Branch cleanup, dependency updates, CI failure triage, structured migrations. These tasks are worth it.

2. Which task does not? Open-ended research, creative design decisions, customer communication. Here the loop produces token cost and drift, rarely value.

3. Who reads the outputs? A loop that nobody checks is a paper shredder with an API connection. Audit trails, approval flows, and regular reviews are not optional, they are prerequisites.

4. What permissions does the loop get? Never human credentials. Service accounts with minimal rights, separated from day-to-day operations. This discipline is not new, but it is often forgotten in the hype.

Conclusion

Agent Loops and Loop Engineering bundle existing practices into an easy-to-grasp framework and, through new product features, enforce more disciplined stopping conditions. The underlying concepts are not new. Anyone selling "Loop Engineering" as a revolution has either never used Reflexion, ReAct, and cron, or is trying to put a new label on old ideas. Both are visible in the 2026 market.

The discussion is still valuable, because it puts verification, stopping conditions, and external state at the center. These topics deserve attention, regardless of whether you call the child Loop Engineering, Recursive Goal System, or simply "better-automated agents".

centerbit

Book a consultation now

If you see similar manual work in your team, we can review the process together in a free initial consultation.

Request consultation