16 May 2026

The Agent Is Not the Point

I recently finished building a coding agent¹. Not a wrapper around someone else's. A small one, from scratch, in Rust. Its core is four tools: read a file, write a file, edit part of a file, and run a shell command. One agent loop. A session that appends linearly to a log. That is roughly it.

The experience taught me something that I think the current conversation about AI in software engineering is mostly missing. So I want to say it directly:

The agent is not the point.

People are the point. Engineering rigor is the point. Being able to gain clarity on the actual problem you are trying to solve is the point. The agent is a tool that, if harnessed well, can help with those things. But the tool itself is not what matters. What matters is what it enables the human to do.

Two camps, both wrong

I see two extreme positions dominating the current AI conversation, and I think both are mostly emotional reactions to something that deserves clearer thinking.

The first camp says that coding agents will do everything. No more coders. No more software engineers. Massive layoffs are coming. Big names in the industry are declaring the end of programming as we know it. Within this camp, there is a further impulse: stop reading code, just let the agent do it. Accept the output. Move on. Everyone is suddenly talking about "agentic this" and "agentic that" without, in many cases, actually understanding what an agent is or how it works under the hood. The word has become a branding exercise more than a technical description.

The second camp says that AI is fundamentally bad. It steals work. It produces garbage. It makes people dumb. It offloads thinking. It is a threat to the profession and to the craft of engineering. This camp includes engineers I have learned a great deal from over the years, people whose judgment I ordinarily trust. The animosity is real and honest, but I think the conclusion is wrong.

Both camps are having an emotional reaction to a genuine change in the landscape. I understand why. The change is real and the pace is fast. But the framing on both sides is shortsighted. The question was never whether AI will replace you or whether AI makes people stupid. That is the wrong question. It was always the wrong question. The question is: what is this thing actually good for, what does it need to be useful, and how do we harness it in a way that produces verifiable, reproducible, deterministic outcomes?

Having built one

I did not set out to build a coding agent to prove a point. I built one because I wanted to understand what was actually happening when I used one. I had been using Pi² and Claude Code in my daily work, and I was impressed but also unsatisfied with how much of the process stayed opaque. So I built OneLoop¹.

Here is what OneLoop does: it reads files, it writes files, it edits files, and it runs shell commands. There is an agent loop that assembles a prompt with the conversation history, sends it to the LLM, parses the response, and executes whatever tool the model asks for. The session is a JSONL file that grows linearly. That is the whole thing.

What surprised me was not how complex the agent needed to be. It was how capable the LLM is with just those few primitives. Given the ability to read files and run commands, the model does genuine detective work. It gathers evidence. It recognizes patterns across a codebase. It follows leads from one file to another, from one log line to a stack trace to a root cause. It can narrow a bug from a vague symptom to a specific line of code by systematically reading, searching, and cross-referencing.

That is not magic. It is a small number of durable primitives plus a very powerful pattern recognition engine. But seeing it up close, from the inside, made something click for me. The value is not in the agent as a thing. The value is in what those primitives enable the LLM to do gather evidence so the human can see more clearly.

What the LLM is actually good for

The LLM is good at pattern recognition across large surfaces. It is good at gathering and synthesizing information. It is good at following a trail of evidence when you give it the tools to look around. It is good at generating plausible code, yes, but that is almost a side effect of a deeper capability: it is good at helping the human gain clarity on the problem.

That is what I keep coming back to. The real value of an LLM in a coding workflow is not that it writes code for you. It is that it helps you see the problem more clearly. It gathers context you might not have the patience to gather. It spots patterns you might miss because you are too close. It follows leads you might not have thought to chase. The code generation is real and useful, but it flows from the clarity, not the other way around.

This is also why the "just let the agent do everything" camp is wrong. If you accept the output without inspecting it, without understanding what happened, without building your own mental model of the problem, you have gained nothing durable. You have a patch that works right now and no understanding of why. That is not engineering. An LLM will happily write and rewrite code to fix one thing while breaking another, over and over, and you should not just accept that cycle without rigor. If you do, there is a term for it: faith-based engineering.

And it is why the "AI is fundamentally bad" camp is wrong too. The pattern recognition is real. The evidence gathering is real. The clarity it can produce is real. Throwing that away because the current implementations are imperfect, or because some of the surrounding hype is ridiculous, is like refusing to use a compiler because you once saw it generate a wrong optimization. The tool does not have to be perfect to be genuinely useful.

The opportunity

I do not want this to read as a cautious "be careful with AI" post. I am genuinely excited about what this enables.

There is so much accumulated waste in our industry. Clunky systems that evolved through years of locally reasonable decisions into globally unreasonable messes. Wrong abstractions that nobody has time to fix. Half-baked integrations, scattered validation logic, unclear ownership, duplicated effort. We all know these systems. We have all worked in them. The problem was never that we lacked the intelligence to fix them. The problem was that we lacked the time and the patience to gather the evidence needed to see clearly what was actually wrong.

That is where the LLM changes the economics. It can gather that evidence. It can read every file, run every test, trace every dependency, and lay it out in front of you. Not so you can blindly accept its conclusions, but so you can think more clearly about what to do. The human judgment is still the scarce resource. The LLM just makes that judgment cheaper to exercise well.

I see a tremendous opportunity to disrupt waste that has been sitting around for years, not because we did not know it was there, but because the cost of understanding it was too high relative to everything else on the backlog. That cost is dropping fast. The question is whether we use that drop to generate more noise or to finally clean things up.

A concrete example. You inherit a codebase and you know there is cruft. It is hazy though. You cannot quite see where the patterns of waste begin and end. There is duplicated logic scattered across handlers, half of it slightly different from the other half, and you are not sure which version is the source of truth. It would take you a full day just to trace all the variations, map the dependencies, and build a mental model of what the code is actually doing versus what the domain needs.

Instead, you ask the agent to read through the relevant files and identify the pattern. It does. It lays out every variation side by side, shows you where they diverge, and highlights which ones match the domain intent and which ones are just drift. Now you can see it. What was hazy is concrete. You did not outsource the thinking — you outsourced the gathering. The judgment about what to keep, what to collapse, and how to restructure is still yours. But the cost of getting to that judgment just dropped from a day to ten minutes.

That is the economics I am talking about. You still have to think. You still have to decide. But you get to do it from a position of clarity instead of a position of exhaustion. And once you can see the pattern clearly, you can guide the agent to clean it up in a way that stays close to the domain — with tests to verify, diffs to inspect, and a clear before and after.

The same applies to building new things. If you can describe what you want with enough clarity — invariants, contracts, tests, acceptance criteria — the LLM can help you get there faster. Not by replacing your thinking, but by amplifying it. The hard part was never the typing. The hard part was always the thinking. Nothing about that has changed.

Where this is heading

I think the future of coding agents — and agents in general — looks less like a product and more like a library. Small composable building blocks. APIs, maybe even ABIs³, that expose just enough surface for people to build on top of. Every major platform will roll out their own agent APIs. The winners will be the ones that treat the agent as something you compose and extend, not something you adopt wholesale.

That is the spirit behind OneLoop. It meets my needs because it is tailored to the way I work — my tools, my environment, my workflow. It is not designed for everybody, and I am not pretending it is. When I believe I have hardened the core pieces enough, with clean interfaces so that someone else can pick, choose, and compose their own workflow on top of it, I will open source it. Until then, I might just release it as-is — warts and all — so people can take inspiration, steal what is useful, and build something that fits their own brain.

Because that, I think, is the right shape for an agent. Not a monolithic product that tells you how to work. A small set of composable primitives that you shape around how you already think.

The agent is not the point

I keep coming back to the same realization. The agent is not the point. It never was.

The point is people thinking more clearly about the problems they are trying to solve. The point is engineering rigor producing verifiable, reproducible outcomes. The point is being able to gain clarity on the actual problem, make sound judgments, and create real value. The point is disrupting the waste and the clunk that has built up over years of shortcutting.

The coding agent is a tool that, built on a handful of small primitives and harnessed with discipline, can help with all of that. But it is still just a tool. It is a very powerful one, and I am excited about what it makes possible, but it is not the thing that matters. What matters is what the human does with the clarity the tool helps produce.

The question was never whether AI will replace you or whether AI will make you dumb. The question is: will you use this tool to think more clearly, build more intentionally, and create more value? That question has always been the right one. The tool just changed.

Footnotes:

OneLoop is a tiny coding agent I wrote in Rust. Its core is four tools (read, write, edit, bash), one agent loop, and a session model that appends linearly to a JSONL file. It is a private repo for now. I will open-source it at some point when I believe it is ready. I highly recommend every engineer who cares about their craft to build one from scratch. I guarantee you will have some aha moments.

Pi is a minimal, extensible coding agent by Mario Zechner: https://pi.dev/. I wrote about why I love it here: https://www.birkey.co/2026-04-19-why-i-love-pi.html

ABI stands for Application Binary Interface — a lower-level contract than an API that defines how software components interact at the machine level. I use it here to make the point that agent interfaces might eventually need to be as stable and well-specified as the contracts that operating systems and compilers have provided for decades.