Why I Love Agent Pi
Pi1 is the first coding agent that really empowered me to morph it to fit the way my brain works, and I am genuinely excited about how malleable it is on top of a handful of primitives.
Most coding agents I have tried in the past have grown in the same direction. More built-in tools. More opinionated prompts. More hidden, opaque decisions with no real visibility or control. More assumptions that I will bend my workflow around theirs - mandatory hooks here, implicit conventions there - instead of letting me configure a few simple things and get on with my work.
The small stuff gives it away. A recent example: Claude Code animates
its "thinking" state with a glyph that lives in a font not every
machine has. If you happen to use a font that does not have extensive
glyph support, it makes eterm2 trigger a redraw on every
tick, resulting in the surrounding text shaking visually. Yes, I can
switch to a different font, but I did not have the choice to turn it
off. It is a small thing. It is also exactly the kind of small thing
you cannot fix from the outside, because the decision is buried
somewhere you are not invited to touch. You start to succumb to their
way of doing things. Small things aside, I cannot tell what the agent
is really doing, I cannot tell why it chose a particular path, and I
cannot tell how to change its behavior without jumping through the
hoops they put in place.
Pi does almost the opposite. Its core is four primitives: read, write, edit, and bash. That is roughly it. Everything else is something you compose on top, or an extension you opt into. That small surface is exactly the thing I keep coming back for.
If you want a fuller sense of the thinking behind Pi, I would point you to a twenty-minute talk by its creator3. I do not often recommend agent-related talks - most of them age badly within a week or so - but the way he frames agents, agentic coding, and the current landscape is the clearest version of that picture I have heard recently. It is worth your time even if you never use Pi.
Small primitives, visible behavior
A coding agent with four primitives is not a limitation in the way it first appears. Most of what a coding agent actually needs to do boils down to reading a file, writing a file, changing part of a file, or running a command. If you can see each of those happen, you can reason about what the agent is doing. If you can script around them, you can make the agent behave the way your project already behaves.
This matters to me because I spend a lot of my time with agents trying to get clarity - building a clearer mental model of a problem before I commit to a direction. That work depends on visibility. What did the agent change? What did it run? What did it see? An agent whose behavior reduces to a handful of visible primitives is much easier to think alongside than one that hides its moves behind a thicker abstraction.
I do not think this is a new insight. Unix made the same bet a long time ago. So did Emacs, in its own way, with text, buffers, processes, and commands. The bet is that a small number of durable primitives plus a programmable substrate will outlast whatever feature-of-the-month the industry is currently excited about. Pi feels like that bet applied to coding agents.
Where Pi fits in the way I already work
I did not come to Pi looking for a new philosophy. I came to it because it fits two things I already depend on.
The first is Nix. I have written before about why I lean on it
heavily4. In short, I do not want coding agents mutating my
machine. I want them to declare what they need, pull it into an
isolated shell, use it, and leave no residue. Pi cooperates with this
model very naturally. Because its only way to reach the outside world
is the shell primitive, I can point it at nix shell or nix
develop and it will happily do real work inside a clean environment.
A Rust toolchain, a specific Python, an odd CLI tool for one task -
none of it has to touch my base system. When I am done, nothing is
left behind except, if I want, a flake.nix that captures what
actually worked.
The second is Emacs. I have also written about treating Emacs as a programmable workbench5. A coding agent that respects shell and files, and does not insist on its own fancy UI, drops into that workbench without friction. Pi runs in a buffer. As a simple example, I can make it interact with Emacs buffers to persist the commands it ran, so I can come back to them later if I need to.
Those two fits are not accidents of Pi's implementation. They fall out of the decision to keep the primitives small and the surface inspectable. An agent that tried to be a full IDE would not compose with Nix or Emacs this cleanly, because it would be busy being its own environment.
Extensibility I can actually use
The other thing I like about Pi is that its extension points are honest. When I want the agent to behave differently on a particular project, I do not have to reverse-engineer a hidden system prompt or wait for a vendor to ship a feature. I can add an extension, adjust a skill, or wire in a custom tool, and the change is visible in the same place the rest of the behavior lives.
Here is a concrete example. I wanted a small workflow for myself: whenever I ask one model to brainstorm an idea with me, a second comparable model should immediately review the response and list pros and cons, and optionally a third arbitrator model should weigh in when the two disagree. Useful pattern, but not something any agent I know ships out of the box. I asked Pi to build it for me as a peer-review extension. It did. Now, my brainstorming sessions work exactly as I wanted.
What made that work was not that Pi is clever. It was that Pi ships with documentation and code examples for its own extension points, and those examples are legible enough that the agent could read them and write a correct extension on the first try. I did not reverse-engineer anything. I did not hunt down a third-party tutorial. I just asked Pi to make the extension and it built the thing.
That is the part I keep thinking about. Pi is self-documenting in the same sense that Emacs is self-documenting6 - the system knows how to explain itself, to me and to any agent I put in front of it. That property sounds almost unremarkable until you notice how rare it is in modern tooling, and how much friction its absence quietly adds to every customization you try to make.
The honest trade-offs
Pi is not the right answer for everyone, and I do not want to oversell it.
Out of the box, it is minimal. If you want an agent that arrives pre-wired with a dozen integrations and a confident opinion about how your project should be structured, Pi will feel underdressed. You are expected to bring the way you think, work, and taste yourself. That takes some getting used to. It is the same kind of work Nix and Emacs ask for, and it pays back in the same way.
It is also a young project. Rough edges exist. Pi is not the only agent in this minimalist neighborhood either - Aider and OpenCode share a lot of the same instincts, and I think that is healthy. The argument I am making is not that Pi has won. It is that the design pressure behind Pi is the one I want more agents to feel.
Putting my money where my mouth is
At some point I realized I had been talking about Pi's design philosophy enough that I should probably just internalize it the hard way: by building something that takes the same bet from scratch. So I did. I wrote a small coding agent in Rust called OneLoop7. Its core is exactly what this post has been describing: four tools (read, write, edit, bash), one agent loop, and a session model that appends linearly to a JSONL file. Nothing else.
I am not bringing this up to announce a project. I am bringing it up
because it proved something to me. OneLoop is rough, young, and does
almost nothing compared to any "serious" coding agent on the market.
And yet I am using it right now - in fact, the flake.nix in the
OneLoop repo itself was written by OneLoop. The agent is building its
own infrastructure. It works for real work because the model does the
heavy lifting, and the agent's only job is to stay out of the way: hand it
files, hand it a bash shell, and let it run. When your agent is just
primitives and a loop, you do not need much more than that to be
productive.
Building it also taught me something I did not fully appreciate from just using Pi. The hard part of a minimalist coding agent is not the agent loop - that is genuinely small. The hard part is everything around it: truncation heuristics so the model context does not explode, sensible output formatting so you can see what happened, session persistence so you can pick up where you left off. Those are real engineering problems. But they are infrastructure problems, not agent design problems, and it is useful to know the difference. Pi absorbs those infrastructure decisions so you do not have to think about them. OneLoop forced me to confront them directly. Both experiences are valuable.
Why this matters beyond Pi
The reason I wanted to write this down is not really about one tool. It is about a pattern I keep noticing in the parts of my setup that have aged well.
The systems I still rely on after many years tend to share a shape. A small number of durable primitives. A programmable substrate. Honest extension points. A willingness to be boring where boring is the right answer. NixOS has that shape at the system level. Emacs has it at the workbench level. Pi showed it to me at the agent level. OneLoop is my way of making sure I actually understand it.
I do not know which specific tools I will be using in five years. I do know that the ones that survive, for me, will look more like this and less like the feature-heavy alternatives that dominate the current moment. That is the real reason I love Pi. Not because it is the final answer, but because it is built in a shape that can keep being useful while the rest of the landscape keeps adding bloated features until it works for everyone - maybe not for you or me.
Footnotes:
Pi is a minimal, extensible coding agent by Mario Zechner: https://pi.dev/.
Mario Zechner's talk on agents and the current agentic coding landscape: https://www.youtube.com/watch?v=RjfbvDXpFls.
My earlier post on why I rely on Nix, especially in the LLM coding era: https://www.birkey.co/2026-03-22-why-i-love-nixos.html.
My earlier post on Emacs as a programmable workbench: https://www.birkey.co/2026-03-28-emacs-as-a-programmable-workbench.html.
The GNU Emacs manual describes Emacs as a "self-documenting" editor and explains that this means you can use help commands at any time to find out what your options are and what commands do: https://www.gnu.org/software/emacs/manual/html_node/emacs/Intro.html.
OneLoop is a tiny coding agent written in Rust. It is a private repo for now. I will open-source it at some point when I believe it is ready.
Oneness is All You Need
Tony Hoare put the problem well when he wrote, "I conclude that there are two ways of constructing a software design."1 One path is simplicity. The other is complexity, whose deficiencies are harder to see.
That line has stayed with me for years because it names a real danger in our industry. We often mistake the absence of visible flaws for actual clarity. We add layers, libraries, frameworks, helper services, configuration systems, and alternative paths until the whole thing looks sophisticated enough that nobody can easily challenge it. Then we call that maturity.
In the current era of bloated, fast-generated code, that danger feels even more immediate. We are producing more software than ever, often faster than we can understand, verify, or justify. That makes simplicity less of a preference and more of a survival strategy.
Most projects do not fail because engineers lacked yet another abstraction. They fail because complexity compounds faster than the team can reason about it. The system becomes harder to inspect, harder to change, harder to verify, and eventually harder to trust.
That is why I keep returning to one design pressure that has become more important to me over time: oneness.
By oneness, I do not mean anything mystical. I mean something very operational:
- one source of truth where possible
- one obvious way to do a thing
- one clear owner per layer
- one logical responsibility per layer
- one clear mental model per layer
- one place to inspect data integrity
- one coherent workflow through the system
The point is not ideological purity. The point is reducing avoidable complexity so the system stays legible, easily testable, and verifiable by the people building it.
Why this feels harder than it should
Even before the current LLM era, it was difficult to stay simple. There were always reasons not to.
An engineer wants to move fast, so a new library gets introduced before its trade-offs are understood. A team wants flexibility, so it creates multiple ways to achieve the same outcome. A system outgrows its original design, so validation rules get copied into controllers, jobs, database constraints, frontend checks, and downstream consumers. Another team arrives and adds a second workflow rather than cleaning up the first. Then a third team adds a wrapper around both.
Nothing in that sequence sounds absurd in isolation. That is exactly why complexity is dangerous. It rarely arrives as one obviously wrong decision. It arrives as a long series of locally reasonable decisions that collectively destroy clarity.
The result is familiar:
- New engineers cannot tell where to start.
- Existing engineers cannot tell which layer owns what.
- Bugs become archaeology.
- Data integrity becomes probabilistic.
- Every change carries too much fear.
This is why I have long been drawn to ideas like Easy To Change2, declarative systems3, self-documenting tools4, and a programmable workbench5 that keeps the whole loop visible. They all pull in the same direction. They reduce multiplicity. They reduce drift. They give you one place to think from.
Why the LLM era changes the economics
The LLM era does not remove this problem. It sharpens it.
LLMs lower the cost of producing code. They do not lower the cost of ambiguity. If they are not harnessed well, they can compound it so quickly that the resulting system becomes almost impossible to reason about. That is the danger. The opportunity is that the same tools can also help us reduce complexity, but only if we use them with discipline.
In fact, ambiguous systems are exactly where generated code becomes most dangerous. If a repository has three ways to configure a service, two half-trusted test setups, duplicated validation logic, unclear module ownership, and no obvious path through the codebase, an agent will happily generate more material inside that ambiguity. It can amplify existing confusion faster than a human ever could.
But I also think the LLM era gives us an opportunity that did not exist in quite the same way before. We can now spend less human energy on producing boilerplate and more human energy on collapsing unnecessary complexity. An agent can help standardize interfaces, remove duplicated code paths, migrate scattered logic into one owned layer, and push a codebase toward a more coherent shape.
The principle did not change. The economics did.
That is why I do not see the current moment as a reason to compromise on simplicity. It is one of the first times in my career when I feel I can insist on it more aggressively.
Code is abundant, understanding is not
SICP says that programs must be written for people to read, and later adds that readers should know what not to read.6 I still agree with both points, but I think their practical implication changes in the agentic era.
If generated code becomes abundant, it becomes impossible for a human to read all of it with equal depth. That is not a moral failure. It is just arithmetic. The surface area grows too quickly.
So we need to move one level higher.
Instead of assuming the human must read every line with equal care, we should design systems so the core behavior can be reviewed through a much smaller surface: tests, invariants, contracts, and executable examples.
I am not claiming tests replace reading code. They do not. Bad tests can hide bad systems, just as bad abstractions can. But the essence of a system is often much smaller than its total implementation volume.
A team may generate or write a thousand lines of code, but what the system fundamentally does may fit in:
- ten invariants
- twenty meaningful examples
- a handful of properties
- a short list of input/output contracts
That smaller surface is something a human can actually hold in their head. It is something another engineer can review, an agent can execute repeatedly, and CI can verify without asking everybody to reread the entire implementation every time.
In other words, the human review surface should get smaller as code generation gets cheaper.
That is not a retreat from engineering rigor. It is an attempt to put rigor where it gives us the most leverage.
In a healthy system, a human reviewer should not need to reread the entire generated implementation to regain confidence. They should be able to look at a smaller behavioral surface and ask: Are the invariants still true? Do the core examples still hold? Did this change widen the contract or violate it? That is a far more realistic way to supervise generated code than pretending abundance did not change the review problem.
Oneness inside layers
When I say oneness, I am not arguing against layered systems. I am arguing that each layer should have one clear responsibility and one obvious place where certain truths become real.
For example:
- There should be one place where a piece of data becomes valid.
- One place where a business invariant is enforced.
- One default command that gets a human or an agent into the project.
- One declared artifact that owns environment setup where practical.
- One obvious module that owns a transformation.
If the honest answer to those questions is often "it depends," the system is usually paying a complexity tax already.
This is also why I care so much about one source of truth. A scattered system forces every engineer to rebuild the same mental model from fragments. A coherent system lets them ask a smaller set of questions, because the data model and ownership model are clearer. That matters for humans and for agents too.
Sometimes the benefit is almost embarrassingly concrete. One declared artifact for environment setup is better than shell scripts, wiki instructions, and CI fragments all telling slightly different stories. One default project command is better than three nearly equivalent ways to run tests. One layer owning a business invariant is better than duplicating partial validation in the UI, the handler, the job runner, and the database and hoping they never drift apart.
Take something as ordinary as order creation. In a messy system, the shape of an order gets partially validated in the frontend, partially checked again in the HTTP handler, partially normalized in a background job, and partially constrained in the database. The tests mirror that fragmentation, so no single test surface tells you what an order is supposed to mean. A more coherent design gives one layer ownership of turning input into a valid order, one place where the invariants become real, and one test suite that expresses those rules directly. The total lines of code may not shrink much, but the number of places you need to look in order to trust the system absolutely does.
This is part of why I like tools and conventions that collapse
scattered state into one place. A flake.nix7 can become
the declared truth for environment setup. A self-documenting
Makefile8 can become the obvious entry point into a
project. A well-owned test suite can become the smallest trustworthy
surface for reviewing behavioral changes. None of these ideas are
glamorous. That is precisely why they age well.
An LLM works much better when there is one obvious command to run, one obvious directory to modify, one obvious test suite to extend, one obvious place where integrity checks belong, and one obvious owner for the relevant behavior. Ambiguity is expensive for humans, and it is even more expensive when delegated.
What oneness is not
Oneness is not:
- one giant service
- one giant function
- one person making all decisions
- refusal to use libraries
- denial of layering
- minimalism for its own sake
It is a bias. It is a design pressure. It says that unnecessary multiplicity should have to justify itself.
Sometimes reality will justify it. There are cases where multiple paths, multiple representations, or multiple deployment forms are the right answer. But the burden of proof should be on complexity, not on simplicity.
That is the part our industry often gets backwards.
What I mean in practice
When I look at a system now, especially in the presence of coding agents, I increasingly want to ask very plain questions:
- What is the one thing this layer owns?
- What is the one mental model for the data in this layer?
- Where is the one place I check whether data is valid?
- What is the one command I should run first?
- What is the one test file or suite that best expresses intended behavior?
- What is the smallest reviewable surface that captures the essence of this change?
Those questions do not solve every design problem. But they keep me oriented toward legibility and verification.
And that, to me, is the real opportunity of the LLM era. We can use these tools to generate more code, yes. But a much better use is to generate less confusion. We can use them to push systems toward one obvious path, one owned responsibility, one executable behavioral surface, and one source of truth where possible.
I do not think software becomes better by becoming more clever. I think it becomes better when it becomes more legible, more deterministic, and easier to verify.
In an era of abundant code, that is what I mean by oneness.
Footnotes:
C. A. R. Hoare, The Emperor's Old Clothes, the 1980 ACM Turing Award Lecture, published in Communications of the ACM 24(2), 1981: https://www.labouseur.com/projects/codeReckon/papers/The-Emperors-Old-Clothes.pdf The "two ways" passage appears on PDF p. 13.
My earlier post on Easy To Change: https://www.birkey.co/2020-12-26-ETC-principle-to-ground-all.html
My post on why declarative systems matter to me: https://www.birkey.co/2026-03-22-why-i-love-nixos.html
The GNU Emacs manual describes Emacs as a "self-documenting" editor and explains that this means you can use help commands at any time to find out what your options are and what commands do: https://www.gnu.org/software/emacs/manual/html_node/emacs/Intro.html
My earlier post on Emacs as a programmable workbench: https://www.birkey.co/2026-03-28-emacs-as-a-programmable-workbench.html
Harold Abelson and Gerald Jay Sussman, Structure and Interpretation of Computer Programs, Preface to the first edition: https://sicp.sourceacademy.org/sicpjs.pdf The readability passage appears on PDF p. 24, including the line about programs being written for people to read and the follow-on point that readers should know what not to read.
Official Nix documentation on flakes: https://nix.dev/concepts/flakes.html
My post on self-documenting project entry points: https://www.birkey.co/2020-03-05-self-documenting-makefile.html
Emacs as a programmable workbench
What I care about in Emacs has less to do with editing text and more to do with one idea: a stable workbench becomes more valuable as the surrounding tool landscape becomes more volatile.
Software engineering has never been only about typing code into files, but that is especially obvious now. More of the job is spent coordinating services, processes, tests, logs, prompts, REPLs1, diffs, generated artifacts, and feedback loops. LLMs amplify that change, but they do not alter the core of the work. If anything, they expose it more clearly. Generated code is abundant. Engineering judgment is not. The hard part is still turning tentative output into something inspectable, reproducible, composable, and reliable.
That is why the environment I work in matters. I want one place where that whole loop stays visible, and Emacs is best understood as exactly that: not a text editor in the narrow sense but a system you can shape around how you inspect, compose, verify, and iterate. The key abstraction is the buffer.
A file can live in a buffer. A terminal can live in a buffer. A REPL can live in a buffer. A compilation can live in a buffer. The output of a command can live in a buffer. A conversation can live in a buffer. Notes, prompts, logs, diffs, and half-formed ideas can live there too. Not every kind of computing reduces cleanly to a buffer, of course, but enough of software work does that the abstraction becomes powerful.
Buffers matter because they are working surfaces. They are inspectable, editable, searchable, programmable places where rough work can stay rough long enough to become better work.
That point gets clearer in a real workflow. Suppose I am using a coding agent to add a feature to a service. The agent runs in one terminal buffer. It proposes a patch. I inspect the diff in another buffer, jump to the changed source, and notice that the edge case handling is wrong. I run the failing test and keep the output open in a compilation buffer. I poke at the behavior in a REPL. I write a short note to myself about the invariant that actually matters. I refine the prompt, rerun the agent, and compare the new diff against the old one. When a command sequence turns out to be useful, I keep it. When the note proves durable, I turn it into documentation or code. What began as a loose collection of prompts, commands, output, guesses, and generated code becomes a more reliable artifact.
That is the real value of the workbench. The whole path from generation to verification to reuse stays visible and inspectable.
One principle has stayed constant across twenty years of using Emacs: I want the tools I depend on to feel native inside my workbench, not bolted on from the outside.
Many code-oriented tools, including coding agents, still do their best
work through commands, files, patches, tests, and process output. If
the terminal lives inside Emacs in an eterm buffer2, the
agent is not working off to the side—it is operating in the same
workbench where I am reading code, reviewing diffs, checking logs, and
deciding what to do next.
Emacs does not magically make the work easy. What it does is keep the work inspectable, repeatable, and easier to automate well.
At this point the obvious objection is: any editor with plugin support or a configuration language does most of this. And it does—up to a point. But what I value in Emacs is not mere coexistence of tools. It is unified manipulation. The same editing model, navigation model, search model, history model, and programmability apply across many kinds of work. The successful one-off can be promoted into a habit, a function, a command, or a workflow without crossing several conceptual boundaries first. What Emacs buys me is that the integration is not a plugin I configure—it is part of the same environment in which I inspect code, review output, capture notes, and shape reusable workflows. And because it is Emacs Lisp throughout, nothing is opaque: every behavior I depend on can be read, modified, and extended in the same language.
That same pattern matters even when no agent is involved. A REPL is not some foreign object. Compilation output is not a separate world. Version control is not exiled to another app. If a tool exposes text, process interaction, or a surface that can be inspected and shaped, Emacs can often bring it into the same workspace and let you build stable habits around it.
This is also why Emacs has aged unusually well. New tools keep appearing: agent shells, chat interfaces, MCP servers and clients3, debugging helpers, deployment wrappers, one-off scripts, and whatever comes next. The tools change, but the need does not. You still need a place to inspect what happened, adjust it, connect it to the rest of your workflow, and promote successful patterns into reusable ones.
Emacs is good at that because its core abstractions are durable. Text, buffers, processes, commands, functions, windows, and programmable transformation through Emacs Lisp are not fashion-driven ideas. They have held up because they map well to real work. When a new tool can speak through those abstractions, I do not have to start over mentally just because the industry has a new wrapper or a new brand name for the same basic activity.
For me, Emacs is still the best answer I know to that problem. It gives me one programmable place to think, stage, inspect, verify, compose, and gradually solidify work that often begins in a messy state.
That, to me, is the enduring value of Emacs.
Footnotes:
REPL stands for Read-Eval-Print Loop: an interactive environment where you enter an expression, the language evaluates it, and the result is printed immediately. The idea originated in Lisp and the dynamic Lisp family of languages, but most modern languages now offer one in some form.
By MCP I mean Model Context Protocol, a standard way for tools and applications to expose capabilities to LLM-based clients. I wrote a more practical post about it here: MCP explained with code.
Why I love NixOS
What I love about NixOS has less to do with Linux and more to do with the Nix package manager1.
To me, NixOS is the operating system artifact of a much more important idea: a deterministic and reproducible functional package manager. That is the core of why I love NixOS. It is not distro branding that I care about. It is the fact that I can construct a whole operating system as a deterministic result of feeding Nix DSL to Nix and then rebuild it, change it bit by bit, and roll it back if I do not like the result.
I love NixOS because most operating systems slowly turn into a pile of state. You install packages, tweak settings, try random tools, remove some of them, upgrade over time and after a while you have a machine that works but not in a way that you can confidently explain from first principles. NixOS felt very different to me. I do not have to trust a pile of state. I can define a system and build it.
I love NixOS because I can specify the whole OS including the packages I need and the configuration in one declarative setup. That one place aspect matters to me more than it might sound at first. I do not have to chase package choices in one place, desktop settings in another place and keyboard behavior somewhere else. Below are a couple of small Nix DSL examples.
- GNOME extensions:
environment.systemPackages = with pkgs; [ gnomeExtensions.dash-to-dock gnomeExtensions.unite gnomeExtensions.appindicator libappindicator ]; services.desktopManager.gnome.extraGSettingsOverrides = '' [org.gnome.shell] enabled-extensions=['dash-to-dock@gnome-shell-extensions.gcampax.github.com', 'unite@hardpixel.eu', 'appindicatorsupport@rgcjonas.gmail.com'] [org.gnome.shell.extensions.dash-to-dock] dock-position='BOTTOM' autohide=true dock-fixed=false extend-height=false transparency-mode='FIX' '';
- Key mapping per keyboard:
services.keyd = {
enable = true;
keyboards = {
usb_keyboard = {
ids = [ "usb:kb" ];
settings.main = {
leftcontrol = "leftmeta";
leftmeta = "leftcontrol";
rightalt = "rightmeta";
rightmeta = "rightalt";
};
};
laptop_keyboard = {
ids = [ "laptop:kb" ];
settings.main = swapLeftAltLeftControl;
};
};
};
Those are ordinary details of a working machine, but that is exactly the point. I can describe them declaratively, rebuild the system and keep moving. If I buy a new computer, I do not have to remember a long chain of manual setup steps or half-baked scripts scattered all over. I can rebuild the system from a single source of truth.
I love NixOS because it has been around for a long time. In my experience, it has been very stable. It has a predictable release cadence every six months. I can set it up to update automatically and upgrade it without the usual fear that tends to come with operating system upgrades. I do not have to think much about upgrade prompts, desktop notifications or random system drift in the background. It mostly stays out of my way. And if I want to be more adventurous, it also has an unstable channel2 that I can enable to experiment and get newer software.
I love NixOS because it lets my laptop be boring in the best possible sense. I recently bought an HP laptop3 and NixOS worked beautifully on it out of the box. I did not have to fight the hardware to get to a reasonable baseline. That gave me exactly what I want from a personal computer: a stable system that I can configure declaratively and then mostly ignore while I focus on actual work.
I love NixOS because it makes experimentation cheap and safe. I can try packages without mutating the base system. I can construct a completely isolated package shell4 for anything from a one-off script to a full-blown project. If I want to harden it further, I can use the Nix DSL to specify the dependencies, build steps and resulting artifacts declaratively. That is a much better way to work than slowly polluting my daily driver and hoping I can reconstruct what I did later.
I love NixOS because I can use the same package manager across macOS and Linux. There is also community-maintained support for FreeBSD, though I have not used it personally. That is a huge practical benefit because my development tooling and dependency management can stay mostly uniform across those systems. It means the value of Nix is not tied only to NixOS. NixOS happens to be the most complete expression of it, but the underlying model is useful to me across platforms.
I love NixOS because it fits especially well with the way I work in the current LLM coding era.
Tools are changing very quickly. Coding agents often need very
specific versions of utilities, compilers and runtimes. They need to
install something, use it, throw it away, try another version and keep
going without turning my PC into a garbage dump of conflicting
state. Nix fits that model naturally. If I tell a coding agent that I
use Nix, it is usually clever enough to reach for nix shell or
nix develop to bring the needed tool into an isolated environment
and execute it there. That is especially handy because Nix treats
tooling as a declared input instead of an accidental side effect on
the system.
A concrete example: I recently built a voice-to-text agent in
Rust5. I did not have the Rust toolchain installed on my system. I
simply told the coding agent that I use Nix, and it figured out how
to pull in the entire Rust toolchain through Nix, compile the project
inside an isolated shell and produce a working binary. My base system
was never touched. No ~/.cargo, no ~/.rustup, no mutated PATH
entries left behind. Without Nix, the agent would have reached for
curl | sh to install rustup, quietly mutated my environment and
left my system slightly different forever. With Nix, none of that
happened.
This pattern generalizes. Every time an agent needs Python 3.11 vs
3.12, a specific version of ffmpeg, an obscure CLI tool or a
particular compiler, Nix gives it a clean and reversible way to get
exactly what it needs. The agent does not have to guess whether a
tool is already installed or in the wrong version. It just declares
what it needs and Nix takes care of the rest in a sandboxed way.
The other thing I appreciate is that Nix turns an agent's
experiment into something you can actually commit and reproduce. Once
the agent has a working setup, you can capture the exact dependencies
in a flake.nix and run nix flake check to verify it builds
cleanly from scratch. That transforms an ad hoc agent session into a
reproducible, verifiable artifact. That is a much stronger foundation
for delivering something that works reliably in production than hoping
the environment happens to be in the right shape on the next
machine.
I love NixOS because I like what Nix gives me in deployment too. I have never been a big fan of Docker as the final answer to the "works on my machine" problem. It solved important problems for the industry, no doubt about that, but I always found the overall model less satisfying than a truly deterministic one. Nix gives me a much better story. I can use dockerTools.buildLayeredImage to build smaller Docker images in a deterministic and layered approach. If I can build it on one computer with the proper configuration, I can build the same artifact on another one as long as Nix supports the architecture, which in my experience has been very reliable.
That coherence is one of the things I value most about NixOS. The same underlying model helps me with my laptop, my shell, my project dependencies, my CI pipeline and my deployment artifact. It is one way of thinking about software instead of a loose collection of unrelated tools and habits.
So when I say I love NixOS, what I really mean is that I love what it represents. I love a system that is declarative, reproducible, reversible and stable. I love being able to experiment without fear and upgrade without drama. I love that it helps me focus on building and experimenting with fast-moving tools, including LLM coding agents, without worrying about messing up my system in the process.
I love NixOS because it is the most complete everyday expression of what I think software systems should be.
Footnotes:
If you are new to Nix, I wrote a more practical getting-started guide here: Nix: Better way for fun and profit.
By unstable channel I mean the official `nixos-unstable` or `nixpkgs-unstable` channels. See Channel branches and channels.nixos.org.
HP EliteBook X G1a 14 inch Notebook with 64 GiB RAM and AMD Ryzen AI 9 HX PRO 375.
For example, nix develop drops you into an interactive shell environment that is very close to what Nix would use to build the current package or project.
A voice-to-text agent I built in Rust that replaced Whisper and Willow Voice in my personal workflow. I wrote it first for macOS and then ported it to Linux. I have been using it as a daily driver for a couple of months now. I am considering open sourcing it or releasing it as a standalone app.
One csv parser to rule them all
One would think that parsing CSV files is pretty straightforward until you get bitten by all kinds of CSV files exists in the wild. Many years ago, I have written a small CSV reader with following requirements in mind:
- Should not depend on any other code other than Clojure
- Should allow me to control how I tokenize and transform lines
- Should allow me to have complete controll over delimiting charactor or charactors, file encoding, amount of lines to read and error handling
The result is csvx. I update it to work across Clojure and ClojureScript both in NodeJS and browser environment. The entire code is less than 200 lines including comments and blank lines. If you find yourself in need of a csv reader with above requirements, you are welcome to steal the code. Enjoy!