Shipping Axon agents
We've been working on Axon agents for almost a year. Last month, we shipped them.
I want to write about what that actually involved — not the polished version, but the real one.
What agents actually are
An Axon agent is a loop. It reads your codebase, makes a plan, writes code, runs tests, evaluates the results, and repeats until the task is done or it gets stuck.
That sounds straightforward. It's not.
The hard part isn't any individual step. It's the coordination. It's making sure the agent doesn't go off in the wrong direction for ten minutes before you notice. It's handling the cases where the tests pass but the code is wrong. It's building the right level of human oversight into a system that's supposed to be autonomous.
What we got wrong
Our first implementation was too autonomous. It would run for a long time, make lots of changes, and present you with a diff that was hard to reason about. Developers didn't trust it. They'd review the output carefully, often reject it, and feel like they'd wasted time waiting.
The fix was counterintuitive: make agents slower and more transparent. Show the plan before executing. Checkpoint at meaningful points. Make it easy to stop and redirect.
Agents that feel slower but show their work turned out to be more trusted and more used than agents that were fast but opaque.
What's next
We're working on multi-agent workflows — cases where multiple agents work on different parts of a task simultaneously. It's technically interesting and practically useful for large refactors.
We're also working on better failure modes. Right now, when an agent gets stuck, it just stops. We want it to be able to ask for help in a structured way that's easy for the developer to respond to.
There's a lot left to do. But agents are real now, and they're useful. That feels good.
