Business as code, not AI as business

· Carl Heaton · AI Commentary

A genre of post is appearing. Founder of a small company writes up how they've restructured around AI agents. The agents have names (Sofia, Athena, Hestia), each owns a "surface" (direction, code, releases, customer support), and they message each other via async mail and shared task lists. Humans handle judgment, relationships, and the bits the agents can't yet do. The whole thing is documented in a public template repository so others can copy the shape.

The most recent example, from aweb.ai, draws a sharp distinction between "AI-assisted" companies (humans using ChatGPT to go faster) and "AI-native" ones (work is done by agents with named responsibilities, persistent context, and durable handoffs). It's worth reading. It's also worth not taking the org chart at face value.

What's interesting and what's marketing

The marketing layer is the agent names. Sofia, Athena, and Hestia are a branding decision, not an architectural one. Calling an LLM a "permanent agent" with a "surface" makes it sound like a hire. It isn't. It's a prompt, a configuration, and a set of tools, and any of those can change in an afternoon. The headline ("we have seven agents and two humans") suggests the agents are doing the equivalent of seven jobs. The piece itself admits the structure is recent (settled at the end of April 2026) and that the company has "a handful of external signups that haven't activated yet." A team that hasn't yet activated a paying customer is not yet evidence for an org model. It's evidence for an idea about an org model.

What is genuinely interesting, and what's worth lifting, is the operational discipline that the team had to invent to make the agents function at all. That part is real, and it applies to any small team, with or without agents.

Business as code, not AI as business

Two ideas get tangled together in the AI-native genre, and separating them is the point of this piece.

AI as business is the version aweb describes. The company runs on agents that own surfaces, message each other, and produce most of the work. The humans direct. The business model assumes ongoing access to those agents at something like today's cost.

Business as code is the prerequisite. Decisions are written down. Work lives in artifacts an outsider can read. Codebases explain themselves. Processes exist as documents, not as one person's habits. The business is legible, the way well-written code is legible, regardless of who or what is doing the work. This is what makes a company resilient to a key person leaving, a new hire joining, an auditor turning up, or, yes, an agent being plugged in to help.

The AI-native writeups skip past business-as-code because their authors already have it. They're a small technical team who naturally write things down. They are showing you the second floor and not the foundation underneath. For most SMEs the foundation is the actual work, and most don't have it. "What's going on with the X project" is answerable only by Sarah, and only when Sarah is in. That's the gap to close.

The good news is that AI is genuinely useful for closing it. Drafting the missing process documents, turning a Slack history into a decision log, generating first-pass documentation for an undocumented codebase, summarising the meetings nobody minuted, these are exactly the tasks current models do well, with a human reviewing the output. Most SMEs would get more out of six months of using AI to make their business legible than out of any restructuring around agents.

The agents-doing-the-work step only makes sense after that. And it's also where the case for caution sharpens.

The cost trajectory nobody can price

Every AI-native writeup is published at a moment in time, against the pricing of the models the team happens to be using, with subsidies in the market that reflect a land-grab phase rather than a steady state. Nobody knows what an "agent that owns customer support" will cost to run in 2028, 2030, or later. The honest answer is that it could be much cheaper, or it could be several times what it costs today. Compute prices have generally fallen, but specific frontier-model inference costs have moved in both directions, and the pricing pressure from training cost recovery hasn't shown up in earnest yet.

The asymmetry matters. If you've redesigned your company around the assumption that agents do a substantial share of the work, and the cost of those agents triples, you don't have a cost-management problem. You have an existential one, because the design assumed the input was cheap. A team that uses AI to be three times more productive without restructuring around it can absorb a price rise by using it less. A team that has agents owning surfaces cannot.

This is the unspoken risk in the AI-native genre. The case for it depends on a future where the cost of running the agents stays low enough to make sense. That future is plausible, but it isn't the only one, and nobody outside the model vendors knows which way it lands.

The discipline worth borrowing

Five things, all of which are good ideas regardless of how much AI you use.

Work needs artifacts. A conversation in Slack is not work. A task with an ID, an owner, a status, and a decision attached to it, is. The aweb piece is explicit that this only became necessary because agents can't remember a conversation from yesterday and need the context written down. The same is true of a new hire, a junior staffer joining mid-project, or anyone returning from holiday. If "what's going on with the X project" can only be answered by the one person who happens to be in the room, your team is fragile in ways that have nothing to do with AI.

Substantial work needs two voices. The aweb model pairs a builder agent with a reviewer agent on anything significant. The principle is older than agents: the person who wrote the thing is the worst person to spot what's wrong with it. Most SMEs already know this for code (pull request review) and finance (someone other than the bookkeeper reconciles). Extending it to other consequential decisions, particularly anything AI-generated that goes out to clients, is the cheapest quality control there is.

Owned surfaces, collaborative boundaries. Every meaningful area of work has one named person who decides. Not a committee. Not "we'll discuss it next standup." A name. When the surfaces meet, those people collaborate, but inside their surface they own the call. This sounds obvious and is rare. Most small companies operate on diffused responsibility, which feels democratic and produces decisions that arrive a week late.

Shared, queryable state. Anyone walking into the business cold should be able to look at the same set of documents and understand what's happening. Status of active work, recent decisions and why, who is responsible for what. Most SMEs have this scattered across one person's email, two WhatsApp groups, and a shared drive nobody can find. The aweb piece formalises it because agents need it. You need it because people leave.

Signal attribution. A specific point that's easy to skim past. The aweb piece insists on distinguishing weak signal from closing-quality feedback, and avoiding unsupported causality claims. Translated: when something works or doesn't, write down what you actually observed and what you're inferring. Most small businesses run on stories that get retold until they sound like facts. "Customers don't want X" usually means one customer said it once.

What not to copy

The seven-named-agents shape, specifically. Two reasons, on top of the cost question above.

First, the failure modes. An agent that "owns" customer support is a single point of failure with no manager and limited accountability. When it gets something wrong, the chain of "who decided this" leads back to the prompt template. That's a reasonable risk for a five-person startup whose customers expect rough edges. It's not a reasonable risk for a manufacturer whose customer is a procurement officer at a Tier 1 supplier.

Second, the lock-in. Setting up your operations around the assumption that agents do the work makes you dependent on the specific model, vendor, and tooling that worked when you set it up. The aweb template ships with their own CLI and references Anthropic's Model Context Protocol. Both are good choices today. Neither is a stable foundation to build a five-year business on.

What this means for an SME

The interesting thing about AI-native company writeups is not the future they describe. It's the present they accidentally reveal: small teams operate with much less written-down structure than they think they do, and the moment you try to hand work to anything other than the original human, that gap shows up.

Do business as code first. Use AI to help you get there. Use the AI-native writeups as a forcing function, not a blueprint. If your team couldn't onboard a new starter from your written task list, status documents, and decision records, that's the problem worth fixing. The agents come later, if at all, and the cost of running them at scale isn't something anyone can promise you today.

How Steelwise can help

Working out which AI tools actually fit the way your business runs, and what discipline needs to be in place before you bolt anything onto it, is the kind of review we run for clients. Get in touch.

Further reading

← All filings