Next Talk: Building an AI QA Engineer with Claude Code and Playwright MCP

April 16, 2026 — ASQF Quality Night Hamburg

Conference
Skip to content

The Software Factory: Why Your Team Will Never Work the Same Again

Published: at 

Table of Contents#

Open Table of Contents

We Already Have Everything We Need#

Pixel art illustration of a lobster operating a software assembly line with AI agent lobsters
The software factory: lobsters building software with AI agents

The current models and tooling are enough to build software factories. Today.

In a software factory, developers stop writing code by hand. AI coding agents implement features and fix bugs while developers design and improve the factory.

Or as Lee Edwards from Root Ventures put it:

“It’s like giving them a nuclear-powered six-axis mill. It’s a single-person software factory.”

Anthropic ships the building blocks today:

You delegate a ticket, review the output, and tighten the architecture. These pieces fit together:

Understanding Claude Code’s Full Stack: MCP, Skills, Subagents, and Hooks Understanding Claude Code's Full Stack: MCP, Skills, Subagents, and Hooks Explained A practical guide to Claude Code's features — explained in the order they were introduced: MCP (2024), Claude Code core (Feb 2025), Plugins (2025), and Agent Skills (Oct 2025). What each does, how they fit together, and when to use what. claude-codeaimcp +2

But first, how we used to work.

How We Worked Before: Beer Commerce#

Pixel art illustration of tired lobsters in suits at a conference table with a scrum board
The old way: endless meetings, handoffs, and two-week sprints

Beer Commerce is a fictional company that sells craft beer online. Mid-size, multiple teams, microservices backend, microfrontend architecture.

One of their teams, the Checkout team, has 12 people:

They run two-week sprints. Here’s what happens when a new feature comes in. Say, “Add a discount code field to the checkout page”:

Business request → BA writes user story → PO refines and prioritizes → sprint planning (2-4 hrs) → UX designs mockups (2-3 days) → dev picks up ticket (day 3-4) → dev writes code (2-3 days) → code review (1-2 days) → QA testing (1-2 days) → deployed (day 10-14).

Business request to production: 10 to 14 days. That’s the happy path, with no blocked dependencies and no sick days. Many features eat an entire sprint or spill into the next one.

Handoffs are the bottleneck. The BA hands off to the PO. The PO hands off to the dev. The dev hands off to QA. Each handoff adds waiting time and context loss. The coding itself takes maybe 2 to 3 days out of a 14-day cycle.

Most of the industry still works this way.

Enter the Software Factory#

Pixel art illustration of a relaxed lobster monitoring screens while agent lobsters work on an assembly line
The new way: delegate to agents, review the output, ship

Claude Code runs as a scheduled agent in the cloud. You give it a task and it works on its own: reading the codebase, writing code, running tests, opening PRs. This is a production workflow.

In a software factory, each team member is a product builder. The UX designer who gets a customer complaint can write a spec, hand it to an agent, and babysit the implementation. The business analyst who spots a conversion drop can describe the fix, kick off an agent, and watch it ship. If you understand the problem, you can drive the solution.

The workflow:

Business request → builder writes spec (30 min) → agent picks up ticket → agent writes frontend + backend + tests in parallel → agent runs full test suite and lint → builder reviews PR (1-2 hrs) → deployed (same day).

A designer, PM, or domain expert can write that spec. A good spec goes in, working software comes out.

Business request to production: hours, not weeks.

The builder writes a good spec, delegates to the agent, and reviews the output.

Pixel art illustration of a factory town with a Mayor lobster coordinating many agent lobsters working on parallel git branches merging through a refinery
Gas Town: a Mayor orchestrating 20-30 Polecats through a Refinery merge queue

Steve Yegge built this for real with Gas Town. He calls it “Kubernetes for AI coding agents”, and the comparison is architecturally accurate. You talk to the “Mayor” (the factory foreman), and it coordinates 20 to 30 parallel coding agents called “Polecats” that each work on feature branches. A “Refinery” manages the merge queue so parallel work doesn’t collide. Git persists everything. If the system crashes, it reads the history and resumes.

Boris Cherny, the creator of Claude Code, said:

“We’re going to start to see the title of software engineer go away. It’s just going to be ‘builder’ or ‘product manager.’”

Anthropic describes the shift as: “Engineers are shifting from writing code to coordinating agents that write code, focusing their own expertise on architecture, system design, and strategic decisions.”

The job:

You spend less time on boilerplate and more on architecture and deciding what to build. I explored this shift in detail here:

Spec-Driven Development with Claude Code in Action Spec-Driven Development with Claude Code in Action A practical workflow for tackling large refactors with Claude Code using parallel research subagents, written specs, and the new task system for context-efficient implementation. claude-codeailocal-first +1

What Are Skills?#

Pixel art illustration of a lobster reaching for glowing skill badges on a workshop wall
Skills turn a general-purpose AI into a specialized factory worker

A factory is only as good as its tooling. For AI coding agents, that tooling is skills.

As Simon Willison explains, a skill is “a Markdown file telling the model how to do something, optionally accompanied by extra documents and pre-written scripts.” He predicts “a Cambrian explosion in Skills which will make this year’s MCP rush look pedestrian.”

Some examples:

You add skills, and your agents handle more types of work without you. They debug production issues, handle incidents, set up infrastructure.

Building and maintaining these skills is the highest-leverage engineering task. You’re programming the factory. I’ve written about this hands-on:

How to Speed Up Your Claude Code Experience with Slash Commands How to Speed Up Your Claude Code Experience with Slash Commands Learn how to transform Claude Code from a chatbot into a deterministic engine using Slash Commands. This guide covers the technical setup and a complete 'Full Circle' workflow that automates your entire feature lifecycle. aiclaude-code

The New Team: Everyone Is a Builder#

Pixel art split comparison of a crowded team versus a small team of builders with agent helpers
From 12 specialists to 5 builders. Everyone ships.

If AI agents handle the implementation, you don’t need 12 people on the Checkout team. You need 5, maybe fewer.

Elad Gil called it “the dirty secret of 2024: the actual engineering team size needed for most software products has collapsed by 5-10x.” Gergely Orosz from The Pragmatic Engineer writes that “we are already seeing the end of two-pizza teams (6-10 people) thanks to AI.” Andres Max goes further: “A 5-person team in 2026 can ship what a 50-person team shipped in 2016.”

Fixed roles dissolve.

Good early-stage startups have no “frontend developers” or “QA engineers.” They have builders. The designer writes code. The engineer talks to customers. The PM debugs production.

The software factory brings that to companies of any size. Agents handle the implementation, and people become builders with agents as their execution layer.

Your expertise still matters. Someone who spent ten years doing UX research asks better questions about user flows. Someone with deep backend experience makes better architectural decisions. Expertise becomes a superpower, not a job boundary. The UX expert can act on their insights by instructing an agent to build a prototype, instead of writing a ticket and waiting two weeks.

AKF Partners describe a similar model: “A team of 2 or 3 humans, a lead developer, a product manager, and a designer, could leverage AI agents to cover coding, testing, deployment, and analytics.”

The data backs this up. According to the DX Q4 Impact Report, roughly 60% of non-engineers like managers, designers, and PMs now use AI to contribute code daily.

The gap between “the person who knows what to build” and “the person who can build it” is closing. Your ideas and judgment matter. Your job title doesn’t.

The Self-Improving Factory#

Pixel art illustration of lobster agents in a circular feedback loop reading data and creating tickets
The self-improving loop: observe, decide, build, ship, repeat

The factory can also figure out what to build next.

Your product generates signals: user feedback in support tickets, behavior data from analytics, A/B test results, error logs. Right now, a human reads all of that, synthesizes it, and turns it into tickets. That takes time and drops context.

In a software factory, agents do this on their own:

The product owner no longer curates a static list once a week. Ideas stream in ranked by impact, updated as data arrives.

Tools like Linear are building toward this. They call their model a triad: humans decide and remain accountable, agents execute within defined scopes, the platform manages interactions and visibility. Issues can only be assigned to humans, but delegated to agents. As they put it: “An agent cannot be held accountable.” The platform becomes the orchestration layer where humans and agents interact.

Combined with their Cursor integration, the picture becomes clear: “Starting an issue used to mean manually creating a feature branch. Now it means assembling the right context so your coding agent can take a first pass.” Linear provides the product and customer context, the coding agent provides the codebase expertise, and the human provides judgment. That’s the factory’s coordination layer.

Because the factory can also implement those ideas, you get a closed loop: observe, decide, build, ship, observe again. You use the factory to improve the product, and you read the product’s data to improve the factory.

Andrej Karpathy pushes this to its extreme with autoresearch. AI agents run ML experiments overnight on a single GPU. Humans write a program.md (a high-level spec) instead of training code, and agents iterate through experiments. Each run takes 5 minutes, producing about 12 experiments per hour, all optimizing toward a single metric. As Karpathy puts it: “Research is now entirely the domain of autonomous swarms of AI agents running across compute cluster megastructures.”

The same pattern applies to product development. Replace “training runs” with “feature experiments” and “val_bpb” with “conversion rate.” Your factory runs experiments, measures outcomes, and feeds the results back into the next cycle.

Anthropic supports scheduled tasks that let agents run on a timer. Set up a daily agent that reads your support inbox, one that checks your analytics dashboard, one that reviews your error logs. Your team reviews what they find, prioritizes, and lets the factory execute.

For a deeper look at how agent teams work in practice:

7 Agent Team Patterns for Claude Code 7 Agent Team Patterns for Claude Code Agent teams let multiple Claude Code instances work in parallel. But when should you use them — and how should you structure the team? Here are 7 patterns I keep reaching for. claude-codeaitooling +1

The Factory Is the Product#

In a software factory, the software you ship is one product. The factory is another.

Skills you build, pipelines you improve, architectural decisions that make your codebase more agent-friendly: these compound. A team that spends a week improving their testing infrastructure ships future features faster. A team that builds a cloud debugging skill lets an agent resolve future incidents.

Pixel art illustration of a lobster at a quality control station inspecting code blocks on a conveyor belt with green checkmarks and red X screens
Backpressure: automated feedback that keeps agents on track

Moss calls this backpressure: the automated feedback that keeps agents on track. Type checkers, linters, test suites, build systems. If an agent misses an import, the build fails and the agent fixes it. You don’t spend your time pointing out syntax errors. Without backpressure, you’re stuck reviewing trivial mistakes. With it, agents self-correct and you focus on architecture and product decisions. Investing in backpressure is investing in your factory’s throughput.

The teams that build the best factories will ship the best products.

This Is Happening Now#

Pixel art illustration of a lobster standing at the entrance of a glowing software factory
Are you going to build a factory?

Claude Code runs in the cloud. You can schedule agents and build skills. Early adopters are doing this now.

As Andres Max put it: “Engineers who only wrote boilerplate are at risk. Engineers who make good decisions, architect systems, and understand users are more valuable than ever.”

Are you going to build a factory, or keep doing two-week sprints while your competitors ship in hours?

In Five Years, Developers Won’t Write Code By Hand In Five Years, Developers Won't Write Code By Hand Software development as translation work is dying. Software engineering—the strategic, architectural discipline—is more valuable than ever. The shift is already here. aisoftware-engineering

Conclusion#

Press Esc or click outside to close

Stay Updated!

Subscribe to my newsletter for more TypeScript, Vue, and web dev insights directly in your inbox.

  • Background information about the articles
  • Weekly Summary of all the interesting blog posts that I read
  • Small tips and trick
Subscribe Now