Claude Code shipped workflows recently, and the docs describe a lot of machinery: deterministic orchestration, parallel and pipeline, journaling and resume, adversarial verify patterns. I wanted to understand it rather than skim the feature list, and the way I learn a tool is to build the smallest real thing with it.
So I picked a task with an obvious fan-out shape: “what happened in the Vue and Nuxt ecosystem this week.” Many independent sources to check, then a merge, then a write-up. I wrote a ~130-line workflow that spawns nine agents in parallel, each scouring a different source, collects their findings into one list, ranks them by impact, and writes a digest. It’s a throwaway, but building it taught me how the whole feature fits together. This post is what I learned.
A workflow is the newest piece of Claude Code’s orchestration story. In my post on agent teams From Tasks to Swarms: Agent Teams in Claude Code Agent teams let multiple Claude Code sessions coordinate, communicate, and self-organize. Here's how they work, when to use them, and what they cost. I traced the progression from subagents to teams. Workflows are the next rung, and they solve a different problem than either: when you want the control flow itself to be deterministic, not decided turn-by-turn by a model.
If you want a sense of the ceiling before the toy example, Jarred Sumner credited dynamic workflows and adversarial code review for porting Bun from Zig to Rust in six days:
Dynamic workflows and adversarial code review was part of what made it possible to rewrite Bun in Rust in 6 days.
New in Claude Code (research preview): dynamic workflows. Claude writes an orchestration script on the fly, then spins up a large fleet of coordinated subagents in parallel to take on your most complex tasks. Use the word "workflow" in a prompt to get started.
✨ TLDR
- → A workflow is a plain JavaScript script that orchestrates subagents deterministically: you own the loops and fan-out, agents do the thinking
- → The shape that generalizes: fan out → reduce → synthesize
- → agent() runs one subagent (use a schema for validated JSON), parallel() is a barrier, pipeline() streams items through stages with no barrier
- → Default to pipeline(); reach for a parallel() barrier only when a stage needs all prior results at once
- → Compose verify/judge/loop-until-dry patterns for confidence, not more agents
- → It's opt-in and token-hungry, so reach for it when a job needs breadth, verification, or scale a single context can't hold
Table of Contents
Open Table of Contents
Where Workflows Fit
Most of the time a single Claude Code session works turn-by-turn: read a file, decide, call a tool, look at the result, decide again. That loop is the right tool for most work. Some jobs don’t fit one head and one context window though:
- Comprehensive jobs: “review every file in this diff”, “audit all 40 dependencies”.
- Confidence-critical jobs: “find the bug, then have three independent skeptics try to refute it”.
- Scale jobs: migrations, sweeps, anything bigger than one context can hold.
Subagents and agent teams can attack these, but there’s a subtle difference in who holds the plan.
| Subagents | Agent Teams | Workflows | |
|---|---|---|---|
| What it is | A worker Claude spawns | Independent Claude sessions | A script the runtime executes |
| Who decides what’s next | Claude, turn by turn | Claude and the teammates | The script |
| Where results live | Claude’s context | Each session’s context | Script variables |
| What’s repeatable | The worker definition | The team setup | The orchestration itself |
| Scale | A few per turn | A handful of sessions | Dozens to hundreds of agents |
With subagents and skills, Claude is the orchestrator. It decides turn by turn what to spawn, and every result lands back in its context window. A workflow moves the plan into code. The script holds the loop, the branching, and the intermediate results, so Claude’s context only ever sees the final answer. That is what lets a workflow scale to hundreds of agents without drowning the conversation.
The Core Idea
A normal agent decides the control flow as it goes. A workflow inverts that. You write the control flow as plain code, and each individual step is delegated to a fresh subagent. The orchestration is deterministic; only the work inside each agent() call is model-powered.
That distinction is the whole point. When you write this:
const results = await parallel(files.map((f) => () => agent(`Review ${f}`)));
You know exactly one agent runs per file, they all run concurrently, and you get an array back. There are no emergent “the model decided to skip three files” surprises. You get determinism in the orchestration and model judgment inside each step.
The shape that keeps showing up is fan out → reduce → synthesize:
graph LR
A[fan out] --> B[agent 1]
A --> C[agent 2]
A --> D[agent ...]
A --> E[agent N]
B --> F[reduce: dedupe + rank]
C --> F
D --> F
E --> F
F --> G[synthesize: write the result]
Swap the sources and prompts and the same skeleton becomes a market scan, a dependency audit, a code review, or a research report.
The Example I Built to Learn It
Here is the workflow I wrote. I picked the newsletter task because it forces you to use every part of the feature: a wide fan-out, a reduce step, and a synthesis step. Every script starts with a meta block that must be a pure literal, then a body using the orchestration primitives.
export const meta = {
name: "vue-newsletter",
description: "Research Vue/Nuxt sources in parallel and synthesize a newsletter",
phases: [
{ title: "Research", detail: "one agent per source" },
{ title: "Curate", detail: "dedupe + rank by impact" },
{ title: "Write", detail: "synthesize the newsletter" },
],
};
1. Fan out with parallel()
Nine sources, nine agents, all at once. Each returns structured JSON validated against a schema, so the model retries on mismatch and I never parse free text:
phase("Research");
const raw = await parallel(
SOURCES.map((s) => () =>
agent(s.prompt, {
label: `research:${s.key}`,
phase: "Research",
schema: ITEM_SCHEMA, // forces validated structured output
agentType: "general-purpose",
}),
),
);
The SOURCES array is just data: one entry per source with a prompt. GitHub core releases, the Nuxt ecosystem, the official blogs, Hacker News, Reddit, dev.to, key people like Evan You and Anthony Fu, and the newsletter/podcast circuit.
2. Reduce with plain JavaScript
Flattening, deduping, and filtering is just code. No agent needed:
const collected = raw.filter(Boolean); // skipped/failed agents become null
const flatItems = collected.flatMap((c) => c.items);
log(`Collected ${flatItems.length} items`);
3. Synthesize with sequential agent() calls
phase("Curate");
const curated = await agent(curatePrompt, { phase: "Curate", schema: CURATED_SCHEMA });
phase("Write");
const newsletter = await agent(writePrompt, { phase: "Write" });
return { newsletter, itemCount: flatItems.length, curated };
The run I did while testing pulled together a Nuxt UI release, a Vue Router v5 minor, a Vue core patch, and a Madrid conference recap: seventeen items across nine sources in about three minutes. Good enough to convince me the orchestration worked, which was the whole point of building it.
The Primitives
A handful of functions do all the work.
agent(prompt, opts?) spawns one subagent. Without options it returns the agent’s final text. The options worth knowing:
schema: a JSON Schema. The subagent is forced to return validated structured data.label: the display name in the progress UI.phase: assigns the agent to a progress group. Use it insideparallel()andpipeline()to avoid racing on the globalphase()state.model: override the model for this one call. Default is to omit it so the agent inherits your session model.agentType: use a custom subagent type instead of the default workflow agent.isolation: "worktree": run the agent in its own git worktree. Only when agents write files in parallel and would otherwise conflict.
parallel(thunks) runs tasks concurrently. It is a barrier: it waits for every thunk before returning. A thunk that throws resolves to null rather than rejecting the whole call, so always .filter(Boolean) the results. You can pass a hundred thunks and they’ll all complete, but only a handful run at once: concurrency is capped at roughly your core count, and the excess queue.
pipeline(items, ...stages) runs each item through all stages independently, with no barrier between stages. Item A can be in stage 3 while item B is still in stage 1. Each stage callback receives (prevResult, originalItem, index).
workflow(nameOrRef, args?) runs another workflow inline as a sub-step and returns whatever it returns. Pass a name to invoke a saved workflow, or { scriptPath } to run a script file. This is composition: a research workflow can call /deep-research as one of its stages instead of reimplementing the fan-out. The child shares the parent’s concurrency cap, agent counter, and token budget, and shows up as its own group in /workflows. Nesting is one level deep: a workflow() call inside a child throws.
// inside a script: hand a sub-question off to the bundled deep-research workflow
const report = await workflow("deep-research", { question: topic });
The rest are small helpers: phase(title) starts a progress group, log(msg) emits a narrator line, args carries the JSON you passed in when launching, and budget exposes the token target so you can scale depth dynamically (it’s null when you launch without a target, so guard any loop-until-budget on budget.total or it runs to the agent cap).
Warning
Date.now(), Math.random(), and an argless new Date() all throw inside a workflow. Workflows journal every agent() call so a run can resume, and non-determinism would invalidate that cache. If you need a timestamp, pass it through args. If you need variety across agents, vary the prompt or label by index.
pipeline vs parallel: The Decision That Matters
This trips people up, so here is the rule I follow.
Default to pipeline(). Reach for a parallel() barrier between stages only when a stage needs all prior results at once.
Legitimate reasons for a barrier:
- ✅ Dedupe or merge across the full result set before expensive downstream work.
- ✅ Early-exit on the total (“0 findings, skip verification entirely”).
- ✅ A prompt that references “the other findings” for comparison.
Not legitimate:
- ❌ “I need to flatten or filter first.” Do it inside a pipeline stage.
- ❌ “The stages feel conceptually separate.” Separate is not the same as synchronized.
- ❌ “It’s cleaner code.” Barrier latency is real wall-clock waste.
The smell test: if you wrote parallel → transform → parallel, and that middle transform has no cross-item dependency, you should have used a pipeline. The newsletter example does use a barrier, and correctly: curation has to see every source before it can dedupe and rank across them.
Quality Patterns
The primitives compose into reusable harnesses. This is the real value over spawning more agents: the structure is what produces confidence. A few I lean on:
- Adversarial verify: for each finding, spawn N independent skeptics prompted to refute it. Kill it unless a majority survive. Stops plausible-but-wrong findings from shipping.
- Perspective-diverse verify: give each verifier a distinct lens (correctness, security, performance, does-it-reproduce) instead of N identical ones. Diversity catches failure modes redundancy can’t.
- Judge panel: generate N attempts from different angles, score with parallel judges, synthesize from the winner while grafting the best of the runners-up.
- Loop-until-dry: for unknown-size discovery, keep spawning finders until K consecutive rounds surface nothing new.
Here is loop-until-dry with a diverse-lens verify, condensed:
const seen = new Set();
const confirmed = [];
let dry = 0;
while (dry < 2) {
const found = (await parallel(FINDERS.map((f) => () =>
agent(f.prompt, { phase: "Find", schema: BUGS })))).filter(Boolean).flatMap((r) => r.bugs);
const fresh = found.filter((b) => !seen.has(key(b)));
if (!fresh.length) {
dry++;
continue;
}
dry = 0;
fresh.forEach((b) => seen.add(key(b)));
const judged = await parallel(fresh.map((b) => () =>
parallel(["correctness", "security", "repro"].map((lens) => () =>
agent(`Judge "${b.desc}" via the ${lens} lens — real?`, { phase: "Verify", schema: VERDICT })))
.then((vs) => ({ b, real: vs.filter(Boolean).filter((v) => v.real).length >= 2 }))));
confirmed.push(...judged.filter((v) => v.real).map((v) => v.b));
}
One detail makes or breaks this: dedupe against everything seen, not just confirmed results. Otherwise rejected findings reappear every round and the loop never converges.
A Shipped Example: How /deep-research Works
My newsletter generator is a toy. If you want to see these patterns in a real, bundled workflow, run /deep-research. It takes a question and returns a cited report, and under the hood it’s the same fan out → reduce → synthesize skeleton with an adversarial verify pass bolted on. It’s the quality pattern from the section above, running in production.
When you launch it the workflow announces its plan and runs in the background while you keep working:
It moves through five phases:
- Scope: one agent decomposes your question into five distinct search angles, so the searches don’t all chase the same wording.
- Search: five web searches run in parallel, one per angle. This is the fan-out.
- Fetch: dedupe the URLs across angles, pull the top ~15 sources, and extract individual claims from them.
- Verify: the interesting part. Each claim gets an adversarial three-vote check, with skeptics trying to refute it. Claims that don’t survive never reach the report.
- Synthesize: one final agent writes the cited report from the claims that held up.
Map that onto the primitives and you can almost see the script: a single agent() for scope, a parallel() fan-out for the five searches, plain JavaScript to dedupe in fetch, a per-claim verify pass (the same parallel() of skeptics from the loop-until-dry example), and a closing agent() to synthesize. The phases show up in /workflows as named groups (Scope 1/1, Search 0/5, Fetch, Verify, Synthesize), each with its own agent count, token total, and elapsed time, so you can drill into any single search or verification and read its prompt and result.
This is the difference between “ask Claude to research something” and a workflow. A single agent doing web research holds every half-read source in one context and never checks its own claims. /deep-research decomposes the search so coverage is wide, keeps the intermediate sources out of your conversation, and runs a verification pass a single turn-by-turn agent would never run against itself.
Triggering and Watching a Run
Worth saying plainly: from Claude Code’s side, a workflow is a tool. There’s a Workflow tool the same way there’s a Read or Bash tool, and “running a workflow” means Claude calls that tool with a script. The runtime executes the script in the background while your session stays responsive, which is why you can keep chatting while dozens of agents churn away.
There are a few ways a workflow gets written and launched:
- Say “workflow” in your prompt. Include the word and Claude writes a workflow script for the task instead of working through it turn by turn.
- Run a saved or bundled command. A workflow you saved to the project, or the built-in
/deep-researchcovered above. - Turn on
ultracode. Claude plans a workflow for every substantial task in the session.
Run a workflow to audit every API endpoint under src/routes/ for missing auth checks. Spawn one agent per route file, then have a second pass verify each finding before reporting.
When a run does what you wanted, you can save it: Claude Code writes the script into .claude/workflows/ in your repo as a <name>.js file (the appendix below is exactly that file for my newsletter). Because it lives in the repo, it’s version-controlled and anyone who clones it can launch it by name and pass arguments:
Run the vue-newsletter workflow with args {"weekStart":"2026-06-04","weekEnd":"2026-06-11"}
Runs happen in the background, and /workflows is how you watch them: it lists every run, including which ones are currently running, and opens a progress view showing each phase with its agent count, token total, and elapsed time. You can drill into a phase, then into a single agent, to read its prompt and result, pause or stop a run, or press s to save a good one’s script as a reusable /<name> command under .claude/workflows/.
When to Reach for One
graph TD
A[Does the job need breadth,<br/>verification, or scale?] -->|No| B[Single session<br/>or a subagent]
A -->|Yes| C[Do you want the control flow<br/>to be deterministic and repeatable?]
C -->|No| D[Agent team]
C -->|Yes| E[Write a workflow]
Good fit ✅
- Decomposing a job so every part is covered in parallel (audits, reviews, sweeps).
- Anything you want to re-run with the same structure (a weekly competitor scan, a release checklist).
- Confidence-critical work where a repeatable verify or judge pass beats one model’s first answer.
Bad fit ❌
- An ordinary task one agent can do turn by turn. Let one agent do it.
- Work that needs you to weigh in between every stage. A workflow can’t take mid-run input; only agent permission prompts pause it.
- Anything where the token cost of dozens of agents isn’t justified by breadth or scale.
Warning
A workflow spawns many agents, so one run can use meaningfully more tokens than doing the same task in conversation, and it counts toward your plan’s usage. Every agent uses your session’s model unless the script routes a stage elsewhere, so check /model before a large run and consider routing cheap stages to a smaller model.
Conclusion
- A workflow is a JavaScript script that orchestrates subagents deterministically. You own the control flow, agents do the thinking, and the plan lives in code so the conversation only sees the final answer.
- The shape that generalizes is fan out → reduce → synthesize. The newsletter generator I built is a deliberately small instance of it.
agent()runs one (use aschemafor validated structured output),parallel()is a barrier,pipeline()streams items through stages with no barrier. Default to pipeline.- The leverage is the repeatable quality patterns: adversarial verify, diverse lenses, judge panels, loop-until-dry.
- It is opt-in and token-hungry. Reach for it when a job needs breadth, independent verification, or scale a single context can’t hold. Otherwise let one agent do the work.
If you’ve already worked through subagents and skills Claude Code Customization: CLAUDE.md, Slash Commands, Skills, and Subagents The complete guide to customizing Claude Code. Compare CLAUDE.md, slash commands, skills, and subagents with practical examples showing when to use each. , workflows are the natural next tool. The fastest way to understand them is the same way I did: pick a small task with a clear fan-out shape and build the throwaway version. Mine was a newsletter generator I won’t run again. The point was never the newsletter; it was seeing how the pieces fit, so that when a job needs breadth or verification, reaching for a workflow is obvious.
Appendix: The Full Script
Everything above is excerpts. Here is the complete .claude/workflows/vue-newsletter.js in one piece, so you can see how the meta block, the schemas, the source list, and the three phases fit together. It’s plain JavaScript: no imports, no filesystem access, inputs via args, results via return.
vue-newsletter.js: the complete workflow
export const meta = {
name: 'vue-newsletter',
description: 'Research Vue/Nuxt ecosystem sources in parallel for a given week and synthesize a newsletter',
whenToUse: 'Generate a weekly Vue/Nuxt newsletter. Pass args {weekStart, weekEnd, label} as ISO dates (e.g. {"weekStart":"2026-05-21","weekEnd":"2026-05-28"}). With no args, agents cover the past 7 days from today.',
phases: [
{ title: 'Research', detail: 'one agent per source — releases, blogs, social, people' },
{ title: 'Curate', detail: 'dedupe + rank items by impact' },
{ title: 'Write', detail: 'synthesize the final newsletter' },
],
}
// Args are optional. Pass {weekStart, weekEnd, label} as ISO dates to scope a specific week.
// With no args, agents are told to cover "the past 7 days from today" (they resolve the date via web search).
const hasRange = args && args.weekStart && args.weekEnd
const weekStart = hasRange ? args.weekStart : null
const weekEnd = hasRange ? args.weekEnd : null
const label = (args && args.label) || (hasRange ? `Week of ${weekStart}–${weekEnd}` : 'this week')
const window = hasRange ? `between ${weekStart} and ${weekEnd}` : 'within the past 7 days from today'
const ITEM_SCHEMA = {
type: 'object',
additionalProperties: false,
properties: {
source: { type: 'string' },
items: {
type: 'array',
items: {
type: 'object',
additionalProperties: false,
properties: {
title: { type: 'string' },
url: { type: 'string' },
summary: { type: 'string', description: '1-3 sentence plain summary of what changed / why it matters' },
category: { type: 'string', enum: ['release', 'article', 'tooling', 'discussion', 'tutorial', 'people', 'other'] },
date: { type: 'string', description: 'ISO date if known, else empty' },
impact: { type: 'string', enum: ['high', 'medium', 'low'] },
},
required: ['title', 'url', 'summary', 'category', 'impact'],
},
},
},
required: ['source', 'items'],
}
// Each source is researched by its own agent in parallel.
const SOURCES = [
{
key: 'core-releases',
prompt: `Find releases/changelogs published ${window} for these GitHub repos: vuejs/core, vuejs/router (vue-router), vuejs/pinia, vueuse/vueuse, vitejs/vite, vitejs/vitest. For each new release in that window, give the version, the highlights, and the release URL. Skip anything outside the date window.`,
},
{
key: 'nuxt-releases',
prompt: `Find releases/changelogs published ${window} for the Nuxt ecosystem on GitHub: nuxt/nuxt, nuxt/ui, nuxt/image, nuxt/content, unjs/nitro, unjs/h3. Give version, highlights, and URL for each release in that window only.`,
},
{
key: 'vue-blog',
prompt: `Check the official Vue.js blog (blog.vuejs.org) and Vue.js news for posts published ${window}. Summarize each post with its URL.`,
},
{
key: 'nuxt-blog',
prompt: `Check the official Nuxt blog (nuxt.com/blog) for posts published ${window}. Summarize each with URL.`,
},
{
key: 'hackernews',
prompt: `Search Hacker News (news.ycombinator.com) for stories about Vue, Nuxt, Vite, or Pinia that were active/posted ${window}. Include the HN discussion URL and the linked article. Note points/comments if visible.`,
},
{
key: 'reddit',
prompt: `Search Reddit r/vuejs and r/Nuxt for notable threads posted ${window} — announcements, releases, popular discussions, showcased projects. Give the reddit thread URL for each.`,
},
{
key: 'devto',
prompt: `Search dev.to for the most useful Vue and Nuxt tagged articles published ${window} (tutorials, deep-dives, tips). Give URLs.`,
},
{
key: 'people',
prompt: `Look for notable updates, posts, or talks ${window} from key Vue/Nuxt people: Evan You (@youyuxi / VoidZero), Daniel Roe (Nuxt lead), Anthony Fu (VueUse/Vitesse/Slidev), Eduardo San Martin Morote (posva — router/pinia), Sébastien Chopin (Nuxt/NuxtLabs). Include VoidZero and NuxtLabs company news too. Give URLs.`,
},
{
key: 'newsletters-podcasts',
prompt: `Find Vue/Nuxt newsletter issues and podcast episodes published ${window}: Vue.js Newsletter (news.vuejs.org), This Week in Vue, Michael Thiessen's newsletter, DejaVue podcast, Deox/Vue Mastery content. Summarize and give URLs.`,
},
]
phase('Research')
const raw = await parallel(
SOURCES.map((s) => () =>
agent(
`You are researching the Vue.js / Nuxt ecosystem for a weekly newsletter covering ${label} (${window}).\n\n${s.prompt}\n\nUse web search and fetch real URLs. Only include items genuinely within the date window. Return real, verifiable URLs — never invent links. If you find nothing in the window, return an empty items array. Set impact based on how much the average Vue developer should care.`,
{ label: `research:${s.key}`, phase: 'Research', schema: ITEM_SCHEMA, agentType: 'general-purpose' },
),
),
)
const collected = raw.filter(Boolean)
const flatItems = collected.flatMap((c) => (c.items || []).map((it) => ({ ...it, source: c.source })))
log(`Collected ${flatItems.length} items across ${collected.length} sources`)
phase('Curate')
const CURATED_SCHEMA = {
type: 'object',
additionalProperties: false,
properties: {
highlights: { type: 'array', items: { type: 'string' }, description: '2-4 sentence TLDR bullets of the biggest stories this week' },
items: {
type: 'array',
items: {
type: 'object',
additionalProperties: false,
properties: {
title: { type: 'string' },
url: { type: 'string' },
summary: { type: 'string' },
category: { type: 'string' },
impact: { type: 'string' },
},
required: ['title', 'url', 'summary', 'category', 'impact'],
},
},
},
required: ['highlights', 'items'],
}
const curated = await agent(
`Here are raw newsletter candidate items gathered from multiple sources for the Vue/Nuxt week of ${label}:\n\n${JSON.stringify(flatItems, null, 2)}\n\nCurate them:\n1. Remove duplicates (same release/article surfaced by multiple sources — keep the best canonical URL).\n2. Drop low-quality, off-topic, or spammy entries.\n3. Rank by impact (high first).\n4. Write 3-5 punchy "highlights" bullets capturing the week's biggest stories.\nKeep every URL exactly as provided — do not fabricate or alter links.`,
{ phase: 'Curate', schema: CURATED_SCHEMA },
)
phase('Write')
const newsletter = await agent(
`Write a polished weekly Vue.js / Nuxt newsletter in Markdown for ${label}.\n\nUse this curated data:\n${JSON.stringify(curated, null, 2)}\n\nStructure:\n- A title with the week range and a one-paragraph intro setting the tone.\n- "📌 This Week's Highlights" — the highlights bullets.\n- "🚀 Releases" — version bumps with what changed (group Vue core + Nuxt + tooling).\n- "📝 Articles & Tutorials".\n- "🛠️ Tooling & Ecosystem".\n- "💬 Community & Discussion".\n- "👤 From the Core Team & Community" — people/company news.\n- A short friendly sign-off.\n\nEvery item must be a markdown link to its real URL. Keep summaries tight and developer-focused. Omit any empty section. Output ONLY the markdown newsletter.`,
{ phase: 'Write' },
)
return { newsletter, itemCount: flatItems.length, curated }