Most of my frontend code is written by an agent now. I review it, nudge it, tighten the architecture. I don’t type most of it.
That flips what a quality pipeline is for. Tests and types stop being something you write so the next developer who touches the code stays sane. They’re the contract the agent uses to check its own work. The more independent ways it can verify a change – types, lint, unit, component, real browser, a11y, bundle budget – the more of a ticket it can finish without me in the loop. Each green check is a signal it can act on; each red one is a hint about what to try next.
Frontend has gotten a lot more complicated in the last year, too. SSR, streaming, partial prerendering, server components, edge runtimes. More places for a change to be silently wrong than there were a year ago. “TypeScript plus a couple of unit tests” doesn’t cover it anymore.
A quality pipeline is the set of checks that run on every change – locally, on commit, and in CI – to give layered confidence that the change is correct, accessible, performant, and safe to ship. A testing strategy is the part of that pipeline that asserts behaviour: what the app should do, at which level, at what cost.
Most teams discuss the two separately. They’re one system. Skip the pipeline and nobody runs the tests. Skip the strategy and you write the wrong ones. Design them together so that:
- Each check has a clear job and runs at the cheapest stage where it can catch the problem.
- The feedback loop is short enough that nobody – developer or agent – bypasses it.
- The same checks run on a contributor’s laptop, in an agent’s sandbox, and on the CI runner.
The pipeline looks roughly the same whether you build with Next, Nuxt, Astro, SvelteKit, Remix, or a plain Vite app. The framework just decides which adapter you import.
Background
Frontend tooling consolidated between 2023 and 2026. Vite became the default dev/build engine across the major frameworks. Vitest replaced Jest. Playwright became the default for E2E. ESLint adopted flat config; Biome and Oxlint emerged as much faster alternatives in Rust. TypeScript strict mode became table stakes. Renovate replaced Dependabot. In March 2026, VoidZero shipped Vite+ as the open-source culmination of that trend – one CLI that wraps Vite, Rolldown, Vitest, Oxlint, Oxfmt, and Tsdown.
The pieces of a modern quality pipeline now look roughly the same regardless of framework, so the concept is worth describing without tying it to one specific stack.
My default stack
For new frontend projects in 2026 I reach for Vite+ instead of wiring
the toolchain by hand. Vite+ (viteplus.dev) is
the unified toolchain from VoidZero, Evan You’s company. It bundles Vite,
Rolldown, Vitest, Oxlint, Oxfmt, and Tsdown behind a single CLI (vp dev,
vp check, vp test, vp build) and one config file. The alpha shipped
open source under MIT.
If you adopt the pieces one at a time, the swaps I would make are:
| Old default | What I use | Why |
|---|---|---|
| ESLint | Oxlint | ~50× faster, fast enough to run on every keystroke |
| Prettier | Oxfmt | ~30× faster, Prettier-compatible defaults |
| Jest | Vitest | ESM-native, browser mode, same matchers |
| webpack | Vite + Rolldown | ~40× faster production builds |
| four separate configs | vp check / vp test / vp build | one CLI, one config |
Each piece holds up on its own. I have shipped Vitest and Oxlint in production for some time; swapping Prettier for Oxfmt and webpack for Rolldown took a day in the projects I tried. Vite+ removes the integration cost that kept teams on the older stack.
The layers
Think of the pipeline as concentric layers, each cheaper and faster than the one outside it. Run cheap checks first. Save the expensive ones for the things only they can catch.
1. Type safety
Type safety is your first line of defence. Run your framework’s type checker in CI on every PR. Treat any new type error as a build failure.
If you use TypeScript, that means tsc --noEmit (or your framework’s
wrapper around it – most frameworks ship one to handle their
template syntax and project references). If you don’t use TypeScript yet,
this is the single highest-leverage change you can make.
Validate untyped boundaries with a schema library (Zod, Valibot, ArkType).
Anywhere data crosses a boundary – route params, API responses, env
vars, form input – parse it. TypeScript trusts the types you write;
schemas check that the data matches them at runtime. Runtime parsing also
removes the temptation to reach for as – see why as is a shortcut to avoid The Problem with as in TypeScript: Why It's a Shortcut We Should Avoid Learn why as can be a Problem in Typescript .
2. Lint and format
Catches style and a wide class of bugs (unused vars, unsafe any, missing
deps in effects) without running the code.
The conventional choice is ESLint flat config plus typescript-eslint and
your framework’s plugin. The 2026 alternative is Oxlint (Rust-based,
~50× faster) paired with Oxfmt for formatting, or Biome for a
single-binary lint+format combo. The trade-off: Oxlint and Biome have
smaller rule sets than ESLint’s mature ecosystem, but they cover most of
the high-value cases and are fast enough to run on every keystroke. For a
working setup that uses Oxlint as a fast first pass and keeps ESLint for
the rules Oxlint doesn’t yet cover, see my opinionated ESLint setup for Vue projects My Opinionated ESLint Setup for Vue Projects A battle-tested linting configuration that catches real bugs, enforces clean architecture, and runs fast using Oxlint and ESLint together. .
Add these two rule families regardless of which linter you pick. They catch bugs the type system misses:
eslint-plugin-regexp– ~60 correctness rules for regular expressions. Cheap to add, catches real bugs.@e18e/eslint-plugin– small performance lints (e.g., preferSet.hasoverArray.includes) that compound across a codebase.
3. Unit tests
For pure functions, hooks, stores, and utilities. Cheap, fast, and where most logic should live.
- Tool: Vitest. Run on every save in watch mode; run all of them in CI.
- Aim for high coverage of pure modules; don’t chase coverage on UI glue.
For Vue specifically, see my guide to testing Vue composables with Vitest How to Test Vue Composables: A Comprehensive Guide with Vitest Learn how to effectively test Vue composables using Vitest. Covers independent and dependent composables, with practical examples and best practices. .
4. Component tests
For components in isolation, with a real DOM and real user interactions. The single biggest win in 2026 is Vitest browser mode – your component tests run in a real Chromium via Playwright instead of jsdom. Hover states, focus, layout, intersection observers, and scroll behaviour all work as they do in production.
Pair this with @testing-library/* for whichever framework you use, and
add axe-core to assert accessibility violations on every mounted
component. Add a meta-test that fails if any component lacks an a11y
assertion, so the practice doesn’t slide. For a deeper walkthrough of how
this fits into a full testing pyramid, see my Vue 3 testing pyramid guide Vue 3 Testing Pyramid: A Practical Guide with Vitest Browser Mode Learn a practical testing strategy for Vue 3 applications using composable unit tests, Vitest browser mode integration tests, and visual regression testing. ;
for the Vue-specific accessibility checklist that informs those a11y
assertions, see my Vue accessibility blueprint Vue Accessibility Blueprint: 8 Steps Master Vue accessibility with our comprehensive guide. Learn 8 crucial steps to create inclusive, WCAG-compliant web applications that work for all users. .
5. API mocking
Hard-coded fixtures go stale. Tests that hit a real backend are flaky. Mock at the network layer once and reuse the same handlers everywhere.
- Tool: MSW (Mock Service Worker). It intercepts
fetch, XHR, and GraphQL with a service worker in the browser and a request interceptor in Node, so the same handler definitions work in Vitest, Vitest browser mode, Playwright, and the dev server. - Define handlers once in
src/mocks/handlers.ts; load them in your test setup and (optionally) in the dev server for offline-first development. - Combined with Zod (or Valibot/ArkType) schemas at the same boundary, you get mocks that are typed, schema-validated, and shared across every layer that hits the network – one source of truth instead of three drifting fixture folders.
6. End-to-end tests
For critical user journeys across real pages: signup, checkout, the one or two flows that must never break. Keep the suite small – E2E is expensive.
- Tool: Playwright.
- Run against a built preview, not the dev server.
- Two assertions worth wiring into a custom fixture, regardless of
framework, because they catch silent regressions:
- Hydration mismatches. Listen for hydration warnings on
consoleand fail the test if any appear. SSR/CSR drift is one of the most common silent regressions in modern frameworks. I wrote a dedicated post on catching hydration errors in Playwright tests How to Catch Hydration Errors in Playwright Tests (Astro, Nuxt, React SSR) A Playwright fixture that listens to the browser console for hydration mismatch warnings and fails your E2E tests when server and client HTML disagree. with a reusable fixture. - CSP violations. Listen for
securitypolicyviolationevents. If your CSP is real, this turns every E2E run into a CSP regression test.
- Hydration mismatches. Listen for hydration warnings on
7. Visual regression
Catches unintended UI drift that unit and E2E tests miss.
- Chromatic (hosted, Storybook-native) or Playwright screenshots + a diff tool like Lost Pixel for self-hosted. For a Vitest-native approach, see how to do visual regression testing in Vue with Vitest How to Do Visual Regression Testing in Vue with Vitest? Learn how to implement visual regression testing in Vue.js using Vitest's browser mode. This comprehensive guide covers setting up screenshot-based testing, creating component stories, and integrating with CI/CD pipelines for automated visual testing. .
onlyChanged: truekeeps it cheap: only re-snapshot stories whose dependencies changed.- Gate on PR; review diffs as part of code review.
8. Performance and bundle size
Performance regressions are silent unless you measure them.
- Lighthouse CI on a preview deployment. Run it against both a light and dark color scheme – contrast regressions show up only in one.
size-limitor your framework’s bundle analyzer on PR for bundle deltas. Set explicit budgets and fail the build when they’re exceeded.- Real-user monitoring (web-vitals shipped to your analytics or Sentry) for what the lab can’t see.
9. Dead code and dependency hygiene
Unused code is a tax on every other check.
- Knip to find unused files, exports, and dependencies. Configure per-workspace if you have a monorepo.
- Renovate for automated dependency updates with grouping and a sane schedule.
- OSV-Scanner for vulnerabilities and Gitleaks for secrets, gated to high-severity only to avoid alert fatigue.
- Generate an SBOM (Software Bill of Materials) on every build with Syft, Trivy, or cdxgen, in CycloneDX or SPDX format. This is shifting from “nice to have” to “regulated requirement” in 2026 (EU CRA, US executive orders), and it’s the same artefact your security team uses to answer customer vulnerability questionnaires.
10. Internationalisation drift
If you ship in more than one language, untranslated strings slip through unnoticed. A drift checker in CI catches them.
- Lunaria compares each locale against a source locale and reports missing or stale keys. It works with any project that has translation files; you can publish a public status dashboard from the same data.
11. Preview deployments
The cheapest way to enable manual review and to give E2E, Lighthouse, and visual-regression checks something realistic to run against.
- Vercel, Netlify, or Cloudflare Pages will give you a unique URL per PR for free.
- Wire your downstream checks to that URL.
12. Automated code review
In 2026, AI code review is a standard pipeline stage. It runs before any human reviewer touches the PR and catches issues the layers above miss: logic mistakes, missing edge cases, security smells, and the small inconsistencies that lint rules can’t express.
- CodeRabbit, Greptile, and Vercel Agent are the main options. Recent benchmarks put Greptile’s bug-catch rate around 82% versus CodeRabbit’s ~44%, but Greptile produces more false positives and runs slower; CodeRabbit covers more git platforms.
- Have it run alongside specialist scanners (secrets, vulnerabilities, workflow lint, shell/yaml lint) so a single bot comment summarises every machine-checkable concern on the PR.
- Pause the bot on Renovate / Dependabot PRs to avoid noise on mechanical updates.
Treat the AI reviewer as a high-recall first pass. Humans still review, but less. If you want to add an AI agent that goes one step further and exercises the app in a real browser, see how I run automated QA with Claude Code, Agent Browser, and GitHub Actions How to Use Claude Code as an AI QA Tester with Agent Browser Claude Code and Agent Browser let you test your web app in a real browser without hardcoded selectors. Manual browser control, AI-driven exploration, and structured JSON output. .
Where each layer runs
Same layers, different stages. Pick the cheapest stage where each check can catch the problem.
| Stage | What runs |
|---|---|
| Editor | Type checker LSP, linter, Vitest watch |
| Pre-commit | Format and lint on staged files only |
| CI on PR | Typecheck, full lint, unit, component, build, knip, size-limit, AI review |
| CI on preview URL | E2E, accessibility (axe + Lighthouse), visual regression |
| Post-merge / nightly | Full E2E matrix, dependency updates, security scans, SBOM publish |
For wiring local hooks themselves, Lefthook has become the
default modern alternative to Husky: a single Go binary, declarative
YAML config, and parallel execution of lint/format/test commands on
staged files. Commit a lefthook.yml to the repo, run
lefthook install once, and contributors get the same hook setup
automatically. Pair it with lint-staged (or Lefthook’s built-in
{staged_files} substitution) so pre-commit only runs against the
files actually changing – the fast-check pattern that keeps the
hook under a couple of seconds.
Git 2.54 added config-based hooks, which means a small project no
longer needs an external hook manager at all. You define hooks in
.gitconfig instead of as scripts under .git/hooks:
[hook "linter"]
event = pre-commit
command = pnpm exec oxlint --staged
[hook "format"]
event = pre-commit
command = pnpm exec oxfmt --check
Multiple hooks per event run in order. Disable a single hook with
hook.<name>.enabled = false (useful for opting one repo out of a
system-wide config), and list the active ones with git hook list pre-commit. The traditional .git/hooks/* scripts still run last,
so existing setups keep working. For a small project this covers most
of what Lefthook does without an extra binary. Lefthook still has the
edge for parallel execution and staged-file substitution.
One valid alternative skips commit hooks entirely and runs everything server-side in CI as required status checks. You trade a slower red-CI feedback loop for never blocking a contributor with a flaky local hook.
What this gets you
- Regressions caught before merge. A typed schema at the boundary plus a Playwright check on the critical path catches more shipped bugs than any single layer alone.
- Refactors get safer. Strict types and a healthy unit and component test suite let you change internals without breaking surface behaviour.
- Onboarding gets shorter. A new contributor can run one command, see green, and trust that CI will tell them if they break something.
- No one becomes the bottleneck. The pipeline enforces the standard for accessibility, performance, and i18n, so quality stops riding on a single contributor.
Picking your testing shape
The layers above tell you what to run. They don’t tell you how much weight to give each one. That depends on your context, and the answer isn’t the same for a solo dev shipping a single Vite app and a fifty-engineer team coordinating across a dozen services.
The industry argues about this in shapes. The classical Pyramid (Mike Cohn) puts most of the weight on unit tests and very little on E2E. Kent C. Dodds’ Trophy moves the weight to integration tests because frontend bugs live at the component-interaction layer. Spotify’s Honeycomb pushes weight onto integrated and contract tests because in a microservices world, isolation tests prove little and full E2E is brittle. The Ice-cream cone is the anti-pattern you end up with by accident: lots of slow E2E on top, almost nothing underneath.
The right answer is “it depends,” but you can be precise about what it depends on. As web.dev puts it in Pyramid or Crab, “the testing strategy that’s right for your team is unique to your project’s context”. Pactflow, from the contract-testing camp, goes further: at scale, full E2E is a tax with diminishing returns once teams and services multiply.
The four inputs that move the needle most:
- Team size. Six developers can keep a real E2E suite green together; sixty cannot. Coordination cost is the silent E2E killer.
- Backend control. If you don’t own the backend, contract testing goes from “nice” to “the only way you can change anything safely”.
- Number of services in the flow. Each new service multiplies the surface that has to be set up, seeded, and reset for an E2E run.
- Deployment cadence. A daily-merge team can’t afford a flaky twenty-minute suite; a quarterly-release team can.
Pick yours and the shape that fits will appear:
A pyramid for one team is the right answer; for another it’s the reason you can’t ship on Friday. The shape isn’t a fashion choice — it’s a function of the constraints you’re working under.
Picking your battles
You don’t need every layer on day one. A reasonable order to add them:
- TypeScript strict + a fast linter + a formatter, wired into Lefthook so format and lint run on staged files at commit time.
- Vitest for utilities, run on PR.
- MSW handlers for any module that hits the network, shared between tests and the dev server.
- Playwright for the single most important user journey, with hydration and CSP listeners wired into a shared fixture.
- Preview deployments and Lighthouse CI (light and dark).
- Storybook + visual regression once the design system stabilises.
- Accessibility audits in component tests and E2E.
- Knip and bundle-size budgets once the codebase has weight.
- i18n drift checking once you ship a second locale.
- AI code review and SBOM generation once the project has external stakeholders – reviewers, customers, or compliance.
Each layer should pay for itself in caught regressions or saved review time. Remove the ones that don’t.
Supply chain defaults
The pipeline above catches your bugs. The supply chain catches you. The 2025–2026 wave of npm attacks (Shai-Hulud, the Rspack postinstall cryptominer, the axios 1.14.1 hijack) made the package manager’s defaults a real part of your security posture. I use pnpm on every new project. Three of its settings do most of the work.
1. Lifecycle scripts blocked by default. Since pnpm 10, preinstall
and postinstall scripts in dependencies do not run on pnpm install.
You opt specific packages in via pnpm.onlyBuiltDependencies in
package.json. Most historical supply chain payloads shipped through
postinstall, so this default removes one of the biggest attack vectors.
2. minimumReleaseAge. A pnpm 10.16+ setting that refuses to
resolve a published version until it is at least N minutes old. Set
it to 1440 (one day) or 10080 (one week) in pnpm-workspace.yaml.
Most compromised packages get detected and unpublished within hours,
so a 24-hour delay covers the common published-and-pulled incidents.
pnpm 11 makes one day the default. Use minimumReleaseAgeExclude for
the few internal or first-party packages you genuinely need to install
the moment they ship.
3. blockExoticSubdeps. Refuses transitive dependencies pinned to
git repositories or tarball URLs. Closes a common path for
typo-squatting and dependency confusion.
A minimal pnpm-workspace.yaml for a new project:
minimumReleaseAge: 1440
blockExoticSubdeps: true
onlyBuiltDependencies:
- esbuild
- sharp
Pair this with OSV-Scanner and Gitleaks in CI (layer 9 of the pipeline) and you cover both the install-time and the audit-time sides of supply chain security.
Related resources
If you want to read more or start implementing this:
Tools
- Vite+ – unified toolchain from VoidZero (Vite, Rolldown, Vitest, Oxlint, Oxfmt, Tsdown)
- pnpm supply chain security – the full list of defaults and settings discussed above
- Vitest – including browser mode
- Playwright
- MSW – network-level API mocking for browser and Node
- Lefthook – fast Git hooks manager (modern Husky alternative)
- Storybook
- Knip
- Oxlint
- Biome
- Lighthouse CI
- Lunaria – i18n drift detection
- size-limit
- axe-core
- Syft – SBOM generation
- CodeRabbit, Greptile – AI code review
Reference
- The Practical Test Pyramid – Ham Vocke’s canonical write-up of Mike Cohn’s pyramid.
- The Testing Trophy – Kent C. Dodds’ model for where to put the weight of your tests.
- Testing of Microservices (Honeycomb) – Spotify’s case for integrated tests over isolated unit tests in a microservices world.
- Pyramid or Crab? Find a testing strategy that fits – web.dev on choosing a shape for your context.
- Proving E2E tests are a scam – Pactflow’s contract-first counter-position.
- Frontend testing guide: 10 essential rules for naming tests Frontend Testing Guide: 10 Essential Rules for Naming Tests Learn how to write clear and maintainable frontend tests with 10 practical naming rules. Includes real-world examples showing good and bad practices for component testing across any framework.