09. Design systems that survive AI generation

16 April 2026#use-case#design-system#frontend#code-quality

Claude builds frontends. Some of them are good. Some of them drift — a pure grey sneaks in, a flat black shadow appears, Inter replaces a distinctive serif. The drift is subtle. The cumulative effect isn't. After a few builds, your product looks like every other AI-generated SaaS page: purple gradients, rounded corners, system fonts, no personality.

The problem isn't that Claude can't do better. It can. The problem is that without a codified system to check against, "better" is whatever Claude decides that day.

The problem

You've seen the pattern. You ask Claude to build a dashboard. It looks fine. You ask for a landing page. Also fine — but in a different way. Different fonts. Different color temperature. Different spatial rhythm. Neither is wrong on its own. Together, they belong to two different products.

This is what happens when design decisions are made per-build instead of per-system. Every generation starts from scratch. There's no memory of what came before, no constraint set that says "this is us."

The failure modes are predictable. Pure neutrals (#FFFFFF, #000000, #808080) disconnect the brand from the surface. Flat black shadows look pasted on. Generic font stacks (Inter, Roboto, Arial) say nothing. Timid, evenly-distributed color palettes avoid commitment. The result is technically functional and aesthetically vacant.

The fix: a codified design system with teeth

The /frontend-design-review skill solves this by turning your design system into a scoring rubric. It reads a reference specification — your tokens, your palettes, your typography rules, your spatial constants — and audits any artifact against it. Not vibes. Scores.

Five dimensions, weighted by how often they're the failure mode:

Color System (30%) — Checks for pure neutral violations, token completeness, temperature consistency, shadow temperature, dark mode quality, and interactive state treatment. Runs the greyscale test: does hierarchy hold without color? This gets the highest weight because it's where AI drift shows up first.

Typography (25%) — Checks against a banned font list, evaluates pairing quality (display + body + mono triad), hierarchy clarity, monospace-as-UI-chrome usage, and typographic details like letter-spacing on uppercase text.

Spatial Composition (20%) — Vertical rhythm, grid gap uniformity, component padding, border radius consistency, responsive behavior.

Motion & Interaction (15%) — Page load orchestration, transition timing, hover state coverage, scroll behavior.

Component Patterns (10%) — Consistency across cards, badges, tables, headers.

Each dimension gets a 1–5 score. The weighted total produces a verdict: Approved, Approved with notes, Revise, or Redesign. There's no "pretty good" — either the system is enforced or it isn't.

What the output looks like

A full review produces four documents:

review.md — The graded audit. Scores, narrative assessment, every finding with severity. Written so a designer understands the reasoning, not just the verdict.

checklist.md — Binary pass/fail against every item in the design system. Printable. Hang it on the wall.

redlines.md — Every finding with a before/after code fix. A developer works through it top to bottom and the design system compliance rises mechanically.

handoff-brief.md — One page for a graphic designer. No code. The three most important things to look at.

The design system behind it

The skill ships with a reference specification built from a real color systems article. It codifies:

The no-pure-neutrals rule. Every neutral carries a hue — even if it's only 2%. Pure grey disconnects the brand from the surface. A tinted neutral makes the brand an intensification of the hue already running through every surface. This is the single decision that makes a UI look expensive.

Temperature-matched shadows. Blue surface, blue shadow. Warm surface, neutral shadow at low opacity. Dark mode uses glow edges instead of drop shadows. Pure black shadows read as Photoshop effects.

Three reference palettes — warm (editorial, rust brand on light grey), cool (developer tools, blue throughout), dark (tinted navy, not zinc-900). Each with all 11 token positions defined.

A font stack pattern. Display font for personality, body font for readability, monospace accent for precision. The reference uses Fraunces, Instrument Sans, and Chivo Mono. The point isn't these specific fonts — it's that fonts are chosen, not defaulted.

Spatial constants. 1080px container. 88px vertical rhythm. 14px grid gaps. Zero border radius throughout (an intentional aesthetic choice). Every component has a defined padding value.

You can replace every value in the reference spec with your own brand's values. The skill doesn't care what your system is. It cares that you have one and that the artifact matches it.

When to run it

After every Claude-generated frontend. That's the honest answer. But especially:

After the /landing-page pipeline builds a page (Review Point B — Phase 6)
Before the /landing-page pipeline builds a page (Review Point A — Phase 4, checking the design brief)
After any standalone component or dashboard build
When comparing two versions of a UI to track design drift

Skip it if the artifact is a prototype that's going to be rebuilt anyway, or if the project hasn't established a design system yet (in which case, establish one first — the reference spec is a good starting point).

Three modes

Full review — "design review" or "check this design." All five dimensions. Four output documents.

Quick check — "just check the colors" or "typography check." One dimension, findings inline, no separate documents.

Comparison — Two versions in. Delta table out. Dimension scores side by side, what improved, what regressed.

Resources

Skill reference: /frontend-design-review — trigger phrases, output structure, scoring rubric details.

Integrates with:

08. Building a landing page from a codebase — plugs in at Review Points A and B
/ux-audit pipeline — UX and positioning analysis

Design system reference: The DESIGN-SPECIFICATION.md bundled with the skill — the full color system, typography rules, spatial constants, and motion patterns extracted from a live color systems article.

Download: Full toolkit (252KB) — all 16 commands, all 11 skills, installs in 30 seconds.

Part of the Claude Skills Library.

𝕏 Post