01. How to audit your codebase before it audits you
#use-case#code-quality#security#orchestrator
David OlssonHow to audit your codebase before it audits you
You've inherited a codebase. Or you've been building one for eighteen months and someone asks if it's production-ready. You have no idea. You know there are problems — every codebase has problems — but you don't know where they are, how bad they are, or which ones to fix first.
The manual approach is to read through the code. The realistic approach is to read through some of the code, find a few things that bother you, fix those, and hope the rest is fine. It never is.
The problem
Code quality isn't one thing. It's at least five things happening simultaneously, and they're easy to miss because they're different shapes:
Consistency problems look like style preferences. One file uses camelCase, another uses snake_case. One module handles errors with try/catch, another returns error objects. Individually, none of these matter. Collectively, they mean every new developer spends their first two weeks learning which convention applies where — because the answer is "it depends on who wrote that file."
Repetition problems look like pragmatism. Someone copy-pasted a function because it was faster than extracting it. Now the function exists in four places with slightly different implementations. A bug fix in one copy doesn't propagate to the others. This is how inconsistencies breed.
Security problems look like nothing at all until they're everything. A hardcoded API key. An unvalidated input. A dependency with a known CVE that nobody checked because the tests pass. These sit quietly in the codebase until someone finds them — and you want that someone to be you.
Pattern problems look like architecture decisions. The codebase uses async/await in some places and callbacks in others. State management is split across three different approaches. There's an N+1 query that nobody noticed because the dataset was small during development. These are the problems that only surface at scale.
Auditability problems look like documentation failures. A module does something important, but its name doesn't say what. Functions have side effects that aren't mentioned anywhere. The dependency graph has circular references that make it impossible to reason about what depends on what.
Five different kinds of problems. Five different ways to miss them. If you're scanning the code yourself, you're probably good at catching one or two of these. The others slip through.
What you get instead
The /code-audit pipeline runs five specialist auditors in parallel — one for each dimension — and then grades every finding by severity. The output is nine reports.
graph TD
A["/code-audit"] --> B["project-scanner"]
B --> C1["consistency-auditor"]
B --> C2["repetition-detector"]
B --> C3["security-auditor"]
B --> C4["pattern-optimizer"]
B --> C5["auditability-assessor"]
C1 --> D["audit-grader"]
C2 --> D
C3 --> D
C4 --> D
C5 --> D
D --> E["00 Executive Summary"]
D --> F["06 Graded TODO"]
C1 --> G["01 Consistency"]
C2 --> H["02 Repetition"]
C3 --> I["03 Security"]
C4 --> J["04 Patterns"]
C5 --> K["05 Auditability"]
Reports 01–05 are the specialist audits. Each one examines the entire codebase through a single lens. The consistency auditor checks naming conventions, file organization, error handling patterns, import styles, and formatting. The repetition detector finds duplicated logic, copy-pasted code, and DRY violations. The security auditor checks for hardcoded secrets, injection vectors, unvalidated inputs, and dependency CVEs. The pattern optimizer looks for anti-patterns, async handling issues, state management problems, and performance traps. The auditability assessor evaluates module clarity, comment quality, circular dependencies, and test coverage on critical paths.
Report 06 is the deliverable. The audit grader reads all five specialist reports, classifies every finding by severity (Critical, Major, Minor, Informational), scores each pillar on a 0–100 scale, and produces a prioritized remediation TODO. This is the list you hand to a developer and say "work through this top to bottom."
Report 00 is the executive summary — the scorecard, the verdict, and the top findings. There are four verdict levels: Good Standing (ship it), Acceptable (ship with awareness), Needs Work (fix before shipping), and Critical (stop and fix now).
Report 07 is the activity log — everything the auditors observed while working.
Report 08 is the delta report. This only appears on re-audits.
Why five auditors instead of one
A single pass through the code catches surface problems. Five focused passes catch structural problems.
The consistency auditor doesn't care about security. The security auditor doesn't care about naming conventions. Each one goes deep on its dimension instead of going wide on everything. The pattern optimizer finds N+1 queries that a security auditor would skip. The auditability assessor finds circular dependencies that a consistency auditor wouldn't flag.
The grader is the synthesis layer. It reads all five reports and asks: given everything these auditors found, what should the developer fix first? A Critical security finding outranks a Major consistency finding. A pattern that causes the same problem in three different audits gets escalated. The prioritization is cross-cutting — no single auditor could produce it.
The delta cycle
The most valuable feature isn't the first audit. It's the second one.
After the first audit, you get a TODO list. You fix things. You run /code-audit again. Now the delta reporter kicks in and produces a comparison: scores went up in consistency and security, down in auditability (a refactor introduced new complexity), and the repetition count dropped by 40%. At this rate, the estimated cycles to Good Standing is two more.
This turns code quality from a subjective opinion into a measured trajectory. You can show the delta report to a lead or a client and say: "We were at 42/100 on security. We're at 71 now. Two more cycles and we're at Good Standing."
When to run it
Before a release. Especially if the codebase hasn't been audited before. The graded TODO tells you what's safe to ship and what isn't.
After a major refactor. Refactors fix some problems and introduce others. The audit catches the others.
On a cadence. Monthly or quarterly. The delta reports show trajectory. Code quality either improves or decays — there's no steady state.
When onboarding a new team member. Hand them the audit report. It's the most honest description of the codebase's current state.
Skip it on throwaway prototypes or projects under 500 lines. The overhead isn't worth it for code that won't live long.
Resources
Pipeline reference: /code-audit — 9 reports, 7 sub-skills, full report table and output tree.
Key skills in this pipeline:
- audit-grader — scores findings, produces the prioritized TODO
- security-auditor — CVEs, injection, hardcoded secrets
- delta-reporter — tracks improvement between audit cycles
Related reading:
- 02. What a real security audit looks like — the dedicated, deep security review
- 03. Documenting projects that don't document themselves — pair with code audit for a full health picture
Download: Full toolkit (252KB) — all 16 commands, all 11 skills, installs in 30 seconds.
Part of the Claude Skills Library.