Specialist Over Generalist: Why We Build 24 Agents Instead of One
#worksona#agents#architecture#specialization#ai#design
David OlssonThe dominant mental model for deploying AI in organizations is the assistant model: one capable, general AI that answers questions, writes documents, and performs tasks. The appeal is obvious โ simplicity, a single integration point, and the intuition that a smarter model does more.
We built the opposite. The Worksona platform coordinates 24 specialized agents. The Dante pharmaceutical pipeline decomposes document processing into four distinct stages. The organizational simulation uses role-specific agents derived from individual behavioral profiles. The survey toolchain has separate tools for pool design, administration, analysis, and simulation.
This is not a quirk of the implementation. It is a deliberate architectural position.
What specialists do better
Domain depth. A generalist model prompted to analyze a pharmaceutical patent produces a competent but shallow response. It knows what patents are; it knows chemistry in a general sense. A specialist agent with a focused system prompt, access to a curated knowledge graph of the relevant compound space, and calibrated temperature settings for structured extraction produces something categorically different โ a structured record with bounding boxes, confidence scores, and links to prior art. The same model, radically different outputs.
Honest failure modes. Generalist models degrade gracefully toward confident wrongness when pushed outside their competence. Specialist agents fail loudly and specifically. When a structure-extraction agent cannot identify a chemical structure, it returns a structured error with a failure reason. When a delegation agent cannot route a task, it escalates rather than fabricates. Specialists know their boundaries and enforce them.
Independent evolution. When we improve the SMILES validation logic in the Dante pipeline, we change one service. The structure-extraction step and the property-calculation step are unaffected. When we improve a generalist agent, we potentially change behavior across every task it handles.
Composability. Specialists combine freely. The bounding-box editor in dante-bronte passes crops to dante-decimer, which passes SMILES to dante-smiles. Each stage is independently testable, independently deployable, and independently replaceable. A generalist cannot be composed with itself.
The coordination overhead
The counterargument is real: specialists require coordination. When five agents handle what one agent could handle, you need a layer that routes tasks, manages results, and handles partial failures.
This is why the delegation stack exists. The coordination overhead is a solved problem in the architecture โ it is Layer 4's job. The delegation orchestrator selects which specialists to engage and in what order, handles intermediate results, and aggregates outputs. The cost of coordination is paid once, in the orchestration layer, and amortized across all workflows.
The spectrum in practice
Not everything warrants a specialist. The Worksona platform maintains a spectrum.
At the specialist end: the Dante pipeline stages, the organizational simulation agents derived from behavioral profiles, the AIMQC inspection pipeline stages. Each of these has a narrow, well-defined responsibility, a specific failure mode, and a clear interface.
At the generalist end: the chat interface agents that handle conversational follow-up, the summary agents that produce human-readable synthesis from structured data. These are deliberately general because their job is to translate structured outputs into human language, not to perform specialized extraction.
The architectural decision is about where to draw the line โ and the principle is simple: if a task has specialized data, specialized failure modes, or specialized quality requirements, it warrants a specialist. If it is fundamentally translational or compositional, a well-prompted generalist is sufficient.
Implications for scaling
The specialist architecture scales in a way the assistant architecture does not. Adding a new domain does not require modifying an existing agent. It requires defining a new specialist with its own prompt, its own tools, its own knowledge graph connections, and registering it with the agent registry.
The first 24 agents took the most effort. Each new agent takes less, because the scaffolding โ the registry, the orchestration layer, the skill loading system โ already exists. The portfolio of agents becomes a portfolio in the financial sense: diversified, independently evolvable, with each new addition reinforcing rather than disrupting the whole.