Touring Emily's 192 Cores
Open emily/core/ and you find 192 Python files. That's not sprawl; it's the anatomy. Emily's cognition is decomposed into small modules because the frameworks that compose her identity need crisp seams. Here's a tour of what lives where and why.
The five families
The cores cluster into five functional families. Most modules belong to exactly one family; a few bridge two.
1. Memory & embedding (the substrate)
memory.pyโ L1/L3/L4 tier operations, promotion, decayembeddings.pyโ OpenAI text-embedding-3-large wrapper, 1536-dim vectorsl3_consolidator.pyโ near-duplicate collapse at 0.92 cosinebatched_updates.pyโ transactional multi-memory writestraining_memory.pyโ the seed layer used during Factory Floor genesis
This is the floor Emily stands on. Touch these and you're changing physics.
2. Cognitive frameworks (the scorers)
math.pyโ EMEB: epsilon calculation, source trust, gibberish detectionearl_tracker.pyโ EARL outcome propagation, 5-turn feedback windowecgl_recomputer.pyโ multi-dimensional scoring (epsilon/outcome/novelty/stability)batch_cognitive_tagger.pyโ async scoring across large memory batchesmetacognition.pyโ Emily reasoning about her own reasoning
These are the rules of thought. A memory without its framework scores is just text in a table.
3. Orchestration (the router)
chat_processor.pyโ the main turn handlerllm_cognitive_processor.pyโ LLM routing, context assemblyclaude_mcp_client.pyโ Claude-specific MCP tool executionattention.pyโ retrieval weighting at read timelearning_cycle.pyโ end-of-turn reflection and promotionapc_metrics.pyโ Adaptive Prompt Control telemetry
This family decides what Emily does with a turn. The LLM hands back tokens; these modules decide what to do next.
4. Autonomy (Project Helios)
autonomous.pyโ the task registryautonomous_worker.pyโ 10-second polling workerclone_provisioning_task.pyโ task template for creating new Emily clonesclone_safety.pyโ safety gates on autonomous actionsreaper.py(in Helios) โ crash recovery via lease expiration
When Emily does something on her own, she does it here.
5. Governance & health (the immune system)
comprehensive_health_check.pyโ the meta-monitor across all tiersbehavior_validator.pyโ checks that Emily's responses match her identitycoherence_validator.pyโ checks consistency across memory graphscommand_validator.pyโ sandboxes what autonomous execution can runauthz.py/auth.py(legacy renamed) โ authorizationattribution.pyโ provenance tracking for every memory
The immune system is load-bearing. Without it, autonomy is reckless.
The ones that surprise people
A few modules don't fit the families neatly and reward attention:
academy.pyโ the new-user onboarding Emily. Runs during Factory Floor Genesis to seed L3 with articulation turns, not just data.cognitive_tracer.pyโ a structured log of every cognitive decision. When Emily does something weird, this is where you look.canonical_hash.pyโ deterministic memory fingerprinting, used to detect near-duplicates before they even reach the 0.92 consolidation threshold.artifact_service.pyโ stores generated artifacts (code, documents) Emily produces, with bidirectional links to the memories that motivated them.style_earl_integration.pyโ EARL applied specifically to voice. Emily learns which phrasings land and which don't, per user.
Why not monolithic
A reasonable question: if Emily is ultimately one cognition, why decompose her into 192 modules instead of a few big ones?
Because the frameworks need replaceable parts. EMEB v2 is already on the board; EARL went from v1 to v2 in February 2026; ECGL is tuned repeatedly. If any of these lived inside a 10,000-line cognition.py, upgrading them would mean touching everything that reads their outputs. Small modules with narrow contracts mean you can swap the scoring logic without touching the orchestration logic.
It also means Emily is legible. You can read any one module and understand its job. That's not a luxury โ it's the only way to reason about a system that claims to be self-correcting. If you can't read a module and understand what it decides, you can't trust the system that decides.
What's next
Most of the current work is in three areas:
- Sharper ECCR routing โ the retrieval layer is still the one most likely to surface "close but wrong" memories.
- Cross-clone knowledge โ how much (if anything) one user's Emily can learn from another user's Emily without violating per-user isolation.
- Framework versioning โ EMEB v3 is under design, focused on better handling of adversarial inputs.
We'll write about each of those as they land. For now, the map is the map.