17 Apr 2026

Touring Emily's 192 Cores

#emily-os#architecture#cores#internals

Open emily/core/ and you find 192 Python files. That's not sprawl; it's the anatomy. Emily's cognition is decomposed into small modules because the frameworks that compose her identity need crisp seams. Here's a tour of what lives where and why.

The five families

The cores cluster into five functional families. Most modules belong to exactly one family; a few bridge two.

1. Memory & embedding (the substrate)

memory.py — L1/L3/L4 tier operations, promotion, decay
embeddings.py — OpenAI text-embedding-3-large wrapper, 1536-dim vectors
l3_consolidator.py — near-duplicate collapse at 0.92 cosine
batched_updates.py — transactional multi-memory writes
training_memory.py — the seed layer used during Factory Floor genesis

This is the floor Emily stands on. Touch these and you're changing physics.

2. Cognitive frameworks (the scorers)

math.py — EMEB: epsilon calculation, source trust, gibberish detection
earl_tracker.py — EARL outcome propagation, 5-turn feedback window
ecgl_recomputer.py — multi-dimensional scoring (epsilon/outcome/novelty/stability)
batch_cognitive_tagger.py — async scoring across large memory batches
metacognition.py — Emily reasoning about her own reasoning

These are the rules of thought. A memory without its framework scores is just text in a table.

3. Orchestration (the router)

chat_processor.py — the main turn handler
llm_cognitive_processor.py — LLM routing, context assembly
claude_mcp_client.py — Claude-specific MCP tool execution
attention.py — retrieval weighting at read time
learning_cycle.py — end-of-turn reflection and promotion
apc_metrics.py — Adaptive Prompt Control telemetry

This family decides what Emily does with a turn. The LLM hands back tokens; these modules decide what to do next.

4. Autonomy (Project Helios)

autonomous.py — the task registry
autonomous_worker.py — 10-second polling worker
clone_provisioning_task.py — task template for creating new Emily clones
clone_safety.py — safety gates on autonomous actions
reaper.py (in Helios) — crash recovery via lease expiration

When Emily does something on her own, she does it here.

5. Governance & health (the immune system)

comprehensive_health_check.py — the meta-monitor across all tiers
behavior_validator.py — checks that Emily's responses match her identity
coherence_validator.py — checks consistency across memory graphs
command_validator.py — sandboxes what autonomous execution can run
authz.py / auth.py (legacy renamed) — authorization
attribution.py — provenance tracking for every memory

The immune system is load-bearing. Without it, autonomy is reckless.

The ones that surprise people

A few modules don't fit the families neatly and reward attention:

academy.py — the new-user onboarding Emily. Runs during Factory Floor Genesis to seed L3 with articulation turns, not just data.
cognitive_tracer.py — a structured log of every cognitive decision. When Emily does something weird, this is where you look.
canonical_hash.py — deterministic memory fingerprinting, used to detect near-duplicates before they even reach the 0.92 consolidation threshold.
artifact_service.py — stores generated artifacts (code, documents) Emily produces, with bidirectional links to the memories that motivated them.
style_earl_integration.py — EARL applied specifically to voice. Emily learns which phrasings land and which don't, per user.

Why not monolithic

A reasonable question: if Emily is ultimately one cognition, why decompose her into 192 modules instead of a few big ones?

Because the frameworks need replaceable parts. EMEB v2 is already on the board; EARL went from v1 to v2 in February 2026; ECGL is tuned repeatedly. If any of these lived inside a 10,000-line cognition.py, upgrading them would mean touching everything that reads their outputs. Small modules with narrow contracts mean you can swap the scoring logic without touching the orchestration logic.

It also means Emily is legible. You can read any one module and understand its job. That's not a luxury — it's the only way to reason about a system that claims to be self-correcting. If you can't read a module and understand what it decides, you can't trust the system that decides.

What's next

Most of the current work is in three areas:

Sharper ECCR routing — the retrieval layer is still the one most likely to surface "close but wrong" memories.
Cross-clone knowledge — how much (if anything) one user's Emily can learn from another user's Emily without violating per-user isolation.
Framework versioning — EMEB v3 is under design, focused on better handling of adversarial inputs.

We'll write about each of those as they land. For now, the map is the map.