Atlas: Turning Documents Into Living Simulations

16 April 2026#atlas#devlog#feature#building-in-public

We are building Atlas, a multi-agent opinion simulation engine. The premise is simple: upload real-world documents, describe a scenario, and get back a structured analytical report and a living digital world you can interrogate.

This is the first post in a series documenting the build. Here is what Atlas is, who it is for, and how it works.

The problem we are solving

Before a product launch, a policy rollout, an M&A announcement, or a crisis communications response goes live, teams run surveys, convene focus groups, and model stakeholder reactions. These methods are useful but slow, expensive, and hard to repeat. They also collapse the diversity of a real stakeholder population into averaged responses.

We want a complementary layer: rapid, repeatable simulation that generates structured hypotheses about opinion dynamics before anything is committed. We call this pre-commitment intelligence — observing how a diverse population might react before the scenario plays out in the real world.

Atlas does not replace research. It gives teams a faster, cheaper way to stress-test assumptions and surface unexpected dynamics before resources are spent.

Who it is for

The primary use cases are organizations navigating situations where public or stakeholder opinion carries material consequences:

Product launches and market entry
Policy rollouts and regulatory announcements
M&A and financial events
Brand campaigns and PR crisis pre-testing
Internal communications testing before organizational changes

Researchers studying computational social science — opinion formation, information propagation, platform dynamics — are a secondary audience. Atlas logs every agent action to JSONL, giving you a full dataset to analyze.

How it works

The workflow is five sequential steps.

Step 1: Graph Build. You upload source documents — PDF, Markdown, plain text, DOCX, CSV, or Excel files. You write one sentence describing the scenario you want to explore. Atlas uses the LLM to design a bespoke ontology tailored to your documents, then extracts entities and relationships into a local SQLite knowledge graph. A 3D force-directed graph visualization appears in real time as entities are discovered.

Step 2: Environment Setup. The extracted entities become simulation agents. Each agent gets a full persona: bio, MBTI personality type, prior stance on the topic, personal stakes, reaction triggers, communication style, and platform-appropriate follower or karma counts drawn from power-law distributions. The system also generates seed posts to bootstrap the conversation and two scheduled turning-point events — narrative twists that get injected at roughly the 24-hour and 48-hour marks of the simulated timeline.

Step 3: Simulation. A dual-platform parallel simulation runs: the same agent population participates on both a Twitter-like and a Reddit-like platform simultaneously, but the mechanics differ. Twitter emphasizes reposting and follower-driven amplification; Reddit emphasizes upvote/downvote dynamics and threaded discussion. Every action is logged. The knowledge graph updates in real time as agents act — it evolves from its initial document-derived state to include simulation-derived opinions and relationships.

Step 4: Report. A ReACT-based Report Agent generates a structured multi-section analytical report through iterative Thought-Tool-Observation-Answer cycles. It has five specialized tools: deep entity analysis, broad graph traversal, fast keyword search, live agent interviews, and simulation analytics (activity curves, turning-point detection, platform divergence scores). Sections stream incrementally to the frontend. The full reasoning trace is visible and auditable.

Step 5: Interaction. Chat with the Report Agent to ask follow-up questions, or switch to direct conversation with any individual simulation agent. Each responds in character with their full persona and in-simulation memory.

The tech

Atlas is built on Flask 3 and Vue 3. The simulation engine is OASIS by the CAMEL-AI team, which handles multi-agent social interaction at scale. The knowledge graph is a local SQLite-backed store — no external graph database required, one .db file per project. Visualization is D3.js. LLM calls go through any OpenAI-compatible endpoint (OpenAI, Ollama, Qwen, Azure, and others all work). Docker-ready.

A few things we are particularly pleased with: per-category model routing (the high-volume simulation step can use a cost-efficient model while the report step uses a frontier model), real-time LLM cost monitoring streamed via SSE, and the file-system persistence model — all state is stored as JSON, JSONL, Markdown, and SQLite files, inspectable with standard tools.

Atlas was forked from MiroFish (Shanda Group) and substantially rewritten: English prompts throughout, richer agent personas, power-law social graphs, simulation analytics, scheduled narrative events, per-category model routing, real-time cost monitoring, and a local SQLite knowledge graph replacing the original Zep Cloud dependency.

Where we are

Atlas is a functional prototype. The five-step pipeline works end-to-end. LLM integration, per-category model routing, Docker deployment, and the configuration system are production-ready. Code quality, error handling, and documentation are partial. There is no automated test suite and no auth layer yet. The honest summary: strong prototype, not production-hardened.

We are building in public. Subsequent posts will go deeper on specific components — the knowledge graph design, the ReACT report agent, the IPC architecture for the OASIS subprocess, and where we are taking this next.

If any of this is relevant to work you are doing, we would like to hear from you.

𝕏 Post