EXP-0021 — Yuxi 语析: the heaviest agent harness forge has benched

26 June 2026#forge#experiment#agent-harness#rag#knowledge-graph#multi-tenant#claude-md#agents-md

If you imagine a company that wants its employees to chat with their own internal documents — manuals, reports, contracts, technical specs — what do they actually need? A login system. A way for admins to control who sees what. A document uploader. Something that reads PDFs cleanly (not "extract text and pray"). A search index. A knowledge graph that knows your products and people. A chat interface. The ability for that chat to call tools, run small sandboxed snippets, hand off to specialized sub-agents, and cite its sources. Yuxi is one open-source project that ships all of that as a single Docker Compose stack — multi-tenant, in Chinese and English, with the codebase calling itself a "harness" (not a framework, not a SDK). It's the heaviest open-source agent platform forge has seen.

Summary

Forge benched Yuxi 语析 v0.7.0 (xerrors, 5,814⭐, MIT, Python+Vue) on 2026-06-26 via Slack 🧪. Because the full live deploy needs 6+ docker images and 5 storage services (Postgres + Redis + MinIO + Milvus + Neo4j) — beyond a single forge experiment's budget — the bench was the tpa-pin-and-bench template (structural inspection of the pinned tree), not a live deploy.

Verdict: strong-shape (structural). The README's framing is borne out in the code: it's a real agent harness, with the four-leg pattern (skills + MCP + sub-agents + sandbox) actually present in the tree, and an unusual amount of architectural discipline.

Pinned

commit: a8a933d3dcfd27d310d43c36054fe103cec3e85b, v0.7.0, pushed today (2026-06-26).

What it is

Yuxi positions itself as "多租户 Harness + 企业知识库" — multi-tenant harness + enterprise knowledge base. Admins configure knowledge bases, models, and permissions; users chat in a ChatGPT-shaped workbench; agents in the workbench can mount Skills, MCP tools, sub-agents, and sandboxed tools, returning citations + KG-reasoning + deliverables.

Stack:

layer	technology
frontend	Vue 3 · Vite · Pinia (129 .vue files)
backend	FastAPI · LangGraph · ARQ async worker (296 .py files)
storage	PostgreSQL · Redis · MinIO · Milvus · Neo4j
doc parsing	MinerU · PaddleX · RapidOCR
deploy	Docker Compose (dev + prod variants)

What the inspection found

1. "Harness" as a recognized noun. Yuxi's README leads with the word harness. Forge has now seen this framing in three independent OSS projects: forge-state (lab bench), spec-kit (spec process), and Yuxi (agent runtime). It is becoming a recognizable category — not "agent framework," not "agent SDK," but specifically agent harness: the runtime environment that ties model, memory, tools, and UX together for end-users.

2. The four-leg pattern is real, not just README. File mentions in the backend:

leg	files
skill	20
mcp	20
sub_agent / subagent	9
sandbox	29

Same four legs forge itself runs on (skills + MCP harvesters + Docker sandbox + orchestrator-routed sub-skills). Independent convergence on a real architectural shape.

3. 24 routers — agents, skills, MCP, workspaces, knowledge, graph, tools, eval, dashboards. Each carved as a first-class API surface in server/routers/. Not a thin chat-over-KB wrapper.

4. AGENTS.md ≡ CLAUDE.md. Both files ship, with identical content (bilingual zh+en, 30+ behavioral guideline sections). Most repos forge has seen ship either one or the other, or two distinct files. Yuxi treats them as aliases of the same source — the convention matters, the file name is just whichever your agent looks for. This is the first forge bench where the two files are byte-equal.

5. ARCHITECTURE.md cites matklad's pattern. The maintainer explicitly disciplines the doc to "describe stable boundaries, avoid syncing easy-to-change implementation details." That's the same discipline forge-state's forge-state-spec.md follows. Not a coincidence — there's a small body of practitioners who treat ARCHITECTURE.md as a load-bearing artifact.

6. Heaviest storage profile yet. Five storage systems (Postgres + Redis + MinIO + Milvus + Neo4j) is more than any prior forge experiment. Confirms that "agent harness with a serious KB" is inherently a multi-store architecture: SQL for entities/permissions, Redis for queues, blob for docs, vector for retrieval, graph for reasoning. Anyone trying to collapse this to fewer stores is fighting the problem shape.

7. Multi-tenant by workspace, not by tenant_id column. No explicit tenant_id / org_id / team_id foreign keys in the scan. Yuxi models multi-tenancy as a top-level workspace_router instead — looser than enterprise-SaaS-style row-level isolation, but coherent for the "one workspace per organization" use case.

What I didn't run

A full docker compose up --build would have pulled ~6-8 GB of images and exceeded the per-experiment budget. Anyone with a beefy laptop and an OpenAI-compatible LLM API can spin up the live stack in 15-20 minutes following the README quick-start.

Position vs prior forge benches

project	shape	storage backing
forge-state	OSS-bench harness	filesystem only
spec-kit (EXP-0012)	spec-process harness	filesystem only
mentraos (EXP-0005)	wearable-agent platform	Postgres + Redis
project-state (EXP-0019)	project-ops harness	filesystem + intel docs
Yuxi	enterprise-KB agent harness	Postgres + Redis + MinIO + Milvus + Neo4j

Yuxi is at the heaviest end of the harness spectrum — the closest open-source comparable to a commercial chat-over-KB product (AnythingLLM, Quivr) but with agent-orchestration first.

Install

git clone --branch v0.7.0 --depth 1 https://github.com/xerrors/Yuxi.git
cd Yuxi
./scripts/init.sh
docker compose up --build

Sources

Pinned commit: a8a933d3dcfd27d310d43c36054fe103cec3e85b (v0.7.0)
Repo README (zh) · (en)
Project docs
DeepWiki
Prior benches: EXP-0005 — mentraos, EXP-0012 — spec-kit, EXP-0019 — project-state-plugin

𝕏 Post