Worksona API: The Agent Runtime That the Rest of the Platform Builds On

3 March 2026#worksona#portfolio#agent-infrastructure#rest-api#memory#local-first

A production AI agent system needs more than an LLM call. It needs memory that does not grow unboundedly. It needs agents that adapt their communication style to the person they are talking to. It needs prompt engineering to be a managed artifact, not scattered strings in application code. Worksona API addresses all three, in a server that runs locally with three runtime dependencies.

What It Is

Worksona API (v2.0.0) is a Node.js/Express REST server that functions as a complete AI agent platform. It fulfills three roles simultaneously: an LLM proxy that normalizes requests and responses across Anthropic, OpenAI, and Google into a single consistent shape; an agent platform hosting 24 pre-built specialized agents across development, design, research, marketing, strategy, and data domains; and a skills API providing CRUD management for structured prompt instruction files loaded at runtime from the filesystem.

The server runs on port 3000 with a minimal footprint — Express, cors, and dotenv are the only runtime dependencies. The same codebase deploys as a local development server or wraps into a Netlify Functions serverless deployment without code changes.

No external database is required. Browser-side persistence uses IndexedDB. Server-side state is in-process RAM. The platform is fully operational on a developer laptop, on a corporate network with data residency requirements, or in a serverless cloud environment.

Why It Matters

Three innovations in Worksona API are worth understanding individually, because each represents a departure from how most LLM application frameworks handle the same problem.

Adaptive personality as a runtime system. Most agent frameworks treat behavior as static: you write a system prompt once and it governs all responses. The personality engine in Worksona API reads signals from incoming user messages — vocabulary complexity, question depth, emotional register — and continuously adjusts four parameters: formality, technicality, verbosity, and empathy. The same agent definition produces appropriate responses for a domain expert and a non-technical user without any explicit configuration or prompt modification. This is a runtime system, not a prompt.

Episode-based memory with importance scoring. Flat conversation history grows without bound and degrades retrieval quality over time. The episodic memory system structures memory as discrete, bounded episodes, each with an explicit importance score. When capacity approaches its limit, low-importance episodes are pruned automatically. High-signal context is preserved. A knowledge graph layer adds a second representation — facts, entities, and relationships extracted from episodes stored as a queryable graph — so agents can retrieve specific prior context without scanning full history.

JSON Schema-validated agent definitions. Agent definitions are JSON documents validated against a published schema. This means an agent is a portable artifact: the same definition loads into the framework, passes CI schema tests, transfers between projects, or publishes to a registry. The AgentLoader handles validation, dependency resolution, and automatic registry registration. Agents are not code — they are data that the framework interprets.

How It Works

The /api/llm endpoint is the core integration point. It accepts a normalized request — provider, model, messages, system prompt, temperature, max tokens — and returns a normalized response with content, usage, model, and stop_reason. Callers never deal with Anthropic's content block array versus OpenAI's choices array. The normalization is the abstraction.

Skills are .skill.md files organized in skills/<category>/ subdirectories. The server loads them recursively at startup, parsing YAML front matter for title, description, category, and tags. Skills are mutable at runtime via full CRUD endpoints — add, update, or delete without restarting the server. This separates prompt engineering from application code: iteration on a skill requires editing a markdown file, not a deployment.

The developer control panel provides a live browser-side view of agent state, memory contents, tool calls, and personality parameters during development. An SSE event stream exposes agent execution events in real time.

Where It Fits in Worksona

Worksona API is the foundational runtime layer. It defines the agent programming model that the rest of the portfolio builds toward or builds on.

Worksona Studio takes the same multi-agent orchestration concepts and expresses them as a visual canvas for non-developer users. worksona-mcp-server exposes agent capabilities via the Model Context Protocol for Claude Desktop integration. worksona-kg extends the knowledge graph component from the memory system into a full RAG platform. The delegator variants evolve the delegation and coordination patterns. Each of those projects assumes the existence of a coherent agent model — Worksona API is where that model is defined.

Version 2.0.0 marks the stabilization of the core patterns: episodic memory, adaptive personality, the skills format, and MCP integration. Those patterns are not incidental features. They represent considered answers to real limitations in how LLM applications typically handle memory, behavior adaptation, and prompt management. The API is the reference implementation. Everything else in the portfolio either consumes it or extends it.

Live: api.worksona.io

𝕏 Post