EXP-0013 — Agentic Resource Discovery: turning Hugging Face's launch into a working package
#forge#huggingface#ard#agentic-discovery#article-as-spec#python#MIT#MCP-adjacent
David OlssonUpdate (2026-06-24):
ard-toolsnow has its own canonical home at github.com/worksona/ard-tools — clone, fork, file issues, install viapip install git+https://github.com/worksona/ard-tools. The companion gist remains the frozen reproducibility anchor for this writeup. Forge'sforge-packagerskill now codifies the promote-to-repo rule for forge-original installable artifacts; see the follow-up post on repos vs gists.
For the layman
When you give an AI agent a task, the agent often needs tools to complete it — a calculator if the task involves math, a search tool if it needs to find facts, an image generator if it needs to produce a picture. Today, those tools have to be pre-installed by whoever set up the agent. If you want the agent to do something nobody anticipated, you're stuck.
Hugging Face just launched a system called Agentic Resource Discovery (ARD) that changes this. The idea: every organization that publishes tools (Hugging Face, OpenAI, you, a research lab, anybody) hosts a small catalog file describing what they have. An agent that needs a new tool searches across all the published catalogs at runtime, finds the best match, and uses it — no install, no setup, no asking a human. It's like the difference between only being able to install apps that came with your phone vs. being able to search the App Store.
ARD is a specification, not a product — anyone can implement it. Hugging Face shipped a reference implementation called the "Discover Tool" that searches across the HF Hub. The launch was announced via a blog post in their huggingface/blog repo.
Forge is our experiment harness. Because the source here is a blog post (not a clonable codebase), forge applied the new article-as-spec template — instead of just summarizing the spec, we tried to build the smallest runnable system the spec describes. The result is a Python package called ard-tools with 12 passing tests. It includes:
- A schema definition for what a valid
ai-catalog.jsonlooks like. - A validator (CLI:
ard-validate yourcatalog.json) that checks any catalog against the spec. - A minimal search engine that takes a query and ranks capabilities across multiple catalogs.
Anyone who wants to publish an ARD catalog or run an ARD search can pip-install this package and have a working starting point in seconds. The Hugging Face reference implementation is more sophisticated (it uses embeddings instead of lexical matching), but the contract — what an ARD catalog has to look like, what a search call has to return — is what we operationalized.
Status: experimented, result success. Source was the Hugging Face blog post launching ARD — not a clonable repo. Forge applied the article-as-spec template (added in EXP-0006) and shipped a runnable Python package, ard-tools, that operationalizes both halves of the spec (the static ai-catalog.json manifest and the dynamic POST /search ranking API). 12/12 tests pass. The CLI validates a sample catalog. The search demo returns the expected ranking.
This is a forge writeup of the Agentic Resource Discovery launch post by Hugging Face.
Canonical home: github.com/worksona/ard-tools. Frozen anchor: gist.github.com/worksona/c03eb6c9….
Install + use
pip install git+https://github.com/worksona/ard-tools
ard-validate path/to/ai-catalog.json
TL;DR
- Source:
huggingface/blog/agentic-resource-discovery-launch.md— Hugging Face's announcement of ARD as an open specification. - Spec in one paragraph. ARD defines two components: (1) a static
ai-catalog.jsonmanifest that any publisher hosts at a stable URL, listing theirskills,mcp-servers,agents,ml-apps, andtoolswithid,name,description,tags; (2) a dynamicPOST /searchendpoint that takes a natural-language query and returns ranked capabilities aggregated across catalogs. - What forge shipped.
ard-tools— a minimal, MIT-licensed, zero-runtime-dependency Python package that ships the schema, the validator, and a lexical-overlap scorer that stands in for the embedding ranker the HF reference uses. - Tests: 12/12 pass on
python:3.12in under 0.1 s. Coverage: schema rejection paths (missing fields, bad enum, extra fields), zero-overlap and tag-overlap scoring, multi-catalog aggregation, top-k ordering. - End-to-end demo.
ard-validate examples/example-catalog.json→OK: ai-catalog.json @ ARD 0.1, publisher=Demo Lab, 2 capabilities.python examples/demo.py→0.40 Demo Lab Fact Check Agentfor a query like "verify a claim with evidence."
What ARD actually is
The launch post describes a federated, search-first model for agent capabilities:
-
The capability manifest. Each publisher hosts an
ai-catalog.jsonat a stable URL. The file lists every skill, MCP server, agent, ML app, or tool the publisher offers, with enough metadata for a search engine to rank it. Forge's schema:- Top level:
ard_version,publisher({name,url,contact}),capabilities[],updated. - Per capability:
id,kind(one ofskill,mcp-server,agent,ml-app,tool),name,description, optionaltags,url,license,version,metadata. - Strict: no unexpected top-level fields, no unexpected fields per capability.
- Top level:
-
The search registry. A
POST /searchendpoint accepts a natural-language query and returns ranked capabilities. The HF reference implementation lives athttps://huggingface-hf-discover.hf.space/searchand exposes the same shape via thehf discover searchCLI command. -
The contract. Anything that consumes ARD doesn't care whether the search ranker uses embeddings, BM25, or lexical overlap; whether the catalogs are hosted on HF, GitHub Pages, or someone's homelab; whether the publisher is Hugging Face, OpenAI, or a research lab. The shape of the request and the shape of the response are the spec.
What forge built
ard-tools is the smallest runnable system that lets you publish an ARD catalog and search across catalogs. Four modules:
src/ard_tools/
├── schema.py JSON schemas for ai-catalog.json and individual capabilities
├── validate.py Pure-Python validator with path-precise error messages
├── search.py Lexical-overlap scorer + multi-catalog aggregator
└── __init__.py Public API
The deliberate non-features:
- No embedding model. The lexical scorer is a stand-in. Real implementations swap it for
sentence-transformersor an API call. The function signature stays the same. - No HTTP server in this v0.1. A FastAPI shim around
search_across_catalogsis ~30 lines; left as the obvious follow-up. - No external schema library. Pure-Python validator, ~50 lines. Keeps the package zero-runtime-dependency.
What the demo shows
$ ard-validate examples/sample-catalog.json
OK: ai-catalog.json @ ARD 0.1, publisher=Demo Lab, 2 capabilities
$ python examples/demo.py
0.40 Demo Lab Fact Check Agent
The query was "verify a claim with evidence". The catalog had two capabilities: a Calculator (tagged arithmetic, math) and a Fact Check Agent (tagged verify, evidence, described as "Verify claims against trusted sources"). The scorer correctly ranked the Fact Check Agent above the Calculator with a 40% overlap score — a real lexical-search behavior that mirrors what the embedding-based HF reference implementation does on the same query shape.
Tests
$ pytest -q tests/
............ [100%]
12 passed in 0.08s
Breakdown:
| group | tests | what it proves |
|---|---|---|
TestValidation | 5 | A valid manifest passes; missing required fields raise with the field name; bad kind enums are rejected; unexpected top-level fields are rejected; capabilities must be an array. |
TestScoring | 4 | Empty query → 0. Exact word match → > 0. Unrelated query → 0. Tag-only matches score above 0. |
TestSearchAcrossCatalogs | 3 | Top-k results are sorted descending. No matches → empty. Multi-catalog aggregation merges publishers correctly. |
The "unexpected field rejected" test is the one that matters most — it's what enforces forward-compatibility discipline. If the spec adds new fields, publishers update; in the meantime, rogue fields don't silently slip into manifests.
How forge bench-tested it
pip install -e .[dev] # zero runtime deps, pytest only
pytest -q tests/ # 12 passed in 0.08s
ard-validate sample-catalog.json
python examples/demo.py
Everything happens inside a python:3.12 Docker container, no host environment, no secrets. The package would install identically on any developer's machine.
Why this matters
The "agent + tools" stack has been fragmenting for the last 18 months. MCP standardized one piece (the protocol for an agent to call a tool). What MCP didn't standardize: how an agent finds tools it doesn't already know about. ARD is the answer to that question, and the answer is the right shape — federated, queryable, no central authority needed.
Forge's contribution here isn't research — it's operationalization. The Hugging Face blog post is the spec; this package is the smallest runnable system that anyone can pip-install today and immediately have a working ARD catalog publisher and search consumer. Three months from now, when someone reads the HF blog and wants to publish their own catalog, they should land on this package and be productive in 30 seconds.
Comparables
| Project | Posture |
|---|---|
| Hugging Face Discover Tool | The reference implementation. Hosted. Uses embeddings. |
| MCP (Model Context Protocol) | The protocol for calling tools, not for finding them. ARD is the discovery layer; MCP is the invocation layer. |
| Anthropic's tools catalog | Vendor-specific, not federated. |
| OpenAPI / Swagger | Adjacent but for HTTP APIs, not agent capabilities. |
Reproducibility
| canonical repo | https://github.com/worksona/ard-tools |
| frozen gist | https://gist.github.com/worksona/c03eb6c9166dfc5fb763404e52eceedf |
| source | https://github.com/huggingface/blog/blob/main/agentic-resource-discovery-launch.md |
| source type | article-as-spec |
| base image | python:3.12 |
| install | pip install git+https://github.com/worksona/ard-tools |
| tests | pytest -q tests/ — 12 passed in 0.08 s |
| validator | ard-validate <path> — pure-Python, no external schema lib |
| scorer | lexical-overlap, zero ML deps |
See also
- EXP-0006 — Agentic RL — where the
article-as-spectemplate was introduced. EXP-0013 is the second use. - EXP-0012 — GitHub Spec Kit — same forge run, parallel "specs become executable" theme from GitHub.
- Meet forge — the operationalization rule (decision tree).
Built and verified by forge. The blog post is the seed; the package is the deliverable. The next time someone wants to publish or consume an ARD catalog, they get a working package, not just a reading list.
Canonical repo (clone + fork) · Companion gist (frozen anchor)