The service

Foundry Oracle

Point the Oracle at your repo. It mines your merged PRs into fixtures, scores your corpus against them, and opens pull requests with measurable improvements. Your keys stay with you. Your corpus stays in your repo.

How it works

1
PR merges

Webhook fires. Oracle reads the PR, ticket, and review comments.

2
Fixture generated

Task + golden diff + review Q&A become a regression fixture in .foundry/fixtures/.

3
Artificer runs

Your corpus + the fixture task. Isolated git worktree. Produces an attempt.

4
Oracle scores

Diff match + LLM quality + review-comment satisfaction. Composite score, per-rubric attribution.

5
Gap detected

Pattern of low scores on a category? Oracle proposes a corpus change.

6
PR back

Oracle opens a PR against your repo with the corpus change and the predicted score delta.

The eval team

Physical isolation via git branches and a dedicated guardian. No credential tricks, no instruction-scoped visibility. The Artificer-under-test can't peek at the golden.

The Artificer under test is your own harness — whichever agent system you run locally. The Oracle team doesn't build. It measures.

Bring your own infrastructure

Your repo is the source of truth

The Oracle writes to .foundry/ in your repo. Corpus, fixtures, analytics — all yours. If you stop paying, you keep everything.

Your keys, your LLM

The Oracle uses your API keys for evaluation runs. Keys pass through per-run. We never store them.

Minimal hosted surface

We run the job orchestrator and an eval-history database for trend analysis. Everything else lives in your repo.

Join the closed beta.

We're onboarding a small group of design partners first. Leave your email and we'll reach out when a slot opens.