Foundry Oracle
Point the Oracle at your repo. It mines your merged PRs into fixtures, scores your corpus against them, and opens pull requests with measurable improvements. Your keys stay with you. Your corpus stays in your repo.
How it works
Webhook fires. Oracle reads the PR, ticket, and review comments.
Task + golden diff + review Q&A become a regression fixture in .foundry/fixtures/.
Your corpus + the fixture task. Isolated git worktree. Produces an attempt.
Diff match + LLM quality + review-comment satisfaction. Composite score, per-rubric attribution.
Pattern of low scores on a category? Oracle proposes a corpus change.
Oracle opens a PR against your repo with the corpus change and the predicted score delta.
The eval team
Physical isolation via git branches and a dedicated guardian. No credential tricks, no instruction-scoped visibility. The Artificer-under-test can't peek at the golden.
ORACLE
Scores attempts against the golden. Diagnoses root causes. Proposes corpus changes.
The Artificer under test is your own harness — whichever agent system you run locally. The Oracle team doesn't build. It measures.
Bring your own infrastructure
The Oracle writes to .foundry/ in your repo. Corpus,
fixtures, analytics — all yours. If you stop paying, you keep
everything.
The Oracle uses your API keys for evaluation runs. Keys pass through per-run. We never store them.
We run the job orchestrator and an eval-history database for trend analysis. Everything else lives in your repo.
Join the closed beta.
We're onboarding a small group of design partners first. Leave your email and we'll reach out when a slot opens.