Catches the places where your CLAUDE.md, READMEs, and *.md docs claim something the code no longer backs up. Checks docs against the actual codebase and reports what's stale, wrong, or missing.
$ brew install Arthur920/tap/staleguard
The architecture
Each layer is cheaper and higher-signal than the next, so most drift is caught before any model runs. Layer 1 is instant and needs nothing; Layers 2–3 run local ONNX models — code never leaves the machine.
Paths exist? Commands real? Config keys present? Architecture rules parsed from prose, checked against the real import graph. No ML — tuned for zero false positives.
For each surviving claim, local embeddings fetch the most-relevant code chunks — symbol-aligned via tree-sitter, with an optional reranker.
A code-aware NLI cross-encoder judges (evidence, claim) → supported · contradicted · unverifiable, each with a confidence.
Underneath sits a drift ledger — makes runs incremental, scores alignment, and gates CI on regressions.
What it catches
npm run, make) with no matching script or target--diff <ref> re-checks only what changedThe Layer 3 judge
The default judge is Arthur920/staleguard — a microsoft/unixcoder-base fine-tune. Code-aware, so real code stays in-distribution as the premise. The alert class — contradictions — is what we optimize for.
Get started
The default build gives you Layer 1 — the deterministic, zero-false-positive core that needs no models. Add the ml feature for Layers 2–3.
Homebrew, an install script, or from source — all give you the deterministic core. Then run check on the full repo.
# Homebrew (macOS / Linux) $ brew install Arthur920/tap/staleguard # or from source $ cargo install --git https://github.com/Arthur920/Staleguard $ staleguard check # full repo, Layer 1
Layers 2–3 run local ONNX models behind the ml feature (prebuilt binaries omit it — the deps are large). Both routes compile from source, then fetch models at runtime.
# Homebrew — compiles with the ml feature $ brew install Arthur920/tap/staleguard-ml # or with cargo $ cargo install --git https://github.com/Arthur920/Staleguard --features ml $ staleguard setup # fetch + load every model, offline thereafter $ staleguard check --layer 3 # all three layers
staleguard check exits non-zero on any finding or a score regression — a drop-in for any pipeline. Commit a baseline on main, then gate PRs against it.
# once, on the base branch — records the alignment baseline $ staleguard check --write-ledger # in CI on each PR — fail only if alignment regressed $ staleguard check --fail-on-regression --format json
Staleguard speaks --format json, so any coding agent can run it and read findings back — directly via shell, or exposed as an MCP check_doc_drift tool.
# let the agent call it directly $ staleguard check --format json --diff main # a good standing instruction in CLAUDE.md: # "After editing code or docs, run staleguard check # --format json and fix any reported drift."
| Variable | Effect |
|---|---|
| STALEGUARD_NLI_REPO | NLI judge model repo (default Arthur920/staleguard) |
| STALEGUARD_NLI_THRESHOLD STALEGUARD_NLI_MARGIN | decision thresholds — how far contradiction must out-score entailment |
| STALEGUARD_NLI_MAX_CLAIMS | per-run claim budget (default 300; 0 = no cap) |
| STALEGUARD_EMBED_REPO | Layer 2 embedding model |
| STALEGUARD_RERANK_REPO | optional reranker |
| STALEGUARD_ORT_THREADS | ONNX intra-op threads (default: all cores) |