CAPAS compiles supplied evidence into a claim admissibility decision. It does not determine truth — it decides, deterministically, whether the evidence licenses the claim: ACCEPT / REWRITE / REJECT / HOLD, with a re-derivable audit trail and no language model in the verdict. Your model proposes; CAPAS disposes. Built for regulatory reviewers, data & training-data engineers, and journal editors.
CAPAS checks whether your supplied evidence is internally consistent and re-derivable. It does not certify scientific truth, compliance, authorship, or the authenticity of the data — a consistent fabrication still passes (the GIGO ceiling).
CAPAS was designed from the ground up to make scientific claim evaluation auditable, repeatable, and integration-ready.
The same claim + evidence bundle always produces the same verdict. No probabilistic outputs, no model drift, no hallucinations. Schema v3 rules are the single source of truth.
No randomnessEvery payload is validated against CAPAS Schema v3 before gating. Field types, required keys, and claim-type-specific constraints are enforced at the gate boundary.
CAPAS Schema v3Every REJECT and HOLD verdict includes machine-readable provenance blockers — licensing flags, reproducibility gaps, attestation failures — exportable as structured JSON.
Audit-readyCAPAS does not use language models for gate decisions. Verdicts are computed by deterministic rule functions — 26 cross-domain invariant gates plus per-claim-type evidence contracts — against a structured contract, fully offline-capable. Every decision carries a re-derivable audit_hash.
From claim draft to admissibility decision in one deterministic pass.
Every gate run returns exactly one of these. Each carries a machine-readable reason trail.
All schema constraints pass. Evidence fully supports the claim as stated. No licensing or reproducibility blockers found.
Evidence is present but the claim overreaches. Direction, scope, or causal language must be corrected before resubmission.
Critical evidence is missing, contradicted, or irreproducible. The claim cannot be licensed in its current form.
Gate cannot complete without external attestation — RO-Crate, CLI verification, or third-party confirmation still required.
CAPAS fills a gap that existing tools leave open: structured, deterministic admissibility gating for scientific claims.
Not a fact-checker — CAPAS doesn't verify truth, it validates evidence licensing and schema compliance
Not a plagiarism detector — citation matching is not the concern; admissibility is
Not an LLM wrapper — zero language model calls in the gate decision path
Not a peer review replacement — CAPAS is a pre-gate boundary tool, not editorial judgment
A deterministic claim admissibility gate — same input always produces the same output
A schema enforcement layer — CAPAS Schema v3 defines exactly what a valid evidence package looks like
A provenance audit tool — every blocker is machine-readable and exportable
An integration-ready API — designed to sit inside research pipelines, publishing workflows, and audit systems
Frontier models and agents generate claims. None of them gate whether a claim is licensed by its evidence before reuse. That side of the stack is empty — and it is the side that governs. CAPAS is evidence-governance infrastructure that starts at that gate.
| System | Generates claims | Gates claims |
|---|---|---|
| GPT · Claude · Gemini | ✓ | ✗ |
| Deep-research agents | ✓ | ✗ |
| LLM-as-judge | ✗ | stochastic · no guarantee, not replayable |
| Fact-checkers | ✗ | truth, not boundary |
| CAPAS | ✗ | ✓ deterministic · replayable · fail-closed (test-proven) |
Without a claim-level gate, drift survives source review, metadata review, and provenance review — because none of them evaluate the claim boundary itself.
A fabricated claim still has to satisfy the conservation laws of its field — and that is re-derivable with no oracle. The same engine catches a balance sheet that doesn’t close, a survey mean that’s arithmetically impossible (GRIM), a 99%-sensitive test claimed to imply a 99% PPV for a rare disease (the base-rate fallacy), an unbalanced reaction, a non-dimensional equation, a qubit with T2 > 2·T1. Any declared number that breaks a domain law forces REJECT — downgrade-only, so it can only make a verdict stricter. Fail-closed is a proven invariant (18/18 structurally-deficient claims rejected, locked by a test), not a number.
Above the schema gate, CAPAS re-computes the claimed result from its raw inputs and only GATEs — marks re-derived — what it can reproduce. What it cannot re-derive it ATTESTs: signed and bound, never marketed as verified. The boundary is explicit on every receipt. No language model is ever in the decision path.
GATE = re-derived and reproducible. ATTEST = signed, not verified. CAPAS certifies computational consistency of the re-derivable slice — not scientific truth, and not the authenticity of the raw data (the irreducible GIGO residual). It re-derives more than it trusts, and says exactly which is which.
Frontier models produce text; CAPAS produces an auditable evidence trail — and that is what a regulated buyer actually purchases. Every verdict is operational, not just a label: ACCEPT licenses reuse · REWRITE returns the corrected claim with an Original→Licensed diff · REJECT names the missing evidence fields · HOLD lists the obligations to resolve before reuse.
28 famous claims — every one of them passed peer review and was published in Nature, Science, or The Lancet. 14 were later retracted (Wakefield, Surgisphere, Schön, STAP…); 14 were independently replicated (LIGO, Higgs, RECOVERY dexamethasone…). Plausibility could not tell them apart — all 28 looked publishable. The gate separated them by structure.
Each fraud was gated for its actual structural deficiency — no controls, no independent reproduction, unauditable data — the same gaps it was retracted for. Honest scope: an illustrative retrospective whose corpus is coded from public retraction records (Retraction Watch, journal notices); it validates the gate’s structural logic, not fraud-detection from raw paper text. Partner pilots pending.
From academic publishing to enterprise AI governance — any workflow that touches scientific claims benefits from a deterministic admissibility gate.
Gate submitted manuscripts at the desk-review stage. Catch inadmissible claims before peer review consumes reviewer time.
Validate claim-evidence pairs before they enter training datasets. Prevent inadmissible or unlicensed scientific claims from corrupting model training.
Generate structured audit trails for claims in regulated industries. Every verdict is machine-readable, timestamped, and exportable.
Run batch evaluations across a corpus of claims. Identify systemic evidence gaps before a full production deployment.
Triage incoming claims automatically. Surface only those with sufficient evidence structure for human fact-checker review.
Embed CAPAS gate calls directly into your existing research pipeline, CMS, or editorial system. JSON in, structured verdict out.
An installable verification layer: your model proposes, CAPAS disposes. It never lets an unsupported claim through as ACCEPT — it re-derives what is re-derivable, grades the rest, and emits a verifiable reward your model can’t game by sounding right. No language model in the verdict.
The boost is reliability, not capability — the model doesn’t get smarter, its output becomes admissible-or-deferred. Honest scope: it grounds record↔text, not text↔reality — a source that lies about its methods and withholds its data passes (the GIGO ceiling), so CAPAS says exactly which slice it re-derived and which it could only attest.
The same deterministic engine, exposed three ways. CAPAS is never the language model — it is the fail-closed layer the model proposes into.
Wrap your LLM in code. gate · reward · certificate · invariants · gate_quantum.
A tool any agent calls — Claude Code, Desktop, Cursor. Zero dependencies. The agent proposes, CAPAS disposes.
Hosted, auth-gated issuance of a signed, persisted, tamper-evident admissibility certificate — the audit artifact a regulated buyer purchases.
The same pattern — invariant checks + threshold gates + a fail-closed verdict + a disclosed boundary — is what IBM’s production calibration system is. The architecture isn’t speculative: a hardware vendor runs it at scale.
IBM’s headline gate-error figure is an optimistic lower bound — for real circuits it under-states by 3–10× (Proctor, Nat. Phys. 2022). From the same published calibration fields, CAPAS re-derives the complete error budget — fully auditable, no hardware required.
Honest scope: the gap is ~2× on a healthy qubit and reaches 11× on a dephasing-limited one (the numbers shown are that worst case — a real qubit whose anomaly the headline hides). Either way it is re-derived from the vendor’s own published fields. Validated live: CAPAS re-found the chip’s one anomalous qubit and its bad-coupler cluster from calibration alone, admitted a real Bell measurement bounded by two independent oracles, and its cross-validation held on a second device. The error-correction prescription (DD, rep-delay, readout mitigation) is re-derived from the same row. Full method →
IBM’s quantum stack will not run your circuit until it clears a calibration gate — every job checked against frozen, re-derived device invariants, fail-closed, with no model-of-the-day in the decision. That is exactly the CAPAS architecture: re-derive from declared evidence, refuse on violation, keep the verdict deterministic. It already runs in production, at the frontier of physics. CAPAS generalizes the same mechanism across ten domains — finance, statistics, epidemiology, chemistry, physics, quantum. Two independent systems converging on the same admissibility mechanism is consilience: evidence the design is structural, not a pitch.
Honest scope: the identities CAPAS checks are textbook and the convergence is architectural, not a partnership claim. The novelty is the cross-domain composition — and on frozen calibration, CAPAS is in fact the stricter of the two.
Any one gate is copyable. What compounds is trust you can audit: a deterministic, re-derivable, fail-closed verdict a regulated buyer can reproduce and a third party can try to refute. In the domains where “the model said so” is not admissible — finance, clinical, safety, quantum — the reference that keeps surviving refutation becomes the standard. CAPAS is built to be that reference: every headline claim CLOSED / BACKED / SCOPED, every verdict re-derivable to a hash, a survive-refutation ledger anyone can challenge. The standard, not the tool, is the moat.
CAPAS ships open-core (Apache-2.0): the schema, calculus, reference gate, CLI, tests, and benchmark corpus are yours to run and fork. The defensible asset is not the code — it’s the certification mark, and a mark is only worth trusting if it can’t be pulled. So the mark is reserved and pre-committed to neutral governance before adoption, not after — the one move that let Open Policy Agent survive its sponsor’s acquisition while MongoDB, Elastic, HashiCorp, and Redis each triggered a fork by relicensing their core to capture value. We renounce that move in writing. Governance charter →
Conformance is self-runnable and deterministic — python3 benchmarks/conformance.py runs the exact suite the certifier runs and returns the same verdict and the same hash. No private process to trust. The mark attests an artifact passed that.
Pinnacle 21 already checks whether a trial dataset is structurally well-formed (CDISC conformance). It does not check whether the reported statistic is licensed by its evidence. CAPAS does: significance versus alpha, multiplicity, confidence-interval-excludes-null, effect direction, endpoint pre-specification — re-derivably, beside the submission, not as a replacement. Validated on a 3,024-case admissibility corpus, 0 deficient claims accepted (fail-closed). Market validation →
Load a sample payload or build your own evidence contract in under two minutes. No account required for the pilot.