Deterministic Scientific Claim Gate

Does this claim’s
evidence hold up?

CAPAS compiles supplied evidence into a claim admissibility decision. It does not determine truth — it decides, deterministically, whether the evidence licenses the claim: ACCEPT / REWRITE / REJECT / HOLD, with a re-derivable audit trail and no language model in the verdict. Your model proposes; CAPAS disposes. Built for regulatory reviewers, data & training-data engineers, and journal editors.

Run sample claim Talk to pilot owner See pilot package

CAPAS checks whether your supplied evidence is internally consistent and re-derivable. It does not certify scientific truth, compliance, authorship, or the authenticity of the data — a consistent fabrication still passes (the GIGO ceiling).

Recent Gate Decisions SCHEMA V3

ACCEPT statistical_confidence: p=0.03 <= alpha=0.05 schema v3

REWRITE direction not independently licensed training 7/14

REJECT artifact unavailable for reproducibility audit trail

HOLD RO-Crate attestation pending CLI verif... no LLM

Deterministic gates

Domains, one engine

LLM in the verdict · fail-closed (test-proven)

100%

Re-derivable, replayable

Every number is CLOSED (proven by a test), BACKED (regenerates with a hash), or SCOPED (empirical-pending, with its corpus & upgrade path) — see the proof ledger.

The Core

Built on four principles

CAPAS was designed from the ground up to make scientific claim evaluation auditable, repeatable, and integration-ready.

Deterministic verdicts

The same claim + evidence bundle always produces the same verdict. No probabilistic outputs, no model drift, no hallucinations. Schema v3 rules are the single source of truth.

No randomness

Schema-validated structure

Every payload is validated against CAPAS Schema v3 before gating. Field types, required keys, and claim-type-specific constraints are enforced at the gate boundary.

CAPAS Schema v3

Full provenance trail

Every REJECT and HOLD verdict includes machine-readable provenance blockers — licensing flags, reproducibility gaps, attestation failures — exportable as structured JSON.

Audit-ready

Zero LLM dependency

CAPAS does not use language models for gate decisions. Verdicts are computed by deterministic rule functions — 26 cross-domain invariant gates plus per-claim-type evidence contracts — against a structured contract, fully offline-capable. Every decision carries a re-derivable audit_hash.

Rule-based only

Where it Lives

Four-step workflow

From claim draft to admissibility decision in one deterministic pass.

Claim drift

A paper says association. A dataset row turns it into causation. CAPAS catches the boundary.

Select mode

Guided builder, raw JSON, batch evaluation, or claim-candidate extraction (deterministic preview, human confirmation). Pick your workflow.

Run gate

Schema v3 and claim-type rules return ACCEPT, REWRITE, REJECT, or HOLD with full provenance.

Inspect decision

Review verdict, schema errors, provenance blockers, and fine-tune evidence readiness.

Verdict Reference

Four possible outcomes

Every gate run returns exactly one of these. Each carries a machine-readable reason trail.

Claim is admissible

All schema constraints pass. Evidence fully supports the claim as stated. No licensing or reproducibility blockers found.

p=0.03 <= alpha=0.05
artifact: available
license: CC-BY-4.0

REWRITE

Claim needs adjustment

Evidence is present but the claim overreaches. Direction, scope, or causal language must be corrected before resubmission.

blocker: direction_not_licensed
training: 7 of 14 pass
suggest: hedge_causality

REJECT

Claim is inadmissible

Critical evidence is missing, contradicted, or irreproducible. The claim cannot be licensed in its current form.

blocker: artifact_unavailable
reproducibility: FAIL
schema_errors: 3

HOLD

Pending verification

Gate cannot complete without external attestation — RO-Crate, CLI verification, or third-party confirmation still required.

pending: ro_crate_attestation
cli_verify: required
no_llm: true

Why CAPAS

Not an LLM checker. Not a plagiarism tool.

CAPAS fills a gap that existing tools leave open: structured, deterministic admissibility gating for scientific claims.

What CAPAS is NOT

Not a fact-checker — CAPAS doesn't verify truth, it validates evidence licensing and schema compliance

Not a plagiarism detector — citation matching is not the concern; admissibility is

Not an LLM wrapper — zero language model calls in the gate decision path

Not a peer review replacement — CAPAS is a pre-gate boundary tool, not editorial judgment

What CAPAS IS

A deterministic claim admissibility gate — same input always produces the same output

A schema enforcement layer — CAPAS Schema v3 defines exactly what a valid evidence package looks like

A provenance audit tool — every blocker is machine-readable and exportable

An integration-ready API — designed to sit inside research pipelines, publishing workflows, and audit systems

The other half of the stack

Everyone builds “what can I say?” — CAPAS builds “should this have been said?”

Frontier models and agents generate claims. None of them gate whether a claim is licensed by its evidence before reuse. That side of the stack is empty — and it is the side that governs. CAPAS is evidence-governance infrastructure that starts at that gate.

System	Generates claims	Gates claims
GPT · Claude · Gemini	✓	✗
Deep-research agents	✓	✗
LLM-as-judge	✗	stochastic · no guarantee, not replayable
Fact-checkers	✗	truth, not boundary
CAPAS	✗	✓ deterministic · replayable · fail-closed (test-proven)

How drift happens

The contamination cascade

Without a claim-level gate, drift survives source review, metadata review, and provenance review — because none of them evaluate the claim boundary itself.

01 · Paper

Benefit in one subgroup

One randomized trial, one patient subgroup, bounded result.

→

02 · Drift

All-patient benefit claimed

The reusable sentence widens a subgroup result to every patient.

→

CAPAS · Gate

REWRITE

Stops drift before it becomes governed evidence.

→

Blocked

Dataset / Model

Population-wide claim reproduced downstream. Prevented.

Cross-domain engine

One mechanism. 10 domains. 26 gates.

A fabricated claim still has to satisfy the conservation laws of its field — and that is re-derivable with no oracle. The same engine catches a balance sheet that doesn’t close, a survey mean that’s arithmetically impossible (GRIM), a 99%-sensitive test claimed to imply a 99% PPV for a rare disease (the base-rate fallacy), an unbalanced reaction, a non-dimensional equation, a qubit with T2 > 2·T1. Any declared number that breaks a domain law forces REJECT — downgrade-only, so it can only make a verdict stricter. Fail-closed is a proven invariant (18/18 structurally-deficient claims rejected, locked by a test), not a number.

Finance — A=L+EStatistics — GRIMEpidemiology — Bayes PPV, RR/OR, vaccine efficacyChemistry — stoichiometry, charge, oxidation statesPhysics — dimensions, η≤1, v≤cQuantum — T2≤2T1, Γφ, gate floorEngineering — Ohm’s lawBiology — Hardy-Weinberg, mark-recaptureMathematics — root & linear-system checksUniversal — probability & conservation+ a new domain = one deterministic law

Proof-carrying

It re-derives the evidence. It doesn’t trust it.

Above the schema gate, CAPAS re-computes the claimed result from its raw inputs and only GATEs — marks re-derived — what it can reproduce. What it cannot re-derive it ATTESTs: signed and bound, never marketed as verified. The boundary is explicit on every receipt. No language model is ever in the decision path.

Statistical — re-run the test from raw dataCalibration & chromatography peak areaClinical datasets (SDTM→ADaM)Financial ratios from XBRL — US-GAAP & IFRSAccounting identities (debits=credits, A=L+E)Dimensional consistency (SI)Stoichiometry / mass balancePhysical laws (Antoine, c, absolute zero, Holevo)Quantum re-simulation (below the frontier)Zero-knowledge proof over hidden data

GATE = re-derived and reproducible. ATTEST = signed, not verified. CAPAS certifies computational consistency of the re-derivable slice — not scientific truth, and not the authenticity of the raw data (the irreducible GIGO residual). It re-derives more than it trusts, and says exactly which is which.

The enterprise asset

Every decision is an audit artifact

Frontier models produce text; CAPAS produces an auditable evidence trail — and that is what a regulated buyer actually purchases. Every verdict is operational, not just a label: ACCEPT licenses reuse · REWRITE returns the corrected claim with an Original→Licensed diff · REJECT names the missing evidence fields · HOLD lists the obligations to resolve before reuse.

"claim_id": "claim_drift_001",

"claim_type": "causal_mechanism_claim",

"decision": "REWRITE",

"evidence_contract": "intervention + temporal_order + confounders",

"blocker": "No intervention or natural-experiment evidence.",

"reviewer_action": "edit_and_resubmit",

"audit_hash": "sha256:f6c361712b894a2d2e4005e345341b4bbcaa2d7b7819121942d550972800b17a",

"non_claim": true

Retrospective validation

Tested against real retracted science

28 famous claims — every one of them passed peer review and was published in Nature, Science, or The Lancet. 14 were later retracted (Wakefield, Surgisphere, Schön, STAP…); 14 were independently replicated (LIGO, Higgs, RECOVERY dexamethasone…). Plausibility could not tell them apart — all 28 looked publishable. The gate separated them by structure.

28/28

Separated by structure on an illustrative, agent-coded retrospective — not an adjudicated benchmark

0/28

Plausibility / peer-review — all 28 passed, retracted and valid alike

Each fraud was gated for its actual structural deficiency — no controls, no independent reproduction, unauditable data — the same gaps it was retracted for. Honest scope: an illustrative retrospective whose corpus is coded from public retraction records (Retraction Watch, journal notices); it validates the gate’s structural logic, not fraud-detection from raw paper text. Partner pilots pending.

Use Cases

Where teams deploy CAPAS

From academic publishing to enterprise AI governance — any workflow that touches scientific claims benefits from a deterministic admissibility gate.

Academic publishing

Gate submitted manuscripts at the desk-review stage. Catch inadmissible claims before peer review consumes reviewer time.

AI training pipelines

Validate claim-evidence pairs before they enter training datasets. Prevent inadmissible or unlicensed scientific claims from corrupting model training.

Regulatory compliance

Generate structured audit trails for claims in regulated industries. Every verdict is machine-readable, timestamped, and exportable.

Research pilot programs

Run batch evaluations across a corpus of claims. Identify systemic evidence gaps before a full production deployment.

Claim fact-checking desks

Triage incoming claims automatically. Surface only those with sufficient evidence structure for human fact-checker review.

API integration

Embed CAPAS gate calls directly into your existing research pipeline, CMS, or editorial system. JSON in, structured verdict out.

For developers

Wrap your LLM. Stop trusting plausible.

An installable verification layer: your model proposes, CAPAS disposes. It never lets an unsupported claim through as ACCEPT — it re-derives what is re-derivable, grades the rest, and emits a verifiable reward your model can’t game by sounding right. No language model in the verdict.

$ pip install capas-claim-gate
# live on PyPI · or from source: pip install git+https://github.com/fomv9354lve/capas-inteligentes
from capas_sdk import verified
# your model proposes the evidence; CAPAS gates it
verdict = verified(my_llm, "reproducibility_check")["verdict"]
# -> ACCEPT / REWRITE / REJECT / HOLD  (never the LLM’s call)

0/3 false-accepts with CAPAS

3/3 false-accepts LLM-alone

illustrative 3-case example, not a benchmark

The boost is reliability, not capability — the model doesn’t get smarter, its output becomes admissible-or-deferred. Honest scope: it grounds record↔text, not text↔reality — a source that lies about its methods and withholds its data passes (the GIGO ceiling), so CAPAS says exactly which slice it re-derived and which it could only attest.

One core, three surfaces

However you build, CAPAS is the gate

The same deterministic engine, exposed three ways. CAPAS is never the language model — it is the fail-closed layer the model proposes into.

Library · pip

Wrap your LLM in code. gate · reward · certificate · invariants · gate_quantum.

pip install capas-claim-gate
live on PyPI

Skill · MCP server

A tool any agent calls — Claude Code, Desktop, Cursor. Zero dependencies. The agent proposes, CAPAS disposes.

python3 capas_mcp.py  # see docs/mcp.md

API · signed certificate

Hosted, auth-gated issuance of a signed, persisted, tamper-evident admissibility certificate — the audit artifact a regulated buyer purchases.

POST /api/certificate → id + signature

The same pattern — invariant checks + threshold gates + a fail-closed verdict + a disclosed boundary — is what IBM’s production calibration system is. The architecture isn’t speculative: a hardware vendor runs it at scale.

Proof on real hardware

We beat a vendor benchmark with the vendor’s own numbers.

IBM’s headline gate-error figure is an optimistic lower bound — for real circuits it under-states by 3–10× (Proctor, Nat. Phys. 2022). From the same published calibration fields, CAPAS re-derives the complete error budget — fully auditable, no hardware required.

IBM headline (RB)

1.6×10^-3

→

Re-derivable complete floor

1.9×10^-2

Error the headline hid

2–11×

Honest scope: the gap is ~2× on a healthy qubit and reaches 11× on a dephasing-limited one (the numbers shown are that worst case — a real qubit whose anomaly the headline hides). Either way it is re-derived from the vendor’s own published fields. Validated live: CAPAS re-found the chip’s one anomalous qubit and its bad-coupler cluster from calibration alone, admitted a real Bell measurement bounded by two independent oracles, and its cross-validation held on a second device. The error-correction prescription (DD, rep-delay, readout mitigation) is re-derived from the same row. Full method →

Independent validation

We didn’t invent the architecture. The most demanding hardware stack in the world already runs it.

IBM’s quantum stack will not run your circuit until it clears a calibration gate — every job checked against frozen, re-derived device invariants, fail-closed, with no model-of-the-day in the decision. That is exactly the CAPAS architecture: re-derive from declared evidence, refuse on violation, keep the verdict deterministic. It already runs in production, at the frontier of physics. CAPAS generalizes the same mechanism across ten domains — finance, statistics, epidemiology, chemistry, physics, quantum. Two independent systems converging on the same admissibility mechanism is consilience: evidence the design is structural, not a pitch.

Honest scope: the identities CAPAS checks are textbook and the convergence is architectural, not a partnership claim. The novelty is the cross-domain composition — and on frozen calibration, CAPAS is in fact the stricter of the two.

Why it compounds

The moat isn’t any single gate. It’s becoming the standard claims are checked against.

Any one gate is copyable. What compounds is trust you can audit: a deterministic, re-derivable, fail-closed verdict a regulated buyer can reproduce and a third party can try to refute. In the domains where “the model said so” is not admissible — finance, clinical, safety, quantum — the reference that keeps surviving refutation becomes the standard. CAPAS is built to be that reference: every headline claim CLOSED / BACKED / SCOPED, every verdict re-derivable to a hash, a survive-refutation ledger anyone can challenge. The standard, not the tool, is the moat.

Open standard · governed

Open engine. Reserved mark. Pre-committed to neutral governance.

CAPAS ships open-core (Apache-2.0): the schema, calculus, reference gate, CLI, tests, and benchmark corpus are yours to run and fork. The defensible asset is not the code — it’s the certification mark, and a mark is only worth trusting if it can’t be pulled. So the mark is reserved and pre-committed to neutral governance before adoption, not after — the one move that let Open Policy Agent survive its sponsor’s acquisition while MongoDB, Elastic, HashiCorp, and Redis each triggered a fork by relicensing their core to capture value. We renounce that move in writing. Governance charter →

Conformance is self-runnable and deterministic — python3 benchmarks/conformance.py runs the exact suite the certifier runs and returns the same verdict and the same hash. No private process to trust. The mark attests an artifact passed that.

Beachhead

Statistical-claim admissibility for regulated submissions.

Pinnacle 21 already checks whether a trial dataset is structurally well-formed (CDISC conformance). It does not check whether the reported statistic is licensed by its evidence. CAPAS does: significance versus alpha, multiplicity, confidence-interval-excludes-null, effect direction, endpoint pre-specification — re-derivably, beside the submission, not as a replacement. Validated on a 3,024-case admissibility corpus, 0 deficient claims accepted (fail-closed). Market validation →

Get started

Ready to gate your first claim?

Load a sample payload or build your own evidence contract in under two minutes. No account required for the pilot.

Open Gate App Download pilot packet Read methodology

CAPAS gates structured evidence supplied by users. It does not certify scientific truth or replace external review.
The engine ships 26 deterministic gates across 10 domains; unsupported domains HOLD until a team defines a new evidence contract, admissibility policy, and audit artifact.
CAPAS does not use language models at decision time — every gate decision is deterministic and fully traceable.

Does this claim’sevidence hold up?