Deterministic Scientific Claim Gate

Does this claim’s
evidence hold up?

CAPAS compiles supplied evidence into a claim admissibility decision. It does not determine truth — it decides, deterministically, whether the evidence licenses the claim: ACCEPT / REWRITE / REJECT / HOLD, with a re-derivable audit trail and no language model in the verdict. Your model proposes; CAPAS disposes. Built for regulatory reviewers, data & training-data engineers, and journal editors.

CAPAS checks whether your supplied evidence is internally consistent and re-derivable. It does not certify scientific truth, compliance, authorship, or the authenticity of the data — a consistent fabrication still passes (the GIGO ceiling).

Recent Gate Decisions SCHEMA V3
ACCEPT statistical_confidence: p=0.03 <= alpha=0.05 schema v3
REWRITE direction not independently licensed training 7/14
REJECT artifact unavailable for reproducibility audit trail
HOLD RO-Crate attestation pending CLI verif... no LLM
26
Deterministic gates
10
Domains, one engine
0
LLM in the verdict · fail-closed (test-proven)
100%
Re-derivable, replayable
Every number is CLOSED (proven by a test), BACKED (regenerates with a hash), or SCOPED (empirical-pending, with its corpus & upgrade path) — see the proof ledger.

Built on four principles

CAPAS was designed from the ground up to make scientific claim evaluation auditable, repeatable, and integration-ready.

Deterministic verdicts

The same claim + evidence bundle always produces the same verdict. No probabilistic outputs, no model drift, no hallucinations. Schema v3 rules are the single source of truth.

No randomness

Schema-validated structure

Every payload is validated against CAPAS Schema v3 before gating. Field types, required keys, and claim-type-specific constraints are enforced at the gate boundary.

CAPAS Schema v3

Full provenance trail

Every REJECT and HOLD verdict includes machine-readable provenance blockers — licensing flags, reproducibility gaps, attestation failures — exportable as structured JSON.

Audit-ready

Zero LLM dependency

CAPAS does not use language models for gate decisions. Verdicts are computed by deterministic rule functions — 26 cross-domain invariant gates plus per-claim-type evidence contracts — against a structured contract, fully offline-capable. Every decision carries a re-derivable audit_hash.

Rule-based only

Four-step workflow

From claim draft to admissibility decision in one deterministic pass.

01
Claim drift
A paper says association. A dataset row turns it into causation. CAPAS catches the boundary.
02
Select mode
Guided builder, raw JSON, batch evaluation, or claim-candidate extraction (deterministic preview, human confirmation). Pick your workflow.
03
Run gate
Schema v3 and claim-type rules return ACCEPT, REWRITE, REJECT, or HOLD with full provenance.
04
Inspect decision
Review verdict, schema errors, provenance blockers, and fine-tune evidence readiness.

Four possible outcomes

Every gate run returns exactly one of these. Each carries a machine-readable reason trail.

ACCEPT

Claim is admissible

All schema constraints pass. Evidence fully supports the claim as stated. No licensing or reproducibility blockers found.

p=0.03 <= alpha=0.05
artifact: available
license: CC-BY-4.0
REWRITE

Claim needs adjustment

Evidence is present but the claim overreaches. Direction, scope, or causal language must be corrected before resubmission.

blocker: direction_not_licensed
training: 7 of 14 pass
suggest: hedge_causality
REJECT

Claim is inadmissible

Critical evidence is missing, contradicted, or irreproducible. The claim cannot be licensed in its current form.

blocker: artifact_unavailable
reproducibility: FAIL
schema_errors: 3
HOLD

Pending verification

Gate cannot complete without external attestation — RO-Crate, CLI verification, or third-party confirmation still required.

pending: ro_crate_attestation
cli_verify: required
no_llm: true

Not an LLM checker. Not a plagiarism tool.

CAPAS fills a gap that existing tools leave open: structured, deterministic admissibility gating for scientific claims.

What CAPAS is NOT

Not a fact-checker — CAPAS doesn't verify truth, it validates evidence licensing and schema compliance

Not a plagiarism detector — citation matching is not the concern; admissibility is

Not an LLM wrapper — zero language model calls in the gate decision path

Not a peer review replacement — CAPAS is a pre-gate boundary tool, not editorial judgment

What CAPAS IS

A deterministic claim admissibility gate — same input always produces the same output

A schema enforcement layer — CAPAS Schema v3 defines exactly what a valid evidence package looks like

A provenance audit tool — every blocker is machine-readable and exportable

An integration-ready API — designed to sit inside research pipelines, publishing workflows, and audit systems

Everyone builds “what can I say?” — CAPAS builds “should this have been said?”

Frontier models and agents generate claims. None of them gate whether a claim is licensed by its evidence before reuse. That side of the stack is empty — and it is the side that governs. CAPAS is evidence-governance infrastructure that starts at that gate.

SystemGenerates claimsGates claims
GPT · Claude · Gemini
Deep-research agents
LLM-as-judgestochastic · no guarantee, not replayable
Fact-checkerstruth, not boundary
CAPAS✓ deterministic · replayable · fail-closed (test-proven)

The contamination cascade

Without a claim-level gate, drift survives source review, metadata review, and provenance review — because none of them evaluate the claim boundary itself.

01 · Paper
Benefit in one subgroup
One randomized trial, one patient subgroup, bounded result.
02 · Drift
All-patient benefit claimed
The reusable sentence widens a subgroup result to every patient.
CAPAS · Gate
REWRITE
Stops drift before it becomes governed evidence.
Blocked
Dataset / Model
Population-wide claim reproduced downstream. Prevented.

One mechanism. 10 domains. 26 gates.

A fabricated claim still has to satisfy the conservation laws of its field — and that is re-derivable with no oracle. The same engine catches a balance sheet that doesn’t close, a survey mean that’s arithmetically impossible (GRIM), a 99%-sensitive test claimed to imply a 99% PPV for a rare disease (the base-rate fallacy), an unbalanced reaction, a non-dimensional equation, a qubit with T2 > 2·T1. Any declared number that breaks a domain law forces REJECT — downgrade-only, so it can only make a verdict stricter. Fail-closed is a proven invariant (18/18 structurally-deficient claims rejected, locked by a test), not a number.

Finance — A=L+EStatistics — GRIMEpidemiology — Bayes PPV, RR/OR, vaccine efficacyChemistry — stoichiometry, charge, oxidation statesPhysics — dimensions, η≤1, v≤cQuantum — T2≤2T1, Γφ, gate floorEngineering — Ohm’s lawBiology — Hardy-Weinberg, mark-recaptureMathematics — root & linear-system checksUniversal — probability & conservation+ a new domain = one deterministic law

It re-derives the evidence. It doesn’t trust it.

Above the schema gate, CAPAS re-computes the claimed result from its raw inputs and only GATEs — marks re-derived — what it can reproduce. What it cannot re-derive it ATTESTs: signed and bound, never marketed as verified. The boundary is explicit on every receipt. No language model is ever in the decision path.

Statistical — re-run the test from raw dataCalibration & chromatography peak areaClinical datasets (SDTM→ADaM)Financial ratios from XBRL — US-GAAP & IFRSAccounting identities (debits=credits, A=L+E)Dimensional consistency (SI)Stoichiometry / mass balancePhysical laws (Antoine, c, absolute zero, Holevo)Quantum re-simulation (below the frontier)Zero-knowledge proof over hidden data

GATE = re-derived and reproducible. ATTEST = signed, not verified. CAPAS certifies computational consistency of the re-derivable slice — not scientific truth, and not the authenticity of the raw data (the irreducible GIGO residual). It re-derives more than it trusts, and says exactly which is which.

Every decision is an audit artifact

Frontier models produce text; CAPAS produces an auditable evidence trail — and that is what a regulated buyer actually purchases. Every verdict is operational, not just a label: ACCEPT licenses reuse · REWRITE returns the corrected claim with an Original→Licensed diff · REJECT names the missing evidence fields · HOLD lists the obligations to resolve before reuse.

"claim_id": "claim_drift_001",
"claim_type": "causal_mechanism_claim",
"decision": "REWRITE",
"evidence_contract": "intervention + temporal_order + confounders",
"blocker": "No intervention or natural-experiment evidence.",
"reviewer_action": "edit_and_resubmit",
"audit_hash": "sha256:f6c361712b894a2d2e4005e345341b4bbcaa2d7b7819121942d550972800b17a",
"non_claim": true

Tested against real retracted science

28 famous claims — every one of them passed peer review and was published in Nature, Science, or The Lancet. 14 were later retracted (Wakefield, Surgisphere, Schön, STAP…); 14 were independently replicated (LIGO, Higgs, RECOVERY dexamethasone…). Plausibility could not tell them apart — all 28 looked publishable. The gate separated them by structure.

28/28
Separated by structure on an illustrative, agent-coded retrospective — not an adjudicated benchmark
0/28
Plausibility / peer-review — all 28 passed, retracted and valid alike

Each fraud was gated for its actual structural deficiency — no controls, no independent reproduction, unauditable data — the same gaps it was retracted for. Honest scope: an illustrative retrospective whose corpus is coded from public retraction records (Retraction Watch, journal notices); it validates the gate’s structural logic, not fraud-detection from raw paper text. Partner pilots pending.

Where teams deploy CAPAS

From academic publishing to enterprise AI governance — any workflow that touches scientific claims benefits from a deterministic admissibility gate.

Academic publishing

Gate submitted manuscripts at the desk-review stage. Catch inadmissible claims before peer review consumes reviewer time.

AI training pipelines

Validate claim-evidence pairs before they enter training datasets. Prevent inadmissible or unlicensed scientific claims from corrupting model training.

Regulatory compliance

Generate structured audit trails for claims in regulated industries. Every verdict is machine-readable, timestamped, and exportable.

Research pilot programs

Run batch evaluations across a corpus of claims. Identify systemic evidence gaps before a full production deployment.

Claim fact-checking desks

Triage incoming claims automatically. Surface only those with sufficient evidence structure for human fact-checker review.

API integration

Embed CAPAS gate calls directly into your existing research pipeline, CMS, or editorial system. JSON in, structured verdict out.

Wrap your LLM. Stop trusting plausible.

An installable verification layer: your model proposes, CAPAS disposes. It never lets an unsupported claim through as ACCEPT — it re-derives what is re-derivable, grades the rest, and emits a verifiable reward your model can’t game by sounding right. No language model in the verdict.

$ pip install capas-claim-gate
# live on PyPI · or from source: pip install git+https://github.com/fomv9354lve/capas-inteligentes
from capas_sdk import verified
# your model proposes the evidence; CAPAS gates it
verdict = verified(my_llm, "reproducibility_check")["verdict"]
# -> ACCEPT / REWRITE / REJECT / HOLD (never the LLM’s call)
0/3 false-accepts with CAPAS
3/3 false-accepts LLM-alone
illustrative 3-case example, not a benchmark

The boost is reliability, not capability — the model doesn’t get smarter, its output becomes admissible-or-deferred. Honest scope: it grounds record↔text, not text↔reality — a source that lies about its methods and withholds its data passes (the GIGO ceiling), so CAPAS says exactly which slice it re-derived and which it could only attest.

However you build, CAPAS is the gate

The same deterministic engine, exposed three ways. CAPAS is never the language model — it is the fail-closed layer the model proposes into.

Library · pip

Wrap your LLM in code. gate · reward · certificate · invariants · gate_quantum.

pip install capas-claim-gate
live on PyPI

Skill · MCP server

A tool any agent calls — Claude Code, Desktop, Cursor. Zero dependencies. The agent proposes, CAPAS disposes.

python3 capas_mcp.py  # see docs/mcp.md

API · signed certificate

Hosted, auth-gated issuance of a signed, persisted, tamper-evident admissibility certificate — the audit artifact a regulated buyer purchases.

POST /api/certificate → id + signature

The same pattern — invariant checks + threshold gates + a fail-closed verdict + a disclosed boundary — is what IBM’s production calibration system is. The architecture isn’t speculative: a hardware vendor runs it at scale.

We beat a vendor benchmark with the vendor’s own numbers.

IBM’s headline gate-error figure is an optimistic lower bound — for real circuits it under-states by 3–10× (Proctor, Nat. Phys. 2022). From the same published calibration fields, CAPAS re-derives the complete error budget — fully auditable, no hardware required.

IBM headline (RB)
1.6×10-3
Re-derivable complete floor
1.9×10-2
Error the headline hid
2–11×

Honest scope: the gap is ~2× on a healthy qubit and reaches 11× on a dephasing-limited one (the numbers shown are that worst case — a real qubit whose anomaly the headline hides). Either way it is re-derived from the vendor’s own published fields. Validated live: CAPAS re-found the chip’s one anomalous qubit and its bad-coupler cluster from calibration alone, admitted a real Bell measurement bounded by two independent oracles, and its cross-validation held on a second device. The error-correction prescription (DD, rep-delay, readout mitigation) is re-derived from the same row. Full method →

We didn’t invent the architecture. The most demanding hardware stack in the world already runs it.

IBM’s quantum stack will not run your circuit until it clears a calibration gate — every job checked against frozen, re-derived device invariants, fail-closed, with no model-of-the-day in the decision. That is exactly the CAPAS architecture: re-derive from declared evidence, refuse on violation, keep the verdict deterministic. It already runs in production, at the frontier of physics. CAPAS generalizes the same mechanism across ten domains — finance, statistics, epidemiology, chemistry, physics, quantum. Two independent systems converging on the same admissibility mechanism is consilience: evidence the design is structural, not a pitch.

Honest scope: the identities CAPAS checks are textbook and the convergence is architectural, not a partnership claim. The novelty is the cross-domain composition — and on frozen calibration, CAPAS is in fact the stricter of the two.

The moat isn’t any single gate. It’s becoming the standard claims are checked against.

Any one gate is copyable. What compounds is trust you can audit: a deterministic, re-derivable, fail-closed verdict a regulated buyer can reproduce and a third party can try to refute. In the domains where “the model said so” is not admissible — finance, clinical, safety, quantum — the reference that keeps surviving refutation becomes the standard. CAPAS is built to be that reference: every headline claim CLOSED / BACKED / SCOPED, every verdict re-derivable to a hash, a survive-refutation ledger anyone can challenge. The standard, not the tool, is the moat.

Open engine. Reserved mark. Pre-committed to neutral governance.

CAPAS ships open-core (Apache-2.0): the schema, calculus, reference gate, CLI, tests, and benchmark corpus are yours to run and fork. The defensible asset is not the code — it’s the certification mark, and a mark is only worth trusting if it can’t be pulled. So the mark is reserved and pre-committed to neutral governance before adoption, not after — the one move that let Open Policy Agent survive its sponsor’s acquisition while MongoDB, Elastic, HashiCorp, and Redis each triggered a fork by relicensing their core to capture value. We renounce that move in writing. Governance charter →

Conformance is self-runnable and deterministicpython3 benchmarks/conformance.py runs the exact suite the certifier runs and returns the same verdict and the same hash. No private process to trust. The mark attests an artifact passed that.

Statistical-claim admissibility for regulated submissions.

Pinnacle 21 already checks whether a trial dataset is structurally well-formed (CDISC conformance). It does not check whether the reported statistic is licensed by its evidence. CAPAS does: significance versus alpha, multiplicity, confidence-interval-excludes-null, effect direction, endpoint pre-specification — re-derivably, beside the submission, not as a replacement. Validated on a 3,024-case admissibility corpus, 0 deficient claims accepted (fail-closed). Market validation →

Ready to gate your first claim?

Load a sample payload or build your own evidence contract in under two minutes. No account required for the pilot.