Methodology

Deterministic claim admissibility
for scientific AI teams.

CAPAS gives review teams a deterministic gate before scientific claims enter reports, governed datasets, or fine-tuning preparation. It checks whether supplied evidence licenses a claim for controlled reuse, and returns a replayable packet — the verdict, the deterministic reason, the required evidence contract, the no-LLM marker, and fine-tune readiness — re-derivable from the same input.

Open Gate App Pilot →

Reproducible engine benchmark · pilots pending

1,238synthetic-grid engine decisions across 12 claim families

78%gated on the adversarial synthetic grid — not a production drift rate

14fine-tune readiness criteria

Synthetic benchmark — full verdict-space coverage on an adversarial grid, not a production drift rate. No production pilot has run yet. Full benchmark methodology →

Recent gate decisionsSCHEMA V3

ACCEPTstatistical_confidence: p=0.03 ≤ alpha=0.05

REWRITEdirection not independently licensed

REJECTartifact unavailable for reproducibility

HOLDRO-Crate attestation pending CLI verification

01 · ENEMY

Claim drift

A cautious source sentence becomes an over-scoped reusable claim. CAPAS catches the boundary.

02 · INPUT

Select mode

Guided builder, raw JSON, batch evaluation, or paper/text ingestion.

03 · GATE

Run gate

Schema v3 and claim-type rules return ACCEPT, REWRITE, REJECT, or HOLD.

04 · AUDIT

Inspect decision

Reason, evidence spans, provenance blockers, and fine-tune readiness.

The artifact

A real decision packet — not a description of one.

This is the literal output of the engine on a claim whose reported number re-derives correctly but whose supplied accounting evidence is internally inconsistent. Reproducible: capas_sdk.gate("financial_metric_claim", evidence).

{
  "schema_version": "capas-claim-payload-v3",
  "verdict": "REJECT",
  "reason": "reported_value matches reference within
     tolerance and period matches; OVERRIDDEN by a
     domain invariant violation: balance identity
     VIOLATED — assets 1000 != liabilities 600 +
     equity 300 (residual 100). The books do not close.",
  "required_fields": ["reported_value", "reference_value",
                      "tolerance", "metric_period_match"],
  "invariant_audit": "FLAG",
  "fine_tune_ready": false,
  "non_claim": "This decision is rule-based over supplied
     evidence fields, not an LLM judgment."
}

Decision path

An LLM may be used upstream — to extract the payload from a paper, or draft a rewrite suggestion. It is never used to determine admissibility. The verdict (ACCEPT / REWRITE / REJECT / HOLD) is produced only by versioned deterministic rules over the supplied evidence fields. The same payload always yields the same verdict, so any decision can be independently re-run and audited. The non_claim field is the machine-readable marker of this.

Text-ingested claims additionally carry source evidence spans; the hosted API wraps any packet in a signed, content-addressed certificate (capas_certstore) for tamper-evidence.

The checklist

The 14 fine-tune readiness criteria.

A claim can be ACCEPTed for a report yet still not be ready for training data. These 14 deterministic checks (verbatim from the engine) gate fine_tune_ready after an ACCEPT — they never change the verdict, only whether the claim may enter fine-tuning preparation.

verdict_accept · the claim verdict is ACCEPT

schema_clean · no schema or required-field blockers

source_backed_evidence · source-backed evidence is attached

external_review · external review is attached

semantic_alignment · claim text alignment is externally certified

witness_independence · witness independence is externally certified

provenance_sources · provenance sources / source URLs present

review_hash_verified · review hash matches the review packet

source_urls_recoverable_hashable · source URLs recoverable with matching hashes

witness_registry_resolved · witness ID resolves in the registry

ro_crate_validated · RO-Crate packet valid and hash-matched

reviewer_attestation_verified · reviewer identity / attestation verifiable

review_id_present · provenance review_id is present

witness_id_present · provenance witness_id is present

Verbatim from capas.evaluate_fine_tune_readiness. Any unmet criterion appears as a named blocker on the packet.

Executive so what

The three things that matter to a buyer.

Enemy

Claim drift

CAPAS targets the point where a cautious source sentence becomes an over-scoped reusable claim.

Control point

Gate before reuse

The gate runs before records enter fine-tuning, publication workflows, governed datasets, or downstream reports.

Audit packet

Structured output

Each output carries decision reason, evidence spans, blockers, and a non-LLM marker.

How the gate works

Evidence contracts decide licensed scope.

The gate takes a claim and an evidence package. It checks them against a claim-type evidence contract. The contract defines what evidence is required and what scope it licenses. The gate returns a verdict — deterministically.

📥

Input

Claim text + evidence fields (statistical, artifact, source URL, license, reviewer hash…)

→

🚧

CAPAS Gate

90+ deterministic rule functions check claim against evidence contract. No LLM in the decision path.

→

📤

Output

ACCEPT REWRITE REJECT HOLD

+ blockers, reviewer action, audit hash, evidence spans

Why this matters now

An auditable evidence trail for training-data governance.

Emerging AI-governance frameworks ask teams to document the quality and provenance of the data behind a claim. CAPAS produces exactly that artifact, per claim, deterministically — with no model in the decision path. It is an input to compliance and review work, not a certification of it.

EU AI Act · Art. 10

Data governance

Article 10 requires high-risk AI systems to use training, validation, and testing data that meet quality and governance criteria. CAPAS records, per claim, whether supplied evidence licenses reuse — a traceable check at the moment data enters a dataset.

NIST AI RMF

Traceability

The NIST AI Risk Management Framework emphasizes documentation and traceability across the data lifecycle. Every CAPAS decision emits a reason, evidence spans, provenance blockers, and an audit hash that can be reviewed after the fact.

No LLM judge

Deterministic by design

Unlike LLM-as-judge evaluators, CAPAS has no language model in the decision path. The same payload always yields the same verdict, so a decision can be independently re-run and audited — which a stochastic model judgment cannot guarantee.

Regulatory references are provided for context only. CAPAS does not certify compliance with the EU AI Act, the NIST AI RMF, or any other framework; it produces deterministic, auditable decision artifacts that support governance and review processes.

Two-week pilot

Measure claim drift on your own corpus.

A controlled operating test: can the organization identify which claims are licensed, which must be rewritten, which must be rejected, and which require more evidence before reuse?

Steps

Select one vertical corpus: AI governance, pharma evidence review, model risk, journal reproducibility, or materials R&D.

Convert 500 structured records into CAPAS payloads through the guided constructor, CLI, or upstream extraction adapter.

Run deterministic batch gating and sample 100 decisions for expert adjudication.

Report decision mix, reviewer agreement, false reject rate, provenance blockers, and review capacity redirected.

Licensed for controlled reuse. Fine-tune obligations to clear.

REWRITE

Evidence supports a narrower claim. Returns the licensed boundary.

REJECT

Returns which evidence is missing or failing — not a silent no.

HOLD

Returns the steps: supply the missing field, verify provenance, re-gate.

Disclaimers

Required context for buyers.

CAPAS gates supplied evidence fields; it does not infer hidden evidence, provide legal advice, certify broad scientific truth, or replace external review.

The 1,238 decisions and 78% gated share are reproducible from the engine’s own benchmark (benchmarks/family_decision_mix.py) over a synthetic decision-space grid — they demonstrate full verdict-space coverage, NOT a real-world drift rate. No production pilot has been run yet; real rates require an independently adjudicated corpus.

Review-capacity estimates are planning assumptions and must be calibrated against the customer baseline.

Do not share payload URLs or exports containing confidential source text, reviewer IDs, witness IDs, licensed materials, or proprietary provenance paths without authorization. Data handling & security →

CAPAS gates supplied evidence fields; it does not infer hidden evidence, provide legal advice, certify broad scientific truth, or replace external review.
Decision examples shown here are illustrative. Schema, evidence contracts, and blockers are emitted by the deterministic gate, not by hidden model judgment.
Pilot and capacity figures are planning models, not booked production deployments, and must be calibrated against the organization's own review baseline.

Deterministic claim admissibilityfor scientific AI teams.

A real decision packet — not a description of one.

The 14 fine-tune readiness criteria.

The three things that matter to a buyer.

Claim drift

Gate before reuse

Structured output

Evidence contracts decide licensed scope.

Input

CAPAS Gate

Output

An auditable evidence trail for training-data governance.

Data governance

Traceability

Deterministic by design

Measure claim drift on your own corpus.

Required context for buyers.

Deterministic claim admissibility
for scientific AI teams.