Two-week deployment packet

Measure claim drift before it
contaminates scientific datasets.

CAPAS packages claim admissibility, deterministic gating, human confirmation, batch reporting, and provenance blockers into a pilot that a technical team can measure and a steering committee can understand.

This is not a truth-certification engagement. It is a controlled operating test: can the organization identify which claims are licensed, which must be rewritten, which must be rejected, and which require more evidence before reuse?
Proof before the pilot

It already separated retracted science from replicated science.

A retrospective over 28 famous claims — all published in Nature, Science, or The Lancet. 14 were later retracted (Wakefield, Surgisphere, Schön, STAP, Stapel…); 14 were independently replicated (LIGO, Higgs, RECOVERY dexamethasone, the Pfizer vaccine trial…). Peer review and plausibility passed all 28 alike. CAPAS gated by structure.

28/28
separated correctly — 14/14 retracted gated, 14/14 replicated accepted, 0 false-accepts
0/28
what plausibility / peer-review separated — every one of the 28 looked publishable
Honest scope. Each fraud was gated for its actual structural deficiency — no controls, no independent reproduction, unauditable data — the same gap it was retracted for. The corpus is coded from public retraction records (Retraction Watch, journal notices), so this validates the gate’s structural logic, not fraud-detection from raw paper text. It is the evidence that the method works before you commit two weeks to the pilot below.
Delivery path

Five steps, two weeks.

1

Scope

Select corpus, owner, reviewer, and baseline review process.

2

Ingest

Bring paper text, metadata exports, theorem notes, or structured records.

3

Confirm

Human reviewer approves candidate spans before CAPAS decides.

4

Gate

Run single and batch CAPAS decisions with explicit evidence contracts.

5

Read out

Deliver decision mix, exception queue, provenance blockers, and ROI model.

Two-week operating plan

Week by week.

Week 1 — Calibrate the gate
Select a controlled corpus of candidate scientific claims.
Map source material to claim categories and evidence fields.
Establish baseline reviewer workflow and decision taxonomy.
Confirm sensitive-data handling, licensing constraints, and export boundaries.
Week 2 — Run, adjudicate, read out
Run single and batch gates on the selected records.
Review HOLD, REWRITE, REJECT, and provenance-blocked exceptions.
Measure decision mix, reviewer agreement, and evidence gaps.
Deliver executive summary, audit packet, exception queue, and deployment backlog.
Buyer-ready artifacts

What you receive at the end.

📊

Audit export

CSV/JSON with verdict, reason, schema version, record ID, evidence fields, and provenance blockers.

🔀

Exception queue

HOLD/REWRITE/REJECT records routed for expert adjudication and wording repair.

🔍

Provenance report

Open review hashes, source URL hashes, witness IDs, reviewer attestations, and RO-Crate blockers.

📈

Decision mix

ACCEPT, REWRITE, REJECT, HOLD, fine-tune-ready, and provenance-blocked counts by corpus slice.

⚠️

Claim contamination register

Examples where source evidence supported a weaker claim than the candidate record asserted.

📋

Executive readout

What can be reused, what needs rewrite, what cannot be reused, and what operational control is required next.

Roles and decision rights

Who owns what.

AI Governance Lead
Owns risk framing, success criteria, and decision policy for controlled reuse.
Research Ops / Data Lead
Owns corpus selection, source access, metadata hygiene, and export workflow.
Subject-Matter Reviewer
Confirms evidence spans, adjudicates exceptions, and approves claim rewrites.
Technical Owner
Runs browser/CLI/API surfaces, preserves audit exports, and maps integration backlog.
Legal / Provenance Reviewer
Reviews licensing, confidentiality, source reuse, attestation, and sensitive-payload handling.
Executive Sponsor
Receives decision mix, contamination examples, and go/no-go recommendation for rollout.
Success criteria

How we know it worked.

Operational success
Every pilot record has a traceable decision and source context.
HOLD/REWRITE/REJECT records produce an actionable exception queue.
Reviewers can explain why accepted claims are licensed for controlled reuse.
Exported audit artifacts are complete enough for governance review.
Commercial success
The buyer sees a measurable claim contamination rate on its own corpus.
The team identifies repeatable provenance blockers and schema gaps.
The pilot produces a next-step integration backlog instead of a one-off demo.
The sponsor can decide whether to expand to CLI/API or workflow integration.
Governance model

How CAPAS stays in its lane.

CAPAS gates supplied evidence only

The engine does not infer hidden evidence, perform broad paper understanding, or certify scientific truth.

Humans confirm spans

Candidate extraction is an aid. A reviewer must confirm claim wording and evidence spans before the gate is treated as operational output.

External provenance verified outside browser

Active checks for hashes, attestations, witness registries, and provenance packets require CLI/API verification.