Windows · Early Access · Active development

Most AI tools help you
find information.
This one helps you decide what's true.

Your analyst rebuilt last quarter's competitive landscape from scratch — again. A claim made it into the final brief because nobody tracked the conflicting source. The citation was a trade blog. The board approved it anyway, because there was no way to know.

This is not a research quality problem. It is a structural problem with how evidence-based work is done.

87–92% LLM self-reported confidence
vs
11–70% Formula-computed per-claim score
(based on actual source tier quality)

Epistamate never uses LLM self-reported confidence in its scores. It computes confidence from the evidence underneath — source credibility tier, cross-source agreement, adversarial challenge outcome, evidence recency.

Not a search engine · Not a chatbot · Not a RAG wrapper · Not a report generator

A claim-based research system — for building knowledge you can defend.

📄 Published · Zenodo
DCBR Engine Architecture
⚖️ EU AI Act
Article 12 aligned
🔒 Privacy-first
Everything stays on your machine
🛡️ Defensive disclosure
IP.com · IPCOM/000277741
Request early access → See it work ↗ The Engine →

Every LLM tells you it's confident.
That confidence is not the same as evidence quality.

🎭

The model doesn't know what it doesn't know

A frontier LLM producing a research brief will report high confidence even when the underlying sources are weak, out of date, or non-existent. It has no mechanism to distinguish between a claim corroborated by three Tier 1 sources and a claim it generated from pretraining memory.

🔍

Epistamate separates the claim from the confidence

Every finding is a structured assertion — with a source credibility tier, provider consensus count, adversarial challenge outcome, and evidence age. The formula-computed score is deterministic. The 29% claim is shown alongside the 90% one. Nothing is averaged away.

📊

The gap is larger than you think

In live sessions, LLM-reported aggregate confidence typically runs 87–92%. Formula-computed per-claim scores, reflecting actual source tier quality, range 11–70%. That gap is not a rounding error. It is the difference between research you can defend and research that sounds authoritative.

Not a summary. An argument.

Epistamate structures research as a claim vault — every finding individually addressable, scored, and traceable. Contradictions are named. Gaps are tracked. The decision record is immutable.

Try the interactive demo →

Enter a research question. See how claims are extracted, scored, and challenged — on a real topic, with real scores.

01

Quick Scan

Topic map, initial claim stubs, prior verified claims retrieved from the knowledge graph.

02

Deep Research

Multi-provider parallel retrieval. Each claim scored against the confidence formula. Gaps extracted as typed objects.

03

Adversarial Challenge

Mandatory phase. Claims that don't survive lose their socratic bonus. Challenges are persisted, not discarded.

04

Structured Brief + Decision Log

WEAK and UNVERIFIED claims excluded from Key Findings. Decision log entry immutably preserves the evidence state at time of decision.

Every AI tool gives you answers.
None tell you which to trust.

01

Research is not answers. Research is claims.

The atomic unit is a claim, not a summary. Every finding is a structured, scored assertion — with a source tier, provider consensus count, adversarial challenge outcome, and evidence age. The 29% claim is shown alongside the 90% one. Epistamate never averages away what it doesn't know, and it never hides uncertainty behind confident prose.

Your documents count. An authoritative report you ingest contributes directly to confidence scores — outweighing what the model guesses from memory.

02

Knowledge resets every session. Research does not.

The same analyst rebuilds the same knowledge every quarter. Epistamate maintains a durable claim graph across sessions. Run five builds on sessions one through four. Contradictions are preserved, not discarded. The knowledge graph compounds with use.

Works for general professional research — not locked to academic literature like Elicit, Consensus, or Scite.

03

You already do this work. You just do it in documents that don't compound.

The best strategy research already works the way the engine works: individual claims are sourced and graded, contradictions are noted, gaps are named, and the final recommendation is honest about its confidence level. What it doesn't do is carry that structure forward to the next engagement, the next client, the next analyst who joins the team.

The structured brief a senior consultant produces for a board is a claim vault — it just doesn't look like one, and it evaporates when the project ends.

Built for work where being wrong has consequences.

The contribution is not any individual component.
It is their co-presence.

Removing any one of the six properties degrades the system to something existing tools already do.

Capability FActScore / VeriScore GraphRAG MemGPT Commercial tools Std. LLM Epistamate
Typed claim extraction ✓ atomic partial
Multi-factor evidence confidence retrieval-based ✓ formula
Tier-enforced source hierarchy
Adversarial challenge (pre-synthesis)
Gap tracking (typed, persistent)
Cross-session compounding partial partial
Bidirectional operation
Domain configurability (runtime) partial
Decision log (Article 12 compatible)
Immutable audit snapshot

Desktop-native. Privacy by design.

Individual analyst

Local SQLite storage with vector similarity. No research data transmitted to external servers beyond LLM API calls. Everything stays on your machine.

Team deployment

Shared knowledge graphs, parallel research runs, consolidated decision logs. Appropriate for investment due diligence, legal research, regulatory compliance.

Enterprise

PostgreSQL-backed store for concurrent deployment. No architectural change to the reasoning pipeline — same formula, same audit trail, at scale.

What people ask before getting in touch.

How is Epistamate different from a RAG system?

A RAG system retrieves documents and passes them to an LLM, which produces a summary. Epistamate extracts structured claims from that retrieval, scores each one against a deterministic confidence formula based on source credibility tier, cross-source agreement, adversarial challenge outcome, and evidence recency — and tracks what is unknown alongside what is known. The output is a scored claim vault with a full provenance chain, not a synthesised summary. The knowledge graph persists across sessions; a RAG context window resets.

How does Epistamate handle AI hallucinations?

Epistamate never uses an LLM's self-reported confidence in its scoring. It computes confidence from the evidence underneath each claim: the credibility tier of cited sources, how many independent providers corroborated the claim, whether the claim survived adversarial challenge, and how recent the evidence is. A claim the model is highly confident about but that only cites a single Tier 3 source will score low. A claim from three Tier 1 documents with cross-provider consensus will score high. The formula separates rhetorical confidence from evidential quality.

What does "EU AI Act Article 12 aligned" mean?

Article 12 of the EU AI Act requires that high-risk AI systems maintain logs enabling post-hoc auditability — what the system did, on what basis, at what point in time. Epistamate's Decision Log mechanism produces exactly this: an immutable, timestamped record of the full evidence state at the moment a decision was logged — which claims were verified, which were contested, which gaps were acknowledged. This is the output of how the system works by default, not a compliance layer added afterward. It is not a legal certification; qualified legal counsel must assess applicability.

How does cross-session compounding work?

Verified claims from session N are stored in a persistent knowledge graph. In session N+1, when a semantically similar question is asked, prior verified claims are retrieved and contribute directly to the new session — reducing re-derivation burden and increasing confidence scores for claims with existing corroboration. Contradictions between sessions are preserved as typed objects, not silently resolved. The graph accumulates with use; it does not reset.

Can I use my own documents as sources?

Yes. Documents you ingest are assigned a credibility tier within the source hierarchy. An authoritative report from a Tier 1 institution outweighs model-generated inferences. Your documents participate in the confidence formula directly — they are not just context passed to the LLM.

What research domains does it work for?

Epistamate's validated pipeline covers general professional research. The engine is domain-configurable at runtime — source trust hierarchies, claim type vocabularies, scoring weights, and output formats are parameters, not hardcoded logic. Gov/Policy and RegWatch vertical slices are in active development. The same binary can run policy research, investment due diligence, and regulatory compliance with no architectural change.

We're in active development and
talking to organisations where defensible, compounding evidence matters.

If you recognise your domain in this site, we'd like to hear about the specific problem before we describe the solution.

What happens when you reach out

  • A short conversation about your specific research workflow
  • Early access to the Windows desktop application
  • Direct input into the domain configuration for your use case