Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.factagora.com/llms.txt

Use this file to discover all available pages before exploring further.

What is Fingerprint?

Fingerprint is Factagora’s content provenance API. It solves a fundamental problem: how do you prove that a piece of content originated from you, even after it’s been rewritten, paraphrased, or translated? Traditional text-matching tools fail when content is rephrased. Fingerprint works differently — it analyzes the causal structure of your content (who did what, when, and why) and embeds an invisible watermark based on that structure. Even if every word is changed, the underlying structure remains detectable.

How it works

1

Embed

You send your content to /fingerprint/embed. The API extracts a Temporal Knowledge Graph (TKG) — entities, timelines, causal relations, and argument chains — then embeds an invisible zero-width Unicode watermark seeded by that structure. You get back the watermarked content to distribute.
2

Detect

When you encounter suspicious content, send it to /fingerprint/detect. The API runs two independent checks:
  1. Watermark check — extracts hidden bits and correlates against stored fingerprints (near-certain match)
  2. TKG matching — compares the causal structure using fuzzy word-level matching (catches paraphrases)
3

Audit

Each match comes with a full breakdown: which entities, time anchors, and causal patterns were shared. This is your evidence trail — auditable and explainable.

Key capabilities

Paraphrase-resistant

Word-level fuzzy matching means “EU enacts AI Act” matches “European Union passed AI Act regulation” — no exact string match required.

Dual-layer detection

Watermark correlation (cryptographic) + TKG Jaccard (semantic) run independently. Both firing simultaneously makes coincidental overlap statistically implausible.

Content-type aware

Scoring weights are automatically tuned: legal documents emphasize causal structure (0.6), news emphasizes entities (0.5), reports balance all three signals.

Auditable evidence

Every match includes the exact overlap lists — shared entities, timelines, and causal triples — so you can explain why two articles matched.

What’s in a TKG snapshot?

The Temporal Knowledge Graph extracted from your content contains four layers:
LayerExamplePurpose
EntitiesBank of Korea, Interest rateWho and what are involved
Timeseries2024-03-15When events occurred
RelationsBank of Korea → raises → Interest rateWhat happened (cause and effect)
Argument mapPremise → Evidence → Conclusion chainsWhy it happened (macro-level reasoning)
The argument map is what makes Fingerprint uniquely robust. Two articles about the same event will share entities and dates, but the argument structure — why the rate was raised, what evidence supports the claim — is the hardest to change in a paraphrase.

Use cases

  • News agencies — Detect when your articles are republished without attribution
  • Legal teams — Prove content provenance in licensing disputes
  • Research organizations — Track how your findings are cited and reused
  • Content platforms — Automatically flag potential content reuse at scale

Next steps

Embed & Detect walkthrough

Step-by-step guide with code examples.

Best practices

Production tips for scoring, filtering, and auditing.