Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.factagora.com/llms.txt

Use this file to discover all available pages before exploring further.

Embed at publish time, not retroactively

Watermark content as early as possible — ideally at the moment of publication or distribution. Embedding after content has already circulated reduces the provenance window.
{
  "content": "...",
  "content_type": "news",
  "metadata": {
    "published_at": "2024-03-15T09:00:00Z",
    "source_id": "article_98765"
  }
}

Always store the fingerprint_id

The fingerprint_id returned by /embed is your key for retrieval and detection. Store it alongside your internal document record.
{
  "internal_id": "article_98765",
  "fingerprint_id": "fp_l1p8OPCwGhvu"
}

Distribute watermarked_content, not the original

The watermarked_content in the response contains invisible zero-width Unicode characters. Distribute this version instead of the original — it is visually identical but carries the cryptographic watermark that enables detection.

Choose the right content_type

Scoring weights are automatically tuned per content type. Choosing the correct type improves detection accuracy:
Content typeEntity weightTime weightCausal weightBest for
news (default)0.50.20.3News articles, press releases
legal0.30.10.6Legal documents, contracts, court filings
report0.40.30.3Research reports, analysis, whitepapers
internal0.50.20.3Internal memos, communications

Use custom weights for specialized use cases

Override the default weights when your detection scenario is unusual. For example, if you only care about whether the same entities appear (regardless of causal structure):
{
  "content": "...",
  "weights": { "entity": 0.8, "time": 0.1, "causal": 0.1 }
}
Weights must sum to 1.0. The weights used are echoed back in meta.weights for auditability.

Use filters to narrow detection scope

Pass filters to reduce the candidate set and speed up detection:
{
  "content": "...",
  "filters": {
    "author_id": "journalist_042",
    "date_from": "2024-01-01",
    "date_to": "2024-12-31",
    "content_type": "news"
  }
}

Understand the two detection layers

Detection runs two independent checks — use both signals together:
LayerHow it worksWhat it proves
WatermarkExtracts invisible zero-width bits and correlates them against stored seedsNear-certain provenance — the exact watermarked content was used
TKG JaccardCompares entities, timelines, and argument chains using word-level fuzzy matchingSemantic similarity — the same story, even if completely rewritten
Check watermark_match and watermark_correlation on each match to see if the watermark layer fired. The meta.watermark_detected field tells you whether any watermark was found in the input at all.

Interpret confidence scores carefully

Score rangeInterpretationRecommended action
0.8 – 1.0Strong matchHigh confidence — safe to automate
0.5 – 0.8Partial matchReview the overlap lists before acting
0.3 – 0.5Weak signalLikely coincidental overlap
Below 0.3Filtered outNot returned (below default min_score)
When watermark_match: true, the match is near-certain regardless of the TKG score.

Audit matches with overlap lists

Every match includes overlap.entities, overlap.timeseries, and overlap.relations — the specific items shared between the query and the candidate. Use these to explain why two articles matched, not just that they matched.
{
  "overlap": {
    "entities": ["bank of korea", "interest rate"],
    "timeseries": ["2024-03-15"],
    "relations": ["bank of korea|raises|interest rate"]
  }
}

Use fingerprint_id for re-scoring

If you’ve already embedded content and want to re-score it against the registry (e.g., periodically checking for new matches), pass the fingerprint_id directly instead of the content:
{
  "fingerprint_id": "fp_l1p8OPCwGhvu"
}
This skips content extraction and uses the stored TKG snapshot, making it faster and idempotent.

Combine with other Factagora APIs

WorkflowAPIsPurpose
Was my content reused?Fingerprint DetectProvenance tracking
Was it reused accurately?Fingerprint Detect + Fact CheckerDetect misrepresentation
What changed in the story?Fingerprint Detect + Causality GraphTrack narrative evolution