Documentation Index
Fetch the complete documentation index at: https://docs.factagora.com/llms.txt
Use this file to discover all available pages before exploring further.
Embed at publish time, not retroactively
Watermark content as early as possible — ideally at the moment of publication or distribution. Embedding after content has already circulated reduces the provenance window.
{
"content": "...",
"content_type": "news",
"metadata": {
"published_at": "2024-03-15T09:00:00Z",
"source_id": "article_98765"
}
}
Always store the fingerprint_id
The fingerprint_id returned by /embed is your key for retrieval and detection. Store it alongside your internal document record.
{
"internal_id": "article_98765",
"fingerprint_id": "fp_l1p8OPCwGhvu"
}
Distribute watermarked_content, not the original
The watermarked_content in the response contains invisible zero-width Unicode characters. Distribute this version instead of the original — it is visually identical but carries the cryptographic watermark that enables detection.
Choose the right content_type
Scoring weights are automatically tuned per content type. Choosing the correct type improves detection accuracy:
| Content type | Entity weight | Time weight | Causal weight | Best for |
|---|
news (default) | 0.5 | 0.2 | 0.3 | News articles, press releases |
legal | 0.3 | 0.1 | 0.6 | Legal documents, contracts, court filings |
report | 0.4 | 0.3 | 0.3 | Research reports, analysis, whitepapers |
internal | 0.5 | 0.2 | 0.3 | Internal memos, communications |
Use custom weights for specialized use cases
Override the default weights when your detection scenario is unusual. For example, if you only care about whether the same entities appear (regardless of causal structure):
{
"content": "...",
"weights": { "entity": 0.8, "time": 0.1, "causal": 0.1 }
}
Weights must sum to 1.0. The weights used are echoed back in meta.weights for auditability.
Use filters to narrow detection scope
Pass filters to reduce the candidate set and speed up detection:
{
"content": "...",
"filters": {
"author_id": "journalist_042",
"date_from": "2024-01-01",
"date_to": "2024-12-31",
"content_type": "news"
}
}
Understand the two detection layers
Detection runs two independent checks — use both signals together:
| Layer | How it works | What it proves |
|---|
| Watermark | Extracts invisible zero-width bits and correlates them against stored seeds | Near-certain provenance — the exact watermarked content was used |
| TKG Jaccard | Compares entities, timelines, and argument chains using word-level fuzzy matching | Semantic similarity — the same story, even if completely rewritten |
Check watermark_match and watermark_correlation on each match to see if the watermark layer fired. The meta.watermark_detected field tells you whether any watermark was found in the input at all.
Interpret confidence scores carefully
| Score range | Interpretation | Recommended action |
|---|
| 0.8 – 1.0 | Strong match | High confidence — safe to automate |
| 0.5 – 0.8 | Partial match | Review the overlap lists before acting |
| 0.3 – 0.5 | Weak signal | Likely coincidental overlap |
| Below 0.3 | Filtered out | Not returned (below default min_score) |
When watermark_match: true, the match is near-certain regardless of the TKG score.
Audit matches with overlap lists
Every match includes overlap.entities, overlap.timeseries, and overlap.relations — the specific items shared between the query and the candidate. Use these to explain why two articles matched, not just that they matched.
{
"overlap": {
"entities": ["bank of korea", "interest rate"],
"timeseries": ["2024-03-15"],
"relations": ["bank of korea|raises|interest rate"]
}
}
Use fingerprint_id for re-scoring
If you’ve already embedded content and want to re-score it against the registry (e.g., periodically checking for new matches), pass the fingerprint_id directly instead of the content:
{
"fingerprint_id": "fp_l1p8OPCwGhvu"
}
This skips content extraction and uses the stored TKG snapshot, making it faster and idempotent.
Combine with other Factagora APIs
| Workflow | APIs | Purpose |
|---|
| Was my content reused? | Fingerprint Detect | Provenance tracking |
| Was it reused accurately? | Fingerprint Detect + Fact Checker | Detect misrepresentation |
| What changed in the story? | Fingerprint Detect + Causality Graph | Track narrative evolution |