Content Credibility Engine
Meta Museum treats content generation and syndication as a credibility system, not a volume system.
This document defines four operational layers:
- Trust layer: provenance, authorship, traceability, rights.
- Originality layer: semantic novelty and source-grounded synthesis.
- Distribution layer: channel orchestration, SEO metadata, syndication.
- Consistency layer: cadence, voice control, and review discipline.
Scope and compatibility
This policy extends existing repository rules in:
- `CLAUDE.md`
- `docs/roadmap.md`
- `docs/linked-art/LinkedArtModel1.0-Reference.md`
If any conflict appears, Linked Art fidelity, rights safety, and "cite or refuse" remain non-negotiable.
1) Trust layer requirements
Trust metadata is persisted in:
- `provenance/ledger.json`
- `provenance/source-map.yaml`
Required metadata per publishable artifact:
- `artifactId` stable identifier
- `contentHashSha256` over canonical source bundle
- `author` and `reviewers`
- `createdAt` and `publishedAt`
- source list with URL, provider, retrieval timestamp
- citation coverage evidence
- rights/reuse disposition
Minimum enforcement:
- No publish without at least one source reference.
- No publish with unresolved rights status.
- No publish if citation coverage fails policy threshold.
2) Originality layer requirements
Originality records are persisted in:
- `semantic-core/originality-index.json`
Baseline novelty policy:
- compute embedding-space novelty against recent published corpus
- classify as "meaningfully original" when cosine distance is above `0.18`
- require one "unique insight" note per artifact (human-written or curator-approved)
Guardrails:
- template-only outputs are rejected
- uncited paraphrase outputs are rejected
- all generated claims must map to source evidence or be removed
3) Distribution layer requirements
Distribution control artifacts:
- `distribution/schedule.yaml`
- runtime queue database at `distribution/queue.db` (gitignored)
Channel set (initial):
- Web
- Medium
- API feed/syndication
Required pipeline behavior:
- per-channel publish caps and minimum cadence checks
- snippet extraction and summary variants
- SEO metadata generation (title, description, canonical URL, tags)
- rights-safe media checks before enqueue
- queue-worker orchestration across `web`, `linkedin`, `medium`, `email`, and `api` channels
4) Consistency layer requirements
Voice and cadence controls:
- `generation/style-profile.md`
- `distribution/schedule.yaml`
Baseline policy:
- minimum 2 outputs per week across channels
- maximum 1 output per day per channel by default
- style profile versioned with last-updated reviewer
5) Monitoring and audits
Monitoring baseline persisted in:
- `monitoring/metrics.json`
Tracked dimensions:
- trust score
- originality score
- distribution reach
- engagement velocity
Weekly audit checks:
- citation/link integrity
- novelty drift
- rights/compliance regressions
- broken outbound links
- engagement velocity minimum threshold
- trust/originality score regression thresholds vs configured baseline
Automation:
- Local run: `pnpm credibility:audit`
- Gate run (non-zero on failure): `pnpm credibility:audit:check`
- Weekly GitHub Action: `.github/workflows/credibility-audit.yml`
- schedule: every Monday
- checks: validation drift, eval relevance threshold, markdown broken-link scan, engagement/trust/originality eval-threshold alerts
- artifact: `artifacts/credibility-audit/latest.json`
6) Implementation notes for this repository
- OpenTelemetry is the default tracing substrate for this workflow.
- Postgres is the system of record where queue/provenance services graduate from file-backed prototypes.
- Redis remains the short-lived cache layer for scoring and queue coordination.
- FastAPI service extensions are allowed where they fit current C2/C4 service boundaries.
- Additional AI framework dependencies (for example LangChain/LlamaIndex) are optional and must pass the standard "ask before acting" dependency gate in `CLAUDE.md`.
7) Definition of done for credibility-ready content features
- A test exists that fails when citations or rights metadata are missing.
- A test exists that fails when originality metadata is absent for generated artifacts.
- Queue/schedule policy is enforced by code, not just documentation.
- Metrics are emitted and trace-linked for generation, review, and publish operations.
- Human approval is explicit before publication actions.