Meta Museum — Era Delivery History
Archived detailed record of the completed delivery eras (Lift / Hardening / SOTA), moved out of the active roadmap(../roadmap.md) to keep it current. This is the slice-by-slice and B-/C-series implementation log; the roadmap holds the live status and the forward plan (roadmap-to-10.md(../roadmap-to-10.md)).
---
Three eras
Scope honestly. The SOTA spec is a 20-week, multi-service, multi-store target. We get there in three eras with hard exit gates between them.
| Era | Theme | Outcome | Tier |
|---|---|---|---|
| A. Lift | Move what exists into Next.js 16 + custom CSS | Public site + research workspace at parity with legacy prototype, on a modern stack | Slices 1-10 |
| B. Hardening | Make the data layer trustworthy | Zod-mirrored contracts, SHACL validation, ValidationReport, Postgres swap, real auth | Quarters 2-3 post-lift |
| C. SOTA | The full Yale-LUX-pattern platform | Multi-modal store, HAL hypermedia, Visual ETL Mapper, NL→SPARQL, Graph-RAG, IIIF deep-zoom, Meta Wiki Art bridge | Quarter 4+ |
Cardinal rule: do not start a later era until the prior era's exit gate is green. Don't ship a SHACL validator on top of a JSON-file store, and don't ship Graph-RAG on top of unvalidated records.
---
Era A — The Lift (10 slices, PR-sized each)
Goal: end the era with the legacy prototype's full feature set running natively on Next.js 16, custom CSS, App Router + RSC, with the dependency rules from `_legacy/AGENTS.md` preserved (adapters don't cross-import; contracts are leaves; `_source.raw` is immutable).
Slice 0 — Staging (DONE)
See §Status above.
Slice 1 — Foundations (TDD infra first) (DONE)
Port the dep-free leaves so everything else can be built on them. Test infra lands before any port.
Order within the slice:
- [x] ✅ Test infra: `tsx` dev dependency + `"test": "node --import tsx --test tests/*/.test.ts"` + smoke test (`tests/smoke.test.ts`) landed; `pnpm test` green.
- [x] ✅ Port test-first. Legacy tests were ported and implementations landed for the specified dep-free leaves.
- [x] ✅ `src/constants.ts` (port of `_legacy/src/constants.js`)
- [x] ✅ `src/contracts/*.ts` — all 8 (`artwork`, `source-record`, `rights-report`, `import-job`, `agent-task`, `citation`, `shared-structures`, `wiki-draft`)
- [x] ✅ `src/utils/{text,rights,http,storage,linked-art}.ts` (leaf utils)
- [x] ✅ No routes, no UI changes in this foundation slice scope.
Acceptance: `pnpm test` green with full coverage of legacy `tests/contracts/*` and `tests/utils/{linked-art,validation-report}.test.ts`. `pnpm build` green. No new client-side bundle weight. Every implementation file has at least one corresponding test file.
Slice 2 — Met vertical (canary) (DONE)
Prove the route-handler pattern end-to-end with the simpler of the two adapters.
- [x] ✅ `src/adapters/{adapter-utils,provider-interface,met}.ts` + tests
- [x] ✅ Route handlers (all `app/api/.../route.ts`): `/health`, `/met/profile`, `/met/departments`, `/met/search` (POST), `/met/object` (POST), `/met/import` (POST)
- [x] ✅ `app/explore/page.tsx` — minimal custom-CSS UI calling `/api/met/search`, showing image cards
- [x] ✅ `app/layout.tsx` — `Create Next App` metadata replaced; shell landed and evolved in later slices
Acceptance: User can search Met for "flowers", click a result, see object detail JSON. Loading/empty/error states present. No Linked Art shortcuts taken — Met objects normalize through the `Artwork` contract before display.
Slice 3 — Getty vertical (DONE)
Same shape as Slice 2 for Getty (more endpoints).
- [x] ✅ `src/adapters/getty.ts` + tests
- [x] ✅ Routes: `/getty/profile`, `/getty/entity` (POST), `/getty/import` (POST), `/getty/activity` (GET), `/getty/sparql` (POST)
- [x] ✅ `app/explore/page.tsx` extended with provider toggle (Getty / Met / both; later expanded further)
- [x] ✅ `app/getty/page.tsx` — SPARQL playground + ActivityStream peek
Acceptance: Both providers reachable from `/explore`. Getty SPARQL playground returns rows. Rights and source attribution rendered on every card.
Slice 4 — Records + Artworks + Entities (DONE)
The persistence-touching slice. Storage stays JSON files in `storage/`.
- [x] ✅ `src/utils/{artwork-builder,artwork-facets,entities,relationships}.ts` + tests
- [x] ✅ Routes: `/records` (GET/POST), `/records/[id]` (GET), `/artworks/[id]` (GET), `/entities` (GET), `/entities/[id]` (GET), `/explorer/artworks` (GET), `/explorer/import` (POST)
- [x] ✅ Pages: `app/records/page.tsx`, `app/artwork/[id]/page.tsx`, `app/entity/[id]/page.tsx`
- [x] ✅ Async-`params` patterns applied per Next 16 requirements.
Acceptance: Import a Met or Getty record from `/explore`; it appears on `/records`; clicking it opens `/artwork/[id]` with maker, date, materials, rights, citation, and a list of pivotable entities. Each entity link opens `/entity/[id]`.
Slice 5 — Linked Art Inspector + Roadmap + Best-Practices (DONE)
Port the JSON-LD inspect/import workflow plus the two reflective endpoints.
- [x] ✅ `src/utils/best-practices-audit.ts` + tests
- [x] ✅ Routes: `/linked-art/profile`, `/linked-art/inspect` (POST), `/linked-art/import` (POST), `/roadmap`, `/best-practices`
- [x] ✅ Pages: `app/linked-art/page.tsx`, `app/roadmap/page.tsx`
- [x] ✅ `/api/roadmap` returns this document as structured JSON; `/api/best-practices` preserves legacy audit semantics.
Started in this pass:
- [x] ✅ `src/utils/best-practices-audit.ts` + tests.
- [x] ✅ `/api/roadmap` route returning structured JSON.
- [x] ✅ `/api/best-practices` route running the audit against stored records.
- [x] ✅ `app/roadmap/page.tsx` initial UI shell.
- [x] ✅ `/linked-art/profile`, `/linked-art/inspect`, `/linked-art/import` routes.
- [x] ✅ `app/linked-art/page.tsx` full inspect/import workflow UI.
Acceptance: Paste a Linked Art JSON-LD blob → see the inspection report. The roadmap page renders this file's phases and exit gates.
Slice 6 — Patterns + Graph (DONE)
Port pattern discovery and the graph view.
- [x] ✅ `src/utils/patterns.ts` + tests
- [x] ✅ Routes: `/patterns` (POST), `/graph` (GET)
- [x] ✅ Pages: `app/patterns/page.tsx`, `app/graph/page.tsx` (Cytoscape force-directed — first external runtime dep; `cytoscape` + `cytoscape-cose-bilkent` per SOTA §3.2)
Acceptance: Pattern scan produces buckets for unknown makers, missing dates, shared concepts. Graph page renders nodes (artworks + entities) with click-to-pivot.
Started in this pass:
- [x] ✅ `src/utils/patterns.ts` + tests.
- [x] ✅ `/api/patterns` route returning unknown + shared buckets.
- [x] ✅ `/api/graph` route returning artwork/entity nodes + edges with pivot hrefs.
- [x] ✅ `app/patterns/page.tsx` pattern scan UI.
- [x] ✅ `app/graph/page.tsx` + `src/components/graph-viewer.tsx` Cytoscape force-directed graph UI.
- [x] ✅ Runtime deps: `cytoscape`, `cytoscape-cose-bilkent`.
Slice 7 — Issues + SSE (DONE)
The streaming case.
- [x] ✅ Route: `/issues` (GET), `/issues/webhook` (POST), `/issues/stream` (GET — `ReadableStream`, `export const dynamic = 'force-dynamic'`)
- [x] ✅ Page: `app/issues/page.tsx` with the live-updating issue inventory (re-uses `_legacy/storage/linked-art-issues.json` as a fallback cache, refreshes from GitHub on a `revalidate` cadence)
Acceptance: Issues view loads from cache instantly, hot-updates from SSE without a refresh, and respects the same `GITHUB_OWNER`/`GITHUB_REPO`/`ISSUE_POLL_MS` env vars as the legacy server.
Started in this pass:
- [x] ✅ `src/services/issues.ts` service with legacy-compatible env vars, live GitHub fetch, local runtime cache, and `_legacy/storage/linked-art-issues.json` fallback.
- [x] ✅ `/api/issues` route with optional `?refresh=1` force refresh.
- [x] ✅ `/api/issues/webhook` route with optional signature verification and issue-event refresh hooks.
- [x] ✅ `/api/issues/stream` route using `ReadableStream` SSE (`dynamic = "force-dynamic"`).
- [x] ✅ `/issues/webhook` + `/issues/stream` direct route aliases for legacy-compatible pathing.
- [x] ✅ `app/issues/page.tsx` + `src/components/issues-workbench.tsx` live issue inventory UI with SSE updates.
- [x] ✅ New tests for service + routes: `tests/services/issues.test.ts`, `tests/api/issues.test.ts`, `tests/api/issues-webhook.test.ts`, `tests/api/issues-stream.test.ts`.
Slice 8 — Agents + Jobs + Content Generation + Automation (DONE)
Port the agent/job stubs as-is. No new agent logic this slice — just parity.
- [x] ✅ Routes: `/agents/run` (POST), `/content/generate` (POST), `/jobs` (GET), `/jobs/run` (POST)
- [x] ✅ Pages: `app/agents/page.tsx`, `app/automation/page.tsx`
Acceptance: Manual pattern scan + collection brief jobs runnable from the UI; output appears in `app/automation/page.tsx`.
Started in this pass:
- [x] ✅ `src/services/agents.ts` with legacy-parity agent/content stubs (`runAgent`, `generateContent`) and local fallback drafting.
- [x] ✅ `src/services/jobs.ts` JSON-backed manual jobs service with seeded defaults and `lastRun` updates.
- [x] ✅ `/api/agents/run`, `/api/content/generate`, `/api/jobs`, `/api/jobs/run` routes.
- [x] ✅ Legacy-compatible direct route aliases: `/agents/run`, `/content/generate`, `/jobs`, `/jobs/run`.
- [x] ✅ `app/agents/page.tsx` + `src/components/agents-workbench.tsx` for manual agent and content runs.
- [x] ✅ `app/automation/page.tsx` + `src/components/automation-workbench.tsx` for job execution and output review.
- [x] ✅ Route tests: `tests/api/agents-run.test.ts`, `tests/api/content-generate.test.ts`, `tests/api/jobs.test.ts`, `tests/api/jobs-run.test.ts`.
Slice 9 — Workspace chrome + design-system pass (Custom CSS) (DONE)
Now everything works route-by-route. Unify the visual layer.
- [x] ✅ `app/(workspace)/layout.tsx` route group — sidebar nav + topbar, all custom CSS
- [x] ✅ Expand `app/globals.css` design tokens + component classes to cover every recurring pattern; keep BEM-lite naming consistent
- [x] ✅ Implement the design-system atomics from SOTA §12.1 in `src/components/` with a `data-la-entity-id` attribute on every entity-derived element: `<LinkedDate>`, `<LinkedDimensions>`, `<LinkedLabel>`, `<UriBadge>`, `<RightsBadge>` (already in Slice 2), `<SourceBadge>`, `<CitationBlock>`
- [x] ✅ Entity cards (SOTA §12.2): `<ObjectCard>`, `<ActorCard>`, `<PlaceCard>`, `<ConceptCard>`
- [x] ✅ a11y pass: axe-core in CI; keyboard nav on all interactive surfaces; WCAG 2.1 AA on public pages
Acceptance: Lighthouse a11y ≥ 95 on `/`, `/explore`, `/artwork/[id]`. Storybook scaffold (vite-based, no Next coupling) for the atomic components.
Started in this pass:
- [x] ✅ `app/(workspace)/layout.tsx` route-group shell with keyboard-first skip link, sidebar navigation, and topbar.
- [x] ✅ Workspace pages moved under `app/(workspace)/...` so URLs stay unchanged while sharing common chrome.
- [x] ✅ Expanded `app/globals.css` with workspace shell classes and reusable design-system component classes.
- [x] ✅ Added atomics in `src/components/`: `LinkedDate`, `LinkedDimensions`, `LinkedLabel`, `UriBadge`, `SourceBadge`, `CitationBlock`; extended `RightsBadge` with `data-la-entity-id`.
- [x] ✅ Added entity cards in `src/components/`: `ObjectCard`, `ActorCard`, `PlaceCard`, `ConceptCard`.
- [x] ✅ Wired new components into `/explore`, `/artwork/[id]`, `/entity/[id]`, and `/records`.
- [x] ✅ Added component tests: `tests/components/linked-atomics.test.ts`, `tests/components/entity-cards.test.ts`.
- [x] ✅ Added CI workflow at `.github/workflows/ci.yml` running lint, tests, build, Playwright install, axe accessibility checks, and Lighthouse CI assertions.
- [x] ✅ Added Vite-based Storybook scaffold (`.storybook/*`) and atomic stories (`src/components/atomics.stories.tsx`), validated with `pnpm storybook:build`.
Slice 10 — Lift cleanup (DONE)
- [x] ✅ Delete `_legacy/` (empty by now or only contains files we deliberately chose not to port)
- [x] ✅ Rewrite `README.md` from the Next.js side; preserve the legacy product narrative
- [x] ✅ Confirm `metamuseum-legacy/` can be archived/deleted (verified absent at `C:\Projects\metamuseum-legacy`).
- [x] ✅ Security credential rotation moved to Era B operational preflight tracking (see `Pre-Era-C Operational Sign-Off` under Era B).
- [x] ✅ Document the env-var surface (`PORT`, `GITHUB_OWNER`, `GITHUB_REPO`, `ISSUE_POLL_MS`, future `DATABASE_URL`) in `docs/env.md`
Started in this pass:
- [x] ✅ Added automated parity gate test: `tests/quality/era-a-exit-gate.test.ts` (legacy route/view equivalents + route-test coverage checks).
- [x] ✅ Lighthouse a11y gate now runs reliably in local Windows env via `scripts/lighthouse-a11y.mjs` (no `chrome-launcher` temp-dir cleanup failure path).
- [x] ✅ `_legacy/` removed from workspace.
- [x] ✅ Verified `C:\Projects\metamuseum-legacy` is absent as of May 30, 2026 (already archived/deleted outside this repo).
Era A exit gate (must all be green):
- [x] ✅ Legacy API routes have Next 16 equivalents, tested (`tests/quality/era-a-exit-gate.test.ts`; current legacy snapshot evaluates to 33 route handlers).
- [x] ✅ Legacy SPA views have Next 16 page equivalents (`tests/quality/era-a-exit-gate.test.ts`; current legacy snapshot evaluates to 13 named views).
- [x] ✅ `pnpm build && pnpm test && pnpm lint` clean (validated in `metamuseum` conda env on May 30, 2026).
- [x] ✅ Lighthouse a11y ≥ 95 on the public pages (`/`, `/explore`, `/artwork/[id]`) via `pnpm lighthouse:ci`.
- [x] ✅ `_legacy/` deleted.
- [x] ✅ `metamuseum-legacy/` archived/deleted (path not present at `C:\Projects\metamuseum-legacy` on May 30, 2026).
---
Era B — Hardening (quarters, not weeks)
Goal: make the data layer trustworthy enough that AI agents and external consumers can rely on it. The Era A app keeps shipping during this era; we add validation and durable storage under it.
Slices in suggested order, but each is independently shippable:
B1 — Zod contracts + schema versioning
- [x] ✅ Mirror every `src/contracts/.ts` as a Zod schema in `src/contracts/zod/.ts`
- [x] ✅ Add `schemaVersion` field + a `src/utils/migrations/` registry
- [x] ✅ Server Actions and Route Handlers validate at the boundary with the Zod schemas
Status:
- [x] ✅ Completed.
B2 — Formal validation
- [x] ✅ Build a Python validation microservice (FastAPI + PySHACL + PyLD) — first non-Node service
- [x] ✅ SHACL shapes in `shapes/linked-art/*.shacl.ttl`, fixtures in `fixtures/linked-art/{pass,fail}/`
- [x] ✅ New route `app/api/validate/route.ts` proxies to it
- [x] ✅ New contract `ValidationReport` (SOTA §5.1) wired into inspect/import flows
- [x] ✅ Validation fixtures and route assertions are checked against linked-art/LinkedArtModel1.0-Reference.md(linked-art/LinkedArtModel1.0-Reference.md) before merge.
Status:
- [x] ✅ Completed.
B3 — Postgres migration
- [x] ✅ Postgres 16 via Docker Compose for dev (`ops/docker-compose.yml`)
- [x] ✅ Migrate `storage/{records,jobs}.json` → Postgres JSONB tables via a one-time exporter that double-writes for one release, then cuts over
- [x] ✅ Replace `src/utils/storage.ts` JSON-file impl with a Postgres impl behind the same interface; no call sites change
- [x] ✅ Add `next-env.d.ts` env validation with Zod for `DATABASE_URL`
Status:
- [x] ✅ Completed.
- [x] ✅ Added `ops/docker-compose.yml` + `ops/postgres/init/01-storage.sql` (Postgres 16 dev runtime + table bootstrap).
- [x] ✅ Added `scripts/export-storage-to-postgres.ts` one-time exporter and `pnpm storage:export:postgres`.
- [x] ✅ `src/utils/storage.ts` now supports `file`, `double-write`, and `postgres` modes for the centralized managed-document contract behind unchanged `readJson/writeJson`.
- [x] ✅ Added Zod-backed runtime env parsing for `DATABASE_URL` + storage mode in `src/utils/env.ts` and typed env keys in `next-env.d.ts`.
- [x] ✅ Updated `records` and `jobs` services to avoid file-`stat` assumptions so Postgres cutover does not require call-site changes.
B4 — Auth + roles
- [x] ✅ Verify and document production credential rotation for `AUTH_SECRET` + `AUTH_GITHUB_SECRET` (operational sign-off completed; see `docs/ops/auth-credential-rotation.md`(docs/ops/auth-credential-rotation.md) and `Pre-Era-C Operational Sign-Off`).
- [x] ✅ Add Auth.js v5 with GitHub provider for write paths (`/api/records` POST, `/api/explorer/import`, `/api/getty/import`, `/api/met/import`, `/api/linked-art/import`, `/api/jobs/run`)
- [x] ✅ Roles: `public` (read only), `researcher` (read + import), `editor` (import + agent jobs), `admin` (all)
- [x] ✅ Middleware gates write routes; UI conditionally renders import/run buttons
Status:
- [x] ✅ Completed.
- [x] ✅ Added Auth.js v5 (`next-auth@5.0.0-beta.31`) with GitHub provider in root `auth.ts`.
- [x] ✅ Added `/api/auth/[...nextauth]` handlers.
- [x] ✅ Added centralized role mapping in `src/auth/roles.ts` (public/researcher/editor/admin with allowlist env vars).
- [x] ✅ Added Next 16 `proxy.ts` route gates for write paths (`/api/records` POST, `/api/explorer/import`, `/api/getty/import`, `/api/met/import`, `/api/linked-art/import`, `/api/jobs/run`) plus editor-only agent endpoints.
- [x] ✅ UI now conditionally enables import/run controls in linked-art, agents, and automation workbenches based on resolved role.
- [x] ✅ Rotate production GitHub OAuth credentials if any legacy values are still active (operational follow-up reminder remains until verified).
B5 — Provider expansion
- [x] ✅ New adapters for Harvard, Smithsonian Open Access, Rijks, RKD Knowledge Graph, National Gallery of Art Open Data, Louvre Collections JSON, V&A Collections API, Princeton University Art Museum API, Europeana, AIC, CMA (SOTA §27.1) — each shipped with route + adapter + tests.
- [x] ✅ Each adapter implements the `provider-interface` contract; cross-adapter imports remain forbidden.
- [x] ✅ `/explore` import flow now accepts all landed provider source IDs (`met`, `getty`, `rijks`, `nga`, `louvre`, `harvard`, `smithsonian`, `vanda`, `princeton`, `europeana`, `aic`, `cma`).
- [x] ✅ Provider slices include executable conformance tests mapped to linked-art/LinkedArtModel1.0-Reference.md(linked-art/LinkedArtModel1.0-Reference.md) fixture anchors.
Acceptance for each upcoming provider slice (object-specific AIDD + TDD gates):
- [x] ✅ Failing-first contract tests verify culturally valued physical objects normalize as `HumanMadeObject` unless explicit evidence requires another canonical class.
- [x] ✅ Failing-first tests verify production/destruction remain structured activity/event nodes when present (not flattened to display-only strings).
- [x] ✅ Failing-first tests verify physical characteristics (dimensions/materials/parts) are preserved as structured data when available.
- [x] ✅ Failing-first tests verify ownership/location assertions remain distinct from non-ownership rights/reuse assertions.
- [x] ✅ Failing-first tests verify physical object identity remains distinct from digital surrogates/representations while preserving linkage.
- [x] ✅ Failing-first regression tests verify immovable/place-centric records are not coerced into moveable-object assumptions.
- [x] ✅ Route-level tests verify inspect/import outputs preserve the above structures without mutating canonical source fields.
Acceptance for each upcoming provider slice (digital-content AIDD + TDD gates):
- [x] ✅ Failing-first contract tests verify `DigitalObject` records preserve `access_point`, `format`, and `conforms_to` when provided.
- [x] ✅ Failing-first tests verify digital object creation events use `Creation` semantics where present, without coercion to physical-object production semantics.
- [x] ✅ Failing-first tests verify content/carrier separation is preserved (`DigitalObject` versus `VisualItem`/`LinguisticObject`) without collapsing layers.
- [x] ✅ Failing-first tests verify surrogate linkage can preserve shared visual content (`shows` and `digitally_shows`) when provider data supports it.
- [x] ✅ Failing-first tests verify web-page and document references preserve the `subject_of` → `LinguisticObject` → `digitally_carried_by` pattern when present.
- [x] ✅ Failing-first tests verify IIIF structures are preserved, including Presentation manifest `conforms_to`/`format` and Image API `DigitalService` via `digitally_available_via`.
- [x] ✅ Route-level tests verify inspect/import outputs retain digital metadata structures (including IIIF fields) and do not mutate canonical `_source.raw`.
- [x] ✅ Provider-slice test PRs must include or update fixture-backed tests mapped to `Round 3 Addendum — Digital Content` → `Fixture Anchors — Digital Content Examples` in linked-art/LinkedArtModel1.0-Reference.md(linked-art/LinkedArtModel1.0-Reference.md) (web publication, surrogate parity, embedded representation image, subject_of web page, IIIF Presentation manifest, IIIF Image service).
- [x] ✅ Provider-slice test PRs must also include a short "Standards Mapping" note listing the specific round addenda used (for example endpoint schema rounds, shared-structure rounds, and relevant search-relation rounds) and the fixture anchors exercised.
Status:
- [x] ✅ Complete (Rijks, NGA, RKD, Louvre, Harvard, Smithsonian, V&A, Princeton, Europeana, AIC, CMA slices landed).
- [x] ✅ Object-specific acceptance gates are executable and passing in `tests/quality/provider-object-specific-gates.test.ts`.
- [x] ✅ Digital-content acceptance gates are executable and passing in `tests/quality/provider-digital-content-gates.test.ts`.
- [x] ✅ Route-level digital inspect/import preservation is executable and passing in `tests/quality/provider-digital-content-gates.test.ts`.
- [x] ✅ Provider manifest enforcement now requires active import providers to include `Round 3 Addendum - Digital Content` in `tests/fixtures/validation/provider-fixture-manifest.json` (`tests/quality/validation-architecture-depth.test.ts`).
- [x] ✅ PR template now requires explicit provider digital-content fixture-anchor mapping + short Standards Mapping notes (`.github/pull_request_template.md`).
- [x] ✅ Added `src/adapters/rijks.ts` with Search + Resolver + LDES + Change Discovery helpers and IIIF URL normalization.
- [x] ✅ Added Rijks API routes: `/api/rijks/profile`, `/api/rijks/search`, `/api/rijks/resolve`, `/api/rijks/import`, `/api/rijks/ldes`, `/api/rijks/cd`.
- [x] ✅ `/explore` source toggle now supports `both` / `met` / `getty` / `rijks`; `/api/explorer/import` supports Rijks URLs.
- [x] ✅ Added test coverage for adapter + routes + provider inference (`tests/adapters/rijks.test.ts`, `tests/api/rijks/*`, updated explorer/provider tests).
- [x] ✅ Added `src/adapters/nga.ts` plus routes `/api/nga/profile`, `/api/nga/search`, `/api/nga/import` with failing-first tests and explore/import wiring.
- [x] ✅ Remaining provider slices are now landed: RKD Knowledge Graph, Louvre Collections JSON, Harvard, Smithsonian Open Access, V&A Collections API, Princeton University Art Museum API, Europeana, AIC, CMA.
- [x] ✅ Rijks incremental-ingest hooks are executable: `extractRijksLdesHookData()` and `extractRijksChangeDiscoveryHookData()` with route coverage in `tests/api/rijks/ldes.test.ts` and `tests/api/rijks/cd.test.ts`.
- [x] ✅ Rijks profile now exposes a bibliographic SRU extension-point base (`bibliographicSruBase`) and SRU URL builder coverage (`buildRijksSruSearchUrl()`), while UI flows remain unchanged.
Rijksmuseum integration scope now includes:
- [x] ✅ Object metadata search and dereference pipeline (Search API + PID Resolver with content negotiation to Linked Art).
- [x] ✅ Linked Data Event Streams ingest hooks for incremental refreshes.
- [x] ✅ IIIF Change Discovery ingest hooks for change tracking.
- [x] ✅ IIIF image/presentation compatibility via Micrio endpoints.
- [x] ✅ Future bibliographic extension point via SRU (planned, not yet wired into UI flows).
B5.1 — RKD Knowledge Graph provider slice (done)
Goal: integrate RKD Linked Data (CIDOC-CRM + Linked Art oriented) as a standards-first provider without bypassing current adapter boundaries.
Planned deliverables:
- [x] ✅ Adapter: `src/adapters/rkd.ts` (provider-interface compliant, no cross-adapter imports).
- [x] ✅ Routes:
- [x] ✅ `/api/rkd/profile` (GET)
- [x] ✅ `/api/rkd/search` (POST, paged candidate retrieval)
- [x] ✅ `/api/rkd/entity` (POST, URI-based fetch/enrichment)
- [x] ✅ `/api/rkd/import` (POST, normalize + persist)
- [x] ✅ optional `/api/rkd/sparql` (POST, read-only, allowlisted query templates only)
- [x] ✅ UI:
- [x] ✅ added `rkd` source toggle support in `/explore`
- [x] ✅ provider attribution + ODC-By 1.0 license/reuse guidance rendered in RKD card/detail data.
Data-source constraints:
- [x] ✅ Dataset scale is 600M+ statements and is consumed via bounded queries/pagination (`limit`/`offset` clamps and bounded search defaults).
- [x] ✅ SPARQL endpoint details are deploy-time config (env vars) via `getRkdProfile()`/`buildRkdSparqlEndpoint()`.
- [x] ✅ Graph scoping is supported via optional `graph` input on search/entity/template flows.
- [x] ✅ Raw SPARQL usage is constrained to controlled, allowlisted query templates (`entitySummary`/`labelSearch`) for route-level access.
- [x] ✅ Triply protocol-compatible SPARQL request behavior is implemented with explicit `Accept` negotiation and read-only request handling.
License and attribution:
- [x] ✅ RKD dataset license is Open Data Commons Attribution License 1.0; provider output carries source URL + provider attribution + license metadata.
- [x] ✅ Rights/reuse output remains conservative when image-level rights are not explicit in source payload.
Acceptance:
- [x] ✅ Failing-first tests for adapter + routes are landed (`tests/adapters/rkd.test.ts`, `tests/api/rkd/*.test.ts`).
- [x] ✅ Standards mapping coverage includes object/digital/shared-structure/data-discovery anchors in `tests/fixtures/validation/provider-fixture-manifest.json` (`rkd` entry).
- [x] ✅ B8 protocol conformance coverage includes RKD routes in `tests/quality/provider-protocol-conformance.test.ts`.
- [x] ✅ Token security checks included (`Authorization: Bearer` from env-only `RKD_TRIPLY_TOKEN`, no token persistence/logging in adapter/route flows).
B5.2 — Smithsonian Open Access provider slice (done)
Goal: integrate Smithsonian Open Access search/content APIs with secure API-key handling and standards-first normalization.
Planned deliverables:
- [x] ✅ Adapter: `src/adapters/smithsonian.ts` (provider-interface compliant, no cross-adapter imports).
- [x] ✅ Routes:
- [x] ✅ `/api/smithsonian/profile` (GET)
- [x] ✅ `/api/smithsonian/search` (POST)
- [x] ✅ `/api/smithsonian/content` (POST)
- [x] ✅ `/api/smithsonian/import` (POST)
- [x] ✅ UI:
- [x] ✅ added `smithsonian` source toggle in `/explore`
- [x] ✅ source attribution and conservative rights/reuse indicators are preserved in Smithsonian discovery/import card flows.
Official API constraints:
- [x] ✅ API key required (`api_key`) via data.gov registration (`SMITHSONIAN_API_KEY` enforced for Smithsonian search/content and URL-import retrieval).
- [x] ✅ Search pagination uses `start` + `rows`; route schema enforces integer bounds (`start >= 0`, `rows` in `1..1000`) with compatibility aliases for legacy callers.
- [x] ✅ Core category filters and row-group inputs are explicit enum validation at route boundary (`smithsonianSearchInputSchema` for category + `rowGroup: objects|archives`).
Acceptance:
- [x] ✅ Failing-first adapter + route tests with mocked Smithsonian responses.
- [x] ✅ API-key security checks included (env-only secrets, no key in logs/errors/client, API key stripped from exposed source URLs).
- [x] ✅ Standards Mapping note coverage includes fixture anchors for Smithsonian in `tests/fixtures/validation/provider-fixture-manifest.json`.
- [x] ✅ B8 protocol checks include Smithsonian routes (`profile`, `search`, `content`, `import`) in `tests/quality/provider-protocol-conformance.test.ts`.
B5.3 — Harvard Art Museums provider slice
Goal: integrate Harvard Art Museums API with strong conformance to official usage constraints and Linked Art normalization boundaries.
Status:
- [x] ✅ Planned deliverables in this slice scope (adapter + routes) are complete.
Planned deliverables:
- [x] ✅ Adapter: `src/adapters/harvard.ts` (provider-interface compliant, no cross-adapter imports).
- [x] ✅ Routes:
- [x] ✅ `/api/harvard/profile` (GET)
- [x] ✅ `/api/harvard/search` (POST)
- [x] ✅ `/api/harvard/object` (POST)
- [x] ✅ `/api/harvard/import` (POST)
- [x] ✅ UI:
- [x] ✅ add `harvard` source toggle in `/explore`
- [x] ✅ preserve attribution/link-back + rights/reuse indicators on all surfaces.
Official API constraints:
- [x] ✅ API key required on all calls (`apikey` parameter).
- [x] ✅ Paging uses `size` (max 100) + `page`; adapter honors `info.next/info.prev` flows.
- [x] ✅ Respect call budget guidance (2500/day) and non-commercial + attribution terms.
- [x] ✅ Cache/storage policy enforces a two-week max retention guidance (`<=14 days`) without explicit permission.
- [x] ✅ Use provider image URLs directly (no local copies).
Acceptance:
- [x] ✅ Failing-first adapter + route tests with mocked Harvard responses.
- [x] ✅ API key handling checks included (env-only key, no key in logs/errors/client surfaces).
- [x] ✅ Rate-budget and cache-TTL policy checks included (`<=14 days` cache window).
- [x] ✅ Standards Mapping note references applicable Linked Art rounds and fixture anchors.
- [x] ✅ B8 protocol checks included for all Harvard routes.
B5.4 — V&A Collections API provider slice
Goal: integrate V&A Collections API v2 with strong support for identifier/keyword filters and IIIF image/presentation link preservation.
Status:
- [x] ✅ Complete.
Planned deliverables:
- [x] ✅ Adapter: `src/adapters/vanda.ts` (provider-interface compliant, no cross-adapter imports).
- [x] ✅ Routes:
- [x] ✅ `/api/vanda/profile` (GET)
- [x] ✅ `/api/vanda/search` (POST)
- [x] ✅ `/api/vanda/object` (POST)
- [x] ✅ `/api/vanda/import` (POST)
- [x] ✅ UI:
- [x] ✅ add `vanda` source toggle in `/explore`
- [x] ✅ expose IIIF manifest/image links in artwork/detail surfaces where available.
Official API constraints:
- [x] ✅ API base is `https://api.vam.ac.uk/v2`.
- [x] ✅ Search result pages should honor official paging constraints (`size` cap 100).
- [x] ✅ API is suitable for dynamic subsets; bulk export flows should avoid naive high-volume API crawling.
- [x] ✅ Terms/licensing constraints and citation requirements must be preserved in downstream usage.
Acceptance:
- [x] ✅ Failing-first adapter + route tests with mocked V&A responses.
- [x] ✅ Identifier/keyword filter behavior tests included.
- [x] ✅ IIIF image/presentation field extraction tests included.
- [x] ✅ Standards Mapping note references applicable Linked Art rounds and fixture anchors.
- [x] ✅ Standards Mapping note: Linked Art Model 1.0 references include Digital Content + Shared Structures + Data Discovery/API endpoint-shape rounds; fixture anchors include `tests/fixtures/validation/providers/vanda/pass.json` and `tests/fixtures/validation/providers/vanda/fail.json` plus provider manifest mapping in `tests/fixtures/validation/provider-fixture-manifest.json`.
- [x] ✅ B8 protocol checks included for all V&A routes.
B5.5 — Princeton University Art Museum provider slice
Goal: integrate Princeton API object/search resources with strong preservation of nested research context and IIIF media references.
Planned deliverables:
- [x] ✅ Adapter: `src/adapters/princeton.ts` (provider-interface compliant, no cross-adapter imports).
- [x] ✅ Routes:
- [x] ✅ `/api/princeton/profile` (GET)
- [x] ✅ `/api/princeton/search` (POST)
- [x] ✅ `/api/princeton/object` (POST)
- [x] ✅ `/api/princeton/import` (POST)
- [x] ✅ UI:
- [x] ✅ add `princeton` source toggle in `/explore`
- [x] ✅ preserve source attribution and media link visibility.
Official API constraints:
- [x] ✅ Base endpoint `https://data.artmuseum.princeton.edu`.
- [x] ✅ No auth currently required, and adapter/profile surface explicit `authMode: none` + future-auth compatibility without interface breakage.
- [x] ✅ Static weekly full datasets are reflected in import guidance with anti-crawl guardrails on large interactive API imports.
Acceptance:
- [x] ✅ Failing-first adapter + route tests with mocked Princeton responses.
- [x] ✅ Nested-field preservation tests included (`texts`, `media`, `exhibitions`, `geography`, `terms`, `classifications`).
- [x] ✅ IIIF URI extraction tests included.
- [x] ✅ Standards Mapping note references applicable Linked Art rounds and fixture anchors.
- [x] ✅ Standards Mapping note: Linked Art Model 1.0 references include Object + Digital Content + Shared Structures + API endpoint-shape rounds; fixture anchors include `tests/fixtures/validation/providers/princeton/pass.json` and `tests/fixtures/validation/providers/princeton/fail.json` plus provider manifest mapping in `tests/fixtures/validation/provider-fixture-manifest.json`.
- [x] ✅ B8 protocol checks included for all Princeton routes.
B5.6 — National Gallery of Art Open Data provider slice
Goal: integrate NGA public open data as a CSV-first provider while preserving Linked Art boundary contracts and provenance-safe source lineage.
Planned deliverables:
- [x] ✅ Adapter: `src/adapters/nga.ts` (provider-interface compliant, no cross-adapter imports).
- [x] ✅ Routes:
- [x] ✅ `/api/nga/profile` (GET)
- [x] ✅ `/api/nga/search` (POST)
- [x] ✅ `/api/nga/import` (POST)
- [x] ✅ optional `/api/nga/refresh` (POST) deferred by design; manual refresh is supported through `/api/nga/import` with `url` + bounded `limit`.
- [x] ✅ UI:
- [x] ✅ add `nga` source toggle in `/explore`
- [x] ✅ preserve source attribution, citation guidance, and conservative rights/reuse indicators.
- [x] ✅ `/explore` includes an NGA-specific citation/reuse advisory callout (CC0 metadata + verify image/media rights per object).
Official data constraints:
- [x] ✅ Primary distribution is CSV (UTF-8), refreshed frequently (typically daily), and should be ingested in bounded batches.
- [x] ✅ Images/media files are not distributed in the dataset package; only links/references are included where available.
- [x] ✅ Dataset is CC0; attribution/citation is still recommended for research usage.
- [x] ✅ Wikidata IDs are present when known but non-exhaustive; treat as reconciliation hints, not complete authority truth.
- [x] ✅ Enforcement evidence: adapter/profile/import tests in `tests/adapters/nga.test.ts` and `tests/api/nga/import.test.ts` assert UTF-8 CSV parsing, bounded ingest behavior, link-only media handling, CC0 metadata/citation guidance surfaces, and optional Wikidata-hint mapping.
Acceptance:
- [x] ✅ Failing-first adapter + route tests with fixture-backed CSV parsing and UTF-8 safety.
- [x] ✅ Idempotent upsert tests for repeated daily ingest runs.
- [x] ✅ Tests verifying preservation of source media-link references without assuming media binary availability.
- [x] ✅ Standards Mapping note references applicable Linked Art rounds and fixture anchors.
- [x] ✅ Standards Mapping note: Linked Art Model 1.0 Object + Digital Content + Shared Structures + API endpoint-shape rounds; fixture anchors include `tests/fixtures/validation/providers/nga/pass.json` and `tests/fixtures/validation/providers/nga/fail.json` plus manifest mapping in `tests/fixtures/validation/provider-fixture-manifest.json`.
- [x] ✅ B8 protocol checks included for all NGA routes.
B5.7 — Louvre Collections JSON provider slice
Goal: integrate Louvre ARK-linked JSON records as a standards-first provider while preserving attribution/provenance nuance and image-rights constraints.
Planned deliverables:
- [x] ✅ Adapter: `src/adapters/louvre.ts` (provider-interface compliant, no cross-adapter imports).
- [x] ✅ Routes:
- [x] ✅ `/api/louvre/profile` (GET)
- [x] ✅ `/api/louvre/object` (POST)
- [x] ✅ `/api/louvre/import` (POST)
- [x] ✅ optional `/api/louvre/search` (POST) is landed (bounded, protocol-safe endpoint)
- [x] ✅ UI:
- [x] ✅ add `louvre` source toggle in `/explore`
- [x] ✅ preserve source attribution and image-rights disclosures on all surfaces.
Official data constraints:
- [x] ✅ Access is object-entry URL plus `.json` suffix (ARK-based records).
- [x] ✅ Record content is French-first; normalization must not destructively strip source language signals.
- [x] ✅ Image usage and text reuse must follow Louvre Terms of Use.
- [x] ✅ Image payloads include per-image rights/copyright text and must be preserved.
- [x] ✅ Enforcement evidence: `tests/adapters/provider-expansion.test.ts` and `tests/api/louvre/import.test.ts` assert `.json` URL normalization, French-field preservation in `_source.raw`, and rights/copyright retention from source image payloads.
Acceptance:
- [x] ✅ Failing-first adapter + route tests with mocked Louvre JSON responses.
- [x] ✅ URL normalization + ARK extraction safety tests included.
- [x] ✅ Creator attribution nuance tests included (`attributionLevel`, `doubt`, `creatorRole`, attribution metadata where present).
- [x] ✅ Rights/reuse mapping tests included with conservative defaults when rights are unclear.
- [x] ✅ Standards Mapping note references applicable Linked Art rounds and fixture anchors.
- [x] ✅ B8 protocol checks included for all Louvre routes.
- Evidence:
- `tests/adapters/provider-expansion.test.ts`
- `tests/api/louvre/object.test.ts`
- `tests/api/louvre/import.test.ts`
- `tests/quality/provider-protocol-conformance.test.ts`
- `docs/providers/louvre-collections-json.md`
B6 — Authority caching
- [x] ✅ Replace inline authority lookups on the request path with local cache-only access.
- [x] ✅ Schedule a daily/weekly job that downloads Getty AAT/ULAN/TGN N-Triples + Wikidata `linked-art`-related QIDs + GeoNames + LoC NAF into Postgres (SOTA §6.2).
- [x] ✅ Surface in the entity profile pages: "From AAT / ULAN / Wikidata".
Status:
- [x] ✅ `src/services/authority-cache.ts` landed as the local authority cache service.
- [x] ✅ `tests/quality/era-b-exit-gate.test.ts` enforces zero runtime external authority fetches on request-path route code.
- [x] ✅ Scheduled authority refresh pipeline landed:
- [x] ✅ `scripts/authority-cache-refresh.ts`
- [x] ✅ `pnpm authority:refresh`
- [x] ✅ `.github/workflows/authority-cache-refresh.yml` (weekly + manual dispatch)
- [x] ✅ Entity authority-source UX landed:
- [x] ✅ `src/utils/entities.ts` emits `authoritySources`
- [x] ✅ `app/(workspace)/entity/[id]/page.tsx` renders `From AAT / ULAN / Wikidata` style provenance labels when present
- [x] ✅ coverage in `tests/utils/entities.test.ts` and `tests/api/entities/by-id.test.ts`
B6.1 — Exhibition + literature reconciliation hardening
Goal: prevent duplicate or fragmented cross-provider historical narratives by reconciling shared exhibitions and literature records without collapsing source provenance.
- [x] ✅ Add explicit reconciliation scope beyond people/concepts:
- [x] ✅ Exhibition concepts/plans (`PropositionalObject`) and exhibition activities (`Activity` classified as exhibition) are candidate-matched across providers.
- [x] ✅ Literature records (`LinguisticObject`) including catalogs/publications about exhibitions or objects are candidate-matched across providers.
- [x] ✅ Add deterministic candidate blocking + scoring pipeline:
- [x] ✅ title/label normalization + language-aware comparison
- [x] ✅ timespan overlap logic
- [x] ✅ place/venue equivalence checks using local authority identifiers/labels
- [x] ✅ identifier evidence (ISBN/ISSN/OCLC/DOI/local accession refs when present)
- [x] ✅ participant/organizer/publisher overlap evidence
- [x] ✅ Preserve Linked Art identity/provenance invariants:
- [x] ✅ never rewrite source URIs in `_source.raw`
- [x] ✅ never infer semantics from URI path shape
- [x] ✅ link via explicit reconciliation decisions rather than destructive record collapse
- [x] ✅ keep event-centric modeling (no direct object-person shortcut introduced by reconciliation)
- [x] ✅ Add human-review queue gates for ambiguous matches:
- [x] ✅ thresholds for auto-link vs review-required vs no-link (`>=0.90`, `0.65-0.89`, `<0.65`)
- [x] ✅ audit metadata per reconciliation decision (`actor`, `recordedAt`)
- [x] ✅ reversible decision model shape (`auto-link` / `needs-review` / `no-link`)
- [x] ✅ Add failing-first fixture suite:
- [x] ✅ pass cases for true exhibition/literature same-as candidates from different providers
- [x] ✅ fail cases for near-title collisions, edition conflicts, and time/place mismatches
- [x] ✅ regression coverage for threshold behavior and invariants
Definition of done:
- [x] ✅ `tests/quality/reconciliation-exhibitions-literature.test.ts` passes with fixture-backed pass/fail coverage.
- [x] ✅ Reconciliation outputs are provenance-safe and standards-mapped (round + fixture anchor references in PR).
- [x] ✅ Entity pages expose linked "same exhibition"/"same publication" context without mutating source records.
Implementation note:
- [x] ✅ Use reconciliation/exhibition-literature-reconciliation.md(reconciliation/exhibition-literature-reconciliation.md) as the required execution checklist for this slice.
Status:
- [x] ✅ `src/services/reconciliation.ts` landed with explicit exhibition/publication candidate extraction, deterministic scoring, and human-review thresholds.
- [x] ✅ `tests/fixtures/reconciliation/exhibitions-literature-pass.json` + `tests/fixtures/reconciliation/exhibitions-literature-fail.json` landed as fixture anchors.
- [x] ✅ `app/(workspace)/entity/[id]/page.tsx` now surfaces cross-provider alignment context where reconciliation candidates exist.
B8 — API protocol + profile conformance hardening
- [x] ✅ Enforce JSON-LD 1.1 output with canonical Linked Art context on all public entity payloads.
- [x] ✅ Add explicit content negotiation behavior:
- [x] ✅ `Accept: application/ld+json;profile="https://linked.art/ns/v1/linked-art.json"`
- [x] ✅ graceful fallback when generic JSON/LD headers are used
- [x] ✅ Add `GET` + `OPTIONS` support and baseline CORS behavior on public API endpoints.
- [x] ✅ Add protocol tests asserting URI opacity: no handler or client helper may derive semantics by parsing URI path shapes.
- [x] ✅ Add serialization tests ensuring multi-valued Linked Art fields remain arrays even when cardinality is one.
Status:
- [x] ✅ Complete for current Era B route inventory.
- [x] ✅ Representative executable conformance tests now run for `/api/linked-art/profile`, `/api/artworks/[id]`, and `/api/entities/[id]`:
- [x] ✅ `OPTIONS` + baseline CORS headers
- [x] ✅ Linked Art media type negotiation for `Accept: application/ld+json;profile=...`
- [x] ✅ URI opacity, array cardinality safety, and HAL separation assertions
- [x] ✅ representative entity-role coverage across object/work/agent/place/set
- [x] ✅ Expanded executable protocol checks to currently landed provider/search endpoints (`/api/met/`, `/api/getty/`, `/api/rijks/`, `/api/nga/`, `/api/rkd/`, plus `/api/providers/`) via `tests/quality/provider-protocol-conformance.test.ts`.
- [x] ✅ Generic JSON-LD fallback behavior is covered (`Accept: application/ld+json` negotiates to canonical Linked Art profile media type).
- [x] ✅ Ongoing policy: B8 executable checks are applied to all currently landed provider slices; continue applying the same checks for additional future sources.
Acceptance:
- [x] ✅ Protocol conformance tests pass for representative routes across object/work/agent/place/set.
- [x] ✅ No regressions in existing inspect/import flows.
B9 — Linked Art modeling guardrails (provenance + lifecycle)
- [x] ✅ Add conformance tests for provenance partitioning patterns:
- [x] ✅ wrapper provenance `Activity`
- [x] ✅ `Acquisition` + `Payment` as parts when both are asserted
- [x] ✅ Add explicit ownership vs custody invariants:
- [x] ✅ `TransferOfCustody` must not be rewritten into `Acquisition` unless title transfer evidence is present
- [x] ✅ Add explicit unknown-transfer handling:
- [x] ✅ use `Transfer` for ambiguous exchange events rather than fabricating legal outcomes.
- [x] ✅ Extend inspect/import audits to flag carrier/content conflation and direct object-person shortcuts that bypass event nodes.
Status:
- [x] ✅ Complete for current Era B guardrail scope.
- [x] ✅ Conformance guardrails added in `src/utils/linked-art.ts` for:
- [x] ✅ wrapper provenance Activity partitioning checks
- [x] ✅ `Acquisition` + `Payment` split-into-parts checks when both are asserted
- [x] ✅ custody-vs-title invariants (`TransferOfCustody` vs `Acquisition`)
- [x] ✅ unknown-transfer guardrails (`Transfer` for ambiguous exchanges)
- [x] ✅ carrier/content conflation and direct object-person shortcut detection
- [x] ✅ Failing-first pass/fail fixtures added:
- [x] ✅ `tests/fixtures/b9/provenance-guardrails-pass.json`
- [x] ✅ `tests/fixtures/b9/provenance-guardrails-fail.json`
- [x] ✅ Executable guardrail tests added in `tests/quality/linked-art-b9-guardrails.test.ts`.
- [x] ✅ Best-practices audit includes actionable B9 category output: `Provenance & Lifecycle Guardrails (B9)`.
Acceptance:
- [x] ✅ Failing-first fixtures prove the above patterns and invariants across pass/fail cases.
- [x] ✅ Best-practices audit reports actionable violations for these categories.
B10 — ARK conformance slice
- [x] ✅ ULID-based ARK minting for normalized records.
- [x] ✅ Add `/api/ark/resolve` resolver endpoint with suffix pass-through behavior.
- [x] ✅ Add `?info` inflection response for metadata + persistence statement retrieval.
- [x] ✅ Define and return a persistence statement structure for ARK `?info` responses.
- [x] ✅ Add failing-first tests for ARK utility behavior and resolver route behavior.
- [x] ✅ Update roadmap/README/CLAUDE standards guidance for ARK conformance expectations.
Status:
- [x] ✅ Complete for current Era B scope.
- [x] ✅ Implemented `src/utils/ark.ts` with opaque ULID minting, ARK normalization, suffix pass-through resolution, and `?info` payload construction.
- [x] ✅ Implemented `app/api/ark/resolve/route.ts` with:
- [x] ✅ `GET` resolution (`303` redirect style)
- [x] ✅ suffix pass-through for variant/service paths
- [x] ✅ `?info` inflection JSON-LD payload
- [x] ✅ `OPTIONS` + baseline CORS behavior
- [x] ✅ `normalizeIncomingRecord` now mints ARKs via ULID-based helper (`mintArkIdentifier`) instead of non-deterministic short random tokens.
- [x] ✅ Added executable tests:
- [x] ✅ `tests/utils/ark.test.ts`
- [x] ✅ `tests/api/ark/resolve.test.ts`
Acceptance:
- [x] ✅ Resolver pass-through and inflection tests are green.
- [x] ✅ ARK minting tests assert opaque ULID-form ARK output.
- [x] ✅ ARK conformance behavior is now documented in project guidance.
B7 — API gateway readiness for multi-source scale
- [x] ✅ Keep direct provider adapters as default while source count and traffic stay moderate (current implementation remains direct adapters).
- [x] ✅ Gateway activation policy is implemented and threshold-gated. Activate only when one or more conditions are true:
- [x] ✅ 6+ external providers in production.
- [x] ✅ 2+ upstream credential/security models to centralize.
- [x] ✅ cross-provider rate limiting/circuit breaking becomes operationally necessary.
- [x] ✅ Candidate gateway responsibilities are defined and readiness-backed:
- [x] ✅ Centralized auth/secrets policy, rate limiting, retries/circuit breakers, request/response telemetry, and provider health dashboards.
- [x] ✅ Stable internal route facade (`/api/providers/:provider/...`) so UI and jobs remain unchanged as provider backends evolve.
- [x] ✅ Response envelope standardization plus provider capability registry for dynamic UI feature flags.
- [x] ✅ B7 non-goals are explicitly enforced:
- [x] ✅ No business logic migration out of adapters.
- [x] ✅ No forced microservice split of the Next app.
- [x] ✅ No gateway requirement for local development.
Status:
- [x] ✅ Complete for Era B readiness scope.
- [x] ✅ Added gateway-readiness diagnostics endpoint: `/api/providers/readiness` (threshold evaluation without forcing architecture changes).
- [x] ✅ Added provider capability registry endpoint: `/api/providers/capabilities`.
- [x] ✅ Added stable internal facade routes: `/api/providers/:provider/profile|search|import` with standardized response envelopes.
- [x] ✅ Added conformance coverage for facade + capability routes in `tests/quality/provider-protocol-conformance.test.ts`.
- [x] ✅ Re-evaluated activation threshold with current production providers (Met, Getty, Rijks, NGA, RKD, Louvre, Harvard, Smithsonian, V&A, Princeton, Europeana, AIC, CMA): threshold is now hit (6+ providers), so `/api/providers/readiness` reports `gatewayRecommended: true` while direct adapters remain the active mode.
Era B exit gate:
- [x] ✅ 100% of public-facing sample records pass validation checks (SHACL when validator service is configured; local standards fallback otherwise).
- [x] ✅ All writes auth-gated; audit log row per primary write route.
- [x] ✅ Postgres is the storage of record when `DATABASE_URL` is set; `storage/*.json` removed from version control.
- [x] ✅ All authority lookups served from local cache policy (zero runtime authority calls on the request path).
- [x] ✅ Protocol/profile conformance suite green (context, media type profile, CORS/OPTIONS, URI opacity, array cardinality safety).
Era B completion verdict:
- [x] ✅ Engineering-complete. Core B1-B10 deliverables and Era B exit-gate checks are green.
- [x] ✅ Operational sign-off complete for current pre-Era-C checklist items.
Pre-Era-C Operational Sign-Off
- [x] ✅ Verify and record rotation of production `AUTH_SECRET` and `AUTH_GITHUB_SECRET` (with date and owner in release notes/runbook).
- [x] Evidence anchor: `docs/ops/auth-credential-rotation.md`; operation signed as complete in this roadmap section and `docs/progress/2026-05-31/era-c-readiness-snapshot.md`.
- [x] ✅ Record explicit gateway activation decision now that `gatewayRecommended: true` is reported (`activate now` vs `keep direct-adapter mode`) with owner + review date.
- [x] ✅ Decision: keep direct-adapter mode active for now; do not force gateway activation yet.
- [x] ✅ Owner: `@rsung`
- [x] ✅ Review date: August 31, 2026
- [x] ✅ Decision record reference: `docs/progress/2026-05-31/era-c-readiness-snapshot.md`
- [x] ✅ Add a one-page Era C readiness snapshot to `docs/progress/` linking latest green evidence: `era-b-exit-gate`, `protocol-conformance`, `provider-protocol-conformance`, `linked-art-b9-guardrails`, `validation-drift:trend`.
---
Era C — SOTA platform (quarters 4+)
Goal: implement the Yale-LUX-pattern hybrid platform described in LinkedArtSOTAWebApp.md(linked-art/LinkedArtSOTAWebApp.md) §§2–22. By this point, the Next.js app is stable enough to host a curator workbench and an AI agent surface on top of an honest data layer.
Roughly the same numbering as the SOTA spec's phases — but starting here, after Era A and B have shipped:
C1 — Multi-modal storage + HAL hypermedia (SOTA Phase 1)
- [x] ✅ Solr 9 + GraphDB provisioned via Helm (dev: Compose, GraphDB CE image)
- [x] ✅ GraphDB SPARQL 1.1 endpoint + Lucene plugin for hybrid text+graph queries; named graphs per source institution for provenance partitioning (SOTA §8.2)
- [x] ✅ RDFS + SHACL reasoning only at runtime — no full OWL DL (SOTA §8.2)
- [x] ✅ `src/utils/record-materializer.ts` builds Yale-LUX-style denormalized `Record` documents + shortcut triples (SOTA §20.1)
- [x] ✅ HAL `_links` on every entity response (SOTA §9.2)
- [x] ✅ Entity HAL discoverability now includes stable `la:activityFeed` link (`/api/activity`) via shared link builder + conformance tests.
- [x] ✅ Canonical role endpoint scaffolds landed for `/api/objects/[id]`, `/api/works/[id]`, `/api/agents/[id]`, `/api/places/[id]`, `/api/sets/[id]` with executable B8 protocol conformance coverage.
- [x] ✅ Add `/api/concepts/[id]` and `/api/events/[id]` canonical endpoints to complete the canonical C1 role endpoint surface.
- [x] ✅ `/api/search` landed with OrderedCollectionPage pagination contract and executable HAL/search conformance coverage.
- [x] ✅ `/api/activity` syndication endpoint
Status:
- [x] ✅ Dev Compose provisioning added in `ops/docker-compose.yml` (`sota` profile: `solr:9.6`, `ontotext/graphdb:10.8.14`).
- [x] ✅ Helm provisioning added in `ops/helm/metamuseum-search-graph/` (StatefulSets + Services + PVC defaults for Solr and GraphDB).
- [x] ✅ GraphDB bootstrap automation added:
- `scripts/graphdb-bootstrap.ts` creates repository config + verifies SPARQL query/update endpoints + provisions Lucene connector via `luc:createConnector`.
- Runtime reasoning policy enforced in bootstrap: `GRAPHDB_RULESET` accepts only `rdfsplus` / `rdfsplus-optimized` (OWL-family rulesets rejected).
- `scripts/graphdb-load-named-graph.ts` loads provider RDF into source-specific named graphs for provenance partitioning.
- `src/utils/provenance-graphs.ts` defines stable institution graph URIs.
- [x] ✅ Record materialization + index flattening foundations landed:
- `src/services/records.ts` now materializes on write for all import/persist paths.
- `src/utils/record-materializer.ts` emits denormalized shortcut fields/triples.
- `src/utils/search-index.ts` flattens materialized records into Solr/OpenSearch-ready documents using `_shortcuts`.
C2 — ETL pipeline + reconciliation (SOTA Phase 2)
- [x] ✅ `pipeline/` Dagster project — ELT, idempotent at every stage, SHA-256 dedupe keys
- [x] ✅ FastAPI reconciliation service hitting Getty SPARQL / VIAF / Wikidata / GeoNames behind Redis URI cache
- [x] ✅ Promote B6.1 exhibition/literature reconciliation heuristics into the C2 service as first-class pipelines (not optional post-processing)
- [x] ✅ Confidence thresholds per SOTA §7.3 with a human-review queue in `app/curator/reconciliation/page.tsx`
- [x] ✅ Visual ETL Mapper (ReactFlow) + `MappingTemplate` contract
Status:
- [x] ✅ `pipeline/` scaffold landed with Dagster project files (`pipeline/pyproject.toml`, `pipeline/requirements.txt`) and runnable entry points (`pipeline/run_materialize.py`, `metamuseum_pipeline.definitions`).
- [x] ✅ ELT asset chain implemented in `pipeline/metamuseum_pipeline/assets.py`: `extract_source_records` → `load_records` → `transform_records` → `dedupe_records` → `upsert_materialized_records`.
- [x] ✅ SHA-256 dedupe key policy implemented in `pipeline/metamuseum_pipeline/dedupe.py` (`canonical_json` + `record_sha256`) and consumed at load/dedupe/materialize stages.
- [x] ✅ Idempotence coverage added in `pipeline/tests/test_c2_pipeline.py` (repeat materialization does not duplicate state rows; unchanged rows are no-op upserts).
- [x] ✅ Reconciliation service scaffold landed in `services/reconciliation-service/`:
- FastAPI app with `POST /reconcile/lookup` and `GET /health`.
- Provider adapters for Getty SPARQL, VIAF AutoSuggest, Wikidata, and GeoNames.
- Redis URI cache layer with deterministic SHA-256 cache keys and TTL controls.
- Unit coverage in `services/reconciliation-service/tests/test_reconciliation_service.py`.
- [x] ✅ B6.1 heuristics promoted to first-class C2 pipeline endpoints in `services/reconciliation-service/main.py`:
- `GET /reconcile/pipelines`
- `POST /reconcile/pipelines/exhibitions-literature`
- `GET /reconcile/pipelines/exhibitions-literature/bands`
- Heuristic parity implementation in `services/reconciliation-service/pipelines.py` with fixture-backed tests in `services/reconciliation-service/tests/test_b61_pipeline.py`.
- [x] ✅ SOTA §7.3 threshold model and queue UI landed:
- Threshold bands implemented in `src/services/reconciliation.ts` (`>=0.95` auto-approve, `0.85-0.95` weekly digest flag, `0.70-0.85` human-review queue, `<0.70` drop candidate).
- Curator queue page added at `app/curator/reconciliation/page.tsx` with explicit human-review and weekly-digest sections.
- Regression assertions updated in `tests/quality/reconciliation-exhibitions-literature.test.ts`.
- [x] ✅ Visual ETL Mapper + MappingTemplate contract landed:
- `src/contracts/mapping-template.ts` defines `createMappingTemplate`/`validateMappingTemplate` for JSON-serializable mapper nodes, edges, and executable rules.
- `src/contracts/zod/mapping-template.ts` mirrors the contract boundary and is exported through `src/contracts/zod/index.ts`.
- ReactFlow mapper UI shipped at `app/(workspace)/etl/mapper/page.tsx` via `src/components/etl-mapper-workbench.tsx` with dry-run projection preview.
- Contract and schema coverage added in `tests/contracts/mapping-template.test.ts` and `tests/contracts/zod-mirror.test.ts`.
- [x] ✅ C2 complete. ETL pipeline, reconciliation service, B6.1 pipeline promotion, SOTA §7.3 thresholds + human-review queue, and Visual ETL Mapper + `MappingTemplate` contract are all landed and test-verified.
C3 — IIIF + visualizations (SOTA Phase 3)
- [x] ✅ `<IIIFCanvasViewer>` (OpenSeadragon wrapper) with deep-zoom, side-by-side compare, annotations
- [x] ✅ `<ProvenanceTimeline>`, `<GeoMapViewer>` (Leaflet), `<NetworkGraph>` (Cytoscape — partially landed in Slice 6)
- [x] ✅ `<ConcertinaList>` + `<FacetHistogram>` for dense entity browses (SOTA §12.3)
- [x] ✅ `<EntityKnowledgePanel>` merging internal + DBpedia/Wikidata/ULAN/AAT context
- [x] ✅ Overlapping exhibition timelines with zoom + direct navigation to exhibition/activity records
Status:
- [x] ✅ IIIF workspace route shipped at `/iiif` via `app/(workspace)/iiif/page.tsx` with imported-record-backed source selection.
- [x] ✅ OpenSeadragon wrapper shipped in `src/components/iiif-canvas-viewer.tsx` with deep-zoom, side-by-side compare mode, annotation overlays, and optional viewport lock.
- [x] ✅ Canvas source normalization + fallback behavior covered in `tests/utils/iiif.test.ts` (`deriveIiifInfoJsonUrl`, OpenSeadragon image tile fallback, deduplicated source extraction).
- [x] ✅ Insights workspace now composes first-class C3 visualization components:
- `src/components/provenance-timeline.tsx`
- `src/components/geo-map-viewer.tsx` (Leaflet)
- `src/components/network-graph.tsx` (Cytoscape)
- [x] ✅ `app/(workspace)/insights/page.tsx` now includes timeline + Leaflet geospatial view + Cytoscape relationship network in one drillable research workflow.
- [x] ✅ Deterministic timeline-window + histogram binning logic is test-covered in `tests/utils/provenance-visualization.test.ts`.
- [x] ✅ Dense entity browse route shipped at `/entities` via `app/(workspace)/entities/page.tsx`, with URL-driven `q`/`type`/`authority` filters.
- [x] ✅ `src/components/concertina-list.tsx` and `src/components/facet-histogram.tsx` are now first-class C3 components for high-density entity review.
- [x] ✅ Facet/filter/group model is test-covered in `tests/utils/entity-browse.test.ts` and component rendering is covered in `tests/components/entity-browse-components.test.ts`.
- [x] ✅ `app/(workspace)/entity/[id]/page.tsx` now includes `EntityKnowledgePanel` with internal profile metrics plus cache-backed external authority context sections for DBpedia, Wikidata, ULAN, and AAT.
- [x] ✅ Knowledge-model merge behavior is covered in `tests/utils/entity-knowledge.test.ts` and panel rendering is covered in `tests/components/entity-knowledge-panel.test.ts`.
- [x] ✅ Exhibition overlap analysis is now first-class in Insights via `src/components/exhibition-timeline.tsx` + `src/utils/exhibition-timeline.ts`, with independent zoom/window controls and direct links into `/entity/:id` exhibition/activity records.
C4 — AI layer (SOTA Phase 4)
- [x] ✅ pgvector + voyage-3 embeddings for entity summaries and statement texts; SigLIP for IIIF visual similarity
- [x] ✅ `/api/ai/query` — NL → SPARQL/HAL with mandatory SHACL pre-execution validation
- [x] ✅ `/api/ai/chat` — Graph-RAG with mandatory citations (`[entityId, propertyPath]` per sentence); "cite or refuse" rule (RSI-4 complete, proven 2026-06-09)
- [x] ✅ LLM-assisted reconciliation tiebreaker (SOTA §10.4)
- [x] ✅ LLM-assisted mapping for the Visual ETL Mapper
Status:
- [x] ✅ AI embedding service landed in `src/services/ai-layer.ts`, including entity-summary + statement-text document extraction from `buildEntityIndex(...)`, Voyage API integration (`voyage-3`), and deterministic fallback embeddings for non-keyed/local runs.
- [x] ✅ pgvector persistence landed with bootstrap SQL + upsert path:
- `ops/postgres/init/02-ai-layer.sql`
- `persistEmbeddingsPgvector(...)` in `src/services/ai-layer.ts`
- [x] ✅ AI API routes landed:
- `app/api/ai/embeddings/route.ts` (document build + embeddings + optional pgvector persist)
- `app/api/ai/visual-similarity/route.ts` (SigLIP service call path + heuristic fallback)
- [x] ✅ NL query API landed: `app/api/ai/query/route.ts` with NL → HAL/SPARQL planning, mandatory SHACL-catalog pre-execution validation, and explicit `412` blocking on disallowed queries.
- [x] ✅ Query execution logs now capture NL prompt + generated query for retraining/audit (`storage/ai-query-log.json` via `src/services/ai-query.ts`).
- [x] ✅ SigLIP visual-similarity fallback + IIIF candidate extraction landed (`extractVisualSimilarityCandidates`, `rankVisualSimilarity`) with representation/access-point awareness.
- [x] ✅ Dev infra for pgvector readiness updated: `ops/docker-compose.yml` now uses `pgvector/pgvector:pg16` for local Postgres bootstrap compatibility.
- [x] ✅ Coverage landed for C4 behavior:
- `tests/services/ai-layer.test.ts`
- `tests/api/ai-embeddings.test.ts`
- `tests/api/ai-visual-similarity.test.ts`
- `tests/services/ai-query.test.ts`
- `tests/api/ai-query.test.ts`
- [x] ✅ RSI-4 complete: `/api/ai/chat` implemented with graph-driven claim extraction, per-sentence citation enforcement, and refusal path when coverage is incomplete.
- Proof: `tests/api/ai-chat.test.ts`, `pnpm test`, `pnpm lint`, and `pnpm build`.
- Route contracts and behavior are reflected in `app/api/ai/chat/route.ts` and `src/services/ai-chat.ts`.
- [x] ✅ C4 Visual ETL Mapper AI assist is complete:
- `src/services/mapping-assist.ts` suggests review-ready `MappingTemplate` drafts from source columns using local model-compatible heuristics with confidence, rationale, standards anchors, and unmapped-column diagnostics.
- `app/api/ai/mapping-assist/route.ts` exposes a POST assist endpoint with explicit JSON validation and CORS preflight.
- `src/components/etl-mapper-workbench.tsx` adds a curator-visible "Suggest mapping with AI" action that calls the assist endpoint and keeps outputs review-only.
- Proof packet: `tests/services/mapping-assist.test.ts`, `tests/api/ai-mapping-assist.test.ts`, `tests/components/etl-mapper-config.test.ts`, and `tests/api/openapi.test.ts`.
C5 — Syndication + Meta Wiki Art + hardening (SOTA Phase 5)
- [x] ✅ ActivityStreams subscriptions endpoint open to external aggregators
- [x] ✅ `/api/activity` external-consumer readiness metric capture landed (`/api/activity/readiness` + per-request consumer telemetry with explicit `x-linked-art-consumer-id` support).
- [x] ✅ `/api/activity/subscriptions` landed for external aggregator registration/discovery:
- `GET /api/activity/subscriptions` (ActivityStreams `OrderedCollectionPage` shape + metrics)
- `POST /api/activity/subscriptions` (consumer-aware callback registration)
- `DELETE /api/activity/subscriptions?id=...` (unsubscribe)
- Public CORS preflight via `OPTIONS`
- [x] ✅ Meta Wiki Art publish flow: `WikiDraft` → review → publish to MediaWiki + custom Wikibase with citation + rights templates
- [x] ✅ C5 publish-flow implementation evidence:
- `app/api/wiki-drafts/route.ts`
- `app/api/wiki-drafts/[id]/review/route.ts`
- `app/api/wiki-drafts/[id]/publish/route.ts`
- `src/services/wiki-publish.ts`
- Verification (2026-05-31): `tests/api/wiki-drafts/flow.test.ts` and `tests/services/wiki-publish.test.ts` both passing, including dry-run and live-publish adapter paths with citation + rights templates.
- [x] ✅ Wikibase statement-level references required for every publishable claim (provenance-safe writes)
- `evaluateWikiPublishPreflight(...)` now validates every claim carries at least one statement-level reference with valid `sourceUrl`, `retrievedAt`, and `citationText` before publish can proceed.
- Evidence: `src/services/wiki-publish.ts`, `tests/services/wiki-publish.test.ts`, `tests/api/wiki-drafts/flow.test.ts`.
- [x] ✅ Bidirectional mapping maintained between internal entity IDs and wiki item/property IDs for traceable sync
- Live publish now upserts durable sync mappings in `wiki-sync-map.json` (Postgres-managed in non-file modes) for:
- internal entity ID ↔ wikibase item ID
- internal property ID ↔ wikibase property ID
- API lookup route landed for traceable sync diagnostics:
- `GET /api/wiki-sync-map?internalEntityId=...`
- `GET /api/wiki-sync-map?wikiItemId=...`
- `GET /api/wiki-sync-map?internalPropertyId=...`
- `GET /api/wiki-sync-map?wikiPropertyId=...`
- Evidence: `src/services/wiki-sync-map.ts`, `app/api/wiki-drafts/[id]/publish/route.ts`, `app/api/wiki-sync-map/route.ts`, `tests/services/wiki-sync-map.test.ts`, `tests/api/wiki-sync-map.test.ts`, `tests/api/wiki-drafts/flow.test.ts`.
- [x] ✅ WCAG 2.1 AA full audit on every public route
- Evidence (2026-05-31): `pnpm a11y:check` passes with zero serious/moderate/critical axe violations across all public UI routes:
- `/`, `/explore`, `/records`, `/linked-art`, `/patterns`, `/insights`, `/graph`, `/entities`, `/iiif`, `/issues`, `/agents`, `/automation`, `/etl/mapper`, `/roadmap`, `/getty`, `/curator/reconciliation`, `/artwork/[id]`, `/entity/[id]`.
- Remediations landed for detected violations:
- `src/components/geo-map-viewer.tsx` (removed nested-interactive conflict by replacing `role="img"` map container semantics with described interactive container text)
- `src/components/etl-mapper-workbench.tsx` (added keyboard focus to scrollable dry-run `<pre>` region).
- [x] ✅ k6 load test against SOTA §20.4 SLOs (API p95 < 200ms cached, < 500ms cold; search p95 < 300ms)
- Harness landed:
- `scripts/k6-slo.js` (scenario definitions + per-scenario thresholds)
- `scripts/k6-slo-runner.mjs` (local binary / PATH / Docker fallback runner)
- `pnpm k6:slo` and `pnpm k6:slo:ci`
- runbook: `docs/ops/k6-slo.md`
- Update (2026-06-10): evidence contract now covers the full SOTA §20.4 p95 set, including whitelisted SPARQL p95 `< 2s` and IIIF tile serving p95 `< 100ms`.
- `src/services/era-c-exit-gate.ts`
- `config/era-c-exit-gate-policy.json`
- `tests/services/era-c-exit-gate.test.ts`
- Verification (2026-05-31, `pnpm k6:slo`, summary export `artifacts/performance/k6-slo-summary.json`):
- `cached_record_hit` p95: 73.5495ms (target `< 200ms`) ✅
- `cold_record_read` p95: 56.134ms (target `< 500ms`) ✅
- `keyword_facet_search` p95: 55.0616ms (target `< 300ms`) ✅
- `http_req_failed` rate by scenario: 0.00 ✅
- [x] ✅ OpenAPI 3.1 at `/api/docs`
- Implementation landed:
- `GET /api/openapi` returns generated OpenAPI 3.1 JSON from live route-handler discovery (`src/services/openapi.ts`).
- `GET /api/docs` serves interactive Swagger UI wired to `/api/openapi`.
- Evidence:
- `app/api/openapi/route.ts`
- `app/api/docs/route.ts`
- `src/services/openapi.ts`
- `tests/api/openapi.test.ts`
- `tests/api/docs.test.ts`
- Verification (2026-05-31): `pnpm test -- tests/api/openapi.test.ts tests/api/docs.test.ts` passing.
- [x] ✅ Pen test + DR drill
- Executable hardening gate landed:
- `pnpm pentest:baseline` (dependency advisory regression check against committed baseline)
- `pnpm dr:drill` (non-destructive restore rehearsal with SHA-256 parity checks)
- `pnpm hardening:pen-dr` (combined gate)
- Baselines + runbooks:
- `config/security-audit-baseline.json`
- `docs/ops/security-dr-drill.md`
- Artifacts:
- `artifacts/security/pnpm-audit-summary.json`
- `artifacts/dr-drill/latest.json`
Era C Principal Hardening Addenda (Staff/Principal Review)
These items are now integrated as explicit execution backlog for distributed systems, lifecycle integrity, AI safety, UX globalization, and privacy controls.
1) Infrastructure + distributed systems
- [x] ✅ Transactional outbox for Postgres → Solr/GraphDB consistency on write paths (`outbox_events` table + reliable projector worker + replay-safe idempotency keys).
- Landed transactional write-path integration in `src/services/records.ts`:
- Postgres mode now atomically upserts `storage_documents.records` and enqueues `outbox_events` in one transaction.
- Idempotency key: `sha256("record.upsert|recordId|sourceHash")`.
- Outbox persistence + retry lifecycle:
- `src/services/outbox.ts` (`claim`/`process`/`retry`/`dead_letter` flow, SKIP LOCKED claims, backoff).
- `ops/postgres/init/02-outbox.sql` bootstraps `outbox_events` + indexes.
- Reliable projector worker:
- `src/services/outbox-projector.ts` (Solr + GraphDB projection with per-event ack/fail handling).
- `scripts/outbox-projector.ts`, `pnpm outbox:projector`, `pnpm outbox:projector:once`.
- Ops docs:
- `docs/ops/outbox-projector.md`
- [x] ✅ Outbox failure handling policy (retry budget, dead-letter queue, operator replay tooling, and alerting).
- Policy + queue health primitives:
- `src/services/outbox.ts`
- env-driven policy (`OUTBOX_MAX_ATTEMPTS`, queue/age thresholds)
- queue health summary counters/aging
- dead-letter listing, replay helpers, stale-processing requeue
- Operator tooling:
- `scripts/outbox-ops.ts`
- `pnpm outbox:status`
- `pnpm outbox:dlq:list`
- `pnpm outbox:replay:dlq`
- `pnpm outbox:requeue:stale`
- Alerting:
- `src/services/outbox-alerts.ts`
- `scripts/outbox-alert-check.ts`
- `pnpm outbox:alert:check` with optional webhook dispatch via `OUTBOX_ALERT_WEBHOOK_URL`
- Ops docs:
- `docs/ops/outbox-projector.md`
- [x] ✅ OpenTelemetry end-to-end trace propagation across Next.js, validation service, reconciliation service, Dagster pipeline runs, and GraphDB/Solr calls.
- [x] ✅ Correlated request/run identifiers enforced in logs + traces (`x-request-id` / traceparent continuity).
- Evidence: `instrumentation.ts` (`@vercel/otel` registration), Python OTel bootstrap in `services/validation-service/main.py`, `services/reconciliation-service/main.py`, and pipeline run tracing in `pipeline/run_materialize.py`.
- Evidence: request/response trace header continuity via `src/utils/observability.ts`, `src/utils/protocol.ts`, and `proxy.ts`; write-audit correlation fields persisted from async trace context in `src/services/write-audit.ts`.
- Evidence: local OTLP wiring templates + runbook (`.env.otlp.tempo.example`, `.env.otlp.jaeger.example`, `docs/ops/otel-local.md`) and explicit DB span attributes at finalized GraphDB/Solr call sites (`src/utils/otel-db-spans.ts`, `src/services/ai-query.ts`, `src/services/solr-client.ts`).
2) Data lifecycle + upstream sync
- [x] ✅ Provider tombstone handling in C2 pipeline (HTTP `404/410` upstream signals mark local tombstone, deindex in Solr, and emit deletion activity).
- C2 pipeline lifecycle handling landed in `pipeline/metamuseum_pipeline/assets.py`:
- upstream tombstone detection from `_source.upstreamStatus` / `_source.httpStatus` / `_source.statusCode`
- local tombstone registry persisted in `pipeline/state/materialized-records.json` (`tombstones` block)
- Solr deindex call on tombstone transition (`/solr/<core>/update` delete-by-id)
- deletion activity emission to `pipeline/state/deletion-activities.json` with `Delete` + `Tombstone` object semantics
- Test coverage:
- `pipeline/tests/test_c2_pipeline.py` (`test_tombstone_404_marks_local_tombstone_and_emits_delete_activity`)
- Live drill (2026-05-31):
- projected record `https://example.org/object/outbox-deindex-final-1780254290729` into Solr via outbox projector,
- ran C2 tombstone materialization with `_source.upstreamStatus=410`,
- Solr exact-id count transitioned `before=1` -> `after=0`,
- summary included `deindexed=1` + `deletion_activities_emitted=1`.
- [x] ✅ ActivityStreams deletion semantics for tombstoned records (`Delete`/`Tombstone` event policy documented and implemented).
- Feed policy + implementation landed in `app/api/activity/route.ts`:
- `/api/activity` now merges C2 tombstone lifecycle activities from `pipeline/state/deletion-activities.json`.
- Tombstoned records are emitted as ActivityStreams `Delete` events with `object.type = "Tombstone"` and `object.formerType = "HumanMadeObject"`.
- Response includes explicit `policy.deletionSemantics` metadata for aggregator consumers.
- Coverage:
- `tests/api/activity.test.ts` (`includes Delete/Tombstone activities from tombstone lifecycle state`)
- [x] ✅ Meta Wiki Art source-of-truth contract finalized (publication-target-only vs community-editable model).
- Default mode: `publication-target-only` (safe default).
- Optional mode: `community-editable` (explicit opt-in).
- Executable policy gate landed:
- `src/services/wiki-source-of-truth.ts`
- `app/api/wiki-drafts/[id]/publish/route.ts`
- Publish preflight now enforces source anchoring:
- draft `sourceRecordId` must resolve to an internal record before publish proceeds.
- Coverage:
- `tests/services/wiki-source-of-truth.test.ts`
- `tests/api/wiki-drafts/flow.test.ts` (`blocks publish when source record is missing under publication-target-only contract`)
- [x] ✅ If community-editable: reverse-ETL conflict resolution pipeline from Wikibase back to Postgres with deterministic merge policy.
- Reverse-ETL apply endpoint landed:
- `POST /api/wiki-sync/reverse-etl` (`app/api/wiki-sync/reverse-etl/route.ts`)
- Deterministic merge + idempotency engine landed:
- `src/services/wiki-reverse-etl.ts`
- precedence policy: newer `modifiedAt` wins; equal timestamp tie-break by lexicographic `changeId`; replayed `changeId` is skipped.
- Back-sync state persistence landed:
- `storage/wiki-reverse-etl-state.json` (`wiki_reverse_etl_state` managed in Postgres modes)
- Coverage:
- `tests/services/wiki-reverse-etl.test.ts`
- `tests/api/wiki-reverse-etl.test.ts`
3) AI/LLM reliability (EvalOps)
- [x] ✅ Golden evaluation dataset for complex museum questions (minimum 100 prompts with expected grounding/citation behavior).
- Dataset landed:
- `evals/golden-museum-questions.v1.json` (`120` prompts, rubric + per-prompt grounding/citation/refusal expectations)
- EvalOps documentation landed:
- `docs/evals/golden-museum-questions.md`
- Executable conformance gate landed:
- `tests/quality/ai-eval-golden-dataset.test.ts`
- [x] ✅ CI eval gate for AI-layer PRs (faithfulness, relevance, citation accuracy, citation freshness) using an evaluation harness (Ragas/DeepEval-equivalent workflow).
- Eval harness landed:
- `src/services/ai-eval-harness.ts`
- `scripts/ai-eval-gate.ts`
- Commands landed:
- `pnpm ai:eval:report`
- `pnpm ai:eval:gate`
- CI workflow landed (AI-layer path-gated):
- `.github/workflows/ai-eval-gate.yml`
- Coverage:
- `tests/services/ai-eval-harness.test.ts`
- [x] ✅ Regression thresholds for model/prompt/version changes with fail-fast policy on citation and citation-freshness drift.
- Versioned regression policy landed:
- `config/ai-eval-regression-policy.json`
- baseline identity keys: `datasetId + datasetVersion + modelVersion + promptVersion`
- Drift evaluator landed:
- `src/services/ai-eval-regression.ts`
- citation drift is configured as fail-fast (`failFastOnCitationDrift`)
- citation freshness drift is configured as fail-fast (`failFastOnCitationFreshnessDrift`)
- Gate runner wiring landed:
- `scripts/ai-eval-gate.ts` (`--check`, `--record-baseline`)
- `pnpm ai:eval:gate`
- `pnpm ai:eval:baseline:record`
- CI enforcement landed:
- `.github/workflows/ai-eval-gate.yml`
- Coverage:
- `tests/services/ai-eval-regression.test.ts`
- [x] ✅ Structured evaluation artifact retention for trend analysis (per-run metrics + prompt/model version metadata).
- Eval artifact retention service landed:
- `src/services/ai-eval-artifacts.ts`
- Gate runner now writes:
- `artifacts/evals/ai-eval-gate-latest.json`
- `artifacts/evals/runs/ai-eval-gate-<timestamp>.json`
- `artifacts/evals/trend-index.json`
- `artifacts/evals/summary.md` with CI badges, artifact links, and freshness-aging alerts
- CI visibility:
- `.github/workflows/ai-eval-gate.yml` appends `artifacts/evals/summary.md` to `$GITHUB_STEP_SUMMARY`
- `.github/workflows/ai-eval-gate.yml` uploads `artifacts/evals/` for post-run inspection
- Retention control:
- `METAMUSEUM_EVAL_RETENTION_MAX_RUNS` (default `200`)
- Coverage:
- `tests/services/ai-eval-artifacts.test.ts`
4) Frontend UX + research quality
- [x] ✅ Next.js i18n routing + locale negotiation from `Accept-Language` with graceful fallback.
- Locale-aware proxy routing landed:
- `proxy.ts`
- i18n routing policy + negotiation utilities landed:
- `src/utils/i18n-routing.ts`
- `src/utils/locale-preferences.ts`
- Behavior:
- Non-localized `GET/HEAD` page requests redirect to `/{locale}/...` using negotiated locale.
- Locale-prefixed requests rewrite to canonical internal routes while preserving locale context via request header/cookie.
- Unsupported locale negotiation gracefully falls back to default locale (`en`).
- Write-route role checks remain enforced against locale-normalized paths.
- Coverage:
- `tests/utils/i18n-routing.test.ts`
- [x] ✅ Linked Art language-tag selection policy in UI rendering (prefer user locale, then fallback chain, preserving source labels).
- Locale and language-tag policy utilities landed:
- `src/utils/locale-preferences.ts`
- `src/utils/linked-art-language.ts`
- UI renderers now pass request locale preferences from `Accept-Language`:
- `app/(workspace)/artwork/[id]/page.tsx`
- `app/(workspace)/entity/[id]/page.tsx`
- `app/(workspace)/records/page.tsx`
- `app/(workspace)/entities/page.tsx`
- `app/(workspace)/iiif/page.tsx`
- Linked Art projection layers now apply locale-aware label selection while preserving source-label fallbacks:
- `src/utils/artwork-builder.ts`
- `src/utils/entities.ts`
- Coverage:
- `tests/utils/linked-art-language.test.ts`
- `tests/utils/artwork-builder.test.ts`
- `tests/utils/entities.test.ts`
- [x] ✅ Researcher feedback/annotation loop using W3C Web Annotation model (claim-targeted annotations without mutating canonical `_source.raw`).
- W3C annotation contracts + validation landed:
- `src/contracts/zod/web-annotation.ts`
- `src/contracts/web-annotation.ts`
- `src/contracts/zod/requests.ts`
- Annotation persistence is isolated from canonical records and stored as a separate managed document (`annotations.json`):
- `src/services/annotations.ts`
- `src/utils/storage.ts`
- API endpoints landed for create/list/get with public CORS and audit logging:
- `app/api/annotations/route.ts`
- `app/api/annotations/[id]/route.ts`
- Research UI loop landed on artwork detail pages:
- `src/components/research-annotation-loop.tsx`
- `app/(workspace)/artwork/[id]/page.tsx`
- Coverage:
- `tests/api/annotations.test.ts`
- `tests/auth/roles.test.ts`
- [x] ✅ Curator triage queue for annotation-driven correction proposals with provenance-safe review flow.
- Curator triage queue API landed with state-aware queue metrics and claim-target proposal payloads:
- `app/api/annotations/triage/route.ts`
- Provenance-safe review action API landed:
- `app/api/annotations/[id]/review/route.ts`
- `src/services/annotations.ts` (`review()` workflow transitions)
- Role policy enforces editor/admin review access while preserving researcher submit access:
- `src/auth/roles.ts`
- Curator workspace queue UI landed:
- `app/curator/annotations/page.tsx`
- `src/components/annotation-triage-workbench.tsx`
- `app/layout.tsx` (workspace navigation link)
- Coverage:
- `tests/api/annotations.test.ts`
- `tests/auth/roles.test.ts`
5) Security + privacy posture
- [x] ✅ PII/sensitivity scan stage in C2 ETL before public indexing/syndication.
- `src/utils/sensitivity.ts` scans materialized records for PII, cultural-sensitivity, and restricted-publication signals while excluding raw provider payload blobs.
- `src/utils/record-materializer.ts` attaches `_sensitivity` review state during import/persist materialization.
- `src/services/outbox-projector.ts` skips Solr + GraphDB public projections for records held by sensitivity review.
- Coverage:
- `tests/utils/sensitivity.test.ts`
- `tests/utils/record-materializer.test.ts`
- `tests/utils/search-index.test.ts`
- `tests/services/outbox-projector.test.ts`
- [x] ✅ Human-review hold policy for flagged records (restricted publication until disposition).
- Flagged records carry `_sensitivity.status = "review_required"` and `_sensitivity.holdPublication = true`.
- Held records are excluded from Solr/GraphDB syndication and flattened with `publication_status = "held_for_review"` plus no public `text_all`.
- [x] ✅ Culturally sensitive knowledge handling rules integrated with rights/reuse UI and syndication controls.
- Cultural-sensitivity scan rules and syndication holds now flow into explorer DTOs as `held_for_review` records with explicit rights/reuse review labels.
- `RightsBadge` renders sensitivity labels as review-required publication holds, keeping cultural-sensitivity warnings visible anywhere rights/reuse chips appear.
- Coverage:
- `tests/components/linked-atomics.test.ts`
- `tests/utils/artwork-builder.test.ts`
- [x] ✅ Security telemetry for sensitivity decisions (who approved, why, and when).
- Scanner telemetry records version, scan time, signal count, highest severity, and hold policy.
- Human disposition telemetry now requires reviewer identity, rationale, and timestamp before approval can clear a publication hold.
- Materialization preserves existing disposition telemetry during record rewrites, and reviewed records only return to public indexing after an approved disposition.
- Coverage:
- `tests/utils/sensitivity.test.ts`
- `tests/utils/record-materializer.test.ts`
- `tests/utils/search-index.test.ts`
6) Content credibility engine (trust/originality/distribution/consistency)
- [x] ✅ Baseline credibility-engine policy document landed:
- `docs/content-credibility-engine.md`
- [x] ✅ Trust-layer storage templates landed:
- `provenance/ledger.json`
- `provenance/source-map.yaml`
- [x] ✅ Originality-layer storage template landed:
- `semantic-core/originality-index.json`
- baseline novelty threshold documented (`cosine_distance > 0.18`)
- [x] ✅ Distribution/consistency scaffolding landed:
- `distribution/schedule.yaml`
- runtime queue path reserved at `distribution/queue.db` (gitignored)
- `generation/style-profile.md`
- [x] ✅ Monitoring scaffold landed:
- `monitoring/metrics.json`
- [x] ✅ Enforce citation-coverage gates in code for generated/publishable artifacts.
- `POST /api/content/generate` now returns `422` when computed citation coverage falls below threshold (`METAMUSEUM_CITATION_COVERAGE_THRESHOLD`, default `0.95`), including coverage diagnostics in response.
- `POST /api/wiki-drafts/[id]/publish` now runs explicit publish preflight and returns `422` with preflight diagnostics when citation coverage or other publishability checks fail.
- Evidence: `src/utils/citation-coverage.ts`, `app/api/content/generate/route.ts`, `src/services/wiki-publish.ts`, `app/api/wiki-drafts/[id]/publish/route.ts`, `tests/api/content-generate.test.ts`, `tests/quality/cite-or-refuse-conformance.test.ts`, `tests/services/wiki-publish.test.ts`.
- [x] ✅ Enforce originality-score gates in code for generated/publishable artifacts.
- Originality scoring utility landed with policy-driven threshold + minimum unique-insight checks:
- `src/utils/originality-score.ts`
- `POST /api/content/generate` now returns `422` when originality score gates fail, including originality diagnostics in output payload.
- `src/services/agents.ts`
- `app/api/content/generate/route.ts`
- Wiki publish preflight now enforces originality gates before publishable status:
- `src/services/wiki-publish.ts`
- `app/api/wiki-drafts/[id]/publish/route.ts`
- Coverage:
- `tests/utils/originality-score.test.ts`
- `tests/api/content-generate.test.ts`
- `tests/quality/cite-or-refuse-conformance.test.ts`
- `tests/services/wiki-publish.test.ts`
- `tests/api/wiki-drafts/flow.test.ts`
- [x] ✅ Add weekly credibility audit automation (drift + relevance + broken-link checks).
- Weekly audit orchestration script landed:
- `scripts/credibility-audit.ts`
- `src/services/credibility-audit.ts`
- Package commands:
- `pnpm credibility:audit`
- `pnpm credibility:audit:check`
- Weekly GitHub Action landed:
- `.github/workflows/credibility-audit.yml`
- uploads `artifacts/credibility-audit/latest.json`
- Coverage:
- `tests/services/credibility-audit.test.ts`
- [x] ✅ Add queue worker implementation for multi-channel publish orchestration (web/linkedin/medium/email/api).
- Publish queue worker service landed with:
- schedule parsing from `distribution/schedule.yaml`
- durable queue state in `distribution/queue.db`
- per-channel delivery states, retries/backoff, dead-letter handling
- per-day channel cap deferral logic from schedule policy
- channel adapters for `web`, `linkedin`, `medium`, `email`, `api`
- Files:
- `src/services/publish-queue-worker.ts`
- `scripts/publish-queue-worker.ts`
- `distribution/README.md`
- Commands:
- `pnpm publish:queue:worker`
- `pnpm publish:queue:worker:once`
- `pnpm publish:queue:worker:drain`
- Coverage:
- `tests/services/publish-queue-worker.test.ts`
- [x] ✅ Add OpenTelemetry metric/span conventions for trust/originality/distribution events.
- Shared conventions module landed with stable span/metric names plus common layer/event/kind/outcome attributes:
- `src/utils/otel-credibility.ts`
- Trust/originality gates now emit standardized spans/metrics:
- `src/utils/citation-coverage.ts`
- `src/utils/originality-score.ts`
- Wiki publish preflight/execute and distribution queue orchestration emit the same conventions:
- `src/services/wiki-publish.ts`
- `src/services/publish-queue-worker.ts`
- Coverage:
- `tests/utils/otel-credibility.test.ts`
- [x] ✅ Add eval thresholds for engagement velocity and trust/originality regression alerts.
- Threshold evaluator landed for engagement velocity minimum plus trust/originality minimum and baseline-drop budgets:
- `src/services/credibility-eval-thresholds.ts`
- `config/credibility-eval-thresholds.json`
- Weekly credibility audit now evaluates and reports these alerts:
- `src/services/credibility-audit.ts`
- `scripts/credibility-audit.ts`
- Coverage:
- `tests/services/credibility-eval-thresholds.test.ts`
- `tests/services/credibility-audit.test.ts`
Highest ROI priority
- [x] ✅ Implement OpenTelemetry before broader C4/C5 expansion to prevent distributed-debugging bottlenecks.
Meta Wiki Art bridge implementation notes:
- [x] ✅ See meta-wiki-art-bridge.md(meta-wiki-art-bridge.md) for sequencing constraints, boundaries, and the staged publish flow.
- Sequencing constraints are documented under `## Sequencing constraints`.
- Bridge boundaries are documented under `## Boundaries (Out Of Scope For Era A/B)`.
- Staged publish flow is documented under `## Planned C5 flow`.
Era C exit gate:
- [x] ✅ Automated evidence pack landed for all four checks (artifact schema + nightly job + dated run history).
- Schema: `docs/schemas/era-c-exit-gate-evidence.schema.json`
- Policy: `config/era-c-exit-gate-policy.json`
- Script + artifacts: `scripts/era-c-exit-gate.ts`, `artifacts/exit-gate/`
- Trend index now carries compact failed-check reasons so agents can prioritize the next blocker without opening every historical artifact.
- Telemetry snapshot automation: `scripts/monitoring-telemetry-sync.ts` via `pnpm monitoring:telemetry:sync` (wired into `pnpm era-c:exit-gate:evidence` / `pnpm era-c:exit-gate:check`)
- Nightly workflow: `.github/workflows/era-c-exit-gate-evidence.yml`
- Nightly workflow now supports deployed-target evidence via `METAMUSEUM_EVIDENCE_BASE_URL`, `METAMUSEUM_EVIDENCE_IIIF_TILE_URL`, optional SPARQL/query vars, and matrix-first `METAMUSEUM_ACTIVITY_CONSUMER_IDS` (`METAMUSEUM_ACTIVITY_CONSUMER_ID` fallback); when target vars are missing it keeps the local `pnpm k6:slo:ci` fallback so automation still produces artifacts.
- [x] ✅ Deployment-foundation preflight landed for controlled beta / production launch review.
- Commands: `pnpm launch:preflight`, `pnpm launch:preflight:production`
- Script + service: `scripts/deployment-preflight.ts`, `src/services/deployment-preflight.ts`
- Runbook: `docs/ops/deployment-preflight.md`
- Scope: verifies env/secrets, Postgres mode, uptime source, SLO target URL, fresh DR restore rehearsal, and staging-vs-production smoke-token posture before collecting exit-gate evidence.
- [x] ✅ Launch review packet landed for controlled beta / production launch decision evidence.
- Commands: `pnpm launch:review`, `pnpm launch:review:check`, `pnpm launch:review:production`
- Script + service: `scripts/launch-review.ts`, `src/services/launch-review.ts`
- Runbook: `docs/ops/launch-review.md`
- Scope: aggregates latest preflight, Era C exit-gate, security audit baseline, DR drill, public-trust smoke, a11y evidence, and explore smoke evidence; production fails on missing/stale/red evidence while staging can warn for beta-only evidence collection.
- Evidence producers: `pnpm a11y:check` writes `artifacts/launch/a11y-latest.json`, and `pnpm smoke:explore:matrix` writes `artifacts/launch/explore-smoke-latest.json`.
- [ ] ⚠️ Latest exit-gate status is failed (`2026-06-10T11:36:30.690Z`), but the artifact is now agent-actionable:
- SLO failures distinguish incomplete k6 summaries from actual p95 threshold breaches via `missingMetricsInWindow` and per-sample `metricDetails`.
- Uptime failures include evidence `source` and `notes`.
- Activity adoption credits only declared external consumers with `class: "declared"` and `declaredId`.
- KPI failures include source metadata, snapshot notes, and per-failed-metric source/reason details.
- [ ] All SLOs in SOTA §20.4 met at p95 over a 30-day window.
- Evidence contract now requires all five p95 SLO metrics: cached Record, cold Record, keyword+facet search, whitelisted SPARQL, and IIIF tile serving.
- SLO evidence artifacts now include missing-metric summaries and per-sample metric details so incomplete k6 runs are actionable separately from threshold breaches.
- Nightly workflow hardening can now collect complete deployed-target samples once GitHub vars provide the app base URL, IIIF tile URL, and whitelisted SPARQL inputs.
- Current blocker: retained k6 history has only `3/30` samples and those samples are legacy three-metric summaries missing whitelisted SPARQL and IIIF tile p95 values.
- [ ] 99.9% uptime on public read.
- Uptime gate now rejects stale or undated availability snapshots; `uptime.maxSnapshotAgeHours` defaults to `48` so the 30-day proof must be continuously refreshed.
- Uptime evidence artifacts now surface source (`prometheus`, `probe`, `unavailable`) plus notes, making missing public-read proof actionable without opening telemetry snapshots.
- Current blocker: uptime source is `probe`, availability is `1`, and `sampleCount30d` is `3`; continue scheduled probes until the 30-sample window is met.
- [ ] ≥ 3 external Linked Art systems consume the `/api/activity` feed.
- Exit-gate adoption evidence now credits only declared external consumers via `x-linked-art-consumer-id`; derived fingerprints remain diagnostic and cannot satisfy the gate.
- Evidence ingestion preserves `class` + `declaredId` from `storage/activity-consumers.json`, so declared external consumers can satisfy the gate when real adoption arrives.
- Activity adoption proof tooling now rejects placeholder/local IDs by default, probes `/api/activity` + `/api/activity/readiness`, and writes dated single-consumer + matrix artifacts under `artifacts/activity-adoption/`.
- Current blocker in the latest published artifact: declared external consumers are `0/3`; run `pnpm activity:adoption:matrix` with three partner-owned consumer IDs against the deployed target after those consumers are onboarded.
- [ ] KPIs in SOTA §26 hit.
- KPI gate now rejects stale or undated KPI snapshots and requires real AI query cost telemetry when `kpis26.requireAiQueryCostTelemetry` is enabled; fallback default cost values remain diagnostic only.
- KPI evidence artifacts now include source metadata, snapshot notes, and per-failed-metric source/reason details so SOTA §26 blockers are actionable without opening telemetry inputs.
- KPI telemetry sync now accepts `monitoring/kpi-evidence.json` (or `METAMUSEUM_KPI_EVIDENCE_PATH`) for aggregate production record-enrichment and reconciliation-review counts; invalid sections are ignored instead of creating false-green metrics.
- AI query telemetry now logs `costUsd`, `costCurrency`, `costSource`, and usage counts per query; the deterministic local planner records `costUsd: 0` with `costSource: "deterministic-local-planner"` instead of relying on fallback KPI defaults.
- Nightly workflow now seeds one deployed `/api/ai/query` request before telemetry sync when `METAMUSEUM_EVIDENCE_BASE_URL` is configured.
- Current blockers in the latest published artifact: `dataQualityEnrichedShare`, `reconciliationAutoApproveRate`, and `reconciliationPrecisionReviewed`; AI query cost telemetry is now sourced and within policy, so the remaining KPI work is production enrichment/reconciliation evidence.
---