Meta Museum Roadmap
This is the current, authoritative roadmap. It supersedes development-roadmap.md(development-roadmap.md) (kept as legacy reference for the pre-Next.js prototype).
The north star is linked-art/LinkedArtSOTAWebApp.md(linked-art/LinkedArtSOTAWebApp.md). Anywhere this document is silent, the SOTA spec wins for architecture; this document wins for sequencing — what gets built when, and what is deferred.
For provider-expansion and validation slices, linked-art/LinkedArtModel1.0-Reference.md(linked-art/LinkedArtModel1.0-Reference.md) is a required standards input for implementation and AIDD + TDD.
That reference is now the round-based standards ledger for model/API/schema/search/protocol conformance, and roadmap execution assumes tests are mapped to its fixture anchors.
---
Status (as of June 24, 2026)
- [x] ✅ Core stack and runtime: Next.js 16.2.6 + React 19.2.4 + TypeScript strict + custom CSS; latest pilot outreach reply/evidence guard passed focused evidence/outreach/activation/support operator/service/offer/page/storage/docs/exporter checks (`70` tests / `13` suites), final `pnpm test` (`1103` tests / `310` suites), `pnpm lint`, and `pnpm build`; `/pilot` now renders a typed zero-complete activation evidence ledger with the six canonical seven-day pilot milestones, while `src/services/pilot-outreach-events.ts`, `src/services/pilot-activation-events.ts`, `src/services/pilot-support-issues.ts`, and `src/services/pilot-evidence-packet.ts` record sequence-guarded exact-account outreach, ordered exact-tenant activation events, chronology/status/identity/severity/resolution-evidence-guarded support load, explicit blocked/ready pilot packet artifacts, blocker-first Markdown packet summaries, open-blocking-support readiness blockers, and overdue open support-response blockers, `pnpm pilot:outreach`, `pnpm pilot:activation`, `pnpm pilot:support`, and `pnpm pilot:evidence --markdown` let operators append, package, and share real dated evidence without hand-editing JSON, and the runbook forbids marking outreach, activation, support, or packet readiness complete from demo, fixture, smoke evidence, reply claims without prior sent evidence, replies dated before `sentAt`, sent follow-up dates that predate `sentAt`, tenant-mismatched activation evidence, later activation milestones without prior tenant evidence, support response deadlines before `openedAt`, support issue `requester`, `summary`, `openedAt`, or `severity` rewrites, resolved support issue `resolvedAt` or `resolutionSummary` rewrites, support resolutions before `openedAt`, or open support issues carrying resolution evidence. Tenant RBAC evidence still proves signed-in members cannot use sibling active-org cookies to read sibling record lists/details, write records into sibling/default storage, read/review/publish/write sibling wiki draft state, read/review/triage/write sibling annotation queues, read/write sibling AgentTask history, or steer scoped AI/editorial record-read outputs away from their authenticated org; proving signed-in non-members cannot use a cookie-selected org to read or mutate records, annotation, wiki draft, or scoped AI/editorial record-read outputs; and proving signed-in non-admin roles cannot use org administration or support-adjacent routes to list orgs, add memberships, create/revoke invites, or export org audit packets. Gated pilot API routes return stable `429`/`Retry-After` denials before parsing or work execution without changing monthly quota denial semantics. The membership-validated `metamuseum.activeOrgId` cookie drives shared preview, request-storage, and pilot route-gate scope without weakening test override or explicit `?tenantId=...` compatibility precedence. The workspace shell renders sanitized active-org status, org id, accessible-org count, storage scope, membership role, selector, stale-selection warning, and an `/orgs` switch/manage affordance without exposing invite token material. Workspace records, explore, entities, entity detail, artwork detail, graph, and patterns pages derive preview storage from authenticated org session scope when no compatibility `?tenantId=...` is supplied, while explicit tenant preview links still win for controlled pilot URLs. Manual pilot entitlements can bind to authenticated org ids while route usage gates prefer the selected active org for entitlement lookup and post-success counter writes before compatibility tenant headers. `/orgs` can render the operator console while admin-only org routes create/list orgs, memberships, sanitized invites, invite revocation, invite acceptance, form posts, org-scoped audit rows, and org audit export packets while Auth.js org scope continues to resolve from managed org memberships/invites before compatibility env mappings or pilot tenant headers. Wiki draft create/list/detail/review/publish access, live-publish sync-map artifacts, reverse-ETL state, and tenant preview pages remain scoped while unscoped public requests cannot see scoped tenant records or artifacts. Provider facade/direct provider/explorer/Linked Art import writes, public browse reads, records, jobs, AgentTask, annotation, audit, activity route storage, wiki draft storage/access routes, wiki sync-map routes, and tenant-scoped export/DR managed documents remain isolated. The temporary `metagenauto/` AG2 reference repo has been distilled into `docs/agents/ag2-extraction-notes.md` before removal from the app tree.
- [x] ✅ Product surface: `29` app pages and `125` API route handlers are live, including 13 provider integrations, ARK resolver, provider facade/readiness/capabilities, standards APIs, ActivityStreams, OpenAPI/docs, AI query/chat/evals, org admin/invite APIs, org audit export, `/api/orgs/active` active-org selection, the `/orgs` operator console for org provisioning, memberships, invites, and revocation, workspace-shell active-org status with selector, stale-selection warning, and switch/manage affordance, persisted AgentTask review history, wiki publish/sync, annotations, public trust, IIIF routes, a larger rotating midnight/navy rain-blue home hero with bright lemon-yellow Meta Museum highlights, top-padded full-color unfiltered contained artwork, a verified Met/CMA/AIC clean-deploy fallback when local image-backed imports are absent, a centered non-overlapping navy artwork info panel, a white hero label, three source-backed real-artwork pathway cards with provider/maker/date/rights/link metadata instead of AI placeholder images, a more readable artwork detail facts panel with metadata-forward desktop proportions and full-width long source fields, a full-width “Live source network” homepage band that frames the current project as a source-backed workbench with numbered source metrics, provenance, rights/attribution context, citation checks, and editor-gated review signals, and the `/pilot` Managed Linked Art Launch Pilot offer page with shared primary/footer navigation access, named outreach status ledger, zero-complete activation evidence ledger, and validated operator flows for outreach and activation evidence.
- [x] ✅ Era A + Era B: legacy lift, parity, provider hardening, authority caching, B6.1 reconciliation, B7 gateway readiness/facade, B8 protocol conformance, B9 modeling guardrails, B10 ARK behavior, and pre-Era-C operational sign-off are complete.
- [x] ✅ Era C implementation surface: C1-C5 core features are implemented, including multi-modal storage scaffolding, HAL/search/activity endpoints, C2 ETL/reconciliation/mapper, C3 IIIF + visualization surfaces, C4 AI query/chat/evals/mapping assist, specialized review agents, and C5 syndication/wiki/security/privacy hardening.
- [x] ✅ AI agent review layer: five single-word field-aligned agents are live behind human approval: Clio (research signals), Mercator (Linked Art mapping review), Janus (reconciliation review), Themis (rights/provenance review), and Calliope (citation-backed curatorial drafts). Localhost smoke completed all five, avatar assets are verified, each returned AgentTask persists to editor-gated review history at `/api/agents/tasks`, and Mercator/Janus now have a disabled-by-default AG2 bridge boundary plus local review-only FastAPI worker with trace propagation, contract validation, timeout/refusal fallback, local fallback, safe enablement docs, and live-worker eval artifacts.
- [x] ✅ SEO/content publishing policy: generated article, book, object-label, and collection-brief outputs now include source-derived SEO metadata (`seoTitle`, meta description, primary/secondary keywords, H1, subheadings, slug, rationale) on both accepted and refused content paths; WikiDrafts validate the same SEO envelope while preserving canonical source titles and citation/originality gates.
- [x] ✅ Security/privacy posture: PII/sensitivity scan, review holds, audited human disposition, rights/reuse UI warnings, and public projection controls are active before Solr/GraphDB syndication.
- [x] ✅ Governance + docs contract: markdown under `docs/` remains the canonical source of truth, surfaced by `/docs`, `/api/docs/manifest`, and `/api/docs/content`; Linked Art reference mapping, AIDD/TDD, and closeout evidence remain mandatory. The normal `pnpm session:closeout` path now refuses to append unless `README.md` and this roadmap were updated since the previous closeout.
- [x] ✅ Deployment (LIVE): the Next.js app is deployed to production on Vercel against Neon Postgres (`storageMode=postgres`), verified serving real records and public pages (2026-06-23). `vercel.json` pins `next build` so the close-out guard cannot break builds; the guard also self-skips when `VERCEL` is set. The `render.yaml` blueprint for the three FastAPI services + Redis cache is committed but not yet deployed; background workers (outbox/publish) remain deferred until projection/publishing is enabled. See deployment.md(deployment.md).
- [x] ✅ Portfolio README + docs (2026-06-24): README trimmed from ~1,300 lines to a lean hero + highlights + run-it + honest "what's real vs. in progress" (detailed status stays here in the roadmap). New deep-dives added: responsible-ai.md(responsible-ai.md) (key handling, denial-of-wallet auth-gate, citation/refusal gates, eval harness, cost control) and linked-art/conformance-matrix.md(linked-art/conformance-matrix.md) (protocol MUSTs verified live + per-provider matrix + honest gaps).
- [x] ✅ Linked Art rights as `Right` entities — all providers (2026-06-24): started the roadmap to 10/10(roadmap-to-10.md) with milestone B1. `src/utils/linked-art-rights.ts` synthesizes a conformant `subject_to` `Right` (classified by CC0 / rightsstatements.org URIs) for every object record that lacks one, wired into both `normalizeIncomingRecord` and the read-path `migrateToCurrentSchema`; Getty's are preserved. Closes the "rights as labels outside Getty" gap in the conformance matrix.
- [x] ✅ Reliable, badged CI — roadmap-to-10 A1 (2026-06-24): the session close-out guard no longer fails CI — `scripts/session-closeout.ts` skips the `--check` guard when `CI` is set (it stays enforced locally), so a stale local close-out log can't turn a green build red (the PR #16 failure mode). README header now carries CI / License / Linked Art / tests badges.
- [x] ✅ Supply-chain hygiene — roadmap-to-10 A2 (2026-06-24): resolved all 3 moderate `pnpm audit --prod` advisories via `pnpm.overrides` (postcss XSS → `>=8.5.10`; OpenTelemetry memory-DoS → core/resources/sdk-trace-base `^2.8.0`, which also fixes `@vercel/otel`'s mis-resolved 1.30.1 peers). Added `.github/dependabot.yml` (npm + github-actions + 4 pip services) and a `pnpm audit --prod --audit-level high` CI gate. Audit clean; tests 1,114; build green.
- [x] ✅ Per-provider conformance matrix generated — roadmap-to-10 B3 (2026-06-24): the 13×2 per-provider pass/fail fixtures (asserted in CI by `validation-architecture-depth.test.ts`) now drive a generated conformance matrix via `scripts/generate-conformance-matrix.ts` (`pnpm conformance:matrix`); `conformance-matrix-generated.test.ts` gates drift. The published `conformance-matrix.md` table is no longer hand-maintained. 13/13 providers pass both directions; suite 1,116.
- [x] ✅ SHACL conformance gate in CI — roadmap-to-10 B2 (2026-06-24): `services/validation-service/shacl_gate.py` + `.github/workflows/shacl-conformance.yml` validate every provider's pass fixture against the Linked Art SHACL shapes with pyshacl (JSON-LD → CIDOC-CRM RDF). Path-filtered job; `pnpm shacl:gate` locally. All 14 pass fixtures conform; a CRM-expansion regression now blocks the build.
- [x] ✅ Measured + gated test coverage — roadmap-to-10 A3 (2026-06-24): `pnpm test:coverage` runs the suite under c8 with a `--check-coverage` gate (lines 85 / funcs 85 / branches 70); CI's test step now enforces it. Current 89.4% lines / 92.1% funcs (core `src/services` 91.9%). README coverage badge added; also fixed CRLF-fragility in the B3 drift test so coverage runs clean cross-platform.
- [x] ✅ Published quality scores — roadmap-to-10 A4 (2026-06-24): `docs/quality.md`(quality.md) publishes CI-measured numbers — Lighthouse a11y 100/100 on key pages, axe 0 severe WCAG 2A/2AA violations across 18 routes, and the k6 p95 performance budget met (cached 73.5 ms, cold 56.1 ms, facet 55.1 ms, 0% errors). README a11y badge added.
- [x] ✅ Faceted / relevance search — roadmap-to-10 B4 (2026-06-24): `src/services/search.ts` ranks `/api/search` results by hit quality (exact label > prefix > substring > name) and returns `type`/`provider` facet counts + `q`/`type`/`provider`/`limit`/`offset` params in the `ld+json` `OrderedCollectionPage`. Tested at the service + API level; conformance-matrix "basic, not faceted" gap closed (Solr 9 documented as the env-gated scale backend).
- [x] ✅ Fixed flaky annotations test / CI reliability (2026-06-24): annotation ids were `annotation-${Date.now()}`, so two creates in the same millisecond shared an id — letting annotations in different org scopes collide and intermittently breaking the cross-org isolation assertion in CI. Centralized id minting in `mintAnnotationId()` (timestamp + `randomUUID`); added a deterministic 1,000-mint uniqueness test. Completes the A1 "reliable CI" goal.
- [x] ✅ HEAD + HTTP/2 conformance — roadmap-to-10 B6 (2026-06-24): the canonical Linked Art entity/collection routes now export `HEAD` (via a `bodilessResponse(await GET(...))` helper in `src/utils/protocol.ts`) — same headers as GET, no body, mirrors 200/404; `OPTIONS` advertises `GET,HEAD,OPTIONS`. HTTP/2 verified live (`HTTP/2.0 200` via Vercel). `tests/api/head-methods.test.ts`; suite 1,128.
- [x] ✅ Roadmap trimmed — roadmap-to-10 A5 (2026-06-24): this roadmap went from ~1,510 lines to ~420 by archiving the slice-by-slice Era A/B/C history to `progress/era-history.md`(progress/era-history.md) (see "Era delivery history" below). `getStructuredRoadmap` aggregates both files so `/api/roadmap` still exposes full phases/milestones.
- [x] ✅ Activity Streams change feed — roadmap-to-10 B5 (2026-06-24): aligned `/api/activity` to Activity Streams 2.0 — AS2 `@context`, `next`/`prev` page links (kept `nextPage`/`prevPage` aliases), a fuller `partOf` `OrderedCollection` with `first`/`last`/`totalItems`, and `application/activity+json`. `tests/api/activity-as2.test.ts`; suite 1,129.
- [x] 🟡 Demo script — roadmap-to-10 A6 (2026-06-24): shot-by-shot 60–90s demo script + screenshot-gallery plan at `demo-script.md`(demo-script.md). The video itself needs a human screen-record (can't be automated). This is the last open 10/10 item; all others are done and CI-confirmed.
- [x] ✅ Dependency batch verified (2026-06-24): applied all 14 open Dependabot updates in one verified PR (#48) — fastapi 0.138, pyld 3.1, redis 8, pydantic 2.13, dagster 1.13.10, `actions/checkout` 6→7 — each installed in a clean venv and run against the relevant service's tests (validation `validate_record` + SHACL gate, reconciliation 9 tests, ag2-worker 6 tests, pipeline 3 tests). SHACL CI pins synced; queue cleared. Keeps A2 supply-chain hygiene current.
- [ ] ⚠️ Era C exit gate is not green yet: latest evidence run (`artifacts/exit-gate/era-c-exit-gate-latest.json`, generated `2026-06-10T11:36:30.690Z`) is `failed` across all four exit checks. The project is strong for controlled beta/demo use, but not yet ready to claim full public-production completion.
Current Launch Readiness
| Launch lane | Score | Current decision | Required next evidence |
|---|---:|---|---|
| Internal/dev demo | 9/10 | Safe to keep using and iterating locally. | Keep `pnpm test`, `pnpm lint`, `pnpm build`, and closeout guard green. |
| Controlled public beta | 8/10 | App is now live on Vercel + Neon (2026-06-23), clearing the deployed-base-URL blocker. `pnpm launch:beta:readiness` is now `blocked` only on `AUTH_GITHUB_ID`, an uptime source, a k6 target URL, and disabling Vercel Deployment Protection; Postgres storage, DR proof, public-trust smoke, a11y, and explore smoke evidence are present. | Set `AUTH_GITHUB_ID`, point `BASE_URL` / `METAMUSEUM_PUBLIC_READ_BASE_URL` at the production URL, configure an uptime source and `IIIF_TILE_URL`, disable Vercel Deployment Protection, then rerun `pnpm launch:preflight`, `pnpm launch:review`, and `pnpm launch:beta:readiness` until status is `live-beta-ready` or only accepted warnings remain. |
| General public production | 6.5/10 | Not yet; product is feature-rich but evidence gates are red. | Passing 30-day SLO/uptime evidence, real KPI telemetry, and external activity-feed adoption proof. |
| Institution-grade / 10/10 | 5.5-6/10 | Blocked by time-based evidence and real-world adoption. | At least 30 days of green SLO + uptime samples, 3 declared external feed consumers, and SOTA §26 KPI targets. |
Current SaaS Readiness
| SaaS lane | Score | Current decision | Required next evidence |
|---|---:|---|---|
| Technical SaaS foundation | 7/10 | Strong enough to begin SaaS packaging: auth, roles, Postgres storage, provider ingestion, validation, docs, launch review, and trust/syndication tooling are real. | Complete staging secrets, production-like deployment, usage limits, tenant-aware data boundaries, and supportable onboarding. |
| Paid pilot readiness | 8/10 | Close for 1-3 concierge pilots where Sun & Rain Works can manually onboard collections and invoice outside the app; `/pilot` is shared-nav reachable and publishes the offer, named initial outreach queue, derived status ledger showing 10 researched accounts and 0 sent messages, and a zero-complete activation evidence ledger backed by managed outreach and activation events with the next required evidence. `pnpm pilot:outreach`, `pnpm pilot:activation`, `pnpm pilot:support`, and `pnpm pilot:evidence --markdown` now give operators validated no-JSON-edit commands for recording real first-outreach/milestone/support evidence and packaging explicit blocked/ready JSON plus Markdown evidence artifacts, with outreach replies requiring prior sent evidence, sent follow-up dates barred from predating `sentAt`, activation milestones requiring prior same-tenant evidence, tenant-mismatched activation evidence rejected, support response and resolution timestamps barred from predating `openedAt`, existing support issue updates barred from rewriting the original `requester`, `summary`, `openedAt`, or `severity`, already-resolved support issue updates barred from rewriting `resolvedAt` or `resolutionSummary`, open support issues barred from carrying resolution evidence, and open blocking support issues plus overdue open support responses preventing `ready` status. `docs/ops/managed-linked-art-pilot-runbook.md` defines the concierge setup, tenant/source namespace, activation events, support intake, monthly evidence packet, and counter-backed route usage gate; `src/services/pilot-entitlements.ts` makes the interim manual invoice-backed `pilot` entitlement durable in `storage/pilot-entitlements.json`, exact-tenant readable, optionally org-bound, validated, exportable to Postgres, and service-gated for import/AI/storage/export/API usage; `src/services/pilot-usage-counters.ts` persists tenant or org usage keys in managed `storage/pilot-usage-counters.json`; `src/services/pilot-support-issues.ts` persists support-load evidence in managed `storage/pilot-support-issues.json`; `src/services/pilot-route-gates.ts` now returns stable `429`/`Retry-After` denials for API-per-minute overages before request parsing or route work; `src/services/org-tenants.ts` stores first-class orgs, memberships, and hashed-token invites in managed `storage/org-tenants.json`; records, wiki draft, annotation, AgentTask, and scoped AI/editorial route tests prove signed-in members cannot use sibling active-org cookies to read sibling records or read/review/publish/write sibling editorial and agent-review state, while records, annotation, wiki draft, and scoped AI/editorial route tests now prove signed-in non-members cannot use cookie-selected org scope for those reads or writes, and org admin route tests prove signed-in non-admin roles cannot list orgs, add memberships, create/revoke invites, or export org audit packets; admin-only `/api/orgs*` routes now create/list orgs, memberships, sanitized invites, invite revocation, public invite acceptance, form posts, org-scoped audit rows, and org-specific audit export packets; `/api/orgs/active` persists a membership-validated active org in an HttpOnly cookie for signed-in members without exposing org admin writes; `/orgs` exposes the current operator console for org provisioning, membership adds, invite creation, sanitized invite review, and pending-invite revocation; `src/services/org-session-status.ts` and the workspace shell expose sanitized active-org status plus a selector/switch/manage affordance and stale-selection warning for signed-in org members; Auth.js resolves signed-in identities from active org memberships before falling back to `AUTH_ORG_MEMBERSHIPS` or pilot tenant headers; `src/services/active-org-selection.ts` now provides the shared membership-validated selected-org resolver used by `src/services/tenant-preview.ts`, `src/services/request-storage-scope.ts`, and `src/services/pilot-route-gates.ts`, so selected active orgs drive preview reads, API storage scope, entitlement lookup, and successful route counter writes while keeping test override headers and explicit tenant preview links as compatibility precedence; `tests/services/pilot-route-gates.test.ts` proves two pilot tenants stay isolated at the counter/gate layer and that org-scoped requests do not write against the compatibility tenant id; `proxy.ts` refuses tenant-tagged direct legacy provider routes before they can bypass facade counters; `src/auth/roles.ts` derives provider import role gates from the provider capability registry and keeps `/api/orgs/active` researcher-level before the admin-only org wildcard; `src/services/org-storage-scope.ts` and `tests/services/org-storage-isolation.test.ts` add service-level org-scoped isolation for records, jobs, AgentTask artifacts, and researcher annotations; tenant/org-scoped records, jobs, AgentTask, annotations, audit rows, ActivityStreams feed reads, activity readiness metrics, provider facade imports, explorer imports, Linked Art imports, wiki draft access routes, wiki sync-map/reverse-ETL routes, scoped public browse/derived plus AI/editorial read routes, and tenant preview pages now propagate exact scope into backing stores or read models; tenant-scoped managed JSON documents are included in Postgres export and DR restore rehearsal; and `docs/ops/procurement-readiness-packet.md` packages the first buyer-facing security overview, data flow, hosting/subprocessor note, backup/restore evidence path, incident summary, and checklist. | Send/track first outreach, onboard one real pilot dataset using the runbook, attach deployment-specific security/legal evidence to the buyer packet, add route-level support-access implementation tests only if a support-as-customer feature ships, and complete the activation ledger with real tenant evidence. |
| Self-serve SaaS readiness | 3/10 | Not ready; the app lacks pricing pages, tenant signup, billing, plan gates, org invites, usage dashboards, and support workflows. | Add account/org onboarding, billing/manual-plan entitlements, quotas, usage analytics, and customer success runbooks. |
| Profitable SaaS business | 4/10 | The product has a credible technical wedge, but revenue operations and repeatable sales motion are not proven yet. | Convert paid pilots into recurring subscriptions with gross-margin, retention, support-load, and acquisition-channel evidence. |
SaaS Commercialization Strategy
Primary wedge: managed Linked Art API + data-quality cockpit for small/mid-size museums, archives, galleries, digital humanities labs, and artist estates that want standards-compliant publication without hiring a semantic-web team. This wedge matches the current product surface best: provider ingestion, Linked Art normalization, API/docs, validation, public trust, AI query, reconciliation, IIIF, ActivityStreams, and Neon-backed storage.
Secondary wedge, defer until the B2B pilot loop works: creator-side provenance and authorship tools. This may scale further, but it needs simpler onboarding, consumer-grade billing, evidence storage, and marketplace/export integrations that are not yet core to the current app.
Initial paid offer: "Managed Linked Art Launch Pilot" — fixed-scope onboarding of one collection export into a hosted workspace, including data-quality report, Linked Art API, public browse pages, provenance/rights review flags, and a monthly evidence packet. Manual invoicing is acceptable for the first pilots; in-app billing can follow validated demand.
SaaS Roadmap Track
| Phase | Goal | Build / decide | Exit criteria |
|---|---|---|---|
| SaaS-0: Positioning + offer | Turn the technical platform into a sellable pilot. | ✅ `/pilot` now publishes the ICP, pilot promise, pricing hypothesis, deliverables, data prerequisites, support boundaries, success metrics, ten qualified prospect profiles, and a ten-account outreach queue with structured stages, status evidence, and derived counts. | Offer page is published, success metrics are explicit, named accounts are listed, and status tracking is visible; full exit still needs at least one actually sent first outreach. |
| SaaS-1: Concierge paid pilot | Earn first non-demo revenue without overbuilding self-serve. | ✅ `docs/ops/managed-linked-art-pilot-runbook.md` now defines the pilot workspace setup runbook, tenant namespace convention, manual plan entitlement config, activation events, support intake process, and customer evidence packet template. Next: execute it with first outreach, one real pilot dataset, and completed real-tenant activation milestones. | 1-3 paid pilots onboarded; each reaches first value within 7 days; pilot users can view/import/query/export without engineer intervention for routine tasks. |
| SaaS-2: Multi-tenant product core | Make the app safe for multiple paying organizations. | ✅ Service-layer org-scoped storage isolation is in place for records, jobs, persisted AgentTask artifacts, researcher annotations, audit logs and org audit exports, ActivityStreams feed reads, activity readiness metrics, Postgres storage export, DR restore rehearsal, scoped public browse/derived reads, scoped AI/editorial read dependencies, wiki draft storage/access routes, wiki sync-map/reverse-ETL artifacts, authenticated org-session preview fallback for workspace pages, first-class managed `org-tenants.json` org/membership/invite backing, admin/team invite APIs, the `/orgs` operator UI, membership-validated `/api/orgs/active` cookie selection, workspace active-org status/selector/switch affordance plus stale-selection feedback, selected-org propagation through shared preview/storage/gate resolvers, records, wiki draft, annotation, AgentTask, and scoped AI/editorial route tenant RBAC read/write evidence, representative records/annotation/wiki draft/scoped AI-editorial non-member route RBAC evidence, non-admin org administration denial evidence, org-bound pilot entitlement/counter gates, and stable pilot API rate-limit denials, with Auth.js resolving active org memberships before compatibility env mappings. Support impersonation policy is now defined and test-locked in the procurement packet; next support-access evidence is route-level implementation proof only if the feature ships. | Tests prove tenant isolation at service and route boundaries; launch review includes tenant security checks; one hosted deployment supports multiple orgs without data bleed. |
| SaaS-3: Billing + growth loop | Move from manual pilots to repeatable subscriptions. | ✅ Interim plan gates and durable manual pilot entitlement validation are executable in `src/services/pilot-entitlements.ts` and stored in managed `storage/pilot-entitlements.json`. Next: add pricing page, checkout or invoice-backed subscriptions, billing webhooks, usage enforcement, metered usage, trial/activation emails, onboarding checklist UI, churn/cancel reasons, and product analytics dashboard. | New org can sign up or be provisioned in under 15 minutes; MRR, activation, retention, support tickets, and usage are visible weekly. |
| SaaS-4: Reliability + compliance for institutions | Make paid deployments procurement-friendly. | ✅ First procurement readiness packet is landed with security overview, data-flow diagram, hosting/subprocessor assumptions, backup/restore proof path, incident response summary, and checklist. Next: add customer-facing status page, SLA/SLO reporting, DPA/legal packet, access-review reports, deployment-specific backup evidence exports, incident drill evidence, and data-retention controls. | Controlled beta evidence is green enough for pilots; production launch review is green before broad public SaaS claims. |
| SaaS-5: Profitability gate | Prove the business model, not just the software. | Track gross margin, cloud cost per tenant, support minutes per account, onboarding cost, conversion rate, retention, expansion, and CAC/payback by channel. | Positive gross margin per tenant, repeatable acquisition channel, retention evidence, and at least one pricing tier that remains profitable after support + infra cost. |
SaaS Product Backlog
| Capability | Current state | SaaS-grade next step |
|---|---|---|
| Tenant/account model | Auth roles and Postgres storage exist; pilot entitlement/counter tests prove exact-tenant and org-bound gate isolation; `src/services/org-tenants.ts` stores first-class orgs, active/inactive memberships, and hashed-token invites in managed storage; `src/services/active-org-selection.ts` validates and persists selected active orgs for signed-in members and now shares that membership-validated resolver with request storage, preview, and route-gate scope; `src/services/org-session-status.ts` resolves sanitized active-org display state and generic stale-selection warnings; admin-only org APIs create/list orgs, memberships, sanitized invites, revocations, public invite acceptance, form posts, and org-scoped audit rows; `/orgs` provides the current operator UI for org creation, membership adds, invite creation, sanitized invite review, and pending-invite revocation; the workspace shell shows active org status, storage scope, membership role, accessible-org count, selector, stale-selection warning, and a switch/manage affordance without token material; Auth.js resolves active memberships into session org ids before compatibility env mappings; records, jobs, and persisted AgentTask artifacts support org-scoped service-layer storage under a shared root; tenant-tagged records, jobs, AgentTask, annotation, activity, audit, import, public browse/derived read routes, AI/editorial read routes, wiki draft access routes, wiki sync-map/reverse-ETL routes, and preview pages now propagate exact tenant or authenticated org scope into backing stores or read models. | Add route-level support-access implementation safeguards only if support-as-customer ships, plus broader tenant RBAC evidence before self-serve hosting. |
| Plans and entitlements | `src/services/pilot-entitlements.ts` defines `free`, `pilot`, `institution`, and `enterprise` plan gates for imports, AI calls, storage, users, exports, API rate limits, and feature access; it also validates and persists interim manual pilot entitlement records by exact `tenantId` with optional `orgId` binding in managed `storage/pilot-entitlements.json`, evaluates requested usage against the active plan, and `src/services/pilot-route-gates.ts` reads tenant or selected authenticated-org usage from managed `storage/pilot-usage-counters.json` before provider facade import/search/profile, AI, content generation, records API, and records export work runs, returns stable `429`/`Retry-After` responses for API-per-minute overages, then records successful 2xx work back into the same counter ledger under the selected scope. `src/services/org-tenants.ts` now gives plan gates a first-class org/membership/invite backing store, `src/services/active-org-selection.ts` persists membership-validated org choice and feeds shared selected-org resolution into route gates, `src/services/org-session-status.ts` makes active-org scope visible in the workspace shell, while `proxy.ts` blocks tenant-tagged direct legacy provider routes, `src/auth/roles.ts` derives provider import write gates from the capability registry, scoped service storage covers records/jobs/AgentTask artifacts, records/jobs/AgentTask routes pass tenant identity into scoped stores, browse plus AI/editorial read APIs select scoped record stores when request context is scoped, wiki draft access routes select scoped draft stores, wiki sync-map/reverse-ETL routes select scoped stores, and preview pages select authenticated org records when query tenant scope is absent. | Add richer plan surfaces, production limiter metadata, and usage dashboards before supporting self-serve multi-org hosting. |
| Billing | No in-app billing; first pilots now have a durable manual invoice-backed entitlement contract with required invoice reference, namespace, owner, publication boundary, monthly evidence cadence, and Postgres export coverage. | Use manual invoice entitlement records for pilots; graduate to Stripe or equivalent checkout/webhooks only after pilot pricing is validated. |
| Onboarding | Developer-led setup works; the managed pilot runbook now defines concierge workspace setup, source-data requirements, namespace rules, and a seven-day activation checklist, but self-serve setup does not exist. | Execute the runbook on one real dataset, then add guided org setup, sample dataset path, first-value dashboard, and onboarding email flow. |
| Usage analytics | Launch/exit evidence exists; `src/services/pilot-outreach-events.ts`, `src/services/pilot-activation-events.ts`, `src/services/pilot-support-issues.ts`, and `src/services/pilot-evidence-packet.ts` now record, summarize, and package required outreach/activation/support evidence, `pnpm pilot:outreach`, `pnpm pilot:activation`, `pnpm pilot:support`, and `pnpm pilot:evidence --markdown` give operators validated write/export paths for real JSON and customer-readable Markdown evidence, `/pilot` renders honest zero-sent and zero-complete ledgers until real records exist, and route-level entitlement gates read exact-tenant month/minute counters from managed storage and record successful gated route work, but broader customer activation analytics are still thin. | Implement tenant-aware tracking for activation milestones, validation improvements, customer views, weekly active users, usage cost, and monthly evidence exports. |
| Support operations | Technical docs are strong; the managed pilot runbook now defines support intake fields, severity levels, response rules, and evidence cadence for concierge pilots. | Formalize intake tooling, known-issues page, escalation policy, and pilot feedback cadence once the first pilot is active. |
| Sales/marketing surface | `/pilot` now publishes the buyer-facing Managed Linked Art Launch Pilot page with problem, buyer, offer, pricing hypothesis, success metrics, support boundaries, source-network CTA, contact CTA, shared primary/footer nav access, named outreach queue, an outreach status ledger with non-clipped status cells, and a zero-complete activation evidence ledger backed by managed outreach/activation events plus operator commands needed to record and package real evidence. | Send/record the first real outreach, then add proof screenshots, completed activation milestones, and a customer evidence packet once the first pilot is active. |
| Procurement readiness | `docs/ops/procurement-readiness-packet.md` now packages a buyer-reviewable security overview, Mermaid data-flow diagram, hosting/subprocessor assumptions, backup/restore evidence path, incident response summary, checklist, and explicit non-SOC-2 / incomplete-tenant-isolation limits. | Attach actual deployment-specific evidence, legal/DPA artifacts, access-review exports, incident drill evidence, and customer-specific subprocessors once the first pilot is active. |
Current Evidence Blockers
| Era C exit check | Current evidence | Required to clear |
|---|---|---|
| SOTA §20.4 p95 SLOs | `3/30` retained samples; retained k6 samples are legacy/partial and missing `sparqlWhitelisted` + `iiifTileServing` p95s. | Configure deployed-target nightly evidence vars and run complete five-scenario `pnpm k6:slo` samples for 30 days; keep all p95s under policy thresholds. |
| Public-read uptime | `source: "probe"`, `availability30d: 1`, `sampleCount30d: 3`; source is now active but below the 30-sample evidence window. | Keep scheduled public probes running via `METAMUSEUM_PUBLIC_READ_BASE_URL` / uptime envs; retain at least 30 samples with >= 99.9% availability. |
| Activity feed adoption | `0/3` declared external consumers in the latest artifact; single-consumer and three-consumer matrix proof tooling are now available. | Onboard three real external consumers that send `x-linked-art-consumer-id` to `/api/activity` within the 30-day window and capture `pnpm activity:adoption:matrix` artifacts. |
| SOTA §26 KPIs | Failing `dataQualityEnrichedShare`, `reconciliationAutoApproveRate`, and `reconciliationPrecisionReviewed`; AI query cost telemetry is now sourced and within policy in the latest artifact. | Export production record-enrichment + reconciliation review counts to `monitoring/kpi-evidence.json`, then rerun telemetry sync. |
Next Operating Plan
- Deployment foundation — ✅ preflight automation and runbook are landed (`pnpm launch:preflight`, `pnpm launch:preflight:production`, `docs/ops/deployment-preflight.md`); Neon-backed `DATABASE_URL` is seeded with all present managed storage documents, `DATABASE_URL` now verifies with `sslmode=verify-full`, `pnpm dr:drill` verifies Postgres restore rehearsal for 6 documents, and `pnpm launch:smoke-token` now generates/rotates the staging researcher smoke token in `.env` without printing it. Latest local preflight against `http://localhost:3000` (`2026-06-10T15:55:43.932Z`) is down to `8 pass / 0 warn / 1 fail`; the exact hard failure left was `AUTH_GITHUB_ID` missing while `AUTH_GITHUB_SECRET` is already present. Update 2026-06-23: the Next.js app is now deployed to production on Vercel against Neon (`vercel.json`, `render.yaml`, `docs/deployment.md` landed; the close-out guard self-skips on Vercel), so the remaining hard inputs are `AUTH_GITHUB_ID`, pointing `BASE_URL`/`METAMUSEUM_PUBLIC_READ_BASE_URL` at the live URL, disabling Vercel Deployment Protection, and wiring the production URL into the nightly uptime/k6/KPI evidence vars.
- Evidence pipeline — ✅ nightly workflow now prefers deployed-target `pnpm k6:slo`, seeds `/api/ai/query` telemetry, probes declared activity adoption, preserves local `pnpm k6:slo:ci` fallback, and uploads performance/activity/monitoring artifacts; public-read uptime probe evidence is active but still needs 30 retained samples.
- Telemetry completeness — ✅ AI query runs emit per-query usage/cost logs, and `monitoring/kpi-evidence.json` can now supply aggregate record-enrichment + reconciliation review counts; still generate the real production export before the next exit-gate run.
- External adoption proof — ✅ partner/bot proof commands and runbook are landed (`pnpm activity:adoption:probe`, `pnpm activity:adoption:matrix`, `docs/ops/activity-adoption-proof.md`); still register three real partner/bot consumers for `/api/activity` and validate `class: "declared"`, `declaredId`, `isExternal`, and recent `lastSeenAt` in `storage/activity-consumers.json`.
- Launch review — ✅ `pnpm launch:review` / `pnpm launch:review:production` aggregate latest preflight, exit-gate, security, DR, public-trust, a11y, and explore-smoke evidence into a packet, and `pnpm launch:beta:readiness` now summarizes controlled-beta go/no-go status from launch-review plus deployment-preflight artifacts. Latest local beta readiness is `blocked` with `5` passed checks, `1` accepted beta warning, and exact preflight blockers for `AUTH_GITHUB_ID`, public base URL, uptime source, and k6 target URL; once those deploy inputs are set, rerun preflight/review/readiness for the live beta decision. Update 2026-06-23: production is live on Vercel + Neon, clearing the deployed-base-URL blocker; remaining inputs are `AUTH_GITHUB_ID`, the live uptime/k6 evidence vars, and disabling Deployment Protection.
- SaaS packaging — ⚠️ `/pilot` is now shared-nav reachable and publishes the Managed Linked Art Launch Pilot offer for concierge paid pilots, including pricing hypothesis, scope, prerequisites, support boundaries, success metrics, ten prospect profiles, a named outreach queue, and a status ledger that honestly shows 10 researched accounts and 0 sent messages; `src/services/site-navigation.ts` prevents header/footer/test navigation drift, `docs/ops/managed-linked-art-pilot-runbook.md` defines concierge setup, namespace, activation events, support intake, evidence packets, and counter-backed route usage gates, `pnpm pilot:outreach` records real dated first-outreach evidence through managed storage without hand-editing JSON while rejecting reply claims without prior sent evidence, replies before `sentAt`, and sent follow-ups before `sentAt`, `pnpm pilot:activation` records real dated activation events through managed storage without hand-editing JSON while rejecting tenant-mismatched evidence and later milestones without prior same-tenant evidence, `pnpm pilot:support` records real support-load evidence through managed storage without hand-editing JSON while rejecting response deadlines or resolution times before `openedAt` and rejecting open issues with resolution evidence, `pnpm pilot:evidence --markdown` writes blocked/ready monthly JSON and Markdown evidence artifacts from entitlement, usage, outreach, activation, and support ledgers, `src/services/pilot-entitlements.ts` defines executable interim commercial plan gates plus durable exact-tenant or org-bound manual pilot entitlement storage and service-level usage enforcement, `src/services/pilot-usage-counters.ts` adds managed tenant/org month/minute counters, `src/services/org-tenants.ts` adds managed org/membership/invite backing with hashed invite tokens, `src/services/active-org-selection.ts` persists selected active orgs after membership validation and now feeds the same selected-org resolver into preview, request-storage, and route-gate scope, `src/services/org-session-status.ts` exposes sanitized active-org status and stale-selection warnings for the workspace shell, admin-only `/api/orgs*` routes expose org, membership, invite, revoke, accept, form-post, audit workflows, and audit export packets, `/orgs` exposes the current shared-nav operator console for org provisioning and invite management without rendering token hashes, `src/services/tenant-preview.ts` lets preview pages fall back to selected authenticated org scope before compatibility tenant query links are needed, `src/services/request-storage-scope.ts` now has records and wiki draft route proof that sibling active-org cookies cannot steer signed-in member reads/writes into sibling org storage, `src/services/pilot-route-gates.ts` enforces provider facade, AI, content generation, records API, and records export usage from those counters before work runs, returns stable `429`/`Retry-After` responses for API-per-minute overages, and records successful 2xx work afterward while preferring selected authenticated org scope over compatibility tenant headers, focused tests prove exact-tenant and org-bound counter/gate isolation, `proxy.ts` refuses tenant-tagged direct legacy provider routes before they can bypass facade counters, `src/auth/roles.ts` derives provider import write gates from the provider capability registry, Auth.js carries first-class signed-in identity-to-org storage scope into sessions before env fallback, `src/services/org-storage-scope.ts` adds scoped service storage for records/jobs/AgentTask artifacts, tenant/org-scoped records/jobs/AgentTask/annotation/audit/activity/import/wiki-draft/wiki-sync routes now pass exact scope into backing stores and audit rows, scoped public browse/derived plus AI/editorial read APIs now select scoped record stores when request context is scoped, page previews can render scoped records/artwork/entity/explore/graph/pattern views from either explicit tenant query or authenticated org session scope, tenant-scoped managed documents participate in export/DR evidence, and `docs/ops/procurement-readiness-packet.md` provides the first buyer-facing security/data-flow/subprocessor/backup/incident-response packet, including the explicit no-production-support-impersonation policy and future control requirements. Next evidence is one sent first outreach, one real pilot dataset executed through the runbook, deployment-specific buyer evidence, route-level support-access evidence only if that feature ships, and the first completed real-tenant activation ledger; do not claim profitable SaaS readiness until recurring revenue, support load, retention, and gross-margin evidence are real.
Latest pilot evidence packet note: `/pilot` still honestly shows 10 researched accounts, 0 sent messages, and 0 of 6 activation milestones complete, but `pnpm pilot:outreach`, `pnpm pilot:activation`, `pnpm pilot:support`, and `pnpm pilot:evidence --markdown` now convert real manual outreach, activation, support, entitlement, and usage ledgers into explicit `blocked` or `ready` JSON plus Markdown packets, reject reply claims without prior sent evidence, reject replies dated before `sentAt`, reject sent follow-up dates before `sentAt`, reject later activation milestones without prior same-tenant evidence, reject tenant-mismatched activation evidence, reject support response and resolution times before `openedAt`, reject existing support issue updates that rewrite the original `requester`, `summary`, `openedAt`, or `severity`, reject already-resolved support issue updates that rewrite `resolvedAt` or `resolutionSummary`, reject open support issues that carry resolution evidence, and prevent `ready` status while any blocking support issue remains open or any open issue is past its next response time, so the next SaaS evidence target remains a real sent outreach plus a real tenant dataset with dated activation evidence.
- Documentation currency — ✅ every iteration must update `README.md` and `docs/roadmap.md` before `pnpm session:closeout`; `scripts/session-closeout.ts` enforces this on the normal closeout path by comparing both files to the previous closeout timestamp. Latest CI hardening keeps workflow JavaScript Actions on Node 24-native major versions while preserving the project’s Node 20 app execution path; latest UI polish keeps the home hero carousel in this current surface because clear full-color artwork imagery, attribution, reuse context, and a centered non-overlapping info panel are part of the public Linked Art trust contract.
- Agent productionization — ✅ persistent AgentTask review history is landed (`agent-tasks.json` managed storage + `/api/agents/tasks`), the internal AG2 bridge boundary is wired for Mercator/Janus behind `METAMUSEUM_AG2_BRIDGE_ENABLED`, and the local Python AG2 worker endpoint is available at `services/ag2-worker` with review-only contract tests, route-to-worker trace propagation, timeout/refusal fallback coverage, safe enablement docs, and live-worker eval artifacts via `pnpm ag2:worker:eval`. ⚠️ next value is collecting operator sign-off for any production bridge enablement; A2A/AG-UI remain deferred.
---
Linked Art adherence uplift (current -> high)
This section turns the current medium/medium-high areas into explicit completion criteria.
A. Validation architecture depth (B2 follow-through)
Current: validation architecture is in place and standards-linked.
Target: high adherence through continuous standards-backed enforcement.
Adherence upgrade target:
- [x] ✅ Validation depth moved from "in place" to continuous fixture-backed drift enforcement.
Status:
- [x] ✅ Complete.
- [x] ✅ Evidence: `/explore` includes `vanda` source toggle and `/artwork/[id]` now exposes digital/IIIF manifest-image links when present in imported records.
- [x] ✅ Provider/import transform policy is now executable via fixture manifest + tests:
- `tests/fixtures/validation/provider-fixture-manifest.json`
- `tests/quality/validation-architecture-depth.test.ts`
- [x] ✅ Scheduled revalidation pass landed:
- `.github/workflows/validation-drift.yml` (weekly + manual dispatch)
- `scripts/validation-drift.ts`
- [x] ✅ CI drift visibility + regression blocking landed:
- `.github/workflows/ci.yml` runs `pnpm validation:drift:check`
- net-new critical violations fail the job
Definition of done:
- [x] ✅ Validation coverage includes object, digital, provenance, shared structures, and endpoint-shape fixtures from the reference rounds used in active slices.
- [x] ✅ CI shows stable/no-regression validation trend for two consecutive release cycles.
- `config/validation-drift-cycles.json` tracks release-cycle snapshots.
- `pnpm validation:drift:trend` performs executable two-cycle no-regression gating.
B. Provider rollout completeness (B5)
Current: all planned expansion providers are landed.
Target: high adherence with repeatable, standards-mapped provider slices.
Adherence upgrade target:
- [x] ✅ Provider rollout discipline is locked to keep all landed B5 providers green with standards-mapped tests (fixtures + protocol/profile checks + parity checklist).
- [x] ✅ Execute remaining providers as independent slices (Louvre, Harvard, Smithsonian, V&A, Princeton, Europeana, AIC, CMA), each with:
- [x] ✅ adapter isolation conformance
- [x] ✅ fixture-anchored standards mapping
- [x] ✅ protocol/profile checks from B8
- [x] ✅ Track per-provider readiness in this roadmap with explicit `not started / in progress / done` status and standards round coverage.
Provider readiness matrix:
| Provider | Status | Standards round coverage | B8 protocol/profile checks | Notes |
|---|---|---|---|---|
| Rijks | done | object + digital + provenance + shared structures + endpoint-shape fixture anchors mapped to `LinkedArtModel1.0-Reference.md` | complete | Routes + adapter + tests are landed and included in provider protocol conformance suite. |
| NGA | done | object + digital + provenance + shared structures + endpoint-shape fixture anchors mapped to `LinkedArtModel1.0-Reference.md` | complete | CSV ingest/provider slice landed with adapter + profile/search/import routes + tests. |
| Louvre | done | object + shared structures + references fixture anchors mapped to `LinkedArtModel1.0-Reference.md` | complete | Routes + adapter + tests landed with provider facade wiring. |
| Harvard | done | object + shared structures + endpoint-shape fixture anchors mapped to `LinkedArtModel1.0-Reference.md` | complete | Routes + adapter + tests landed with provider facade wiring. |
| Smithsonian | done | object + shared structures + data-discovery fixture anchors mapped to `LinkedArtModel1.0-Reference.md` | complete | Routes + adapter + tests landed with provider facade wiring. |
| V&A | done | object + digital + data-discovery fixture anchors mapped to `LinkedArtModel1.0-Reference.md` | complete | Routes + adapter + tests landed with provider facade wiring. |
| Princeton | done | object + digital + shared structures fixture anchors mapped to `LinkedArtModel1.0-Reference.md` | complete | Routes + adapter + tests landed with provider facade wiring. |
| Europeana | done | object + shared structures + data-discovery fixture anchors mapped to `LinkedArtModel1.0-Reference.md` | complete | Routes + adapter + tests landed with provider facade wiring. |
| AIC | done | object + digital + shared structures fixture anchors mapped to `LinkedArtModel1.0-Reference.md` | complete | Routes + adapter + tests landed with provider facade wiring. |
| CMA | done | object + digital + shared structures fixture anchors mapped to `LinkedArtModel1.0-Reference.md` | complete | Routes + adapter + tests landed with provider facade wiring. |
Provider parity checklist (all must be complete per provider before status can be set to `done`):
- [x] ✅ identity mapping (stable URI + `equivalent` handling)
- [x] ✅ activity/event modeling preserved (no object-person shortcut regressions)
- [x] ✅ rights/reuse + attribution semantics preserved and surfaced
- [x] ✅ IIIF/link-layer handling preserved when source provides it
- [x] ✅ source provenance metadata preserved end-to-end (`_source.provider`, source URL, ingest time)
Definition of done:
- [x] ✅ All planned B5 providers shipped with route + adapter + tests + standards mapping notes.
- [x] ✅ Provider parity checklist complete for identity, activity modeling, rights, IIIF, and source provenance.
C. HAL + Search relations conformance (rounds 71-79)
Current: documented in the standards reference, partially deferred in platform slices.
Target: high adherence with enforceable API behavior.
- [x] ✅ Add conformance tests for OrderedCollection/OrderedCollectionPage search response shapes.
- [x] ✅ Enforce stable relation naming and discoverability contracts via search relation fields (`nextPage` / `prevPage`) and HAL-aware protocol assertions.
- [x] ✅ Prevent inverse-relationship duplication drift by asserting search-driven inverse discovery patterns.
Definition of done:
- [x] ✅ Representative search endpoints pass relation + pagination + shape conformance tests.
- [x] ✅ HAL link contract tests pass for versioning, related search links, and format/profile discoverability.
Status:
- [x] ✅ `tests/quality/hal-search-relations-conformance.test.ts` enforces response-shape + relation-contract behavior on representative direct and provider-facade search routes.
- [x] ✅ `tests/quality/protocol-conformance.test.ts` continues to enforce HAL separation + media-type/profile behavior on public API payloads.
D. Era B exit-gate closure readiness
Current: Era B gate closed for the current scope.
Target: keep it closed as new providers land.
- [x] ✅ B6 authority-cache request-path policy enforced.
- [x] ✅ B8/B9 conformance suites in CI.
- [x] ✅ Postgres mode is now the default storage-of-record when `DATABASE_URL` is present.
- [x] ✅ `storage/*.json` removed from version control.
- [x] ✅ Write audit-log verification in CI for all primary write routes.
Definition of done:
- [x] ✅ Era B exit gate lines are green with evidence links to tests and commands (`tests/quality/era-b-exit-gate.test.ts`, `tests/quality/protocol-conformance.test.ts`, `tests/quality/provider-protocol-conformance.test.ts`, `tests/quality/linked-art-b9-guardrails.test.ts`).
Adherence upgrade target:
- [x] ✅ Era B sustainment is continuously enforced: provider-slice conformance, authority-cache policy, and write-audit checks remain green as new sources land.
Execution policy:
- [x] ✅ No provider/validation PR merges without round + fixture-anchor standards mapping.
- [x] ✅ No protocol-affecting merges without conformance test coverage for headers/shape/negotiation touched.
- [x] ✅ Enforcement evidence:
- `.github/pull_request_template.md`
- `tests/quality/execution-policy-gates.test.ts`
---
Stack decisions — locked in
| Layer | Decision | Locked because |
|---|---|---|
| Framework | Next.js 16 App Router + RSC | already scaffolded; matches SOTA §11 |
| UI lang | TypeScript 5, `strict: true` | scaffolded |
| Styling | Custom CSS — design tokens + BEM-lite component classes in `app/globals.css`; no utility framework | user decision (reversed earlier Tailwind choice); resolves SOTA §30 Q3 |
| Forms | React Hook Form + Zod | SOTA §3.2 |
| Data fetching | RSC `fetch` first; TanStack Query for interactive client state | SOTA §11.3 + Next 16 cache-components model |
| Server state mutations | Server Actions over fetch-based POSTs where possible | Next 16 idiom |
| Tests | `node:test` + `node:assert` via `tsx`, Playwright for e2e, axe-core for a11y | SOTA §23 + legacy convention |
| Pkg manager | pnpm | scaffolded |
| Persistence (this era) | Postgres JSONB is the storage-of-record behind `src/utils/storage.ts` (`postgres` default with compatibility `file`/`double-write` modes) | preserves stable call sites during/after B3 migration |
| Triple store (SOTA era) | GraphDB (Ontotext) — Community Edition for OSS path; SE/EE if SPARQL p95 demands | user decision; resolves SOTA §30 Q1 |
| Search index (SOTA era) | Solr 9 (LUX-aligned) | architecture decision (May 30, 2026); replaces Solr/OpenSearch fork |
| Persistence (SOTA era) | Postgres 16 + JSONB · Solr 9 · GraphDB · pgvector | SOTA §3.4 / §8 with finalized search choice |
| Curator backend (SOTA era) | In-house curator console (custom, Linked Art-native) | architecture decision (May 30, 2026); draws lessons from Arches/Ogee without adopting platform lock-in |
| Developer/ops backend | In-house ops console (pipeline/debug/automation focused) | architecture decision (May 30, 2026); complements curator console |
| Canonical ID scheme | `https://lod.metamuseum.org/{type}/{ulid}` | architecture decision (May 30, 2026); opaque, sortable, federation-ready |
| Publication bridge (SOTA era) | MediaWiki + custom Wikibase for Meta Wiki Art publishing | aligns Linked Art/SPARQL/citation goals; see `docs/meta-wiki-art-bridge.md` |
Previously deferred architecture decisions (now finalized, May 30, 2026):
- [x] ✅ Search engine: Solr 9 (LUX-aligned).
- [x] ✅ Curator backend: in-house custom curator console.
- [x] ✅ Developer backend: in-house ops console.
- [x] ✅ Canonical ID scheme: `https://lod.metamuseum.org/{type}/{ulid}`.
---
Era delivery history
All three delivery eras are complete (see Status above):
- Era A — The Lift (10 PR-sized slices): TDD foundations, Met + Getty verticals, records/artworks/entities, Linked Art inspector, patterns/graph, issues/SSE, agents/jobs/content, workspace chrome.
- Era B — Hardening (B1–B10): Zod contracts + schema versioning, formal validation, Postgres, auth + roles, 13-provider expansion, authority caching, exhibition/literature reconciliation, protocol + modeling guardrails, ARK conformance, gateway readiness.
- Era C — SOTA (C1–C5): multi-modal storage + HAL, ETL + reconciliation + mapper, IIIF + visualizations, the AI layer, syndication + Meta Wiki Art + security/privacy hardening.
The full slice-by-slice and B-/C-series implementation detail is archived in progress/era-history.md(progress/era-history.md). The active forward plan is roadmap-to-10.md(roadmap-to-10.md).
---
Cross-cutting standards (apply from Slice 1 onward)
These are not phases — they are continuous quality gates. Borrowed from `_legacy/AGENTS.md` and SOTA §23.
- [x] ✅ AIDD + TDD is the default. Define behavior in natural language and map standards rounds/fixture anchors first, then write the failing test (red), pass with minimum code (green), and refactor with the suite green. Tests are the spec; reviewers read tests before reading implementation. A failing test stops the line. See CLAUDE.md(../CLAUDE.md) §"We lead with AIDD + TDD".
- [x] ✅ Adapters do not import each other. Bridge via `src/utils/artwork-builder.ts`.
- [x] ✅ Contracts are leaf modules. No upstream deps.
- [x] ✅ `_source.raw` is immutable. Transform at read time.
- [x] ✅ Rights-aware by default. Every UI surface showing an image carries reuse status + attribution.
- [x] ✅ Linked Art JSON-LD is the canonical data layer. UI DTOs (`Artwork`) are separate; map at the boundary.
- [x] ✅ Loading / empty / error / success states on every interactive UI.
- [x] ✅ Keyboard navigation + visible focus on every interactive surface.
- [x] ✅ At least one test for any risky transform.
- [x] ✅ Cite or refuse. Generated content always carries citations + rights + review state.
- [x] ✅ Next 16 specifics: `cookies()`, `headers()`, dynamic `params` are async — always `await` them.
- [x] ✅ No `unknown` swallowed silently. A record with unknown rights gets an explicit "Rights unknown — do not reuse" badge.
- [x] ✅ Reference-driven conformance. Any provider/API/schema/search/protocol PR must cite relevant linked-art/LinkedArtModel1.0-Reference.md(linked-art/LinkedArtModel1.0-Reference.md) rounds and include failing-first tests mapped to the referenced fixture anchors.
- [x] ✅ Reference maintenance loop. If a PR depends on newly published Linked Art guidance not yet captured in `LinkedArtModel1.0-Reference.md`, that round/addendum must be appended before (or in the same change as) the implementation PR.
- [x] ✅ Standards Mapping is required in provider/validation PRs. Include: referenced round numbers, fixture anchors exercised, and failing-first test files proving red→green conformance.
- [x] ✅ PR template completion is required. Every PR must complete .github/pull_request_template.md(../.github/pull_request_template.md), including AIDD checklist gates, standards mapping, and protocol assertions touched.
- [x] ✅ AI-generated tests/refactors require human semantic verification. AI can accelerate drafting, but authors/reviewers remain accountable for Linked Art correctness, provider semantics, and protocol behavior.
- [x] ✅ Provider/pipeline boundary drift is actively bounded. `docs/risk-register.md` tracks risk posture and `tests/contracts/provider-boundary-contracts.test.ts` enforces adapter import boundaries.
- [x] ✅ Protocol conformance is mandatory. Public API behavior must preserve JSON-LD context/profile correctness, support `GET` + `OPTIONS`, and provide baseline CORS + media-type negotiation.
- [x] ✅ AI-RSI compounding loop is mandatory. Each merge requires: 72h review check, evidence capture in session log, and roadmap/README/CLAUDE updates before the next RSI expansion scope.
- [x] ✅ HAL/data separation is mandatory. API navigation metadata lives in `_links` (non-semantic) and must not pollute semantic graph payloads.
- [x] ✅ URI opacity is mandatory. Never infer semantics from URI path structure in router logic or client helpers.
- [x] ✅ Inverse discovery via Search API. Prefer standardized search relations and OrderedCollection/OrderedCollectionPage responses rather than duplicating inverse relationship fields.
- [x] ✅ Carrier/content separation is non-negotiable. `HumanMadeObject`/`DigitalObject` must remain distinct from `VisualItem`/`LinguisticObject`.
- [x] ✅ Authority-backed classification UX. Curatorial/classification input paths must use controlled authority sources (AAT/ULAN/Wikidata equivalents), not free-text categories by default.
- [x] ✅ Data discovery signposting. Public record HTML pages expose a single canonical `describedby` link to the Linked Art JSON-LD record.
Verification note (May 31, 2026):
- The first nine cross-cutting gates above are marked complete based on current enforcement in CI/tests and live implementation patterns (provider boundary checks, contract leaf structure, `_source.raw` invariants, rights surfaces, Linked Art boundary mapping, interactive state handling, and keyboard/focus coverage).
- Additional governance/protocol gates are marked complete where enforced by executable tests (`protocol-conformance`, `provider-protocol-conformance`, `hal-search-relations-conformance`, `provider-digital-content-gates`) and PR governance checks (`execution-policy-gates`, PR template standards mapping requirements).
- Remaining open gates in this section are intentionally left unchecked only where future era scope is intentionally deferred; cross-cutting gate set above is now fully enforced in current Era A/B surfaces.
---
What this roadmap deliberately does NOT do (yet)
To stay honest about scope:
- [x] ✅ No microservice split during Era A. All routes lived in the single Next 16 app; Python services begin in Era B (validation) and Era C (reconciliation, AI).
- [x] ✅ No triple store or vector store in Era A or B. Postgres + JSONB remains sufficient until Era C search/graph patterns are activated.
- [x] ✅ No module-federation for the Era A app. The app remains a single deployable Next.js build.
- [x] ✅ No Arches / Ogee / Zelge adoption planned. Lessons are reused, but curator and ops surfaces remain in-house.
- [x] ✅ No fancy IIIF in Era A. Current Era A/B UI uses provider image URLs; OpenSeadragon remains planned for C3.
- [x] ✅ No NL→SPARQL until the SHACL gate exists (B2 → C4).
- [x] ✅ No Meta Wiki Art write path until contracts, validation, auth, audit log, and Postgres cutover are stable (C5).
Verification note (May 31, 2026):
- Constraints above are verified against current code/routes/dependencies and remain in force for pre-Era-C scope control.
---
What I'd build next, concretely
Era C1 prep while sustaining Era B quality gates:
- [x] ✅ RSI-5: AI evidence drift + citation freshness (Medium-severity remediation) is complete (proven 2026-06-09):
- Owner: Platform + AI Reliability
- Action 1 complete (2026-06-09): `/api/ai/query` now returns citation metadata, coverage, and explicit refusal state, locked by `tests/api/ai-query.test.ts` and `tests/quality/cite-or-refuse-conformance.test.ts`.
- Action 2 complete (2026-06-09): `/api/ai/query` and `/api/ai/chat` now emit `retrievedAt` + `citationFreshness` diagnostics and refuse stale evidence via policy-backed route tests.
- Scope:
- Enforce same cite-or-refuse behavior beyond `/api/ai/chat` so `/api/ai/query` emits stable evidence metadata and refuses under coverage/freshness thresholds.
- Keep `/api/ai/chat` grounded response semantics while adding the shared freshness guard.
- Acceptance:
- `/api/ai/query` returns `{ answer, citations, coverage, citationFreshness, refusalReason? }` with `entityId`, `propertyPath`, `sourceUrl`, and `retrievedAt` in cited outputs.
- Under-cited or stale-evidence answers return explicit refusal and reason.
- Regression test coverage proves chat/query parity and stale-evidence failure modes.
- Proof packet:
- `tests/api/ai-query.test.ts`, `tests/api/ai-chat.test.ts`, and `tests/quality/cite-or-refuse-conformance.test.ts` cover cited success, coverage refusal, and freshness refusal behavior.
- Close-out packet synchronizes `docs/risk-register.md`, `CLAUDE.md`, `README.md`, and this roadmap with evidence proofs.
- [x] ✅ RSI-6: AI eval drift baselines include citation freshness (Medium-severity remediation) is complete (proven 2026-06-09):
- `src/services/ai-eval-harness.ts` now scores `citationFreshness` from actual citation/source timestamps instead of treating retrieval time as always fresh.
- `src/services/ai-eval-regression.ts`, `scripts/ai-eval-gate.ts`, and `config/ai-eval-regression-policy.json` now baseline, persist, print, and fail-fast on `citationFreshnessDrop`.
- `evals/golden-museum-questions.v1.json` and `docs/evals/golden-museum-questions.md` now declare `citationFreshnessThreshold = 0.95`.
- Proof packet: `tests/services/ai-eval-harness.test.ts`, `tests/services/ai-eval-regression.test.ts`, `tests/services/ai-eval-artifacts.test.ts`, `tests/quality/ai-eval-golden-dataset.test.ts`, plus direct AI eval gate output with `citationFreshness=1` and `citationFreshnessDrop=0`.
- [x] ✅ RSI-7: AI eval summary badges + aging-pressure alerting (Medium-severity remediation) is complete (proven 2026-06-09):
- `src/services/ai-eval-artifacts.ts` now renders `artifacts/evals/summary.md` with status, faithfulness, relevance, citation accuracy, `citationFreshness`, pass-rate badges, artifact links, and alerts.
- `src/services/ai-eval-harness.ts` now persists citation freshness aging (`oldestAgeDays`, `oldestAgeRatio`, `maxAgeDays`) for trend-aware pressure detection.
- `.github/workflows/ai-eval-gate.yml` now appends the summary to `$GITHUB_STEP_SUMMARY` and uploads `artifacts/evals/` for CI inspection.
- Proof packet: `tests/services/ai-eval-artifacts.test.ts`, plus direct AI eval gate output with `summary=artifacts/evals/summary.md`, `citationFreshness=1`, and `oldestAgeRatio=0`.
- [x] ✅ RSI-8: AI eval artifact dashboard visibility (Medium-severity remediation) is complete (proven 2026-06-09):
- `src/services/ai-eval-dashboard.ts` now loads ignored local eval artifacts, latest metrics, trend runs, artifact timestamps, and freshness-aging warnings with explicit empty/partial states.
- `app/(workspace)/ai-evals/page.tsx` now provides a read-only app dashboard for latest eval summary, trend index, freshness-aging state, and active warnings.
- Navigation now exposes the dashboard from the workspace sidebar, primary Workspace menu, and footer.
- Proof packet: `tests/services/ai-eval-dashboard.test.ts` and `tests/pages/ai-evals-page.test.ts`, plus focused system Node test output passing 3 tests.
- [x] ✅ RSI-9: latest-vs-previous AI eval artifact diff (Medium-severity remediation) is complete (proven 2026-06-09):
- `src/services/ai-eval-dashboard.ts` now computes latest-vs-previous metric deltas, freshness-aging pressure deltas, status changes, prompt-count changes, identity changes, and compact “what changed” notes.
- `app/(workspace)/ai-evals/page.tsx` now renders a “Latest vs previous run” review section with metric arrows, freshness pressure movement, and fast-review notes.
- Proof packet: `tests/services/ai-eval-dashboard.test.ts` and `tests/pages/ai-evals-page.test.ts`, plus focused system Node test output passing 3 tests.
- [x] ✅ RSI-10: severity-labeled AI eval diff triage (Medium-severity remediation) is complete (proven 2026-06-09):
- `src/services/ai-eval-dashboard.ts` now classifies metric deltas, freshness-aging pressure movement, and overall latest-vs-previous diff priority as `regression`, `watch`, `stable`, or `improved`.
- `app/(workspace)/ai-evals/page.tsx` now renders priority labels and visible metric/aging threshold pills so review starts with severity instead of raw deltas only.
- Proof packet: `tests/services/ai-eval-dashboard.test.ts` and `tests/pages/ai-evals-page.test.ts`, plus focused system Node output passing 3 tests, `pnpm test` passing 787/249, `pnpm lint` passing with one existing warning, and production Next build passing via system Node.
- [x] ✅ RSI-11: policy-driven AI eval priority visibility (Medium-severity remediation) is complete (proven 2026-06-09):
- `config/ai-eval-regression-policy.json` now owns diff severity thresholds consumed by dashboard and CI summary generation.
- `app/(workspace)/ai-evals/page.tsx` now shows last-N severity distribution across `regression`, `watch`, `stable`, and `improved` comparisons.
- `artifacts/evals/summary.md` now includes CI-visible review priority when at least two retained eval runs exist.
- Proof packet: `tests/services/ai-eval-dashboard.test.ts`, `tests/services/ai-eval-artifacts.test.ts`, and `tests/pages/ai-evals-page.test.ts`, plus focused system Node output passing 8 tests, `pnpm test` passing 788/249, `pnpm lint` passing with one existing warning, AI eval gate printing `reviewPriority: stable`, and production Next build passing via system Node.
- [x] ✅ RSI-12: AI eval agent-summary reliability (Medium-severity remediation) is complete (proven 2026-06-09):
- `src/services/ai-eval-severity.ts` now validates `diffSeverityPolicy` shape/order and fails malformed policy with explicit errors.
- `/api/ai-evals/summary` now exposes agent-ready JSON with latest run, review priority, severity distribution, severity history, alerts, and artifact links.
- `app/(workspace)/ai-evals/page.tsx` now renders compact severity sparkline and latest-vs-previous comparison history for fast trend review.
- Proof packet: `tests/services/ai-eval-severity.test.ts`, `tests/api/ai-evals-summary.test.ts`, `tests/services/ai-eval-dashboard.test.ts`, `tests/services/ai-eval-artifacts.test.ts`, and `tests/pages/ai-evals-page.test.ts`, plus focused system Node output passing 13 tests, `pnpm test` passing 793/251, `pnpm lint` passing with one existing warning, AI eval gate printing `reviewPriority: stable`, and production Next build confirming `/api/ai-evals/summary`.
- [x] ✅ RSI-13: AI eval contract + CI annotation reliability (Medium-severity remediation) is complete (proven 2026-06-09):
- `src/contracts/zod/ai-eval-summary.ts` now defines the reusable agent JSON contract and OpenAPI schema for `/api/ai-evals/summary`.
- `config/ai-eval-regression-policy.json` now owns `severityHistoryPolicy.maxComparisons`, which drives dashboard distribution/history and agent JSON window metadata.
- `scripts/ai-eval-gate.ts` now emits a GitHub PR warning annotation when review priority is `watch` or `regression`, with workflow path filters covering `/api/ai-evals`, policy, and contract changes.
- Proof packet: `tests/api/ai-evals-summary.test.ts`, `tests/api/openapi.test.ts`, `tests/services/ai-eval-severity.test.ts`, `tests/services/ai-eval-dashboard.test.ts`, `tests/services/ai-eval-artifacts.test.ts`, and `tests/pages/ai-evals-page.test.ts`, plus focused system Node output passing 18 tests, `pnpm test` passing 796/251, `pnpm lint` passing with one existing warning, AI eval gate printing `reviewPriority: stable`, and production Next build confirming `/api/ai-evals/summary`.
- [x] ✅ RSI-14: AI eval version/pruning hygiene (Medium-severity remediation) is complete (proven 2026-06-09):
- `/api/ai-evals/summary` now returns `schemaVersion: 1`, and `aiEvalSummaryResponseSchema` rejects unsupported versions before agents consume the payload.
- `docs/evals/golden-museum-questions.md` publishes exact GitHub CI annotation examples for `regression` and `watch`, with snapshot assertions keeping docs and formatter output aligned.
- `src/services/ai-eval-artifacts.ts` now prunes orphaned run JSON outside the retained trend window while preserving retained run files and non-JSON notes.
- Proof packet: `tests/api/ai-evals-summary.test.ts`, `tests/api/openapi.test.ts`, and `tests/services/ai-eval-artifacts.test.ts`, plus focused system Node output passing 22 tests, `pnpm test` passing 800/251, `pnpm lint` passing with one existing warning, AI eval gate printing `reviewPriority: stable`, and production Next build confirming `/api/ai-evals/summary`.
- [x] ✅ RSI-15: AI eval migration/reporting visibility (Medium-severity remediation) is complete (proven 2026-06-09):
- `src/contracts/zod/ai-eval-summary.ts` now includes future `schemaVersion: 2` migration notes, with v1 accepted and planned v2 rejected through fixture compatibility tests.
- `src/services/ai-eval-artifacts.ts` now returns a retention pruning report with `delete`/`dry-run` mode and retained/orphaned/deleted/preserved file counts.
- `artifacts/evals/summary.md` now surfaces latest CI annotation status and retention pruning status for PR/build reviewers.
- Proof packet: `tests/api/ai-evals-summary.test.ts`, `tests/services/ai-eval-artifacts.test.ts`, and `tests/fixtures/ai-eval-summary/*`, plus focused system Node output passing 24 tests, `pnpm test` passing 802/251, `pnpm lint` passing with one existing warning, AI eval gate printing `reviewPriority: stable`, and production Next build confirming `/api/ai-evals/summary`.
- [x] ✅ RSI-16: AI eval summary artifact/schema visibility (Medium-severity remediation) is complete (proven 2026-06-09):
- `tests/fixtures/ai-eval-summary/summary-snapshot.md` now locks the generated CI summary markdown, proving `summary.md` stays reviewable and stable.
- `/api/ai-evals/summary` now exposes the latest `retentionPruneReport` for agents, including delete/dry-run mode and retained/orphaned/deleted/preserved counts.
- `/api/openapi` now includes the AI eval summary schema migration compatibility table so future `schemaVersion` upgrades are discoverable from the contract surface.
- Proof packet: `tests/api/ai-evals-summary.test.ts`, `tests/api/openapi.test.ts`, `tests/services/ai-eval-artifacts.test.ts`, `tests/services/ai-eval-dashboard.test.ts`, and `tests/fixtures/ai-eval-summary/summary-snapshot.md`, plus focused system Node output passing 25 tests, `pnpm test` passing 803/251, `pnpm lint` passing with one existing warning, AI eval gate printing `reviewPriority: stable`, and production Next build confirming `/api/ai-evals/summary`.
- [x] ✅ RSI-17: Visual ETL Mapper AI-assist safety (Medium-severity remediation) is complete (proven 2026-06-09):
- `/api/ai/mapping-assist` now returns review-ready, contract-valid `MappingTemplate` drafts from source columns with confidence, rationale, standards anchors, and unmapped-column diagnostics.
- `/etl/mapper` now exposes a "Suggest mapping with AI" action while keeping generated mappings review-only before any ingestion activation.
- Unknown columns are surfaced as diagnostics instead of invented mappings.
- Proof packet: `tests/services/mapping-assist.test.ts`, `tests/api/ai-mapping-assist.test.ts`, `tests/components/etl-mapper-config.test.ts`, and `tests/api/openapi.test.ts`, plus focused system Node output passing 7 mapper-assist tests and 2 OpenAPI tests, `pnpm test` passing 809/253, `pnpm lint` passing with one existing warning, AI eval gate printing `reviewPriority: stable`, and production Next build confirming `/api/ai/mapping-assist`.
- [x] ✅ RSI-18: mapper-assist fixture/schema/importability hardening (Medium-severity remediation) is complete (proven 2026-06-09):
- `tests/fixtures/mapping-assist/tricky-columns.json` now locks tricky rights/credit/sensitive columns so mapper assist cannot invent unsafe Linked Art target paths.
- `src/utils/etl-mapper-assist.ts` and `/etl/mapper` now support importing returned suggestions as reviewable ReactFlow draft nodes/edges.
- `/api/openapi` now exposes `MappingAssistResponse` and references it from `/api/ai/mapping-assist`.
- Proof packet: `tests/services/mapping-assist.test.ts`, `tests/utils/etl-mapper-assist.test.ts`, `tests/components/etl-mapper-config.test.ts`, `tests/api/openapi.test.ts`, and `tests/api/ai-mapping-assist.test.ts`, plus focused system Node output passing 11 tests, `pnpm test` passing 811/254, `pnpm lint` passing with one existing warning, AI eval gate printing `reviewPriority: stable`, and production Next build confirming `/api/ai/mapping-assist`.
- [x] ✅ RSI-19: provider-family/browser/request-schema mapper hardening (Medium-severity remediation) is complete (proven 2026-06-09):
- `tests/fixtures/mapping-assist/provider-families.json` now covers Met, Getty, and Rijks-style mapper columns with allowed-path and must-stay-unmapped assertions.
- `scripts/smoke-etl-mapper-assist.ts` and `pnpm smoke:etl:mapper-assist` now provide a Playwright browser smoke for `/etl/mapper` assist generation plus draft import.
- `/api/openapi` now exposes `MappingAssistRequest` and attaches it to the `/api/ai/mapping-assist` POST request body.
- Proof packet: `tests/services/mapping-assist.test.ts`, `tests/scripts/etl-mapper-smoke-script.test.ts`, and `tests/api/openapi.test.ts`, plus focused system Node output passing 7 tests, `pnpm smoke:etl:mapper-assist` passing against `http://localhost:3001/en`, `pnpm test` passing 813/255, `pnpm lint` passing with one existing warning, AI eval gate printing `reviewPriority: stable`, and production Next build confirming `/api/ai/mapping-assist`.
- [x] ✅ RSI-20: negative mapper fixtures, visual screenshot, and API-doc examples (Medium-severity remediation) is complete (proven 2026-06-09):
- `tests/fixtures/mapping-assist/negative-provider-families.json` now locks near-miss Met/Getty/Rijks columns so rights, credit, restriction, sensitivity, donor, and flag wording cannot trigger unsafe suggestions.
- `scripts/smoke-etl-mapper-assist.ts` now waits on the mapping-assist POST, imports the draft, and writes `artifacts/smoke/etl-mapper-assist-imported.png` with animations disabled.
- `/api/docs` now includes concrete mapping-assist request and response examples next to the Swagger UI entry point.
- Proof packet: `tests/services/mapping-assist.test.ts`, `tests/scripts/etl-mapper-smoke-script.test.ts`, and `tests/api/docs.test.ts`, plus focused system Node output passing 11 tests, `pnpm smoke:etl:mapper-assist` passing against `http://localhost:3001/en`, `pnpm test` passing 814/255, `pnpm lint` passing with one existing warning, AI eval gate printing `reviewPriority: stable`, and production Next build passing.
- [x] ✅ RSI-21: mapper layout, OpenAPI-sourced docs examples, and confidence policy (Medium-severity remediation) is complete (proven 2026-06-09):
- `app/globals.css` now gives mapper actions a full-width wrapped row with no button overlap in the refreshed screenshot.
- `app/api/docs/route.ts` now renders mapping-assist request/response examples from `/api/openapi` instead of duplicating static JSON.
- `src/services/mapping-assist.ts` now applies a minimum confidence policy so lower-confidence accession/place/description patterns stay diagnostics-only.
- Proof packet: `tests/components/etl-mapper-config.test.ts`, `tests/api/docs.test.ts`, and `tests/services/mapping-assist.test.ts`, plus focused system Node output passing 15 tests, `pnpm smoke:etl:mapper-assist` refreshing `artifacts/smoke/etl-mapper-assist-imported.png`, `pnpm test` passing 817/255, `pnpm lint` passing with one existing warning, AI eval gate printing `reviewPriority: stable`, and production Next build passing.
- [x] ✅ RSI-22: public source narrative and trust uplift (Medium-severity remediation) is complete (proven 2026-06-09):
- `app/datasets/page.tsx` now presents a public source-network narrative backed by `getProviderCapabilities`, not copied prototype constants.
- `app/about/page.tsx` and `app/projects/page.tsx` now migrate the sibling `meta-museum-art` mission/project surfaces into native Next pages while replacing coming-soon/static project cards with live source-network, Linked Art workbench, and Meta Wiki Art workflow links.
- `src/services/site-metadata.ts` centralizes OpenGraph/Twitter metadata with a reviewed local image, and footer navigation exposes Contact/Privacy/Terms plus asset provenance.
- `docs/asset-provenance.md` tracks all preserved sibling visual assets with SHA-256 hashes while marking placeholder thumbnails as excluded from public use and keeping copied legal text out.
- Proof packet: `tests/services/public-source-narrative.test.ts`, `tests/pages/public-source-pages.test.ts`, focused RSI-22 test output passing 6/2 plus follow-up `pnpm test -- tests/pages/public-source-pages.test.ts` passing 851/267 on 2026-06-09, `pnpm lint` passing with one existing warning, `pnpm build` passing, and screenshot proof at `artifacts/smoke/datasets-page.png`.
- [x] ✅ RSI-23: public-source agent API and trust smoke hardening (Medium-severity remediation) is complete (proven 2026-06-09):
- `/api/public-sources/summary` now exposes schema-versioned agent JSON for source stats, provider capability flags, and imported asset provenance.
- `docs/asset-provenance.md` and `src/contracts/zod/public-sources-summary.ts` now enforce explicit license-review statuses so unknown asset rows fail.
- `pnpm smoke:public-trust` captures browser screenshots for `/datasets`, `/contact`, `/privacy`, and `/terms`; the nested `meta-museum-art` prototype copy was removed after all images were preserved and inventoried.
- Proof packet: `tests/api/public-sources-summary.test.ts`, `tests/services/public-source-narrative.test.ts`, `tests/scripts/public-trust-smoke-script.test.ts`, focused RSI-23 output passing 6/3, `pnpm smoke:public-trust` passing against `http://localhost:3001`, `pnpm test` passing 827/259, `pnpm lint` passing with one existing warning, and `pnpm build` passing.
- [x] ✅ RSI-24: OpenAPI, checksum drift, and screenshot retention hardening (Medium-severity remediation) is complete (proven 2026-06-09):
- `/api/openapi` now includes `PublicSourcesSummaryResponse` and references it from `/api/public-sources/summary`.
- Imported public-source assets now fail tests when their SHA-256 hashes drift from `docs/asset-provenance.md`.
- `pnpm smoke:public-trust` now writes timestamped screenshot runs, latest copies, previous-run links, a summary JSON, and prunes old runs outside the retention window.
- Proof packet: `tests/api/openapi.test.ts`, `tests/services/public-source-narrative.test.ts`, `tests/services/public-trust-smoke-artifacts.test.ts`, `tests/scripts/public-trust-smoke-script.test.ts`, focused RSI-24 output passing 6/5, `pnpm smoke:public-trust` passing against `http://localhost:3001`, `pnpm test` passing 828/260, `pnpm lint` passing with one existing warning, and `pnpm build` passing.
- [x] ✅ RSI-25: public trust docs, diff metadata, and CI artifact visibility (Medium-severity remediation) is complete (proven 2026-06-09):
- `/api/docs` now renders the `/api/public-sources/summary` response example from `/api/openapi`, keeping examples single-source for humans and agents.
- Public trust smoke artifacts now include latest-vs-previous checksum/byte diff metadata per screenshot plus CI-ready summary markdown.
- `.github/workflows/public-trust-smoke.yml` now builds, runs `pnpm smoke:public-trust`, appends public trust summary links to `$GITHUB_STEP_SUMMARY`, and uploads `public-trust-smoke-artifacts`.
- Proof packet: `tests/api/docs.test.ts`, `tests/api/openapi.test.ts`, `tests/services/public-trust-smoke-artifacts.test.ts`, `tests/services/public-trust-smoke-ci-summary.test.ts`, `tests/scripts/public-trust-smoke-script.test.ts`, `tests/scripts/public-trust-ci-workflow.test.ts`, focused RSI-25 output passing 12/10, `pnpm smoke:public-trust` passing against temporary `http://localhost:3001`, `pnpm test` passing 830/262, `pnpm lint` passing with one existing warning, and `pnpm build` passing.
- [x] ✅ RSI-26: pixel-diff thresholds, public trust summary API, and main CI artifact links (Medium-severity remediation) is complete (proven 2026-06-09):
- `pnpm smoke:public-trust` now decodes PNG screenshots, computes changed-pixel ratios, and fails when `PUBLIC_TRUST_SCREENSHOT_PIXEL_DIFF_THRESHOLD` is exceeded.
- `/api/public-trust/summary` now exposes schema-versioned latest public trust smoke artifacts and pixel-diff status for agents.
- `.github/workflows/ci.yml` now runs public trust smoke, appends summary links to `$GITHUB_STEP_SUMMARY`, and uploads `public-trust-smoke-artifacts`.
- Proof packet: `tests/services/public-trust-smoke-artifacts.test.ts`, `tests/api/public-trust-summary.test.ts`, `tests/api/openapi.test.ts`, `tests/scripts/public-trust-smoke-script.test.ts`, `tests/scripts/public-trust-ci-workflow.test.ts`, focused RSI-26 output passing 10/7, `pnpm smoke:public-trust` passing with `unchanged=4` and `pixel failures=0`, `pnpm test` passing 834/263, `pnpm lint` passing with one existing warning, and `pnpm build` passing.
- [x] ✅ RSI-27: public trust per-page thresholds, OpenAPI docs example, and retention badge (Medium-severity remediation) is complete (proven 2026-06-09):
- Public trust smoke now applies stricter `/datasets` pixel drift policy (`0.005`) than Contact/Privacy/Terms legal pages (`0.02`).
- `/api/docs` now renders the `/api/public-trust/summary` response example from `/api/openapi`.
- Public trust CI summary now includes a retention badge snapshot-locked by `tests/fixtures/public-trust-summary/summary-snapshot.md`.
- Proof packet: `tests/services/public-trust-smoke-artifacts.test.ts`, `tests/scripts/public-trust-smoke-script.test.ts`, `tests/services/public-trust-smoke-ci-summary.test.ts`, `tests/fixtures/public-trust-summary/summary-snapshot.md`, `tests/api/docs.test.ts`, `tests/api/openapi.test.ts`, focused RSI-27 output passing 13/9, `pnpm smoke:public-trust` passing with per-page thresholds, `unchanged=4`, and `pixel failures=0`, `pnpm test` passing 835/263, `pnpm lint` passing with one existing warning, and `pnpm build` passing.
- [x] ✅ RSI-28: JSON public trust threshold policy, agent-visible policy, and CI drift annotations (Medium-severity remediation) is complete (proven 2026-06-09):
- Public trust smoke policy is now persisted in `config/public-trust-smoke-policy.json` and consumed by `scripts/smoke-public-trust-pages.ts`.
- `/api/public-trust/summary` now exposes the applied threshold policy for agents alongside latest smoke artifacts.
- CI summary tooling now emits warning annotations when screenshots change but remain under their configured threshold.
- Proof packet: `config/public-trust-smoke-policy.json`, `src/services/public-trust-smoke-policy.ts`, `scripts/smoke-public-trust-pages.ts`, `src/services/public-trust-summary.ts`, `src/contracts/zod/public-trust-summary.ts`, `src/services/public-trust-smoke-ci-summary.ts`, `tests/services/public-trust-smoke-policy.test.ts`, `tests/scripts/public-trust-smoke-script.test.ts`, `tests/api/public-trust-summary.test.ts`, `tests/services/public-trust-smoke-ci-summary.test.ts`, `tests/scripts/public-trust-ci-workflow.test.ts`, focused RSI-28 output passing 20/12, `pnpm smoke:public-trust` passing with JSON policy thresholds, `unchanged=4`, and `pixel failures=0`, `pnpm test` passing 838/264, `pnpm lint` passing with one existing warning, and `pnpm build` passing.
- [x] ✅ RSI-29: public trust reviewer rationale, schema rejection, and severity-grouped PR summaries (Medium-severity remediation) is complete (proven 2026-06-09):
- Public trust policy pages now carry `reviewSeverity`, `reviewerNote`, and `reasonCodes`.
- `/api/public-trust/summary` now exposes reviewer rationale metadata in the applied threshold policy for agents.
- Public trust CI summaries and warning annotations now group under-threshold visual drift by high/medium/low route severity.
- Proof packet: `config/public-trust-smoke-policy.json`, `src/services/public-trust-smoke-policy.ts`, `src/services/public-trust-summary.ts`, `src/contracts/zod/public-trust-summary.ts`, `scripts/smoke-public-trust-pages.ts`, `src/services/public-trust-smoke-ci-summary.ts`, `tests/services/public-trust-smoke-policy.test.ts`, `tests/api/public-trust-summary.test.ts`, `tests/services/public-trust-smoke-ci-summary.test.ts`, `tests/fixtures/public-trust-summary/summary-snapshot.md`, focused RSI-29 output passing 18/11 plus CI-summary focused retest 2/1, `pnpm exec start-server-and-test "pnpm dev" http://localhost:3000 "pnpm smoke:public-trust"` passing with JSON policy metadata and `pixel failures=0`, `pnpm test` passing 839/264, `pnpm lint` passing with one existing warning, and `pnpm build` passing.
- [x] ✅ RSI-30: owner/reviewer initials, schema v2 fixture, and grouped annotation snapshot (Medium-severity remediation) is complete (proven 2026-06-09):
- Public trust policy pages now carry `ownerInitials` and `reviewerInitials` alongside severity, notes, and reason codes.
- `/api/public-trust/summary` and CI drift summary rows now expose owner/reviewer initials for fast review routing.
- Planned `schemaVersion: 2` policy fixture and grouped warning annotation snapshot are now committed as executable contract evidence.
- Proof packet: `config/public-trust-smoke-policy.json`, `src/services/public-trust-smoke-policy.ts`, `src/services/public-trust-summary.ts`, `src/contracts/zod/public-trust-summary.ts`, `src/services/public-trust-smoke-ci-summary.ts`, `tests/fixtures/public-trust-policy/schema-v2-planned.json`, `tests/fixtures/public-trust-summary/grouped-annotations-snapshot.txt`, `tests/services/public-trust-smoke-policy.test.ts`, `tests/api/public-trust-summary.test.ts`, `tests/services/public-trust-smoke-ci-summary.test.ts`, focused RSI-30 tests passing 15/10, `pnpm exec start-server-and-test "pnpm dev" http://localhost:3000 "pnpm smoke:public-trust"` passing with `unchanged=4` and `pixel failures=0`, isolated transient `tests/api/artworks/by-id.test.ts` retest passing, final `pnpm test` passing 839/264, `pnpm lint` passing with one existing warning, and `pnpm build` passing.
- [x] ✅ RSI-1: Provider/pipeline boundary drift hardening (High-severity remediation) is closed and proven:
- boundary contract test passes with allowed shared imports only,
- full `pnpm test` + `pnpm lint` + `pnpm build` cycle passed in-cycle,
- close-out evidence synchronized in `CLAUDE.md`, `README.md`, and this roadmap.
- [x] ✅ RSI-2: UI journey automation scope (Medium-severity remediation) is closed and proven:
- matrix automation now spans role/provider combinations (`pnpm smoke:explore:matrix`), and `/api/objects`, `/api/works`, `/api/agents`, `/api/places`, `/api/sets` route assertions verify positive + negative paths for imported records,
- smoke probe evidence and status updates are synchronized in `CLAUDE.md`, this roadmap, and `README.md`.
- [x] ✅ RSI-3: Single-file and process-complexity reduction (Low-severity remediation) is complete (proven 2026-06-09):
- Owner map and top-complexity target inventory are now finalized in `docs/risk-register.md` (Action 1 complete).
- Action 2 is complete: `src/services/publish-queue-worker.ts` split into helper modules with existing tests preserved.
- Action 3 is complete: `src/services/issues.ts` split into focused modules under `src/services/issues/` with behavior preserved.
- Action 4 is complete: `src/services/outbox.ts` split into focused modules under `src/services/outbox/` with behavior preserved.
- Action 5 is complete: `src/services/reconciliation.ts` split into focused modules under `src/services/reconciliation/` with behavior preserved.
- Action 6 is complete: `src/services/wiki-publish.ts` split into focused modules under `src/services/wiki-publish/` with behavior preserved.
- Action 7 is complete: `src/services/monitoring-telemetry.ts` split into focused modules under `src/services/monitoring-telemetry/` with behavior preserved.
- Proof packet: `pnpm test`, `pnpm lint`, and `pnpm build` for the closeout cycle, plus synchronized updates in `CLAUDE.md`, `docs/roadmap.md`, and `README.md` with this evidence path.
- Action 8 is complete: `scripts/authority-cache-refresh.ts` split into focused modules under `scripts/authority-cache-refresh/` with behavior preserved.
- Proof packet: `pnpm test`, `pnpm lint`, and `pnpm build` with synchronized updates in `CLAUDE.md`, `docs/roadmap.md`, and `README.md`.
- Action 9 is complete: `src/services/ai-layer.ts` split into focused modules under `src/services/ai-layer/` with API preserved in the facade.
- Proof packet: `pnpm test`, `pnpm lint`, and `pnpm build` all passed, with close-out updates synchronized in `CLAUDE.md`, `README.md`, and `docs/risk-register.md`.
- [x] ✅ RSI-4: Chat grounding and citation enforcement (Medium-severity remediation) is complete (proven 2026-06-09):
- `/api/ai/chat` now returns sentence-level grounded citations (`[entityId, propertyPath]`) and refuses output when citation coverage is incomplete.
- Evidence is implemented in `app/api/ai/chat/route.ts` and `src/services/ai-chat.ts`.
- Proof packet: `tests/api/ai-chat.test.ts`, `pnpm test` (full suite), `pnpm lint`, and `pnpm build`; close-out updates synchronized in `CLAUDE.md`, `README.md`, and `docs/risk-register.md`.
- [x] ✅ Expand HAL `_links` discoverability coverage from representative routes to all public entity-role routes as they land (enforced through protocol + provider conformance suites and `hal-entity-discoverability-conformance` quality checks).
- [x] ✅ Add search-relation conformance breadth tests across additional provider search endpoints (beyond representative NGA/RKD/facade checks) with pagination drift assertions.
- Evidence: `tests/quality/hal-search-relations-conformance.test.ts` now validates relation and pagination behavior for Louvre, Harvard, Smithsonian, V&A, Princeton, Europeana, AIC, CMA, and Rijks routes in addition to existing representative checks.
- [x] ✅ Keep execution-policy gates strict: no provider/validation merges without standards mapping + fixture anchors, and no protocol-affecting merges without conformance coverage.
- Evidence: `tests/quality/execution-policy-gates.test.ts`, `.github/pull_request_template.md`.
Suggested branch name (used): `codex/era-c1-hal-search-conformance-breadth`.