Risk Register
Current risk posture
| Severity | Gap | Impact | Status |
|---|---|---|---|
| High | Complex provider and pipeline boundaries | Potential contract drift if adapter boundaries weaken | Closed — proven by RSI-1 on 2026-06-09 |
| Medium | UI journey automation scope | Hidden regressions outside current smoke paths | Closed — proven by RSI-2 on 2026-06-09 |
| Low | Single file and process complexity | Long-tail maintainability cost | Closed — proven by RSI-3 action 9 on 2026-06-09 |
| Medium | Chat grounding and citation enforcement | Ungrounded `/api/ai/chat` responses could leak hallucinated claims | Closed — proven by RSI-4 on 2026-06-09 |
| Medium | AI evidence freshness and citation drift | AI outputs can become stale or under-cited as retrieval/model behavior shifts | Closed — proven by RSI-5 Action 2 on 2026-06-09 |
| Medium | AI eval drift baselines omit citation freshness | Model/prompt regression gates could miss stale-evidence degradation | Closed — proven by RSI-6 on 2026-06-09 |
| Medium | AI eval artifact visibility and freshness-aging pressure | CI/pass status can hide stale-evidence aging pressure or make eval artifacts hard to inspect | Closed — proven by RSI-7 on 2026-06-09 |
| Medium | AI eval artifact dashboard visibility | Operators still need an in-app view of latest/trend eval artifacts and aging alerts | Closed — proven by RSI-8 on 2026-06-09 |
| Medium | AI eval artifact diff review speed | Operators can miss meaningful latest-vs-previous eval changes when raw artifacts are inspected manually | Closed — proven by RSI-9 on 2026-06-09 |
| Medium | AI eval diff prioritization | Raw deltas can obscure review priority when metric drift and evidence-aging pressure compete | Closed — proven by RSI-10 on 2026-06-09 |
| Medium | AI eval priority discoverability | Review severity can drift from policy or remain hidden unless operators open raw artifacts | Closed — proven by RSI-11 on 2026-06-09 |
| Medium | AI eval agent-summary reliability | Agents need validated policy and compact JSON/history instead of scraping UI or raw artifacts | Closed — proven by RSI-12 on 2026-06-09 |
| Medium | AI eval contract and CI annotation reliability | Agents and PR reviewers need schema-backed summary payloads, configurable history windows, and CI-visible priority warnings | Closed — proven by RSI-13 on 2026-06-09 |
| Medium | AI eval version and retention hygiene | Agents need explicit summary-version negotiation and artifact retention must not accumulate orphaned run JSON | Closed — proven by RSI-14 on 2026-06-09 |
| Medium | AI eval migration and reporting visibility | Future summary migrations, retention dry-runs, and CI annotation/pruning status need explicit reviewer evidence | Closed — proven by RSI-15 on 2026-06-09 |
| Medium | AI eval summary artifact/schema visibility | CI summary markdown, agent pruning status, and future migration compatibility must stay contract-visible | Closed — proven by RSI-16 on 2026-06-09 |
| Medium | Visual ETL Mapper AI-assist safety | LLM-assisted mappings could invent unsafe Linked Art paths unless suggestions are contract-validated and review-only | Closed — proven by RSI-17 on 2026-06-09 |
| Medium | Mapper-assist fixture/schema/importability drift | Tricky columns, UI draft import, or OpenAPI response docs could drift after initial mapper-assist launch | Closed — proven by RSI-18 on 2026-06-09 |
| Medium | Mapper-assist provider/browser/request-schema drift | Provider-specific columns, browser import flows, or request-body docs could drift from the mapper-assist contract | Closed — proven by RSI-19 on 2026-06-09 |
| Medium | Mapper-assist near-miss/docs/visual drift | Almost-mappable columns, imported-draft visuals, or human docs examples could drift from safe mapper behavior | Closed — proven by RSI-20 on 2026-06-09 |
| Medium | Mapper-assist layout/example/confidence drift | UI overlap, duplicated docs examples, or low-confidence suggestions could reduce curator trust | Closed — proven by RSI-21 on 2026-06-09 |
| Medium | Public source narrative and trust-page drift | Stale provider stats, untracked prototype assets, or placeholder legal copy could mislead humans and agents | Closed — proven by RSI-22 on 2026-06-09 |
| Medium | Public source summary and trust-smoke drift | Agents or reviewers could miss source/trust regressions if public pages lack contract JSON and screenshot proof | Closed — proven by RSI-23 on 2026-06-09 |
| Medium | Public source contract/artifact drift | Agents could miss public-source schema changes, copied assets could drift, or screenshots could accumulate without latest/previous review context | Closed — proven by RSI-24 on 2026-06-09 |
| Medium | Public trust docs and CI artifact visibility drift | Humans or agents could miss public-source examples, screenshot diffs, or CI artifact links during review | Closed — proven by RSI-25 on 2026-06-09 |
| Medium | Public trust pixel-diff and artifact API drift | Meaningful visual drift or missing CI/API artifact context could escape public trust review | Closed — proven by RSI-26 on 2026-06-09 |
| Medium | Public trust threshold/docs badge drift | Overbroad visual thresholds, undocumented trust examples, or unstable CI summaries could weaken public trust review | Closed — proven by RSI-27 on 2026-06-09 |
| Medium | Public trust policy/API/annotation drift | Hardcoded visual policy, hidden applied thresholds, or silent under-threshold changes could weaken trust review | Closed — proven by RSI-28 on 2026-06-09 |
| Medium | Public trust rationale/schema/summary drift | Reviewer rationale could be hidden, future policy schemas could be consumed unsafely, or CI summaries could obscure severity | Closed — proven by RSI-29 on 2026-06-09 |
| Medium | Public trust ownership/migration/annotation drift | Review ownership could be hidden, future schema migration could lack a fixture, or warning text could drift silently | Closed — proven by RSI-30 on 2026-06-09 |
Use `docs/closeout-notes.md`(docs/closeout-notes.md) for the ready-to-fill one-click RSI closeout block each cycle.
Remediation slice queue
```text
## [ ] RSI-5 closeout (ready-to-fill)
```
- [x] RSI-1: Provider/pipeline boundary drift hardening (High-severity remediation) — closed (proven)
- Owner: Platform
- Scope: keep adapter boundaries enforceable by contract tests + deterministic governance checks.
- Evidence before close-out (must all pass before slice status can move to done):
- `tests/contracts/provider-boundary-contracts.test.ts` passes and has no unexpected allowed-import exceptions.
- `src/adapters/*.ts` contains no new cross-adapter imports beyond `provider-interface`, `adapter-utils`, or `expansion-provider`, with any change to this exception set documented in this register.
- At least one fresh `pnpm test` run includes the new boundary contract test.
- This row is reflected in the same-cycle `CLAUDE.md` close-out row and synchronized to `README.md` and `docs/roadmap.md` before the next RSI expansion scope.
- Closed proof (2026-06-09):
- `pnpm test`, `pnpm lint`, and `pnpm build` all passed in the same cycle.
- Full close-out sync completed in `CLAUDE.md`, `README.md`, and `docs/roadmap.md`.
- [x] RSI-2: UI journey automation breadth (Medium-severity remediation) — closed (proven)
- Owner: Product + Platform
- Scope: remove hidden journey blind spots by broadening journey smoke automation to role+provider matrix plus entity-role read-path assertions.
- Evidence before close-out:
- Smoke probe includes all matrix scenarios (`public` and `researcher` × `met` and `getty`) in `scripts/smoke-explore-import-matrix.ts`.
- Probe includes `/api/objects/[id]`, `/api/works/[id]`, `/api/agents/[id]`, `/api/places/[id]`, and `/api/sets/[id]` happy/404 route checks for imported records.
- Slice references `pnpm smoke:explore:matrix` and shares one-liner proof path in `package.json`.
- Close-out row is added and this risk row is synchronized to `README.md` and `docs/roadmap.md`.
- Closed proof (2026-06-09):
- Proof references now include `scripts/smoke-explore-import-matrix.ts` and `package.json` (`pnpm smoke:explore:matrix`) covering role/provider matrix plus route-assertion checks.
- `docs/roadmap.md` and `README.md` updated to mark the journey automation gap as closed with RSI-2 proof.
- `CLAUDE.md` close-out log row added.
- [x] RSI-3: Single-file and process-complexity reduction (Low-severity remediation) — closed (proven)
- Owner: Product + Platform
- Scope: reduce long-tail maintenance cost by splitting oversized files, documenting process ownership, and making future complexity boundaries explicit.
- Status update (2026-06-09): Action 1 complete — candidate-file complexity inventory captured with owners and decomposition targets.
- Status update (2026-06-09): Action 2 complete — publish-queue worker refactor slice landed and proven in `src/services/publish-queue-*.ts` with full gate verification.
- Status update (2026-06-09): Action 3 complete — `src/services/issues.ts` split into `src/services/issues/{cache.ts,github.ts,analysis.ts,types.ts}` and proven with full gate evidence.
- Status update (2026-06-09): Action 4 complete — `src/services/outbox.ts` split into focused modules under `src/services/outbox/`.
- Status update (2026-06-09): Action 5 complete — `src/services/reconciliation.ts` split into focused modules with preserved behavior and passing gate suite.
- Status update (2026-06-09): Action 6 complete — `src/services/wiki-publish.ts` split into focused modules with behavior preserved and full gate evidence (`pnpm test`, `pnpm lint`, and `pnpm build`).
- Evidence required before close-out:
- Identify and decompose top 8 file classes with sustained complexity symptoms (e.g., route handlers > ~400 LOC, scripts with multi-step orchestration, services with mixed responsibilities).
- Add a short decomposition plan and owner map in the risk row before implementation.
- New or refactored files must keep `pnpm test` and `pnpm lint` green with existing RSI checks.
- Add a close-out row in `CLAUDE.md` plus updates in `docs/roadmap.md` and `README.md`.
- Candidate inventory (Action 1; baseline from code-size scan):
- `src/services/publish-queue-worker.ts` (679 LOC) — Owner: Product + Platform; target: split worker orchestration, provider adapters, and persistence transaction helpers.
- `src/services/issues.ts` (670 LOC) — Owner: Product + Platform; target: split cache hydration, GitHub event fetch, and SSE/event-bus publishing modules.
- `src/services/outbox.ts` (669 LOC) — Owner: Platform; target: split status/query read-models, projector/retry orchestration, and DLQ policy helpers.
- `src/services/reconciliation.ts` (603 LOC) — Owner: Product + Platform; target: split candidate scoring, queue orchestration, and merge policy enforcement.
- `scripts/authority-cache-refresh.ts` (483 LOC) — Owner: Platform; target: split CLI orchestration, external authority fetch, and cache persistence/error retry stages.
- `src/services/ai-layer.ts` (468 LOC) — Owner: Product + Platform; target: split prompt/command validation, route-level orchestration, and response shaping/metrics.
- Proof plan for next close-out:
- Action-1 inventory + owner map is now captured in this register.
- Action-2 completion proof (2026-06-09):
- `pnpm test`, `pnpm lint`, and `pnpm build` pass after the publish-queue refactor slice landed.
- `tests/services/publish-queue-worker.test.ts` proves queue orchestration behavior and daily-cap deferral behavior remains unchanged.
- `CLAUDE.md`, `README.md`, and `docs/roadmap.md` are updated in the same slice closeout package.
- Action-3 completion proof (2026-06-09):
- `tests/services/issues.test.ts`, `tests/api/issues.test.ts`, `tests/api/issues-stream.test.ts`, and `tests/api/issues-webhook.test.ts` pass.
- `pnpm test`, `pnpm lint`, and `pnpm build` pass in the same cycle.
- `CLAUDE.md`, `README.md`, and `docs/roadmap.md` are updated in the same slice closeout package.
- Action-4 completion proof (2026-06-09):
- `src/services/outbox.ts` split into `src/services/outbox/{commands.ts,payload.ts,pool.ts,policy.ts,types.ts}` with all `outbox` behavior preserved.
- `tests/services/outbox.test.ts`, `tests/services/outbox-projector.test.ts`, and `tests/services/outbox-alerts.test.ts` pass.
- `tests/services/outbox.test.ts` covers payload and failure-policy behavior; `tests/services/outbox-projector.test.ts` and `tests/services/outbox-alerts.test.ts` remain stable with unchanged exported contracts.
- `pnpm test`, `pnpm lint`, and `pnpm build` pass in the same cycle.
- `CLAUDE.md`, `README.md`, and `docs/roadmap.md` are updated in the same slice closeout package.
- Action-5 completion proof (2026-06-09):
- `src/services/reconciliation.ts` split into `src/services/reconciliation/{candidates.ts,scoring.ts,service.ts,thresholds.ts,tiebreaker.ts,types.ts,utils.ts}` while keeping public API stable (`ReconciliationCandidate` remains exported).
- `tests/quality/reconciliation-exhibitions-literature.test.ts` pass with unchanged fixture coverage and output shape validation.
- `pnpm test`, `pnpm lint`, and `pnpm build` pass in the same cycle.
- `CLAUDE.md`, `README.md`, and `docs/roadmap.md` are updated in the same slice closeout package.
- Action-6 completion proof (2026-06-09):
- `src/services/wiki-publish.ts` split into `src/services/wiki-publish/{client.ts,draft.ts,plan.ts,preflight.ts,publish.ts,types.ts,utils.ts}` with preserved public exports and test behavior.
- `tests/services/wiki-publish.test.ts` passes (and no wiki publish integration behavior changed).
- `pnpm test`, `pnpm lint`, and `pnpm build` pass in the same cycle.
- `CLAUDE.md`, `README.md`, and `docs/roadmap.md` are updated in the same slice closeout package.
- Action-7 completion proof (2026-06-09):
- `src/services/monitoring-telemetry.ts` decomposed into `src/services/monitoring-telemetry/{service.ts,uptime.ts,kpis.ts,io.ts,utils.ts,types.ts}` with public API stability and all existing callers preserved (`tests/services/monitoring-telemetry.test.ts`, `scripts/monitoring-telemetry-sync.ts`).
- `pnpm test`, `pnpm lint`, and `pnpm build` pass in the same cycle (monitoring slice regression check validated via `tests/services/monitoring-telemetry.test.ts`).
- `CLAUDE.md`, `README.md`, and `docs/roadmap.md` are updated in the same slice closeout package.
- Action-8 completion proof (2026-06-08):
- `scripts/authority-cache-refresh.ts` split into `scripts/authority-cache-refresh/{cli.ts,config.ts,io.ts,parser.ts,refresh.ts,report.ts,runner.ts,types.ts}` with behavior preserved from single-file logic.
- `pnpm test` (full suite), `pnpm lint`, and `pnpm build` pass in the same cycle.
- `CLAUDE.md`, `README.md`, and `docs/roadmap.md` are updated in the same slice closeout package.
- Action-9 completion proof (2026-06-09):
- `src/services/ai-layer.ts` decomposed into `src/services/ai-layer/{build.ts,embed.ts,persist.ts,pool.ts,types.ts,utils.ts,constants.ts,visual.ts}` with public API preserved via facade export.
- `pnpm test`, `pnpm lint`, and `pnpm build` pass in the same cycle after `src/services/ai-layer/utils.ts` type hardening.
- `CLAUDE.md`, `README.md`, and `docs/roadmap.md` are updated in the same slice closeout package.
- [x] RSI-4: Chat grounding and citation enforcement (Medium-severity remediation) — closed (proven)
- Owner: Platform + AI Reliability
- Scope: enforce Graph-RAG chat grounding (`/api/ai/chat`) with mandatory sentence-level citations and automatic refusal on insufficient coverage.
- Evidence before close-out:
- `src/services/ai-chat.ts` and `app/api/ai/chat/route.ts` implement citation scoring, per-sentence citation grouping, and refusal responses with evidence metadata.
- `tests/api/ai-chat.test.ts` covers invalid input, refusal path, and successful cited answer behavior.
- Closed proof (2026-06-09):
- `tests/api/ai-chat.test.ts` passes in-cycle.
- `pnpm test`, `pnpm lint`, and `pnpm build` all pass in the same closeout cycle.
- `CLAUDE.md`, `README.md`, and this roadmap are updated in the same slice closeout package.
- [x] RSI-5: AI evidence drift + citation freshness (Medium-severity remediation) — closed (proven)
- Owner: Platform + AI Reliability
- Status update (2026-06-09): Action 1 complete — query-route citation metadata and refusal fields are implemented in `src/services/ai-query.ts`, with matching assertions in `tests/api/ai-query.test.ts` and `tests/quality/cite-or-refuse-conformance.test.ts`.
- Status update (2026-06-09): Action 2 complete — shared citation freshness metadata now adds `retrievedAt` and `citationFreshness` diagnostics to `/api/ai/query` and `/api/ai/chat`, refusing stale evidence through policy-backed tests.
- Scope:
- Prevent AI output quality regressions by enforcing citation coverage in broader AI workflows beyond `/api/ai/chat`, especially `/api/ai/query`.
- Ensure every AI answer path emits stable evidence metadata (`entityId`, `propertyPath`, `sourceUrl`, `retrievedAt`) and explicit refusal when coverage or freshness is below policy.
- Track this as a recurring AI-RSI checkpoint in the same evidence loop as prior RSI slices.
- Acceptance:
- `/api/ai/query` returns structured citation metadata on success and refusal paths.
- Any under-cited output returns refusal state with explicit citation coverage failure reason.
- Any stale cited output returns refusal state with explicit citation freshness failure reason and `citationFreshness` diagnostics.
- Query tests prove: invalid input handling, refusal path behavior, and cited success path.
- `/api/ai/chat` keeps grounded response semantics while gaining the same freshness refusal guard.
- Evidence plan (before close-out):
- Add/extend `tests/api/ai-query.test.ts` with cite-or-refuse assertions. Action 1 complete.
- Add/extend `tests/quality/cite-or-refuse-conformance.test.ts` to compare generated-content and `/api/ai/query` response behavior. Action 1 complete.
- Add/extend `tests/api/ai-chat.test.ts` with stale-citation refusal assertions. Action 2 complete.
- Run `pnpm test`, `pnpm lint`, and `pnpm build` in the same closeout cycle. Closeout gate.
- Mirror closeout evidence in `README.md`, `docs/roadmap.md`, and `CLAUDE.md`. Closeout gate.
- One-click closeout block template (ready-to-fill):
- Slice: Platform + AI
- Severity: Medium
- Action: RSI-5 / AI evidence drift + citation freshness
- Date: 2026-06-09
- Status: DONE
- Evidence path:
- [ ] `tests/api/ai-query.test.ts`
- [ ] `tests/api/ai-chat.test.ts`
- [ ] `tests/quality/cite-or-refuse-conformance.test.ts`
- [ ] `pnpm test` (full suite)
- [ ] `pnpm lint`
- [ ] `pnpm build`
- [ ] Closeout row sync: `docs/risk-register.md`
- [ ] Closeout row sync: `docs/roadmap.md`
- [ ] Closeout row sync: `README.md`
- [ ] Closeout log sync: `CLAUDE.md`
- Compounding proof:
- [ ] Risk status updated to closed/proven with “proven” date
- [ ] Explicit what changed / evidence / next action present
- [ ] No active RSI-5 TODOs from this slice remain
- [ ] Existing AI-RSI evidence loop (docs/rsi-wiki.md) references the slice outcome
- Submit now:
- [ ] Copy/paste this block into `docs/roadmap.md`, `CLAUDE.md`, and `README.md` as final closeout package
- [x] RSI-6: AI eval drift baselines include citation freshness (Medium-severity remediation) — closed (proven)
- Owner: Platform + AI Reliability
- Scope: make AI eval regression baselines compound on RSI-5 by scoring and fail-fast gating citation freshness drift, not only citation structure/accuracy.
- Acceptance:
- `EvalScoreThresholds` includes `citationFreshness` for current metrics, baselines, artifact summaries, and drift comparisons.
- `config/ai-eval-regression-policy.json` defines `citationFreshness`, `maxCitationFreshnessDrop`, and `failFastOnCitationFreshnessDrift`.
- Golden eval dataset rubric includes `citationFreshnessThreshold = 0.95`.
- Stale evidence is proven to lower eval freshness score and fail the sample gate.
- Live eval gate reports zero drift for the current baseline identity.
- Closed proof (2026-06-09):
- `src/services/ai-eval-harness.ts` scores freshness from actual citation `retrievedAt` / source timestamps.
- `src/services/ai-eval-regression.ts` compares `citationFreshnessDrop` and fail-fast policy.
- `scripts/ai-eval-gate.ts` persists and prints `citationFreshness` metrics/drift.
- Tests: `tests/services/ai-eval-harness.test.ts`, `tests/services/ai-eval-regression.test.ts`, `tests/services/ai-eval-artifacts.test.ts`, and `tests/quality/ai-eval-golden-dataset.test.ts`.
- Gate: direct AI eval check passed with `citationFreshness=1` and `citationFreshnessDrop=0`.
- [x] RSI-7: AI eval summary badges + aging-pressure alerting (Medium-severity remediation) — closed (proven)
- Owner: Platform + AI Reliability
- Scope: make eval freshness status visible in CI artifacts and warn when cited evidence ages toward the policy window even while `citationFreshness` remains flat/green.
- Acceptance:
- `artifacts/evals/summary.md` renders badges for status, faithfulness, relevance, citation accuracy, `citationFreshness`, and pass rate.
- Summary includes latest/run/trend artifact links plus citation freshness aging (`oldestAgeDays`, `oldestAgeRatio`, `maxAgeDays`).
- CI appends the summary to `$GITHUB_STEP_SUMMARY` and uploads `artifacts/evals/`.
- Trend alert emits `citation_freshness_aging_pressure` when freshness is flat/improved but `oldestAgeRatio` crosses the threshold.
- Closed proof (2026-06-09):
- `.github/workflows/ai-eval-gate.yml` publishes and uploads AI eval artifacts.
- `src/services/ai-eval-artifacts.ts` builds the summary and evaluates freshness-aging pressure.
- `src/services/ai-eval-harness.ts` records freshness-aging trend fields.
- `scripts/ai-eval-gate.ts` persists summary path and prints `citationFreshnessAging`.
- Tests: `tests/services/ai-eval-artifacts.test.ts`.
- Gate: direct AI eval check passed with `summary=artifacts/evals/summary.md`, `citationFreshness=1`, and `oldestAgeRatio=0`.
- [x] RSI-8: AI eval artifact dashboard visibility (Medium-severity remediation) — closed (proven)
- Owner: Platform + AI Reliability
- Scope: expose latest eval summary, trend index, freshness-aging state, and warnings in a read-only app dashboard.
- Acceptance:
- Dashboard reads ignored `artifacts/evals/ai-eval-gate-latest.json`, `artifacts/evals/trend-index.json`, and `artifacts/evals/summary.md` when present.
- Empty/missing local artifact state is explicit and non-failing.
- Display model includes latest metrics, `citationFreshness`, freshness-aging warnings, run identity, and artifact timestamps.
- Parser/render-model tests prove dashboard output matches artifact contents.
- Closed proof (2026-06-09):
- `src/services/ai-eval-dashboard.ts` loads latest/trend/summary artifacts with explicit `ready`, `partial`, and `empty` states.
- `app/(workspace)/ai-evals/page.tsx` renders latest metrics, `citationFreshness`, freshness-aging state, active warnings, run identity, and artifact timestamps.
- Navigation exposes `/ai-evals` in the workspace sidebar, Workspace menu, and footer.
- Tests: `tests/services/ai-eval-dashboard.test.ts` and `tests/pages/ai-evals-page.test.ts`.
- Focused gate: system Node `--test tests/services/ai-eval-dashboard.test.ts tests/pages/ai-evals-page.test.ts` passed 3 tests.
- [x] RSI-9: latest-vs-previous AI eval artifact diff (Medium-severity remediation) — closed (proven)
- Owner: Platform + AI Reliability
- Scope: accelerate eval artifact review by showing latest-vs-previous deltas and “what changed” notes directly in `/ai-evals`.
- Acceptance:
- Dashboard model computes metric deltas for faithfulness, relevance, citation accuracy, `citationFreshness`, and pass rate.
- Dashboard model computes freshness-aging pressure movement between retained runs when both runs have aging metadata.
- Dashboard notes status changes, prompt-count changes, identity changes, and no-movement cases.
- Page renders an explicit “need at least two retained eval runs” state when a diff is unavailable.
- Closed proof (2026-06-09):
- `src/services/ai-eval-dashboard.ts` computes `AiEvalRunDiff` with metric deltas, aging deltas, and review notes.
- `app/(workspace)/ai-evals/page.tsx` renders the “Latest vs previous run” and “What changed notes” sections.
- Tests: `tests/services/ai-eval-dashboard.test.ts` and `tests/pages/ai-evals-page.test.ts`.
- Focused gate: system Node `--test tests/services/ai-eval-dashboard.test.ts tests/pages/ai-evals-page.test.ts` passed 3 tests.
- [x] RSI-10: severity-labeled AI eval diff triage (Medium-severity remediation) — closed (proven)
- Owner: Platform + AI Reliability
- Scope: prioritize `/ai-evals` latest-vs-previous review with severity labels for metric drift, freshness-aging pressure, and overall diff status.
- Acceptance:
- Dashboard model labels each metric delta as `regression`, `watch`, `stable`, or `improved` using explicit metric thresholds.
- Dashboard model labels freshness-aging pressure movement using explicit aging thresholds.
- Overall diff severity escalates failed-status regressions, identity-change watch states, metric regressions, and aging regressions before stable/improved labels.
- Page renders the overall priority label and the applied metric/aging thresholds for fast operator triage.
- Closed proof (2026-06-09):
- `src/services/ai-eval-dashboard.ts` defines `AI_EVAL_DIFF_SEVERITY_THRESHOLDS` and classifies `AiEvalRunDiff`, metric deltas, and aging deltas.
- `app/(workspace)/ai-evals/page.tsx` renders priority labels and threshold pills in “Latest vs previous run.”
- Tests: `tests/services/ai-eval-dashboard.test.ts` and `tests/pages/ai-evals-page.test.ts`.
- Gates: focused system Node test passed 3 tests; `pnpm test` passed 787/249; `pnpm lint` passed with one existing warning; production Next build passed via system Node.
- [x] RSI-11: policy-driven AI eval priority visibility (Medium-severity remediation) — closed (proven)
- Owner: AI Reliability + Platform + DevEx
- Scope: keep eval severity thresholds config-driven and surface priority in dashboard and CI summary paths.
- Acceptance:
- `config/ai-eval-regression-policy.json` owns `diffSeverityPolicy` thresholds for metric movement and freshness-aging pressure.
- Dashboard model consumes policy thresholds and computes last-N severity distribution.
- `/ai-evals` renders `regression`, `watch`, `stable`, and `improved` distribution counts even when raw artifacts are absent.
- CI summary renders review priority and threshold context when two retained runs are available.
- Closed proof (2026-06-09):
- `src/services/ai-eval-severity.ts` centralizes threshold normalization, run diff severity, and distribution counting.
- `src/services/ai-eval-dashboard.ts` and `src/services/ai-eval-artifacts.ts` consume the shared classifier.
- `scripts/ai-eval-gate.ts` passes policy thresholds and prints `reviewPriority`.
- Tests: `tests/services/ai-eval-dashboard.test.ts`, `tests/services/ai-eval-artifacts.test.ts`, and `tests/pages/ai-evals-page.test.ts`.
- Gates: focused system Node test passed 8 tests; `pnpm test` passed 788/249; `pnpm lint` passed with one existing warning; AI eval gate printed `reviewPriority: stable`; production Next build passed via system Node.
- [x] RSI-12: AI eval agent-summary reliability (Medium-severity remediation) — closed (proven)
- Owner: AI Reliability + Platform + Agent Experience
- Scope: make eval priority safe for agents by validating severity policy, exposing compact JSON, and showing comparison history in the dashboard.
- Acceptance:
- Malformed `diffSeverityPolicy` values fail schema/order validation with explicit error messages.
- `/api/ai-evals/summary` returns agent-ready JSON with review priority, severity distribution, severity history, alerts, and artifact links.
- `/ai-evals` renders a compact severity sparkline plus comparison history for retained runs.
- Route supports `GET` + `OPTIONS` with baseline public JSON/CORS behavior.
- Closed proof (2026-06-09):
- `src/services/ai-eval-severity.ts` centralizes validation and history symbols.
- `src/services/ai-eval-dashboard.ts` exposes `loadAiEvalAgentSummary`.
- `app/api/ai-evals/summary/route.ts` serves the agent summary endpoint.
- Tests: `tests/services/ai-eval-severity.test.ts`, `tests/api/ai-evals-summary.test.ts`, `tests/services/ai-eval-dashboard.test.ts`, `tests/services/ai-eval-artifacts.test.ts`, and `tests/pages/ai-evals-page.test.ts`.
- Gates: focused system Node test passed 13 tests; `pnpm test` passed 793/251; `pnpm lint` passed with one existing warning; AI eval gate printed `reviewPriority: stable`; production Next build confirmed `/api/ai-evals/summary`.
- [x] RSI-13: AI eval contract and CI annotation reliability (Medium-severity remediation) — closed (proven)
- Owner: AI Reliability + DevEx + Agent Experience
- Scope: prevent agent/API contract drift and invisible eval priority by adding schema-backed JSON/OpenAPI tests, a config-driven severity-history window, and GitHub PR annotations for `watch`/`regression`.
- Acceptance:
- `/api/ai-evals/summary` payload parses against the reusable Zod contract.
- `/api/openapi` references `AiEvalSummaryResponse` for the route.
- `severityHistoryPolicy.maxComparisons` drives dashboard and agent summary window metadata.
- Malformed history-window policy values fail explicitly.
- CI logs emit a GitHub warning annotation when eval priority is `watch` or `regression`.
- Closed proof (2026-06-09):
- `src/contracts/zod/ai-eval-summary.ts`, `src/services/openapi.ts`, `src/services/ai-eval-severity.ts`, `src/services/ai-eval-dashboard.ts`, `src/services/ai-eval-artifacts.ts`, `scripts/ai-eval-gate.ts`, `.github/workflows/ai-eval-gate.yml`, and `config/ai-eval-regression-policy.json`.
- Tests: `tests/api/ai-evals-summary.test.ts`, `tests/api/openapi.test.ts`, `tests/services/ai-eval-severity.test.ts`, `tests/services/ai-eval-dashboard.test.ts`, `tests/services/ai-eval-artifacts.test.ts`, and `tests/pages/ai-evals-page.test.ts`.
- Gates: focused system Node test passed 18 tests; `pnpm test` passed 796/251; `pnpm lint` passed with one existing warning; AI eval gate printed `reviewPriority: stable`; production Next build confirmed `/api/ai-evals/summary`.
- [x] RSI-14: AI eval version and retention hygiene (Medium-severity remediation) — closed (proven)
- Owner: AI Reliability + DevEx + Platform
- Scope: make the agent summary contract explicitly versioned, keep CI annotation examples copy/pasteable and snapshot-proven, and prevent stale eval run artifacts from accumulating outside the retained trend window.
- Acceptance:
- `/api/ai-evals/summary` returns `schemaVersion: 1`.
- Unsupported summary schema versions fail contract parsing in tests.
- EvalOps docs publish exact GitHub annotation examples for `regression` and `watch`.
- Annotation examples are asserted against formatter snapshots.
- Retention pruning removes orphaned run JSON while preserving retained run JSON and non-JSON notes.
- Closed proof (2026-06-09):
- `src/contracts/zod/ai-eval-summary.ts`, `app/api/ai-evals/summary/route.ts`, `src/services/ai-eval-artifacts.ts`, and `docs/evals/golden-museum-questions.md`.
- Tests: `tests/api/ai-evals-summary.test.ts`, `tests/api/openapi.test.ts`, and `tests/services/ai-eval-artifacts.test.ts`.
- Gates: focused system Node test passed 22 tests; `pnpm test` passed 800/251; `pnpm lint` passed with one existing warning; AI eval gate printed `reviewPriority: stable`; production Next build confirmed `/api/ai-evals/summary`.
- [x] RSI-15: AI eval migration and reporting visibility (Medium-severity remediation) — closed (proven)
- Owner: AI Reliability + Platform + DevEx
- Scope: make future summary migrations explicit, make pruning observable in dry-run/delete modes, and put annotation/pruning status directly into CI summary markdown.
- Acceptance:
- `schemaVersion: 2` migration notes exist while v2 remains unsupported.
- Fixture compatibility tests accept `schema-v1-ready.json` and reject `schema-v2-planned.json`.
- Retention pruning reports `delete` and `dry-run` modes with retained/orphaned/deleted/preserved counts.
- CI summary markdown includes latest annotation status and retention pruning status.
- Closed proof (2026-06-09):
- `src/contracts/zod/ai-eval-summary.ts`, `src/services/ai-eval-artifacts.ts`, `docs/evals/golden-museum-questions.md`, and `tests/fixtures/ai-eval-summary/*`.
- Tests: `tests/api/ai-evals-summary.test.ts` and `tests/services/ai-eval-artifacts.test.ts`.
- Gates: focused system Node test passed 24 tests; `pnpm test` passed 802/251; `pnpm lint` passed with one existing warning; AI eval gate printed `reviewPriority: stable`; production Next build confirmed `/api/ai-evals/summary`.
- [x] RSI-16: AI eval summary artifact/schema visibility (Medium-severity remediation) — closed (proven)
- Owner: DevEx + Platform + AI Reliability
- Scope: make CI summary markdown snapshot-verifiable, make latest pruning status agent-readable, and make schema migration compatibility visible through OpenAPI.
- Acceptance:
- Generated `artifacts/evals/summary.md` matches `tests/fixtures/ai-eval-summary/summary-snapshot.md`.
- `/api/ai-evals/summary` includes latest `summary.retentionPruneReport`.
- `/api/openapi` includes the AI eval summary schema migration compatibility table.
- Closed proof (2026-06-09):
- `src/services/ai-eval-artifacts.ts`, `src/services/ai-eval-dashboard.ts`, `src/contracts/zod/ai-eval-summary.ts`, `src/services/openapi.ts`, `docs/evals/golden-museum-questions.md`, and `tests/fixtures/ai-eval-summary/summary-snapshot.md`.
- Tests: `tests/api/ai-evals-summary.test.ts`, `tests/api/openapi.test.ts`, `tests/services/ai-eval-artifacts.test.ts`, and `tests/services/ai-eval-dashboard.test.ts`.
- Gates: focused system Node test passed 25 tests; `pnpm test` passed 803/251; `pnpm lint` passed with one existing warning; AI eval gate printed `reviewPriority: stable`; production Next build confirmed `/api/ai-evals/summary`.
- [x] RSI-17: Visual ETL Mapper AI-assist safety (Medium-severity remediation) — closed (proven)
- Owner: AI Reliability + Curator Experience + Platform
- Scope: complete C4 LLM-assisted mapping without allowing automated ingestion activation from unreviewed suggestions.
- Acceptance:
- `/api/ai/mapping-assist` returns a contract-valid draft `MappingTemplate`.
- Unknown columns are emitted as diagnostics, not hallucinated mappings.
- `/etl/mapper` exposes the assist action as curator-review-only.
- Closed proof (2026-06-09):
- `src/services/mapping-assist.ts`, `src/contracts/zod/mapping-assist.ts`, `app/api/ai/mapping-assist/route.ts`, and `src/components/etl-mapper-workbench.tsx`.
- Tests: `tests/services/mapping-assist.test.ts`, `tests/api/ai-mapping-assist.test.ts`, `tests/components/etl-mapper-config.test.ts`, and `tests/api/openapi.test.ts`.
- Gates: focused system Node test passed 7 mapper-assist tests plus 2 OpenAPI tests; `pnpm test` passed 809/253; `pnpm lint` passed with one existing warning; AI eval gate printed `reviewPriority: stable`; production Next build confirmed `/api/ai/mapping-assist`.
- [x] RSI-18: mapper-assist fixture/schema/importability hardening (Medium-severity remediation) — closed (proven)
- Owner: AI Reliability + Curator UX + Platform
- Scope: harden mapper assist after launch with golden fixtures, importable draft rendering, and OpenAPI contract visibility.
- Acceptance:
- Tricky rights/credit/sensitive columns stay unmapped and cannot create hallucinated target paths.
- Returned suggestions can be converted into ReactFlow draft nodes/edges for curator review.
- `/api/openapi` exposes and references `MappingAssistResponse`.
- Closed proof (2026-06-09):
- `tests/fixtures/mapping-assist/tricky-columns.json`, `src/services/mapping-assist.ts`, `src/utils/etl-mapper-assist.ts`, `src/components/etl-mapper-workbench.tsx`, `src/contracts/zod/mapping-assist.ts`, and `src/services/openapi.ts`.
- Tests: `tests/services/mapping-assist.test.ts`, `tests/utils/etl-mapper-assist.test.ts`, `tests/components/etl-mapper-config.test.ts`, `tests/api/openapi.test.ts`, and `tests/api/ai-mapping-assist.test.ts`.
- Gates: focused system Node test passed 11 tests; `pnpm test` passed 811/254; `pnpm lint` passed with one existing warning; AI eval gate printed `reviewPriority: stable`; production Next build confirmed `/api/ai/mapping-assist`.
- [x] RSI-19: provider-family/browser/request-schema mapper hardening (Medium-severity remediation) — closed (proven)
- Owner: AI Reliability + Curator UX + Platform
- Scope: make mapper assist safer across provider families, browser interaction, and request contract discovery.
- Acceptance:
- Met/Getty/Rijks provider-family fixtures map only expected columns to approved Linked Art target paths.
- `/etl/mapper` has a browser-level assist/import smoke command discoverable as `pnpm smoke:etl:mapper-assist`.
- `/api/openapi` exposes `MappingAssistRequest` and references it as the mapping-assist POST request body.
- Closed proof (2026-06-09):
- `tests/fixtures/mapping-assist/provider-families.json`, `scripts/smoke-etl-mapper-assist.ts`, `package.json`, `src/contracts/zod/mapping-assist.ts`, and `src/services/openapi.ts`.
- Tests: `tests/services/mapping-assist.test.ts`, `tests/scripts/etl-mapper-smoke-script.test.ts`, and `tests/api/openapi.test.ts`.
- Gates: focused system Node test passed 7 tests; `pnpm smoke:etl:mapper-assist` passed against `http://localhost:3001/en`; `pnpm test` passed 813/255; `pnpm lint` passed with one existing warning; AI eval gate printed `reviewPriority: stable`; production Next build confirmed `/api/ai/mapping-assist`.
- [x] RSI-20: negative mapper fixtures, visual screenshot, and API-doc examples (Medium-severity remediation) — closed (proven)
- Owner: AI Reliability + Curator UX + Platform
- Scope: harden near-miss column refusal, imported-draft visual regression evidence, and human/agent examples in `/api/docs`.
- Acceptance:
- Almost-mappable provider columns containing rights, credit, restriction, sensitivity, donor, or flag wording stay unmapped.
- Browser smoke imports the assist draft and writes a stable screenshot artifact at `artifacts/smoke/etl-mapper-assist-imported.png`.
- `/api/docs` includes concrete request and response examples for `/api/ai/mapping-assist`.
- Closed proof (2026-06-09):
- `tests/fixtures/mapping-assist/negative-provider-families.json`, `src/services/mapping-assist.ts`, `scripts/smoke-etl-mapper-assist.ts`, and `app/api/docs/route.ts`.
- Tests: `tests/services/mapping-assist.test.ts`, `tests/scripts/etl-mapper-smoke-script.test.ts`, and `tests/api/docs.test.ts`.
- Gates: focused system Node test passed 11 tests; `pnpm smoke:etl:mapper-assist` passed and wrote `artifacts/smoke/etl-mapper-assist-imported.png`; `pnpm test` passed 814/255; `pnpm lint` passed with one existing warning; AI eval gate printed `reviewPriority: stable`; production Next build passed.
- [x] RSI-21: mapper layout, OpenAPI-sourced docs examples, and confidence policy (Medium-severity remediation) — closed (proven)
- Owner: Curator UX + Platform + AI Reliability
- Scope: improve curator trust by fixing mapper action layout, making docs examples single-source, and gating low-confidence suggestions.
- Acceptance:
- Refreshed browser screenshot shows mapper actions in a non-overlapping wrapped row.
- `/api/docs` request/response examples are generated from `/api/openapi` examples.
- Low-confidence accession/place/description patterns stay diagnostics-only.
- Closed proof (2026-06-09):
- `app/globals.css`, `app/api/docs/route.ts`, `src/contracts/zod/mapping-assist.ts`, `src/services/openapi.ts`, and `src/services/mapping-assist.ts`.
- Tests: `tests/components/etl-mapper-config.test.ts`, `tests/api/docs.test.ts`, and `tests/services/mapping-assist.test.ts`.
- Gates: focused system Node test passed 15 tests; `pnpm smoke:etl:mapper-assist` refreshed `artifacts/smoke/etl-mapper-assist-imported.png`; `pnpm test` passed 817/255; `pnpm lint` passed with one existing warning; AI eval gate printed `reviewPriority: stable`; production Next build passed.
- [x] RSI-22: public source narrative and trust uplift (Medium-severity remediation) — closed (proven)
- Owner: Platform + Curator UX + Trust
- Scope: import only useful sibling-repo source-network ideas into current production surfaces: registry-backed `/datasets`, reviewed local metadata image, asset provenance tracking, footer trust links, and rewritten Contact/Privacy/Terms pages.
- Acceptance:
- `/datasets` and homepage stats derive provider counts from `getProviderCapabilities`, not copied constants.
- OpenGraph/Twitter metadata points at a reviewed local image.
- Footer exposes Contact, Privacy, Terms, and asset provenance links.
- Imported sibling assets have SHA-256 provenance and placeholder thumbnails/legal copy are excluded.
- Closed proof (2026-06-09):
- `src/services/public-source-narrative.ts`, `src/services/site-metadata.ts`, `app/datasets/page.tsx`, `app/contact/page.tsx`, `app/privacy/page.tsx`, `app/terms/page.tsx`, `docs/asset-provenance.md`, and `public/images/meta-museum-art/*`.
- Tests: `tests/services/public-source-narrative.test.ts` and `tests/pages/public-source-pages.test.ts`.
- Gates: focused RSI-22 tests passed 6/2; `pnpm test` passed 823/257; `pnpm lint` passed with one existing warning; `pnpm build` passed; screenshot captured at `artifacts/smoke/datasets-page.png`.
- [x] RSI-23: public-source agent API and trust smoke hardening (Medium-severity remediation) — closed (proven)
- Owner: Platform + Trust + Curator UX
- Scope: make the public source/trust uplift machine-readable and visually checked by adding a summary API, license-review status enum, and browser smoke screenshots for all public trust pages.
- Acceptance:
- `/api/public-sources/summary` returns schema-versioned JSON that parses against `publicSourcesSummaryResponseSchema`.
- Asset provenance rows use the explicit `reviewed-local-use` / `needs-license-review` / `rejected-placeholder` enum, and unknown values fail tests.
- `pnpm smoke:public-trust` captures `/datasets`, `/contact`, `/privacy`, and `/terms` screenshots.
- The nested `C:\Projects\metamuseum\meta-museum-art` prototype copy is removed after all images are preserved under `public/images/meta-museum-art/` and inventoried in `docs/asset-provenance.md`.
- Closed proof (2026-06-09):
- `app/api/public-sources/summary/route.ts`, `src/contracts/zod/public-sources-summary.ts`, `src/services/public-source-narrative.ts`, `scripts/smoke-public-trust-pages.ts`, `docs/asset-provenance.md`, and `package.json`.
- Tests: `tests/api/public-sources-summary.test.ts`, `tests/services/public-source-narrative.test.ts`, and `tests/scripts/public-trust-smoke-script.test.ts`.
- Gates: focused RSI-23 tests passed 6/3; `pnpm smoke:public-trust` passed against `http://localhost:3001`; `pnpm test` passed 827/259; `pnpm lint` passed with one existing warning; `pnpm build` passed.
- [x] RSI-24: OpenAPI, checksum drift, and screenshot retention hardening (Medium-severity remediation) — closed (proven)
- Owner: Platform + Trust + Curator UX
- Scope: make the public source summary discoverable through OpenAPI, prove imported visual assets have not drifted from provenance checksums, and convert public trust screenshots into retained latest/previous review artifacts.
- Acceptance:
- `/api/openapi` exposes `PublicSourcesSummaryResponse` and references it from `/api/public-sources/summary`.
- `tests/services/public-source-narrative.test.ts` fails when a copied asset hash differs from `docs/asset-provenance.md`.
- `pnpm smoke:public-trust` writes timestamped screenshot runs plus `latest` copies and prunes old runs beyond `PUBLIC_TRUST_SCREENSHOT_RETENTION_MAX_RUNS`.
- Closed proof (2026-06-09):
- `src/services/openapi.ts`, `src/contracts/zod/public-sources-summary.ts`, `src/services/public-trust-smoke-artifacts.ts`, `scripts/smoke-public-trust-pages.ts`, and `docs/asset-provenance.md`.
- Tests: `tests/api/openapi.test.ts`, `tests/services/public-source-narrative.test.ts`, `tests/services/public-trust-smoke-artifacts.test.ts`, and `tests/scripts/public-trust-smoke-script.test.ts`.
- Gates: focused RSI-24 tests passed 6/5; `pnpm smoke:public-trust` passed against `http://localhost:3001`; `pnpm test` passed 828/260; `pnpm lint` passed with one existing warning; `pnpm build` passed.
- [x] RSI-25: Public trust docs, diff metadata, and CI artifact visibility (Medium-severity remediation) — closed (proven)
- Owner: Platform + Trust + DevEx
- Scope: keep public-source examples and public trust smoke artifacts review-visible by adding OpenAPI-sourced `/api/docs` payloads, latest-vs-previous screenshot diff metadata, and CI summary/upload links.
- Acceptance:
- `/api/docs` renders `/api/public-sources/summary` example payloads from `/api/openapi`.
- Public trust smoke `summary.json` includes per-screenshot latest-vs-previous checksum/byte diff metadata.
- CI appends public trust artifact links to `$GITHUB_STEP_SUMMARY` and uploads `public-trust-smoke-artifacts`.
- Closed proof (2026-06-09):
- `app/api/docs/route.ts`, `src/services/openapi.ts`, `src/contracts/zod/public-sources-summary.ts`, `src/services/public-trust-smoke-artifacts.ts`, `src/services/public-trust-smoke-ci-summary.ts`, `scripts/public-trust-smoke-ci-summary.ts`, and `.github/workflows/public-trust-smoke.yml`.
- Tests: `tests/api/docs.test.ts`, `tests/api/openapi.test.ts`, `tests/services/public-trust-smoke-artifacts.test.ts`, `tests/services/public-trust-smoke-ci-summary.test.ts`, `tests/scripts/public-trust-smoke-script.test.ts`, and `tests/scripts/public-trust-ci-workflow.test.ts`.
- Gates: focused RSI-25 tests passed 12/10; `pnpm smoke:public-trust` passed against temporary `http://localhost:3001`; `pnpm test` passed 830/262; `pnpm lint` passed with one existing warning; `pnpm build` passed.
- [x] RSI-26: Pixel-diff thresholds, public trust summary API, and main CI artifact links (Medium-severity remediation) — closed (proven)
- Owner: Trust + DevEx + Platform
- Scope: harden public trust screenshot review by failing meaningful pixel drift, exposing the latest smoke report to agents, and publishing artifact links from main CI.
- Acceptance:
- `pnpm smoke:public-trust` fails when screenshot changed-pixel ratio exceeds `PUBLIC_TRUST_SCREENSHOT_PIXEL_DIFF_THRESHOLD`.
- `/api/public-trust/summary` returns schema-versioned latest public trust artifact JSON.
- Main `.github/workflows/ci.yml` appends public trust summary links and uploads `public-trust-smoke-artifacts`.
- Closed proof (2026-06-09):
- `src/services/public-trust-smoke-artifacts.ts`, `scripts/smoke-public-trust-pages.ts`, `src/contracts/zod/public-trust-summary.ts`, `src/services/public-trust-summary.ts`, `app/api/public-trust/summary/route.ts`, `src/services/openapi.ts`, and `.github/workflows/ci.yml`.
- Tests: `tests/services/public-trust-smoke-artifacts.test.ts`, `tests/api/public-trust-summary.test.ts`, `tests/api/openapi.test.ts`, `tests/scripts/public-trust-smoke-script.test.ts`, and `tests/scripts/public-trust-ci-workflow.test.ts`.
- Gates: focused RSI-26 tests passed 10/7; `pnpm smoke:public-trust` passed with `unchanged=4` and `pixel failures=0`; `pnpm test` passed 834/263; `pnpm lint` passed with one existing warning; `pnpm build` passed.
- [x] RSI-27: Public trust per-page thresholds, OpenAPI docs example, and retention badge (Medium-severity remediation) — closed (proven)
- Owner: Trust + Platform + DevEx
- Scope: make visual drift review more precise with per-page thresholds, make `/api/public-trust/summary` discoverable in `/api/docs`, and snapshot-lock the public trust CI retention badge.
- Acceptance:
- `/datasets` uses stricter pixel threshold (`0.005`) than Contact/Privacy/Terms (`0.02`).
- `/api/docs` renders `/api/public-trust/summary` examples from `/api/openapi`.
- `renderPublicTrustSmokeCiSummary` output matches `tests/fixtures/public-trust-summary/summary-snapshot.md`.
- Closed proof (2026-06-09):
- `src/services/public-trust-smoke-artifacts.ts`, `scripts/smoke-public-trust-pages.ts`, `src/contracts/zod/public-trust-summary.ts`, `src/services/openapi.ts`, `app/api/docs/route.ts`, and `src/services/public-trust-smoke-ci-summary.ts`.
- Tests: `tests/services/public-trust-smoke-artifacts.test.ts`, `tests/scripts/public-trust-smoke-script.test.ts`, `tests/services/public-trust-smoke-ci-summary.test.ts`, `tests/fixtures/public-trust-summary/summary-snapshot.md`, `tests/api/docs.test.ts`, and `tests/api/openapi.test.ts`.
- Gates: focused RSI-27 tests passed 13/9; `pnpm smoke:public-trust` passed with per-page thresholds, `unchanged=4`, and `pixel failures=0`; `pnpm test` passed 835/263; `pnpm lint` passed with one existing warning; `pnpm build` passed.
- [x] RSI-28: JSON public trust threshold policy, agent-visible policy, and CI drift annotations (Medium-severity remediation) — closed (proven)
- Owner: Trust + Platform + DevEx
- Scope: move per-page public trust thresholds into JSON policy, expose the applied policy through the agent summary API, and warn reviewers when screenshots change under threshold.
- Acceptance:
- `scripts/smoke-public-trust-pages.ts` consumes `config/public-trust-smoke-policy.json` through `loadPublicTrustSmokePolicy`.
- `/api/public-trust/summary` includes `thresholdPolicy` with default and per-page thresholds.
- CI summary tooling emits GitHub warning annotations for changed screenshots whose pixel ratio remains under the configured threshold.
- Closed proof (2026-06-09):
- `config/public-trust-smoke-policy.json`, `src/services/public-trust-smoke-policy.ts`, `scripts/smoke-public-trust-pages.ts`, `src/services/public-trust-summary.ts`, `src/contracts/zod/public-trust-summary.ts`, and `src/services/public-trust-smoke-ci-summary.ts`.
- Tests: `tests/services/public-trust-smoke-policy.test.ts`, `tests/scripts/public-trust-smoke-script.test.ts`, `tests/api/public-trust-summary.test.ts`, `tests/services/public-trust-smoke-ci-summary.test.ts`, and `tests/scripts/public-trust-ci-workflow.test.ts`.
- Gates: focused RSI-28 tests passed 20/12; `pnpm smoke:public-trust` passed with JSON policy thresholds, `unchanged=4`, and `pixel failures=0`; `pnpm test` passed 838/264; `pnpm lint` passed with one existing warning; `pnpm build` passed.
- [x] RSI-29: Public trust reviewer rationale, schema rejection, and severity-grouped PR summaries (Medium-severity remediation) — closed (proven)
- Owner: Trust + Platform + DevEx
- Scope: make public trust review rationale explicit in policy/API payloads, reject unsupported future policy schema versions, and group under-threshold CI drift by route severity.
- Acceptance:
- Policy pages include `reviewSeverity`, `reviewerNote`, and `reasonCodes`.
- `/api/public-trust/summary` exposes reviewer rationale metadata in `thresholdPolicy.pages`.
- `loadPublicTrustSmokePolicy` rejects unsupported `schemaVersion: 2` fixtures.
- CI PR summary and warnings group changed-under-threshold screenshots by severity.
- Closed proof (2026-06-09):
- `config/public-trust-smoke-policy.json`, `src/services/public-trust-smoke-policy.ts`, `src/services/public-trust-summary.ts`, `src/contracts/zod/public-trust-summary.ts`, `scripts/smoke-public-trust-pages.ts`, and `src/services/public-trust-smoke-ci-summary.ts`.
- Tests: `tests/services/public-trust-smoke-policy.test.ts`, `tests/api/public-trust-summary.test.ts`, `tests/services/public-trust-smoke-ci-summary.test.ts`, and `tests/fixtures/public-trust-summary/summary-snapshot.md`.
- Gates: focused RSI-29 tests passed 18/11 plus CI-summary focused retest 2/1; `pnpm exec start-server-and-test "pnpm dev" http://localhost:3000 "pnpm smoke:public-trust"` passed with JSON policy metadata and `pixel failures=0`; `pnpm test` passed 839/264; `pnpm lint` passed with one existing warning; `pnpm build` passed.
- [x] RSI-30: Owner/reviewer initials, schema v2 fixture, and grouped annotation snapshot (Medium-severity remediation) — closed (proven)
- Owner: Trust + Platform + DevEx
- Scope: make review ownership explicit per policy row, preserve a future v2 migration fixture, and snapshot exact grouped warning annotations.
- Acceptance:
- Policy pages include `ownerInitials` and `reviewerInitials`.
- `/api/public-trust/summary` exposes owner/reviewer initials in `thresholdPolicy.pages`.
- CI summary rows and grouped warning annotations include owner/reviewer initials.
- `tests/fixtures/public-trust-policy/schema-v2-planned.json` remains rejected until migration support lands.
- Closed proof (2026-06-09):
- `config/public-trust-smoke-policy.json`, `src/services/public-trust-smoke-policy.ts`, `src/services/public-trust-summary.ts`, `src/contracts/zod/public-trust-summary.ts`, `src/services/public-trust-smoke-ci-summary.ts`, and `scripts/smoke-public-trust-pages.ts`.
- Tests: `tests/fixtures/public-trust-policy/schema-v2-planned.json`, `tests/fixtures/public-trust-summary/grouped-annotations-snapshot.txt`, `tests/services/public-trust-smoke-policy.test.ts`, `tests/api/public-trust-summary.test.ts`, and `tests/services/public-trust-smoke-ci-summary.test.ts`.
- Gates: focused RSI-30 tests passed 15/10; `pnpm exec start-server-and-test "pnpm dev" http://localhost:3000 "pnpm smoke:public-trust"` passed with `unchanged=4` and `pixel failures=0`; first full `pnpm test` exposed a transient `tests/api/artworks/by-id.test.ts` failure that passed isolated retest; final `pnpm test` passed 839/264; `pnpm lint` passed with one existing warning; `pnpm build` passed.
Controls and mitigation plan
| Risk | Control | Owner | Evidence / check |
|---|---|---|---|
| Adapter and pipeline boundary drift | Static boundary contract test ensures adapter modules do not import each other (except allowed shared interfaces/utils) | Platform | `tests/contracts/provider-boundary-contracts.test.ts` |
| Adapter and pipeline boundary drift | AI-RSI evidence loop captures boundary checks each cycle | Engineering + AI-assisted operator | `docs/rsi-wiki.md`, `README.md`, `docs/roadmap.md`, `CLAUDE.md` |
| Adapter and pipeline boundary drift | Risk review in each close-out row for new provider/pipeline touchpoints | Platform owner | `CLAUDE.md` close-out log + section updates |
| UI journey automation | Role/provider matrix smoke plus entity-role route assertions for imported records | QA + Product | `scripts/smoke-explore-import-matrix.ts`, `package.json` (`smoke:explore:matrix`) |
| Low-severity complexity | Maintainability decomposition plan before future scale drift | Product + Platform | Planned owner list and decomposition evidence in this register + RSI close-out rows |
| AI eval freshness drift | Regression baseline includes `citationFreshness` and fail-fast freshness-drop policy | Platform + AI Reliability | `config/ai-eval-regression-policy.json`, `scripts/ai-eval-gate.ts`, `tests/services/ai-eval-harness.test.ts` |
| AI eval artifact visibility | CI summary badges, artifact upload, and freshness-aging pressure alerts | Platform + AI Reliability | `.github/workflows/ai-eval-gate.yml`, `src/services/ai-eval-artifacts.ts`, `tests/services/ai-eval-artifacts.test.ts` |
| AI eval dashboard visibility | Read-only dashboard for latest/trend eval artifacts and warnings | Platform + AI Reliability | `src/services/ai-eval-dashboard.ts`, `app/(workspace)/ai-evals/page.tsx`, `tests/services/ai-eval-dashboard.test.ts`, `tests/pages/ai-evals-page.test.ts` |
| AI eval artifact diff review | Latest-vs-previous dashboard diff with metric/aging deltas and review notes | Platform + AI Reliability | `src/services/ai-eval-dashboard.ts`, `app/(workspace)/ai-evals/page.tsx`, `tests/services/ai-eval-dashboard.test.ts`, `tests/pages/ai-evals-page.test.ts` |
| AI eval diff prioritization | Severity-labeled diff thresholds for `regression`, `watch`, `stable`, and `improved` triage | Platform + AI Reliability | `src/services/ai-eval-dashboard.ts`, `app/(workspace)/ai-evals/page.tsx`, `tests/services/ai-eval-dashboard.test.ts`, `tests/pages/ai-evals-page.test.ts` |
| AI eval priority discoverability | Config-driven severity policy, dashboard distribution, and CI review-priority summary | AI Reliability + Platform + DevEx | `config/ai-eval-regression-policy.json`, `src/services/ai-eval-severity.ts`, `src/services/ai-eval-artifacts.ts`, `scripts/ai-eval-gate.ts` |
| AI eval agent-summary reliability | Policy validation, agent JSON endpoint, and severity sparkline/history | AI Reliability + Platform + Agent Experience | `src/services/ai-eval-severity.ts`, `app/api/ai-evals/summary/route.ts`, `tests/api/ai-evals-summary.test.ts` |
| AI eval contract and CI annotation reliability | Schema-backed summary payload, configurable history window, and PR-visible priority annotations | AI Reliability + DevEx + Agent Experience | `src/contracts/zod/ai-eval-summary.ts`, `src/services/openapi.ts`, `scripts/ai-eval-gate.ts`, `tests/api/openapi.test.ts` |
| AI eval version and retention hygiene | Versioned summary contract, snapshot-documented annotations, and orphaned run pruning | AI Reliability + DevEx + Platform | `src/contracts/zod/ai-eval-summary.ts`, `src/services/ai-eval-artifacts.ts`, `tests/services/ai-eval-artifacts.test.ts` |
| AI eval migration and reporting visibility | Future schema migration notes, fixture compatibility, dry-run pruning reports, and CI summary status | AI Reliability + Platform + DevEx | `src/contracts/zod/ai-eval-summary.ts`, `src/services/ai-eval-artifacts.ts`, `tests/fixtures/ai-eval-summary/schema-v1-ready.json` |
| AI eval summary artifact/schema visibility | Snapshot-locked CI summary markdown, agent-visible pruning report, and OpenAPI migration compatibility table | DevEx + Platform + AI Reliability | `tests/fixtures/ai-eval-summary/summary-snapshot.md`, `src/services/ai-eval-dashboard.ts`, `src/services/openapi.ts`, `tests/api/openapi.test.ts` |
| Visual ETL Mapper AI-assist safety | Review-only mapping suggestions with contract validation, rationale, standards anchors, and unmapped-column diagnostics | AI Reliability + Curator Experience + Platform | `src/services/mapping-assist.ts`, `app/api/ai/mapping-assist/route.ts`, `tests/services/mapping-assist.test.ts`, `tests/api/ai-mapping-assist.test.ts` |
| Mapper-assist fixture/schema/importability drift | Golden tricky-column fixture, importable ReactFlow draft helper, and OpenAPI response component | AI Reliability + Curator UX + Platform | `tests/fixtures/mapping-assist/tricky-columns.json`, `src/utils/etl-mapper-assist.ts`, `src/contracts/zod/mapping-assist.ts`, `tests/api/openapi.test.ts` |
| Mapper-assist provider/browser/request-schema drift | Provider-family fixtures, browser smoke command, and OpenAPI request-body component | AI Reliability + Curator UX + Platform | `tests/fixtures/mapping-assist/provider-families.json`, `scripts/smoke-etl-mapper-assist.ts`, `src/contracts/zod/mapping-assist.ts`, `tests/api/openapi.test.ts` |
| Mapper-assist near-miss/docs/visual drift | Negative provider fixtures, imported-draft screenshot artifact, and `/api/docs` examples | AI Reliability + Curator UX + Platform | `tests/fixtures/mapping-assist/negative-provider-families.json`, `scripts/smoke-etl-mapper-assist.ts`, `app/api/docs/route.ts`, `tests/api/docs.test.ts` |
| Mapper-assist layout/example/confidence drift | Wrapped mapper actions, OpenAPI-sourced docs examples, and low-confidence diagnostics-only policy | Curator UX + Platform + AI Reliability | `app/globals.css`, `app/api/docs/route.ts`, `src/services/mapping-assist.ts`, `tests/components/etl-mapper-config.test.ts` |
| Public source/trust drift | Registry-backed public narrative, provenance-tracked imported assets, and rewritten trust pages | Platform + Curator UX + Trust | `src/services/public-source-narrative.ts`, `docs/asset-provenance.md`, `tests/pages/public-source-pages.test.ts` |
| Public source/trust drift | Agent summary API, asset license-review enum, and four-page screenshot smoke | Platform + Trust + Curator UX | `app/api/public-sources/summary/route.ts`, `src/contracts/zod/public-sources-summary.ts`, `scripts/smoke-public-trust-pages.ts` |
| Public source/trust drift | OpenAPI schema reference, checksum drift test, and screenshot latest/previous retention pruning | Platform + Trust + Curator UX | `src/services/openapi.ts`, `tests/api/openapi.test.ts`, `src/services/public-trust-smoke-artifacts.ts` |
| Public source/trust drift | OpenAPI-sourced docs examples, screenshot diff metadata, and CI summary/upload links | Platform + Trust + DevEx | `app/api/docs/route.ts`, `src/services/public-trust-smoke-ci-summary.ts`, `.github/workflows/public-trust-smoke.yml` |
| Public source/trust drift | Pixel-diff threshold failure, public trust summary API, and main CI artifact links | Trust + Platform + DevEx | `src/services/public-trust-smoke-artifacts.ts`, `app/api/public-trust/summary/route.ts`, `.github/workflows/ci.yml` |
| Public source/trust drift | Per-page pixel thresholds, OpenAPI-sourced trust summary example, and retention badge snapshot | Trust + Platform + DevEx | `scripts/smoke-public-trust-pages.ts`, `app/api/docs/route.ts`, `tests/fixtures/public-trust-summary/summary-snapshot.md` |
| Public source/trust drift | JSON threshold policy, agent-visible applied thresholds, and CI under-threshold drift warnings | Trust + Platform + DevEx | `config/public-trust-smoke-policy.json`, `app/api/public-trust/summary/route.ts`, `src/services/public-trust-smoke-ci-summary.ts` |
| Public source/trust drift | Reviewer rationale metadata, schema-version rejection, and severity-grouped PR summaries | Trust + Platform + DevEx | `config/public-trust-smoke-policy.json`, `src/services/public-trust-smoke-policy.ts`, `src/services/public-trust-smoke-ci-summary.ts` |
| Public source/trust drift | Owner/reviewer initials, future v2 migration fixture, and snapshot-locked warning annotations | Trust + Platform + DevEx | `config/public-trust-smoke-policy.json`, `tests/fixtures/public-trust-policy/schema-v2-planned.json`, `tests/fixtures/public-trust-summary/grouped-annotations-snapshot.txt` |
Next review
- Weekly RSI review pass
- Expand this register when new provider/pipeline slices are introduced