{"id":"ops/outbox-projector","relativePath":"ops/outbox-projector.md","title":"Transactional Outbox Projector (Postgres -> Solr/GraphDB)","markdown":"# Transactional Outbox Projector (Postgres -> Solr/GraphDB)\n\nThis runbook covers the transactional outbox gate for record write paths.\n\n## What is implemented\n\n- Record writes in Postgres mode (`/api/records` and all import routes using `createRecordsService.add`) now:\n  - upsert `storage_documents.records` and\n  - enqueue `outbox_events`\n  - in the same database transaction.\n- Outbox events use replay-safe idempotency keys:\n  - `sha256(\"record.upsert|recordId|sourceHash\")`\n- Projector worker claims pending events with `FOR UPDATE SKIP LOCKED`, retries with backoff, and dead-letters after max attempts.\n\n## Commands\n\n```bash\npnpm outbox:projector:once\npnpm outbox:projector\npnpm outbox:status\npnpm outbox:dlq:list\npnpm outbox:replay:dlq\npnpm outbox:requeue:stale\npnpm outbox:alert:check\n```\n\n`pnpm outbox:projector:once` runs a single batch.  \n`pnpm outbox:projector` drains multiple batches until empty (or batch cap reached).\n\nOperator commands:\n\n- `pnpm outbox:status` queue health counters + aging\n- `pnpm outbox:dlq:list` inspect dead-letter queue\n- `pnpm outbox:replay:dlq` replay dead-letter events back to pending\n- `pnpm outbox:requeue:stale` requeue stuck `processing` events\n- `pnpm outbox:alert:check` evaluate alert policy and optionally dispatch webhook\n\n## Required env\n\n- `DATABASE_URL` (Postgres)\n- `METAMUSEUM_STORAGE_MODE=postgres` for fully transactional write+enqueue\n- Optional projector targets:\n  - `OUTBOX_PROJECT_TO_SOLR=1` (default on)\n  - `OUTBOX_PROJECT_TO_GRAPHDB=1` (default on)\n  - `SOLR_BASE_URL`, `SOLR_CORE`, `SOLR_USERNAME`, `SOLR_PASSWORD`\n  - `GRAPHDB_BASE_URL`, `GRAPHDB_REPOSITORY_ID`, `GRAPHDB_USERNAME`, `GRAPHDB_PASSWORD`\n\nFailure policy + alert env:\n\n- `OUTBOX_MAX_ATTEMPTS` (default `8`)\n- `OUTBOX_REPLAY_BATCH_LIMIT` (default `100`)\n- `OUTBOX_DLQ_ALERT_THRESHOLD` (default `1`)\n- `OUTBOX_RETRY_ALERT_THRESHOLD` (default `50`)\n- `OUTBOX_PENDING_ALERT_THRESHOLD` (default `200`)\n- `OUTBOX_PENDING_MAX_AGE_MINUTES` (default `15`)\n- `OUTBOX_STALLED_PROCESSING_MAX_AGE_MINUTES` (default `10`)\n- `OUTBOX_ALERT_WEBHOOK_URL` (optional webhook target)\n- `OUTBOX_ALERT_FAIL_ON_ALERT=1` to fail CI/cron checks on threshold breach\n\n## Table schema\n\n`ops/postgres/init/02-outbox.sql` creates:\n\n- `outbox_events`\n- status/availability index\n- aggregate index\n\nStatuses:\n\n- `pending`\n- `processing`\n- `retry`\n- `processed`\n- `dead_letter`\n\n## Failure handling policy\n\n- Retry budget:\n  - attempts increment on claim (`pending/retry -> processing`)\n  - failures backoff exponentially (`2^attempt` seconds, capped at 300s)\n  - dead-letter transition at `attempt_count >= OUTBOX_MAX_ATTEMPTS`\n- Dead-letter queue:\n  - inspect with `pnpm outbox:dlq:list`\n  - replay with `pnpm outbox:replay:dlq`\n- Stalled worker recovery:\n  - `pnpm outbox:requeue:stale` unlocks long-running `processing` rows back to `retry`\n- Alerting:\n  - `pnpm outbox:alert:check` emits threshold alerts and can POST a webhook payload\n","sections":[{"level":2,"heading":"What is implemented","anchor":"what-is-implemented"},{"level":2,"heading":"Commands","anchor":"commands"},{"level":2,"heading":"Required env","anchor":"required-env"},{"level":2,"heading":"Table schema","anchor":"table-schema"},{"level":2,"heading":"Failure handling policy","anchor":"failure-handling-policy"}],"html":"<h1 id=\"transactional-outbox-projector-postgres-solr-graphdb\">Transactional Outbox Projector (Postgres -&gt; Solr/GraphDB)</h1>\n<p>This runbook covers the transactional outbox gate for record write paths.</p>\n<h2 id=\"what-is-implemented\">What is implemented</h2>\n<ul><li>Record writes in Postgres mode (`/api/records` and all import routes using `createRecordsService.add`) now:</li><li>upsert `storage_documents.records` and</li><li>enqueue `outbox_events`</li><li>in the same database transaction.</li><li>Outbox events use replay-safe idempotency keys:</li><li>`sha256(&quot;record.upsert|recordId|sourceHash&quot;)`</li><li>Projector worker claims pending events with `FOR UPDATE SKIP LOCKED`, retries with backoff, and dead-letters after max attempts.</li></ul>\n<h2 id=\"commands\">Commands</h2>\n<pre><code>\npnpm outbox:projector:once\npnpm outbox:projector\npnpm outbox:status\npnpm outbox:dlq:list\npnpm outbox:replay:dlq\npnpm outbox:requeue:stale\npnpm outbox:alert:check\n</code></pre>\n<p>`pnpm outbox:projector:once` runs a single batch.  </p>\n<p>`pnpm outbox:projector` drains multiple batches until empty (or batch cap reached).</p>\n<p>Operator commands:</p>\n<ul><li>`pnpm outbox:status` queue health counters + aging</li><li>`pnpm outbox:dlq:list` inspect dead-letter queue</li><li>`pnpm outbox:replay:dlq` replay dead-letter events back to pending</li><li>`pnpm outbox:requeue:stale` requeue stuck `processing` events</li><li>`pnpm outbox:alert:check` evaluate alert policy and optionally dispatch webhook</li></ul>\n<h2 id=\"required-env\">Required env</h2>\n<ul><li>`DATABASE_URL` (Postgres)</li><li>`METAMUSEUM_STORAGE_MODE=postgres` for fully transactional write+enqueue</li><li>Optional projector targets:</li><li>`OUTBOX_PROJECT_TO_SOLR=1` (default on)</li><li>`OUTBOX_PROJECT_TO_GRAPHDB=1` (default on)</li><li>`SOLR_BASE_URL`, `SOLR_CORE`, `SOLR_USERNAME`, `SOLR_PASSWORD`</li><li>`GRAPHDB_BASE_URL`, `GRAPHDB_REPOSITORY_ID`, `GRAPHDB_USERNAME`, `GRAPHDB_PASSWORD`</li></ul>\n<p>Failure policy + alert env:</p>\n<ul><li>`OUTBOX_MAX_ATTEMPTS` (default `8`)</li><li>`OUTBOX_REPLAY_BATCH_LIMIT` (default `100`)</li><li>`OUTBOX_DLQ_ALERT_THRESHOLD` (default `1`)</li><li>`OUTBOX_RETRY_ALERT_THRESHOLD` (default `50`)</li><li>`OUTBOX_PENDING_ALERT_THRESHOLD` (default `200`)</li><li>`OUTBOX_PENDING_MAX_AGE_MINUTES` (default `15`)</li><li>`OUTBOX_STALLED_PROCESSING_MAX_AGE_MINUTES` (default `10`)</li><li>`OUTBOX_ALERT_WEBHOOK_URL` (optional webhook target)</li><li>`OUTBOX_ALERT_FAIL_ON_ALERT=1` to fail CI/cron checks on threshold breach</li></ul>\n<h2 id=\"table-schema\">Table schema</h2>\n<p>`ops/postgres/init/02-outbox.sql` creates:</p>\n<ul><li>`outbox_events`</li><li>status/availability index</li><li>aggregate index</li></ul>\n<p>Statuses:</p>\n<ul><li>`pending`</li><li>`processing`</li><li>`retry`</li><li>`processed`</li><li>`dead_letter`</li></ul>\n<h2 id=\"failure-handling-policy\">Failure handling policy</h2>\n<ul><li>Retry budget:</li><li>attempts increment on claim (`pending/retry -&gt; processing`)</li><li>failures backoff exponentially (`2^attempt` seconds, capped at 300s)</li><li>dead-letter transition at `attempt_count &gt;= OUTBOX_MAX_ATTEMPTS`</li><li>Dead-letter queue:</li><li>inspect with `pnpm outbox:dlq:list`</li><li>replay with `pnpm outbox:replay:dlq`</li><li>Stalled worker recovery:</li><li>`pnpm outbox:requeue:stale` unlocks long-running `processing` rows back to `retry`</li><li>Alerting:</li><li>`pnpm outbox:alert:check` emits threshold alerts and can POST a webhook payload</li></ul>","updatedAt":"2018-10-20T01:46:40.000Z","checksum":"dc70ad76647111ba6e8a60c39c45c61006fc8a54a8261298908117ae888fc3fb","checksumPrefix":"dc70ad766471","anchorCount":5,"lineCount":89,"rawUrl":"/api/docs/content?path=ops%2Foutbox-projector.md","htmlUrl":"/docs?doc=ops%2Foutbox-projector.md","apiUrl":"/api/docs/content?path=ops%2Foutbox-projector.md"}