{"id":"providers/rkd-knowledge-graph","relativePath":"providers/rkd-knowledge-graph.md","title":"RKD Knowledge Graph Integration Plan","markdown":"# RKD Knowledge Graph Integration Plan\n\nStatus: Planned (next B5 provider slice)  \nUpdated: May 30, 2026\n\n## Source\n\n- Dataset page: <https://rkd.triply.cc/rkd/RKD-Knowledge-Graph>\n- Services page: <https://rkd.triply.cc/rkd/RKD-Knowledge-Graph/services>\n- Triply SPARQL API reference: <https://docs.triply.cc/triply-api/>\n\nAs of May 30, 2026, the public dataset reports 600M+ statements and indicates CIDOC-CRM + Linked Art oriented modeling.\n\n## Why this provider\n\n- Strong Linked Data fit for Meta Museum’s event-centric model.\n- High-value coverage in Netherlandish art and archival/provenance-rich domains.\n- Natural complement to Met, Getty, and Rijks.\n\n## Integration goals\n\n- Add RKD as a first-class provider through `provider-interface`.\n- Keep Linked Art JSON-LD canonical in storage and import flows.\n- Preserve attribution + license metadata at ingest and display boundaries.\n- Keep runtime safe at scale (bounded/paged queries only).\n\n## Proposed implementation slice\n\n### Adapter\n\n- `src/adapters/rkd.ts`\n  - provider descriptor/profile\n  - URI parsing/normalization helpers\n  - candidate extraction from SPARQL/search payloads\n  - normalization into `SourceRecord` + `Artwork` boundary DTOs\n\n### Routes\n\n- `GET /api/rkd/profile`\n- `POST /api/rkd/search`\n- `POST /api/rkd/entity`\n- `POST /api/rkd/import`\n- Optional `POST /api/rkd/sparql` (read-only, allowlisted templates only)\n\n### UI\n\n- Add `rkd` source toggle to `/explore`.\n- Show provider/license attribution consistently in cards/details/import output.\n\n## Configuration model\n\nUse environment-driven endpoint config to avoid hardcoding service assumptions:\n\n- `RKD_TRIPLY_ACCOUNT` (default: `rkd`)\n- `RKD_TRIPLY_DATASET` (default: `RKD-Knowledge-Graph`)\n- `RKD_SPARQL_SERVICE` (for example `speedy` or configured service name)\n- `RKD_SPARQL_ENDPOINT` (explicit override, if provided)\n- `RKD_TRIPLY_INSTANCE` (default: `triplydb.com`)\n- `RKD_TRIPLY_TOKEN` (optional; required for private/internal datasets or write-capable service management)\n- `RKD_USE_SAVED_QUERY_API` (`true` by default in production)\n- `RKD_SAVED_QUERY_BASE` (optional explicit saved-query API base override)\n\nResolution order:\n1. `RKD_SPARQL_ENDPOINT` explicit override\n2. computed Triply endpoint from account/dataset/service\n\nComputed endpoint patterns (from official Triply API docs):\n\n- Dataset API:\n  - `https://api.INSTANCE/datasets/ACCOUNT/DATASET/`\n- Service API:\n  - `https://api.INSTANCE/datasets/ACCOUNT/DATASET/services/SERVICE/`\n- SPARQL endpoint:\n  - `https://api.INSTANCE/datasets/ACCOUNT/DATASET/services/SERVICE/sparql`\n- Linked-data export:\n  - `https://api.INSTANCE/datasets/ACCOUNT/DATASET/download`\n- Graph list:\n  - `https://api.INSTANCE/datasets/ACCOUNT/DATASET/graphs`\n- IRI describe:\n  - `https://api.INSTANCE/datasets/ACCOUNT/DATASET/describe.nt?resource=RESOURCE`\n\n## Authentication and security\n\nFrom official Triply API guidance:\n\n- Public datasets: most reads can be done without auth.\n- Internal/private datasets (and write operations): require bearer token.\n- Header format:\n  - `Authorization: Bearer TOKEN`\n- Token handling requirements:\n  - never commit tokens to git\n  - never share tokens outside authorized operators\n  - rotate tokens regularly or after suspected compromise\n\nProject policy:\n\n- `RKD_TRIPLY_TOKEN` must come from environment/secret manager only.\n- No token logging in route handler errors.\n- Tests must stub auth headers and never use real secrets.\n\n## Scale and safety guardrails\n\n- Enforce small page windows (`limit`, `offset` / cursor) and max caps.\n- Timebox upstream calls and retry with backoff.\n- Reject mutation SPARQL (`INSERT`, `DELETE`, `LOAD`, `CLEAR`, etc.).\n- Prefer allowlisted query templates with controlled variables.\n- Add cache headers/internal memoization for repeated lookups.\n- Default to saved query APIs for production retrieval/pagination; use raw SPARQL only for controlled admin/research paths.\n\n## SPARQL protocol + formats (official)\n\nTriply supports the SPARQL 1.1 query protocol with:\n\n- GET query (`query` parameter) for small requests\n- POST urlencoded (`application/x-www-form-urlencoded`)\n- POST direct (`application/sparql-query`) for larger/custom requests\n\nImplementation preference:\n\n1. Saved queries for production flows\n2. Direct POST for ad hoc long queries\n3. GET only for tiny diagnostics\n\nResult formats (via `Accept` header or suffix) include:\n\n- `application/json`, `application/sparql-results+json`, `text/csv`, `text/tab-separated-values`\n- `application/ld+json`, `application/n-triples`, `application/n-quads`, `application/trig`, `text/turtle`\n\nFor Linked Art integration routes, prefer:\n\n- `application/ld+json` / `application/n-triples` for graph materialization paths\n- `application/sparql-results+json` for tabular SELECT/ASK result handling\n\n## Linked-data serialization handling\n\nTriply linked-data export supports:\n\n- TriG (`application/trig`)\n- N-Triples (`application/n-triples`)\n- N-Quads (`application/n-quads`)\n- Turtle (`text/turtle`)\n- JSON-LD (`application/ld+json`)\n\nMeta Museum adapter behavior:\n\n- ingest using JSON-LD or N-Triples/N-Quads depending on endpoint behavior\n- normalize to canonical Linked Art JSON-LD at storage boundary\n- preserve source serialization metadata in `_source` diagnostics where useful\n\n## Standards and conformance mapping\n\nAll RKD provider PRs must include Standards Mapping with round + fixture anchors from:\n\n- object + provenance rounds\n- shared structures rounds\n- schema rounds (as applicable)\n- search/protocol rounds (71+)\n\nRequired conformance checks:\n\n- event-centric modeling preserved (no object-person shortcut flattening)\n- ownership vs custody distinction preserved\n- ambiguous transfers mapped conservatively (`Transfer`)\n- carrier/content separation preserved (`HumanMadeObject`/`DigitalObject` vs `VisualItem`/`LinguisticObject`)\n- protocol/profile checks (B8): context/profile negotiation, CORS/OPTIONS, URI opacity, array cardinality safety\n- Triply transport checks: auth behavior for public/private access modes, controlled media-type negotiation, safe fallback when requested serialization is unavailable\n\n## License and attribution handling\n\n- Dataset license: Open Data Commons Attribution License 1.0 (ODC-By 1.0).\n- Persist and render source attribution and license metadata with imported records.\n- Keep reuse status conservative when image rights are not explicit.\n\n## Test plan (failing-first)\n\n- `tests/adapters/rkd.test.ts`\n- `tests/api/rkd/profile.test.ts`\n- `tests/api/rkd/search.test.ts`\n- `tests/api/rkd/entity.test.ts`\n- `tests/api/rkd/import.test.ts`\n- optional `tests/api/rkd/sparql.test.ts`\n- extend `tests/quality/protocol-conformance.test.ts` with RKD route coverage once routes land\n- add `tests/api/rkd/auth.test.ts` covering:\n  - public-read unauthenticated path\n  - bearer token forwarding path\n  - missing/invalid token behavior for protected operations\n- add `tests/api/rkd/serialization.test.ts` covering Accept-header negotiation and fallback behavior\n\n## Exit criteria for RKD slice\n\n- Routes and adapter shipped with green tests.\n- `/explore` can discover + import RKD records.\n- Standards Mapping note included in PR.\n- B8 protocol checks implemented for RKD endpoints.\n","sections":[{"level":2,"heading":"Source","anchor":"source"},{"level":2,"heading":"Why this provider","anchor":"why-this-provider"},{"level":2,"heading":"Integration goals","anchor":"integration-goals"},{"level":2,"heading":"Proposed implementation slice","anchor":"proposed-implementation-slice"},{"level":3,"heading":"Adapter","anchor":"adapter"},{"level":3,"heading":"Routes","anchor":"routes"},{"level":3,"heading":"UI","anchor":"ui"},{"level":2,"heading":"Configuration model","anchor":"configuration-model"},{"level":2,"heading":"Authentication and security","anchor":"authentication-and-security"},{"level":2,"heading":"Scale and safety guardrails","anchor":"scale-and-safety-guardrails"},{"level":2,"heading":"SPARQL protocol + formats (official)","anchor":"sparql-protocol-formats-official"},{"level":2,"heading":"Linked-data serialization handling","anchor":"linked-data-serialization-handling"},{"level":2,"heading":"Standards and conformance mapping","anchor":"standards-and-conformance-mapping"},{"level":2,"heading":"License and attribution handling","anchor":"license-and-attribution-handling"},{"level":2,"heading":"Test plan (failing-first)","anchor":"test-plan-failing-first"},{"level":2,"heading":"Exit criteria for RKD slice","anchor":"exit-criteria-for-rkd-slice"}],"html":"<h1 id=\"rkd-knowledge-graph-integration-plan\">RKD Knowledge Graph Integration Plan</h1>\n<p>Status: Planned (next B5 provider slice)  </p>\n<p>Updated: May 30, 2026</p>\n<h2 id=\"source\">Source</h2>\n<ul><li>Dataset page: &lt;https://rkd.triply.cc/rkd/RKD-Knowledge-Graph&gt;</li><li>Services page: &lt;https://rkd.triply.cc/rkd/RKD-Knowledge-Graph/services&gt;</li><li>Triply SPARQL API reference: &lt;https://docs.triply.cc/triply-api/&gt;</li></ul>\n<p>As of May 30, 2026, the public dataset reports 600M+ statements and indicates CIDOC-CRM + Linked Art oriented modeling.</p>\n<h2 id=\"why-this-provider\">Why this provider</h2>\n<ul><li>Strong Linked Data fit for Meta Museum’s event-centric model.</li><li>High-value coverage in Netherlandish art and archival/provenance-rich domains.</li><li>Natural complement to Met, Getty, and Rijks.</li></ul>\n<h2 id=\"integration-goals\">Integration goals</h2>\n<ul><li>Add RKD as a first-class provider through `provider-interface`.</li><li>Keep Linked Art JSON-LD canonical in storage and import flows.</li><li>Preserve attribution + license metadata at ingest and display boundaries.</li><li>Keep runtime safe at scale (bounded/paged queries only).</li></ul>\n<h2 id=\"proposed-implementation-slice\">Proposed implementation slice</h2>\n<h3 id=\"adapter\">Adapter</h3>\n<ul><li>`src/adapters/rkd.ts`</li><li>provider descriptor/profile</li><li>URI parsing/normalization helpers</li><li>candidate extraction from SPARQL/search payloads</li><li>normalization into `SourceRecord` + `Artwork` boundary DTOs</li></ul>\n<h3 id=\"routes\">Routes</h3>\n<ul><li>`GET /api/rkd/profile`</li><li>`POST /api/rkd/search`</li><li>`POST /api/rkd/entity`</li><li>`POST /api/rkd/import`</li><li>Optional `POST /api/rkd/sparql` (read-only, allowlisted templates only)</li></ul>\n<h3 id=\"ui\">UI</h3>\n<ul><li>Add `rkd` source toggle to `/explore`.</li><li>Show provider/license attribution consistently in cards/details/import output.</li></ul>\n<h2 id=\"configuration-model\">Configuration model</h2>\n<p>Use environment-driven endpoint config to avoid hardcoding service assumptions:</p>\n<ul><li>`RKD_TRIPLY_ACCOUNT` (default: `rkd`)</li><li>`RKD_TRIPLY_DATASET` (default: `RKD-Knowledge-Graph`)</li><li>`RKD_SPARQL_SERVICE` (for example `speedy` or configured service name)</li><li>`RKD_SPARQL_ENDPOINT` (explicit override, if provided)</li><li>`RKD_TRIPLY_INSTANCE` (default: `triplydb.com`)</li><li>`RKD_TRIPLY_TOKEN` (optional; required for private/internal datasets or write-capable service management)</li><li>`RKD_USE_SAVED_QUERY_API` (`true` by default in production)</li><li>`RKD_SAVED_QUERY_BASE` (optional explicit saved-query API base override)</li></ul>\n<p>Resolution order:</p>\n<ol><li>`RKD_SPARQL_ENDPOINT` explicit override</li></ol>\n<ol><li>computed Triply endpoint from account/dataset/service</li></ol>\n<p>Computed endpoint patterns (from official Triply API docs):</p>\n<ul><li>Dataset API:</li><li>`https://api.INSTANCE/datasets/ACCOUNT/DATASET/`</li><li>Service API:</li><li>`https://api.INSTANCE/datasets/ACCOUNT/DATASET/services/SERVICE/`</li><li>SPARQL endpoint:</li><li>`https://api.INSTANCE/datasets/ACCOUNT/DATASET/services/SERVICE/sparql`</li><li>Linked-data export:</li><li>`https://api.INSTANCE/datasets/ACCOUNT/DATASET/download`</li><li>Graph list:</li><li>`https://api.INSTANCE/datasets/ACCOUNT/DATASET/graphs`</li><li>IRI describe:</li><li>`https://api.INSTANCE/datasets/ACCOUNT/DATASET/describe.nt?resource=RESOURCE`</li></ul>\n<h2 id=\"authentication-and-security\">Authentication and security</h2>\n<p>From official Triply API guidance:</p>\n<ul><li>Public datasets: most reads can be done without auth.</li><li>Internal/private datasets (and write operations): require bearer token.</li><li>Header format:</li><li>`Authorization: Bearer TOKEN`</li><li>Token handling requirements:</li><li>never commit tokens to git</li><li>never share tokens outside authorized operators</li><li>rotate tokens regularly or after suspected compromise</li></ul>\n<p>Project policy:</p>\n<ul><li>`RKD_TRIPLY_TOKEN` must come from environment/secret manager only.</li><li>No token logging in route handler errors.</li><li>Tests must stub auth headers and never use real secrets.</li></ul>\n<h2 id=\"scale-and-safety-guardrails\">Scale and safety guardrails</h2>\n<ul><li>Enforce small page windows (`limit`, `offset` / cursor) and max caps.</li><li>Timebox upstream calls and retry with backoff.</li><li>Reject mutation SPARQL (`INSERT`, `DELETE`, `LOAD`, `CLEAR`, etc.).</li><li>Prefer allowlisted query templates with controlled variables.</li><li>Add cache headers/internal memoization for repeated lookups.</li><li>Default to saved query APIs for production retrieval/pagination; use raw SPARQL only for controlled admin/research paths.</li></ul>\n<h2 id=\"sparql-protocol-formats-official\">SPARQL protocol + formats (official)</h2>\n<p>Triply supports the SPARQL 1.1 query protocol with:</p>\n<ul><li>GET query (`query` parameter) for small requests</li><li>POST urlencoded (`application/x-www-form-urlencoded`)</li><li>POST direct (`application/sparql-query`) for larger/custom requests</li></ul>\n<p>Implementation preference:</p>\n<ol><li>Saved queries for production flows</li></ol>\n<ol><li>Direct POST for ad hoc long queries</li></ol>\n<ol><li>GET only for tiny diagnostics</li></ol>\n<p>Result formats (via `Accept` header or suffix) include:</p>\n<ul><li>`application/json`, `application/sparql-results+json`, `text/csv`, `text/tab-separated-values`</li><li>`application/ld+json`, `application/n-triples`, `application/n-quads`, `application/trig`, `text/turtle`</li></ul>\n<p>For Linked Art integration routes, prefer:</p>\n<ul><li>`application/ld+json` / `application/n-triples` for graph materialization paths</li><li>`application/sparql-results+json` for tabular SELECT/ASK result handling</li></ul>\n<h2 id=\"linked-data-serialization-handling\">Linked-data serialization handling</h2>\n<p>Triply linked-data export supports:</p>\n<ul><li>TriG (`application/trig`)</li><li>N-Triples (`application/n-triples`)</li><li>N-Quads (`application/n-quads`)</li><li>Turtle (`text/turtle`)</li><li>JSON-LD (`application/ld+json`)</li></ul>\n<p>Meta Museum adapter behavior:</p>\n<ul><li>ingest using JSON-LD or N-Triples/N-Quads depending on endpoint behavior</li><li>normalize to canonical Linked Art JSON-LD at storage boundary</li><li>preserve source serialization metadata in `_source` diagnostics where useful</li></ul>\n<h2 id=\"standards-and-conformance-mapping\">Standards and conformance mapping</h2>\n<p>All RKD provider PRs must include Standards Mapping with round + fixture anchors from:</p>\n<ul><li>object + provenance rounds</li><li>shared structures rounds</li><li>schema rounds (as applicable)</li><li>search/protocol rounds (71+)</li></ul>\n<p>Required conformance checks:</p>\n<ul><li>event-centric modeling preserved (no object-person shortcut flattening)</li><li>ownership vs custody distinction preserved</li><li>ambiguous transfers mapped conservatively (`Transfer`)</li><li>carrier/content separation preserved (`HumanMadeObject`/`DigitalObject` vs `VisualItem`/`LinguisticObject`)</li><li>protocol/profile checks (B8): context/profile negotiation, CORS/OPTIONS, URI opacity, array cardinality safety</li><li>Triply transport checks: auth behavior for public/private access modes, controlled media-type negotiation, safe fallback when requested serialization is unavailable</li></ul>\n<h2 id=\"license-and-attribution-handling\">License and attribution handling</h2>\n<ul><li>Dataset license: Open Data Commons Attribution License 1.0 (ODC-By 1.0).</li><li>Persist and render source attribution and license metadata with imported records.</li><li>Keep reuse status conservative when image rights are not explicit.</li></ul>\n<h2 id=\"test-plan-failing-first\">Test plan (failing-first)</h2>\n<ul><li>`tests/adapters/rkd.test.ts`</li><li>`tests/api/rkd/profile.test.ts`</li><li>`tests/api/rkd/search.test.ts`</li><li>`tests/api/rkd/entity.test.ts`</li><li>`tests/api/rkd/import.test.ts`</li><li>optional `tests/api/rkd/sparql.test.ts`</li><li>extend `tests/quality/protocol-conformance.test.ts` with RKD route coverage once routes land</li><li>add `tests/api/rkd/auth.test.ts` covering:</li><li>public-read unauthenticated path</li><li>bearer token forwarding path</li><li>missing/invalid token behavior for protected operations</li><li>add `tests/api/rkd/serialization.test.ts` covering Accept-header negotiation and fallback behavior</li></ul>\n<h2 id=\"exit-criteria-for-rkd-slice\">Exit criteria for RKD slice</h2>\n<ul><li>Routes and adapter shipped with green tests.</li><li>`/explore` can discover + import RKD records.</li><li>Standards Mapping note included in PR.</li><li>B8 protocol checks implemented for RKD endpoints.</li></ul>","updatedAt":"2018-10-20T01:46:40.000Z","checksum":"2b4b42f2ad4217ec5b67c38e43f7aff6322130f896e4e1ac0d1a5723f06106cf","checksumPrefix":"2b4b42f2ad42","anchorCount":16,"lineCount":195,"rawUrl":"/api/docs/content?path=providers%2Frkd-knowledge-graph.md","htmlUrl":"/docs?doc=providers%2Frkd-knowledge-graph.md","apiUrl":"/api/docs/content?path=providers%2Frkd-knowledge-graph.md"}