Linked Open Art Data Web App (AI) — Must-have Data Sources
Purpose: a curated shortlist of high-leverage open / machine-readable sources to power a linked open art data app (entity resolution, enrichment, search, recommendations).
Recommended pattern (why this stack works)
- Use a backbone entity graph for identity + disambiguation.
- Use authority vocabularies for normalization (names, places, materials, techniques, concepts).
- Add institutional collection APIs for canonical object records, images, provenance, exhibitions.
- Normalize all ingested data into a common internal model (recommend: [[Linked Art]] style modeling).
Top 10 prioritized sources
Core art sources
- [[Wikidata]]
- Best for: global entity backbone (artists, artworks, movements, places, institutions)
- Strength: broad linked-data connectivity; identifiers to many external catalogs
- Use in product: entity graph + join keys + disambiguation + “sameAs” expansion
- [[Europeana]]
- Best for: aggregation/discovery across European cultural heritage
- Strength: Search/Record/Entity APIs for discovery + entity matching
- Use in product: broad coverage, cross-institution linking, discovery funnels
- [[Rijksmuseum]] (Rijksmuseum Data)
- Best for: high-quality museum metadata, Europeana-aligned structures
- Strength: open museum dataset + APIs/downloads; strong object/creator metadata
- Use in product: deep collection depth, high-trust records for training/eval
- [[The Met Collection API]] (Metropolitan Museum of Art)
- Best for: broad Open Access artworks + images
- Strength: popular and well-documented; good for rapid prototyping
- Use in product: canonical object pages, image-based features, recommendations
- [[Art Institute of Chicago API]]
- Best for: unified public API spanning collection + related content
- Strength: consistent API surface; strong for integrating “content around art”
- Use in product: object + interpretive content; richer on-site experiences
High-value linked-data & standards
- [[Getty Vocabularies]]
- Best for: authority control (names/places/materials/techniques/concepts)
- Strength: normalization + disambiguation; improves entity linking quality
- Use in product: canonical labels, multilingual expansion, semantic search facets
- [[Linked Art]] (data model ecosystem)
- Best for: interoperability standard for art data publishing/integration
- Strength: consistent graph model; reduces bespoke mapping per museum
- Use in product: internal canonical schema + export format
- [[Smithsonian American Art Museum LOD]] (SAAM)
- Best for: stable URIs for artists/objects; strong American art coverage
- Strength: linked open data with resolvable identifiers
- Use in product: graph linking, authority-style references, provenance context
- [[Tate collection data]]
- Best for: open collection metadata around artists and artworks
- Strength: good coverage; useful for cross-museum linking
- Use in product: UK/modern art depth; complementary to Europeana + Wikidata
- [[American Art Collaborative]] (and similar museum consortia LOD)
- Best for: cross-museum linking beyond a single institution
- Strength: multi-institution graph context; increases match rates + coverage
- Use in product: broader graph + better recommendations across collections
Practical build order (implementation sequence)
1) [[Wikidata]] for the backbone entity graph + disambiguation.
2) [[Getty Vocabularies]] for controlled terms / normalization.
3) [[Europeana]] for broad aggregation + entity matching.
4) Add “depth” via: [[Rijksmuseum]], [[The Met Collection API]], [[Art Institute of Chicago API]], [[Smithsonian American Art Museum LOD]], [[Tate collection data]].
5) Adopt [[Linked Art]] as the canonical internal model and map sources into it.
Notes for AI-powered features (what to extract/standardize)
- Identity: stable IDs, sameAs links, external identifiers
- Strings to normalize: person names, place names, titles, materials, techniques
- Time: production dates (with uncertainty), life dates, period styles
- Relationships: creator ↔ work, work ↔ movement, work ↔ place, work ↔ institution
- Assets: image URLs, IIIF manifests (when available), rights statements
Follow-ups / next steps
- Create individual notes per source with: endpoints, rate limits, licensing, identifier strategy, mapping notes to [[Linked Art]].
- Define internal entity types (Artist, Artwork, Place, Movement, Institution, Material, Technique) and linking rules.