[OpenGLAM] getty thesaurus, linked data, and sparql

Eric Lease Morgan eric_morgan at infomotions.com
Fri Feb 21 16:50:34 UTC 2014


> Today the Getty released the Art and Architecture Thesaurus as Linked Open Data [1].

Releasing the Getty Thesaurus as linked data is very interesting, and after visiting the blog posting I discovered a SPARQL endpoint to the data. [2] Yet, I seem to always have problems exploring SPARQL endpoints without having an in-depth and thorough knowledge of the underlying ontologies. Is this just me, or am I missing something?

For example, without knowing anything, I think I can submit a SPARQL query such as the following to just about any SPARQL endpoint to get an overview of the triple store’s ontologies:

  SELECT DISTINCT ?class
  WHERE { ?subject a ?class }
  ORDER BY ?class

This query uses the SPARQL short-hand notation of “a” to denote the RDF predicate equal to rdf:typeOf, which I assume will be in just about every triple store. Correct? Applying this query to the Getty SPARQL endpoint returns a list of (hopefully) actionable URIs describing all the ontologies used in the triple store. 

I can submit the following SPARQL query to just about any triple store to get a list of all the predicates used in the triple store, but the query usually never returns; it probably creates a heck of a lot of work on the endpoint’s backend. Each one of these predicates ought to be described in greater detail in the actionable URIs from Query #1. Correct?

  SELECT DISTINCT ?property
  WHERE { ?subject ?property ?object }
  ORDER BY ?property

Given these ontologies (classes) and properties (relationships), I ought to be able to navigate around the triple store discovering cool information, but I find the process to be very difficult. Here are a few queries:

  # list of concepts
  SELECT * 
  WHERE { ?s a <http://vocab.getty.edu/ontology#Concept> }

  # all about the English phrase founding tools
  SELECT * 
  WHERE { ?s ?p "founding tools"@en }

  # uri for founding tools
  SELECT ?uri 
  WHERE { ?uri rdfs:label "founding tools"@en }

I find this process to be painful. To what degree am I still to much a novice at SPARQL, and to what degree do I need to have an intimate knowledge of the ontologies before I can create meaningful queries? To what degree do more user-friendly front-ends need to be created? In order for URIs to replace literals in RDF, there will need to be much easier to use interfaces to triple stores. Correct? Like the need for a data dictionary and entity-relationship diagram in searching of relational databases vis SQL, to what degree do I really need to know and understand the supporting ontologies before I can make meaningful sense of a triple store?

Put another way, is there some set of basic/rudimentary queries I can send to SPARQL endpoints, get results, and begin to drill down without really knowing the ontologies? I’m stymied in this regard.


[1] announcement - http://blogs.getty.edu/iris/art-architecture-thesaurus-now-available-as-linked-open-data/
[2] data home - http://vocab.getty.edu

—
Eric Lease Morgan
University of Notre Dame





More information about the open-glam mailing list