[open-bibliography] British Library data announcement

William Waites ww at eris.okfn.org
Mon Nov 22 12:17:52 GMT 2010


* [2010-11-22 10:29:32 -0000] Deliot, Corine <Corine.Deliot at bl.uk> écrit:

] Re: isbns.
] They appear in dcterms:identifier [http://purl.org/dc/terms/identifier] and
] the range of that property is rdfs:literal [1], which is why we have
] recorded them as a string literal.

Corine, quite right. So what we have done is used bibo:isbn but we
have also converted the object to a resource which is wrong for the
same reason (bibo:isbn subPropertyOf dct:identifier) as you point
out. 

] The latest version of the data available from us (version 0.3.1)
] also includes isbns and issns in bibo:isbn and bibo:issn. These are
] sub-properties of bibo:identifier, which has the range of
] rdfs:literal [2], which is why we again have recorded these as
] string literals.

Which is as it should be. In this case the values should not be
prefixed with urn:isbn, correct?

Is there a list of changes from the version that we have to 0.3.1?
The reason I ask is that it might be easier (quicker) for us to
basically patch the data in the store than to reload it from scratch
and re-index it again.

] My understanding is that if we wanted to treat isbns and issns as
] resources, then we would have to use dc:identifier
] http://purl.org/dc/elements/1.1/identifier, which had no range
] defined or another property from another RDF vocabulary with the
] appropriate range.

Right. But using bibo:isbn is better since it properly disambiguates.

] We are aware that blank nodes are an issue but we are not in a
] position to mint our own URIs yet. We are investigating how best
] to do this.

So far experiences basically just loading the data into a Virtuoso
store (after minting URIs ourselves) and running some very thin
front-end software to make individual resources dereferencable seems
to be working well at scale. The loading process (well the indexing
of appropriate predicates for full-text search) was very slow, took
several days. But once it was done, performance is good even on modest
hardware (Amazon 64-bit instance, 8Gb of RAM). We would be more than
happy to help you duplicate the setup if that would be helpful.

Minting URIs is crucial though, because it is the only way to refer to
any of your data. Much as I think it would be cool to have
http://bnb.bibliographica.org/entry/xyz as the standard way to refer
to the BL catalogue, it really would be better if you were to do it
(and avoid a small forest of owl:sameAs in the future as the data
begins to get used).

Cheers,
-w
-- 
William Waites
http://eris.okfn.org/ww/foaf#i
9C7E F636 52F6 1004 E40A  E565 98E3 BBF3 8320 7664



More information about the open-bibliography mailing list