[open-bibliography] Sample subject specific journal article metadata available for comment
Tom Morris
tfmorris at gmail.com
Thu Dec 15 16:52:39 UTC 2011
On Thu, Dec 15, 2011 at 6:17 AM, Deliot, Corine <Corine.Deliot at bl.uk> wrote:
> The British Library is currently examining options for making subject
> specific journal article metadata (e.g. Biotechnology) available for
> projects in a basic RDF/XML representation (i.e. not linked data). The
> metadata would be distributed under a Creative Commons CC0 1.0 Universal
> Public Domain Dedication licence. A sample file covering the topic of
> Palaeontology is now available for feedback at:
> http://www.bl.uk/bibliographic/datasamples.html
Awesome. Could you expand on what is meant by the distinction between
basic RDF and linked data?
Does it have something to do with why a snippet like this:
<rda:placeOfPublication>
<rdf:Description>
<rdfs:label>xuu</rdfs:label>
<skos:inScheme rdf:resource="http://id.loc.gov/vocabulary/countries"/>
<rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/>
</rdf:Description>
</rda:placeOfPublication>
doesn't simply link to http://id.loc.gov/vocabulary/countries/xuu ? I
think a direct link would be more useful and could it could be
generated mechanically. Ditto for
http://id.loc.gov/vocabulary/iso639-2/eng
Some other things I noticed:
- Having the ISSN as both a URN and in textual forms seems redundant
- There are author names which look malformed (perhaps as the result
of a bad parse?), such as:
- , S. x.
- s
- , J.
- XML encoding of special characters appears broken, e.g. de Ricqlè is
encoded as:
<rdfs:label>de Ricqlè</rdfs:label>
There seem to be two flavors of this problem. Double escaping and
double escaping with the trailing semicolon missing.
- There are some improbable looking titles for articles which are
supposedly in French:
CHARACTERIZATION OF CRUSTACEAN CENTRAL GLIA
CORTICAL CONNECTIONS OF AREA V4 IN THE MACAQUE
Paleogeographic-palinspastique maps of the Swiss Molasse Basin
(Early Oligocene-Middle Miocene)
- Palaeoworld and "PALAEOWORLD -NANJING THEN KIDLINGTON-" both have
the same ISSN and appear to be the same journal despite being
referenced separately.
- Some journal names end with a full stop/dot/period ('.') while most
do not. A consistent convention would make the data easier to use.
Thanks for making the data available!
Tom
More information about the open-bibliography
mailing list