[open-bibliography] Sample subject specific journal article metadata available for comment

Tom Morris tfmorris at gmail.com
Thu Dec 15 16:52:39 UTC 2011


On Thu, Dec 15, 2011 at 6:17 AM, Deliot, Corine <Corine.Deliot at bl.uk> wrote:

> The British Library is currently examining options for making subject
> specific journal article metadata (e.g. Biotechnology) available for
> projects in a basic RDF/XML representation (i.e. not linked data). The
> metadata would be distributed under a Creative Commons CC0 1.0 Universal
> Public Domain Dedication licence. A sample file covering the topic of
> Palaeontology is now available for feedback at:
> http://www.bl.uk/bibliographic/datasamples.html

Awesome.  Could you expand on what is meant by the distinction between
basic RDF and linked data?

Does it have something to do with why a snippet like this:

<rda:placeOfPublication>
  <rdf:Description>
    <rdfs:label>xuu</rdfs:label>
    <skos:inScheme rdf:resource="http://id.loc.gov/vocabulary/countries"/>
    <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/>
  </rdf:Description>
</rda:placeOfPublication>

doesn't simply link to http://id.loc.gov/vocabulary/countries/xuu ?  I
think a direct link would be more useful and could it could be
generated mechanically.  Ditto for
http://id.loc.gov/vocabulary/iso639-2/eng

Some other things I noticed:

- Having the ISSN as both a URN and in textual forms seems redundant

- There are author names which look malformed (perhaps as the result
of a bad parse?), such as:
  - , S. x.
  - s
  - , J.

- XML encoding of special characters appears broken, e.g. de Ricqlè is
encoded as:

    <rdfs:label>de Ricql&#xe8</rdfs:label>

  There seem to be two flavors of this problem.  Double escaping and
double escaping with the trailing semicolon missing.

- There are some improbable looking titles for articles which are
supposedly in French:

  CHARACTERIZATION OF CRUSTACEAN CENTRAL GLIA
  CORTICAL CONNECTIONS OF AREA V4 IN THE MACAQUE
  Paleogeographic-palinspastique maps of the Swiss Molasse Basin
(Early Oligocene-Middle Miocene)

- Palaeoworld and "PALAEOWORLD -NANJING THEN KIDLINGTON-" both have
the same ISSN and appear to be the same journal despite being
referenced separately.

- Some journal names end with a full stop/dot/period ('.') while most
do not.  A consistent convention would make the data easier to use.

Thanks for making the data available!

Tom




More information about the open-bibliography mailing list