[open-bibliography] Sample subject specific journal article metadata available for comment

Rosie, Heather Heather.Rosie at bl.uk
Fri Dec 16 10:15:13 UTC 2011


Hi Tom

Many thanks for your comments.

The purpose of releasing this initial data sample was to see how much interest there might be amongst researchers in having access to datasets in particular subject areas in a format in common use.  The metadata is derived from the British Library's "ETOC" database which provides subscription based access to 20,000 of the BL's most popular research journals.  The metadata is held in a proprietary, SGML, format.  The sample dataset, covering articles from journals classed as "Palaeontology", is our first attempt to make some of this data available under the same terms and conditions as our other "basic RDF/XML" metadata offerings (http://www.bl.uk/bibliographic/datafree.html#basicrdfxml). If there is sufficient interest, we may look to develop a linked data format similar to the BNB books dataset (http://www.bl.uk/bibliographic/datafree.html#lod).  To do this, however, would require more work on our part in terms of modelling which we are not in a position to do at this time.

With regard to the errors in data content, no attempt has been made to correct this. The content is exactly as offered via the ETOC subscription service; only the format, coverage (limited to certain subject sets), and terms and conditions under which it can be accessed, have changed.

Best wishes

Heather 

Heather Rosie
Online Metadata Analyst
The British Library
 
heather.rosie at bl.uk
 

-----Original Message-----
From: open-bibliography-bounces at lists.okfn.org [mailto:open-bibliography-bounces at lists.okfn.org] On Behalf Of Tom Morris
Sent: 15 December 2011 16:53
To: List for Working Group on Open Bibliographic Data
Subject: Re: [open-bibliography] Sample subject specific journal article metadata available for comment

On Thu, Dec 15, 2011 at 6:17 AM, Deliot, Corine <Corine.Deliot at bl.uk> wrote:

> The British Library is currently examining options for making subject
> specific journal article metadata (e.g. Biotechnology) available for
> projects in a basic RDF/XML representation (i.e. not linked data). The
> metadata would be distributed under a Creative Commons CC0 1.0 Universal
> Public Domain Dedication licence. A sample file covering the topic of
> Palaeontology is now available for feedback at:
> http://www.bl.uk/bibliographic/datasamples.html

Awesome.  Could you expand on what is meant by the distinction between
basic RDF and linked data?

Does it have something to do with why a snippet like this:

<rda:placeOfPublication>
  <rdf:Description>
    <rdfs:label>xuu</rdfs:label>
    <skos:inScheme rdf:resource="http://id.loc.gov/vocabulary/countries"/>
    <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/>
  </rdf:Description>
</rda:placeOfPublication>

doesn't simply link to http://id.loc.gov/vocabulary/countries/xuu ?  I
think a direct link would be more useful and could it could be
generated mechanically.  Ditto for
http://id.loc.gov/vocabulary/iso639-2/eng

Some other things I noticed:

- Having the ISSN as both a URN and in textual forms seems redundant

- There are author names which look malformed (perhaps as the result
of a bad parse?), such as:
  - , S. x.
  - s
  - , J.

- XML encoding of special characters appears broken, e.g. de Ricqlè is
encoded as:

    <rdfs:label>de Ricql&#xe8</rdfs:label>

  There seem to be two flavors of this problem.  Double escaping and
double escaping with the trailing semicolon missing.

- There are some improbable looking titles for articles which are
supposedly in French:

  CHARACTERIZATION OF CRUSTACEAN CENTRAL GLIA
  CORTICAL CONNECTIONS OF AREA V4 IN THE MACAQUE
  Paleogeographic-palinspastique maps of the Swiss Molasse Basin
(Early Oligocene-Middle Miocene)

- Palaeoworld and "PALAEOWORLD -NANJING THEN KIDLINGTON-" both have
the same ISSN and appear to be the same journal despite being
referenced separately.

- Some journal names end with a full stop/dot/period ('.') while most
do not.  A consistent convention would make the data easier to use.

Thanks for making the data available!

Tom

_______________________________________________
open-bibliography mailing list
open-bibliography at lists.okfn.org
http://lists.okfn.org/mailman/listinfo/open-bibliography

**************************************************************************
Experience the British Library online at http://www.bl.uk/
 
The British Library’s new interactive Annual Report and Accounts 2010/11 : http://www.bl.uk/annualreport2010-11http://www.bl.uk/knowledge
 
Help the British Library conserve the world's knowledge. Adopt a Book. http://www.bl.uk/adoptabook
 
The Library's St Pancras site is WiFi - enabled
 
*************************************************************************
 
The information contained in this e-mail is confidential and may be legally privileged. It is intended for the addressee(s) only. If you are not the intended recipient, please delete this e-mail and notify the mailto:postmaster at bl.uk : The contents of this e-mail must not be disclosed or copied without the sender's consent.
 
The statements and opinions expressed in this message are those of the author and do not necessarily reflect those of the British Library. The British Library does not take any responsibility for the views of the author.
 
*************************************************************************
 Think before you print




More information about the open-bibliography mailing list