[openbiblio-dev] Bibliographic metadata from scientific publishers/publications

Peter Murray-Rust pm286 at cam.ac.uk
Sun Jun 27 10:29:37 UTC 2010

Part of #jiscobib concerns scientific publications and the extraction of
bibliographic data from them using either existing resources (e.g. TOCs) or
automatic extraction. [To avoid any rights issues all my examples come from
CC-BY publications].

So far I have discovered the following schemas/specs

* Dublin Core (http://en.wikipedia.org/wiki/Dublin_Core - IMO a better intro
than http://dublincore.org/ )
* NLMDTD (http://dtd.nlm.nih.gov/ "Journal Archiving and Interchange Tag

are there others we need to know about?

Here are two examples from our collaborators in #jiscobib. I continue at the
end of the second:


<meta content="urn:issn:1600-5368" name="DC.source" />
  <meta content="http://creativecommons.org/licenses/by/2.0/uk"
name="DC.rights" />
  <meta content="Xue, L.-W." name="DC.creator" />
  <meta content="Li, X.-W." name="DC.creator" />
  <meta content="Zhao, G.-Q." name="DC.creator" />
  <meta content="Peng, Q.-L." name="DC.creator" />
  <meta content="2009-09-01" name="DC.date" />
  <meta content="doi:10.1107/S1600536809037520" name="DC.identifier" />
  <meta content="International Union of Crystallography" name="DC.publisher"
  <meta content="http://scripts.iucr.org/cgi-bin/paper?is2450"
name="DC.link" />
  <meta content="en" name="DC.language" />
  <meta content="text" name="DC.type" />
name="DC.title" />
  <meta content="In the title compound, [Cu(C13H9NO3)(C5H5N)], the CuII atom
is coordinated in a distorted square-pyramidal geometry, with two N and two
O atoms in the basal positions and one O atom in the apical position. The
apical Cu-O bond [2.3520 (16) A] is much longer than the basal Cu-O and Cu-N
bonds [1.9139 (14)-2.0136 (17) A]. The carboxylate group bridges CuII atoms,
forming a zigzag chain along the a axis." name="DCTERMS.abstract" />
  <meta content="10" name="prism.number" />
  <meta content="65" name="prism.volume" />
  <meta content="2009-09-01" name="prism.publicationDate" />
  <meta content="Acta Crystallographica Section E: Structure Reports Online"
name="prism.publicationName" />
  <meta content="1600-5368" name="prism.issn" />
  <meta content="metal-organic compounds" name="prism.section" />
  <meta content="1237" name="prism.startingPage" />
  <meta content="med at iucr.org
name="prism.rightsAgent" />
  <meta content="1237" name="prism.endingPage" />
  <meta content="1600-5368" name="prism.eissn" />
  <meta content="" name="keywords" />
  <meta content="NOARCHIVE,NOINDEX" name="ROBOTS" />

PLoS (PLoS Biology)

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article
  PUBLIC "-//NLM//DTD Journal Publishing DTD v2.0 20040830//EN" "
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="
http://www.w3.org/1998/Math/MathML" article-type="research-article"
dtd-version="2.0" xml:lang="EN">
<journal-id journal-id-type="publisher-id">plos</journal-id>
<journal-id journal-id-type="nlm-ta">PLoS Biol</journal-id>
<journal-id journal-id-type="pmc">plosbiol</journal-id>
<journal-title>PLoS Biology</journal-title>
<issn pub-type="ppub">1544-9173</issn>
<issn pub-type="epub">1545-7885</issn>
<publisher-name>Public Library of Science</publisher-name>
<publisher-loc>San Francisco, USA</publisher-loc></publisher>
<article-id pub-id-type="publisher-id">09-PLBI-RA-4846R2</article-id>
<article-id pub-id-type="doi">10.1371/journal.pbio.1000399</article-id>
<article-categories><subj-group subj-group-type="heading"><subject>Research
<article-title>Cortical Overexpression of Neuronal Calcium Sensor-1 Induces
Functional Plasticity in Spinal Cord Following Unilateral Pyramidal Tract
Injury in Rat</article-title>
<alt-title alt-title-type="running-head">Neuronal Calcium Sensor-1 Induces
<contrib contrib-type="author" xlink:type="simple"><name
K.</given-names></name><xref ref-type="aff"
rid="aff1"><sup>1</sup></xref><xref ref-type="corresp"
<contrib contrib-type="author" xlink:type="simple"><name
ref-type="aff" rid="aff2"><sup>2</sup></xref></contrib>
<contrib contrib-type="author" xlink:type="simple"><name
A.</given-names></name><xref ref-type="aff"
<contrib contrib-type="author" xlink:type="simple"><name
J.</given-names></name><xref ref-type="aff"
<contrib contrib-type="author" xlink:type="simple"><name
B.</given-names></name><xref ref-type="aff"
<aff id="aff1"><label>1</label><addr-line>Neurorestoration Group, Wolfson
CARD, King's College London, Guy's Campus, London, United
Kingdom</addr-line>       </aff>
<aff id="aff2"><label>2</label><addr-line>Henry Wellcome LINE, Dorothy
Hodgkin Building, Bristol University, Bristol, United
Kingdom</addr-line>       </aff>
<aff id="aff3"><label>3</label><addr-line>School of Biological Sciences,
Royal Holloway-University of London, Egham, Surrey, United
Kingdom</addr-line>       </aff>
<contrib contrib-type="editor" xlink:type="simple"><name
<role>Academic Editor</role>
<xref ref-type="aff" rid="edit1"/></contrib>
<aff id="edit1">University of California San Francisco, United States of
<corresp id="cor1">* E-mail: <email xlink:type="simple">ping.yip at kcl.ac.uk
<fn fn-type="con"><p>The author(s) have made the following declarations
about their contributions: Conceived and designed the experiments: PKY.
Performed the experiments: PKY TAS. Analyzed the data: PKY TAS. Contributed
reagents/materials/analysis tools: PKY LFW RJYM. Wrote the paper: PKY TAS
<pub-date pub-type="collection"><month>6</month><year>2010</year></pub-date>
<copyright-statement>Yip et al. This is an open-access article distributed
under the terms of the Creative Commons Attribution License, which permits
unrestricted use, distribution, and reproduction in any medium, provided the
original author and source are credited.</copyright-statement>
<related-article id="RA1" related-article-type="companion"
ext-link-type="uri" vol="8" page="e1000400" xlink:type="simple"
xlink:href="info:doi/10.1371/journal.pbio.1000400"> <article-title>Healing
Spinal Cord Injuries</article-title></related-article>
<abstract abstract-type="toc">
<p>Overexpression of neuronal calcium sensor 1 in cortical neurons can help
restore axonal plasticity and regeneration following axonal injury in adult
rats, and can also improve behavioral function.</p>
<p>Following trauma  ... within the injured adult central nervous
<abstract abstract-type="summary"><title>Author Summary</title>
<p>Following trauma to the central nervous system  ...  and can help improve
behavioural function.</p>
<counts><page-count count="22"/></counts></article-meta>

So some questions:
* is the term "bibliographic entry" appropriate for these two specimens?
* can the OKF-RDF approach hold all the information in them?
* can a conversion to OKF-RDF be done automatically? [I appreciate we may
need to create a mapping].
* do we need to microparse any of the information? [I would like to do this
for the affiliations or to normalize this to a unique indentifier].

If the answers are YYY* then we should be able to convert a lot of Open
Access information immediately.


Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/openbiblio-dev/attachments/20100627/ddef3710/attachment.html>

More information about the openbiblio-dev mailing list