[open-bibliography] journal article mining

Peter Murray-Rust pm286 at cam.ac.uk
Fri Feb 4 15:46:57 UTC 2011


Thanks you very much for a full reply. This is very helpful.

I am copying this in to the Open Bibliography list. For their background I
have been exploring with Eefke and the STM Publishers association whether
text-mining was allowable and whether bibliographic data is copyrightable.
Eefke gives a clear answer to the second so I am posting this on this list.
I think it now makes possible a lot of very valuable things with Open
Bibliographic Data.

On Fri, Feb 4, 2011 at 2:34 PM, Eefke Smit <eefkesmi at xs4all.nl> wrote:

> As promised, I would sort out your question about the openness of
> bibliographies. You made quite clear in our conversation that you are not
> particularly fond of ‘it depends’ answers. So I fear you may find the
> following answer slightly disappointing, because also for bibliographies the
> answer to the question how open they are, depends on what your regard to be
> elements of a bibliography.
>

We  have addressed this in "Principles of Open Bibliographic Data"
http://openbiblio.net/principles/

>
>
> To start with the simplest elements that are indeed open and considered
> ‘facts’ hence copyright free: article title; authors of article; journal
> title; volume-issue information; and dates of receipt/publication. These are
> all considered to be facts and cannot be copyrighted.
>

We have essentially covered these in
*Core data: names and identifiers of author(s) and editor(s), titles,
publisher information, publication date and place, identification of parent
work (e.g. a journal), page information, URIs.*

I think this is entirely in line with the you and your STM colleagues and
this agreement is an extremely important step forward.


>
> But nowadays people sometimes include much more into bibliographies, for
> example images, tables, abstracts, even chemical structures. Bibliographic
> data can include a number of different kinds of fields and information,
> including thesauri, classifications like chemistry structures, etc., so
> there can be some information that is copyrightable or systems that are tied
> into copyright or trademark protected content.
>
>
>
Precisely what that is does indeed depend. Our list of secondary
bibliographic data overlaps greatly with yours. I have highlighted the
components that I would believe would be uncopyrightable.

*Secondary data*: *format of work*, *non-web identifiers (ISBN, LCCN, OCLC
number etc.)*, *an indication of rights associated with a work, information
on sponsorship (e.g. funding), information about carrier type, extent and
size information, administrative data (last modified etc.), relevant links
(to wikipedia, google books, amazon etc.), table of contents, links to
digitized parts of a work (tables of content, registers, bibliographies
etc.), addresses and other contact details about the author(s),* cover
images, abstracts, reviews, summaries, subject headings, assigned keywords,
classification notation, user-generated tags, exemplar data (number of
holdings, call number), …

This does not mean that the others were by default copyrightable, but we
know of places where people have asserted rights over some of them.

I think you and I differ about whether tables and graphs are copyrightable
in this context. I would concede that images which contained createive work
were copyrightable but that images representing factual information (e.g.
chemical structures) were not copyrightable. For example it would be foolish
to be unable to communicate a chemical structure to someone because you
might break copyright. There are millions of such images on suppliers
bottles and witholding this infromation means that people could and would
die.

I also asked about whom I should contact within a publisher to get a
definitive answer from that organization (as most of the time I get no
reply).


> On your question whom to contact for permissions  as a reader, I would
> advise you to address the ‘rights and permissions departments’ or ‘licensing
> departments’ at the relevant publisher houses or else enquire via your local
> license holder (Cambridge library) who their contacts are. Very often these
> are regionally assigned, so a general list would be difficult to compose.
>

This seems to confirm that it can therefore be quite difficult to get the
right person within a large publishing house and get an answer.

The STM members can be found on www.stm-assoc.org
>
>
>
> Hope this information is of help to you,
>

Yes it is very useful.


> Kindest regards, Eefke Smit.
>
>
>
>
>
>
>
>
>
> *Van:* peter.murray.rust at googlemail.com [mailto:
> peter.murray.rust at googlemail.com] *Namens *Peter Murray-Rust
> *Verzonden:* woensdag 2 februari 2011 15:00
>
> *Aan:* Eefke Smit
> *Onderwerp:* Re: journal article mining
>
>
>
> Here is a link to the concern over bibliographic data:
>
>
> http://blogs.ch.cam.ac.uk/pmr/2011/02/02/dois-are-not-copyright-what-about-bibliographic-data/
>
> I'd be grateful for the following:
> * a brief account of what issues you are taking forward from our
> conversation
> * a list of the appropriate addresses in STM publishers that I can write to
> about permissions as a reader.
>
> Thanks
>
> P.
>
> --
> Peter Murray-Rust
> Reader in Molecular Informatics
> Unilever Centre, Dep. Of Chemistry
> University of Cambridge
> CB2 1EW, UK
> +44-1223-763069
>



-- 
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-bibliography/attachments/20110204/91311168/attachment.html>


More information about the open-bibliography mailing list