[open-bibliography] (Final?) discussion of the openbiblio principles

Peter Murray-Rust pm286 at cam.ac.uk
Sat Jan 8 18:18:26 UTC 2011


On Sat, Jan 8, 2011 at 5:38 PM, Jim Pitman <pitman at stat.berkeley.edu> wrote:

> Adrian, I have one more thought. I suggest putting into "Secondary data"
> "indication that one work is derived from, related to, or cited by another
> work"
> This is a nice rhetorical progression, because few would deny that the fact
> of one
> work being derived from or related to another is essentially public domain,
> and
> citation by citation one could make the case for "cited by" too. Publishers
> often engage in the noxious practice of making citation lists available
> only to subscribers,
> and citation idexerss hoard the citation data. This is something we should
>  be pushing back against.
>

Absolutely. As you know David Shotton is funded to do this by JISC


> One of the greatest potential benefits of open biblio in the journal
> article space is if
> we can open up the citation graph which was until recently controlled by
> Thomson-Reuters,
> but which is gradually becoming more open due to efforts of CiteSeer,
> Google Scholar,
> Microsoft Academic Search and others including David Shotton and myself. We
> need to make the point
> to publishers that it is in their own interest to release their citation
> data and allow it to be fully processed
> by the community without licensing retrrictions. And too bad for the
> current citation indexers if they get disintermediated.
>
> Also, we should definitely not make a statement of open biblio principles
> which might later be used against
> us by saying even we did not indicate citation data might be covered!
>
>
I have consistently made a distinction between Bibliography and Citations -
and indeed defined (my own view) or Open Scholarship as Open Access,
Bibliography, Citations and Dataq (OABCD). The reasons are both taxonomic
and pragmatic. See below


> This comes back to the copyrightable/non-copyrightable issue. Our point
> should be that  whether or not various
> elements of this biblio data are copyrightable or restrictable by licenses,
> it is in the greater interest of the scientific
> community for publishers and data aggregators to make all this information
> freely available. Then let the various agents compete with
> each other on a new playing field for the quality of services they can
> provide over this data, rather than competing for control of the data.
>
> We agree on the motivation, but many will not. So my position - and I think
it's covered by the principles - are:

Bibliography has an objectivity which defines a set of objects and their
addresses/identifiers.  In this sense they are a unique set - an exhaustive
collection of the worlds' bibliographic data should be platonic - it should
be the same whoever creates it. Of course there are edge cases, but for
articles this is essentially true. Each article has a single core
bibliographic data (I treat data as singular when I feel like it!) and if
this data is not available the article "does not exist". It's similar to car
licence plates, phone books, addresses of institutions.

There is no motivation ("sentiment") attached to bibliographic data. It
addresses and identifies. The "core-data" doesn't comment. It may attach
factual information such as language, document format, rights, etc. Every
piece of this core-data can in principle be precisely verified by examining
the artifact. (I don't want to get into FRBR here...)

By contrast the citations are subjective and potentially ambiguous or
"wrong". In an ideal world the bibliographic data are nodes in a graph and
the citations are (annotated) edges. In practice many citations point to
non-existent or ambiguous nodes  - and this is in some cases irresolvable.
An article can be created (and many are) without citations. An article must
have a single set of bibliographic data.

There is also a pragmatic aspect. Whether we like it or not the citations
are often hidden behind paywalls and are often regarded as the property of
the publisher. We may not like it and we may attempt to change it but it's a
current fact. It will take time (years?) to get Citations made Open. We have
strategies - which include author support, beneficient journals, funder
requirements etc. and these will win, but at present this is a much more
messy area than bibliography.

So we should not let Citations and Bibliography get muddled - they are
different things. I'm not quite sure what you are proposing - that
bibliographic data should include lists of citing authorities? If so this is
technically hard, and is dynamic (whereas much Bibliography once created is
effectively static for ever after).

So I would argue against adding this - we are already tackling citations and
will continue to do so, but it's a longer effort. If we can get Open
Bibliography adopted - especially across a wide section of the community -
authors, academia, publishers, funders, etc. we can move to the next
challenge.

Open Bibliogtaphic Data is concise and compelling. I expect we can get many
people/orgs to sign up to it. That enhances our credibility.

P.




> --Jim
> ----------------------------------------------
> Jim Pitman
> Director, Bibliographic Knowledge Network Project
> http://www.bibkn.org/
>
> Professor of Statistics and Mathematics
> University of California
> 367 Evans Hall # 3860
> Berkeley, CA 94720-3860
>
> ph: 510-642-9970  fax: 510-642-7892
> e-mail: pitman at stat.berkeley.edu
> URL: http://www.stat.berkeley.edu/users/pitman
>
>
> > Hello Jim,
> >
> > thanks for your valuable input. I incorporated both your proposals
> > into the document. In principle 3 it now reads: "These licenses make
> > it impossible to effectively integrate and re-purpose datasets. They
> > furthermore prevent commercial services which add value to
> > bibliographic data or commercial activities which could be used to
> > support data preservation."
> >
> > I hope everybody is OK with this. Anyway, it's still time for more
> > changes. e.g. to change the "strongly recommend" part...
> >
> > As attachment a PDF version of the current principles draft which
> > Peter might use for the symposium. Fortunately it fits on two pages!
> > (Karen, as I made some more changes I created a new pdf.)
> >
> > Adrian
> >
> > 2011/1/7 Jim Pitman <pitman at stat.berkeley.edu>:
> > > Adrian Pohl <adrian.pohl at okfn.org> wrote:
> > >
> > >> I've already made some more changes on the google doc
> > >> and added comments to the document to initiate further discussion, see
> > >> http://bit.ly/gIfB11
> > >
> > > Overall, it looks very good to me now. Especially, staying away from
> the copyrighable/non_copyrightable
> > > issue seems very effective.
> > > I added a few suggestions like this  [ suggestion JP].
> > > Adrian, please incorporate as if see fit. Or others could add their
> approval or disapproval
> > > inside the [...]. I'm not sure what the protocol is for making changes.
> > > --Jim
> > >
> > > _______________________________________________
> > > open-bibliography mailing list
> > > open-bibliography at lists.okfn.org
> > > http://lists.okfn.org/mailman/listinfo/open-bibliography
> > >
>
> _______________________________________________
> open-bibliography mailing list
> open-bibliography at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-bibliography
>



-- 
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-bibliography/attachments/20110108/e886174f/attachment-0001.html>


More information about the open-bibliography mailing list