[open-bibliography] WorldCat API and Licensing

Peter Murray-Rust pm286 at cam.ac.uk
Tue Jan 11 08:42:03 UTC 2011


The NC versus non-NC argument is an important one and will continue for a
long time. I can see that there are areas where NC is a useful pragmatic way
forward. We've been having these discussions with a publisher who - after
several years - has allowed <0.1% of its corpus to be made available for us
as training data to support our text-mining software (which is
OSI-licensed). That means that if we use the data for training our software
we cannot licence the data as compatible with OSI (F/OSS). We cannot
distribute our software unless we say - "you can't train this software if
you are a commercial entity" - you cxan use it but not develop your own
models using this corpus. We've argued with the publisher but we are stuck
on this. So we cannot distribute the training data.

The same would be true of bibliographic data. Suppose I wanted to develop
machine-learning on bibliographic data  - something I may well wish to do.
The NC makes it not compatible with F/OSS.

There is a huge friction associated with negotations. Many rights holders
don't reply, many set additonal restrictions. We've spent 15 months with a
publisher whos has said "yes, we can text mine their data" as long as all
the results belong to them and we don't publish any of our work. Wasted huge
amounts of our time. By contrast I can use OKD - F/OSS material without
spending more than a minute or so looking at the licence.

On Tue, Jan 11, 2011 at 1:41 AM, Jim Pitman <pitman at stat.berkeley.edu>wrote:

> Peter Murray-Rust <pm286 at cam.ac.uk> wrote:
>
> > I'd agree with what Tom says. There may or may not be uses for
> > NC-bibliography but it cannot be included in Open Bibliography as we are
> > using it. The increasing availability ofg NC may help change the culture
> but
> > at the bottom we have to have complete Openness.
>
> I dont see why an NC clause significantly inhibits free exchange of
>  bibliographic data.
> That is enough for my purposes. Generally, I see the arguments against NC
> as greatly overstated.
> There are an increasing number of agents who seem to be willing to tolerate
> NC, so I think it is
> important to take them at their word and demonstrate significant NC
> applications of their data.
>
> I have no objection to anyone using NC if it works in their field of
endeavour and they are able to work with agents who tolerate NC. But it
doesn't extend to OKF activities any more than it extends to GNU/FSF. There
are NC software licences but they don't work with any OSI-software.


> Also, just because some component of bib data I use comes with an NC clause
> does not seem to prevent
> me from saying my contribution is CC0. I claim no additional copyright
> layer, I declare where my
> data come from and who has checked them, and move on to other things.
> Others can do whatever they
> want with the data.  What is wrong with this picture?
>
> We believe - and we are going ahead with the pragmatics - that individual
data are not copyrightable. Thus if you or I wil our own efforts create a
bibliographica datum that's fine. If, however we get 100,000 entries from a
supplier who puts non-OKF restrictions on them we cannot extract the items
from that set. The supplier can - and as you and others hint - probably will
challenge that.

> We assert that individual bibliographic components are Open so that we can
> > reasonably show that we have not taken them from someone else's
> collection.
>
> The issue is about taking them from a collection. I don't believe that
there is any practice of licensing individual data.  If there is then we
simply get the datum from another source.


> I dont see how "we" can make individual bibliographic components Open by
> saying they are.
> Either they are by law, or they aren't, and some copyright ownner, not us,
> can make them so.
>
> It is our belief that individual items are not copyrightable any more than
- say - a street sign or a postcode are copyrightable. Open Streetmap goes
round roads and records the names of streets. They are scrupulously careful
not to take them off copyrighted maps.


> > Since most of these are normalizable data then once we have got them,
> they act as a permanent record.
>
> OK
>
> > If we take them from NC collections then they contaminate the rest of our
> Open collection.
> So what? Why is that so bad? I just dont get the virulence of the argument
> against NC. What I do see is there
> are a large number of major biblio data providers in the article space
> where I think I could fairly easily persuade
> them to provide data NC, by arguing e.g. the OCLC has already done so, its
> the emerging standard, .... but it
> may be very tough to get a full CC0 declaration from these sources. I'm
> willing to try, but I'm not going to
> spend a lot of hours on this. I'd rather spend the hours doing creative
> processing of data I can get hold of with NC.
>

that's fine and we aren't stopping you. Were simply making it clear that it
can't be part of OKF's Open Bibliography.

>
> > I am optimistic that as we get momentum then there are enough
> OKD-compliant data that we can build the mass of Open Bibliography quite
> quickly.
>
> I agree with this  in the Book biblio space, but not in the Article space.
> I think the vested interests which hold most of the article data,
> which are not the libraries but the publishers  and A&I services, may be
> too entrenched  to yield this data easily. Certainly, I see great
> resistance in the field of mathematics.
>
> This in a sense is the point. There are major interests at stake. Any
negotiation is (a) costly in time (b) usually unclear (c) often fails.

I'm going to be pragmatic - I have to concentrate my energies on Launching
Open Bibliographic Principles on Monday. I hope the authors will be present
virtually.

P.


-- 
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-bibliography/attachments/20110111/188ae2cb/attachment-0001.html>


More information about the open-bibliography mailing list