[openbiblio-dev] Bibsoup legal issues

Peter Murray-Rust pm286 at cam.ac.uk
Thu Sep 29 13:09:32 UTC 2011

On Wed, Sep 28, 2011 at 9:48 PM, Karen Coyle <kcoyle at kcoyle.net> wrote:

> Quoting Jim Pitman <pitman at stat.Berkeley.EDU>:
>  The recent upload
>> http://bibsoup.net/collection/**CM:Neyman__Jerzy_pub<http://bibsoup.net/collection/CM:Neyman__Jerzy_pub>
>> raises some systems and metadata and legal issues which we need to deal
>> with.
>> This collection is now displayed with no source and no copyright
>> indication.
> Didn't we write a set of principles stating that bibliographic data is
> not/should not be considered to be under copyright? Or are you considering
> the collection copyright-able?

Karen, you are right about BibSoup as an Open collection of open
bibliography (and we should not deviate from this). We are also discussing
the use of BibJSON and Bibserver more generally. We talked about some of
this with Rufus and Mark yesterday. Both BibJSON and Bibserver can be used
with non-OKD-compliant material and so it's important to be able to
represent this

The particular issue yesterday (and I doubt it will be the last) was who had
authority to upload bibliographic records and to amend them. This isn't a
licence problem - it's a community regulation of practice, rather like
Wikipedia. One knotty point was whether they could be anonymous
contributions - this is technically possible but is it desirable? How much
software should be developed to support this, etc. There is probably a need
to identify roles - "owner" of a collection or of an entry, "manager" (e.g.
of a collection), etc. We felt that records would benefit from having
individual "owner"s - these might be dc:creators or something else.

  I really dont like this.  I think every collection displayed
> on the bibsoup should have a source file on the web.

I think this addresses the "Open API" aspect - the need to be able to
iterate over a complete collection.

I'm not sure what the nature of this source file would be. Couldn't it be a
> temporary output from something like Zotero or Mendeley, and would not be
> retained after uploading?
> Are users expected to re-upload their collections when they add something
> new?

These are exactly the sort of problems we need to address - how do we
increment a collection? At present Jim/Mark have come up with some use cases
to inform the software development and we need to start getting some clear

> If they can add to the collections without re-uploading, then the "source"
> needs to be at the citation level, not the collection level. But I still
> think that there shouldn't necessarily be a permanent source file for the
> collection.

There needs to be a complete statement of what is in the collection, and I
think it needs to be a static view , not a programmatic one. For myself I
don't see a source file as a major burden. Let's say we have 100,000 items
in a collection - that won't kill today's systems.

> It seems to me that the best thing to capture is some identity (email) of
> the person who uploaded it. That implies that the registration facility use
> an email to the registrant to validate the email address.

Exactly our thoughts yesterday :-)

> kc
> I'll also comment on ?Jim

 What we are doing with this upload from file option is allowing users to
>> publish bibs to the web. But no metadata is being collected to
>> show
>> * what is the source of the data.
>> * who the user is
>> * what if any rights they may claim over the content,
*3 - the point is that we shall probably have entries in the
Bibserver/BibJSON environment which have potential IP. We may strive (and
succeed) is getting this removed. But we have to be able to record it.

>> I find this quite troubling.  I think we need to try to separate three
>> different functions of the bibsoup/bibserver
>> 1) capability to display a bib file from just about any more/less open
>> biblio source on the web: we should expect such sources to have stable
>> addresses
This is effectively a static function: static Bibserver.render(bibEntry). It
can be pointed at anything whether online or onPC.

>> 2) capability to upload and provide a url for any bib a user offers from
>> their desktop.  But for this, we must have some clickthrough legal page,
>> which licenses the
>>  content and makes it clear what the source is.  I think this is like a
>> pre-stage to 1), and from there the display is identical to 1). It is not
>> properly separated from 1) at present.
Yes - we may have to have a number of holding tanks to receive fairly raw
material and sort out the problems before commitment. Such as leagalities,
normalisation, deduplication, etc.

>> 3) capability to acquire, cache and merge public domain components of
>> biblio data uploaded to the bibsoup from whatever sources
>> There are also legal issues around 1).
>> As long as we just offer display over files that are already accessible on
>> the web, we are probably fairly safe.
>> But we should think about our legal/licensing strategy. This is not
>> formulated at present.
We probaqble also need set of Foo2BibJSON and BibJSON2Foo  converters,
including online services. This should be a star-geometry with BibJSON at
the centre. In this way BibJSON becomes the lingua franca for converting
between commonly used bibliographic / reference management systems . This
service could also have a BibJSON2Bar display where Bar was any of the
common publication formats (e.g. Harvard)

>> We may also consider providing users with private displays  of their data.
>> But I advise against this, because we then have to provide security on
>> collections,
>> privacy policy, ...  which is costly. Lets leave that to others. I think
>> we should say that anyone can upload their data to bibsoup, but that in
>> doing so they make their
>> content more public than it otherwise would be, unless they provide
>> explicit licenses on their records or collections.
I agree. Access control is hard. We have to build hooks into Bibserver, but
whether we have to implement them is unclear

> This is something we will have to work through also with the bigger
>> suppliers like Mendeley and Microsoft.

Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/openbiblio-dev/attachments/20110929/6347f32d/attachment.html>

More information about the openbiblio-dev mailing list