[openbiblio-dev] Bibsoup legal issues

Jim Pitman pitman at stat.Berkeley.EDU
Thu Sep 29 14:53:30 UTC 2011


else want to participate?

--Jim

Peter Murray-Rust <pm286 at cam.ac.uk> wrote:

> On Wed, Sep 28, 2011 at 9:48 PM, Karen Coyle <kcoyle at kcoyle.net> wrote:
>
> > Quoting Jim Pitman <pitman at stat.Berkeley.EDU>:
> >
> >  The recent upload
> >>
> >> http://bibsoup.net/collection/**CM:Neyman__Jerzy_pub<http://bibsoup.net/collection/CM:Neyman__Jerzy_pub>
> >>
> >> raises some systems and metadata and legal issues which we need to deal
> >> with.
> >>
> >> This collection is now displayed with no source and no copyright
> >> indication.
> >>
> >
> > Didn't we write a set of principles stating that bibliographic data is
> > not/should not be considered to be under copyright? Or are you considering
> > the collection copyright-able?
>
>
> Karen, you are right about BibSoup as an Open collection of open
> bibliography (and we should not deviate from this). We are also discussing
> the use of BibJSON and Bibserver more generally. We talked about some of
> this with Rufus and Mark yesterday. Both BibJSON and Bibserver can be used
> with non-OKD-compliant material and so it's important to be able to
> represent this
>
> The particular issue yesterday (and I doubt it will be the last) was who had
> authority to upload bibliographic records and to amend them. This isn't a
> licence problem - it's a community regulation of practice, rather like
> Wikipedia. One knotty point was whether they could be anonymous
> contributions - this is technically possible but is it desirable? How much
> software should be developed to support this, etc. There is probably a need
> to identify roles - "owner" of a collection or of an entry, "manager" (e.g.
> of a collection), etc. We felt that records would benefit from having
> individual "owner"s - these might be dc:creators or something else.
>
>   I really dont like this.  I think every collection displayed
> > on the bibsoup should have a source file on the web.
> >
>
> I think this addresses the "Open API" aspect - the need to be able to
> iterate over a complete collection.
>
> I'm not sure what the nature of this source file would be. Couldn't it be a
> > temporary output from something like Zotero or Mendeley, and would not be
> > retained after uploading?
> >
> > Are users expected to re-upload their collections when they add something
> > new?
> >
>
> These are exactly the sort of problems we need to address - how do we
> increment a collection? At present Jim/Mark have come up with some use cases
> to inform the software development and we need to start getting some clear
> directions.
>
> >
> > If they can add to the collections without re-uploading, then the "source"
> > needs to be at the citation level, not the collection level. But I still
> > think that there shouldn't necessarily be a permanent source file for the
> > collection.
> >
>
> There needs to be a complete statement of what is in the collection, and I
> think it needs to be a static view , not a programmatic one. For myself I
> don't see a source file as a major burden. Let's say we have 100,000 items
> in a collection - that won't kill today's systems.
>
> >
> > It seems to me that the best thing to capture is some identity (email) of
> > the person who uploaded it. That implies that the registration facility use
> > an email to the registrant to validate the email address.
> >
>
> Exactly our thoughts yesterday :-)
>
>
> >
> > kc
> >
> >
> > I'll also comment on ?Jim
>
>  What we are doing with this upload from file option is allowing users to
> >> publish bibs to the web. But no metadata is being collected to
> >> show
> >> * what is the source of the data.
> >> * who the user is
> >> * what if any rights they may claim over the content,
> >>
> >
> *3 - the point is that we shall probably have entries in the
> Bibserver/BibJSON environment which have potential IP. We may strive (and
> succeed) is getting this removed. But we have to be able to record it.
>
> >
> >> I find this quite troubling.  I think we need to try to separate three
> >> different functions of the bibsoup/bibserver
> >>
> >> 1) capability to display a bib file from just about any more/less open
> >> biblio source on the web: we should expect such sources to have stable
> >> addresses
> >>
> >
> This is effectively a static function: static Bibserver.render(bibEntry). It
> can be pointed at anything whether online or onPC.
>
> >
> >> 2) capability to upload and provide a url for any bib a user offers from
> >> their desktop.  But for this, we must have some clickthrough legal page,
> >> which licenses the
> >>  content and makes it clear what the source is.  I think this is like a
> >> pre-stage to 1), and from there the display is identical to 1). It is not
> >> properly separated from 1) at present.
> >>
> >
> Yes - we may have to have a number of holding tanks to receive fairly raw
> material and sort out the problems before commitment. Such as leagalities,
> normalisation, deduplication, etc.
>
> >
> >> 3) capability to acquire, cache and merge public domain components of
> >> biblio data uploaded to the bibsoup from whatever sources
> >>
> >> There are also legal issues around 1).
> >> As long as we just offer display over files that are already accessible on
> >> the web, we are probably fairly safe.
> >> But we should think about our legal/licensing strategy. This is not
> >> formulated at present.
> >>
> >
> We probaqble also need set of Foo2BibJSON and BibJSON2Foo  converters,
> including online services. This should be a star-geometry with BibJSON at
> the centre. In this way BibJSON becomes the lingua franca for converting
> between commonly used bibliographic / reference management systems . This
> service could also have a BibJSON2Bar display where Bar was any of the
> common publication formats (e.g. Harvard)
>
> >
> >> We may also consider providing users with private displays  of their data.
> >> But I advise against this, because we then have to provide security on
> >> collections,
> >> privacy policy, ...  which is costly. Lets leave that to others. I think
> >> we should say that anyone can upload their data to bibsoup, but that in
> >> doing so they make their
> >> content more public than it otherwise would be, unless they provide
> >> explicit licenses on their records or collections.
> >>
> >
> I agree. Access control is hard. We have to build hooks into Bibserver, but
> whether we have to implement them is unclear
>
>
> > This is something we will have to work through also with the bigger
> >> suppliers like Mendeley and Microsoft.
> >>
> >>
>
> -- 
> Peter Murray-Rust
> Reader in Molecular Informatics
> Unilever Centre, Dep. Of Chemistry
> University of Cambridge
> CB2 1EW, UK
> +44-1223-763069




More information about the openbiblio-dev mailing list