[open-bibliography] Using IsItOpenData to get information about licensing policies

Thu Mar 17 22:13:21 UTC 2011

On Thu, Mar 17, 2011 at 8:13 PM, Jim Pitman <pitman at stat.berkeley.edu>wrote:

> Following up on a point from Peter's previous:
>
> >> We have been crawling the commercial publishers and should have
> potentially 10 million records.
>
> First a few questions for Peter, then some for the group:
>

>
> For Peter:
>
> Where/when will the results of this crawling be made available?
>

When we have time to concentrate on how that is done

With what attributes/schema (especially, any subject classification?

that may be copyright so we don't save it

> , citation data?)

that requires fulltext which we don;t have access to

and what license?
>

 CC0 - the STM publishers agree that bib metadata are not copyright

What explicit permissions do you have from which publishers?
>

we don't need them

What if any basis for updating?
>

no idea but the software can be rerun any time

How will your crawl compare with http://www.journaltocs.hw.ac.uk/ ?
>

it will get the article metadata, not the tocs

For the group:
>
> Who else in the group besides Peter, myself(Jim) and Thomas is involved in
> medium to large scale
> efforts to aggregate article metadata?
>

> How to avoid unnecessary duplication of such efforts?
>

Coordinate it here

> How to ensure consistency and viability of a large open store of article
> metadata from multiple sources?
>

lets' see what we get. We can't really tell till we analyze it

> What organization can be trusted to openly maintain such a large database?
> Are we thinking OKFN. If not OKFN then what?
>

Article metadata is pretty static. The changes (such as withdrawn articles)
are generally not consistently managed by the publishers anyway.

By what business model could the database be maintained?
>

I don't think we can answer this now. When academia shows some interest in
actually doing anything it will be easy. The best we can do is build a
bottom-up demonstrator

> How should authority to write to the database be managed?
>
> We'll face that when we have some interest from outside the current group.
It's completely dependent on what the business model is.

Excuse so many questions, but it seems timely to initiate some general
> discussion of these
> issues.
>

What is happening is that various prototypes are being created by early
adopters. It's clear that some types of metadata are easy to collect at very
low cost. When we have got those we'll get some idea of quality - e.g. are
records from different sources compatible.  Then - I hope - we'll get some
interest fromn potential funders.

If academia relaises that it can take control of its own we may have some
traction. The quality of commercial bib data is not necessarily good by
definition - and I have some examples of awful stuff. But it's much easier
to spend millions of pounds of academic money than to ...

We shall get something useful but ...

P.

-- 
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-bibliography/attachments/20110317/3563c57d/attachment-0001.html>