[open-science] feedback wanted on text-mining initiatives

Nick Barnes nb at climatecode.org
Tue Apr 24 10:01:57 UTC 2012


On Tue, Apr 24, 2012 at 08:36, Peter Murray-Rust <pm286 at cam.ac.uk> wrote:
> This is clearly a matter of great and immediate importance. A few of us met
> yesterday at Oxford and discussed Heather's excellent blog posts. The
> immediate outcome is that JennyM, DianeC and I will try to pull material
> together.
>
> My memory is that about 18 months ago we started to pull together something
> under "Panton papers" and there may be some early drafts on OKF wikis.
>
> It will not be easy to get a rapid and authoritative paper about text-mining
> (I prefer to call it information-mining) - there are lots of technical
> details and legal concerns. This is why it took 2 years to hack out the
> details of the Panton Principles. Therefore I think we should aim for a
> background (non-normative) summary of the current position and a set of
> general principles which deliberately (at this stage) leave details to be
> filled in.
>
> Ideally we would like to come up with principles to which all parties
> (authors, publishers, libraries) could put their names. We shall probably be
> asking for more than some publishers currently allow in their contractual
> restrictions - we would ask them to realise the many benefits they will get
> from allowing mining.
>
> Among the things which we probably should not address are:
> * what can and cannot be mined and reproduced
> * to what spread of activities (e.g. science) this belongs
> * mechanisms for making it happen (including better technical provisions)
>
> It is particularly important that we do not give away rights that we already
> believe we have or which we can reasonably aspire to. This is not a
> negotiation, it's a statement of principles.
>
> We'll be doing this on this list and OKF wiki pages.

I spent an hour or so trying to come up with pithy or snappy
expressions, but only got this far:

We can and will process any documents we can read with any software we want.

(or: the right to read is the right to mine).

We can and will publish facts which we discover by reading and
processing documents.

(or: facts don't belong to anyone).
-- 
Nick Barnes, Climate Code Foundation, http://climatecode.org/




More information about the open-science mailing list