[open-science] feedback wanted on text-mining initiatives

Peter Murray-Rust pm286 at cam.ac.uk
Tue Apr 24 07:36:40 UTC 2012


This is clearly a matter of great and immediate importance. A few of us met
yesterday at Oxford and discussed Heather's excellent blog posts. The
immediate outcome is that JennyM, DianeC and I will try to pull material
together.

My memory is that about 18 months ago we started to pull together something
under "Panton papers" and there may be some early drafts on OKF wikis.

It will not be easy to get a rapid and authoritative paper about
text-mining (I prefer to call it information-mining) - there are lots of
technical details and legal concerns. This is why it took 2 years to hack
out the details of the Panton Principles. Therefore I think we should aim
for a background (non-normative) summary of the current position and a set
of general principles which deliberately (at this stage) leave details to
be filled in.

Ideally we would like to come up with principles to which all parties
(authors, publishers, libraries) could put their names. We shall probably
be asking for more than some publishers currently allow in their
contractual restrictions - we would ask them to realise the many benefits
they will get from allowing mining.

Among the things which we probably should not address are:
* what can and cannot be mined and reproduced
* to what spread of activities (e.g. science) this belongs
* mechanisms for making it happen (including better technical provisions)

It is particularly important that we do not give away rights that we
already believe we have or which we can reasonably aspire to. This is not a
negotiation, it's a statement of principles.

We'll be doing this on this list and OKF wiki pages.

P.



-- 
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-science/attachments/20120424/25a51a2f/attachment-0001.html>


More information about the open-science mailing list