[open-science] text-mining restrictions redux

Vision, Todd J tjv at bio.unc.edu
Wed Mar 7 21:07:21 UTC 2012


Hello all,

When text mining was a hot topic on this list last year, Jenny Molloy started an open Google Doc to keep us track of what publishers did and did not allow their subscribers to do: http://bit.ly/zyyR98

The spreadsheet lists, for each publisher: 
- a link to their standard license agreement, where possible
- whether that agreement explicitly prohibits text/data mining or not (or the language is ambiguous, as in the case of Wiley)
- the relevant quote upon which that interpretation is based

In response to the recent discussions around Elsevier's text-mining policy, it has become apparent that for those publishers with standard license agreements that disallow text-mining, there may also be language added *allowing* text-mining for those customers who have specifically negotiate (or demand?) it.  

So a column has now been added to include the existence or ideally the language of such additional agreements, where known.  Unfortunately, Elsevier has expressly asked that this language not be shared.  We currently lack information on whether equivalent "over-rides" are available for subscribers to other publishers.

Regardless of how one feels about the legal and moral standing of publishers to control the use of their content in this way, it would be valuable to keep this resource current as documentation of the obstacles that are put in the way of effective use of the literature.

So this is an open invitation to review the document and please fix any errors or add any information that you may have.  It will be instructive to compare the responses from PMR's enquiry with this information that was available a year or so ago.

NPG language that appears to disallow text-mining is particularly interesting in light of the editorial that appeared in the latest issue of Nature: "Publishers and scientists should do more to foster the mining of research literature by computer" [1] 

For last year's thread, which had some interesting and still relevant points, see the subject "text-mining restrictions - a plea for more information" from April 2011.

cheers,
Todd

[1] http://dx.doi.org/10.1038/483124a



More information about the open-science mailing list