[open-science] text-mining restrictions - a plea for more information

Clifford Lynch cliff at cni.org
Sun Apr 17 20:06:03 UTC 2011

Two quick points on this.

First, basically any of the contracts from state institutions in the 
US are public record and can be obtained under state Freedom of 
Information act laws. In addition, there is a move underway within 
the ARL libraries (both public and private) to stop writing contracts 
with non-disclosure clauses; there's a feeling that greater 
transparency, both in financial terms and in terms of useage 
conditions and restrictions, is desirable. I believe that there was a 
piece in the Chronicle of Higher Education a couple of weeks ago 
discussing a Cornell position statement on this.

But having said this, usually the terms that the publishers are 
trying to keep secret are financial; as you say, if there's a 
prohibition on mass downloading of articles, it's pretty useless if 
people in the institutional community are not aware of it. I would 
suspect that if you contact your university library and ask them 
about contractual restrictions on bulk downloading or crawling, 
they'll be quite forthcoming.

I believe that such clauses are pretty commonplace.  They often deal 
with both crawling and also with downloading "significant" portions 
of the journal article databases onto local faculty or student 

Clifford Lynch
Director, CNI

At 19:07 +0100 04/17/11, Peter Murray-Rust wrote:
>On Sun, Apr 17, 2011 at 3:38 PM, Vision, Todd J 
><<mailto:tjv at bio.unc.edu>tjv at bio.unc.edu> wrote:
>Peter's draft whitepaper on text-mining is badly needed and nicely 
>put.  I was particularly interested in this passage:
>"The provision of journal articles is controlled not only by 
>copyright but also (for most scientists) the contracts signed by the 
>institution. These contracts are usually not public. We believe 
>(from anecdotal evidence) that there are clauses forbidding the use 
>of systematic machine crawling of articles, even for legitimate 
>scientific purposes."
>Thank you very much for giving me further encouragement.
>We have also heard tell of the existence of such clauses, but also 
>have not been able to secure first-hand evidence for them.  It would 
>be very nice to promote this from "anecdotal" to "documented", and I 
>would like here to put out a wider plea for anyone who might be able 
>to provide the language of these contractual retrictions. 
> Alternatively, I would welcome suggestions for how we are to know 
>what exactly we are prohibited from doing in light of the 
>confidential nature of the contracts.
>I will take the decidely unscientific step of assuming that this is 
>indepdnent confirmation and that we should take this further.
>If copyright holders really wish to enforce such restrictions, it 
>seems odd that their very existence is little more than a rumor. Can 
>secret restrictions be legally enforced?
>IANAL but I think this depends on the legal jurisdiction. 
>We continually hear of contracts in many areas of activity vwhere 
>part of the contract is that details may not be disclosed, so I 
>expect it is legal. However I don't know whether such gagging 
>clauses are actually in force or whether not many people are 
>sufficiently interested to tell us.
>So there is one legal way to find out and I think it's appropriate. 
>Before doing it it would be very useful to have more 
>confirmation, as if this is well known I don't want to waste 
>poeple's time.
>So, please, can we have rapid responses to this question before I 
>(amd possibly others) start stirring things yet again...
>Peter Murray-Rust
>Reader in Molecular Informatics
>Unilever Centre, Dep. Of Chemistry
>University of Cambridge
>CB2 1EW, UK
>open-science mailing list
>open-science at lists.okfn.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-science/attachments/20110417/505400a9/attachment-0001.html>

More information about the open-science mailing list