[open-science] feedback wanted on text-mining initiatives

Mon Apr 23 18:51:04 UTC 2012

Hi Heather and Peter,

the comment function on the blog is still broken, so here is a comment
on your post:

Having a standard clause for this is definitely one step forward, just
today the American Assoc of Immunology was asking for a standard
clause to add to their license and I have trouble coming up with
something general enough... what shall we ask the library to add to
AAI license to allow text mining for everyone ? I will work something
out, but any comments are welcome...

It is not correct that only Elsevier allows text mining in their (new)
license. Some other publishers have been allowing text mining for a
while, when the institution that negotiates the license has made that
part of the agreement. Sage and Mary-Ann Liebert are examples for
these, they allow text mining for the University of California, and
maybe many others. Some publishers allow text mining in their general
license, like the AACR or the Royal Society of Medicine. The journal
of Heredity has not added anything to their license yet but has a
contact person now on their website and say that they actively support
it.

I've listed the licenses of these publishers on my webpage
http://text.soe.ucsc.edu/progress.html, in the third column of the
table.

cheers
Max

--
Maximilian Haeussler, max at soe.ucsc.edu
mob +1 831 295 0653 office: +1 831 459 5232

On Sun, Apr 22, 2012 at 1:18 AM, Peter Murray-Rust <pm286 at cam.ac.uk> wrote:
>
>
> On Sat, Apr 21, 2012 at 4:39 PM, Heather Piwowar <hpiwowar at gmail.com> wrote:
>>
>> Great suggestions!  (sorry comments on the blog weren't working, not sure
>> why, will keep investigating.)
>>
>> I met with UBC librarians yesterday.  They are excited about codifying
>> these expectations into a text-mining addendum that they can use as a model
>> agreement with all publishers.  They suggested I spearhead contacting other
>> publishers.  I'm going to be busy with that for the next little while.
>>
>> Does anyone else want to pick up the statement and run with pulling
>> together these comments, driving to a community "short, tight, non-nonsense
>> manifesto"?
>>
> Yes - I would absolutely like to be involved.
>
> I have already contacted 6 other publishers and will be publishing the
> results very soon.
>
> I think it's absolutely critical this does not get fragmented and there is a
> single statement. The great danger is making it too weak. I'll try to find
> time today to put some points on this list.
>
> We should remember that this must be international as legal jurisdictions
> vary.
>
> P.
>
>
>> Heather
>>
>>
>> On Fri, Apr 20, 2012 at 9:52 AM, Peter Murray-Rust <pm286 at cam.ac.uk>
>> wrote:
>>>
>>> Heather this is great,
>>> I tried to respond to your blog but could. so please publish this as if I
>>> had.
>>>
>>> "This is a great idea, Heather. It is important to state what we want and
>>> what we believe we have a right to, not just what we can 'negotiate' or do
>>> without being sued. There are fundamental rights and we should aim for them.
>>>
>>> A finished manifesto will require some communal work. Firstly it must
>>> cover multiple jurisdictions (e.g. "fair use" is irrelevant in UK law, and
>>> in any case Larry Lessig simple describes it as a right to go to court (or
>>> similar).
>>>
>>> It's extremely important that we don't get so excited that we give away
>>> stuff that actually is ours. For example I will argue that *all* factual
>>> data is de facto mineable and is only prevented by publisher contracts. We
>>> should also address the problem (if any) of server overload - it is easily
>>> manageable by caches. I am particularly concerned about other-than-text -
>>> much of my current work is on diagrams.
>>>
>>> So a short, tight, non-nonsense manifesto is exactly what is required.
>>> Like Panton Principles and the Principles pf Open Bibliography.
>>>
>>> And we have a number of people in OKF who are very interested.
>>> "
>>>
>>> So as you say this is the time - we should aim to get something out as
>>> quickly as we can without losing the fundamental principles. That's not
>>> easy, but it's possible
>>>
>>> On Fri, Apr 20, 2012 at 5:33 PM, Nick Barnes <nb at climatecode.org> wrote:
>>>>
>>>> On Fri, Apr 20, 2012 at 16:15, Heather Piwowar <hpiwowar at gmail.com>
>>>> wrote:
>>>> > Hi Open Science,
>>>> >
>>>> > There is growing interest in text-mining rights.  I'm in the middle of
>>>> > a bit
>>>> > of it, and would love some feedback and community.
>>>> >
>>>> > Briefly, due to a twitter conversation, Elsevier and I began to talk
>>>> > about
>>>> > updating the subscription contract of the University of British
>>>> > Columbia to
>>>> > explicitly include text-mining rights.  The rights Elsevier has agreed
>>>> > to
>>>> > are more broad than they've agreed to with other institutions, as far
>>>> > as I
>>>> > know (tell me if I'm wrong!), and more broad than those of most
>>>> > publishers.
>>>> >  More information.
>>>> >
>>>> > In the mean time, PMR and others are asserting text-mining rights and
>>>> > going
>>>> > ahead.  This is another approach and I'm glad they are doing it.
>>>> >
>>>> > I've drafted a short "text-mining manifesto" if you will...  how
>>>> > researchers
>>>> > expect to be able to access and process the accessing the literature
>>>> > to
>>>> > which we have access.   How to improve this statement, and what to do
>>>> > with
>>>> > it next?
>>>>
>>>> Tried to respond on your blog but for some reason WordPress doesn't
>>>> like my login any more.  Anyway, I was commenting to encourage you to
>>>> broaden it.  For instance are "aggregate statistical" results the only
>>>> kind of fact that text-miners might want to publish?  Also, to
>>>> strengthen the wording.  It took me several drafts of the Science Code
>>>> Manifesto to get to the bald statements of "must".
>>>> --
>>>> Nick Barnes, Climate Code Foundation, http://climatecode.org/
>>>>
>>>> _______________________________________________
>>>> open-science mailing list
>>>> open-science at lists.okfn.org
>>>> http://lists.okfn.org/mailman/listinfo/open-science
>>>
>>>
>>>
>>>
>>> --
>>> Peter Murray-Rust
>>> Reader in Molecular Informatics
>>> Unilever Centre, Dep. Of Chemistry
>>> University of Cambridge
>>> CB2 1EW, UK
>>> +44-1223-763069
>>
>>
>>
>> _______________________________________________
>> open-science mailing list
>> open-science at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/open-science
>>
>
>
>
> --
> Peter Murray-Rust
> Reader in Molecular Informatics
> Unilever Centre, Dep. Of Chemistry
> University of Cambridge
> CB2 1EW, UK
> +44-1223-763069
>
> _______________________________________________
> open-science mailing list
> open-science at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-science
>