[open-science] feedback wanted on text-mining initiatives

Peter Suber peters at earlham.edu
Fri Apr 27 16:51:16 UTC 2012


I strongly support the kind of open text-mining declaration Peter MR
describes.

Peter is exactly right in his understanding of the purpose of the BBB
statements in the OA domain. They defined OA and set out goals. Sometimes
they described in general terms some strategies for achieving those goals.
(For example, the Budapest statement described what we now call green and
gold OA as methods for delivering OA.)  But they left implementation
details to those inspired to reach the goals, didn't dictate just one path
to reach the goals, and tried not to say anything that might be made
obsolete by advances in technology.

I'm far from a specialist in text mining, but I'd be happy to help the
cause in any way that I could.

One question:  should the text-mining declaration cover data mining as
well? Or are the issues so different that they deserve separate treatment?

     Peter S.

Peter Suber
gplus.to/petersuber


On Fri, Apr 27, 2012 at 10:25 AM, Peter Murray-Rust <pm286 at cam.ac.uk> wrote:

>
>
> On Fri, Apr 27, 2012 at 2:28 PM, Richard Kidd <KiddR at rsc.org> wrote:
>
>> On Fri, Apr 27, 2012 at 1:40 PM, Richard Kidd <KiddR at rsc.org> wrote:****
>>
>> > > Among the things which we probably should not address are:
>> > > * what can and cannot be mined and reproduced****
>>
>>
>>
>> Apols for the misunderstanding, my fault for using ‘open’ in a diff
>> context - my request is that “what can and cannot be mined and reproduced” *
>> *should** be discussed and addressed – as it’s the key issue. Am trying
>> to get comments up on yr blog post but it’s not behaving…****
>>
>>
>>
>
> Thanks and understood. It is critical that is *is* discussed, and very
> possibly on the OKF site.
>
> The point here is to create a declaration about text-mining similar to
> Budapest/Berlin/Bethesda [for Open Access]. They deliberately do not go
> into details, but state a goal that can later be reified in law and
> practice. They state what "Open Access" means in general terms. Phrases
> like "for whatever purpose", "everybody", "without further permission".
> They do NOT state that there should be a licence - licences are simply one
> way of implementing them.
>
> Let us call the Declaration of textmining the "Open Text Mining
> Declaration". (It's slightly but not very contaminated by NPG's "Open
> Text-mining Initiative" which most people have forgotten. It should be
> brief - perhaps 2-3 lines at. It would define Open Text-mining...
>
> "By Open textmining we mean ... everyone ... without further permission
> ... available to all ...".
>
> That does not mean that everyone must agree to do it. It is a goal. The
> BBB declarations are not yet implemented universally. But they are the
> yardstick that most of us use. They are particularly useful because so many
> people and organisations create their own usage for "Open Access" without
> defining it - thus causing confusion. We wish to avoid this for this new
> field.
>
> The details come second, and change as the world and technology changes.
> It is generally agreed that CC-BY licences permit text- and other mining
> without further permission. Contrast that with almost everything else where
> nothing is clear.
>
> If 20 scientists per university wish to text-mine that means 1000
> universities * 20 scientists * 100 publishers == potentially 1 million
> requests. The system can't cope. So the only ways forward are:
> * refuse everything. There seem to be publishers who take this view. It
> has the virtue of clarity
> * permit everything. There are certainly publishers (BMC/PLoS) who take
> this view
> * leave everything unclear. "consult your librarian" "we'll discuss this
> with our marketing people". That's the position for most publishers.
>
> Fuzziness is destroying scientific progress and creating tensions. The
> OTMD is an attempt to bring some clarity. Whether any given publisher
> accepts , rejects or ignores it is irrelevant to the wording of the
> declaration.
>
> P.
>
>
> --
> Peter Murray-Rust
> Reader in Molecular Informatics
> Unilever Centre, Dep. Of Chemistry
> University of Cambridge
> CB2 1EW, UK
> +44-1223-763069
>
> _______________________________________________
> open-science mailing list
> open-science at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-science
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-science/attachments/20120427/78be4bff/attachment-0001.html>


More information about the open-science mailing list