[open-linguistics] META-NET Data Liberation Campaign

Wed Dec 5 17:17:15 UTC 2012

Thanks for the feedback. I fully understand your position and that you want (need!) to take small steps. 

My point (obviously not clear--sorry!) was that we should be careful to make it clear that "research only" may be acceptable now as a step, but that it is not the ultimate ideal. Having seen that GPL etc. promotion ended up with people feeling it is the de facto "good citizen" license, I simply want to avoid the same situation by making it clear that research-only is not the ideal, but rather a step.

Thanks,
Nancy 

On Dec 5, 2012, at 11:58 AM, John Judge <jjudge at computing.dcu.ie> wrote:

> Thanks for your feedback Nancy. You raise a very valid point about the licensing restrictions and how a move towards fully open data is a goal we would like to pursue. In this instance with our data liberation campaign we are trying to take some small but concrete steps in this direction.
> 
> If we were to lobby the owners of this data to try and ask that they simply move to a fully open distribution model right away I think that we wouldn't have much success in getting them to loosen their restrictions at all. Instead through the approach we're taking of inviting them to share the data in some situations we're hoping to be able to "get our foot in the door" and show them that there's nothing to fear from sharing their data and much to gain. 
> 
> Through this approach we've already had some successes, which I hope to report on more fully at a later stage when the details have firmed up. But suffice to say I think that given the tight restrictive circumstances under which much of this data is currently being held that , for now at least, this softly softly approach where we offer the owners everything from an open distribution channel to free legal help with regards licensing is helping free up the data and opening the minds of those holding it to the greater possibilities of loosening their restrictions. So in my view this is just a first step, and an important one to show that there are good citizens out there and that that community wants and needs the data to be freely available through less restrictive licences.
> 
> All the best,
> John
> 
> On 05/12/2012 16:15, Nancy Ide wrote:
>> I would like to raise a concern here that calling for "open for research" licensing is potentially damaging to our interests, in the sense that it promotes a practice that is counter to what I assume (hope) is the overall goal: fully open data, restricted to no one for any purpose and thereby supportive of collaborative development across both nonprofit and commercial organizations. Given the EU's promotion of collaboration between research and industry in their funding model, it would seem that this would be in the interest of the official language bodies in Europe as well. 
>> 
>> A look at the impact of the promotion of GNU (copyleft) and "share-alike" licenses makes my point: promotion of these licenses as the "good citizen's license" has had a subtle but pervasive impact on software and data licensing, in that these licenses are at this point the de facto licenses of choice. Unfortunately, such licenses are often not suitable for commercial use because of the requirement to distribute results under the same terms. So what we have is a grass roots effort to be open that in fact has had the result of obstructing full openness. I fear that the promotion of the even stronger research-only restriction will have a similar, and even more damaging, effect.
>> 
>> I would recommend promotion of something like the Apache 2.0 license (http://www.apache.org/licenses/LICENSE-2.0), even if (as pointed out in the note below) it is not likely that such a license would be acceptable in this instance. That would send the message that a fully open license is what the community feels is the good citizen's choice, in that it supports collaborative development among both nonprofit and commercial organizations. If there cannot be agreement to adopt this type of licensing, so be it, but the message from the community should be clear about what we see as the ideal.
>> 
>> Nancy Ide
>> 
>> =======================================================
>> Nancy Ide
>> Professor of Computer Science
>> 
>> Department of Computer Science
>> Vassar College
>> Poughkeepsie, New York 12604-0520 
>> USA
>> 
>> tel: (+1 845) 437 5988
>> fax: (+1 845) 437 7498
>> email: ide at cs.vassar.edu
>> http://www.cs.vassar.edu/~ide
>> =======================================================
>> 
>> 
>> 
>> 
>> 
>> On Dec 3, 2012, at 7:25 PM, Christian Chiarcos <christian.chiarcos at web.de> wrote:
>> 
>>> Dear all,
>>> 
>>> as most of us probably know, a number of "reference corpora" for major (and minor) languages of Europe that have been produced in the last decades, but many of them are not fully available to the public (not even under a restrictive license), or available in a snippet view on the web only (and hence unusable for NLP or advanced statistical analyses), -- not to talk about open licenses.
>>> 
>>> To address this issue, META-NET have prepared an open letter to all the official language bodies in Europe and to those holding onto the various corpora calling on them to consider trying to make this important language data available for research purposes.  If you feel that there is a huge benefit to liberating these corpora and making them available for research then please contact your local language body and let them know that you are in favour of the META-NET proposal.
>>> 
>>> More on this can be found on our blog, in a recent post by John Judge, META-NET Ireland (http://linguistics.okfn.org/2012/11/19/meta-net-data-liberation-campaign/, from where the last paragraph was quoted). I'd like to thank John for replicating his original post there and hope this initiative receives some support from the OWLG.
>>> 
>>> Certainly, making these resources available under a research license would not be sufficient in the eyes of many on the list, but it would definitely be an important (and more easily achievable) step towards the further liberation of linguistic data.
>>> 
>>> Thank you,
>>> Christian
>>> -- 
>>> Christian Chiarcos
>>> Information Sciences Institute
>>> University of Southern California
>>> 4676 Admiralty Way #1001
>>> Marina del Rey, CA 90292
>>> tel: +1-310-448-9391
>>> fax: +1-310-448-8599
>>> http://purl.org/chiarcos/home
>>> chiarcos at isi.edu
>>> 
>>> _______________________________________________
>>> open-linguistics mailing list
>>> open-linguistics at lists.okfn.org
>>> http://lists.okfn.org/mailman/listinfo/open-linguistics
>>> Unsubscribe: http://lists.okfn.org/mailman/options/open-linguistics
>> 
> 
> 
> -- 
> John Judge
> 
> Centre for Next Generation Localisation
> META-NET CIO
> 
> Email: jjudge at computing.dcu.ie
> Phone: +353 1 700 6729
> Mob: +353 87 218 9093
> Skype: jjudge2
> http://www.cngl.ie
> http://www.meta-net.eu 
> 
> 
> Email Disclaimer
> "This email and any files transmitted with it are confidential and are intended solely for use by the addressee. Any unauthorised dissemination, distribution or copying of this message and any attachments is strictly prohibited. If you have received this email in error please notify the sender and delete the message. Any views or opinions presented in this email may solely be the views of the author and cannot be relied upon as being those of Dublin City University. E-mail communications such as this cannot be guaranteed to be virus free, timely, secure or error free and Dublin City University do not accept liability for any such matters or their consequences. Please consider the environment before printing this Email."

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-linguistics/attachments/20121205/ee649333/attachment-0001.html>