[open-linguistics] META-NET Data Liberation Campaign

Christian Chiarcos christian.chiarcos at web.de
Tue Dec 4 00:25:05 UTC 2012


Dear all,

as most of us probably know, a number of "reference corpora" for major  
(and minor) languages of Europe that have been produced in the last  
decades, but many of them are not fully available to the public (not even  
under a restrictive license), or available in a snippet view on the web  
only (and hence unusable for NLP or advanced statistical analyses), -- not  
to talk about open licenses.

To address this issue, META-NET have prepared an open letter to all the  
official language bodies in Europe and to those holding onto the various  
corpora calling on them to consider trying to make this important language  
data available for research purposes.  If you feel that there is a huge  
benefit to liberating these corpora and making them available for research  
then please contact your local language body and let them know that you  
are in favour of the META-NET proposal.

More on this can be found on our blog, in a recent post by John Judge,  
META-NET Ireland  
(http://linguistics.okfn.org/2012/11/19/meta-net-data-liberation-campaign/,  
 from where the last paragraph was quoted). I'd like to thank John for  
replicating his original post there and hope this initiative receives some  
support from the OWLG.

Certainly, making these resources available under a research license would  
not be sufficient in the eyes of many on the list, but it would definitely  
be an important (and more easily achievable) step towards the further  
liberation of linguistic data.

Thank you,
Christian
-- 
Christian Chiarcos
Information Sciences Institute
University of Southern California
4676 Admiralty Way #1001
Marina del Rey, CA 90292
tel: +1-310-448-9391
fax: +1-310-448-8599
http://purl.org/chiarcos/home
chiarcos at isi.edu




More information about the open-linguistics mailing list