[open-linguistics] Collection of resources

Nancy Ide ide at cs.vassar.edu
Sat Jan 15 19:01:53 UTC 2011


Many thanks for getting this out!


On Jan 15, 2011, at 8:23 AM, Jonathan Gray wrote:

> Arg!
> 
> Nancy: I'm so sorry for neglecting your post. It has been on my to-do
> list for a very long time, but buried under the pre-Christmas rush,
> and then the New Year catch up.
> 
> Here it is:
> 
> http://blog.okfn.org/2011/01/15/opening-up-linguistic-data-at-the-american-national-corpus/
> 
> Apologies again for the unacceptably long wait! It is a great post. ;-)
> 
> Regarding 'sharealike' provisions for linguistic data, it might be
> that, like the science working group's 'Panton Principles' [1], we
> wish to draft a set of principles / guidelines for linguistic data
> that stipulate that SA is not such a good idea, and why this is the
> case. Before we do that, I'd like to invite more people to the group
> and have a proper debate about why this might be the case for
> linguistics, and to come up with a clear list of problems/issues. E.g.
> plenty of companies are using Open Street Map's data (which is under
> SA license). How is linguistics different from this and what are the
> main use cases / scenarios where SA is problematic. (We should also,
> of course, break this down for NC restrictions.)
> 
> I guess a bigger issue to tackle in this kind of discussion is why
> ordinary linguists should care about making an interoperable 'commons'
> of linguistic data (that includes commercial entities as well as
> researchers). What is in it for them? I personally feel that a vast,
> rich and collaboratively maintained open commons of linguistic data
> would probably be a Really Good Thing for linguistics researchers, but
> obviously this is something that should probably be discussed further
> to tease out the pros and cons.
> 
> All the best,
> 
> Jonathan
> 
> [1] http://pantonprinciples.org/
> 
> On Sat, Jan 15, 2011 at 12:26 AM, Nancy Ide <ide at cs.vassar.edu> wrote:
>> Hmmm... this is problematic for linguistic data. Most of the things in your list are restricted from commercial use--but of course, the "share-alike" restriction is basically a restriction to non-commercial use, since commercial users can't typically redistribute their products based on or incorporating the data under the same conditions. Anything distributed through the Linguistic Data Consortium has licensing  of one kind or another, which may in fact be different from the definition of open data on the web page.
>> 
>> I was asked to write a blog for the OKFN site but after I wrote it, I never heard back. I am about to submit it to opensource.com, who also asked me to do a blog on the topic. In it I talk about the problems of "share alike" for linguistic data. In my various roles as president of ACL-SIGANN, developer of the Open American National Corpus, etc., I have been promoting the idea of true openness for linguistic data, involving at most attribution. I would like to suggest that in this list of resources, we differentiate the restrictions on linguistic data in terms of completely open vs. share-alike, vs. anything else.
>> 
>> Just my two cents...
>> 
>> On Jan 14, 2011, at 5:40 PM, William Waites wrote:
>> 
>>> * [2011-01-14 15:58:04 -0500] Nancy Ide <ide at cs.vassar.edu> écrit:
>>> 
>>> ] Can I ask a question concerning what you mean by "open" here?
>>> ] Among the resources listed, there is some variety in the
>>> ] conditions under which they can be obtained. Is this the
>>> ] function of the license column? I do think there should be
>>> ] a clear statement of what "open" is defined to be, if
>>> ] possible--and maybe grouping the datasets by availability type.
>>> 
>>> There might be some specialisation of definition for
>>> linguistic data (analogous to what the bibliographic
>>> data group is developing) but in general it means
>>> the sense of http://www.opendefinition.org/
>>> 
>>> Happy hacking,
>>> -w
>>> --
>>> William Waites                <mailto:ww at styx.org>
>>> http://eris.okfn.org/ww/         <sip:ww at styx.org>
>>> 9C7E F636 52F6 1004 E40A  E565 98E3 BBF3 8320 7664
>>> 
>>> _______________________________________________
>>> open-linguistics mailing list
>>> open-linguistics at lists.okfn.org
>>> http://lists.okfn.org/mailman/listinfo/open-linguistics
>>> 
>> 
>> 
>> _______________________________________________
>> open-linguistics mailing list
>> open-linguistics at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/open-linguistics
>> 
> 
> 
> 
> -- 
> Jonathan Gray
> 
> Community Coordinator
> The Open Knowledge Foundation
> http://blog.okfn.org
> 
> http://twitter.com/jwyg
> http://identi.ca/jwyg
> 
> _______________________________________________
> open-linguistics mailing list
> open-linguistics at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-linguistics
> 





More information about the open-linguistics mailing list