[open-linguistics] How to represent LLOD diagram categories at datahub ?

Hugh Paterson III hugh at thejourneyler.org
Sat Oct 5 17:32:53 UTC 2013


This division makes sense given the stated problem.

I think that the original OLAC division was mostly aimed at the Himmelmann 1998 division between Descriptive linguistics and Documentary linguistics, and OLAC was focused on resources produced and housed in archives which were part of the documentary enterprise. - at least that is how I have understood it. (I am still not sure if what is being proposed here is just for the LLOD diagram or a larger framework for connecting linked data. If it is just for the LLOD diagram then what I have to say may not be maximally helpful.)

If we introduce terms like "lexicon" and "corpus" the we should also introduce a definition with them. I was rapidly looking for some definitions for these terms so I turned to project GOLD to see if the terms were at all defined. "lexicon" does have a definition [http://linguistics-ontology.org/gold/2010/Lexicon] , but "corpus" is not listed.  Though Project GOLD does introduce us to the variation in the use of the term "Lexicon" - I suppose in the LLOD context we are saying that it is a resource like a dictionary, wordlist, etc. But what is a "corpus"? - is it a cohesive work or is it just a collection of all known works on a language? Perhaps we need an addition to Project GOLD - if it is going to continue to be useful.



- Hugh



On Oct 5, 2013, at 3:26 AM, Christian Chiarcos wrote:

> Dear all,
> 
> earlier, we discussed categories for coloring the LLOD diagram. The diagram we prepared for LDL-2013 was based on a something like the minimal consensus:
> 
> - lexicon (= LREMap lexicon, olac:lexicon)
> - corpus (= LREMap corpus, ~ olac:primary data)
> - language_description (basically everything else, ~ olac:language_description)
> 
> I guess the first two are unproblematic, but the third is very heterogeneous, it includes
> - terminology repositories
> - typological databases
> - bibliographical databases
> In a way, all of these "describe language" (information about languages, information about concepts relevant to the description of language, information about collections of language data), but honestly, I would prefer the label "other", because this is very different from what I think an olac:language_description is meant to be.
> 
> Two questions
> - Is this general classification acceptable ?
> - How shall we encode the categories ? Using tags "lexicon", "corpus", etc. ? Or using a custom field "LLOD category" ? Unless anyone protests, I would suggest to use tags for "lexicon" and "corpus" and classify everything without such a tag as "language_description".
> 
> Best,
> Christian
> -- 
> Christian Chiarcos
> Applied Computational Linguistics
> Johann Wolfgang Goethe Universität Frankfurt a. M.
> 60054 Frankfurt am Main, Germany
> 
> office: Robert-Mayer-Str. 10, #401b
> mail: chiarcos at informatik.uni-frankfurt.de
> web: http://acoli.cs.uni-frankfurt.de
> tel: +49-(0)69-798-22463
> fax: +49-(0)69-798-28931
> 
> _______________________________________________
> open-linguistics mailing list
> open-linguistics at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-linguistics
> Unsubscribe: http://lists.okfn.org/mailman/options/open-linguistics

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-linguistics/attachments/20131005/dbca57ec/attachment-0001.html>


More information about the open-linguistics mailing list