[open-linguistics] Renaming "lexicon" category: Vote for Lexical-Semantic Resource or Lexical Conceptual Resource ? [was: LLOD diagram draft]

Christian Chiarcos christian.chiarcos at web.de
Wed Apr 9 15:52:26 UTC 2014


Dear Marta,

taking your input into consideration (thanks a lot), the discussion in the  
telco continued and you can find an emerging consensus on the etherpad  
(http://pad.okfn.org/p/OWLG).

Clearly, LexicalConceptualResource corresponds to the broad  
"lexion/lexical semantic resource" group we have in the diagram. Except  
that we would also include general semantic knowledge bases (such as  
DBpedia) along with the resource types you mentioned. We did not find a  
consensus as to how to label this category, some preferred "lexical  
semantic resource" (what we used before), others see the advantages of  
"lexical conceptual resource" (ties to MetaShare). The pragmatic solution  
was to have a Doodle poll. So, please, everyone who has a preference for  
one of the other alternative, please vote under  
http://doodle.com/x4eqvftyn2s2peqh ! (Until April 23rd).

As for sub-categories of this category, we would need larger groups for  
the diagram (so that colors can still be distinguished). This was quite  
controversial, too, but in the end, we would group lexicon, wordnets (and  
maybe, word lists) together as "lexicon" ("lexical resource"), and  
terminological resources and ontology under "domain-specific  
terminologies", with "general semantic knowledge bases" as a sibling  
concept. That's the state of discussion, at least.

Best,
Christian

On Wed, 09 Apr 2014 15:15:35 +0200, Philipp Cimiano  
<cimiano at cit-ec.uni-bielefeld.de> wrote:

> Hi Marta,
>
>    thanks, very interesting. Given your input I am even more convinced
> that reusing the upper category "LexicalConceptualResource" subsuming
> all those things that you mention would be appropriate.
>
> Philipp.
>
> Am 09.04.14 15:10, schrieb Marta Villegas:
>> Dear all,
>>
>> I'm afraid this is my first time at LLOD (sorry for not participating
>> much). I'm sending you some few comments regarding Philpp''s mail
>> about top categories.Currently, at UPF we follow MetaShare proposal
>> and distinguish between:
>>
>> LexicalConceptualResource
>> ComputationalLexicon
>> Framenet
>> Lexicon
>> MachineReadableDictionary
>> Ontology
>> TerminologicalResource
>> Thesaurus
>> WordList
>> Wordnet
>> Corpus
>> CorpusAudio
>> CorpusCollection
>> CorpusImage
>> CorpusText
>> CorpustextNgram
>> CorpusVideo
>>
>> As you can see, Corpus sub-classes are defined according to Media Type.
>> In MetaShare, things like 'parallel corpus' vs 'monolingual corpus'
>> are encoded by means of multilinguality property
>> which serve to distinguish between parallel, comparable and
>> MultilingualSingleText.
>> Similarly 'bilingual' vs 'monolingual' is encoded by means of
>> linguality (for monolingual, bilingual and multilingual).
>>
>> You can have a look at the browser
>> (http://lod.iula.upf.edu/types/Service). The ontology files are at
>> http://purl.org/ms-lod/MetaShare.ttl
>> http://purl.org/ms-lod/BioServices.ttl
>> http://purl.org/ms-lod/UPF-MetadataRecords.ttl
>>
>> Please note that this is an ongoing project!!!!
>>
>> All the best!
>>
>>
>> 2014-04-09 14:24 GMT+02:00 Philipp Cimiano
>> <cimiano at cit-ec.uni-bielefeld.de
>> <mailto:cimiano at cit-ec.uni-bielefeld.de>>:
>>
>>     Dear all,
>>
>>      apologies, but my connection here is very bad, so I can not
>>     follow the skype telco, so I provide my input here answering to
>>     the email of Christian.
>>
>>     I like the top categories: corpus, lexicon metadata in principle.
>>     But I would recommend to reuse categories proposed by others. For
>>     example, the Metashare node of UPF uses the following categories
>>     (thanks to Jorge for providing them):
>>
>>      *
>>
>>         Lexical Conceptual Resource (94)
>>
>>          o
>>
>>             Lexicon (77)
>>
>>          o
>>
>>             Wordnet (6)
>>
>>          o
>>
>>             Terminological Resource (4)
>>
>>          o
>>
>>             Word List (4)
>>
>>          o
>>
>>             Ontology (3)
>>
>>      *
>>
>>         Corpus (30)
>>
>>       * Tool Service (10)
>>
>>
>>     I think reusing these categories (except for Tool Service) would
>>     be fine. The numbers in brackets indicate the number of resoruces
>>     of the corresponding type available. Adding Metadata would be good.
>>
>>     ParallelCorpus as subcategory of Corpus seems appropriate and
>>     useful aas just suggested in the telco (I picked that ;))
>>
>>     Other than that, the subcategories of Corpus would be defined by
>>     the annotation layers the corpus contains, getting too fine-graned
>>     at the level of the cloud is difficult.
>>
>>     In any case in the future I hope that we can dynamically generate
>>     different diagrams filtering by conditions, e.g. license,
>>     annotation layers available, language etc.
>>
>>
>>
>>
>>     Am 04.04.14 21:44, schrieb Christian Chiarcos:
>>>     Dear all,
>>>
>>>     please find the first draft for the new LLOD cloud diagram  
>>> attached.
>>>
>>>     An important difference as compared to the last draft is that
>>>     *only datasets with links to other LLOD datasets are included*.
>>>     Data sets for which we could not read information from any of the
>>>     URLs given in Datahub responded were excluded.
>>>
>>>     If you don't find your dataset displayed properly (or missing),
>>>     please check your Datahub entry!
>>>
>>>     Differences as compared to last edition:
>>>     - Categories revised, now at two levels of granularity (feedback
>>>     please!)
>>>     - Novel data sets, including the datasets of LDL-2014
>>>     contributions and the associated data challenge
>>>     - Included linguistically relevant Datahub entries *not* marked
>>>     as ressources of the linguistics group (e.g., the Greek WordNet).
>>>     We extracted all Datahub entries with tags "llod",
>>>     "linguistics%20lod", "lexicon", "corpus", "thesaurus",
>>>     "linguistic", "linguistics", or "typology".
>>>     - Diagram pruning: Eliminate data sets not linked with other LLOD
>>>     data sets
>>>
>>>     Known issues:
>>>     - Edge breadth and bubble size reflect the link/triple counts as
>>>     given in Datahub. Where this information is not found, edges are
>>>     missing or bubbles are equally sized.
>>>     - Datasets from the LREC Share Your Resources Initiative have not
>>>     been included yet. We can discuss at the telco next week whether
>>>     we want to prepare a May-2014 edition that covers this (and
>>>     other) data.
>>>
>>>     All the best,
>>>     Christian
>>>
>>>
>>>     _______________________________________________
>>>     open-linguistics mailing list
>>>     open-linguistics at lists.okfn.org   
>>> <mailto:open-linguistics at lists.okfn.org>
>>>     https://lists.okfn.org/mailman/listinfo/open-linguistics
>>>     Unsubscribe:https://lists.okfn.org/mailman/options/open-linguistics
>>
>>
>>     --
>>
>>     Prof. Dr. Philipp Cimiano
>>
>>     Phone:+49 521 106 12249  <tel:%2B49%20521%20106%2012249>
>>     Fax:+49 521 106 12412  <tel:%2B49%20521%20106%2012412>
>>     Mail:cimiano at cit-ec.uni-bielefeld.de   
>> <mailto:cimiano at cit-ec.uni-bielefeld.de>
>>
>>     Forschungsbau Intelligente Systeme (FBIIS)
>>     Raum 2.307
>>     Universität Bielefeld
>>     Inspiration 1
>>     33619 Bielefeld
>>
>>
>>     _______________________________________________
>>     open-linguistics mailing list
>>     open-linguistics at lists.okfn.org
>>     <mailto:open-linguistics at lists.okfn.org>
>>     https://lists.okfn.org/mailman/listinfo/open-linguistics
>>     Unsubscribe: https://lists.okfn.org/mailman/options/open-linguistics
>>
>>
>>
>>
>> --
>> Marta Villegas
>> marta.villegas at gmail.com <mailto:marta.villegas at gmail.com>
>>
>>
>> _______________________________________________
>> open-linguistics mailing list
>> open-linguistics at lists.okfn.org
>> https://lists.okfn.org/mailman/listinfo/open-linguistics
>> Unsubscribe: https://lists.okfn.org/mailman/options/open-linguistics
>
>


-- 
Christian Chiarcos
Applied Computational Linguistics
Johann Wolfgang Goethe Universität Frankfurt a. M.
60054 Frankfurt am Main, Germany

office: Robert-Mayer-Str. 10, #401b
mail: chiarcos at informatik.uni-frankfurt.de
web: http://acoli.cs.uni-frankfurt.de
tel: +49-(0)69-798-22463
fax: +49-(0)69-798-28931



More information about the open-linguistics mailing list