[open-linguistics] LLOD diagram draft

Marta Villegas marta.villegas at gmail.com
Wed Apr 9 13:10:34 UTC 2014


Dear all,

I'm afraid this is my first time at LLOD (sorry for not participating
much). I'm sending you some few comments regarding Philpp''s mail about top
categories.Currently, at UPF we follow MetaShare proposal and distinguish
between:

LexicalConceptualResource
ComputationalLexicon
Framenet
Lexicon
MachineReadableDictionary
Ontology
TerminologicalResource
Thesaurus
WordList
Wordnet
Corpus
CorpusAudio
CorpusCollection
CorpusImage
CorpusText
CorpustextNgram
CorpusVideo

As you can see, Corpus sub-classes are defined according to Media Type.
In MetaShare, things like 'parallel corpus' vs 'monolingual corpus' are
encoded by means of multilinguality property
which serve to distinguish between parallel, comparable and
MultilingualSingleText.
Similarly 'bilingual' vs 'monolingual' is encoded by means of linguality
(for monolingual, bilingual and multilingual).

You can have a look at the browser (http://lod.iula.upf.edu/types/Service).
The ontology files are at
http://purl.org/ms-lod/MetaShare.ttl
http://purl.org/ms-lod/BioServices.ttl
http://purl.org/ms-lod/UPF-MetadataRecords.ttl

Please note that this is an ongoing project!!!!

All the best!


2014-04-09 14:24 GMT+02:00 Philipp Cimiano <cimiano at cit-ec.uni-bielefeld.de>
:

>  Dear all,
>
>  apologies, but my connection here is very bad, so I can not follow the
> skype telco, so I provide my input here answering to the email of Christian.
>
> I like the top categories: corpus, lexicon metadata in principle. But I
> would recommend to reuse categories proposed by others. For example, the
> Metashare node of UPF uses the following categories (thanks to Jorge for
> providing them):
>
>
>    -
>
>    Lexical Conceptual Resource (94)
>     -
>
>       Lexicon (77)
>        -
>
>       Wordnet (6)
>        -
>
>       Terminological Resource (4)
>        -
>
>       Word List (4)
>        -
>
>       Ontology (3)
>        -
>
>    Corpus (30)
>     - Tool Service (10)
>
>
> I think reusing these categories (except for Tool Service) would be fine.
> The numbers in brackets indicate the number of resoruces of the
> corresponding type available. Adding Metadata would be good.
>
> ParallelCorpus as subcategory of Corpus seems appropriate and useful aas
> just suggested in the telco (I picked that ;))
>
> Other than that, the subcategories of Corpus would be defined by the
> annotation layers the corpus contains, getting too fine-graned at the level
> of the cloud is difficult.
>
> In any case in the future I hope that we can dynamically generate
> different diagrams filtering by conditions, e.g. license, annotation layers
> available, language etc.
>
>
>
>
> Am 04.04.14 21:44, schrieb Christian Chiarcos:
>
> Dear all,
>
> please find the first draft for the new LLOD cloud diagram attached.
>
> An important difference as compared to the last draft is that *only
> datasets with links to other LLOD datasets are included*. Data sets for
> which we could not read information from any of the URLs given in Datahub
> responded were excluded.
>
> If you don't find your dataset displayed properly (or missing), please
> check your Datahub entry!
>
> Differences as compared to last edition:
> - Categories revised, now at two levels of granularity (feedback please!)
> - Novel data sets, including the datasets of LDL-2014 contributions and
> the associated data challenge
> - Included linguistically relevant Datahub entries *not* marked as
> ressources of the linguistics group (e.g., the Greek WordNet). We extracted
> all Datahub entries with tags "llod", "linguistics%20lod", "lexicon",
> "corpus", "thesaurus", "linguistic", "linguistics", or "typology".
> - Diagram pruning: Eliminate data sets not linked with other LLOD data
> sets
>
> Known issues:
> - Edge breadth and bubble size reflect the link/triple counts as given in
> Datahub. Where this information is not found, edges are missing or bubbles
> are equally sized.
> - Datasets from the LREC Share Your Resources Initiative have not been
> included yet. We can discuss at the telco next week whether we want to
> prepare a May-2014 edition that covers this (and other) data.
>
> All the best,
> Christian
>
>
> _______________________________________________
> open-linguistics mailing listopen-linguistics at lists.okfn.orghttps://lists.okfn.org/mailman/listinfo/open-linguistics
> Unsubscribe: https://lists.okfn.org/mailman/options/open-linguistics
>
>
>
> --
>
> Prof. Dr. Philipp Cimiano
>
> Phone: +49 521 106 12249
> Fax: +49 521 106 12412
> Mail: cimiano at cit-ec.uni-bielefeld.de
>
> Forschungsbau Intelligente Systeme (FBIIS)
> Raum 2.307
> Universität Bielefeld
> Inspiration 1
> 33619 Bielefeld
>
>
> _______________________________________________
> open-linguistics mailing list
> open-linguistics at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/open-linguistics
> Unsubscribe: https://lists.okfn.org/mailman/options/open-linguistics
>
>


-- 
Marta Villegas
marta.villegas at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-linguistics/attachments/20140409/83e17468/attachment-0003.html>


More information about the open-linguistics mailing list