[open-linguistics] LLOD diagram draft

Thu Apr 24 09:08:56 UTC 2014

Dear Christian, all,

  I would like to further seed our discussion on the top-level ontology 
for the cloud diagram by providing additional input.

I proposed during our last telco to closely follow the Metashare 
top-level ontology to foster overall interoperability. I think the LR 
community is fragmented enough, so we have the chance to contribute to 
overall homogenization.

In Metashare the following top-level distinctions are made:

1) corpus (including written/text, oral/spoken, multimodal/multimedia 
corpora),
2) lexical/conceptual resource (including terminological resources, word 
lists, semantic lexica,
ontologies, etc.),
3) language description (including grammars,  typological databases, 
courseware, etc.),
4) tool/service (including processing tools, applications,  web 
services, etc. required for processing data resources).

See Section 4 of this paper: 
http://www.lrec-conf.org/proceedings/lrec2012/pdf/998_Paper.pdf

I propose we adopt the categories 1)-3) directly as they cover exactly 
the categories we had in mind. Reusing these names would contribute to 
overall interoperability.

Btw. I think grammars should also be included in the LLOD diagram.

Just my two cents,

Philipp.

Am 04.04.14 21:44, schrieb Christian Chiarcos:
> Dear all,
>
> please find the first draft for the new LLOD cloud diagram attached.
>
> An important difference as compared to the last draft is that *only 
> datasets with links to other LLOD datasets are included*. Data sets 
> for which we could not read information from any of the URLs given in 
> Datahub responded were excluded.
>
> If you don't find your dataset displayed properly (or missing), please 
> check your Datahub entry!
>
> Differences as compared to last edition:
> - Categories revised, now at two levels of granularity (feedback please!)
> - Novel data sets, including the datasets of LDL-2014 contributions 
> and the associated data challenge
> - Included linguistically relevant Datahub entries *not* marked as 
> ressources of the linguistics group (e.g., the Greek WordNet). We 
> extracted all Datahub entries with tags "llod", "linguistics%20lod", 
> "lexicon", "corpus", "thesaurus", "linguistic", "linguistics", or 
> "typology".
> - Diagram pruning: Eliminate data sets not linked with other LLOD data 
> sets
>
> Known issues:
> - Edge breadth and bubble size reflect the link/triple counts as given 
> in Datahub. Where this information is not found, edges are missing or 
> bubbles are equally sized.
> - Datasets from the LREC Share Your Resources Initiative have not been 
> included yet. We can discuss at the telco next week whether we want to 
> prepare a May-2014 edition that covers this (and other) data.
>
> All the best,
> Christian
>
>
> _______________________________________________
> open-linguistics mailing list
> open-linguistics at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/open-linguistics
> Unsubscribe: https://lists.okfn.org/mailman/options/open-linguistics

-- 

Prof. Dr. Philipp Cimiano

Phone: +49 521 106 12249
Fax: +49 521 106 12412
Mail: cimiano at cit-ec.uni-bielefeld.de

Forschungsbau Intelligente Systeme (FBIIS)
Raum 2.307
Universität Bielefeld
Inspiration 1
33619 Bielefeld

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-linguistics/attachments/20140424/74b45fb0/attachment-0003.html>