Christian Chiarcos christian.chiarcos at web.de
Mon Aug 11 14:42:08 UTC 2014

Dear all,

please find a message below regarding the status of LLOD resources in the

If your data set is not classified as "linked data" in the attached file,
it may be to one the following reasons:

(a) For some resources, we need to discuss their status as being ontologies
or not.

(b) Check availability. Some datasets, e.g., eurosentiment and
lemonwordnet, appear to be offline.

(c) Check robots.txt and make sure that your data can be crawled.

(d) Check your datahub metadata: Note that we could only update resources
directly associated with the owlg organization at datahub. This is not the
case for all resources, including a few important ones such as PanLex or

If you have only an issue with metada and you want your datasets to be
included, you can either

(a) add the tags "llod", "lod", "linguistics" and set the attributes
"triples" and "links:..." according to

(b) write one of the admins of http://datahub.io/de/organization/about/owlg
(e.g., Bettina or me) and ask to be invited to the organization. Then, we
should be able to update your metadata.

If these issues are fixed, please send an email to Max (in CC) to include
your dataset.


---------- Forwarded message ----------
From: Max Schmachtenberg <max at informatik.uni-mannheim.de>
Date: 2014-08-05 15:29 GMT+02:00
Subject: Re: AW: Updated LOD Cloud Diagram - Please enter your linked
datasets into the datahub.io catalog for inclusion.
To: Christian Chiarcos <christian.chiarcos at web.de>

 Hello Christian,

I analysed the datasets in your group. In the attached tsv, you can find my

Regarding datasets with linked data (unlinke those that are offline or
where I could not fined linked data, hence named "no linked data"), there
are either those that are:

- in our crawl
- an ontology (which we do not include)
- were disallowed to be crawled ("was disallowed") or
-"linked data", which should be included to the next cloud.

If you disagree with my analysis in any of the datasets, please let me know.


On 08/04/2014 06:28 PM, Christian Chiarcos wrote:

Dear Chris, dear Max,

 by now, most of our datasets should have the tags "lod" and "llod", and
can be retrieved from datahub via the organization (owlg). Not all owlg
datasets are actually linked data, but those that specify size and links
are. If you consider these datasets (including some already in the diagram,
but currently under cross-domain) as being substantial enough, a name like
"language resources" would be ideal, I think.

 Please let me know if there are any further questions. (Though I'll be
sporadically online only until end of August).

 Thanks a lot, and best regards,

On 24.07.2014, at 14:49, "Christian Bizer" <chris at bizer.de> wrote:

  Dear Christian and Max

thank you very much for pointing us at this group.  We will include the
datasets that set links into the diagram.

If we have a decent number of linguistic datasets that we can include into
the cloud, I would be happy with adding an additional category.

Max: Could you please note the datasets from the group that set links for
inclusion into the diagram.



*Von:* christian.chiarcos at googlemail.com [
mailto:christian.chiarcos at googlemail.com <christian.chiarcos at googlemail.com>]
*Im Auftrag von *Christian Chiarcos
*Gesendet:* Donnerstag, 24. Juli 2014 16:34
*An:* Christian Bizer
*Betreff:* Re: Updated LOD Cloud Diagram - Please enter your linked
datasets into the datahub.io catalog for inclusion.

Dear Chris,

thank you for your efforts, great to have an updated diagram.

I would like to point out to you the existence of the datasets compiled by
the Open Linguistics Working Group (OWLG) of the OKFN. We have been
promoting a Linguistic Linked Open Data (LLOD) (sub-)cloud in the last few
years with great success in the NLP and linguistics communities, and it
would be great to these datasets also included in the LOD cloud. While all
our metadata is stored on datahub.io, its specifications are not (yet)
conformant to LOD requirements, but we're working on that. In any case,
there would be about 30-50 new data sets (hard to quantify because of the
overlap between current LOD and LLOD) and I was wondering whether this
would not justify a new category for the diagram, say "Linguistic Linked
Open Data". Most LLOD data sets that already are in the LOD diagram (e.g.,
those maintained at AKSW) are tagged as "cross-domain" which I personally
find somewhat dissatisfying.

Most of our datasets can be found under http://datahub.io/organization/owlg,
(L)LOD data sets are those with the triples and links:... tags. We try to
add the "lod" tag, but it will take a while because there is no way of
doing global modifications on datahub.io.

We have a hard time synchronizing the metadata, so different tags have been
used to identify resources that are in the LLOD but maintained by other
organizations, most noteably using the tags "llod", "linguistic" or
"linguistics". Sorting these out will take time and I'm not sure whether
we'll make it until August 8th. Any ideas when the next diagram after the
August version will be compiled ?

Best regards,


Prof. Dr. Christian Chiarcos
Applied Computational Linguistics
Johann Wolfgang Goethe Universität Frankfurt a. M.
60054 Frankfurt am Main, Germany

office: Robert-Mayer-Str. 10, #401b
mail: chiarcos at informatik.uni-frankfurt.de
web: http://acoli.cs.uni-frankfurt.de
tel: +49-(0)69-798-22463
fax: +49-(0)69-798-28931

Max Schmachtenberg
Chair of Information Systems V
Web-based Systems Group
Universität Mannheim
B6, 26, Room C1.07
D-68159 Mannheim
Phone: +49 621 181 3705
Mail: max at informatik.uni-mannheim.de
Web: dws.informatik.uni-mannheim.de
