[open-linguistics] Linguistic LOD cloud - help needed, now is the time to submit your data set

John McCrae jmccrae at cit-ec.uni-bielefeld.de
Fri Aug 3 10:29:17 UTC 2012


Hi all,

I did the analysis independently to try to figure out why Sebastian H, had
labelled so many resources as "fail". I found that most are actually not in
a terrible state but had a few issues (especially CKAN entries).

One point though: I would say that resources such as GOLD should be
included however... GOLD consists of nearly 600 identifiers and is far from
a trivial resource. Moreover, one of the key benefits of LLD is the ability
to agree on data categories by linking to web resources. I view this as a
key "selling point" of LLD and would strongly campaign for keeping it ;)

It does seem that from Sebastian N's comments there is a documentation
issue here, we need clearer documentation of the procedure (as well I have
attended a lot of the telcos and if I can't figure it out, no-one outside
will be able to). I would be happy to help with this... (but obviously I am
not totally clear on the procedures)

I didn't use the Google Doc as it appears to be out-of-date (many of the
resources in the diagram are not in the document)... should I integrate
into the Google Doc, or perhaps we could move all documentation to the Wiki
so it is easier to find?

Regards,
John

On Fri, Aug 3, 2012 at 11:03 AM, Sebastian Nordhoff <
sebastian_nordhoff at eva.mpg.de> wrote:

> Dear all,
> there seems to be some confusion with regard to documentation practice.
> Some members of this list are closer to the inner workings of the LOD-cloud
> than others and are aware of many implicit assumptions/shared knowledge
> other people ignore.
> It would probably be good to list the relevant documents and processes
> again. RTFM is OK, but you have to no where the M is.
> Finally, I would like to commend John for bein BOLD in the wikipedia
> sense. Not knowing the precise rules should not ban anyone from
> contributing, and I would like to ask John to continue contributing with
> whatever knowledge of the rules and procedures he has or lacks.
> Best
> Sebastian N
>
>
>
>
>
> On Fri, 03 Aug 2012 10:24:49 +0200, Sebastian Hellmann <
> hellmann at informatik.uni-**leipzig.de <hellmann at informatik.uni-leipzig.de>>
> wrote:
>
>  Hi John,
>>
>> Am 02.08.2012 15:19, schrieb John McCrae:
>>
>>> Hi all,
>>>
>>> I decided to do an independent evaluation of what was in the LLOD, to
>>> identify what needs to be done, and found that the situation isn't
>>> perhaps
>>> as bad as the previous email suggests.
>>>
>> Sorry, John. The only thing you did is soften the criteria for
>> inclusion. That doesn't make the data better. You even went so far as to
>> disregard the criteria superimposed by the current practice:
>> http://richard.cyganiak.de/**2007/10/lod/#how-to-join<http://richard.cyganiak.de/2007/10/lod/#how-to-join>
>> CKAN entry is required, if not then "fail".
>>
>>  My notes are here:
>>>
>>> http://wiki.okfn.org/Working_**Groups/linguistics/Resources_**
>>> in_the_cloud<http://wiki.okfn.org/Working_Groups/linguistics/Resources_in_the_cloud>
>>>
>> Well, that is a nice table, but rather pointless. Please concentrate on
>> maintaining the group resources at:
>> http://thedatahub.org/en/**group/linguistics<http://thedatahub.org/en/group/linguistics>
>> or
>> https://docs.google.com/**spreadsheet/ccc?key=**
>> 0AlMk5ouIspH1dGx1R1Rnd1ZXX0xmL**XppSWFrcm0wNFE&authkey=**
>> CJi9u78D&authkey=CJi9u78D#gid=**0<https://docs.google.com/spreadsheet/ccc?key=0AlMk5ouIspH1dGx1R1Rnd1ZXX0xmLXppSWFrcm0wNFE&authkey=CJi9u78D&authkey=CJi9u78D#gid=0>
>>
>>
>>> The following resources appeared to be acceptable (i.e., they exist, have
>>> RDF, contain some useful data and had links to some other resource or to
>>> data categories)
>>>
>> softening criteria
>>
>>>
>>>     - Cornetto
>>>     - WOLD
>>>     - W3C WordNet
>>>     - DBPediaWiktionary
>>>     - LemonWiktionary*
>>>     - LemonWordNet*
>>>     - Open Data Thesaurus**
>>>     - DBPedia**
>>>     - YAGO
>>>     - Localized DBPedias**
>>>     - OpenCyc
>>>     - GOLD***
>>>     - ISOcat***
>>>     - Lexvo
>>>     - Lingvoj
>>>     - Glottolog/LingDoc*
>>>
>>> * Sebastian has indicated that these resources may be buggy. There are no
>>> issues here <http://code.google.com/p/**mlode/issues/list<http://code.google.com/p/mlode/issues/list>>
>>> that make them
>>> unusable however so I count them as good.
>>>
>> LemonWiktionary and Glottolog have 18 issues total, which is good.
>> Sebastian Nordhoff already fixed 4 bugs for Glottolog, making it much
>> better and removing the "fail".
>> Let's work on the data, not lowering expectations.
>>
>>> ** DBpedia and Open Data Thesaurus are not primarily linguistics
>>> resources,
>>> should they be included in the LLOD cloud?
>>>
>> My definition would include "anything that is useful for NLP" as well.
>> Besides you have redirects.
>>
>>> *** IMHO categories and schematic information resources are vital part of
>>> the LLOD cloud, I can't understand why Sebastian suggests they should not
>>> be included!?
>>>
>> copying behaviour from http://lod-cloud.net/
>> We can do schemas extra, if you want to.
>>
>>> The following resources need to be entered into CKAN: (6/27)
>>> <snip>
>>>
>>> The following resources should be removed (at least for the time being)
>>> from the cloud diagram: (5/27)
>>> <snip>
>>>
>>> The following resources need attention: (4/27)
>>> <snip>
>>>
>> That is a total of 15, I counted 18.
>>
>>  So In summary out of the 27 bubbles in the LLOD cloud 17 are usable and 4
>>> can likely be quickly fixed. I have attached a version of the LLOD cloud
>>> with these results attached. Please edit the Wiki page if you feel I have
>>> got something wrong.
>>>
>> Please concentrate on editing CKAN  or the Google spreadsheet and submit
>> your data set to Google code
>> We are working on creating updates of the cloud based on CKAN.
>> @John, please read:
>> http://richard.cyganiak.de/**2007/10/lod/#how-to-join<http://richard.cyganiak.de/2007/10/lod/#how-to-join>
>> http://wiki.okfn.org/Wg/**linguistics/llod#How_to_**contribute<http://wiki.okfn.org/Wg/linguistics/llod#How_to_contribute>
>> LemonWordnet for example needs 50 links to an existing resource. Jimmy
>> O'Regan was so kind to create that for you:
>> http://code.google.com/p/**mlode/issues/detail?id=34<http://code.google.com/p/mlode/issues/detail?id=34>
>>
>> Kind regards,
>> Sebastian
>>
>>
> ______________________________**_________________
> open-linguistics mailing list
> open-linguistics at lists.okfn.**org <open-linguistics at lists.okfn.org>
> http://lists.okfn.org/mailman/**listinfo/open-linguistics<http://lists.okfn.org/mailman/listinfo/open-linguistics>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-linguistics/attachments/20120803/e9017111/attachment-0001.html>


More information about the open-linguistics mailing list