[open-linguistics] Linguistic LOD cloud - help needed, now is the time to submit your data set

Pablo N. Mendes pablomendes at gmail.com
Fri Aug 3 09:40:48 UTC 2012


Hi Sebastian H.,
Thanks for kickstarting the discussion.

Hi John,
Thanks for providing a very useful alternative analysis.

Hi all,
I have worked on releasing the big LOD cloud last year, and as a result we
produced a report that documents our experience [1]. The TL;DR is: give
people a script [2] to check their entry and they will make their entry
compliant.
A summary of the stats we obtained after collecting entries:
http://www4.wiwiss.fu-berlin.de/lodcloud/state/

What will probably work best for us will be to dictate an initial set of
guidelines (as Sebastian H. did) and then relax them or make them stricter
later as we look at what's realistic to accomplish in our timeframe.

We might need a small team of 2-3 people that is responsible for deciding
on things by taking into consideration everybody's opinions. You know,
"quote-on-quote democratic" rather than "really democratic". I
unfortunately cannot volunteer to participate.

Cheers,
Pablo

[1] Full document: http://planet-data-wiki.sti2.at/uploads/c/c0/D4.1.pdf
[2] Here is the script:
http://www4.wiwiss.fu-berlin.de/lodcloud/ckan/validator/
If you want to extend it for the LLOD, here is the source:
https://github.com/anjeve/ckan-lod-validator

On Fri, Aug 3, 2012 at 11:03 AM, Sebastian Nordhoff <
sebastian_nordhoff at eva.mpg.de> wrote:

> Dear all,
> there seems to be some confusion with regard to documentation practice.
> Some members of this list are closer to the inner workings of the LOD-cloud
> than others and are aware of many implicit assumptions/shared knowledge
> other people ignore.
> It would probably be good to list the relevant documents and processes
> again. RTFM is OK, but you have to no where the M is.
> Finally, I would like to commend John for bein BOLD in the wikipedia
> sense. Not knowing the precise rules should not ban anyone from
> contributing, and I would like to ask John to continue contributing with
> whatever knowledge of the rules and procedures he has or lacks.
> Best
> Sebastian N
>
>
>
>
>
> On Fri, 03 Aug 2012 10:24:49 +0200, Sebastian Hellmann <
> hellmann at informatik.uni-**leipzig.de <hellmann at informatik.uni-leipzig.de>>
> wrote:
>
>  Hi John,
>>
>> Am 02.08.2012 15:19, schrieb John McCrae:
>>
>>> Hi all,
>>>
>>> I decided to do an independent evaluation of what was in the LLOD, to
>>> identify what needs to be done, and found that the situation isn't
>>> perhaps
>>> as bad as the previous email suggests.
>>>
>> Sorry, John. The only thing you did is soften the criteria for
>> inclusion. That doesn't make the data better. You even went so far as to
>> disregard the criteria superimposed by the current practice:
>> http://richard.cyganiak.de/**2007/10/lod/#how-to-join<http://richard.cyganiak.de/2007/10/lod/#how-to-join>
>> CKAN entry is required, if not then "fail".
>>
>>  My notes are here:
>>>
>>> http://wiki.okfn.org/Working_**Groups/linguistics/Resources_**
>>> in_the_cloud<http://wiki.okfn.org/Working_Groups/linguistics/Resources_in_the_cloud>
>>>
>> Well, that is a nice table, but rather pointless. Please concentrate on
>> maintaining the group resources at:
>> http://thedatahub.org/en/**group/linguistics<http://thedatahub.org/en/group/linguistics>
>> or
>> https://docs.google.com/**spreadsheet/ccc?key=**
>> 0AlMk5ouIspH1dGx1R1Rnd1ZXX0xmL**XppSWFrcm0wNFE&authkey=**
>> CJi9u78D&authkey=CJi9u78D#gid=**0<https://docs.google.com/spreadsheet/ccc?key=0AlMk5ouIspH1dGx1R1Rnd1ZXX0xmLXppSWFrcm0wNFE&authkey=CJi9u78D&authkey=CJi9u78D#gid=0>
>>
>>
>>> The following resources appeared to be acceptable (i.e., they exist, have
>>> RDF, contain some useful data and had links to some other resource or to
>>> data categories)
>>>
>> softening criteria
>>
>>>
>>>     - Cornetto
>>>     - WOLD
>>>     - W3C WordNet
>>>     - DBPediaWiktionary
>>>     - LemonWiktionary*
>>>     - LemonWordNet*
>>>     - Open Data Thesaurus**
>>>     - DBPedia**
>>>     - YAGO
>>>     - Localized DBPedias**
>>>     - OpenCyc
>>>     - GOLD***
>>>     - ISOcat***
>>>     - Lexvo
>>>     - Lingvoj
>>>     - Glottolog/LingDoc*
>>>
>>> * Sebastian has indicated that these resources may be buggy. There are no
>>> issues here <http://code.google.com/p/**mlode/issues/list<http://code.google.com/p/mlode/issues/list>>
>>> that make them
>>> unusable however so I count them as good.
>>>
>> LemonWiktionary and Glottolog have 18 issues total, which is good.
>> Sebastian Nordhoff already fixed 4 bugs for Glottolog, making it much
>> better and removing the "fail".
>> Let's work on the data, not lowering expectations.
>>
>>> ** DBpedia and Open Data Thesaurus are not primarily linguistics
>>> resources,
>>> should they be included in the LLOD cloud?
>>>
>> My definition would include "anything that is useful for NLP" as well.
>> Besides you have redirects.
>>
>>> *** IMHO categories and schematic information resources are vital part of
>>> the LLOD cloud, I can't understand why Sebastian suggests they should not
>>> be included!?
>>>
>> copying behaviour from http://lod-cloud.net/
>> We can do schemas extra, if you want to.
>>
>>> The following resources need to be entered into CKAN: (6/27)
>>> <snip>
>>>
>>> The following resources should be removed (at least for the time being)
>>> from the cloud diagram: (5/27)
>>> <snip>
>>>
>>> The following resources need attention: (4/27)
>>> <snip>
>>>
>> That is a total of 15, I counted 18.
>>
>>  So In summary out of the 27 bubbles in the LLOD cloud 17 are usable and 4
>>> can likely be quickly fixed. I have attached a version of the LLOD cloud
>>> with these results attached. Please edit the Wiki page if you feel I have
>>> got something wrong.
>>>
>> Please concentrate on editing CKAN  or the Google spreadsheet and submit
>> your data set to Google code
>> We are working on creating updates of the cloud based on CKAN.
>> @John, please read:
>> http://richard.cyganiak.de/**2007/10/lod/#how-to-join<http://richard.cyganiak.de/2007/10/lod/#how-to-join>
>> http://wiki.okfn.org/Wg/**linguistics/llod#How_to_**contribute<http://wiki.okfn.org/Wg/linguistics/llod#How_to_contribute>
>> LemonWordnet for example needs 50 links to an existing resource. Jimmy
>> O'Regan was so kind to create that for you:
>> http://code.google.com/p/**mlode/issues/detail?id=34<http://code.google.com/p/mlode/issues/detail?id=34>
>>
>> Kind regards,
>> Sebastian
>>
>>
> ______________________________**_________________
> open-linguistics mailing list
> open-linguistics at lists.okfn.**org <open-linguistics at lists.okfn.org>
> http://lists.okfn.org/mailman/**listinfo/open-linguistics<http://lists.okfn.org/mailman/listinfo/open-linguistics>
>



-- 
---
Pablo N. Mendes
http://pablomendes.com
Events: http://wole2012.eurecom.fr (*Extended Deadline: Aug 6th 2012*)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-linguistics/attachments/20120803/939f67cd/attachment-0001.html>


More information about the open-linguistics mailing list