[open-linguistics] Linguistic LOD cloud - help needed, now is the time to submit your data set

Steven Moran bambooforest at gmail.com
Fri Aug 3 10:43:17 UTC 2012


Hi,

Regarding GOLD I agree and I've been in discussions with the group at
Linguist List to make explicit the license of GOLD, which they told me will
be "Creative Commons Attribution 3.0 Unported License".

Yesterday I added GOLD to MLODE issue tracker. At one time I was a
contributor for GOLD, so I volunteer to work with the students to get it
past FAIL, if it was labeled in that category. :)

One issue is that resources like this, for academic publications, should be
easily citable if used in research involving the LLOD. I know GOLD uses
DCMI categories for its bibliographic citation. Perhaps something like a
minimum of data categories for citing the data set should also be a
requirement for data sets in the LLOD (or at least an option for data sets
that should be attributable, e.g. GOLD was Scott Farrar's PhD dissertation
work).

Regards,

-Steve


On Fri, Aug 3, 2012 at 12:29 PM, John McCrae <
jmccrae at cit-ec.uni-bielefeld.de> wrote:

> Hi all,
>
> I did the analysis independently to try to figure out why Sebastian H, had
> labelled so many resources as "fail". I found that most are actually not in
> a terrible state but had a few issues (especially CKAN entries).
>
> One point though: I would say that resources such as GOLD should be
> included however... GOLD consists of nearly 600 identifiers and is far from
> a trivial resource. Moreover, one of the key benefits of LLD is the ability
> to agree on data categories by linking to web resources. I view this as a
> key "selling point" of LLD and would strongly campaign for keeping it ;)
>
> It does seem that from Sebastian N's comments there is a documentation
> issue here, we need clearer documentation of the procedure (as well I have
> attended a lot of the telcos and if I can't figure it out, no-one outside
> will be able to). I would be happy to help with this... (but obviously I am
> not totally clear on the procedures)
>
> I didn't use the Google Doc as it appears to be out-of-date (many of the
> resources in the diagram are not in the document)... should I integrate
> into the Google Doc, or perhaps we could move all documentation to the Wiki
> so it is easier to find?
>
> Regards,
> John
>
> On Fri, Aug 3, 2012 at 11:03 AM, Sebastian Nordhoff <
> sebastian_nordhoff at eva.mpg.de> wrote:
>
>> Dear all,
>> there seems to be some confusion with regard to documentation practice.
>> Some members of this list are closer to the inner workings of the LOD-cloud
>> than others and are aware of many implicit assumptions/shared knowledge
>> other people ignore.
>> It would probably be good to list the relevant documents and processes
>> again. RTFM is OK, but you have to no where the M is.
>> Finally, I would like to commend John for bein BOLD in the wikipedia
>> sense. Not knowing the precise rules should not ban anyone from
>> contributing, and I would like to ask John to continue contributing with
>> whatever knowledge of the rules and procedures he has or lacks.
>> Best
>> Sebastian N
>>
>>
>>
>>
>>
>> On Fri, 03 Aug 2012 10:24:49 +0200, Sebastian Hellmann <
>> hellmann at informatik.uni-**leipzig.de <hellmann at informatik.uni-leipzig.de>>
>> wrote:
>>
>>  Hi John,
>>>
>>> Am 02.08.2012 15:19, schrieb John McCrae:
>>>
>>>> Hi all,
>>>>
>>>> I decided to do an independent evaluation of what was in the LLOD, to
>>>> identify what needs to be done, and found that the situation isn't
>>>> perhaps
>>>> as bad as the previous email suggests.
>>>>
>>> Sorry, John. The only thing you did is soften the criteria for
>>> inclusion. That doesn't make the data better. You even went so far as to
>>> disregard the criteria superimposed by the current practice:
>>> http://richard.cyganiak.de/**2007/10/lod/#how-to-join<http://richard.cyganiak.de/2007/10/lod/#how-to-join>
>>> CKAN entry is required, if not then "fail".
>>>
>>>  My notes are here:
>>>>
>>>> http://wiki.okfn.org/Working_**Groups/linguistics/Resources_**
>>>> in_the_cloud<http://wiki.okfn.org/Working_Groups/linguistics/Resources_in_the_cloud>
>>>>
>>> Well, that is a nice table, but rather pointless. Please concentrate on
>>> maintaining the group resources at:
>>> http://thedatahub.org/en/**group/linguistics<http://thedatahub.org/en/group/linguistics>
>>> or
>>> https://docs.google.com/**spreadsheet/ccc?key=**
>>> 0AlMk5ouIspH1dGx1R1Rnd1ZXX0xmL**XppSWFrcm0wNFE&authkey=**
>>> CJi9u78D&authkey=CJi9u78D#gid=**0<https://docs.google.com/spreadsheet/ccc?key=0AlMk5ouIspH1dGx1R1Rnd1ZXX0xmLXppSWFrcm0wNFE&authkey=CJi9u78D&authkey=CJi9u78D#gid=0>
>>>
>>>
>>>> The following resources appeared to be acceptable (i.e., they exist,
>>>> have
>>>> RDF, contain some useful data and had links to some other resource or to
>>>> data categories)
>>>>
>>> softening criteria
>>>
>>>>
>>>>     - Cornetto
>>>>     - WOLD
>>>>     - W3C WordNet
>>>>     - DBPediaWiktionary
>>>>     - LemonWiktionary*
>>>>     - LemonWordNet*
>>>>     - Open Data Thesaurus**
>>>>     - DBPedia**
>>>>     - YAGO
>>>>     - Localized DBPedias**
>>>>     - OpenCyc
>>>>     - GOLD***
>>>>     - ISOcat***
>>>>     - Lexvo
>>>>     - Lingvoj
>>>>     - Glottolog/LingDoc*
>>>>
>>>> * Sebastian has indicated that these resources may be buggy. There are
>>>> no
>>>> issues here <http://code.google.com/p/**mlode/issues/list<http://code.google.com/p/mlode/issues/list>>
>>>> that make them
>>>> unusable however so I count them as good.
>>>>
>>> LemonWiktionary and Glottolog have 18 issues total, which is good.
>>> Sebastian Nordhoff already fixed 4 bugs for Glottolog, making it much
>>> better and removing the "fail".
>>> Let's work on the data, not lowering expectations.
>>>
>>>> ** DBpedia and Open Data Thesaurus are not primarily linguistics
>>>> resources,
>>>> should they be included in the LLOD cloud?
>>>>
>>> My definition would include "anything that is useful for NLP" as well.
>>> Besides you have redirects.
>>>
>>>> *** IMHO categories and schematic information resources are vital part
>>>> of
>>>> the LLOD cloud, I can't understand why Sebastian suggests they should
>>>> not
>>>> be included!?
>>>>
>>> copying behaviour from http://lod-cloud.net/
>>> We can do schemas extra, if you want to.
>>>
>>>> The following resources need to be entered into CKAN: (6/27)
>>>> <snip>
>>>>
>>>> The following resources should be removed (at least for the time being)
>>>> from the cloud diagram: (5/27)
>>>> <snip>
>>>>
>>>> The following resources need attention: (4/27)
>>>> <snip>
>>>>
>>> That is a total of 15, I counted 18.
>>>
>>>  So In summary out of the 27 bubbles in the LLOD cloud 17 are usable and
>>>> 4
>>>> can likely be quickly fixed. I have attached a version of the LLOD cloud
>>>> with these results attached. Please edit the Wiki page if you feel I
>>>> have
>>>> got something wrong.
>>>>
>>> Please concentrate on editing CKAN  or the Google spreadsheet and submit
>>> your data set to Google code
>>> We are working on creating updates of the cloud based on CKAN.
>>> @John, please read:
>>> http://richard.cyganiak.de/**2007/10/lod/#how-to-join<http://richard.cyganiak.de/2007/10/lod/#how-to-join>
>>> http://wiki.okfn.org/Wg/**linguistics/llod#How_to_**contribute<http://wiki.okfn.org/Wg/linguistics/llod#How_to_contribute>
>>> LemonWordnet for example needs 50 links to an existing resource. Jimmy
>>> O'Regan was so kind to create that for you:
>>> http://code.google.com/p/**mlode/issues/detail?id=34<http://code.google.com/p/mlode/issues/detail?id=34>
>>>
>>> Kind regards,
>>> Sebastian
>>>
>>>
>> ______________________________**_________________
>> open-linguistics mailing list
>> open-linguistics at lists.okfn.**org <open-linguistics at lists.okfn.org>
>> http://lists.okfn.org/mailman/**listinfo/open-linguistics<http://lists.okfn.org/mailman/listinfo/open-linguistics>
>>
>
>
> _______________________________________________
> open-linguistics mailing list
> open-linguistics at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-linguistics
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-linguistics/attachments/20120803/a4abef7c/attachment-0001.html>


More information about the open-linguistics mailing list