[open-linguistics] Defining "Openness" for Linguistic Linked Open Data

Wed Jan 17 12:29:54 UTC 2018

G'day,

I think it is very misleading to have a diagram called Linked Open Data
cloud if it is not in fact comprised of linked open data.   We have a
simple solution: create two diagrams: (Linguistic) Linked Data which allows
all linked data and (Linguistic) Linked Open Data, for data that is open
according to the open definition.   That way everyone can have what they
want :-).  The two resources can link to each other somewhere in their
descriptions to make them both easily accessible.

As Victor pointed out, you need to be open to be LOD.   I quote from [1]
"You can have 5-star Linked Data without it being open. However, if it
claims to be Linked Open Data then it does have to be open, to get any star
at all."

[1] https://www.w3.org/DesignIssues/LinkedData.html

Yours,

On Wed, Jan 17, 2018 at 6:09 PM, Víctor Rodríguez Doncel <
vrodriguez at fi.upm.es> wrote:

> Dear Christian,
>
> Even if the very first star of the high-quality "5 star linked data" [1]
> imposes that license must be open, there are some people who would like to
> soften this requirement, making the O in LOD to simply mean "open format".
> Furthermore, the European Union is striving towards establishing data
> markets, where obviously licenses cannot be open and where linked data may
> play a role (see how they are actively funding, right now, such research
> projects).
>
> In my opinion, limiting the LLOD to strictly open datasets is a mistake,
> as it would depict a reality only partially. The webpage at
> http://linguistic-lod.org/llod-cloud already displays the cloud by
> license; I cannot possibly imagine how to improve that...
>
> Regards,
> Víctor
>
>
> [1] https://www.w3.org/DesignIssues/LinkedData.html
>
> [2] ICT-13-2018-2019: Supporting the emergence of data markets and the
> data economy
> https://ec.europa.eu/research/participants/portal/desktop/en
> /opportunities/h2020/topics/ict-13-2018-2019.html
>
> El 16/01/2018 a las 13:32, Christian Chiarcos escribió:
>
>> Dear all,
>>
>> when we first began developing the Linguistic Linked Open Data cloud
>> diagram, we followed a highly permissive approach on criteria for
>> inclusion, with the idea to move if from an abstract vision to a set of
>> actually usable resources -- in fact the first versions of the diagram
>> (before the MLODE workshop in September 2012) are explicitly referred to as
>> "drafts" because we included resources whose conversion to LOD had only
>> been *promised* the time.
>>
>> However, the quality criteria have been continuously enforced since then.
>> This includes availability, size, number of links, and an explicit
>> definition of linguistic relevance as an entry criterion, so that these are
>> now roughly equivalent with the LOD criteria.
>>
>> Along with that, we did *not* enforce an Open Definition-conformant
>> license (http://opendefinition.org/licenses/). In particular, arguments
>> have been brought forward to include non-commercial resources. One of the
>> reasons is that many classical resources developed during the 1990s and
>> early 2000s are released under "academic" licenses and that even today,
>> entire sub-communities in linguistics tend to be very protective about
>> their data. Encouraging noncommercial licenses is a viable compromise to
>> reach out to these communities without compromising the idea of embracing
>> openness altogether. We did have discussions about this from the very
>> beginning, and there are good arguments for either view, but we did *not*
>> manage to establish a consensus to exclude, in particular, NC-licensed data.
>>
>> For the moment, openness is (implicitly) defined as being in line with
>> the LOD diagram, i.e., we inherit its view that "we take a liberal view of
>> what we consider “open”. If the data is openly accessible from a network
>> point of view – that is, it's not behind an authorization check or paywall"
>> (http://lod-cloud.net/). This approach can be criticized for good
>> reasons, but it is an established and transparent practice that goes back
>> to the original LOD diagram by Cyganiak and Jentzsch, and that has also
>> been documented since then.
>>
>> Part of this documentation is that under http://linguistic-lod.org/llod
>> -cloud, users can get an alternative visualization of the diagram with
>> respect to licenses, and as can be easily seen, about half of the LLOD
>> bubbles are non-commercial, three have no explicit license (which means a
>> restrictive license, in Germany, at least), and three more are labeled as
>> "closed" (which may in fact mean that different sub-resources have
>> different licenses, e.g., Multext-East[http://nl.ijs.si/ME/V4/], which
>> includes CC-BY-SA and CC-BY-NC lexica as well as corpus data under a
>> restricted/non-commercial license).
>>
>> However, this can be a problem for data providers who find their NC data
>> in the (L)LOD diagram without being "Open" according to the Open
>> Definition, as users of this data may get a wrong impression about their
>> usage rights -- despite warnings such as "Before using any data, you should
>> always check the publisher's website for the terms and conditions" (
>> http://lod-cloud.net/).
>>
>> The question now is what to do about this situation. Personally, I would
>> prefer to roughly stay with the current practice for the LOD and LLOD
>> diagrams for the moment, but to provide an explicit statement that *our*
>> definition of openness exceeds beyond the Open Definition by including
>> non-commercial/"academic" resources, because this is an explicit need in
>> (parts of) our community. At the same time, given such a statement,
>> resources with unclear (= restrictive) licenses should be removed from the
>> diagram. As these are quantitatively marginal anyway, this should not
>> affect the usability of LLOD resources and the diagram in comparison to its
>> current state.
>>
>> In any case, this is for the immediate future only. At some point in the
>> future, after intense lobbying among our peers and (hopefully) growing
>> imporance of OpenDefinition-compliant licenses, we should certainly adopt a
>> stricter definition, but for the moment, the growth in resources,
>> demonstrating their use and developing applications of (L)LOD should --
>> IMHO -- take priority over ideological purity until it is established as a
>> conventional approach for (certain kinds of) linguistic data.
>>
>> This may be controversial, though, so, what do others think?
>>
>> Best,
>> Christian
>>
>
>
> --
> Víctor Rodríguez-Doncel
> D3205 - Ontology Engineering Group (OEG)
> Departamento de Inteligencia Artificial
> ETS de Ingenieros Informáticos
> Universidad Politécnica de Madrid
>
> Campus de Montegancedo s/n
> Boadilla del Monte-28660 Madrid, Spain
> Tel. (+34) 91336 3672
> Skype: vroddon3
>
>
> _______________________________________________
> open-linguistics mailing list
> open-linguistics at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/open-linguistics
> Unsubscribe: https://lists.okfn.org/mailman/options/open-linguistics
>

-- 
Francis Bond <http://www3.ntu.edu.sg/home/fcbond/>
Division of Linguistics and Multilingual Studies
Nanyang Technological University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-linguistics/attachments/20180117/ee276c49/attachment-0003.html>