[open-linguistics] Defining "Openness" for Linguistic Linked Open Data
bond at ieee.org
Wed Jan 17 12:29:54 UTC 2018
I think it is very misleading to have a diagram called Linked Open Data
cloud if it is not in fact comprised of linked open data. We have a
simple solution: create two diagrams: (Linguistic) Linked Data which allows
all linked data and (Linguistic) Linked Open Data, for data that is open
according to the open definition. That way everyone can have what they
want :-). The two resources can link to each other somewhere in their
descriptions to make them both easily accessible.
As Victor pointed out, you need to be open to be LOD. I quote from 
"You can have 5-star Linked Data without it being open. However, if it
claims to be Linked Open Data then it does have to be open, to get any star
On Wed, Jan 17, 2018 at 6:09 PM, Víctor Rodríguez Doncel <
vrodriguez at fi.upm.es> wrote:
> Dear Christian,
> Even if the very first star of the high-quality "5 star linked data" 
> imposes that license must be open, there are some people who would like to
> soften this requirement, making the O in LOD to simply mean "open format".
> Furthermore, the European Union is striving towards establishing data
> markets, where obviously licenses cannot be open and where linked data may
> play a role (see how they are actively funding, right now, such research
> In my opinion, limiting the LLOD to strictly open datasets is a mistake,
> as it would depict a reality only partially. The webpage at
> http://linguistic-lod.org/llod-cloud already displays the cloud by
> license; I cannot possibly imagine how to improve that...
>  https://www.w3.org/DesignIssues/LinkedData.html
>  ICT-13-2018-2019: Supporting the emergence of data markets and the
> data economy
> El 16/01/2018 a las 13:32, Christian Chiarcos escribió:
>> Dear all,
>> when we first began developing the Linguistic Linked Open Data cloud
>> diagram, we followed a highly permissive approach on criteria for
>> inclusion, with the idea to move if from an abstract vision to a set of
>> actually usable resources -- in fact the first versions of the diagram
>> (before the MLODE workshop in September 2012) are explicitly referred to as
>> "drafts" because we included resources whose conversion to LOD had only
>> been *promised* the time.
>> However, the quality criteria have been continuously enforced since then.
>> This includes availability, size, number of links, and an explicit
>> definition of linguistic relevance as an entry criterion, so that these are
>> now roughly equivalent with the LOD criteria.
>> Along with that, we did *not* enforce an Open Definition-conformant
>> license (http://opendefinition.org/licenses/). In particular, arguments
>> have been brought forward to include non-commercial resources. One of the
>> reasons is that many classical resources developed during the 1990s and
>> early 2000s are released under "academic" licenses and that even today,
>> entire sub-communities in linguistics tend to be very protective about
>> their data. Encouraging noncommercial licenses is a viable compromise to
>> reach out to these communities without compromising the idea of embracing
>> openness altogether. We did have discussions about this from the very
>> beginning, and there are good arguments for either view, but we did *not*
>> manage to establish a consensus to exclude, in particular, NC-licensed data.
>> For the moment, openness is (implicitly) defined as being in line with
>> the LOD diagram, i.e., we inherit its view that "we take a liberal view of
>> what we consider “open”. If the data is openly accessible from a network
>> point of view – that is, it's not behind an authorization check or paywall"
>> (http://lod-cloud.net/). This approach can be criticized for good
>> reasons, but it is an established and transparent practice that goes back
>> to the original LOD diagram by Cyganiak and Jentzsch, and that has also
>> been documented since then.
>> Part of this documentation is that under http://linguistic-lod.org/llod
>> -cloud, users can get an alternative visualization of the diagram with
>> respect to licenses, and as can be easily seen, about half of the LLOD
>> bubbles are non-commercial, three have no explicit license (which means a
>> restrictive license, in Germany, at least), and three more are labeled as
>> "closed" (which may in fact mean that different sub-resources have
>> different licenses, e.g., Multext-East[http://nl.ijs.si/ME/V4/], which
>> includes CC-BY-SA and CC-BY-NC lexica as well as corpus data under a
>> restricted/non-commercial license).
>> However, this can be a problem for data providers who find their NC data
>> in the (L)LOD diagram without being "Open" according to the Open
>> Definition, as users of this data may get a wrong impression about their
>> usage rights -- despite warnings such as "Before using any data, you should
>> always check the publisher's website for the terms and conditions" (
>> The question now is what to do about this situation. Personally, I would
>> prefer to roughly stay with the current practice for the LOD and LLOD
>> diagrams for the moment, but to provide an explicit statement that *our*
>> definition of openness exceeds beyond the Open Definition by including
>> non-commercial/"academic" resources, because this is an explicit need in
>> (parts of) our community. At the same time, given such a statement,
>> resources with unclear (= restrictive) licenses should be removed from the
>> diagram. As these are quantitatively marginal anyway, this should not
>> affect the usability of LLOD resources and the diagram in comparison to its
>> current state.
>> In any case, this is for the immediate future only. At some point in the
>> future, after intense lobbying among our peers and (hopefully) growing
>> imporance of OpenDefinition-compliant licenses, we should certainly adopt a
>> stricter definition, but for the moment, the growth in resources,
>> demonstrating their use and developing applications of (L)LOD should --
>> IMHO -- take priority over ideological purity until it is established as a
>> conventional approach for (certain kinds of) linguistic data.
>> This may be controversial, though, so, what do others think?
> Víctor Rodríguez-Doncel
> D3205 - Ontology Engineering Group (OEG)
> Departamento de Inteligencia Artificial
> ETS de Ingenieros Informáticos
> Universidad Politécnica de Madrid
> Campus de Montegancedo s/n
> Boadilla del Monte-28660 Madrid, Spain
> Tel. (+34) 91336 3672
> Skype: vroddon3
> open-linguistics mailing list
> open-linguistics at lists.okfn.org
> Unsubscribe: https://lists.okfn.org/mailman/options/open-linguistics
Francis Bond <http://www3.ntu.edu.sg/home/fcbond/>
Division of Linguistics and Multilingual Studies
Nanyang Technological University
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the open-linguistics