[open-linguistics] Linked Open Data and Endangered Languages

Ritesh riteshkrjnu at gmail.com
Mon Oct 12 14:25:21 UTC 2015


Dear members,

I have been trying to read about and learn how to convert language data
​,​
that is either in XML format or in plain text format
​,​
into Linked Open Data. What I have understood is that it needs to be
represented in RDF format and made accessible over the web (and also link
it to other similar resources, if possible). However, there has been too
much of information, too many frameworks to deal with and I must admit I am
pretty lost.

What I am some of my colleagues are trying to do is very simple - we have a
large amount of audio (and also video) recordings, with inter-linear
glossing of quite a few critically endangered languages of India. When
these data were collected (and that means even today), they were collected
as part of a documentation project to preserve as much as possible of these
dying languages (and may be use the language data to revitalise the
languages). Most of the data is in XML format which is created by two of
the most used softwares in field linguistics and language documentation -
SIL FieldWorks and ELAN Video Annotation Tool. So the data is structured
but not Linked Open Data. We are trying to export and publish this data as
Linked Open Data.

Now the problem is - none of us really understands RDF or Linked Open Data
that well. Theoretically we understand that RDF maintains the semantics of
any document, thereby, making interoperability possible but that is pretty
much all we know. We are still not able to figure out how exactly could
this be done. Any pointers towards what exactly is Linked Open Data and how
could we convert data into Linked Open Data would be very helpful. Of
course, there are a large number of resources available on the web but they
are a bit too much - most of the times we end up more confused than ever.
So we would appreciate something which gives an overview of this and may be
also some indications / guidelines as to how we could approach this.

In addition to this we were also wondering if somebody in this group would
be interested in delivering a 'revelation' talk or may be, giving some kind
of tutorial / workshop / training on how exactly this could be done. We are
organising a conference on Language Technologies for Endangered Languages
from 25 - 27 February, 2016 in Agra
​
, India
​
​
(Conference Website <http://elkl4.kmiagra.in/>)
​
and we would like it to be done there so that maximum number of people
could be benefited. Please let me know and we could talk about this on the
personal email (without spamming the inbox of the subscribers to this list).

Thanks & Best regards,

-- 
Ritesh Kumar, Ph.D.
Assistant Professor
Department of Linguistics
Dr. B.R. Ambedkar University
Agra, India
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-linguistics/attachments/20151012/de51943a/attachment-0002.html>


More information about the open-linguistics mailing list