[open-linguistics] Full CfP Linked Data in Linguistic Typology, Deadline January 15

Sebastian Nordhoff sebastian_nordhoff at eva.mpg.de
Mon Jan 7 19:30:02 UTC 2013

Linked Data in Linguistic Typology

== Convenor ==
Sebastian Nordhoff

== Date ==
August 15-18, 2013, precise day tbd

== Venue ==
MPI-EVA Leipzig (ALT Theme session)

== Submission deadline ==
JANUARY 15, 2013 (next week)

== Audience ==
Tyopologists and computational linguists

== Call for papers ==
Typology lives on data. Typologists produce, curate, extract, aggregate,
and analyze data on a daily basis. One major issue is the interoperability
of digital data thus gathered. This workshop will deal with the
production, publication, and interlinking of typological data according to
Semantic Web principles (Linked Open Data).

Several attempts at standardizing typological data have been made, e.g.
LDS (Comrie & Smith 1977) and GOLD (Farrar and Langendoen 2003). These
top-down approaches have had some success, but a large scale adoption is
still wanting. A bottom-up approach as for instance employed by TDS
(http://tds2.dans.knaw.nl/) and ISO-CAT (http://www.isocat.org/) could be
more promising as it takes into account the often strong feelings
linguists have about data categories.

Numerous projects around the world gather heterogeneous typological data,
but data representation is by and large project-specific and not guided by
general principles. This often results in serious problems over time,
including issues with regard to persistence, provenance, interoperability,
and accessibility.

These problems are well-known in other data-heavy subdisciplines, e.g.
lexicography and corpus linguistics. The lemon project (McCrae et al.
2012) tackles these issues for lexicography, OLiA does the same for corpus
linguistics (Chiarcos 2012). In this workshop, we want to explore in how
far the solutions developed in the other subdisciplines can be applied to
typology, building upon more general concepts of interlinking
heterogeneous data sets in the context of Linked Open Data (Berners-Lee
2006, Heath & Bizer 2009).

The working group on Open Data in Linguistics of the Open Knowledge
Foundation has recently started working on interlinking data from various
subdisciplines (Chiarcos et al. 2012a). The insights and experiences
gained there can fruitfully be applied to typology, as the integration of
WALS, WOLD, ASJP, Glottolog, and IDS into the Linguistic Linked Open Data
Cloud show (Nordhoff 2012, Hellmann et al. forthcoming). Chiarcos et al.
(2012b) show how such data can then be cross-queried across knowledge
bases to gain new insights and test hypotheses.

The major advantages of the Linked Open Data approach advocated in
Chiarcos et al. (2012a) are the potentials of cross-querying data, and the
possibility of a federated approach to data production (crowdsourcing).

The aim of this workshop is to bring together typologists who create or
curate large data sets and practitioners of Linked Open Data, to leverage
the potential of creating a linked data cloud for linguistic typology. We
welcome presentations about novel techniques of publishing data on the
web, about interlinking and cross-querying databases, and about federating
data production.

== References  ==
Berners-Lee, Tim. 2006. Design Issues: Linked Data. July 2006.

Chiarcos 2012. Ontologies of Linguistic Annotation: Survey and
Perspectives. LREC 2012, Istanbul.

Chiarcos, Christian, Nordhoff, Sebastian & Hellmann, Sebastian (eds.).
2012a. Linked Data in Linguistics: Representing and Connecting Language
Data and Language Metadata. Heidelberg: Springer.

Chiarcos, Christian, Hellmann, Sebastian & Nordhoff, Sebastian 2012b.
Linking Linguistic Resources: Examples from the Open Linguistics Working
Group. In Chiarcos et al. (eds.) 2012a.

Comrie, Bernard & Smith, Norval. 1977. The Lingua Descriptive Studies
Questionnaire. Lingua 41. 1-74.

Farrar, Scott & Langendoen, Terry. 2003. A linguistic ontology for the
semantic web. GLOT International 7. 200-203.

Heath Tom & Bizer, Chris. 2011. Linked Data - Evolving the Web into a
Global Data Space. San Rafael: Morgan & Claypool.

Hellmann, Sebastian, Moran, Steven, Brümmer, Martin, McCrae, John (eds.).
Forthcoming. Multilingual Linked Open Data. Special Issue of the Semantic
Web Journal.

McCrae, John, Montiel-Ponsoda, Elena & Cimiano, Philipp. 2012. Integrating
WordNet and Wiktionary with lemon. In Chiarcos et al. (eds.) 2012a.

Nordhoff, Sebastian 2012. Linked Data for Linguistic Diversity Research:
Glottolog/Langdoc and ASJP Online In Chiarcos et al. (eds.) 2012a.

== Submission ==
Send your abstract as an e-mail attachment to: ALT10[>>> Please replace
the brackets with an AT sign! <<<]eva.mpg.de

Subject header: (your name) ALT 10 abstract

Include these things in the body of the email:
authors' names
abstract title
contact information: e-mail, phone, fax
    Note: One individual may be involved in a maximum of two abstracts
(maximum of one as sole author), regardless of category (oral, poster,
theme-session talk).

Maximum length: 500 words or 1 single-spaced page.

Please put this information at the top of your abstract:
abstract title
abstract category (oral, poster, oral/poster)
theme session (if applicable)

Format: If at all possible, please send your abstract as a pdf.

Name: Give your pdf a filename similar to the subject header.

Anonymity: Abstracts must be anonymous: do not put your name or other
identifying information on the abstract. Also, please anonymize your pdf
by removing identifying information.

== Further information ==

Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

More information about the open-linguistics mailing list