[open-linguistics] CFP: Linked Data in Linguistics 2012 (LDL 2012) - extended abstracts due August 7, 2011

Mon Aug 1 11:50:38 UTC 2011

Apologies for cross-postings. Please send it to interested colleagues. Thanks!
PDF-Version can be found here:
http://ldl2012.lod2.eu/DGFS2012-LinkedDataCfP.pdf
Although the workshop is held in 2012 the submission deadline for
extended abstracts is on August 7, 2011

2nd CALL FOR PAPERS (EXTENDED ABSTRACTS)

*********************************************************
Linked Data in Linguistics
Representing and connecting language data and language metadata
http://ldl2012.lod2.eu

Workshop organized as part of the Annual Conference of the German
Linguistic Society (DGfS),
to be held in Frankfurt, Germany, March 7-9, 2012
*********************************************************
Date: March 7-9, 2012
Submission Deadline: August 7, 2011 (Extended Abstracts)
Venue: Frankfurt am Main, Germany
*********************************************************

** Overview **
The explosion of information technology has led to a substantial
growth in quantity, diversity and complexity of linguistic data
accessible over the Internet. These resources become even more useful
when linked with each other. This workshop will present principles,
use cases, and best practices for using the linked data paradigm[a] to
represent, exploit, store, and connect different types of linguistic
data collections.
The intended audience includes empirically-working linguists and
philologists interested in the representation, exchange and
interlinking of linguistic data and metadata, computer scientists and
computational linguists interested in the application of Semantic Web
formalisms and technologies to language data, and developers of
infrastructures for linguistic data and other researchers with an
interest in both aspects.

** Linguistic data and metadata **
The last years have seen the rapid development of linguistic data
collections available over the Internet. The workshop intends to
address questions and use cases for the creation, publication and
application of data collections including (but not limited to):

1. Language archives for (endangered) languages, that contain a wealth
of textual material as well as audio and video (DOBES, PARADISEC,
ELAR). How can this material be mobilized?
2. Typological databases such as the World Atlas of Language
Structures (WALS), or the Typological Database System (TDS) provide
rich repositories of information about languages and their respective
features. An interesting feature would be to combine the information
from these resources, for example “Is it true that OV languages [WALS
feature 83A] are characterized by pitch accent [TDS, StressTyp data
base]” ? How can such queries be accomplished?
3. Computational lexicography uses formalisms such as RDF, SKOS and
OWL to encode dictionaries and to employ them in different
applications. What are the practical benefits of this representation?
4. Lexical-semantic resources such as WordNet, FrameNet and general
knowledge bases like DBpedia and Yago represent the very foundation of
computational semantics, and are also available in OWL and RDF. How
does this representation improve the accessibility and the application
of these resources?
5. Linguistic corpora involve an increasing diversity of annotations
such as syntax, semantics and coreference (e.g.,
PennTreeBank/PropBank/PennDiscourseTreebank, OntoNotes, SALSA/TIGER).
How can such multi-layered corpora be represented, evaluated and
connected to electronic lexicons, lexical-semantic resources, or
metadata repositories?
6. Metadata repositories provide common vocabularies for the
description of other types of linguistic data, thus enabling to
compare and integrate them. This includes information about languages
(e.g. in LL-MAP or Mulitree), but also information about linguistic
data categories and phenomena (e.g. in GOLD and ISOcat). How do such
common repositories improve the re-usability of linguistic resources
in research and in Semantic Web applications?

It is the challenge of our time to store, interlink and exploit this
wealth of data. Our workshop leverages the Digital Humanities paradigm
within linguistics, focusing on the use of information technology to
improve data-driven linguistic research.
This workshop invites researchers from the fields of language
documentation, typology, computational linguistics, corpus
linguistics, as well as researchers from other empirically-oriented
disciplines of linguistics who share an interest in data and metadata
modelling with Semantic Web technologies such as RDF or OWL.

** Topics of interests **
We invite contributions related (but not limited) to one of the
following topics:
1. Use cases, problem descriptions and project proposals for the
creation, maintenance and publication of linguistic data collections
that are linked with other resources
2. Modelling linguistic data and metadata with OWL and/or RDF
3. Ontologies for linguistic data and metadata collections
4. Applications of such data, other ontologies or linked data from any
subdiscipline of linguistics (may include work in progress or project
descriptions)
5. Legal and social aspects of Linked Linguistic Data

** Goals **
Beside the discussion of projects, experiences and open questions, the
workshop hopes to support the on-going development of a community of
researchers interested in linked linguistic data. This involves the
following aspects:

1. The primary goal is to establish interdisciplinary contact across
the boundaries between different subdisciplines of applied
linguistics, computational linguistics and neighbouring fields. We are
under the impression that people coming from very different
backgrounds encounter similar issues in their work and that there is
potential for synergies here.
2. The second goal is to increase the amount of Linked Open Data on
the web so that researchers can make use of the data already out
there. In other words: we want to find the data giants on whose
shoulders future generations would be able to stand, and convince them
to make their data available as Linked Data.
3. The third goal is to discuss strategies, reasons and problems to
publish linguistic data under open licensed, with the perspective to
increase the prestige of data as a form of scientific production which
does not need to shy away from comparison with more established genres
like articles or monographs.

** Submission **
Until August 7, 2011 we are expecting an extended abstract of up to
2500 words plus references. With A4 and 10pt Times font, this
corresponds to four pages plus references. For submission details,
please consult the workshop webpage: http://ldl2012.lod2.eu/submission

** Important Dates **
August 7, 2011: Deadline for extended abstracts (four pages plus references)
September 9, 2011: Notification of acceptance
October 23, 2011: One-page abstract for DGfS conference proceedings
December 1, 2011: Camera-ready papers for workshop proceedings (eight
pages plus references)
March 7-9, 2012: Workshop

** Invited speakers **
Martin Haspelmath (Max Planck Institute for Evolutionary Anthropology)
Nancy Ide (American National Corpus, Vassar College)

** Workshop organizers **
Sebastian Nordhoff (Max Planck Institute for Evolutionary
Anthropology, Leipzig, Germany)
Christian Chiarcos (University of Potsdam, Germany)
Sebastian Hellmann (University of Leipzig, Germany)

** Programme committee**
Emily Bender (University of Washington)
Philipp Cimiano (CITEC, Universität Bielefeld)
Alexis Dimitriadis (Universiteit Utrecht)
Caroline Féry (Universität Frankfurt)
Jeff Good (University at Buffalo)
Harald Hammarström (MPI-EVA Leipzig)
Ernesto William de Luca (DAI-Lab, Technische Universität Berlin)
Harald Lüngen (IDS Mannheim)
Lutz Maicher (Fraunhofer MOEZ)
John McCrae (CITEC, Universität Bielefeld)
Gerard de Melo (MPI for Informatics, Saarbrücken)
Pablo Mendes (FU Berlin)
Steven Moran (University of Washington)
Axel-C. Ngonga Ngomo (Universität Leipzig)
Antonio Pareja-Lora (Universidad Complutense de Madrid)
Cornelius Puschmann (Heinrich-Heine-Universität Düsseldorf)
Felix Sasaki (DFKI Berlin, FH Potsdam)
Stavros Skopeteas (Universität Bielefeld)
Dennis Spohr (CITEC, Universität Bielefeld)
Johanna Völker (Universität Mannheim)
Menzo Windhouwer (MPI Nijmegen / Universiteit Amsterdam)
Alena Witzlack-Makarevich (University of Zurich)

The workshop is endorsed and sponsored by the Max Planck Institute for
Evolutionary Anthropology (http://www.eva.mpg.de) and the LOD2
project: Creating Knowledge out of Interlinked Data (http://lod2.eu)
--
Christian Chiarcos
Applied Computational Linguistics
Universität Potsdam, Germany
snail: Karl-Liebknecht-Str. 24-25, 14476 Golm
web: http://www.sfb632.uni-potsdam.de/~chiarcos
email: chiarcos AT uni LINE potsdam DOT de
phone: +49-331-977-2664
fax: +49-331-977-2087