[open-linguistics] 2nd CFP - GermEval 2015: LexSub (German Lexical Substitution Shared Task)

Tristan Miller miller at ukp.informatik.tu-darmstadt.de
Tue Mar 17 09:52:33 UTC 2015

==== Second Call for Participation ====

GermEval 2015: LexSub
(German Lexical Substitution Shared Task)

29 September 2015 at GSCL 2015, Essen, Germany


---- Introduction ----

Word sense disambiguation (WSD) has been a core research problem in
computational linguistics since the very inception of the field. In
recent years, there has been considerable interest in using lexical
substitution as an extrinsic evaluation of WSD systems. This has led to
a number of mono- and crosslingual evaluation competitions at SemEval
and EVALITA. We now invite all researchers and industry professionals to
participate in GermEval 2015: LexSub, the first lexical substitution
task for the German language.  The task has been officially accepted as
a workshop at the GSCL 2015 conference at the University of
Duisburg-Essen, and is scheduled for the afternoon of Tuesday, 29
September 2015.

---- Task description ----

Lexical substitution is the task of identifying an appropriate
substitute for a target word in a given context. For example, in the
sentence "She's a bright kid who excels academically," an appropriate
substitute for "bright" might be "smart", whereas an inappropriate one
would be "glowing". Automatically identifying substitution candidates,
and selecting those which best match the context, requires intelligent
application of lexical-semantic knowledge and word sense disambiguation
techniques. However, unlike traditional WSD tasks, lexical substitution
does not mandate the use of any particular sense inventory.

The data for the GermEval 2015: LexSub task is described by Cholakov et
al. in "Lexical substitution dataset for German" (Proc. LREC, 2014). All
together it consists of 2040 sentences from the German Wikipedia, each
containing a target word and a list of substitutions proposed by human
annotators. There are 153 unique target words, equally distributed
across parts of speech (nouns, verbs, and adjectives) and three
frequency groups. About half of this data (26 nouns, 26 verbs, and 26
adjectives in 1040 sentence contexts) forms the training set, which will
be made available to participants in advance. The remainder forms the
test set, which will be used for the evaluation and published in full
only after the shared task is completed.

Participants need not rely on any particular language resources, but if
they wish they can employ the sense-linked lexical-semantic resource UBY
and JoBimText distributional semantics models. UBY also provides an
interface to GermaNet. Industrial users will be eligible to a special
GermaNet licence to be obtained from Eberhard-Karls Universität
Tübingen. Please refer to our web pages on how to obtain the data sets
and resources.

Systems' performance will be measured by comparing their substitutes
against those selected by the human annotators; for this we will use the
"best", "out of ten", and "generalized average precision" metrics. The
organizers will provide a scoring system and the output of some baseline

---- Practical information ----

* 23 January 2015: Availability of training data
* 1 July 2015: Availability of test data
* 15 July 2015: Deadline for initial submission of papers and results
* 1 August 2015: Notification of acceptance and shared task results
* 15 August 2015: Deadline for camera-ready papers
* 30 September–2 October 2015: GSCL 2015

Submissions will consist of a file providing the substitutions for each
instance of the target data and a paper of up to four pages (including
references) describing the approach and analyzing the performance.
Papers should follow the GSCL 2015 style guide, and will be reviewed and
published in an online volume of workshop proceedings. (We may ask
participants to peer-review other submissions.) Participants are
expected to present summaries of their systems at the GermEval 2015:
LexSub workshop at GSCL 2015.

---- Organizing committee ----

* Sallam Abualhaija, Technische Universität Hamburg-Harburg
* Darina Benikova, LT Group, Technische Universität Darmstadt
* Chris Biemann, LT Group, Technische Universität Darmstadt
* Judith Eckle-Kohler, UKP Lab, Technische Universität Darmstadt
* Iryna Gurevych, UKP Lab, Technische Universität Darmstadt
* Tristan Miller, UKP Lab, Technische Universität Darmstadt

To contact the organizing committee, please post to the GermEval 2015:
LexSub mailing list at
https://groups.google.com/forum/#!forum/germeval-2015-lexsub, or for
private communication e-mail Tristan Miller.

---- Acknowledgements ----

This shared task is supported by the DFG-funded project “Integrating
Collaborative and Linguistic Resources for Word Sense Disambiguation and
Semantic Role Labeling” (InCoRe, GU 798/9-1), the BMBF-funded CLARIN
F-AG7, and the LOEWE research cluster “Digital Humanities”.

Tristan Miller, Research Scientist
Ubiquitous Knowledge Processing Lab (UKP-TUDA)
Department of Computer Science, Technische Universität Darmstadt
Tel: +49 6151 16 6166 | Web: http://www.ukp.tu-darmstadt.de/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://lists.okfn.org/pipermail/open-linguistics/attachments/20150317/5800e873/attachment-0002.sig>

More information about the open-linguistics mailing list