This is an old revision of the document!


Lexical Semantics Workshop (ESSLLI 2008)

Bridging the gap between semantic theory and computational simulations
Workshop at ESSLLI 2008, Hamburg, August 4-8 2008

Workshop programme and proceedings available here

Background and motivation

Motivational Poster

Corpus-based distributional models (such as LSA or HAL) have been claimed to capture interesting aspects of word meaning and provide an explanation for the rapid acquisition of semantic knowledge by human language learners. However, although these models have been proposed as plausible simulations of human semantic space organization, careful and extensive empirical tests of such claims are still lacking.

Systematic evaluations typically focus on large-scale quantitative tasks, often more oriented towards engineering applications (see, e.g., the recent SEMEVAL evaluation campaign) than towards the challenges posed by linguistic theory, philosophy and cognitive science. This has resulted in a great divide between corpus-driven computational approaches to semantics on the one hand and theory-driven symbolic approaches on the other – a situation that is characteristic of the linguistic and of most of the cognitive tradition. Moreover, whereas human lexical semantic competence is obviously multi-faceted – ranging from free association to taxonomic judgments to relational effects – tests of distributional models tend to focus on a single aspect (most typically the detection of semantic similarity), and few if any models have been tuned to tackle different facets of semantics in an integrated manner.

Our workshop purports to fill these gaps by inviting research teams and individual scholars to test their computational models on a variety of small but carefully designed tasks that aim to bring out linguistically and cognitively interesting aspects of semantics (see below for details). To this effect, annotated datasets are available to the participants, who are encouraged to explore them and highlight interesting aspects of their models' performance, conduct quantitative and qualitative error analysis, etc.

The focus is NOT on competition, but on understanding how different models highlight different semantic aspects, how far we are from an integrated model, and which aspects of semantics are beyond the reach of purely distributional approaches. In fact, we believe that at the current state of the art in computational and distributional semantics, our goal should not be to develop the best-performing model for a specific application, but rather to enlarge our understanding of the limits and potentialities of different approaches when confronted with cognitively realistic tasks.

In addition to these practical tasks, theoretical and experimental papers discussing the relation between distributional and symbolic approaches to meaning are also invited. We are particularly interested in papers that analyze our task data sets from a theoretical perspective or that discuss simulation results and their implications for semantic and cognitive theory.

Through collaborative preparatory work on the Word Space wiki (wordspace.collocations.de) and thanks to the ESSLLI multiple-day workshop format, we hope that this initiative will foster collaboration among the nascent community of researchers interested in computational semantics from a theoretical rather than engineering-oriented point of view.

For further information, please write to lexsem08@gmail.com, and/or subscribe to the workshop wiki RSS feed.

Tasks and data sets

In order to reach a better understanding of the possibilities and limitations of distributional models of word meaning, we envisaged a number of tasks that focus on linguistic and cognitive challenges rather than application engineering.

Small annotated data sets are available on the workshop page and participants are invited to apply their computational models and conduct a thorough analysis of the results. The goal is not to achieve better precision than competitors, but to understand the strengths and weaknesses of individual models, analyze and explain errors, etc. Theoretical discussions of the data sets from a linguistic or cognitive perspective are also invited and will complement the empirical findings.

Ongoing work on data set preparation can be monitored at http://wordspace.collocations.de/doku.php/data:start.

We offer the following tasks (click on the links for detailed task descriptions and downloads):

Workshop information

Dates

  • April 4, 2008: Paper submission deadline
  • April 24, 2008: Notification
  • August 4-9, 2008: Workshop in Hamburg (during the first week of ESSLLI)

Programme Committee

Marco Baroni (University of Trento) (co-organizer)
Reinhard Blutner (University of Amsterdam)
Gemma Boleda (UPF, Barcelona)
Peter Bosch (University of Osnabrück)
Paul Buitelaar (DFKI, Saarbrücken)
John Bullinaria (University of Birmingham)
Katrin Erk (UT, Austin)
Stefan Evert (University of Osnabrück) (co-organizer)
Patrick Hanks (Masaryk University, Brno)
Anna Korhonen (Cambridge University)
Michiel van Lambalgen (University of Amsterdam)
Alessandro Lenci (University of Pisa) (co-organizer)
Claudia Maienborn (University of Tübingen)
Simonetta Montemagni (ILC-CNR, Pisa)
Rainer Osswald (University of Hagen)
Manfred Pinkal (University of Saarland)
Massimo Poesio (University of Trento)
Reinhard Rapp (University of Mainz)
Magnus Sahlgren (SICS, Kista)
Sabine Schulte im Walde (University of Stuttgart)
Manfred Stede (University of Potsdam)
Suzanne Stevenson (University of Toronto)
Peter Turney (NRC Canada, Ottawa)
Tim Van de Cruys (University of Groningen)
Gabriella Vigliocco (University College, London)
Chris Westbury (University of Alberta)

Paper submission

We welcome papers reporting results of experimenting with word space models on one or more workshop tasks, as well as comparing different models on the same task(s) (authors are asked to carry out their own evaluation, using, if possible, the tools provided on the Website).

We also welcome papers focussing on:

  • methodological and theoretical issues concerning word space models;
  • open challenges for distributional methods for semantic analysis;
  • interactions with formal approaches to meaning;
  • interaction with cognitive research on human semantic memory.

The papers should not be longer than 8 pages, and they should be submitted anonymously in PDF format following the ACL 2008 stylesheet.

Submission must be sent to lexsem08@gmail.com, no later than April 4, specifying PAPER SUBMISSION in the subject and the authors' names and affiliation in the message body.

Workshop homepage

ESSLLI 2008 homepage