This is an old revision of the document!
Draft of the Expression of Interest Call:
Corpus-based distributional models (such as LSA or HAL) have been claimed to capture interesting aspects of word meaning. Although they have been proposed as plausible simulations of the organization of ome properties of how human semantic representation works, a careful and extensive evaluation of such claim is still by and large lacking.
Evaluation of these models has tended to focus on large scale quantitative tasks, often more oriented towards engineering applications (see, e.g., the recent SEMEVAL evaluation campaign) than towards the challenges from formal semantics, linguistic theory, philosophy and cognitive science. The result is a great divide that still exists between corpus-driven computational approach to semantics and more formally oriented approaches, typycal of the lingusitic and of most od the psychological tradition
Moreover, whereas human lexical semantic competence is obviously multi-faceted, ranging from free association to taxonomic judgments to relational effects, tests of distributional models tend to focus on a single aspect (typically limited to semantic similarity detection), and few if any models have been tuned to tackle different facets of semantics in an integrated manner.
Our workshop purports to fill these gaps by inviting single scholars and research teams to test their computational models on a variety of small but carefully designed tasks, that aim to bring out linguistically and cognitively interesting aspects of semantics. Specifically, we envisage the following taks:
- categorization;
- concrete nouns categorization;
- concrete vs. abstract nouns discrimination;
- verbs categorization;
- MISSING TASK
- property generation;
_ MISSING TASK
The focus is NOT on competition, but on understanding how different models highlight different semantic aspects, and how far we are from integrated models of all such aspects. We think that the current state of the art does not require to discover the best model, but rather to deepn our undestanding of the weak points of differen approaches. To this effect, annotated datasets will be distributed to the participants, that will be encouraged to explore them and highlight interesting aspects of their models' performance, perform quantitative and qualitative error analysis, etc.
Theoretical and experimental papers related to the task datasets and simulation results are also invited.
Through collaborative preparatory work on the Word Space wiki (http://wordspace.collocations.de) and thanks to the ESSLLI multiple-day workshop format, we hope that this initiative will foster collaboration among the nascent community of researchers interested in computational semantics from a theoretical rather than applicative point of view.
We ask for expressions of interest from researchers and teams that might take part in the initiative and might want to have a say in test set design (a subsequent mail will provide deadlines for data-set creation and workshop submission.)
XXX CONTACT INFO XXX