This is an old revision of the document!


Distributional Semantic Models (NAACL-HLT 2010)

Distributional Semantic Models NAACL-HLT 2010 (Los Angeles)
Tutorial at the NAACL-HLT 2010 Conference, Los Angeles, 1 June 2010

Course description

Distributional semantic models (DSM) – also known as "word space" or "distributional similarity" models – are based on the assumption that the meaning of a word can (at least to a certain extent) be inferred from its usage, i.e. its distribution in text. Therefore, these models build high-dimensional vector representations through a statistical analysis of the contexts in which words occur.

Since the seminal papers of Landauer & Dumais (1997) and Schütze (1998), DSMs have been an active area of research in computational linguistics. Amongst many other tasks, they have been applied to solving the TOEFL synonym test (Landauer & Dumais 1997, Rapp 2004), automatic thesaurus construction (Lin 1998), identification of translation equivalents (Rapp 1999), word sense induction and discrimination (Schütze 1998), POS induction (Schütze 1995), identification of analogical relations (Turney 2006), PP attachment disambiguation (Pantel & Lin 2000), semantic classification (Versley 2008), as well as the prediction of fMRI (Mitchell et al. 2008) and EEG (Murphy et al. 2009) data. Recent years have seen renewed and rapidly growing interest in distributional approaches, as shown by the series of workshops on DSM held at Context 2007, ESSLLI 2008, EACL 2009, CogSci 2009, NAACL-HLT 2010, ACL 2010 and ESSLLI 2010 (click here for links).