Distributional Semantic Models (NAACL-HLT 2010)
Start page – Schedule – Software & data sets – Bibliography
Schedule & handouts
Part 1
Presentation slides (PDF, 1.9 MiB) – handout (PDF, 1.0 MiB)
- Introduction
- motivation and brief history of distributional semantics
- common DSM architectures
- prototypical applications
- Taxonomy of DSM parameters including
- size and type of context window
- feature scaling (tf.idf, statistical association measures, …)
- normalisation and standardisation of rows and/or columns
- distance/similarity measures: Euclidean, Minkowski p-norms, cosine, entropy-based, …
- dimensionality reduction: feature selection, SVD, random indexing (RI)
- Usage and evaluation of DSM
- what to do with DSM distances
- attributional vs. relational similarity
- evaluation tasks & results for attributional similarity
Part 2
Part 2 was not covered in the tutorial session at NAACL-HLT 2010. An extended version of the presentation slides & handout has been superseded by a five-part tutorial presented at ESSLLI 2016 & 2018.
- Elements of matrix algebra for DSM
- basic matrix and vector operations
- norms and distances, angles, orthogonality
- projection and dimensionality reduction
- Making sense of DSMs: mathematical analysis and visualisation techniques
- nearest neighbours and clustering
- semantic maps: PCA, MDS, SOM
- visualisation of high-dimensional spaces
- supervised classification based on DSM vectors
- understanding dimensionality reduction with SVD and RI
- term-term vs. term-context matrix, connection to first-order association
- SVD as a latent class model
- Current research topics and future directions
- overview of current research on DSMs
- evaluation tasks and data sets
- available "off-the-shelf" DSM software
- limitations and key problems of DSMs
- trends for future work