Distributional Semantic Models (NAACL-HLT 2010)

Start page – Schedule – Software & data sets – Bibliography

Schedule & handouts

Part 1

Presentation slides (PDF, 1.9 MiB) – handout (PDF, 1.0 MiB)

Introduction
- motivation and brief history of distributional semantics
- common DSM architectures
- prototypical applications

Taxonomy of DSM parameters including
- size and type of context window
- feature scaling (tf.idf, statistical association measures, …)
- normalisation and standardisation of rows and/or columns
- distance/similarity measures: Euclidean, Minkowski p-norms, cosine, entropy-based, …
- dimensionality reduction: feature selection, SVD, random indexing (RI)

Usage and evaluation of DSM
- what to do with DSM distances
- attributional vs. relational similarity
- evaluation tasks & results for attributional similarity

Part 2

Part 2 was not covered in the tutorial session at NAACL-HLT 2010. An extended version of the presentation slides & handout has been superseded by a five-part tutorial presented at ESSLLI 2016 & 2018.

Elements of matrix algebra for DSM
- basic matrix and vector operations
- norms and distances, angles, orthogonality
- projection and dimensionality reduction

Making sense of DSMs: mathematical analysis and visualisation techniques
- nearest neighbours and clustering
- semantic maps: PCA, MDS, SOM
- visualisation of high-dimensional spaces
- supervised classification based on DSM vectors
- understanding dimensionality reduction with SVD and RI
- term-term vs. term-context matrix, connection to first-order association
- SVD as a latent class model

Current research topics and future directions
- overview of current research on DSMs
- evaluation tasks and data sets
- available "off-the-shelf" DSM software
- limitations and key problems of DSMs
- trends for future work

You are here: start » course » acl2010 » schedule

Distributional Semantic Models (NAACL-HLT 2010)

Schedule & handouts

Part 1

Part 2

Navigation

Search

Toolbox