Distributional Semantic Models (NAACL-HLT 2010)

Schedule & handouts

Part 1

Presentation slides (PDF, 1.9 MiB) – handout (PDF, 1.0 MiB)

  • Introduction
    • motivation and brief history of distributional semantics
    • common DSM architectures
    • prototypical applications
  • Taxonomy of DSM parameters including
    • size and type of context window
    • feature scaling (tf.idf, statistical association measures, …)
    • normalisation and standardisation of rows and/or columns
    • distance/similarity measures: Euclidean, Minkowski p-norms, cosine, entropy-based, …
    • dimensionality reduction: feature selection, SVD, random indexing (RI)
  • Usage and evaluation of DSM
    • what to do with DSM distances
    • attributional vs. relational similarity
    • evaluation tasks & results for attributional similarity

Part 2

Part 2 was not covered in the tutorial session at NAACL-HLT 2010. An extended version of the presentation slides & handout has been superseded by a five-part tutorial presented at ESSLLI 2016 & 2018.

  • Elements of matrix algebra for DSM
    • basic matrix and vector operations
    • norms and distances, angles, orthogonality
    • projection and dimensionality reduction
  • Making sense of DSMs: mathematical analysis and visualisation techniques
    • nearest neighbours and clustering
    • semantic maps: PCA, MDS, SOM
    • visualisation of high-dimensional spaces
    • supervised classification based on DSM vectors
    • understanding dimensionality reduction with SVD and RI
    • term-term vs. term-context matrix, connection to first-order association
    • SVD as a latent class model
  • Current research topics and future directions
    • overview of current research on DSMs
    • evaluation tasks and data sets
    • available “off-the-shelf” DSM software
    • limitations and key problems of DSMs
    • trends for future work