Distributional Semantics – A Practical Introduction (ESSLLI 2016 & 2018)

Start pageScheduleSoftware & data setsBibliography

  • extended version of part 2 practice: input formats (26.05.2019)

Schedule & handouts

Day 1: Introduction

Presentation slides (PDF, 1.1 MB) – handout (PDF, 0.9 MB) – R code

  • motivation and brief history of distributional semantics
  • common DSM architectures & prototypical applications
  • first practical exercises with the wordspace package

Day 2: The parameters of a DSM

Presentation slides (PDF, 1.3 MB) – handout (PDF, 1.0 MB) – R codepractice: input formatsexercise (DSM parameters)

  • taxonomy of DSM parameters: context representation, feature scaling, normalization and standardization, distance/similarity measures, dimensionality reduction
  • overview of common parameter settings & best-practice recommendations
  • practical exercises: building DSMs and exploring their parameters

Day 3: Applications and evaluation

Presentation slides (PDF, 2.0 MB) – handout (PDF, 1.8 MB) – R codeexercise (evaluation)

  • attributional and relational similarity, clustering and semantic categorization, multiple-choice tasks
  • insights from recent parameter evaluation studies
  • practical exercises: implementation and evaluation of selected tasks

Day 4: Elements of matrix algebra

Presentation slides (PDF, 0.7 MB) – handout (PDF, 0.6 MB) – R codebonus practice: Schütze-style WSD – exercise (roll your own DSM)

  • basic matrix and vector operations, orthogonal projection & dimensionality reduction
  • singular value decomposition (SVD)
  • practical exercises: roll your own DSM with matrix operations

Day 5: Making sense of DSMs

  • mathematical properties of and relations between different types of DSM
  • singular value decomposition (SVD) as a latent class model
  • comparison with neural vector embeddings