This is an old revision of the document!
Table of Contents
Courses and Tutorials on DSM
ESSLLI '09 – NAACL-HLT 2010 – Downloads & Links – Bibliography
Online access (Web interfaces)
- Web interface for several pre-trained Infomap models (CIMeC, U Trento)
- Explore a German LSA space (CogSci, U Osnabrück)
Off-the-shelf packages for DSM
- HiDEx, the High-Dimensional Explorer
- S-Space Package (work in progress)
- Wordspaces (interactive exploration)
- Divisi (semantic networks, tensors & SVD in Python)
Downloads
Data sets
- Verb + object noun co-occurrences (tokens) extracted from the British National Corpus: bnc_vobj_filtered.txt.gz (15 MB)
- A 5-million word corpus of Harry Potter fan fiction in lemma
_
pos format (pre-cleaned): potter_tokens.txt.gz (8.9 MB)
- NEW: DSM for 34,150 English nouns from 2-billion-word ukWaC corpus: ukwac_vobj_S_svd.rda (158 MB)
- verb-object co-occurrences, features are 3,371 frequent verbs, log-scaled t-score, 300 SVD dimensions
- nearest-neighbour demo with visualisation: neighbour_demo.R