====== Courses and Tutorials on DSM ====== [[course:esslli2009:start|ESSLLI '09]] – [[course:acl2010:start|NAACL-HLT 2010]] – **Downloads & Links** – [[course:bibliography|Bibliography]] ===== Online access (Web interfaces) ===== * Web interface for several pre-trained [[http://clic.cimec.unitn.it/infomap-query/|Infomap models]] (CIMeC, U Trento) * Explore a [[http://www.cogsci.uni-osnabrueck.de/~korpora/ws/cgi-bin/HIT/LSA_NN.perl|German LSA space]] (CogSci, U Osnabrück) ===== Off-the-shelf packages for DSM ===== * [[http://infomap-nlp.sourceforge.net/|Infomap NLP]] * [[http://www.psych.ualberta.ca/~westburylab/downloads/HiDEx.download.html|HiDEx]], the High-Dimensional Explorer * [[http://code.google.com/p/semanticvectors|Semantic Vectors]] * [[http://senseclusters.sourceforge.net/|SenseClusters]] * [[http://code.google.com/p/airhead-research/|S-Space Package]] (work in progress) * [[http://code.google.com/p/wordspaces/|Wordspaces]] (interactive exploration) * [[http://divisi.media.mit.edu/|Divisi]] (semantic networs, tensors & SVD in Python) ===== Downloads ===== ==== Data sets ==== * Verb + object noun co-occurrences (tokens) extracted from the British National Corpus: [[http://www.collocations.de/data/bnc_vobj_filtered.txt.gz|bnc_vobj_filtered.txt.gz]] (15 MB) * A 5-million word corpus of Harry Potter fan fiction in //lemma//''_''//pos// format (pre-cleaned): [[http://www.collocations.de/data/potter_tokens.txt.gz|potter_tokens.txt.gz]] (8.9 MB) * **NEW:** DSM for 34,150 English nouns from 2-billion-word ukWaC corpus: [[http://www.collocations.de/data/ukwac_vobj_S_svd.rda|ukwac_vobj_S_svd.rda]] (158 MB) * verb-object co-occurrences, features are 3,371 frequent verbs, log-scaled t-score, 300 SVD dimensions * nearest-neighbour demo with visualisation: [[http://wordspace.collocations.de/lib/exe/fetch.php/course:neighbour_demo.r|neighbour_demo.R]]