This is an old revision of the document!
Table of Contents
DSM Software and Data Sets
Useful corpora
- The Westbury Lab at Alberta has a preprocessed (cleaned) Wikipedia Corpus from an April 2010 dump. The WaCky initiative offers a WaCkypedia, a dependency-parsed Wikipedia Corpus from a 2009 dump. Both corpora only cover the English Wikipedia.
Off-the-shelf packages for DSM
- GenSim: incremental SVD & LSA in python, easily deployable to clusters.
-
- HiDEx, the High-Dimensional Explorer
-
-
- S-Space Package (work in progress)
- Wordspaces (interactive exploration)
- Divisi (semantic networks, tensors & SVD in Python)