Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
software:start [2010/11/01 14:30]
schtepf [DSM Software and Data Sets]
software:start [2011/02/02 09:25]
schtepf [Off-the-shelf packages for DSM]
Line 7: Line 7:
 \\ \\
 \\ \\
 +
 +===== Useful corpora =====
 +
 +  * The Westbury Lab at Alberta has a [[http://www.psych.ualberta.ca/~westburylab/downloads/westburylab.wikicorp.download.html|preprocessed (cleaned) Wikipedia Corpus]] from an April 2010 dump.  The WaCky initiative offers [[http://wacky.sslmit.unibo.it/doku.php?id=corpora|WaCkypedia, a dependency-parsed Wikipedia Corpus]] from a 2009 dump.  Both corpora only cover the //English Wikipedia//.
  
 ===== Off-the-shelf packages for DSM ===== ===== Off-the-shelf packages for DSM =====
  
 +  * [[GenSim]]: incremental SVD & LSA in python, easily deployable to clusters.
   * [[http://infomap-nlp.sourceforge.net/|Infomap NLP]]   * [[http://infomap-nlp.sourceforge.net/|Infomap NLP]]
     * [[rewInfoMap|Review]]     * [[rewInfoMap|Review]]
Line 17: Line 22:
     * [[rewSemVector|Review]]     * [[rewSemVector|Review]]
   * [[http://senseclusters.sourceforge.net/|SenseClusters]]   * [[http://senseclusters.sourceforge.net/|SenseClusters]]
 +    * [[rewSenseClusters|Review]]
   * [[http://code.google.com/p/airhead-research/|S-Space Package]] (work in progress)   * [[http://code.google.com/p/airhead-research/|S-Space Package]] (work in progress)
 +    *[[rewSSpacePackage|Review]]
   * [[http://code.google.com/p/wordspaces/|Wordspaces]] (interactive exploration)   * [[http://code.google.com/p/wordspaces/|Wordspaces]] (interactive exploration)
-  * [[http://divisi.media.mit.edu/|Divisi]] (semantic networks, tensors & SVD in Python)+    * [[rewWordSpaces|Review]] 
 +  * [[http://csc.media.mit.edu/docs/divisi2|Divisi]] (semantic networks, tensors & SVD in Python
 +    * [[rewDivisi2|Review]] 
 +  * [[miscellaneous|Miscellaneous]] 
 +  * [[http://scgroup20.ceid.upatras.gr:8000/tmg/|Text to Matrix Generator (TMG)]] (Matlab toolbox for text mining)