This is an old revision of the document!
General
- Develop in UCLA,
- Set of Java libraries,
- It is not finished; it is not dead code, though.
- There is a rich documentation regarding the algorithms and the implementation.
- Since it is a collection of algorithms, it is necessary to decide which ones are necessary!
- "The focus of this framework is to ease the development of new algorithms and the comparison against existing models." (Jurgens, Stevens).
- "Each word space algorithms is designed to run as a stand alone program and also to be used as a library class." (Jurgens, Stevens).
- The library supports word-document vectors.
- The authors affirm that it can collect more that a context-vector for a single word depending on the semantic meaning (e.g. bank as institution and bank as "Sitztgelegenheit" )
- "Libraries provide support for converting between multiple matrix formats, enabling interaction with external matrix-based program".
- SVD and randomized projections.
- From the pictures, scalability of most of the algorithms seems to grow with a linear factor!
- The package is constituted by four type of tools:
- A library (implementation) of commonly used algorithms in semantic spaces.
- Tools for building semantic models
- Evaluation tools (e.g. TOEFL test for synonyms).
- Interaction tools (e.g. queries, etc.).
Installation
- Required Software
- svn (Subversion). Can be installed with a apt-get command:
sudo apt-get install subversion
- To installed the package go to a target directory. The authors recommends to use the following command:
svn checkout http://airhead-research.googlecode.com/svn/trunk/sspace sspace-read-only
- A new directory should have been created. Go to the directory and use the command
ant
. Ant is part of the Apache project and is used to build java libraries. It will automatically detect the file build.html and install from it. I explained somewhere how to install ant.