This is an old revision of the document!


General

  • Develop in UCLA,
  • Set of Java libraries,
  • It is not finished; it is not dead code, though.
  • There is a rich documentation regarding the algorithms and the implementation.
  • Since it is a collection of algorithms, it is necessary to decide which ones are necessary!
  • "The focus of this framework is to ease the development of new algorithms and the comparison against existing models." (Jurgens, Stevens).
  • "Each word space algorithms is designed to run as a stand alone program and also to be used as a library class." (Jurgens, Stevens).
  • The library supports word-document vectors.
  • The authors affirm that it can collect more that a context-vector for a single word depending on the semantic meaning (e.g. bank as institution and bank as "Sitztgelegenheit" :-))
  • "Libraries provide support for converting between multiple matrix formats, enabling interaction with external matrix-based program".
  • SVD and randomized projections.
  • From the pictures, scalability of most of the algorithms seems to grow with a linear factor!
  • The package is constituted by four type of tools:
    • A library (implementation) of commonly used algorithms in semantic spaces.
    • Tools for building semantic models
    • Evaluation tools (e.g. TOEFL test for synonyms).
    • Interaction tools (e.g. queries, etc.).

Installation

  • Required Software
    • svn (Subversion). Can be installed with a apt-get command:
      sudo apt-get install subversion
    • To installed the package go to a target directory. The authors recommends to use the following command:
      svn checkout http://airhead-research.googlecode.com/svn/trunk/sspace sspace-read-only
    • A new directory should have been created. Go to the directory and use the command
      ant

      . Ant is part of the Apache project and is used to build java libraries. It will automatically detect the file build.html and install from it. I explained somewhere how to install ant.