This is an old revision of the document!
Table of Contents
This page is under construction!
General
- Infomap NLP Software: Not in development any more. The authors recommend to use SemanticVectors instead!!!
- Uses Latent Semantic Analysis
- The implementation is in C.
- Documentation: http://infomap-nlp.sourceforge.net/doc/
- Infomap is intended to build `language models' and to perform information retrieval tasks on the such models
- Simple input format
- You might need gdbm libraries. I had troubles installing this libraries in my laptop. In the present moment it is not working.
- The documentation includes installation instructions, algorithm description and implementation guide.
— Eduardo Aponte 2010/10/31 12:28
Installation
- Before installing Infomap you would have to install gdbm libraries in your computer. This could be quite challenging. In the following I document the installation process I followed.
- As a first step, you should download the last version of gdbm.
- Untar the .gz file and go into the created directory.
- Try:
./configure
This command should try to configure the program to your system specifications. It is highly likely that this process fails. The most likely reason is that a system library called libtool is not version compatible. To check your version of this program (in ubuntu):
apt-cache policy libtool
. I presuppose you have libtool installed in your computer. You probably have a newer version of libtool as the one presuppose by the gdbm package. The solution I found was to run:
autoconf -f -oconfigure
- The last overwrote all the libtool-related files in the directory. Now you can run
make
safely. If you obtain the following error -which actually is highly unlikely
checking build system type... Invalid configuration `x86_64-unknown-linux-gnu': machine `x86_64-unknown' not recognized
you will need to deceive the program. Add before any command:
linux32
- You might also have problems with the ANSI c headers. To solve this problem
sudo apt-get install libc6-dev
Testing
The first step in order to build a model is to choose a directory where the models will be created. This is done by setting an environment variable
INFOMAP_WORKING_DIR=/home/jrandom/infomap_models export INFOMAP_WORKING_DIR
Afterwards run build the model. Informap accepts two formats: a single file where documents are divided by xml markers or as set of files, where every file contains exactly one document. I decided to use this second option. As input, there should be a file specifying the name of file containing a document.
infomap-build -m /usr/local/share/corpora/manyNames.txt many_01