This is an old revision of the document!

### Table of Contents

This page is under construction!

### General

• Infomap NLP Software: Not in development any more. The authors recommend to use SemanticVectors instead!!!
• Uses Latent Semantic Analysis
• The implementation is in C.
• Infomap is intended to build language models' and to perform information retrieval tasks on the such models
• Simple input format
• You might need gdbm libraries. I had troubles installing this libraries in my laptop. In the present moment it is not working.
• The documentation includes installation instructions, algorithm description and implementation guide.

Eduardo Aponte 2010/10/31 12:28

### Installation

• Before installing Infomap you would have to install gdbm libraries in your computer. This could be quite challenging. In the following I document the installation process I followed.
1. As a first step, you should download the last version of gdbm.
2. Untar the .gz file and go into the created directory.
3. Try:
./configure

This command should try to configure the program to your system specifications. It is highly likely that this process fails. The most likely reason is that a system library called libtool is not version compatible. To check your version of this program (in ubuntu):

apt-cache policy libtool

. I presuppose you have libtool installed in your computer. You probably have a newer version of libtool as the one presuppose by the gdbm package. The solution I found was to run:

autoconf -f -oconfigure
4. The last overwrote all the libtool-related files in the directory. Now you can run
make

safely. If you obtain the following error -which actually is highly unlikely

checking build system type... Invalid configuration x86_64-unknown-linux-gnu': machine x86_64-unknown' not recognized

you will need to deceive the program. Add before any command:

linux32
5. You might also have problems with the ANSI c headers. To solve this problem
sudo apt-get install libc6-dev

### Testing

The first step in order to build a model is to choose a directory where the models will be created. This is done by setting an environment variable

INFOMAP_WORKING_DIR=/home/jrandom/infomap_models
export INFOMAP_WORKING_DIR

Afterwards run build the model. Informap accepts two formats: a single file where documents are divided by xml markers or as set of files, where every file contains exactly one document. I decided to use this second option. As input, there should be a file specifying the name of file containing a document.

infomap-build -m /usr/local/share/corpora/manyNames.txt many_01

Remember to add infomap to your PATH variable.

In corpora directory, you will find a simple py script for building a corpora from a file where every line is a document. Afterwards I used the following command:

infomap-build -m /net/data/CL/projects/wordspace/software_tests/corpora/infoCorpus/directory.txt firstModel`

directory.txt is a file contaning the name of every file contaning a document.