Both sides previous revision
Previous revision
Next revision
|
Previous revision
|
course:material [2021/08/07 18:00] schtepf [Software for the course] |
course:material [2022/08/07 18:46] (current) schtepf [Software for the course] |
* ''sparsesvd'' (v0.2) | * ''sparsesvd'' (v0.2) |
* ''wordspace'' (v0.2-6) | * ''wordspace'' (v0.2-6) |
* recommended: ''e1071'', ''text2vec'', ''Rtsne'', ''uwot'' | * recommended: ''e1071'', ''rsparse'', ''Rtsne'', ''uwot'' |
* optional: ''tm'', ''quanteda'', ''data.table'', ''wordcloud'', ''shiny'', ''spacyr'', ''udpipe'', ''coreNLP'' (don't worry if some of these fail to install) | * optional: ''tm'', ''quanteda'', ''data.table'', ''wordcloud'', ''shiny'', ''spacyr'', ''udpipe'', ''coreNLP'' (don't worry if some of these fail to install) |
| * optional: ''NMF'' (also install ''biocManager'', then run command ''BiocManager::install("bioBase")'') |
- During the course, you will be asked to install a further package with additional evaluation tasks (''wordspaceEval'') from a password-protected Web page: | - During the course, you will be asked to install a further package with additional evaluation tasks (''wordspaceEval'') from a password-protected Web page: |
* ''wordspaceEval'' v0.2: [[http://www.collocations.de/data/protected/wordspaceEval_0.2.tar.gz|Source/Linux]] – [[http://www.collocations.de/data/protected/wordspaceEval_0.2.tgz|MacOS]] – [[http://www.collocations.de/data/protected/wordspaceEval_0.2.zip|Windows]] (login required) | * ''wordspaceEval'' v0.2: [[http://www.collocations.de/data/protected/wordspaceEval_0.2.tar.gz|Source/Linux]] – [[http://www.collocations.de/data/protected/wordspaceEval_0.2.tgz|MacOS]] – [[http://www.collocations.de/data/protected/wordspaceEval_0.2.zip|Windows]] (login required) |
- Download one or more of the pre-compiled DSMs listed below | - Download one or more of the pre-compiled DSMs listed below |
| |
/* -- doesn't apply at the moment -- | ===== Scaling R to large data sets ===== |
| |
| Most of our hands-on examples work reasonably well in a standard R installation, even on a moderately powerful laptop computer. |
| However, if you intend to work on real-life tasks and process large DSMs, it is important to enable multi-threaded computation |
| in R. Since DSMs build on matrix operations, a multi-threaded linear algebra library (“BLAS”) is key. |
| |
| - In Linux, it should be sufficient to install the OpenBLAS package, e.g. in Ubuntu: ''sudo apt install libopenblas-dev'' |
| - In MacOS, follow [[https://groups.google.com/g/r-sig-mac/c/YN6uNYCIZK0|these instructions]] to enable the VecLib BLAS built into MacOS. You may also want to [[https://mac.r-project.org/openmp/|enable OpenMP]] for an additional speed boost on expensive distance metrics (but this is less important). |
| - In Windows, you can try installing [[https://mran.microsoft.com/open|Microsoft R Open]] or do a Web search for alternative solutions. |
| |
| |
| <!-- doesn't apply at the moment -- |
==== Getting the latest & greatest ==== | ==== Getting the latest & greatest ==== |
| |
You can also check the [[http://wordspace.r-forge.r-project.org/|wordspace homepage]] for new releases and installation instructions. | You can also check the [[http://wordspace.r-forge.r-project.org/|wordspace homepage]] for new releases and installation instructions. |
| |
*/ | --> |
| |
===== Example data sets ===== | ===== Example data sets ===== |
* ''[[http://www.collocations.de/data/potter_l2r2.txt.gz|potter_l2r2.txt.gz]]'' (51.3 MB) | * ''[[http://www.collocations.de/data/potter_l2r2.txt.gz|potter_l2r2.txt.gz]]'' (51.3 MB) |
* ''[[http://www.collocations.de/data/potter_lemmas.txt.gz|potter_lemmas.txt.gz]]'' (1.1 MB) | * ''[[http://www.collocations.de/data/potter_lemmas.txt.gz|potter_lemmas.txt.gz]]'' (1.1 MB) |
| * ''[[http://www.collocations.de/data/VSS.txt|VSS.txt]]'' (37 kB) |
| |
===== Pre-compiled DSMs ===== | ===== Pre-compiled DSMs ===== |