Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
data:start [2008/01/17 22:32]
marco
data:start [2008/06/18 15:46]
schtepf
Line 8: Line 8:
 ===== Ordered by task categories ===== ===== Ordered by task categories =====
  
 +==== Task 1: Free Association ====
  
 +It is tempting to make a connection between the **statistical association** patterns of words -- both first-order associations (//collocations//) and higher-order associations (//word space//) -- and **human free associations** -- the first words that come to mind when native speakers are presented with a stimulus word.  In this  task, we will explore to what extent such free associations can be explained and predicted by statistically salient patterns in the linguistic experience of speakers, possibly offering a simple and straightforward cognitive interpretation of distributional similarity (i.e. higher-order association).  However, this is not merely a "baseline" task: it also touches on intriguing research problems such as the interaction of first-order and higher-order information in human associative memory.
  
 +
 +  * [[Correlation with Free Association Norms]]
  
  
-==== Task 1: Categorization ====+==== Task 2: Categorization ====
  
 Categorization tasks play a prominent role in cognitive research on concepts. In this type of tasks, subjects Categorization tasks play a prominent role in cognitive research on concepts. In this type of tasks, subjects
Line 23: Line 27:
 of the lexicon and/or semantic dimensions: of the lexicon and/or semantic dimensions:
  
-  * [[Concrete Nouns Categorization]] +  * [[Concrete Noun Categorization]] 
-  * [[Abstract/Concrete Nouns Discrimination]]+  * [[Abstract/Concrete Nouns Discrimination|Abstract/Concrete Noun Discrimination]]
   * [[Verb Categorization]]   * [[Verb Categorization]]
  
Line 30: Line 34:
  
  
-==== Task 2: Free Association ==== 
  
-It is tempting to make a connection between the **statistical association** patterns of words --  first-order (//collocations//) as well as higher order (//word space//) -- and **human free associations** -- the first words that come to mind when native speakers are presented with a stimulus word.  In this  task, we will explore to what extent such free associations can be explained and predicted by statistically salient patterns in the linguistic experience of speakers, possibly offering a simple and straightforward interpretation of distributional similarity (i.e. higher-order association), in contrast to the symbolic aspects of meaning targeted by the other tasks.  However, this is not merely a "baseline" task: it also touches on intriguing research problems such as the interaction of first-order and higher-order information in human associative memory. 
  
- 
-  * [[Correlation with Free Association Norms]] 
  
 ==== Task 3: Property Generation ==== ==== Task 3: Property Generation ====
Line 41: Line 41:
 The ability to describe a concept in terms of its salient properties is an important feature of human conceptual cognition. In this task, we compare human-generated //norms// collected by psychologists to the properties generated by computational models. The ability to describe a concept in terms of its salient properties is an important feature of human conceptual cognition. In this task, we compare human-generated //norms// collected by psychologists to the properties generated by computational models.
  
-  * [[Comparison with Speaker-Generated Features]] (**preliminary gold standard and evaluation script available!**) +  * [[Comparison with Speaker-Generated Features]]
- +
- +
- +
  
  
 ===== Source corpus ===== ===== Source corpus =====
  
-You can train your word space on your favorite corpus. However, we also invite you, if this is suitable, to experiment with the [[http://wacky.sslmit.unibo.it|ukWaC]] corpus, so that we will be able to compare different word spaces trained on the same corpus (for information on how to obtain the corpus, write to [[wacky@sslmit.unibo.it|this address]]). +You can train your word space on your favorite corpus. However, we also invite you, if this is suitable, to experiment with the [[http://wacky.sslmit.unibo.it|ukWaC]] corpus, so that we will be able to compare different word spaces trained on the same corpus (for information on how to obtain the corpus, write to [[wacky@sslmit.unibo.it|this address]]). ukWaC is a very large (about 2 billion tokens) Web-derived corpus. It is split into sub-sections containing randomly chosen documents. Thus, if your algorithm has problems scaling up to 2 billion tokens, you can train it on one or more sub-sections, that will constitute a document-based random sub-sample of ukWaC.