Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
data:verb_categorization [2008/01/20 15:36]
alexlenci
— (current)
Line 1: Line 1:
-====== Task 1.c - Verb Categorization ====== 
- 
-==== Introduction ==== 
- 
-The goal of the sub-task is to group verbs into semantic categories. 
- 
-The {{verb.categorization.dataset.tar.gz |data set}} consists of 45 verbs, belonging to 9 semantic classes. The classification scheme is inspired to the one described in P. Vinson & G. Vigliocco (2007), “Semantic Feature Production Norms for a Large Set of Objects and Events”, //Behavior Research Methods//, which in turn closely follows the classification proposed in Levin (1993). 
- 
-==== Task Operationalization ==== 
- 
-We operationalize verb categorization as a clustering task. Since the data set is organized hierarchically, 
-we will run two clustering experiments, varying the number of classes and consequently their level of generality: 
- 
-  * **9-way clustering** - models will be tested on their ability to categorize the verbs into the most fine-grained classes of the dataset: //communication// ("talk"), //mentalState// ("know"), //motionManner// ("run"), //motionDirection// ("arrive"), //changeLocation// ("carry"), //bodySense// ("smell"), //bodyAction// ("eat"), //exchange// ("buy"), //changeState// ("destroy"); 
-  * **5-way clustering** - models will be tested on their ability to categorize the verbs into 5 classes: //cognition// (superordinate of //communication// and //mentalState//), //motion// (superordinate of //motionManner//, //motionDirection//, //changeLocation//), //body// (superordinate of //bodySense// and //bodyAction//), //exchange//, and //chnageState//; 
- 
-  * **2-way clustering** - models will be tested on their ability to categorize the nouns into the two top classes: //natural// (superordinate of //animal// and //vegetable//) and //artifact// (superordinate of //tool// and //vehicle//) 
- 
-To abstract away from differences stemming from any specific clustering method, you are asked to run your experiments with the //k-means// algorithm available in [[http://glaros.dtc.umn.edu/gkhome/cluto/cluto/overview|CLUTO]]. In case you can not run  [[http://glaros.dtc.umn.edu/gkhome/cluto/cluto/overview|CLUTO]] on your system, the workshop organizers will carry out the clustering for you. The data format to ptovde the data to be clustered will be provided later on. Participants are also invited to experiment with other clustering methods and to compare the results with those obtained with[[http://glaros.dtc.umn.edu/gkhome/cluto/cluto/overview|CLUTO]]. 
- 
- 
-==== Task Evaluation ==== 
- 
-Evaluation will be carried in two stages: 
- 
-1. **quantitative evaluation** - results will be evaluated with respect to the two measures for cluster quality available in CLUTO: //purity// and //entropy// (cf. Zhao, Y. and G. Karypis (2002), "Evaluation of Hierarchical Clustering Algorithms for Document Datasets", in //CIKM 2002//).  
- 
-2. **qualitative evaluation** - participants will be asked to perform a fine-grained error analysis, focussing on critical nouns, hard classes, etc. Details about this type of evaluation will be provided later on. 
- 
-Back to [[Start]] 
-