Differences

This shows you the differences between two versions of the page.

--- data:concrete_nouns_categorization [2008/01/19 18:11]
alexlenci
+++ — (current)
@@ Line 1: / Line 1: @@
-====== Task 1a: Concrete Nouns Categorization ======
-==== Introduction ====
-The goal of the sub-task is to group concrete nouns into semantic categories.
-The {{concnouns.categorization.dataset.tar.gz |data set}} consists of 44 concrete nouns, belonging to 7 semantic categories (four animates and two inanimates). The nouns are included in the feature norms described in McRae et al. (2005) (cf. [[comparison_with_speaker-generated_features|Task3]]).
-==== Task Operationalization ====
-We operationalize concrete nouns categorization as a clustering task. Since the data set is organized hierarchically,
-we will run three clustering experiments, varying the number of classes and consequently their level of generality:
-  * **7-way clustering** - models will be tested on their ability to categorize the nouns into the most fine-grained classes of the dataset: //bird// ("peacock"), //groundAnimal// ("lion"), //fruitTree// ("cherry"), //green// ("potato"), //kitchenware// ("spoon"), //instrument// ("hammer"), //vehicle// ("car");
-  * **4-way clustering** - models will be tested on their ability to categorize the nouns into 4 superordinate classes: //animal// (superordinate of //bird// and //groundAnimal//), //vegetable// (superordinate of //fruitTree// and //green//), //tool// (superordinate of //kitchenware// and //instrument//), //vehicle//;
-  * **2-way clustering** - models will be tested on their ability to categorize the nouns into the two top classes: //natural// (superordinate of //animal// and //vegetable//) and //artifact// (superordinate of //tool// and //vehicle//)
-To abstract away from differences stemming from the particular clustering method, you are asked to run your experiments with //k-means// algorithm available in [[http://glaros.dtc.umn.edu/gkhome/cluto/cluto/overview|CLUTO]]. In case you can not run  CLUTO on your system, you will able the possibility to send your data to the workshop rganzier, who will run the clustering experiments for you. Participants are obviouslty free to experiment also with other clustering methotds. Comparison with the results obtaimned with CLUTSO are also welcome.
-==== Task Evaluation ====
-Evaluation will be carried in two stages:
-. quantitative evaluation - results wil be evaluated with respect to the two measures for cluster quality available in CLUTO: //purity// and //entropy// (cf. Zhao, Y. and G. Karypis (2002), "Evaluation of Hierarchical Clustering Algorithms for Document Datasets", in //CIKM 2002//).
-. qualitative evaluation - particpnats will be asked to focus on a fine-grained process of error analysis, to identify the hardeest nouns to cluster, etc.
-Back to [[Start]]

You are here: start » data » concrete_nouns_categorization

Differences

Navigation

Search

Toolbox