University of Bonn, Computer Science VI, Autonomous Intelligent Systems

The LabelMe-12-50k dataset

Description:

The LabelMe-12-50k dataset consists of 50,000 JPEG images (40,000 for training and 10,000 for testing), which were extracted from LabelMe [1]. Each image is 256x256 pixels in size. 50% of the images in the training and testing set show a centered object, each belonging to one of the 12 object classes shown in Table 1. The remaining 50% show a randomly selected region of a randomly selected image ("clutter").

The dataset is a quite difficult challenge for object recognition systems because the instances of each object class vary greatly in appearance, lighting conditions, and angles of view. Furthermore, centered objects may be partly occluded or other objects (or parts of them) may be present in the image. See [1] for a more detailed descripton of the dataset.

Table 1: Object classes and number of instances in the LabelMe-12-50k dataset

#	Object class	Instances in training set	Instances in testing set
1	person	4,885	1,180
2	car	3,829	974
3	building	2,085	531
4	window	4,097	1,028
5	tree	1,846	494
6	sign	954	249
7	door	830	178
8	bookshelf	391	100
9	chair	385	88
10	table	192	54
11	keyboard	324	75
12	head	212	49
	clutter	20,000	5,000
	total number of images	40,000	10,000

Annotation format:

The dataset archive contains annotation files in two formats:

Human-readable text files (annotation-train.txt and annotation-test.txt), which contain in each line an image file name (without the .jpg extension) and 12 class labels corresponding to the 12 object classes.
Binary files (annotation-train.bin and annotation-test.bin), which contain 12 successive 32-bit float values for each image, each value representing the class label of the corresponding class. The file does not contain any meta information (e.g., there is no header).

The annotation label values of the two file formats differ slightly because the values in the text files are rounded to the second decimal place. If you want to report recognition rates, you should use the binary annotation files for training and testing because of the more precise label values.

All label values are between -1.0 and 1.0. For the 50% of non-clutter images, the label of the depicted object is set to 1.0. As instances of other object classes may also be present in the image (in object images as well as in clutter images), the other labels either have a value of -1.0 or a value between 0.0 and 1.0. A value of -1.0 is set either if no instance of the object class is present in the image or if the level of overlapping (calculated by the size and position of the object's bounding box) is below a certain threshold. Values above 0.0 are assigned if this threshold is exceeded. A value of 1.0 means that the corresponding object is exactly centered in the image and 160 pixels in size (in its larger dimension), just like the extracted objects.

Download:

You can download the dataset [here] (tar.gz archive, 461.4MB) .

Recognition rates:

Currently, the only results shown in Table 2 are from our paper [1]. If you would like to report recognition rates, please send them to uetz _at_ ais.uni-bonn.de, including a link to your publication or a description of the method you used.

Table 2: Training and testing error rates on the LabelMe-12-50k dataset

Method used	Training error rate	Testing error rate	Reported by...
Locally-connected Neural Pyramid	3.77%	16.27%	Uetz and Behnke 2009 [1]

Citation:

If you refer to our dataset, please cite:

[1]	Rafael Uetz and Sven Behnke, "Large-scale Object Recognition with CUDA-accelerated Hierarchical Neural Networks," Proceedings of the IEEE International Conference on Intelligent Computing and Intelligent Systems 2009 (ICIS 2009) [Download PDF]

References:

[2]	B.C. Russell, A. Torralba, K.P. Murphy, W.T. Freeman, "LabelMe: A database and web-based tool for image annotation," International Journal of Computer Vision, vol. 77, no. 1-3, pp. 157-173, 2008

Last updated: November 17, 2009 by Rafael Uetz (uetz _at_ ais.uni-bonn.de)

University of Bonn, Institute for Computer Science, Departments: I, II, III, IV, V, VI