Generic Visual Categorization Using Weak Geometry

Gabriela Csurka²⁰,
Christopher R. Dance²⁰,
Florent Perronnin²⁰ &
…
Jutta Willamowski²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4170))

2922 Accesses
3 Citations

Abstract

In the first part of this chapter we make a general presentation of the bag-of-keypatches approach to generic visual categorization (GVC). Our approach is inspired by the bag-of-words approach to text categorization. This method is able to identify the object content of natural images while generalizing across variations inherent to the object class. To obtain a visual vocabulary insensitive to viewpoint and illumination, rotation or affine invariant orientation histogram descriptors of image patches are vector quantized. Each image is then represented by one visual word occurrence histogram. To classify the images we use one-against-all SVM classifiers and choose the best ranked category. The main advantages of the method are that it is simple, computationally efficient and intrinsically invariant. We obtained excellent results as well for multi-class categorization as for object detection.

In the second part we improve the categorizer by incorporating geometric information. Based on scale, orientation or closeness of the keypatches we can consider a large number of simple geometrical relationships, each of which can be considered as a simplistic classifier. We select from this multitude of classifiers (several millions in our case) and combine them effectively with the original classifier. Results are shown on a new challenging 10 class dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

eBook: USD 15.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Bag-of-Words Image Representation: Key Ideas and Further Insight

Application of SVMs to the Bag-of-Features Model: A Kernel Perspective

Classifying Images at Scene Level: Comparing Global and Local Descriptors

References

Amir, A., Argillander, J., Berg, M., Chang, S.-F., Franz, M., Hsu, W., Iyengar, G., Kender, J., Kennedy, L., Lin, C.-Y., Naphade, M., Natsev, A., Smith, J., Tesic, J., Wu, G., Yang, R., Zhang, D.: IBM research TRECVID-2004 video retrieval system. In: Proc. of TREC Video Retrieval Evaluation (2004)
Google Scholar
Carbonetto, P., de Freitas, N., Barnard, K.: A statistical model for general contextual object recognition. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 350–362. Springer, Heidelberg (2004)
Chapter Google Scholar
Chen, Y., Wang, J.Z.: Image categorization by learning and reasoning with regions. JMLR 5, 913–939 (2004)
Google Scholar
Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Proc. ECCV International Workshop on Statistical Learning in Computer Vision (2004)
Google Scholar
Farquhar, J., Szedmak, S., Meng, H., Shawe-Taylor, J.: Improving “bag-of-keypoints” image categorisation. Technical report, University of Southampton (2005)
Google Scholar
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proc. CVPR, vol. 2, pp. 264–271 (2003)
Google Scholar
Ferrari, V., Tuytelaars, T., Van Gool, L.: Simultaneous object recognition and segmentation by image exploration. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 40–54. Springer, Heidelberg (2004)
Chapter Google Scholar
Hsu, W.H., Chang, S.-F.: Visual cue cluster construction via information bottleneck principle and kernel density estimation. In: Leow, W.-K., Lew, M., Chua, T.-S., Ma, W.-Y., Chaisorn, L., Bakker, E.M. (eds.) CIVR 2005. LNCS, vol. 3568, pp. 82–91. Springer, Heidelberg (2005)
Chapter Google Scholar
Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)
Chapter Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Semi-local affine parts for object recognition. In: Proc. BMVC, vol.2, pp. 959–968 (2004)
Google Scholar
Leibe, B., Leonardis, A., Schiele, B.: Combined object categorization and segmentation with an implicit shape model. In: Proc. ECCV Workshop on Statistical Learning in Computer Vision, pp. 17–32 (2004)
Google Scholar
Li, Y., Bilmes, J.A., Shapiro, L.G.: Object class recognition using images of abstract regions. In: Proc. ICPR, vol. 1, pp. 40–44 (2004)
Google Scholar
Lodhi, H., Shawe-Taylor, J., Christianini, N., Watkins, C.: Text classification using string kernels. In: Advances in Neural Information Processing Systems, vol. 13 (2001)
Google Scholar
Lowe, D.G.: Object recognition from local scale-invariant features. In: Proc. ICCV, pp. 1150–1157 (1999)
Google Scholar
Mikolajczyk, K., Schmid, C.: An affine invariant interest point detector. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 128–142. Springer, Heidelberg (2002)
Chapter Google Scholar
Opelt, A., Fussenegger, M., Pinz, A., Auer, P.: Weak hypotheses and boosting for generic object detection and recognition. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3022, pp. 71–84. Springer, Heidelberg (2004)
Chapter Google Scholar
Pan, J.-Y., Yang, H.-J., Faloutsos, C., Duygulu, P.: GCap: Graph-based automatic image captioning. In: Proc. CVPR Workshop on Multimedia Data and Document Engineering (2004)
Google Scholar
Pelleg, D., Moore, A.: X-means: Extending k-means with efficient estimation of the number of clusters. In: Proc. ICML (2000)
Google Scholar
Perronnin, F., Dance, C., Csurka, G., Bressan, M.: Adapted vocabularies for generic visual categorization. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 464–475. Springer, Heidelberg (2006)
Chapter Google Scholar
Platt, J.C.: Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. In: Advances in Large Margin Classifiers. MIT Press, Cambridge (1999)
Google Scholar
Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidence-rated predictions. Machine Learning 37(3), 297–336 (1999)
Article MATH Google Scholar
Sivic, J.S., Russell, B.C., Efros, A.A., Zisserman, A., Feeman, W.F.: Discovering objects and their localization in images. In: Proc. ICCV, pp. 370–377 (2005)
Google Scholar
Sivic, J.S., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: Proc. ICCV, vol. 2, pp. 1470–1477 (2003)
Google Scholar
Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. In: Proc. ICML (2000)
Google Scholar
Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing features: Efficient boosting procedures for multiclass object detection. In: Proc. CVPR, vol. 2, pp. 762–769 (2004)
Google Scholar
Vapnik, V.: Statistical Learning Theory. Wiley, Chichester (1998)
MATH Google Scholar
Zhu, L., Rao, A., Zhang, A.: Theory of keyblock-based image retrieval. ACM Transactions on Information Systems 20(2), 224–257 (2002)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Xerox Research Centre Europe, 6 Rue de Maupertuis, 38240, Meylan, France
Gabriela Csurka, Christopher R. Dance, Florent Perronnin & Jutta Willamowski

Authors

Gabriela Csurka
View author publications
You can also search for this author in PubMed Google Scholar
Christopher R. Dance
View author publications
You can also search for this author in PubMed Google Scholar
Florent Perronnin
View author publications
You can also search for this author in PubMed Google Scholar
Jutta Willamowski
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Département d’Informatique, Ecole Normale Supérieure, P.O. Box, Paris, France
Jean Ponce
Carnegie Mellon University, Pittsburgh, USA
Martial Hebert
GRAVIR-INRIA, 655 avenue de l’Europe, P.O. Box, 38330, Montbonnot, France
Cordelia Schmid
Department of Engineering Science, University of Oxford, Parks Road, OX1 3PJ, Oxford, UK
Andrew Zisserman

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Csurka, G., Dance, C.R., Perronnin, F., Willamowski, J. (2006). Generic Visual Categorization Using Weak Geometry. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds) Toward Category-Level Object Recognition. Lecture Notes in Computer Science, vol 4170. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11957959_11

Download citation

DOI: https://doi.org/10.1007/11957959_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68794-8
Online ISBN: 978-3-540-68795-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Generic Visual Categorization Using Weak Geometry

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Bag-of-Words Image Representation: Key Ideas and Further Insight

Application of SVMs to the Bag-of-Features Model: A Kernel Perspective

Classifying Images at Scene Level: Comparing Global and Local Descriptors

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Generic Visual Categorization Using Weak Geometry

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Bag-of-Words Image Representation: Key Ideas and Further Insight

Application of SVMs to the Bag-of-Features Model: A Kernel Perspective

Classifying Images at Scene Level: Comparing Global and Local Descriptors

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation