[go: up one dir, main page]

Skip to main content

Generic Visual Categorization Using Weak Geometry

  • Chapter
Toward Category-Level Object Recognition

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4170))

Abstract

In the first part of this chapter we make a general presentation of the bag-of-keypatches approach to generic visual categorization (GVC). Our approach is inspired by the bag-of-words approach to text categorization. This method is able to identify the object content of natural images while generalizing across variations inherent to the object class. To obtain a visual vocabulary insensitive to viewpoint and illumination, rotation or affine invariant orientation histogram descriptors of image patches are vector quantized. Each image is then represented by one visual word occurrence histogram. To classify the images we use one-against-all SVM classifiers and choose the best ranked category. The main advantages of the method are that it is simple, computationally efficient and intrinsically invariant. We obtained excellent results as well for multi-class categorization as for object detection.

In the second part we improve the categorizer by incorporating geometric information. Based on scale, orientation or closeness of the keypatches we can consider a large number of simple geometrical relationships, each of which can be considered as a simplistic classifier. We select from this multitude of classifiers (several millions in our case) and combine them effectively with the original classifier. Results are shown on a new challenging 10 class dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

eBook
USD 15.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Amir, A., Argillander, J., Berg, M., Chang, S.-F., Franz, M., Hsu, W., Iyengar, G., Kender, J., Kennedy, L., Lin, C.-Y., Naphade, M., Natsev, A., Smith, J., Tesic, J., Wu, G., Yang, R., Zhang, D.: IBM research TRECVID-2004 video retrieval system. In: Proc. of TREC Video Retrieval Evaluation (2004)

    Google Scholar 

  2. Carbonetto, P., de Freitas, N., Barnard, K.: A statistical model for general contextual object recognition. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 350–362. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  3. Chen, Y., Wang, J.Z.: Image categorization by learning and reasoning with regions. JMLR 5, 913–939 (2004)

    Google Scholar 

  4. Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Proc. ECCV International Workshop on Statistical Learning in Computer Vision (2004)

    Google Scholar 

  5. Farquhar, J., Szedmak, S., Meng, H., Shawe-Taylor, J.: Improving “bag-of-keypoints” image categorisation. Technical report, University of Southampton (2005)

    Google Scholar 

  6. Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proc. CVPR, vol. 2, pp. 264–271 (2003)

    Google Scholar 

  7. Ferrari, V., Tuytelaars, T., Van Gool, L.: Simultaneous object recognition and segmentation by image exploration. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 40–54. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  8. Hsu, W.H., Chang, S.-F.: Visual cue cluster construction via information bottleneck principle and kernel density estimation. In: Leow, W.-K., Lew, M., Chua, T.-S., Ma, W.-Y., Chaisorn, L., Bakker, E.M. (eds.) CIVR 2005. LNCS, vol. 3568, pp. 82–91. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  9. Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  10. Lazebnik, S., Schmid, C., Ponce, J.: Semi-local affine parts for object recognition. In: Proc. BMVC, vol.2, pp. 959–968 (2004)

    Google Scholar 

  11. Leibe, B., Leonardis, A., Schiele, B.: Combined object categorization and segmentation with an implicit shape model. In: Proc. ECCV Workshop on Statistical Learning in Computer Vision, pp. 17–32 (2004)

    Google Scholar 

  12. Li, Y., Bilmes, J.A., Shapiro, L.G.: Object class recognition using images of abstract regions. In: Proc. ICPR, vol. 1, pp. 40–44 (2004)

    Google Scholar 

  13. Lodhi, H., Shawe-Taylor, J., Christianini, N., Watkins, C.: Text classification using string kernels. In: Advances in Neural Information Processing Systems, vol. 13 (2001)

    Google Scholar 

  14. Lowe, D.G.: Object recognition from local scale-invariant features. In: Proc. ICCV, pp. 1150–1157 (1999)

    Google Scholar 

  15. Mikolajczyk, K., Schmid, C.: An affine invariant interest point detector. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 128–142. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  16. Opelt, A., Fussenegger, M., Pinz, A., Auer, P.: Weak hypotheses and boosting for generic object detection and recognition. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3022, pp. 71–84. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  17. Pan, J.-Y., Yang, H.-J., Faloutsos, C., Duygulu, P.: GCap: Graph-based automatic image captioning. In: Proc. CVPR Workshop on Multimedia Data and Document Engineering (2004)

    Google Scholar 

  18. Pelleg, D., Moore, A.: X-means: Extending k-means with efficient estimation of the number of clusters. In: Proc. ICML (2000)

    Google Scholar 

  19. Perronnin, F., Dance, C., Csurka, G., Bressan, M.: Adapted vocabularies for generic visual categorization. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 464–475. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  20. Platt, J.C.: Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. In: Advances in Large Margin Classifiers. MIT Press, Cambridge (1999)

    Google Scholar 

  21. Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidence-rated predictions. Machine Learning 37(3), 297–336 (1999)

    Article  MATH  Google Scholar 

  22. Sivic, J.S., Russell, B.C., Efros, A.A., Zisserman, A., Feeman, W.F.: Discovering objects and their localization in images. In: Proc. ICCV, pp. 370–377 (2005)

    Google Scholar 

  23. Sivic, J.S., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: Proc. ICCV, vol. 2, pp. 1470–1477 (2003)

    Google Scholar 

  24. Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. In: Proc. ICML (2000)

    Google Scholar 

  25. Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing features: Efficient boosting procedures for multiclass object detection. In: Proc. CVPR, vol. 2, pp. 762–769 (2004)

    Google Scholar 

  26. Vapnik, V.: Statistical Learning Theory. Wiley, Chichester (1998)

    MATH  Google Scholar 

  27. Zhu, L., Rao, A., Zhang, A.: Theory of keyblock-based image retrieval. ACM Transactions on Information Systems 20(2), 224–257 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Csurka, G., Dance, C.R., Perronnin, F., Willamowski, J. (2006). Generic Visual Categorization Using Weak Geometry. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds) Toward Category-Level Object Recognition. Lecture Notes in Computer Science, vol 4170. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11957959_11

Download citation

  • DOI: https://doi.org/10.1007/11957959_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68794-8

  • Online ISBN: 978-3-540-68795-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics