[go: up one dir, main page]

Skip to main content

Context Aware Document Binarization and Its Application to Information Extraction from Structured Documents

  • Conference paper
  • First Online:
Document Analysis and Recognition - ICDAR 2023 (ICDAR 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14187))

Included in the following conference series:

  • 1586 Accesses

Abstract

Document binarization plays a key role in information extraction pipelines from document images. In this paper, we propose a robust binarization algorithm that aims at obtaining highly accurate binary maps with reduced computational burden. The proposed technique exploits the effectiveness of the classic Sauvola thresholding set within a DNN environment. Our model learns to combine multi-scale Sauvola thresholds using a featurewise attention module that exploits the visual context of each pixel. The resulting binarization map is further enhanced by a spatial error concealment procedure to recover missing or severely degraded visual information. Moreover, we propose to employ an automatic color removal module that is responsible for suppressing any binarization irrelevant information from the image. This is especially important for structured documents, such as payment forms, where colored structures are used for better user experience and readability. The resulting model is compact, explainable and end-to-end trainable. The proposed technique outperforms the state-of-the-art algorithms in terms of binarization accuracy and successfully extracted information rates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Similar content being viewed by others

References

  1. Li, D., Wu, Y., Zhou, Y: SauvolaNet: learning adaptive sauvola network for degraded document binarization. In: Proceedings of ICDAR, Lausanne (2021)

    Google Scholar 

  2. Gatos, B., Ntirogiannis, K., Pratikakis, I.: ICDAR 2009 document image binarization contest (DIBCO 2009). In: Proceedings of ICDAR, Barcelona (2009)

    Google Scholar 

  3. Pratikakis, I., Gatos, B., Ntirogiannis, K.: H-DIBCO 2010 - handwritten document image binarization competition. In: Proceedings of ICFHR, Kolkata (2010)

    Google Scholar 

  4. Pratikakis, I., Gatos, B., Ntirogiannis, K.: ICDAR 2011 document image binarization contest (DIBCO 2011). In: Proceedings of ICDAR, Beijing (2011)

    Google Scholar 

  5. Pratikakis, I., Gatos, B., Ntirogiannis, K.: ICFHR 2012 competition on handwritten document image binarization (H-DIBCO 2012). In: Proceedings of ICFHR, Bari (2012)

    Google Scholar 

  6. Pratikakis, I., Gatos, B., Ntirogiannis, K.: ICDAR 2013 document image binarization contest (DIBCO 2013). In: Proceedings of ICDAR, Washington (2013)

    Google Scholar 

  7. Ntirogiannis, K., Gatos, B., Pratikakis, I.: ICFHR2014 competition on handwritten document image binarization (H-DIBCO 2014). In: Proceedings of ICFHR, Hersonissos (2014)

    Google Scholar 

  8. Pratikakis, I., Zagoris, K., Barlas, G., Gatos, B.: ICFHR2016 handwritten document image binarization contest (H-DIBCO 2016). In: Proceedings of ICFHR, Shenzhen (2016)

    Google Scholar 

  9. Pratikakis, I., Zagoris, K., Barlas, G., Gatos, B.: ICDAR2017 competition on document image binarization (DIBCO 2017). In: Proceedings of ICDAR, Kyoto (2017)

    Google Scholar 

  10. Pratikakis, I., Zagoris, K., Kaddas, P., Gatos, B.: ICFHR2018 competition on handwritten document image binarization (H-DIBCO 2018). In: Proceedings of ICFHR, Niagara Falls (2018)

    Google Scholar 

  11. Nafchi, H.Z., Ayatollahi, S.M., Moghaddam, R.F., Cheriet, M.: An efficient ground truthing tool for binarization of historical manuscripts. In: Proceedings of ICDAR (2013)

    Google Scholar 

  12. Deng, F., Wu, Z., Lu, Z., Brown, M.S.: Binarizationshop: a user-assisted software suite for converting old documents to black-and-white. In: Proceedings of Annual Joint Conf. on Digital Libraries (2010)

    Google Scholar 

  13. Hedjam, R., Nafchi, H.Z., Moghaddam, R.F., Kalacska, M., Cheriet, M.: ICDAR 2015 contest on multispectral text extraction (MS-Tex 2015). In: Proceedings of ICDAR (2015)

    Google Scholar 

  14. Calvo-Zaragoza, J., Gallego, A.J.: A selectional auto-encoder approach for document image binarization. Pattern Recogn. 86, 34–47 (2019)

    Google Scholar 

  15. Niblack, W.: An Introduction to Digital Image Processing, Strandberg Publishing Company (1985)

    Google Scholar 

  16. He, S., Schomaker, L.: Deepotsu: document enhancement and binarization using iterative deep learning. Pattern Recogn. 91, 379–390 (2019)

    Google Scholar 

  17. Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recogn. 33, 225–236 (2000)

    Google Scholar 

  18. Hadjadj, Z., Meziane, A., Cherfa, Y., Cheriet, M., Setitra, I.: Isauvola: improved sauvola’s algorithm for document image binarization. In: Proceedings of ICIAR, Póvoa de Varzim (2016)

    Google Scholar 

  19. De, R., Chakraborty, A., Sarkar, R.: Document image binarization using dual discriminator generative adversarial networks. IEEE Signal Process. Lett. 27, 1090–1094 (2020)

    Google Scholar 

  20. Zhao, J., Shi, C., Jia, F., Wang, Y., Xiao, B.: Document image binarization with cascaded generators of conditional generative adversarial networks. Pattern Recogn. 96, 106968 (2019)

    Google Scholar 

  21. Jemni, S.K., Souibgui, M.A., Kessentini, Y., Fornés, A.: Enhance to read better: a multi-task adversarial network for handwritten document image enhancement. Pattern Recogn. 123, 108370 (2021)

    Google Scholar 

  22. Otsu, N.: A threshold selection method from gray-level histograms. Automatica (1975)

    Google Scholar 

  23. Vo, G.D., Park, C.: Robust regression for image binarization under heavy noise and nonuniform background. Pattern Recogn. 81, 224–239 (2018)

    Google Scholar 

  24. Moghaddam, R.F., Cheriet, M.: A multi-scale framework for adaptive binarization of degraded document images. Pattern Recogn. 43, 2186–2198 (2010)

    Google Scholar 

  25. Peng, X., Wang, C., Cao, H.: Document binarization via multi-resolutional attention model with DRD loss, In: Proceedings of ICDAR, Sydney (2020)

    Google Scholar 

  26. Tensmeyer, C., Martínez, T.: Document image binarization with fully convolutional neural networks. In: Proceedings of ICDAR, Kyoto (2017)

    Google Scholar 

  27. Lazzara, G., Géraud, T.: Efficient multiscale Sauvola’s binarization. Int. J. Document Anal. Recogn. 17, 105–123 (2014)

    Google Scholar 

  28. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. Medical Image Computing and Computer-Assisted Intervention (2015)

    Google Scholar 

  29. Wan, A.M., Mohamed, M.M.A.K.: Binarization of document image using optimum threshold modification. J. Phys. 1019, 012022 (2018)

    Google Scholar 

  30. Kaur, A., Rani, U., Gurpreet, S.J.: Modified Sauvola Binarization for Degraded Document Images. Engineering Applications of Artificial Intelligence (2020)

    Google Scholar 

  31. Koloda, J., Peinado, A.M., Sánchez, V.: Kernel-based MMSE multimedia signal reconstruction and its application to spatial error concealment. IEEE Trans. Multimed. (2014)

    Google Scholar 

  32. Koloda, J., Seiler, J., Peinado, A.M., Kaup, A.: Scalable kernel-based minimum mean square error estimator for accelerated image error concealment. IEEE Trans. Broadcasting 63, 59–70 (2017)

    Google Scholar 

  33. Geiger, A., Lenz, P., Urtasun, R.: Are We Ready for Autonomous Driving? The KITTI Vision Benchmark Suite. In: Proceedings of CVPR, Providence (2012)

    Google Scholar 

  34. Gini GmbH. https://gini.net/en/products/extract/gini-smart. Accessed 15 Jan 2023

  35. Gini Photo Payment. https://gini.net/en/gini-now-processes-over-7-million-photo-payments-per-month. Accessed 15 Jan 2023

  36. Document Data Capture. https://www.bitkom.org/sites/default/files/file/import/130302-Document-Data-Capture.pdf. Accessed 15 Jan 2023

  37. Lin, Y.-S., Ju, R.-Y., Chen, C.-C., Lin, T.-Y., Chiang, J.-S.: Three-Stage Binarization of Color Document Images Based on Discrete Wavelet Transform and Generative Adversarial Networks, arXiv preprint (2022)

    Google Scholar 

  38. CADB Testset. https://github.com/gini/vision-cadb-testset.git. Accessed 15 Jan 2023

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ján Koloda .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Koloda, J., Wang, J. (2023). Context Aware Document Binarization and Its Application to Information Extraction from Structured Documents. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds) Document Analysis and Recognition - ICDAR 2023. ICDAR 2023. Lecture Notes in Computer Science, vol 14187. Springer, Cham. https://doi.org/10.1007/978-3-031-41676-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-41676-7_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-41675-0

  • Online ISBN: 978-3-031-41676-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics