Context Aware Document Binarization and Its Application to Information Extraction from Structured Documents

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14187))

Included in the following conference series:

International Conference on Document Analysis and Recognition

1586 Accesses

Abstract

Document binarization plays a key role in information extraction pipelines from document images. In this paper, we propose a robust binarization algorithm that aims at obtaining highly accurate binary maps with reduced computational burden. The proposed technique exploits the effectiveness of the classic Sauvola thresholding set within a DNN environment. Our model learns to combine multi-scale Sauvola thresholds using a featurewise attention module that exploits the visual context of each pixel. The resulting binarization map is further enhanced by a spatial error concealment procedure to recover missing or severely degraded visual information. Moreover, we propose to employ an automatic color removal module that is responsible for suppressing any binarization irrelevant information from the image. This is especially important for structured documents, such as payment forms, where colored structures are used for better user experience and readability. The resulting model is compact, explainable and end-to-end trainable. The proposed technique outperforms the state-of-the-art algorithms in terms of binarization accuracy and successfully extracted information rates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

ISauvola: Improved Sauvola’s Algorithm for Document Image Binarization

Binarizing Documents by Leveraging both Space and Frequency

Deep semantic binarization for document images

Article 06 August 2022

References

Li, D., Wu, Y., Zhou, Y: SauvolaNet: learning adaptive sauvola network for degraded document binarization. In: Proceedings of ICDAR, Lausanne (2021)
Google Scholar
Gatos, B., Ntirogiannis, K., Pratikakis, I.: ICDAR 2009 document image binarization contest (DIBCO 2009). In: Proceedings of ICDAR, Barcelona (2009)
Google Scholar
Pratikakis, I., Gatos, B., Ntirogiannis, K.: H-DIBCO 2010 - handwritten document image binarization competition. In: Proceedings of ICFHR, Kolkata (2010)
Google Scholar
Pratikakis, I., Gatos, B., Ntirogiannis, K.: ICDAR 2011 document image binarization contest (DIBCO 2011). In: Proceedings of ICDAR, Beijing (2011)
Google Scholar
Pratikakis, I., Gatos, B., Ntirogiannis, K.: ICFHR 2012 competition on handwritten document image binarization (H-DIBCO 2012). In: Proceedings of ICFHR, Bari (2012)
Google Scholar
Pratikakis, I., Gatos, B., Ntirogiannis, K.: ICDAR 2013 document image binarization contest (DIBCO 2013). In: Proceedings of ICDAR, Washington (2013)
Google Scholar
Ntirogiannis, K., Gatos, B., Pratikakis, I.: ICFHR2014 competition on handwritten document image binarization (H-DIBCO 2014). In: Proceedings of ICFHR, Hersonissos (2014)
Google Scholar
Pratikakis, I., Zagoris, K., Barlas, G., Gatos, B.: ICFHR2016 handwritten document image binarization contest (H-DIBCO 2016). In: Proceedings of ICFHR, Shenzhen (2016)
Google Scholar
Pratikakis, I., Zagoris, K., Barlas, G., Gatos, B.: ICDAR2017 competition on document image binarization (DIBCO 2017). In: Proceedings of ICDAR, Kyoto (2017)
Google Scholar
Pratikakis, I., Zagoris, K., Kaddas, P., Gatos, B.: ICFHR2018 competition on handwritten document image binarization (H-DIBCO 2018). In: Proceedings of ICFHR, Niagara Falls (2018)
Google Scholar
Nafchi, H.Z., Ayatollahi, S.M., Moghaddam, R.F., Cheriet, M.: An efficient ground truthing tool for binarization of historical manuscripts. In: Proceedings of ICDAR (2013)
Google Scholar
Deng, F., Wu, Z., Lu, Z., Brown, M.S.: Binarizationshop: a user-assisted software suite for converting old documents to black-and-white. In: Proceedings of Annual Joint Conf. on Digital Libraries (2010)
Google Scholar
Hedjam, R., Nafchi, H.Z., Moghaddam, R.F., Kalacska, M., Cheriet, M.: ICDAR 2015 contest on multispectral text extraction (MS-Tex 2015). In: Proceedings of ICDAR (2015)
Google Scholar
Calvo-Zaragoza, J., Gallego, A.J.: A selectional auto-encoder approach for document image binarization. Pattern Recogn. 86, 34–47 (2019)
Google Scholar
Niblack, W.: An Introduction to Digital Image Processing, Strandberg Publishing Company (1985)
Google Scholar
He, S., Schomaker, L.: Deepotsu: document enhancement and binarization using iterative deep learning. Pattern Recogn. 91, 379–390 (2019)
Google Scholar
Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recogn. 33, 225–236 (2000)
Google Scholar
Hadjadj, Z., Meziane, A., Cherfa, Y., Cheriet, M., Setitra, I.: Isauvola: improved sauvola’s algorithm for document image binarization. In: Proceedings of ICIAR, Póvoa de Varzim (2016)
Google Scholar
De, R., Chakraborty, A., Sarkar, R.: Document image binarization using dual discriminator generative adversarial networks. IEEE Signal Process. Lett. 27, 1090–1094 (2020)
Google Scholar
Zhao, J., Shi, C., Jia, F., Wang, Y., Xiao, B.: Document image binarization with cascaded generators of conditional generative adversarial networks. Pattern Recogn. 96, 106968 (2019)
Google Scholar
Jemni, S.K., Souibgui, M.A., Kessentini, Y., Fornés, A.: Enhance to read better: a multi-task adversarial network for handwritten document image enhancement. Pattern Recogn. 123, 108370 (2021)
Google Scholar
Otsu, N.: A threshold selection method from gray-level histograms. Automatica (1975)
Google Scholar
Vo, G.D., Park, C.: Robust regression for image binarization under heavy noise and nonuniform background. Pattern Recogn. 81, 224–239 (2018)
Google Scholar
Moghaddam, R.F., Cheriet, M.: A multi-scale framework for adaptive binarization of degraded document images. Pattern Recogn. 43, 2186–2198 (2010)
Google Scholar
Peng, X., Wang, C., Cao, H.: Document binarization via multi-resolutional attention model with DRD loss, In: Proceedings of ICDAR, Sydney (2020)
Google Scholar
Tensmeyer, C., Martínez, T.: Document image binarization with fully convolutional neural networks. In: Proceedings of ICDAR, Kyoto (2017)
Google Scholar
Lazzara, G., Géraud, T.: Efficient multiscale Sauvola’s binarization. Int. J. Document Anal. Recogn. 17, 105–123 (2014)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. Medical Image Computing and Computer-Assisted Intervention (2015)
Google Scholar
Wan, A.M., Mohamed, M.M.A.K.: Binarization of document image using optimum threshold modification. J. Phys. 1019, 012022 (2018)
Google Scholar
Kaur, A., Rani, U., Gurpreet, S.J.: Modified Sauvola Binarization for Degraded Document Images. Engineering Applications of Artificial Intelligence (2020)
Google Scholar
Koloda, J., Peinado, A.M., Sánchez, V.: Kernel-based MMSE multimedia signal reconstruction and its application to spatial error concealment. IEEE Trans. Multimed. (2014)
Google Scholar
Koloda, J., Seiler, J., Peinado, A.M., Kaup, A.: Scalable kernel-based minimum mean square error estimator for accelerated image error concealment. IEEE Trans. Broadcasting 63, 59–70 (2017)
Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are We Ready for Autonomous Driving? The KITTI Vision Benchmark Suite. In: Proceedings of CVPR, Providence (2012)
Google Scholar
Gini GmbH. https://gini.net/en/products/extract/gini-smart. Accessed 15 Jan 2023
Gini Photo Payment. https://gini.net/en/gini-now-processes-over-7-million-photo-payments-per-month. Accessed 15 Jan 2023
Document Data Capture. https://www.bitkom.org/sites/default/files/file/import/130302-Document-Data-Capture.pdf. Accessed 15 Jan 2023
Lin, Y.-S., Ju, R.-Y., Chen, C.-C., Lin, T.-Y., Chiang, J.-S.: Three-Stage Binarization of Color Document Images Based on Discrete Wavelet Transform and Generative Adversarial Networks, arXiv preprint (2022)
Google Scholar
CADB Testset. https://github.com/gini/vision-cadb-testset.git. Accessed 15 Jan 2023

Download references

Author information

Authors and Affiliations

Gini GmbH, Department Computer Vision & Information Extraction, Munich, Germany
Ján Koloda & Jue Wang

Authors

Ján Koloda
View author publications
You can also search for this author in PubMed Google Scholar
Jue Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ján Koloda .

Editor information

Editors and Affiliations

TU Dortmund University, Dortmund, Germany
Gernot A. Fink
Adobe, College Park, MN, USA
Rajiv Jain
Osaka Metropolitan University, Osaka, Japan
Koichi Kise
Rochester Institute of Technology, Rochester, NY, USA
Richard Zanibbi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Koloda, J., Wang, J. (2023). Context Aware Document Binarization and Its Application to Information Extraction from Structured Documents. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds) Document Analysis and Recognition - ICDAR 2023. ICDAR 2023. Lecture Notes in Computer Science, vol 14187. Springer, Cham. https://doi.org/10.1007/978-3-031-41676-7_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-41676-7_4
Published: 19 August 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-41675-0
Online ISBN: 978-3-031-41676-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)