Abstract
In today’s digital era, archival and transmission of document images are generally carried out in a compressed form in order to avoid wastage of storage space and bandwidth. In the case of CCITT Group 3 and Group 4, the compressed representation is a stream of white and black pixel intensity values called runs, correspondingly indicating background and foreground regions of the document image. In this research paper, we propose a novel entropy-driven incremental learning technique that directly works on the compressed stream of runs, and subsequently facilitates text-line segmentation in handwritten document images using entropy and connected component analysis. Spatial Entropy Quantifier (SEQ) is extracted from the stream of runs based on a suitable window. Further, incremental entropy and connected component analysis are carried out thus separating text and non-text regions leading to automatic text-line segmentation. The proposed method is validated with the compressed dataset of handwritten document images and performance is reported.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
T.4-Recommedation Standardization of group 3 facsimile apparatus for document transmission, terminal equipments and protocols for telematic services, vol. vii, fascicle, vii. 3, Geneva. Technical report (1985)
T.6-Recommendation Standardization of group 4 facsimile apparatus for document transmission, terminal equipments and protocols for telematic services, vol. vii, fascicle, vii. 3, Geneva. Technical report (1985)
Amarnath, R., Nagabhushan, P.: Spotting separator points at line terminals in compressed document images for text-line segmentation. Int. J. Comput. Appl. 172(4) (2017)
Javed, M., Krishnanand, S.H., Nagabhushan, P., Chaudhuri, B.B.: Visualizing CCITT Group 3 and Group 4 TIFF Documents and Transforming to Run-Length Compressed Format Enabling Direct Processing in Compressed Domain International Conference on Computational Modeling and Security (CMS 2016) Procedia Computer Science 85 213 – 221. Elsevier. (2016)
Javed, M., Nagabhushan, P.: A review on document image analysis techniques directly in the compressed domain. Artif Intell Rev. s10462-017-9551-9. Springer Science+Business Media Dordrecht (2017)
Gowda, S.D., Nagabhushan, P.: Entropy Quantifiers Useful for Establishing Equivalence between Text Document Images International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007)
Javed, M., Nagabhushan, P., Chaudhuri, B.B.: Entropy computations of document images in run-length compressed domain. In: Fifth International Conference on Signal and Image Processing (2014)
Sindhushree, G.S., Amarnath, R., Nagabhushan, P.: Entropy based approach for enabling text line segmentation in handwritten documents. In: First International Conference on Data Analytics and Learning (DAL), Mysore (2018). In Press Springer, LNNS
Preeti M., P. Nagabhushan, P.: Incremental feature transformation for temporal space. Int. J. Comput. 145(8), Appl. 0975–8887 (2016)
https://en.wikipedia.org/wiki/Transmission_Control_Protocol. Accessed from 31 Mar 2018
Alaei, A., Pal, U., Nagabhushan, P., Kimura, F.: Painting based technique for skew estimation of scanned documents. In: International Conference on Document Analysis and Recognition (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Amarnath, R., Nagabhushan, P., Javed, M. (2020). Enabling Text-Line Segmentation in Run-Length Encoded Handwritten Document Image Using Entropy-Driven Incremental Learning. In: Chaudhuri, B., Nakagawa, M., Khanna, P., Kumar, S. (eds) Proceedings of 3rd International Conference on Computer Vision and Image Processing. Advances in Intelligent Systems and Computing, vol 1022. Springer, Singapore. https://doi.org/10.1007/978-981-32-9088-4_20
Download citation
DOI: https://doi.org/10.1007/978-981-32-9088-4_20
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-32-9087-7
Online ISBN: 978-981-32-9088-4
eBook Packages: EngineeringEngineering (R0)