Abstract
Script identification is an emerging document analysis problem where we identify scripts type from multilingual documents. It is well known that there are 22 official languages in India and 11 scripts are used to write them. Traditional approaches for script identification consider all the scripts together and perform a classification at single level in brute force manner. In this paper, we propose a novel multi-level approach that separate 11 different scripts (Bangla, Devanagari, Gujarati, Gurumukhi, Kannada, Malayalam, Oriya, Roman, Tamil, Telugu & Urdu) from multi-script documents. A three-level hierarchy is followed during the grouping of different Indic scripts based on their structural similarities. The proposed approach not only performs well in terms of classification accuracy but also it shows more realistic way to separate multiple numbers of Indic scripts. We obtain an average script identification accuracy of 94.43% at individual script-level which is the encouraging observation of the current inherent complex problem.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ghosh, D., Dube, T., Shivprasad, S.P.: Script recognition—a review. IEEE Trans. Pattern Anal. Mach. Intell. 32(12), 2142–2161 (2010)
Singh, P.K., Sarkar, R., Nasipuri, M.: Offline script identification from multilingual indic-script documents: a state-of-the-art. Comput. Sci. Rev. 15–16, 1–28 (2015)
Obaidullah, S.M., Das, S.K., Roy, K.: A system for handwritten script identification from Indian document. J. Pattern Recogn. Res. 8, 1–12 (2013)
Hochberg, J., Bowers, K., Cannon, M., Kelly, P.: Script and language identication for handwritten document images. J. Doc. Anal. Recogn. 2(2/3), 45–52 (1999)
Zhu, X., Li, Y.Y., Doermann, D.: Language identication for handwritten document images using a shape codebook. Pattern Recogn. 42, 3184–3191 (2009)
Obaidullah S.M., Das, N., Roy, K.: Gabor filter based technique for offline indic script identification from handwritten document images. In: International Conference on Devices, Circuits and Communications, ICDCCom 2014, pp 1–6 (2014)
Hangarge, M., Santosh, K.C., Pardeshi, R.: Directional discrete cosine transform for handwritten script identification. In: Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, pp 344–348 (2013)
Obaidullah, S.M., Karim, R., Shaikh, S., Halder, C., Das, N., Roy, K.: Transform based approach for Indic script identification from handwritten document images. In: 3rd International Conference on Signal Processing, Communications and Networking, pp 1–7 (2015)
Basu, S., Das, N., Sarkar, R., Kundu, M., Nasipuri, M., Basu, D.K.: A novel framework for automatic sorting of postal documents with multi-script address blocks. Pattern Recogn. 43(10), 3507–3521 (2010)
Obaidullah, S.M., Halder, C., Das, N., Roy, K.: An approach for automatic Indic script identification from handwritten document images. In: 2nd Doctoral Symposium on Applied Computation and Security Systems, pp 37–51 (2015)
Singh, P.K., Mondal, A., Bhowmik, S., Sarkar, R., Nasipuri, M.: Word-Level Script Identification from Handwritten Multi-script Documents. In: Satapathy, S.C., Biswal, B.N., Udgata, Siba K., Mandal, J.K. (eds.) Proceedings of the 3rd International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2014. AISC, vol. 327, pp. 551–558. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-11933-5_62
Pardeshi, R., Chaudhuri, B.B., Hangarge, M., Santosh, K.C.: Automatic handwritten Indian scripts identification. In: 14th International Conference on Frontiers in Handwriting Recognition, pp 375–380 (2014)
Obaidullah, S.M., Halder, C., Das, N., Roy, K.: Numeral script identification from handwritten document images. Proc. Comput. Sci. J. 54C, 585–594 (2015)
Obaidullah, S.M., Roy, K., Das, N.: Comparison of different classifiers for script identification from handwritten document. In: IEEE International Conference on Signal Processing, Computing and Control, pp. 019–024 (2013)
Mandelbrot, B.B.: The Fractal Geometry of Nature. Freeman, NY (1982)
Jayara, M.A., Fleyeh, H.: Convex hulls in image processing: a scoping review. Sci. Acad. Publ. 6(2), 48–58 (2016)
Avis, D., Bremner, D., Seidel, R.: How good are convex hull algorithms? Comput. Geom. 7, 265–301 (1997)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ghosh, S. et al. (2019). Handwritten Indic Script Identification – A Multi-level Approach. In: Mandal, J., Mukhopadhyay, S., Dutta, P., Dasgupta, K. (eds) Computational Intelligence, Communications, and Business Analytics. CICBA 2018. Communications in Computer and Information Science, vol 1031. Springer, Singapore. https://doi.org/10.1007/978-981-13-8581-0_9
Download citation
DOI: https://doi.org/10.1007/978-981-13-8581-0_9
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-8580-3
Online ISBN: 978-981-13-8581-0
eBook Packages: Computer ScienceComputer Science (R0)