Abstract
In this paper, we propose the Glyph-Semanteme fusion Embedding (GSE) for Chinese character and apply it to Offline Handwritten Chinese Text Recognition (offline-HCTR). It is well known that the number of Chinese characters is very large and the glyphs of these characters are complex, but few researchers realize that the underlying reason for this phenomenon is that Chinese is a form of ideogram, which indicates that there are correlations between the glyph and semanteme of a character. In order to utilize this feature and create better representations for Chinese characters, firstly, we extract the glyph embedding and semanteme embedding for each Chinese character; then we propose a parameterized gated fusion strategy to automatically calculate the Glyph-Semanteme fusion Embedding for each character by fusing its glyph embedding and semanteme embedding. We apply the proposed GSE to an attention-based Encoder-decoder network for the offline-HCTR task. Furthermore, two kinds of GSE, Character-level GSE (CGSE) and Text-level GSE (TGSE), are applied to the decoder phase to yield the predictions. On the standard benchmark ICDAR-2013 HCTR competition dataset, the proposed method achieves 96.65% character-level recognition accuracy, which demonstrates the effectiveness of the proposed glyph-semanteme fusion embedding.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
http://www.nlpr.ia.ac.cn/databases/handwriting/Home.html.
http://tcci.ccf.org.cn/conference/2017/taskdata.php.
https://dumps.wikimedia.org/zhwiki/latest/zhwiki-latest-pages-articles.xml.bz2.
https://github.com/KaimingHe/deep-residual-networks.
https://www.tensorflow.org/.
References
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings
Bai F, Cheng Z, Niu Y, Pu S, Zhou S (2018) Edit probability for scene text recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1508–1516
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Du J, Wang Z, Zhai J, Hu J (2016) Deep neural network based hidden Markov model for offline handwritten Chinese text recognition. In: 23rd International Conference on Pattern Recognition, ICPR 2016, Cancún, Mexico, December 4–8, 2016, pp 3428–3433
Graves A, Fernández S, Gomez F, Schmidhuber J (2006) Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: the 23rd International Conference on Machine learning, pp 369–376
Greff K, Srivastava RK, Schmidhuber J (2017) Highway and residual networks learn unrolled iterative estimation. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track Proceedings
Gu J, Wang G, Cai J, Chen T (2017) An empirical study of language CNN for image captioning. In: Proceedings of the International Conference on Computer Vision (ICCV)
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition,CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016, pp 770–778
Karpathy A, Fei-Fei L (2015) Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3128–3137
Karpathy A, Joulin A, Li F (2014) Deep fragment embeddings for bidirectional image sentence mapping. In: Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8–13 2014, Montreal, Quebec, Canada, pp 1889–1897
Kim S, Dalmia S, Metze F (2019) Gated embeddings in end-to-end speech recognition for conversational-context fusion. In: Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28–August 2, 2019, Volume 1: Long Papers, pp 1131–1141
Liu C, Sako H, Fujisawa H (2004) Effects of classifier structures and training regimes on integrated segmentation and recognition of handwritten numeral strings. IEEE Trans Pattern Anal Mach Intell 26(11):1395–1407
Liu C, Yin F, Wang D, Wang Q (2013) Online and offline handwritten Chinese character recognition: benchmarking on new databases. Pattern Recognit 46(1):155–162
Luo C, Jin L, Sun Z (2019) Moran: a multi-object rectified attention network for scene text recognition. Pattern Recogn 90:109–118
Messina R, Louradour J (2015) Segmentation-free handwritten Chinese text recognition with lSTM-RNN. In: Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, pp 171–175. IEEE
Ranzato M, Chopra S, Auli M, Zaremba W (2016)Sequence level training with recurrent neural networks. In: Y. Bengio, Y. LeCun (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2–4, 2016, Conference Track Proceedings
Shi B, Bai X, Yao C (2017) An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans Pattern Anal Mach Intell 39(11):2298–2304
Shi B, Bai X, Yao C (2017) An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans Pattern Anal Mach Intell 39(11):2298–2304
Srivastava RK, Greff K, Schmidhuber J (2015) Training very deep networks. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7–12, 2015, Montreal, Quebec, Canada, pp 2377–2385
Su TH, Zhang TW, Guan DJ, Huang HJ (2009) Off-line recognition of realistic Chinese handwriting using segmentation-free strategy. Pattern Recognit 42(1):167–182
Wang QF, Yin F, Liu CL (2014) Unsupervised language model adaptation for handwritten Chinese text recognition. Pattern Recognit 47(3):1202–1216
Wang W, Du J, Wang Z (2018) Parsimonious HMMS for offline handwritten Chinese text recognition. In: 16th International Conference on Frontiers in Handwriting Recognition, ICFHR 2018, Niagara Falls, NY, USA, August 5–8, 2018, pp 145–150
Wang Z, Du J, Wang J (2020) Writer-aware CNN for parsimonious HMM-based offline handwritten Chinese text recognition. Pattern Recognit 100:107102
Wang Z, Du J, Wang W, Zhai J, Hu J (2018) A comprehensive study of hybrid neural network hidden Markov model for offline handwritten Chinese text recognition. Int J Doc Anal Recognit 21(4):241–251
Wei X, Lu S, Wen Y, Lu Y (2016) Recognition of handwritten Chinese address with writing variations. Pattern Recognit Lett 73:68–75
Weston J, Chopra S, Bordes A (2015) Memory networks. In: Bengio Y, LeCun Y (eds) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings
Wu Y, Yin F, Liu C (2017) Improving handwritten Chinese text recognition using neural network language models and convolutional neural network shape models. Pattern Recogn 65:251–264
Wu YC, Yin F, Chen Z, Liu CL (2017) Handwritten Chinese text recognition using separable multi-dimensional recurrent neural network. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol 1, pp 79–84. IEEE
Wu YC, Yin F, Liu CL (2017) Improving handwritten Chinese text recognition using neural network language models and convolutional neural network shape models. Pattern Recognit 65:251–264
Xie Z, Huang Y, Zhu Y, Jin L, Liu Y, Xie L (2019) Aggregation cross-entropy for sequence recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019, pp 6538–6547
Yang H, Jin L, Sun J (2018) Recognition of Chinese text in historical documents with page-level annotations. In: 16th International Conference on Frontiers in Handwriting Recognition, ICFHR 2018, Niagara Falls, NY, USA, August 5–8, 2018, pp 199–204
Zeiler MD (2012) Adadelta: an adaptive learning rate method. CoRR. arXiv:1212.5701
Zhan H, Lyu S, Lu Y (2018) Improving off-line handwritten Chinese character recognition with semantic information. In: Neural Information Processing - 25th International Conference, ICONIP 2018, Siem Reap, Cambodia, December 13–16, 2018, Proceedings, Part V, Lecture Notes in Computer Science, vol 11305, pp 528–536
Zhang B, Xiong D, Su J (2020) Neural machine translation with deep attention. IEEE Trans Pattern Anal Mach Intell 42(1):154–163
Zilly JG, Srivastava RK, Koutník J, Schmidhuber J (2017) Recurrent highway networks. In: Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6–11 August 2017, Proceedings of Machine Learning Research, vol 70, pp 4189–4198
Acknowledgements
This work is supported by Natural Science Foundation of Shanghai (No. 19ZR1415900).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhan, H., Lyu, S. & Lu, Y. Improving offline handwritten Chinese text recognition with glyph-semanteme fusion embedding. Int. J. Mach. Learn. & Cyber. 13, 485–496 (2022). https://doi.org/10.1007/s13042-021-01420-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-021-01420-7