RMSLRS: Real-Time Multi-terminal Sign Language Recognition System

Yilin Zhao¹⁴,
Biao Zhang¹⁵ &
Kun Ma ORCID: orcid.org/0000-0002-0135-5423¹⁶

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 649))

Included in the following conference series:

International Conference on Innovations in Bio-Inspired Computing and Applications

799 Accesses

Abstract

Sign language is commonly used by deaf or speech-impaired people to convey meaning. Sign language recognition (SLR) aims to help users learn and use sign language by recognizing the signs from given videos. Although sign language has gained social acceptance, few sign language recognition systems have been developed with educational purposes. Current SLR system has a single form of expression and lacks real-time interaction and feedback with an expensive cost. And videos captured in real scenes are complex, which has impact on the performance of models. To this end, this paper has proposed a novel real-time multi-terminal sign language recognition system (RMSLRS). Specifically, a lightweight sign language recognition model based on MediaPipe Holistic is proposed to perform sign language inference by sensing multi-dimensional information such as pose, facial expression, and hand tracking, achieving near real-time performance on mobile and desktop devices. Then, a novel pre-processing module is proposed to reduce the adverse effects of background noise in videos through the YOLO model and OpenCv. Furthermore, a novel technical architecture for front-end and back-end separation and multi-terminal deployment is designed, including WeChat applet, desktop application and website. Finally, this system has been deployed with gratifying success in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Real-Time Hand Tracking and Gesture Recognizing Communication System for Physically Disabled People

Continuous Sign Language Recognition Using Holistic Key Points

A Comparative Study of Globally Popular Sign Language Recognition System

References

AlKhuraym, B.Y., Ismail, M.M.B., Bchir, O.: Arabic sign language recognition using lightweight cnn-based architecture. Int. J. Adv. Comput. Sci. Appl. 13(4) (2022)
Google Scholar
Blanchette, J., Summerfield, M.: C++ GUI programming with Qt 4. Prentice Hall Professional (2006)
Google Scholar
Bradski, G.: The opencv library. Dr. Dobb’s J. Softw. Tools Professional Programmer 25(11), 120–123 (2000)
Google Scholar
Chen, X., Wang, G., Guo, H., Zhang, C.: Pose guided structured region ensemble network for cascaded hand pose estimation. Neurocomputing 395, 138–149 (2020)
Article Google Scholar
Cho, S., et al.: Tackling background distraction in video object segmentation. arXiv preprint arXiv:2207.06953 (2022)
Doosti, B.: Hand pose estimation: a survey. arXiv preprint arXiv:1903.01013 (2019)
Filipova, O.: Learning Vue. js 2. Packt Publishing Ltd. (2016)
Google Scholar
Gedraite, E.S., Hadad, M.: Investigation on the effect of a gaussian blur in image filtering and segmentation. In: Proceedings ELMAR-2011, pp. 393–396. IEEE (2011)
Google Scholar
Halder, A., Tayade, A.: Real-time vernacular sign language recognition using mediapipe and machine learning. J. Homepage: www. ijrpr. com ISSN 2582, 7421 (2021)
Google Scholar
Hao, L., Wan, F., Ma, N., Wang, Y.: Analysis of the development of wechat mini program. In: Journal of Physics: Conference Series, vol. 1087, p. 062040. IOP Publishing (2018)
Google Scholar
Huang, J., Zhou, W., Li, H., Li, W.: Sign language recognition using 3d convolutional neural networks. In: 2015 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2015)
Google Scholar
Huang, Z., Xu, W., Yu, K.: Bidirectional lstm-crf models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
Isinkaye, F.O., Folajimi, Y.O., Ojokoh, B.A.: Recommendation systems: principles, methods and evaluation. Egyptian Inform. J. 16(3), 261–273 (2015)
Article Google Scholar
Jang, E., Gu, S., Poole, B.: Categorical reparametrization with gumble-softmax. In: International Conference on Learning Representations (ICLR 2017). OpenReview. net (2017)
Google Scholar
Jiang, S., Sun, B., Wang, L., Bai, Y., Li, K., Fu, Y.: Skeleton aware multi-modal sign language recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3413–3423 (2021)
Google Scholar
Kocabas, M., Karagoz, S., Akbas, E.: Multiposenet: fast multi-person pose estimation using pose residual network. In: Proceedings of the European conference on computer vision (ECCV), pp. 417–433 (2018)
Google Scholar
Koller, O., Camgoz, N.C., Ney, H., Bowden, R.: Weakly supervised learning with multi-stream cnn-lstm-hmms to discover sequential parallelism in sign language videos. IEEE Trans. Pattern Anal. Mach. Intell. 42(9), 2306–2320 (2019)
Article Google Scholar
Koller, O., Ney, H., Bowden, R.: Deep learning of mouth shapes for sign language. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 85–91 (2015)
Google Scholar
Molchanov, P., Gupta, S., Kim, K., Pulli, K.: Multi-sensor system for driver’s hand-gesture recognition. In: 2015 11th IEEE international Conference and Workshops on Automatic Face and Gesture Recognition (FG), vol. 1, pp. 1–8. IEEE (2015)
Google Scholar
Neri, A., Colonnese, S., Russo, G., Talone, P.: Automatic moving object and background separation. Signal Process. 66(2), 219–232 (1998)
Article MATH Google Scholar
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29
Chapter Google Scholar
Pigou, L., Dieleman, S., Kindermans, P.-J., Schrauwen, B.: Sign language recognition using convolutional neural networks. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8925, pp. 572–578. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16178-5_40
Chapter Google Scholar
Rastgoo, R., Kiani, K., Escalera, S.: Sign language recognition: a deep survey. Expert Syst. Appl. 164, 113794 (2021)
Article Google Scholar
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Safeel, M., Sukumar, T., Shashank, K., Arman, M., Shashidhar, R., Puneeth, S.: Sign language recognition techniques-a review. In: 2020 IEEE International Conference for Innovation in Technology (INOCON), pp. 1–9. IEEE (2020)
Google Scholar
Singh, A.K., Kumbhare, V.A., Arthi, K.: Real-time human pose detection and recognition using mediapipe. In: Reddy, V.S., Prasad, V.K., Wang, J., Reddy, K. (eds.) ICSCSP 2021. AISC, pp. 145–154. Springer, Singapore (2021). https://doi.org/10.1007/978-981-16-7088-6_12
Wadhawan, A., Kumar, P.: Sign language recognition systems: a decade systematic literature review. Archives Comput. Methods Eng. 28(3), 785–813 (2021)
Article Google Scholar
Walls, C.: Spring Boot in action. Simon and Schuster (2015)
Google Scholar

Download references

Acknowledgment

This work was supported by the Shandong Provincial Teaching Research Project of Graduate Education (SDYJG21034), the National Natural Science Foundation of China (61772231), and the Shandong Provincial Key R & D Program of China (2021CXGC010103).

Author information

Authors and Affiliations

Business School of Jinan University, University of Jinan, Jinan, 250022, China
Yilin Zhao
School of Computer and Information, Hefei University of Technology, Hefei, 230009, China
Biao Zhang
Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan, 250022, China
Kun Ma

Authors

Yilin Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Biao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Kun Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kun Ma .

Editor information

Editors and Affiliations

Faculty of Computing and Data Science, Flame University, Pune, Maharashtra, India
Ajith Abraham
Computer Science and Engineering Department, Thapar Institute of Engineering and Technology, Patiala, Punjab, India
Anu Bajaj
Scientific Network for Innovation and Research Excellence, Machine Intelligence Research Labs, Auburn, WA, USA
Niketa Gandhi
Interdisciplinary Studies Research Center (ISRC), Institute of Engineering, Polytechnique of Porto (ISEP/P.PORTO), INOV (Institute for Systems and Computer Engineering, Technology and Science), Porto, Portugal
Ana Maria Madureira
Department of Industrial Engineering, Istanbul Technical University, Istanbul, Türkiye
Cengiz Kahraman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhao, Y., Zhang, B., Ma, K. (2023). RMSLRS: Real-Time Multi-terminal Sign Language Recognition System. In: Abraham, A., Bajaj, A., Gandhi, N., Madureira, A.M., Kahraman, C. (eds) Innovations in Bio-Inspired Computing and Applications. IBICA 2022. Lecture Notes in Networks and Systems, vol 649. Springer, Cham. https://doi.org/10.1007/978-3-031-27499-2_54

Download citation

DOI: https://doi.org/10.1007/978-3-031-27499-2_54
Published: 28 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-27498-5
Online ISBN: 978-3-031-27499-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics