Abstract
Sign language is commonly used by deaf or speech-impaired people to convey meaning. Sign language recognition (SLR) aims to help users learn and use sign language by recognizing the signs from given videos. Although sign language has gained social acceptance, few sign language recognition systems have been developed with educational purposes. Current SLR system has a single form of expression and lacks real-time interaction and feedback with an expensive cost. And videos captured in real scenes are complex, which has impact on the performance of models. To this end, this paper has proposed a novel real-time multi-terminal sign language recognition system (RMSLRS). Specifically, a lightweight sign language recognition model based on MediaPipe Holistic is proposed to perform sign language inference by sensing multi-dimensional information such as pose, facial expression, and hand tracking, achieving near real-time performance on mobile and desktop devices. Then, a novel pre-processing module is proposed to reduce the adverse effects of background noise in videos through the YOLO model and OpenCv. Furthermore, a novel technical architecture for front-end and back-end separation and multi-terminal deployment is designed, including WeChat applet, desktop application and website. Finally, this system has been deployed with gratifying success in practice.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
AlKhuraym, B.Y., Ismail, M.M.B., Bchir, O.: Arabic sign language recognition using lightweight cnn-based architecture. Int. J. Adv. Comput. Sci. Appl. 13(4) (2022)
Blanchette, J., Summerfield, M.: C++ GUI programming with Qt 4. Prentice Hall Professional (2006)
Bradski, G.: The opencv library. Dr. Dobb’s J. Softw. Tools Professional Programmer 25(11), 120–123 (2000)
Chen, X., Wang, G., Guo, H., Zhang, C.: Pose guided structured region ensemble network for cascaded hand pose estimation. Neurocomputing 395, 138–149 (2020)
Cho, S., et al.: Tackling background distraction in video object segmentation. arXiv preprint arXiv:2207.06953 (2022)
Doosti, B.: Hand pose estimation: a survey. arXiv preprint arXiv:1903.01013 (2019)
Filipova, O.: Learning Vue. js 2. Packt Publishing Ltd. (2016)
Gedraite, E.S., Hadad, M.: Investigation on the effect of a gaussian blur in image filtering and segmentation. In: Proceedings ELMAR-2011, pp. 393–396. IEEE (2011)
Halder, A., Tayade, A.: Real-time vernacular sign language recognition using mediapipe and machine learning. J. Homepage: www. ijrpr. com ISSN 2582, 7421 (2021)
Hao, L., Wan, F., Ma, N., Wang, Y.: Analysis of the development of wechat mini program. In: Journal of Physics: Conference Series, vol. 1087, p. 062040. IOP Publishing (2018)
Huang, J., Zhou, W., Li, H., Li, W.: Sign language recognition using 3d convolutional neural networks. In: 2015 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2015)
Huang, Z., Xu, W., Yu, K.: Bidirectional lstm-crf models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
Isinkaye, F.O., Folajimi, Y.O., Ojokoh, B.A.: Recommendation systems: principles, methods and evaluation. Egyptian Inform. J. 16(3), 261–273 (2015)
Jang, E., Gu, S., Poole, B.: Categorical reparametrization with gumble-softmax. In: International Conference on Learning Representations (ICLR 2017). OpenReview. net (2017)
Jiang, S., Sun, B., Wang, L., Bai, Y., Li, K., Fu, Y.: Skeleton aware multi-modal sign language recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3413–3423 (2021)
Kocabas, M., Karagoz, S., Akbas, E.: Multiposenet: fast multi-person pose estimation using pose residual network. In: Proceedings of the European conference on computer vision (ECCV), pp. 417–433 (2018)
Koller, O., Camgoz, N.C., Ney, H., Bowden, R.: Weakly supervised learning with multi-stream cnn-lstm-hmms to discover sequential parallelism in sign language videos. IEEE Trans. Pattern Anal. Mach. Intell. 42(9), 2306–2320 (2019)
Koller, O., Ney, H., Bowden, R.: Deep learning of mouth shapes for sign language. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 85–91 (2015)
Molchanov, P., Gupta, S., Kim, K., Pulli, K.: Multi-sensor system for driver’s hand-gesture recognition. In: 2015 11th IEEE international Conference and Workshops on Automatic Face and Gesture Recognition (FG), vol. 1, pp. 1–8. IEEE (2015)
Neri, A., Colonnese, S., Russo, G., Talone, P.: Automatic moving object and background separation. Signal Process. 66(2), 219–232 (1998)
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29
Pigou, L., Dieleman, S., Kindermans, P.-J., Schrauwen, B.: Sign language recognition using convolutional neural networks. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8925, pp. 572–578. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16178-5_40
Rastgoo, R., Kiani, K., Escalera, S.: Sign language recognition: a deep survey. Expert Syst. Appl. 164, 113794 (2021)
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Safeel, M., Sukumar, T., Shashank, K., Arman, M., Shashidhar, R., Puneeth, S.: Sign language recognition techniques-a review. In: 2020 IEEE International Conference for Innovation in Technology (INOCON), pp. 1–9. IEEE (2020)
Singh, A.K., Kumbhare, V.A., Arthi, K.: Real-time human pose detection and recognition using mediapipe. In: Reddy, V.S., Prasad, V.K., Wang, J., Reddy, K. (eds.) ICSCSP 2021. AISC, pp. 145–154. Springer, Singapore (2021). https://doi.org/10.1007/978-981-16-7088-6_12
Wadhawan, A., Kumar, P.: Sign language recognition systems: a decade systematic literature review. Archives Comput. Methods Eng. 28(3), 785–813 (2021)
Walls, C.: Spring Boot in action. Simon and Schuster (2015)
Acknowledgment
This work was supported by the Shandong Provincial Teaching Research Project of Graduate Education (SDYJG21034), the National Natural Science Foundation of China (61772231), and the Shandong Provincial Key R & D Program of China (2021CXGC010103).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhao, Y., Zhang, B., Ma, K. (2023). RMSLRS: Real-Time Multi-terminal Sign Language Recognition System. In: Abraham, A., Bajaj, A., Gandhi, N., Madureira, A.M., Kahraman, C. (eds) Innovations in Bio-Inspired Computing and Applications. IBICA 2022. Lecture Notes in Networks and Systems, vol 649. Springer, Cham. https://doi.org/10.1007/978-3-031-27499-2_54
Download citation
DOI: https://doi.org/10.1007/978-3-031-27499-2_54
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-27498-5
Online ISBN: 978-3-031-27499-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)