[go: up one dir, main page]

Skip to main content

RMSLRS: Real-Time Multi-terminal Sign Language Recognition System

  • Conference paper
  • First Online:
Innovations in Bio-Inspired Computing and Applications (IBICA 2022)

Abstract

Sign language is commonly used by deaf or speech-impaired people to convey meaning. Sign language recognition (SLR) aims to help users learn and use sign language by recognizing the signs from given videos. Although sign language has gained social acceptance, few sign language recognition systems have been developed with educational purposes. Current SLR system has a single form of expression and lacks real-time interaction and feedback with an expensive cost. And videos captured in real scenes are complex, which has impact on the performance of models. To this end, this paper has proposed a novel real-time multi-terminal sign language recognition system (RMSLRS). Specifically, a lightweight sign language recognition model based on MediaPipe Holistic is proposed to perform sign language inference by sensing multi-dimensional information such as pose, facial expression, and hand tracking, achieving near real-time performance on mobile and desktop devices. Then, a novel pre-processing module is proposed to reduce the adverse effects of background noise in videos through the YOLO model and OpenCv. Furthermore, a novel technical architecture for front-end and back-end separation and multi-terminal deployment is designed, including WeChat applet, desktop application and website. Finally, this system has been deployed with gratifying success in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. AlKhuraym, B.Y., Ismail, M.M.B., Bchir, O.: Arabic sign language recognition using lightweight cnn-based architecture. Int. J. Adv. Comput. Sci. Appl. 13(4) (2022)

    Google Scholar 

  2. Blanchette, J., Summerfield, M.: C++ GUI programming with Qt 4. Prentice Hall Professional (2006)

    Google Scholar 

  3. Bradski, G.: The opencv library. Dr. Dobb’s J. Softw. Tools Professional Programmer 25(11), 120–123 (2000)

    Google Scholar 

  4. Chen, X., Wang, G., Guo, H., Zhang, C.: Pose guided structured region ensemble network for cascaded hand pose estimation. Neurocomputing 395, 138–149 (2020)

    Article  Google Scholar 

  5. Cho, S., et al.: Tackling background distraction in video object segmentation. arXiv preprint arXiv:2207.06953 (2022)

  6. Doosti, B.: Hand pose estimation: a survey. arXiv preprint arXiv:1903.01013 (2019)

  7. Filipova, O.: Learning Vue. js 2. Packt Publishing Ltd. (2016)

    Google Scholar 

  8. Gedraite, E.S., Hadad, M.: Investigation on the effect of a gaussian blur in image filtering and segmentation. In: Proceedings ELMAR-2011, pp. 393–396. IEEE (2011)

    Google Scholar 

  9. Halder, A., Tayade, A.: Real-time vernacular sign language recognition using mediapipe and machine learning. J. Homepage: www. ijrpr. com ISSN 2582, 7421 (2021)

    Google Scholar 

  10. Hao, L., Wan, F., Ma, N., Wang, Y.: Analysis of the development of wechat mini program. In: Journal of Physics: Conference Series, vol. 1087, p. 062040. IOP Publishing (2018)

    Google Scholar 

  11. Huang, J., Zhou, W., Li, H., Li, W.: Sign language recognition using 3d convolutional neural networks. In: 2015 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2015)

    Google Scholar 

  12. Huang, Z., Xu, W., Yu, K.: Bidirectional lstm-crf models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)

  13. Isinkaye, F.O., Folajimi, Y.O., Ojokoh, B.A.: Recommendation systems: principles, methods and evaluation. Egyptian Inform. J. 16(3), 261–273 (2015)

    Article  Google Scholar 

  14. Jang, E., Gu, S., Poole, B.: Categorical reparametrization with gumble-softmax. In: International Conference on Learning Representations (ICLR 2017). OpenReview. net (2017)

    Google Scholar 

  15. Jiang, S., Sun, B., Wang, L., Bai, Y., Li, K., Fu, Y.: Skeleton aware multi-modal sign language recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3413–3423 (2021)

    Google Scholar 

  16. Kocabas, M., Karagoz, S., Akbas, E.: Multiposenet: fast multi-person pose estimation using pose residual network. In: Proceedings of the European conference on computer vision (ECCV), pp. 417–433 (2018)

    Google Scholar 

  17. Koller, O., Camgoz, N.C., Ney, H., Bowden, R.: Weakly supervised learning with multi-stream cnn-lstm-hmms to discover sequential parallelism in sign language videos. IEEE Trans. Pattern Anal. Mach. Intell. 42(9), 2306–2320 (2019)

    Article  Google Scholar 

  18. Koller, O., Ney, H., Bowden, R.: Deep learning of mouth shapes for sign language. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 85–91 (2015)

    Google Scholar 

  19. Molchanov, P., Gupta, S., Kim, K., Pulli, K.: Multi-sensor system for driver’s hand-gesture recognition. In: 2015 11th IEEE international Conference and Workshops on Automatic Face and Gesture Recognition (FG), vol. 1, pp. 1–8. IEEE (2015)

    Google Scholar 

  20. Neri, A., Colonnese, S., Russo, G., Talone, P.: Automatic moving object and background separation. Signal Process. 66(2), 219–232 (1998)

    Article  MATH  Google Scholar 

  21. Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29

    Chapter  Google Scholar 

  22. Pigou, L., Dieleman, S., Kindermans, P.-J., Schrauwen, B.: Sign language recognition using convolutional neural networks. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8925, pp. 572–578. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16178-5_40

    Chapter  Google Scholar 

  23. Rastgoo, R., Kiani, K., Escalera, S.: Sign language recognition: a deep survey. Expert Syst. Appl. 164, 113794 (2021)

    Article  Google Scholar 

  24. Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)

  25. Safeel, M., Sukumar, T., Shashank, K., Arman, M., Shashidhar, R., Puneeth, S.: Sign language recognition techniques-a review. In: 2020 IEEE International Conference for Innovation in Technology (INOCON), pp. 1–9. IEEE (2020)

    Google Scholar 

  26. Singh, A.K., Kumbhare, V.A., Arthi, K.: Real-time human pose detection and recognition using mediapipe. In: Reddy, V.S., Prasad, V.K., Wang, J., Reddy, K. (eds.) ICSCSP 2021. AISC, pp. 145–154. Springer, Singapore (2021). https://doi.org/10.1007/978-981-16-7088-6_12

  27. Wadhawan, A., Kumar, P.: Sign language recognition systems: a decade systematic literature review. Archives Comput. Methods Eng. 28(3), 785–813 (2021)

    Article  Google Scholar 

  28. Walls, C.: Spring Boot in action. Simon and Schuster (2015)

    Google Scholar 

Download references

Acknowledgment

This work was supported by the Shandong Provincial Teaching Research Project of Graduate Education (SDYJG21034), the National Natural Science Foundation of China (61772231), and the Shandong Provincial Key R & D Program of China (2021CXGC010103).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kun Ma .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhao, Y., Zhang, B., Ma, K. (2023). RMSLRS: Real-Time Multi-terminal Sign Language Recognition System. In: Abraham, A., Bajaj, A., Gandhi, N., Madureira, A.M., Kahraman, C. (eds) Innovations in Bio-Inspired Computing and Applications. IBICA 2022. Lecture Notes in Networks and Systems, vol 649. Springer, Cham. https://doi.org/10.1007/978-3-031-27499-2_54

Download citation

Publish with us

Policies and ethics