Abstract
Image-based hand gesture recognition is a very challenging problem as the hand is a smaller object with complex articulations compared to the entire human body. It occupies a little portion in the image and is more easily affected by segmentation errors, and hence needs delicate description. This paper suggests a new weighted multi-scale feature descriptor (WMD) along the contour of the hand for robust hand gesture recognition using depth images. Firstly, the weight factor is estimated for each contour point by 2D Gaussian smoothing function and Prewitt operator to relate it with its neighbors and highlight its importance. Then the WMD descriptor is constructed via 1D left-side and right-side Gaussian smoothing considering the contour points are more sensitive than those inner points of the hand and depend on each other when used to recognize the gestures. Granularity of the descriptor is characterized by multiple scales with different standard deviations of the Gaussian function. And its invariants to translation, rotation and scaling transformations are proved theoretically and validated experimentally. Finally, extensive experiments on our self-established ten-gesture dataset and two public datasets have been carried out by comparing the proposed algorithm with three distance-based and two CNN-based hand gesture recognition methods. The encouraging results demonstrate that our method outperforms the others and achieves a good combination of accuracy (more than 95%) and computational efficiency (averaging 0.054s per frame).















Similar content being viewed by others
Data availability
Data will be made available upon request.
References
Liu AA, Nie WZ et al (2015) Coupled hidden conditional random fields for RGB-D human action recognition. Sig Process 112:74–82. https://doi.org/10.1016/j.sigpro.2014.08.038
Chevtchenko SF, Vale RF et al (2018) A convolutional neural network with feature fusion for real-time hand posture recognition. Appl Soft Comput 73:748–766. https://doi.org/10.1016/j.asoc.2018.09.010
Memo A, Zanuttigh P (2018) Head-mounted gesture controlled interface for human-computer interaction. Multimed Tools Appl 77:27–53
Liu Y, Jiang J et al (2021) Hand pose estimation from RGB images based on deep learning: a survey. IEEE 7th International Conference on Virtual Reality (ICVR)
Dardas NH, Georganas ND (2011) Real-time hand gesture detection and recognition using bag-of-features and support vector machine techniques. IEEE Trans Instrum Meas 60(11):3592–3607
Tubaiz N, Shanableh T et al (2015) Glove-based continuous arabic sign language recognition in user-dependent mode. IEEE Trans Hum-Mach Syst 45:526–533. https://doi.org/10.1109/THMS.2015.2406692
Cornacchia M, Ozcan K et al (2017) A survey on activity detection and classification using wearable sensors. IEEE Sensors J 17:386–403. https://doi.org/10.1109/JSEN.2016.2628346
Lei W, Du QH, Koniusz P (2019) A comparative review of recent kinect-based action recognition algorithms. IEEE Trans Image Process 29:15–28
Song L, Yu G et al (2021) Human pose estimation and its application to action recognition: a survey. J Vis Commun Image Represent 76:103055. https://doi.org/10.1016/j.jvcir.2021.103055
Mohamed N, Mustafa M et al (2021) A review of the hand gesture recognition system: current progress and future directions. IEEE Access 9:19
Thanh TT, Fan C et al (2012) Extraction of discriminative patterns from skeleton sequences for human action recognition. In: 2012 IEEE RIVF International Conference on Computing & Communication Technologies, Research, Innovation, and Vision for the Future
Zhu HM, Pun CM (2013) Human action recognition with skeletal information from depth camera. In: IEEE International Conference on Information & Automation, 26–28. https://doi.org/10.1109/ICIA31444.2013
Amor B, Su J, Srivastava A (2015) Action recognition using rate-invariant analysis of skeletal shape trajectories. IEEE Trans Pattern Anal Mach Intell 38:1–13. https://doi.org/10.1109/TPAMI.2015.2439257
Liu X, Shi H et al (2020) 3D skeletal gesture recognition via hidden states exploration. IEEE Trans Image Process 29:1–1
Kowdiki M, Khaparde A (2021) Automatic hand gesture recognition using hybrid meta-heuristic-based feature selection and classification with dynamic time warping. Comput Sci Rev 39. https://doi.org/10.1016/j.cosrev.2020.100320
Wang C, Liu Z, Chan SC (2015) Superpixel-based hand gesture recognition with kinect depth camera. IEEE Trans Multimed 17(1):29–39. https://doi.org/10.1109/TMM.2014.2374357
Chen H, Liu X et al (2018) Temporal hierarchical dictionary with HMM for fast gesture recognition. 24th International Conference on Pattern Recognition (ICPR), Beijing, China, pp 3378–3383
Raheja JL, Minhas M et al (2015) Robust gesture recognition using Kinect: a comparison between DTW and HMM. Optik, Int J Light Electron Opt 126:1098–1104. https://doi.org/10.1016/j.ijleo.2015.02.043
Escobedo EJ, Chavez GC (2020) Multimodal hand gesture recognition combining temporal and pose information based on CNN descriptors and histogram of cumulative magnitudes. J Vis Commun Image Represent 71. https://doi.org/10.1016/j.jvcir.2020.102772
Shin S, Kim WY (2020) Skeleton-based dynamic hand gesture recognition using a part-based GRU-RNN for gesture-based interface. IEEE Access 8:50236–50243. https://doi.org/10.1109/ACCESS.2020.2980128
Lai K, Yanushkevich SN (2018) CNN + RNN Depth and skeleton based dynamic hand gesture recognition. In: 2018 24th International Conference on Pattern Recognition (ICPR), pp 3451–3456
Guo F, He Z et al (2021) Normalized edge convolutional networks for skeleton-based hand gesture recognition. Pattern Recogn 118. https://doi.org/10.1016/j.patcog.2021.108044
Ren Z, Yuan J et al (2013) Robust part-based hand gesture recognition using Kinect Sensor. IEEE Trans Multimed 15:1110–1120. https://doi.org/10.1109/TMM.2013.2246148
Wong WK, Juwono FH et al (2021) Multi-features capacitive hand gesture recognition sensor: a machine learning approach. IEEE Sensors J 21:8441–8450. https://doi.org/10.1109/JSEN.2021.3049273
Lee DL, You WS (2018) Recognition of complex static hand gestures by using the wristband-based contour features. IET Image Proc 12:80–87
He Y, Li G et al (2019) Gesture recognition based on an improved local sparse representation classification algorithm. Clust Comput 22:10935–10946
Wang Z (2021) Gesture recognition by model matching of slope difference distribution features. Measurement 181:109590. https://doi.org/10.1016/j.measurement.2021.109590
Kim J, Yu S et al (2017) An adaptive local binary pattern for 3D hand tracking. Pattern Recognit 61:139–152. https://doi.org/10.1016/j.patcog.2016.07.039
Tang J, Hong C et al (2018) Structured dynamic time warping for continuous hand trajectory gesture recognition. Pattern Recognit 80:21–31
Calado A, Roselli P et al (2022) A geometric model based approach to hand gesture recognition. IEEE Trans Syst Man Cybern: Syst 52. https://doi.org/10.1109/TSMC.2021.3138589
Zhang B, Yang Y et al (2017) Action recognition using 3D histograms of texture and a multi-class boosting classifier. IEEE Trans Image Process 26:4648–4660. https://doi.org/10.1109/TIP.2017.2718189
Reza A, Maryam AA et al (2019) Dynamic 3D hand gesture recognition by learning weighted depth motion maps. IEEE Trans Circ Syst Video Technol 29:1729–1740. https://doi.org/10.1109/TCSVT.2018.2855416
Sun Y, Weng Y et al (2020) Gesture recognition algorithm based on multi-scale feature fusion in RGB‐D images. IET Image Process 14:3662–3668
Huang Y, Yang J (2021) A multi-scale descriptor for real time RGB-D hand gesture recognition. Pattern Recognit Lett 144:97–104. https://doi.org/10.1016/j.patrec.2020.11.011
Lazarou M, Li B, Stathaki T (2021) A novel shape matching descriptor for real-time static hand gesture recognition. Comput Vis Image Underst 210:103241. https://doi.org/10.1016/j.cviu.2021.103241
Sahana T, Basu S et al (2022) MRCS: multi-radii circular signature based feature descriptor for hand gesture recognition. Multimed Tools Appl 81(6):8539–8560
Dominio F, Donadeo M, Zanuttigh P (2014) Combining multiple depth-based descriptors for hand gesture recognition. Pattern Recognit Lett 50:101–111
Deng M (2020) Robust human gesture recognition by leveraging multi-scale feature fusion. Signal Process Image Commun 83. https://doi.org/10.1016/j.image.2019.115768
Acknowledgements
The authors would like to thank anonymous reviewers for their valuable comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
There is no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, B., Ding, W. & Ye, J. A new weighted multi-scale descriptor for hand gesture recognition. Multimed Tools Appl 83, 43325–43347 (2024). https://doi.org/10.1007/s11042-023-17319-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-17319-0