Multimodal intent understanding and interaction system for elderly-assisted companionship

288 Accesses
Explore all metrics

Abstract

With the aging of society, there has been an increasing amount of research on elderly-assisted companion robots. However, many existing methods used in research insufficiently consider the physiological characteristics of the elderly or rely on a single mode of interaction, leading to inaccurate understanding of elderly individuals’ intents. In this paper, we design a multimodal intent understanding and interaction system for elderly-assisted companionship. The system presents the following main innovations: (1) Proposing a semantic-based multimodal fusion algorithm (MSFA) to integrate the semantic layers of gesture and speech, addressing the heterogeneity and asynchrony issues between the two modalities. (2) Assisting elderly individuals in completing daily tasks through the human–computer cooperative interaction control algorithm (HCC). Experimental results demonstrate that the proposed multimodal fusion algorithm achieves effective intent recognition and combines natural human–machine interaction with intent understanding. This not only accurately captures users’ interaction intents and assists in completing interactive tasks but also reduces users’ mental and cognitive load, achieving a more desirable interaction effect. Additionally, the subjective evaluation analysis by users further verifies the effectiveness of the system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Affective Interaction Technology of Companion Robots for the Elderly: A Review

Pointing Gestures for Human-Robot Interaction in Service Robotics: A Feasibility Study

Gestural and Touchscreen Interaction for Human-Robot Collaboration: A Comparative Study

References

Aaltonen, I., Arbola, A., Heikkil, P., et al.: Hello Pepper, may I tickle you? children’s and adults’ responses to an entertainment robot at a shopping mall. In: ACM Philadelphia (2017)
Berns, K., Mehdi, S. A.: Use of an autonomous mobile robot for elderly care. In: 2010 Advanced Technologies for Enhancing Quality of Life. Ieee, pp 121–126 (2010)
Cacace, J., Finzi, A., Lippiello, V.: A robust multimodal fusion framework for command interpretation in human-robot cooperation. In: 2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN). IEEE
Di Nuovo, A., Broz, F., Wang, N., et al.: The multi-modal interface of Robot-Era multi-robot services tailored for the elderly. Intel. Serv. Robot. 11, 109–126 (2018)
Article Google Scholar
Do, H.M., Pham, M., Sheng, W., et al.: RiSH: a robot-integrated smart home for elderly care. Robot. Auton. Syst. 101(1), 74–92 (2018)
Article Google Scholar
Han, J. G., Campbell, N., Jokinen, K., et al.: Investigating the use of non-verbal cues in human–robot interaction with a Nao robot. In: IEEE, pp 679–683 (2012)
Hatori, J., Kikuchi, Y., Kobayashi, S., et al.: Interactively picking real-world objects with unconstrained spoken language instructions. In: ICRA Brisbane, (2018)
Islam, M. M., Iqbal, T.: Hamlet: a hierarchical multimodal attention-based human activity recognition algorithm. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, pp 10285–10292 (2020)
Iwata, H., Sugano, S.: Design of human symbiotic robot TWENDY-ONE. In: IEEE Kobe, (2009)
Jose, K, J., Lakshmi, K. S.: Joint slot filling and intent prediction for natural language understanding in frames dataset. In: ICIRCA Coimbatore (2018)
Kim, J. H., Thang, N. D., Kim, T. S.: 3-D hand motion tracking and gesture recognition using a data glove. In: 2009 IEEE international symposium on industrial electronics, pp 1013–101 (2009)
Koceski, S., Koceska, N.: Evaluation of an assistive telepresence robot for elderly healthcare. J. Med. Syst. 40(5), 1–7 (2016)
Article Google Scholar
Lafaye, J., Gouaillier, D., Wieber, P. B.: Linear model predictive control of the locomotion of Pepper, a humanoid robot with omnidirectional wheels. In: IEEE (2014)
Li, J., Feng, Z. Q., Xie, W., et al.: A method of gesture recognition using CNN-SVM model with error correction strategy. In: 2018 International conference on computer, communication and network technology (CCNT 2018) ISBN, pp 978-1 (2018)
Maeshima, S., Osawa, A., Nishio, D., et al.: Efficacy of a hybrid assistive limb in post-stroke hemiplegic patients: a preliminary report. BMC Neurol. 11(1), 1–6 (2011)
Article Google Scholar
Mikolov, T., Yih, W., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: Human language technologies, pp 746–751 (2013)
Parlitz, C., Hagele, M., Klein, P., et al.: Care-O-bot 3-rationale for human-robot interaction design. In: Seul:ISR, (2008)
Rane, P., Mhatre, V., Kurup, L.: Study of a home robot: JIBO. Int. J. Eng. Res. Technol. (IJERT) 3(10), 490–493 (2014)
Google Scholar
Rosa, S., Patane, A., Lu, C.X., et al.: Semantic place understanding for human–robot coexistence—toward intelligent workplaces. IEEE Trans. Hum.-Mach. Syst. 49(2), 160–170 (2018)
Article Google Scholar
Seppälä, M.: A secure and conflict free control platform for Care-O-Bot 4. 2018(1): 77–84 (2018)
Shanthakumar, V.A., Peng, C., Hansberger, J., et al.: Design and evaluation of a hand gesture recognition approach for real-time interactions. Multimed. Tools Appl. 79(25), 17707–17730 (2020)
Article Google Scholar
Sindagi, V. A., Zhou, Y., Tuzel, O.: Mvx-net: multimodal voxelnet for 3d object detection. In: 2019 International Conference on Robotics and Automation (ICRA). IEEE, pp 7276–7282 (2019)
Variani, E., Lei, X., McDermott, E., et al.: Deep neural networks for small footprint text-dependent speaker verification. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 4052–4056 (2014)
Wang, Q., Lan, Z.: The primary research of control system on companion robot for the elderly. In: 2016 International Conference on Advanced Robotics and Mechatronics (ICARM). IEEE, pp 38–41 (2016)
Zhang, J., Yin, Z., Chen, P., et al.: Emotion recognition using multi-modal data and machine learning techniques: a tutorial and review. Inform. Fusion 59(1), 103–126 (2020a)
Google Scholar
Zhang, X., Feng, Z., Tian, J., et al.: Multimodal data fusion algorithm applied to robots. J. Phys.: Conf. Ser. 1453(1), 012040 (2020b)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Science and Engineering, University of Jinan, Jinan, 250022, China
Ying Wang, Zhiquan Feng & Hongyue Wang
Shandong Provincial Key Laboratory of Network Based Intelligent Computing, Jinan, 250022, China
Ying Wang, Zhiquan Feng & Hongyue Wang

Authors

Ying Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhiquan Feng
View author publications
You can also search for this author in PubMed Google Scholar
Hongyue Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhiquan Feng.

Ethics declarations

Conflict of interest

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, Y., Feng, Z. & Wang, H. Multimodal intent understanding and interaction system for elderly-assisted companionship. CCF Trans. Pervasive Comp. Interact. 6, 52–67 (2024). https://doi.org/10.1007/s42486-023-00137-6

Download citation

Received: 15 May 2023
Accepted: 19 August 2023
Published: 23 October 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s42486-023-00137-6

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Affective Interaction Technology of Companion Robots for the Elderly: A Review

Pointing Gestures for Human-Robot Interaction in Service Robotics: A Feasibility Study

Gestural and Touchscreen Interaction for Human-Robot Collaboration: A Comparative Study

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Multimodal intent understanding and interaction system for elderly-assisted companionship

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Affective Interaction Technology of Companion Robots for the Elderly: A Review

Pointing Gestures for Human-Robot Interaction in Service Robotics: A Feasibility Study

Gestural and Touchscreen Interaction for Human-Robot Collaboration: A Comparative Study

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation