CN111665930A

CN111665930A - Multi-mode emotion recognition method and system integrating cloud and edge computing

Info

Publication number: CN111665930A
Application number: CN201910089030.7A
Authority: CN
Inventors: 王春雷; 尉迟学彪; 毛鹏轩
Original assignee: Beijing Rostec Technology Co ltd
Current assignee: Beijing Rostec Technology Co ltd
Priority date: 2019-03-05
Filing date: 2019-03-05
Publication date: 2020-09-15

Abstract

The invention relates to the field of emotion recognition, in particular to a cloud and edge computing fused multi-modal emotion recognition method and system. The data acquisition module is responsible for acquiring face images and voice related data; the edge calculation module calculates a local emotion recognition result based on a facial expression recognition technology; the network transmission module transmits the voice data to the cloud computing module; the cloud computing module is responsible for computing an online emotion recognition result based on a voice emotion recognition technology and pushing the result to the emotion state judging module through the network transmission module; and the emotion state judgment module is used for comprehensively judging the emotion state based on the local emotion recognition result and the online emotion recognition result. The process only relates to network transmission of voice data, and has the characteristics of high response speed and high recognition accuracy.

Description

Multi-mode emotion recognition method and system integrating cloud and edge computing

Technical Field

The invention relates to the field of emotion recognition, in particular to a cloud and edge computing fused multi-modal emotion recognition method and system.

Background

At present, emotion recognition is widely applied to various electronic products as a common human-computer interaction technology, and is a key technology for realizing human-computer emotion interaction. Emotion recognition is carried out in two types at present, one type is an online mode, emotion data of a user is collected by product equipment, transmitted to a cloud server through a network for emotion recognition, and transmitted back to the product equipment through the network for corresponding interactive control, and the method has the advantages that the emotion recognition accuracy is high, but the method has the defects that network environment support is needed, a large amount of network bandwidth is consumed, service response delay time is prolonged, and privacy of video data is difficult to guarantee; the other type of emotion recognition mode is a local recognition module for recognition, the mode does not need to use a network, has the advantages of high response speed and the like, but has higher requirements on local computing resources, and the emotion recognition accuracy is usually low due to the uniqueness of the local emotion recognition module, so that the user experience is influenced.

In order to solve the above problems, it is necessary to provide a method and a system for multi-modal emotion recognition that can achieve a high emotion recognition accuracy and ensure a high response speed and that integrates cloud and edge computing.

Disclosure of Invention

The invention aims to overcome the defects in the prior art and provide a multi-modal emotion recognition method and system integrating cloud and edge computing.

In order to achieve the purpose, the invention provides the following technical scheme:

a cloud and edge computing fused multimodal emotion recognition system, comprising: the system comprises a data acquisition module, an edge calculation module, a network transmission module, a cloud calculation module and an emotion state judgment module.

As a preferred scheme of the invention, the data acquisition module is responsible for acquiring face images and voice related data and pushing the face image data to the edge calculation module; meanwhile, whether network connection exists or not is judged, and if yes, voice data are pushed to the cloud computing module through the network transmission module; the edge calculation module calculates a local emotion recognition result E based on the facial expression recognition technology_local(ii) a The cloud computing module is responsible for computing online emotion recognition based on a voice emotion recognition technologyResults E_online(the initial value is 0), and the recognition result is pushed to the emotion state judgment module through the network transmission module; the emotional state determination module is based on E_localAnd E_onlineAnd calculating a comprehensive emotion recognition result value E by a decision-making layer fused weighted value method, thereby finally realizing multi-modal comprehensive judgment on the emotion state. Wherein E ═ w ^ E_local+(1-w)*E_online，w∈(0，1]Weight preset for system, if and only if E_onlineWhen 0, w is 1.

The invention also provides the following technical scheme:

a multi-modal emotion recognition method fusing cloud and edge computing comprises the following steps:

step 1, a data acquisition module acquires face image data and voice data, pushes the face image data to an edge calculation module, simultaneously judges whether network connection exists or not, and pushes the voice data to a cloud calculation module through a network transmission module if the network connection exists;

step 2, the edge calculation module carries out facial expression recognition based on the facial image data pushed by the data acquisition module so as to calculate a local emotion recognition result E_localAnd E is_localPushing to a emotion state judgment module;

step 3, the cloud computing module carries out voice emotion recognition on the basis of the voice data pushed by the data acquisition module, and therefore an online emotion recognition result E is calculated_online(initial value is set to 0), and E is_onlineThe emotion information is pushed to an emotion state judgment module through a network transmission module;

step 4, the emotion state judgment module identifies a result E based on local emotion_localAnd online emotion recognition result E_onlineCalculating a comprehensive emotion recognition result E (w E) by using a weighted value method of decision layer fusion_local+(1-w)*E_onlineAnd finally, a multi-modal comprehensive decision on emotional state is achieved, wherein w ∈ (0, 1)]Weight preset for system, if and only if E_onlineWhen 0, w is 1.

Compared with the prior art, the invention has the beneficial effects that: according to the method, local identification of the emotional state is realized through the face expression identification technology by utilizing edge calculation, online identification of the emotional state is realized through the voice emotion identification technology by utilizing cloud computing, and finally multi-mode comprehensive judgment of the emotional state is realized through mutual fusion of the edge calculation and the cloud computing. Because the multi-modal emotion recognition process only relates to network transmission of voice data, the network bandwidth consumption is low, and the advantages of high response speed and high recognition accuracy are achieved.

Drawings

FIG. 1 is a functional block diagram of the method.

Fig. 2 is a functional flow diagram of the method.

Detailed Description

The present invention will be described in further detail with reference to examples and embodiments.

Example 1

As shown in fig. 1, a multi-modal emotion recognition system integrating cloud and edge computing includes a data acquisition module, an edge computing module, a network transmission module, a cloud computing module, and an emotion state determination module. The data acquisition module is responsible for acquiring and pushing face images and voice related data; the edge calculation module calculates a local emotion recognition result E based on the facial expression recognition technology_local(ii) a The cloud computing module is responsible for computing an online emotion recognition result E based on the voice emotion recognition technology_online(initial value is 0), and E is added_onlineThe emotion state judgment module is used for judging the emotion state of the user; the emotion state judgment module is based on E_localAnd E_onlineCalculating a comprehensive emotion recognition result value E-w-E by using a weighted value method of decision layer fusion_local+(1-w)*E_onlineAnd finally, multi-modal comprehensive judgment of emotional state is realized, wherein w ∈ (0, 1)]Weight preset for system, if and only if E_onlineWhen 0, w is 1; the network transmission module is a universal module and is suitable for mobile, telecommunication, communication and the like; the data acquisition module adopts a camera to acquire face image data and adopts a microphone to acquire voice data.

Example 2

As shown in fig. 2, a multi-modal emotion recognition method fusing cloud and edge computing includes the following four steps:

step 1, a data acquisition module acquires face image data and voice data and pushes the face image data to an edge calculation module; meanwhile, whether network connection exists or not is judged, and if yes, the voice data are pushed to the cloud computing module through the network transmission module;

step 2, the edge calculation module carries out facial expression recognition based on the facial image data pushed by the data acquisition module so as to calculate a local emotion recognition result E_localAnd E is_localPushing the emotion information to an emotion state judgment module;

step 3, the cloud computing module carries out voice emotion recognition on the basis of the voice data pushed by the data acquisition module, and therefore an online emotion recognition result E is calculated_online(initial value is 0), and E is added_onlineThe emotion information is pushed to an emotion state judgment module through a network transmission module;

the emotional state determination module in the step 4 is based on E_localAnd E_onlineCalculating a comprehensive emotion recognition result E (w E) by using a weighted value method of decision layer fusion_local+(1-w)*E_onlineFinally, multi-modal comprehensive judgment of emotional states is realized, wherein w ∈ (0, 1)]Weight preset for system, if and only if E_onlineWhen 0, w is 1.

Claims

1. A multi-mode emotion recognition method and system fusing cloud and edge computing comprises five functional modules in total, namely data acquisition, edge computing, network transmission, cloud computing and emotion state judgment.

2. The method of claim 1, wherein: the data acquisition module is responsible for acquiring emotion related data of a face image and voice and pushing the face image data to the edge calculation module; meanwhile, whether network connection exists or not is judged, and if the network connection exists, voice data are pushed to the cloud computing module through the network transmission module.

3. The method of claim 1, wherein: the edge calculation module is responsible for calculating a local emotion recognition result E_localAnd the calculation process is realized based on the facial expression recognition technology; the cloud computing module is responsible for computing an online emotion recognition result E_online(initial value is set to 0) and the calculation process is implemented based on speech emotion recognition technology.

4. The method of claim 1, wherein: the network transmission module is used for voice data and online emotion recognition result E_onlineThe transmission of (1); the emotional state determination module is based on the E_localAnd said E_onlineAnd calculating the comprehensive emotion recognition result E.

5. The method of claim 4, wherein: the calculation process of the integrated emotion recognition result value E is realized based on a weighted value method of decision layer fusion, namely E ═ w × E_local+(1-w)*E_onlineWherein w ∈ (0, 1)]Weight preset for system, if and only if E_onlineWhen 0, w is 1.

6. A multi-modal emotion recognition method fusing cloud and edge computing comprises the following steps:

step 3, the cloud computing module pushes the voice number based on the data acquisition moduleBased on the speech emotion recognition, an online emotion recognition result E is calculated_onlineAnd E is_onlineThe emotion information is pushed to an emotion state judgment module through a network transmission module;

the emotional state determination module in the step 4 is based on E_localAnd E_onlineCalculating a value of the integrated emotion recognition result E, i.e. E ═ w × E_local+(1-w)*E_onlineWherein w ∈ (0, 1)]Weight preset for system, if and only if E_onlineWhen 0, w is 1.