CN111665930A - Multi-mode emotion recognition method and system integrating cloud and edge computing - Google Patents
Multi-mode emotion recognition method and system integrating cloud and edge computing Download PDFInfo
- Publication number
- CN111665930A CN111665930A CN201910089030.7A CN201910089030A CN111665930A CN 111665930 A CN111665930 A CN 111665930A CN 201910089030 A CN201910089030 A CN 201910089030A CN 111665930 A CN111665930 A CN 111665930A
- Authority
- CN
- China
- Prior art keywords
- module
- online
- emotion recognition
- emotion
- local
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000008909 emotion recognition Effects 0.000 title claims abstract description 55
- 238000000034 method Methods 0.000 title claims abstract description 29
- 230000008451 emotion Effects 0.000 claims abstract description 26
- 230000005540 biological transmission Effects 0.000 claims abstract description 19
- 238000005516 engineering process Methods 0.000 claims abstract description 11
- 230000008921 facial expression Effects 0.000 claims abstract description 8
- 230000002996 emotional effect Effects 0.000 claims description 10
- 230000001815 facial effect Effects 0.000 claims description 3
- 230000004927 fusion Effects 0.000 claims description 2
- 230000004044 response Effects 0.000 abstract description 5
- 230000007547 defect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 239000000047 product Substances 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000006227 byproduct Substances 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000002035 prolonged effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/63—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/55—Push-based network services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2203/00—Indexing scheme relating to G06F3/00 - G06F3/048
- G06F2203/01—Indexing scheme relating to G06F3/01
- G06F2203/011—Emotion or mood input determined on the basis of sensed human body parameters such as pulse, heart rate or beat, temperature of skin, facial expressions, iris, voice pitch, brain activity patterns
Landscapes
- Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Theoretical Computer Science (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Computer Networks & Wireless Communication (AREA)
- General Engineering & Computer Science (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Child & Adolescent Psychology (AREA)
- Hospice & Palliative Care (AREA)
- Psychiatry (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to the field of emotion recognition, in particular to a cloud and edge computing fused multi-modal emotion recognition method and system. The data acquisition module is responsible for acquiring face images and voice related data; the edge calculation module calculates a local emotion recognition result based on a facial expression recognition technology; the network transmission module transmits the voice data to the cloud computing module; the cloud computing module is responsible for computing an online emotion recognition result based on a voice emotion recognition technology and pushing the result to the emotion state judging module through the network transmission module; and the emotion state judgment module is used for comprehensively judging the emotion state based on the local emotion recognition result and the online emotion recognition result. The process only relates to network transmission of voice data, and has the characteristics of high response speed and high recognition accuracy.
Description
Technical Field
The invention relates to the field of emotion recognition, in particular to a cloud and edge computing fused multi-modal emotion recognition method and system.
Background
At present, emotion recognition is widely applied to various electronic products as a common human-computer interaction technology, and is a key technology for realizing human-computer emotion interaction. Emotion recognition is carried out in two types at present, one type is an online mode, emotion data of a user is collected by product equipment, transmitted to a cloud server through a network for emotion recognition, and transmitted back to the product equipment through the network for corresponding interactive control, and the method has the advantages that the emotion recognition accuracy is high, but the method has the defects that network environment support is needed, a large amount of network bandwidth is consumed, service response delay time is prolonged, and privacy of video data is difficult to guarantee; the other type of emotion recognition mode is a local recognition module for recognition, the mode does not need to use a network, has the advantages of high response speed and the like, but has higher requirements on local computing resources, and the emotion recognition accuracy is usually low due to the uniqueness of the local emotion recognition module, so that the user experience is influenced.
In order to solve the above problems, it is necessary to provide a method and a system for multi-modal emotion recognition that can achieve a high emotion recognition accuracy and ensure a high response speed and that integrates cloud and edge computing.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provide a multi-modal emotion recognition method and system integrating cloud and edge computing.
In order to achieve the purpose, the invention provides the following technical scheme:
a cloud and edge computing fused multimodal emotion recognition system, comprising: the system comprises a data acquisition module, an edge calculation module, a network transmission module, a cloud calculation module and an emotion state judgment module.
As a preferred scheme of the invention, the data acquisition module is responsible for acquiring face images and voice related data and pushing the face image data to the edge calculation module; meanwhile, whether network connection exists or not is judged, and if yes, voice data are pushed to the cloud computing module through the network transmission module; the edge calculation module calculates a local emotion recognition result E based on the facial expression recognition technologylocal(ii) a The cloud computing module is responsible for computing online emotion recognition based on a voice emotion recognition technologyResults Eonline(the initial value is 0), and the recognition result is pushed to the emotion state judgment module through the network transmission module; the emotional state determination module is based on ElocalAnd EonlineAnd calculating a comprehensive emotion recognition result value E by a decision-making layer fused weighted value method, thereby finally realizing multi-modal comprehensive judgment on the emotion state. Wherein E ═ w ^ Elocal+(1-w)*Eonline,w∈(0,1]Weight preset for system, if and only if EonlineWhen 0, w is 1.
The invention also provides the following technical scheme:
a multi-modal emotion recognition method fusing cloud and edge computing comprises the following steps:
step 1, a data acquisition module acquires face image data and voice data, pushes the face image data to an edge calculation module, simultaneously judges whether network connection exists or not, and pushes the voice data to a cloud calculation module through a network transmission module if the network connection exists;
step 2, the edge calculation module carries out facial expression recognition based on the facial image data pushed by the data acquisition module so as to calculate a local emotion recognition result ElocalAnd E islocalPushing to a emotion state judgment module;
step 3, the cloud computing module carries out voice emotion recognition on the basis of the voice data pushed by the data acquisition module, and therefore an online emotion recognition result E is calculatedonline(initial value is set to 0), and E isonlineThe emotion information is pushed to an emotion state judgment module through a network transmission module;
step 4, the emotion state judgment module identifies a result E based on local emotionlocalAnd online emotion recognition result EonlineCalculating a comprehensive emotion recognition result E (w E) by using a weighted value method of decision layer fusionlocal+(1-w)*EonlineAnd finally, a multi-modal comprehensive decision on emotional state is achieved, wherein w ∈ (0, 1)]Weight preset for system, if and only if EonlineWhen 0, w is 1.
Compared with the prior art, the invention has the beneficial effects that: according to the method, local identification of the emotional state is realized through the face expression identification technology by utilizing edge calculation, online identification of the emotional state is realized through the voice emotion identification technology by utilizing cloud computing, and finally multi-mode comprehensive judgment of the emotional state is realized through mutual fusion of the edge calculation and the cloud computing. Because the multi-modal emotion recognition process only relates to network transmission of voice data, the network bandwidth consumption is low, and the advantages of high response speed and high recognition accuracy are achieved.
Drawings
FIG. 1 is a functional block diagram of the method.
Fig. 2 is a functional flow diagram of the method.
Detailed Description
The present invention will be described in further detail with reference to examples and embodiments.
Example 1
As shown in fig. 1, a multi-modal emotion recognition system integrating cloud and edge computing includes a data acquisition module, an edge computing module, a network transmission module, a cloud computing module, and an emotion state determination module. The data acquisition module is responsible for acquiring and pushing face images and voice related data; the edge calculation module calculates a local emotion recognition result E based on the facial expression recognition technologylocal(ii) a The cloud computing module is responsible for computing an online emotion recognition result E based on the voice emotion recognition technologyonline(initial value is 0), and E is addedonlineThe emotion state judgment module is used for judging the emotion state of the user; the emotion state judgment module is based on ElocalAnd EonlineCalculating a comprehensive emotion recognition result value E-w-E by using a weighted value method of decision layer fusionlocal+(1-w)*EonlineAnd finally, multi-modal comprehensive judgment of emotional state is realized, wherein w ∈ (0, 1)]Weight preset for system, if and only if EonlineWhen 0, w is 1; the network transmission module is a universal module and is suitable for mobile, telecommunication, communication and the like; the data acquisition module adopts a camera to acquire face image data and adopts a microphone to acquire voice data.
Example 2
As shown in fig. 2, a multi-modal emotion recognition method fusing cloud and edge computing includes the following four steps:
step 1, a data acquisition module acquires face image data and voice data and pushes the face image data to an edge calculation module; meanwhile, whether network connection exists or not is judged, and if yes, the voice data are pushed to the cloud computing module through the network transmission module;
step 2, the edge calculation module carries out facial expression recognition based on the facial image data pushed by the data acquisition module so as to calculate a local emotion recognition result ElocalAnd E islocalPushing the emotion information to an emotion state judgment module;
step 3, the cloud computing module carries out voice emotion recognition on the basis of the voice data pushed by the data acquisition module, and therefore an online emotion recognition result E is calculatedonline(initial value is 0), and E is addedonlineThe emotion information is pushed to an emotion state judgment module through a network transmission module;
the emotional state determination module in the step 4 is based on ElocalAnd EonlineCalculating a comprehensive emotion recognition result E (w E) by using a weighted value method of decision layer fusionlocal+(1-w)*EonlineFinally, multi-modal comprehensive judgment of emotional states is realized, wherein w ∈ (0, 1)]Weight preset for system, if and only if EonlineWhen 0, w is 1.
Claims (6)
1. A multi-mode emotion recognition method and system fusing cloud and edge computing comprises five functional modules in total, namely data acquisition, edge computing, network transmission, cloud computing and emotion state judgment.
2. The method of claim 1, wherein: the data acquisition module is responsible for acquiring emotion related data of a face image and voice and pushing the face image data to the edge calculation module; meanwhile, whether network connection exists or not is judged, and if the network connection exists, voice data are pushed to the cloud computing module through the network transmission module.
3. The method of claim 1, wherein: the edge calculation module is responsible for calculating a local emotion recognition result ElocalAnd the calculation process is realized based on the facial expression recognition technology; the cloud computing module is responsible for computing an online emotion recognition result Eonline(initial value is set to 0) and the calculation process is implemented based on speech emotion recognition technology.
4. The method of claim 1, wherein: the network transmission module is used for voice data and online emotion recognition result EonlineThe transmission of (1); the emotional state determination module is based on the ElocalAnd said EonlineAnd calculating the comprehensive emotion recognition result E.
5. The method of claim 4, wherein: the calculation process of the integrated emotion recognition result value E is realized based on a weighted value method of decision layer fusion, namely E ═ w × Elocal+(1-w)*EonlineWherein w ∈ (0, 1)]Weight preset for system, if and only if EonlineWhen 0, w is 1.
6. A multi-modal emotion recognition method fusing cloud and edge computing comprises the following steps:
step 1, a data acquisition module acquires face image data and voice data and pushes the face image data to an edge calculation module; meanwhile, whether network connection exists or not is judged, and if yes, the voice data are pushed to the cloud computing module through the network transmission module;
step 2, the edge calculation module carries out facial expression recognition based on the facial image data pushed by the data acquisition module so as to calculate a local emotion recognition result ElocalAnd E islocalPushing to a emotion state judgment module;
step 3, the cloud computing module pushes the voice number based on the data acquisition moduleBased on the speech emotion recognition, an online emotion recognition result E is calculatedonlineAnd E isonlineThe emotion information is pushed to an emotion state judgment module through a network transmission module;
the emotional state determination module in the step 4 is based on ElocalAnd EonlineCalculating a value of the integrated emotion recognition result E, i.e. E ═ w × Elocal+(1-w)*EonlineWherein w ∈ (0, 1)]Weight preset for system, if and only if EonlineWhen 0, w is 1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910089030.7A CN111665930A (en) | 2019-03-05 | 2019-03-05 | Multi-mode emotion recognition method and system integrating cloud and edge computing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910089030.7A CN111665930A (en) | 2019-03-05 | 2019-03-05 | Multi-mode emotion recognition method and system integrating cloud and edge computing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111665930A true CN111665930A (en) | 2020-09-15 |
Family
ID=72381228
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910089030.7A Pending CN111665930A (en) | 2019-03-05 | 2019-03-05 | Multi-mode emotion recognition method and system integrating cloud and edge computing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111665930A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114155882A (en) * | 2021-11-30 | 2022-03-08 | 浙江大学 | Method and device for judging road rage emotion based on voice recognition |
CN115473932A (en) * | 2022-08-15 | 2022-12-13 | 四川天府新区麓湖小学(成都哈密尔顿麓湖小学) | Emotional state analysis method and device for intelligent education |
CN116405635A (en) * | 2023-06-02 | 2023-07-07 | 山东正中信息技术股份有限公司 | Multi-mode conference recording method and system based on edge calculation |
-
2019
- 2019-03-05 CN CN201910089030.7A patent/CN111665930A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114155882A (en) * | 2021-11-30 | 2022-03-08 | 浙江大学 | Method and device for judging road rage emotion based on voice recognition |
CN114155882B (en) * | 2021-11-30 | 2023-08-22 | 浙江大学 | Method and device for judging emotion of road anger based on voice recognition |
CN115473932A (en) * | 2022-08-15 | 2022-12-13 | 四川天府新区麓湖小学(成都哈密尔顿麓湖小学) | Emotional state analysis method and device for intelligent education |
CN116405635A (en) * | 2023-06-02 | 2023-07-07 | 山东正中信息技术股份有限公司 | Multi-mode conference recording method and system based on edge calculation |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106339094B (en) | Interactive remote expert cooperation examination and repair system and method based on augmented reality | |
CN102831176B (en) | The method of commending friends and server | |
CN111665930A (en) | Multi-mode emotion recognition method and system integrating cloud and edge computing | |
CN102708865A (en) | Method, device and system for voice recognition | |
CN107731231B (en) | Method for supporting multi-cloud-end voice service and storage device | |
US20230196586A1 (en) | Video personnel re-identification method based on trajectory fusion in complex underground space | |
CN106485476A (en) | A kind of staff attendance system based on video | |
US11503110B2 (en) | Method for presenting schedule reminder information, terminal device, and cloud server | |
EP3776171A1 (en) | Non-disruptive nui command | |
WO2024032159A1 (en) | Speaking object detection in multi-human-machine interaction scenario | |
CN113014960A (en) | Method, device and storage medium for online video production | |
CN113706673B (en) | Cloud rendering frame platform applied to virtual augmented reality technology | |
CN102521683A (en) | Student management method of distance education | |
CN111835547A (en) | Quality of service (QoS) management method and related equipment | |
CN113656125A (en) | Virtual assistant generation method and device and electronic equipment | |
CN107783650A (en) | A kind of man-machine interaction method and device based on virtual robot | |
CN110197230B (en) | Method and apparatus for training a model | |
CN111885351A (en) | A screen display method, device, terminal device and storage medium | |
CN115017352B (en) | Mobile phone car machine interaction system based on image recognition | |
CN107770474B (en) | Sound processing method and device, terminal equipment and storage medium | |
CN112637643B (en) | Networking method and device of mobile terminal, terminal equipment and storage medium | |
CN113033475B (en) | Target object tracking method, related device and computer program product | |
CN111327659A (en) | User queuing optimization method based on edge calculation | |
CN111031354B (en) | Multimedia playing method, device and storage medium | |
CN115022722A (en) | Video monitoring method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20200915 |
|
WD01 | Invention patent application deemed withdrawn after publication |