[go: up one dir, main page]

CN111665930A - Multi-mode emotion recognition method and system integrating cloud and edge computing - Google Patents

Multi-mode emotion recognition method and system integrating cloud and edge computing Download PDF

Info

Publication number
CN111665930A
CN111665930A CN201910089030.7A CN201910089030A CN111665930A CN 111665930 A CN111665930 A CN 111665930A CN 201910089030 A CN201910089030 A CN 201910089030A CN 111665930 A CN111665930 A CN 111665930A
Authority
CN
China
Prior art keywords
module
online
emotion recognition
emotion
local
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910089030.7A
Other languages
Chinese (zh)
Inventor
王春雷
尉迟学彪
毛鹏轩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Rostec Technology Co ltd
Original Assignee
Beijing Rostec Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Rostec Technology Co ltd filed Critical Beijing Rostec Technology Co ltd
Priority to CN201910089030.7A priority Critical patent/CN111665930A/en
Publication of CN111665930A publication Critical patent/CN111665930A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/55Push-based network services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/01Indexing scheme relating to G06F3/01
    • G06F2203/011Emotion or mood input determined on the basis of sensed human body parameters such as pulse, heart rate or beat, temperature of skin, facial expressions, iris, voice pitch, brain activity patterns

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Engineering & Computer Science (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Child & Adolescent Psychology (AREA)
  • Hospice & Palliative Care (AREA)
  • Psychiatry (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the field of emotion recognition, in particular to a cloud and edge computing fused multi-modal emotion recognition method and system. The data acquisition module is responsible for acquiring face images and voice related data; the edge calculation module calculates a local emotion recognition result based on a facial expression recognition technology; the network transmission module transmits the voice data to the cloud computing module; the cloud computing module is responsible for computing an online emotion recognition result based on a voice emotion recognition technology and pushing the result to the emotion state judging module through the network transmission module; and the emotion state judgment module is used for comprehensively judging the emotion state based on the local emotion recognition result and the online emotion recognition result. The process only relates to network transmission of voice data, and has the characteristics of high response speed and high recognition accuracy.

Description

Multi-mode emotion recognition method and system integrating cloud and edge computing
Technical Field
The invention relates to the field of emotion recognition, in particular to a cloud and edge computing fused multi-modal emotion recognition method and system.
Background
At present, emotion recognition is widely applied to various electronic products as a common human-computer interaction technology, and is a key technology for realizing human-computer emotion interaction. Emotion recognition is carried out in two types at present, one type is an online mode, emotion data of a user is collected by product equipment, transmitted to a cloud server through a network for emotion recognition, and transmitted back to the product equipment through the network for corresponding interactive control, and the method has the advantages that the emotion recognition accuracy is high, but the method has the defects that network environment support is needed, a large amount of network bandwidth is consumed, service response delay time is prolonged, and privacy of video data is difficult to guarantee; the other type of emotion recognition mode is a local recognition module for recognition, the mode does not need to use a network, has the advantages of high response speed and the like, but has higher requirements on local computing resources, and the emotion recognition accuracy is usually low due to the uniqueness of the local emotion recognition module, so that the user experience is influenced.
In order to solve the above problems, it is necessary to provide a method and a system for multi-modal emotion recognition that can achieve a high emotion recognition accuracy and ensure a high response speed and that integrates cloud and edge computing.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provide a multi-modal emotion recognition method and system integrating cloud and edge computing.
In order to achieve the purpose, the invention provides the following technical scheme:
a cloud and edge computing fused multimodal emotion recognition system, comprising: the system comprises a data acquisition module, an edge calculation module, a network transmission module, a cloud calculation module and an emotion state judgment module.
As a preferred scheme of the invention, the data acquisition module is responsible for acquiring face images and voice related data and pushing the face image data to the edge calculation module; meanwhile, whether network connection exists or not is judged, and if yes, voice data are pushed to the cloud computing module through the network transmission module; the edge calculation module calculates a local emotion recognition result E based on the facial expression recognition technologylocal(ii) a The cloud computing module is responsible for computing online emotion recognition based on a voice emotion recognition technologyResults Eonline(the initial value is 0), and the recognition result is pushed to the emotion state judgment module through the network transmission module; the emotional state determination module is based on ElocalAnd EonlineAnd calculating a comprehensive emotion recognition result value E by a decision-making layer fused weighted value method, thereby finally realizing multi-modal comprehensive judgment on the emotion state. Wherein E ═ w ^ Elocal+(1-w)*Eonline,w∈(0,1]Weight preset for system, if and only if EonlineWhen 0, w is 1.
The invention also provides the following technical scheme:
a multi-modal emotion recognition method fusing cloud and edge computing comprises the following steps:
step 1, a data acquisition module acquires face image data and voice data, pushes the face image data to an edge calculation module, simultaneously judges whether network connection exists or not, and pushes the voice data to a cloud calculation module through a network transmission module if the network connection exists;
step 2, the edge calculation module carries out facial expression recognition based on the facial image data pushed by the data acquisition module so as to calculate a local emotion recognition result ElocalAnd E islocalPushing to a emotion state judgment module;
step 3, the cloud computing module carries out voice emotion recognition on the basis of the voice data pushed by the data acquisition module, and therefore an online emotion recognition result E is calculatedonline(initial value is set to 0), and E isonlineThe emotion information is pushed to an emotion state judgment module through a network transmission module;
step 4, the emotion state judgment module identifies a result E based on local emotionlocalAnd online emotion recognition result EonlineCalculating a comprehensive emotion recognition result E (w E) by using a weighted value method of decision layer fusionlocal+(1-w)*EonlineAnd finally, a multi-modal comprehensive decision on emotional state is achieved, wherein w ∈ (0, 1)]Weight preset for system, if and only if EonlineWhen 0, w is 1.
Compared with the prior art, the invention has the beneficial effects that: according to the method, local identification of the emotional state is realized through the face expression identification technology by utilizing edge calculation, online identification of the emotional state is realized through the voice emotion identification technology by utilizing cloud computing, and finally multi-mode comprehensive judgment of the emotional state is realized through mutual fusion of the edge calculation and the cloud computing. Because the multi-modal emotion recognition process only relates to network transmission of voice data, the network bandwidth consumption is low, and the advantages of high response speed and high recognition accuracy are achieved.
Drawings
FIG. 1 is a functional block diagram of the method.
Fig. 2 is a functional flow diagram of the method.
Detailed Description
The present invention will be described in further detail with reference to examples and embodiments.
Example 1
As shown in fig. 1, a multi-modal emotion recognition system integrating cloud and edge computing includes a data acquisition module, an edge computing module, a network transmission module, a cloud computing module, and an emotion state determination module. The data acquisition module is responsible for acquiring and pushing face images and voice related data; the edge calculation module calculates a local emotion recognition result E based on the facial expression recognition technologylocal(ii) a The cloud computing module is responsible for computing an online emotion recognition result E based on the voice emotion recognition technologyonline(initial value is 0), and E is addedonlineThe emotion state judgment module is used for judging the emotion state of the user; the emotion state judgment module is based on ElocalAnd EonlineCalculating a comprehensive emotion recognition result value E-w-E by using a weighted value method of decision layer fusionlocal+(1-w)*EonlineAnd finally, multi-modal comprehensive judgment of emotional state is realized, wherein w ∈ (0, 1)]Weight preset for system, if and only if EonlineWhen 0, w is 1; the network transmission module is a universal module and is suitable for mobile, telecommunication, communication and the like; the data acquisition module adopts a camera to acquire face image data and adopts a microphone to acquire voice data.
Example 2
As shown in fig. 2, a multi-modal emotion recognition method fusing cloud and edge computing includes the following four steps:
step 1, a data acquisition module acquires face image data and voice data and pushes the face image data to an edge calculation module; meanwhile, whether network connection exists or not is judged, and if yes, the voice data are pushed to the cloud computing module through the network transmission module;
step 2, the edge calculation module carries out facial expression recognition based on the facial image data pushed by the data acquisition module so as to calculate a local emotion recognition result ElocalAnd E islocalPushing the emotion information to an emotion state judgment module;
step 3, the cloud computing module carries out voice emotion recognition on the basis of the voice data pushed by the data acquisition module, and therefore an online emotion recognition result E is calculatedonline(initial value is 0), and E is addedonlineThe emotion information is pushed to an emotion state judgment module through a network transmission module;
the emotional state determination module in the step 4 is based on ElocalAnd EonlineCalculating a comprehensive emotion recognition result E (w E) by using a weighted value method of decision layer fusionlocal+(1-w)*EonlineFinally, multi-modal comprehensive judgment of emotional states is realized, wherein w ∈ (0, 1)]Weight preset for system, if and only if EonlineWhen 0, w is 1.

Claims (6)

1. A multi-mode emotion recognition method and system fusing cloud and edge computing comprises five functional modules in total, namely data acquisition, edge computing, network transmission, cloud computing and emotion state judgment.
2. The method of claim 1, wherein: the data acquisition module is responsible for acquiring emotion related data of a face image and voice and pushing the face image data to the edge calculation module; meanwhile, whether network connection exists or not is judged, and if the network connection exists, voice data are pushed to the cloud computing module through the network transmission module.
3. The method of claim 1, wherein: the edge calculation module is responsible for calculating a local emotion recognition result ElocalAnd the calculation process is realized based on the facial expression recognition technology; the cloud computing module is responsible for computing an online emotion recognition result Eonline(initial value is set to 0) and the calculation process is implemented based on speech emotion recognition technology.
4. The method of claim 1, wherein: the network transmission module is used for voice data and online emotion recognition result EonlineThe transmission of (1); the emotional state determination module is based on the ElocalAnd said EonlineAnd calculating the comprehensive emotion recognition result E.
5. The method of claim 4, wherein: the calculation process of the integrated emotion recognition result value E is realized based on a weighted value method of decision layer fusion, namely E ═ w × Elocal+(1-w)*EonlineWherein w ∈ (0, 1)]Weight preset for system, if and only if EonlineWhen 0, w is 1.
6. A multi-modal emotion recognition method fusing cloud and edge computing comprises the following steps:
step 1, a data acquisition module acquires face image data and voice data and pushes the face image data to an edge calculation module; meanwhile, whether network connection exists or not is judged, and if yes, the voice data are pushed to the cloud computing module through the network transmission module;
step 2, the edge calculation module carries out facial expression recognition based on the facial image data pushed by the data acquisition module so as to calculate a local emotion recognition result ElocalAnd E islocalPushing to a emotion state judgment module;
step 3, the cloud computing module pushes the voice number based on the data acquisition moduleBased on the speech emotion recognition, an online emotion recognition result E is calculatedonlineAnd E isonlineThe emotion information is pushed to an emotion state judgment module through a network transmission module;
the emotional state determination module in the step 4 is based on ElocalAnd EonlineCalculating a value of the integrated emotion recognition result E, i.e. E ═ w × Elocal+(1-w)*EonlineWherein w ∈ (0, 1)]Weight preset for system, if and only if EonlineWhen 0, w is 1.
CN201910089030.7A 2019-03-05 2019-03-05 Multi-mode emotion recognition method and system integrating cloud and edge computing Pending CN111665930A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910089030.7A CN111665930A (en) 2019-03-05 2019-03-05 Multi-mode emotion recognition method and system integrating cloud and edge computing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910089030.7A CN111665930A (en) 2019-03-05 2019-03-05 Multi-mode emotion recognition method and system integrating cloud and edge computing

Publications (1)

Publication Number Publication Date
CN111665930A true CN111665930A (en) 2020-09-15

Family

ID=72381228

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910089030.7A Pending CN111665930A (en) 2019-03-05 2019-03-05 Multi-mode emotion recognition method and system integrating cloud and edge computing

Country Status (1)

Country Link
CN (1) CN111665930A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114155882A (en) * 2021-11-30 2022-03-08 浙江大学 Method and device for judging road rage emotion based on voice recognition
CN115473932A (en) * 2022-08-15 2022-12-13 四川天府新区麓湖小学(成都哈密尔顿麓湖小学) Emotional state analysis method and device for intelligent education
CN116405635A (en) * 2023-06-02 2023-07-07 山东正中信息技术股份有限公司 Multi-mode conference recording method and system based on edge calculation

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114155882A (en) * 2021-11-30 2022-03-08 浙江大学 Method and device for judging road rage emotion based on voice recognition
CN114155882B (en) * 2021-11-30 2023-08-22 浙江大学 Method and device for judging emotion of road anger based on voice recognition
CN115473932A (en) * 2022-08-15 2022-12-13 四川天府新区麓湖小学(成都哈密尔顿麓湖小学) Emotional state analysis method and device for intelligent education
CN116405635A (en) * 2023-06-02 2023-07-07 山东正中信息技术股份有限公司 Multi-mode conference recording method and system based on edge calculation

Similar Documents

Publication Publication Date Title
CN106339094B (en) Interactive remote expert cooperation examination and repair system and method based on augmented reality
CN102831176B (en) The method of commending friends and server
CN111665930A (en) Multi-mode emotion recognition method and system integrating cloud and edge computing
CN102708865A (en) Method, device and system for voice recognition
CN107731231B (en) Method for supporting multi-cloud-end voice service and storage device
US20230196586A1 (en) Video personnel re-identification method based on trajectory fusion in complex underground space
CN106485476A (en) A kind of staff attendance system based on video
US11503110B2 (en) Method for presenting schedule reminder information, terminal device, and cloud server
EP3776171A1 (en) Non-disruptive nui command
WO2024032159A1 (en) Speaking object detection in multi-human-machine interaction scenario
CN113014960A (en) Method, device and storage medium for online video production
CN113706673B (en) Cloud rendering frame platform applied to virtual augmented reality technology
CN102521683A (en) Student management method of distance education
CN111835547A (en) Quality of service (QoS) management method and related equipment
CN113656125A (en) Virtual assistant generation method and device and electronic equipment
CN107783650A (en) A kind of man-machine interaction method and device based on virtual robot
CN110197230B (en) Method and apparatus for training a model
CN111885351A (en) A screen display method, device, terminal device and storage medium
CN115017352B (en) Mobile phone car machine interaction system based on image recognition
CN107770474B (en) Sound processing method and device, terminal equipment and storage medium
CN112637643B (en) Networking method and device of mobile terminal, terminal equipment and storage medium
CN113033475B (en) Target object tracking method, related device and computer program product
CN111327659A (en) User queuing optimization method based on edge calculation
CN111031354B (en) Multimedia playing method, device and storage medium
CN115022722A (en) Video monitoring method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200915

WD01 Invention patent application deemed withdrawn after publication