CN112232276A - A kind of emotion detection method and device based on speech recognition and image recognition - Google Patents
A kind of emotion detection method and device based on speech recognition and image recognition Download PDFInfo
- Publication number
- CN112232276A CN112232276A CN202011213188.XA CN202011213188A CN112232276A CN 112232276 A CN112232276 A CN 112232276A CN 202011213188 A CN202011213188 A CN 202011213188A CN 112232276 A CN112232276 A CN 112232276A
- Authority
- CN
- China
- Prior art keywords
- expression
- image
- recognition
- emotion
- scene
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/63—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- General Physics & Mathematics (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Theoretical Computer Science (AREA)
- Child & Adolescent Psychology (AREA)
- Hospice & Palliative Care (AREA)
- Psychiatry (AREA)
- Signal Processing (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
本发明涉及一种基于语音识别和图像识别的情绪检测方法和装置,获取待检测的一段用户的自拍视频,以及自拍视频对应的实际场景,对自拍视频进行处理,得到图像信号和语音信号,对图像信号进行处理,获取表情变化趋势,对语音信号进行处理,获取语音信号在实际场景中的初步情绪结果,最后融合表情变化趋势以及初步情绪结果,获取用户的最终情绪结果。本发明提供的基于语音识别和图像识别的情绪检测方法是一种自动检测方法,相较于人工检测的方式,不受到主观因素的影响,从而提升检测准确性;无需专门设置检测人员,减少人工成本;处理效率较快,而且,在对处理设备进行设置,能够对多个自拍视频同时进行处理,效率较高。
The invention relates to an emotion detection method and device based on speech recognition and image recognition, which acquires a Selfie video of a user to be detected and an actual scene corresponding to the Selfie video, processes the self-timer video, and obtains an image signal and a voice signal. The image signal is processed to obtain the expression change trend, the voice signal is processed, and the initial emotional result of the speech signal in the actual scene is obtained, and finally the expression change trend and the initial emotional result are combined to obtain the final emotional result of the user. The emotion detection method based on speech recognition and image recognition provided by the present invention is an automatic detection method. Compared with the manual detection method, it is not affected by subjective factors, thereby improving the detection accuracy; it does not need to set up special detection personnel and reduces manual labor. Cost; the processing efficiency is relatively fast, and when the processing equipment is set, multiple selfie videos can be processed at the same time, and the efficiency is relatively high.
Description
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202011213188.XA CN112232276B (en) | 2020-11-04 | 2020-11-04 | An emotion detection method and device based on speech recognition and image recognition |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202011213188.XA CN112232276B (en) | 2020-11-04 | 2020-11-04 | An emotion detection method and device based on speech recognition and image recognition |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN112232276A true CN112232276A (en) | 2021-01-15 |
| CN112232276B CN112232276B (en) | 2023-10-13 |
Family
ID=74121979
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202011213188.XA Active CN112232276B (en) | 2020-11-04 | 2020-11-04 | An emotion detection method and device based on speech recognition and image recognition |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN112232276B (en) |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112990301A (en) * | 2021-03-10 | 2021-06-18 | 深圳市声扬科技有限公司 | Emotion data annotation method and device, computer equipment and storage medium |
| CN112992148A (en) * | 2021-03-03 | 2021-06-18 | 中国工商银行股份有限公司 | Method and device for recognizing voice in video |
| CN114065742A (en) * | 2021-11-19 | 2022-02-18 | 马上消费金融股份有限公司 | A text detection method and device |
| CN115795095A (en) * | 2022-11-23 | 2023-03-14 | 中国石油大学(华东) | Automatic music matching method based on video content |
| CN118428343A (en) * | 2024-07-03 | 2024-08-02 | 广州讯鸿网络技术有限公司 | Full-media interactive intelligent customer service interaction method and system |
| CN118608742A (en) * | 2024-08-08 | 2024-09-06 | 深圳市欧冠微电子科技有限公司 | Viewing angle switching control method, device, medium and electronic device for display panel |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2020125386A1 (en) * | 2018-12-18 | 2020-06-25 | 深圳壹账通智能科技有限公司 | Expression recognition method and apparatus, computer device, and storage medium |
| WO2020135194A1 (en) * | 2018-12-26 | 2020-07-02 | 深圳Tcl新技术有限公司 | Emotion engine technology-based voice interaction method, smart terminal, and storage medium |
| CN111681681A (en) * | 2020-05-22 | 2020-09-18 | 深圳壹账通智能科技有限公司 | Voice emotion recognition method and device, electronic equipment and storage medium |
| CN111694959A (en) * | 2020-06-08 | 2020-09-22 | 谢沛然 | Network public opinion multi-mode emotion recognition method and system based on facial expressions and text information |
-
2020
- 2020-11-04 CN CN202011213188.XA patent/CN112232276B/en active Active
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2020125386A1 (en) * | 2018-12-18 | 2020-06-25 | 深圳壹账通智能科技有限公司 | Expression recognition method and apparatus, computer device, and storage medium |
| WO2020135194A1 (en) * | 2018-12-26 | 2020-07-02 | 深圳Tcl新技术有限公司 | Emotion engine technology-based voice interaction method, smart terminal, and storage medium |
| CN111368609A (en) * | 2018-12-26 | 2020-07-03 | 深圳Tcl新技术有限公司 | Voice interaction method, intelligent terminal and storage medium based on emotion engine technology |
| CN111681681A (en) * | 2020-05-22 | 2020-09-18 | 深圳壹账通智能科技有限公司 | Voice emotion recognition method and device, electronic equipment and storage medium |
| CN111694959A (en) * | 2020-06-08 | 2020-09-22 | 谢沛然 | Network public opinion multi-mode emotion recognition method and system based on facial expressions and text information |
Non-Patent Citations (3)
| Title |
|---|
| WENBIN ZHOU等: "Deep Learning-Based Emotion Recognition from Real-Time Videos", 《HCII 2020: HUMAN-COMPUTER INTERACTION. MULTIMODAL AND NATURAL INTERACTION》 * |
| 陈师哲;王帅;金琴;: "多文化场景下的多模态情感识别", 软件学报, no. 04 * |
| 饶元;吴连伟;王一鸣;冯聪;: "基于语义分析的情感计算技术研究进展", 软件学报, no. 08 * |
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112992148A (en) * | 2021-03-03 | 2021-06-18 | 中国工商银行股份有限公司 | Method and device for recognizing voice in video |
| CN112990301A (en) * | 2021-03-10 | 2021-06-18 | 深圳市声扬科技有限公司 | Emotion data annotation method and device, computer equipment and storage medium |
| CN114065742A (en) * | 2021-11-19 | 2022-02-18 | 马上消费金融股份有限公司 | A text detection method and device |
| CN114065742B (en) * | 2021-11-19 | 2023-08-25 | 马上消费金融股份有限公司 | Text detection method and device |
| CN115795095A (en) * | 2022-11-23 | 2023-03-14 | 中国石油大学(华东) | Automatic music matching method based on video content |
| CN118428343A (en) * | 2024-07-03 | 2024-08-02 | 广州讯鸿网络技术有限公司 | Full-media interactive intelligent customer service interaction method and system |
| CN118608742A (en) * | 2024-08-08 | 2024-09-06 | 深圳市欧冠微电子科技有限公司 | Viewing angle switching control method, device, medium and electronic device for display panel |
Also Published As
| Publication number | Publication date |
|---|---|
| CN112232276B (en) | 2023-10-13 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN115050077B (en) | Emotion recognition method, device, equipment and storage medium | |
| Tao et al. | End-to-end audiovisual speech recognition system with multitask learning | |
| CN112686048B (en) | Emotion recognition method and device based on fusion of voice, semantics and facial expressions | |
| CN112232276A (en) | A kind of emotion detection method and device based on speech recognition and image recognition | |
| CN110728997B (en) | A Multimodal Depression Detection System Based on Context Awareness | |
| CN109658923B (en) | Speech quality inspection method, equipment, storage medium and device based on artificial intelligence | |
| CN117765981A (en) | An emotion recognition method and system based on cross-modal fusion of speech and text | |
| US20170345424A1 (en) | Voice dialog device and voice dialog method | |
| CN108305618B (en) | Voice acquisition and search method, smart pen, search terminal and storage medium | |
| CN111986675A (en) | Voice conversation method, device and computer readable storage medium | |
| CN115438725A (en) | State detection method, device, equipment and storage medium | |
| CN118658467A (en) | A cheating detection method, device, equipment, storage medium and product | |
| CN118380144A (en) | A feature extraction evaluation system and method based on multimodal deep learning | |
| CN112597889A (en) | Emotion processing method and device based on artificial intelligence | |
| CN118038897A (en) | Voice communication quality evaluation method, device, server and storage medium | |
| CN112951274A (en) | Voice similarity determination method and device, and program product | |
| CN119475252B (en) | A multimodal emotion recognition method | |
| CN114267324B (en) | Speech generation method, device, equipment and storage medium | |
| CN120105346A (en) | A multi-modal data acquisition and feature fusion system, method and server | |
| CN117831575B (en) | Intelligent business analysis method, system and electronic device based on big data | |
| CN112434953A (en) | Customer service personnel assessment method and device based on computer data processing | |
| CN115914742B (en) | Character recognition method, device and equipment for video captions and storage medium | |
| CN115831153B (en) | Pronunciation quality testing methods | |
| CN115171673B (en) | A communication assistance method, device and storage medium based on role portrait | |
| CN118016273A (en) | Disease auxiliary diagnosis method, device, equipment and readable storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| TA01 | Transfer of patent application right | ||
| TA01 | Transfer of patent application right |
Effective date of registration: 20230526 Address after: No. 16-44, No. 10A-10C, 12A, 12B, 13A, 13B, 15-18, Phase II of Wuyue Plaza Project, east of Zhengyang Street and south of Haoyue Road, Lvyuan District, Changchun City, Jilin Province, 130000 Applicant after: Jilin Huayuan Network Technology Co.,Ltd. Address before: 450000 Wenhua Road, Jinshui District, Zhengzhou City, Henan Province Applicant before: Zhao Zhen |
|
| TA01 | Transfer of patent application right | ||
| TA01 | Transfer of patent application right |
Effective date of registration: 20230913 Address after: Room 1001, 1st floor, building B, 555 Dongchuan Road, Minhang District, Shanghai Applicant after: Shanghai Enterprise Information Technology Co.,Ltd. Address before: No. 16-44, No. 10A-10C, 12A, 12B, 13A, 13B, 15-18, Phase II of Wuyue Plaza Project, east of Zhengyang Street and south of Haoyue Road, Lvyuan District, Changchun City, Jilin Province, 130000 Applicant before: Jilin Huayuan Network Technology Co.,Ltd. |
|
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| PE01 | Entry into force of the registration of the contract for pledge of patent right | ||
| PE01 | Entry into force of the registration of the contract for pledge of patent right |
Denomination of invention: An emotion detection method and device based on speech recognition and image recognition Granted publication date: 20231013 Pledgee: Agricultural Bank of China Limited Shanghai Huangpu Sub branch Pledgor: Shanghai Enterprise Information Technology Co.,Ltd. Registration number: Y2024310000041 |
|
| PC01 | Cancellation of the registration of the contract for pledge of patent right | ||
| PC01 | Cancellation of the registration of the contract for pledge of patent right |
Granted publication date: 20231013 Pledgee: Agricultural Bank of China Limited Shanghai Huangpu Sub branch Pledgor: Shanghai Enterprise Information Technology Co.,Ltd. Registration number: Y2024310000041 |
|
| EE01 | Entry into force of recordation of patent licensing contract | ||
| EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20210115 Assignee: Shanghai Quche Intelligent Technology Co.,Ltd. Assignor: Shanghai Enterprise Information Technology Co.,Ltd. Contract record no.: X2025980014762 Denomination of invention: An emotion detection method and device based on speech recognition and image recognition Granted publication date: 20231013 License type: Common License Record date: 20250723 |
