[go: up one dir, main page]

CN212342269U - Emotion monitoring system based on sound frequency analysis - Google Patents

Emotion monitoring system based on sound frequency analysis Download PDF

Info

Publication number
CN212342269U
CN212342269U CN202021353381.9U CN202021353381U CN212342269U CN 212342269 U CN212342269 U CN 212342269U CN 202021353381 U CN202021353381 U CN 202021353381U CN 212342269 U CN212342269 U CN 212342269U
Authority
CN
China
Prior art keywords
speech
emotion
voice
mfcc
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202021353381.9U
Other languages
Chinese (zh)
Inventor
丁晨
刘豫华
陈磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Qunzhi Intelligent Technology Co ltd
Original Assignee
Suzhou Qunzhi Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Qunzhi Intelligent Technology Co ltd filed Critical Suzhou Qunzhi Intelligent Technology Co ltd
Priority to CN202021353381.9U priority Critical patent/CN212342269U/en
Application granted granted Critical
Publication of CN212342269U publication Critical patent/CN212342269U/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

The invention provides an emotion monitoring system based on sound frequency analysis, which comprises: the array microphone collects the speaking voice of a tested object, the array microphone is connected with the PC, and the PC is operated with: the voice signal preprocessing algorithm is used for processing voice characteristics by using MFCC voice emotional characteristic parameters; the speech emotion feature extraction algorithm is characterized in that an LSTM is adopted to obtain a complete speech sequence, similarity calculation is carried out on the speech sequence output in the LSTM, and emotion weight of each frame signal of speech relative to a test object is confirmed; a speech emotion recognition algorithm that recognizes the emotional state of a test subject based on small changes in speech from high frequency (RHFR) and lower frequency (RLFR).

Description

Emotion monitoring system based on sound frequency analysis
Technical Field
The invention relates to the field of biological recognition and artificial intelligence, and can be applied to some special industries, such as public security inquiries, inspection commission conversation and other business scenes needing to detect the emotion change of an object.
Background
The human pronunciation mechanism is a very complex process, and to do so requires a significant amount of muscle and body organ involvement, and in some way synchronizes them in precise time. First, the brain understands a given situation and evaluates the impact due to speech. Then, if it is decided to speak, air is squeezed from the lungs up to the vocal cords, which causes them to vibrate at a particular frequency, producing sound. The vibrating air continues to flow to the brain-manipulated tongue, teeth, and lips to create a sound stream that becomes a word or phrase that we can understand. The brain closely monitors this process to ensure that the emitted sound expresses a unique intent, is understandable and can be heard by listeners. Due to this uninterrupted brain monitoring, every "event" of brain activity is reflected by the speech flow. The core of the system is that the long-term memory network LTSM algorithm accurately monitors small changes in the high frequency (RHFR) and the lower frequency (RLFR) of the voice of the slave object, so as to recognize and monitor the emotion change of the object.
Disclosure of Invention
The invention mainly aims to provide an emotion monitoring system based on sound frequency analysis, which is characterized in that speaking voice of an object is collected through an array microphone, voice signal preprocessing is carried out on voice characteristics by using MFCC voice emotion characteristic parameters, then the characteristics extracted by MFCC are input into an LSTM model, a complete voice sequence is obtained through LSTM, similarity calculation is carried out on the voice sequence output in LSTM, the emotion weight of each frame signal of the voice relative to the object is learned, and finally the obtained information is subjected to emotion classification through a full connection layer, so that the voice emotion of the object is recognized and monitored.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the invention and, together with the description, serve to explain the invention and not to limit the invention;
FIG. 1: an implementation logic diagram of the present invention.
Detailed Description
In the following embodiments, shown in fig. 1, the emotion monitoring system based on sound frequency analysis includes: the system comprises an array microphone, a voice signal preprocessing algorithm, a voice emotion feature extraction algorithm and a voice emotion recognition algorithm.
The array microphone collects voice data of an object and transmits the voice data to the voice signal preprocessing algorithm.
The voice signal preprocessing algorithm adopts MFCC voice emotional characteristic parameters to process voice characteristics (acoustic characteristics, prosodic characteristics and acoustic characteristics), MFCC is a cepstrum coefficient extracted in a Mel scale frequency domain, is a characteristic widely used in automatic voice and speaker recognition, can simulate the characteristics of human ears, and constructs characteristic parameters through the auditory characteristics of human beings.
The voice emotion feature extraction algorithm uses an LSTM long-time memory model, the LSTM model is a time cycle neural network (RNN), large-scale acoustic modeling is more effective, the voice sequence is modeled in each layer of network according to the long-term dependence characteristic of the voice sequence, and the overall recognition performance is high.
The speech emotion recognition algorithm calculates the correlation weight of each time domain and emotional characteristics in the speech signal through sound frequency, then compares the correlation weights of different time domains in the speech signal, selects the time domain signal with larger weight from the correlation weights to recognize, and realizes recognition of speech emotion.
Although the present invention has been described with reference to specific examples, the description of the examples does not limit the scope of the present invention. Those skilled in the art can easily make various modifications or combinations of the embodiments without departing from the spirit and scope of the invention by referring to the description of the invention, which should also be construed as the scope of the invention.

Claims (4)

1. An emotion monitoring system based on sound frequency analysis, comprising:
array microphone can gather the pronunciation sound of test object within 3 meters to transmit the audio file to PC, the operation has on PC:
a speech signal preprocessing algorithm, which uses MFCC speech emotion feature parameters to process speech features, wherein MFCC is a speech emotion feature parameter, which is a cepstrum coefficient extracted in a Mel scale frequency domain and is a feature widely used in automatic speech and speaker recognition;
the speech emotion feature extraction algorithm is used for inputting the sound features extracted by the MFCC into the LSTM model and obtaining a complete speech sequence through the LSTM model;
the speech emotion recognition algorithm is used for carrying out similarity calculation on a speech sequence obtained by the LSTM model, learning emotion weight of each frame signal of speech relative to a test object, and carrying out emotion classification on the obtained information through high frequency and low frequency so as to realize recognition and monitoring of speech emotion of the test object;
and the voice emotion display interface displays the voice waveform of the test object on the user interface, and simultaneously displays the voice emotion change in a plurality of chart modes, wherein the charts comprise a line graph, a bar graph, a scatter diagram and an instrument panel.
2. The emotion monitoring system based on sound frequency analysis as claimed in claim 1, wherein the speech signal preprocessing algorithm selects MFCC as speech emotion feature to increase the utility of feature parameters and reduce the complexity of feature extraction, aiming at the characteristics of non-stationary randomness and time variability of speech signals, MFCC is a speech emotion feature parameter which is a cepstrum coefficient extracted in Mel scale frequency domain and is a feature widely used in automatic speech and speaker recognition, and the feature parameter is constructed by simulating the characteristics of human ears through the auditory features of human beings.
3. The emotion monitoring system based on sound frequency analysis as claimed in claim 1, wherein the speech emotion feature extraction algorithm selects LSTM as an improved method of a Recurrent Neural Network (RNN), and the LSTM circularly transmits states in a self network, so that the time series structure has a wider acceptable input range and a function of describing dynamic time behaviors.
4. The system of claim 1, wherein the speech emotion feature extraction algorithm can accurately monitor the subtle changes in voice from high frequency (RHFR) which can reflect emotional states with high excitement or strong, and lower frequency (RLFR) which can reflect stress states, thinking levels, and other cognitive processes.
CN202021353381.9U 2020-07-11 2020-07-11 Emotion monitoring system based on sound frequency analysis Active CN212342269U (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202021353381.9U CN212342269U (en) 2020-07-11 2020-07-11 Emotion monitoring system based on sound frequency analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202021353381.9U CN212342269U (en) 2020-07-11 2020-07-11 Emotion monitoring system based on sound frequency analysis

Publications (1)

Publication Number Publication Date
CN212342269U true CN212342269U (en) 2021-01-12

Family

ID=74081541

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202021353381.9U Active CN212342269U (en) 2020-07-11 2020-07-11 Emotion monitoring system based on sound frequency analysis

Country Status (1)

Country Link
CN (1) CN212342269U (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113205009A (en) * 2021-04-16 2021-08-03 广州朗国电子科技有限公司 Animal emotion recognition method and device and storage medium
CN113892952A (en) * 2021-06-09 2022-01-07 上海良相智能化工程有限公司 An intelligent research and judgment system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113205009A (en) * 2021-04-16 2021-08-03 广州朗国电子科技有限公司 Animal emotion recognition method and device and storage medium
CN113892952A (en) * 2021-06-09 2022-01-07 上海良相智能化工程有限公司 An intelligent research and judgment system

Similar Documents

Publication Publication Date Title
Gevaert et al. Neural networks used for speech recognition
Yu et al. Speech enhancement based on denoising autoencoder with multi-branched encoders
KR20240135018A (en) Multi-modal system and method for voice-based mental health assessment using emotional stimuli
CN109887489B (en) Speech dereverberation method based on depth features for generating countermeasure network
CN107329996A (en) A kind of chat robots system and chat method based on fuzzy neural network
KR19990028694A (en) Method and device for evaluating the property of speech transmission signal
CN212342269U (en) Emotion monitoring system based on sound frequency analysis
CN113571095B (en) Speech emotion recognition method and system based on nested deep neural network
Parmar et al. Effectiveness of cross-domain architectures for whisper-to-normal speech conversion
Cardona et al. Online phoneme recognition using multi-layer perceptron networks combined with recurrent non-linear autoregressive neural networks with exogenous inputs
Chen et al. Ema2s: An end-to-end multimodal articulatory-to-speech system
Fan et al. The impact of student learning aids on deep learning and mobile platform on learning behavior
Ling An acoustic model for English speech recognition based on deep learning
CN102880906A (en) Chinese vowel pronunciation method based on DIVA nerve network model
Peng et al. Urban noise monitoring using edge computing with CNN-LSTM on Jetson Nano
Pertilä et al. Online own voice detection for a multi-channel multi-sensor in-ear device
Schoentgen Vocal cues of disordered voices: An overview
Tsenov et al. Speech recognition using neural networks
Rodriguez et al. A fuzzy information space approach to speech signal non‐linear analysis
Luo The improving effect of intelligent speech recognition System on english learning
Azam et al. Urdu spoken digits recognition using classified MFCC and backpropgation neural network
Wang et al. Speech Emotion Feature Extraction Method Based on Improved MFCC and IMFCC Fusion Features
Ganhinhin et al. Voice conversion of tagalog synthesized speech using cycle-generative adversarial networks (cycle-gan)
Lin et al. Bipolar population threshold encoding for audio recognition with deep spiking neural networks
Stolar et al. Optimized multi-channel deep neural network with 2D graphical representation of acoustic speech features for emotion recognition

Legal Events

Date Code Title Description
GR01 Patent grant
GR01 Patent grant