[go: up one dir, main page]

CN113729707A - An emotion recognition method based on FECNN-LSTM multimodal fusion of eye movement and PPG - Google Patents

An emotion recognition method based on FECNN-LSTM multimodal fusion of eye movement and PPG Download PDF

Info

Publication number
CN113729707A
CN113729707A CN202111037434.5A CN202111037434A CN113729707A CN 113729707 A CN113729707 A CN 113729707A CN 202111037434 A CN202111037434 A CN 202111037434A CN 113729707 A CN113729707 A CN 113729707A
Authority
CN
China
Prior art keywords
data
features
ppg
eye movement
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111037434.5A
Other languages
Chinese (zh)
Inventor
陶小梅
陈心怡
周颖慧
鲍金笛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Technology
Original Assignee
Guilin University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Technology filed Critical Guilin University of Technology
Priority to CN202111037434.5A priority Critical patent/CN113729707A/en
Publication of CN113729707A publication Critical patent/CN113729707A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/16Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state
    • A61B5/165Evaluating the state of mind, e.g. depression, anxiety
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B3/00Apparatus for testing the eyes; Instruments for examining the eyes
    • A61B3/10Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions
    • A61B3/11Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions for measuring interpupillary distance or diameter of pupils
    • A61B3/112Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions for measuring interpupillary distance or diameter of pupils for measuring diameter of pupils
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B3/00Apparatus for testing the eyes; Instruments for examining the eyes
    • A61B3/10Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions
    • A61B3/113Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions for determining or recording eye movement
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/02Detecting, measuring or recording for evaluating the cardiovascular system, e.g. pulse, heart rate, blood pressure or blood flow
    • A61B5/024Measuring pulse rate or heart rate
    • A61B5/02416Measuring pulse rate or heart rate using photoplethysmograph signals, e.g. generated by infrared radiation
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/103Measuring devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
    • A61B5/11Measuring movement of the entire body or parts thereof, e.g. head or hand tremor or mobility of a limb
    • A61B5/1103Detecting muscular movement of the eye, e.g. eyelid movement
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7203Signal processing specially adapted for physiological signals or for diagnostic purposes for noise prevention, reduction or removal
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/725Details of waveform analysis using specific filters therefor, e.g. Kalman or adaptive filters
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • A61B5/7267Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Veterinary Medicine (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Surgery (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Medical Informatics (AREA)
  • Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Physiology (AREA)
  • Psychiatry (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Ophthalmology & Optometry (AREA)
  • Cardiology (AREA)
  • Child & Adolescent Psychology (AREA)
  • Mathematical Physics (AREA)
  • Fuzzy Systems (AREA)
  • Developmental Disabilities (AREA)
  • Educational Technology (AREA)
  • Hospice & Palliative Care (AREA)
  • Psychology (AREA)
  • Social Psychology (AREA)
  • Evolutionary Computation (AREA)
  • Human Computer Interaction (AREA)
  • Dentistry (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

本发明公开了一种基于FECNN‑LSTM的眼动和PPG多模态融合的情感识别方法。包括:通过观看学习视频刺激材料,利用眼动跟踪技术和光电容积脉搏波描记的容积测量方法,获得学习者的瞳孔直径、眨眼、注视和眼跳等眼动信息以及心率值,心率变异性,峰值间期信号。研究在线学习过程中学习者情感状态与眼动生理信号的关系。计算并使用主成分分析法选取与学习者情感状态最相关的眼动特征、心率特征、心率变异性特征和峰值间期特征。再进行特征层融合生成浅层特征,归一化处理后,再用FECNN网络提取深层特征,再将得到的深层特征和浅层特征进行特征层融合后采用长短时记忆网络LSTM和随机森林RF,K近邻KNN,多层感知机MLP和支持向量机SVM进行感兴趣,困惑,无聊,高兴四种情感分类。

Figure 202111037434

The invention discloses an emotion recognition method based on FECNN-LSTM multimodal fusion of eye movement and PPG. Including: by watching the learning video stimulation materials, using eye tracking technology and the volume measurement method of photoplethysmography, to obtain the pupil diameter, blinking, gaze and saccade and other eye movement information of learners, as well as heart rate value, heart rate variability, Peak interval signal. To study the relationship between learners' emotional state and eye movement physiological signals during online learning. Compute and use principal component analysis to select the eye movement features, heart rate features, heart rate variability features and peak interval features that are most relevant to the learner's affective state. Then perform feature layer fusion to generate shallow features. After normalization, the deep features are extracted by the FECNN network, and then the obtained deep features and shallow features are fused by feature layers, and the long and short-term memory network LSTM and random forest RF are used. K-Nearest Neighbors KNN, Multilayer Perceptron MLP and Support Vector Machine SVM perform four kinds of emotion classification: Interested, Confused, Bored, and Happy.

Figure 202111037434

Description

FECNN-LSTM-based emotion recognition method based on multi-mode fusion of eye movement and PPG
The research obtains a national science fund project (number: 61906051), a Guangxi science fund project (number: 2018GXNSFBA050029) and a doctor scientific research initiation fund (GUTQDJJ2005015) of Guilin university
Technical Field
The invention relates to the field of emotion recognition, in particular to an emotion recognition method aiming at video learning and based on eye movement and PPG multi-mode signal analysis fusion and long-time memory network utilization.
Background
With the rapid development of artificial intelligence, emotional intelligence is also gradually valued by researchers. The emotion calculation endows a computer with the ability of recognizing, understanding, expressing and adapting human emotion, so that the computer can sense the emotional state of a user and timely make correct response. Emotion recognition is one of the key problems in emotion calculation research, and has important significance in various different scenes such as human-computer interaction and the like. Current emotion recognition studies are mainly: emotion recognition is carried out by adopting voice, text, facial expression, physiological signals (such as electroencephalogram EEG, skin electricity EDA, electromyogram EMG, photoplethysmography, PPG and the like) and multi-modal signal fusion. The human face expression, the voice and the like are external expressions and are easy to hide or disguise. In contrast, the change of the physiological signal is spontaneously generated by a human physiological system, is not controlled by the subjective will of a person, and can provide accurate and reliable basis for emotion recognition. In addition, along with the development of scientific technology, equipment for acquiring physiological signals is gradually perfected, and the device has the characteristics of convenience in carrying, non-invasive performance and stable signals, so that the emotion recognition research based on the physiological signals has great practical value. But selecting appropriate emotional features and classification methods requires further research.
PPG is a volume measurement method called photoplethysmography, which measures the blood flow rate and the volume changes in the blood by optical techniques. Physiological indexes related to emotional changes, such as Heart Rate (HR), Heart beat interval (beat Rate variance, HRV), and Heart Rate Variability (HRV), can be calculated from the photoplethysmographic signal. HRV is the time interval change between continuous heartbeats, is an important index of individual emotion and psychological state, and can well represent the change of emotional state. In addition, compared with physiological signals such as electroencephalogram and respiration, the pulse signals are more convenient to acquire, and the included emotional characteristics are richer. The time-frequency domain features, the depth level features and the heart rate related features of PPG are commonly used for emotion classification in research.
The vision is a direct channel for people to obtain information, and the eye movement information can objectively reflect the information processing mechanism of human brain. The cognitive processing process of humans is largely dependent on the visual system, with about 80% to 90% of the external information being obtained by the human eye. The eye movement information of the learner in the video learning is rich, and the eye movement data can be obtained through the non-invasive eye movement tracking technology without interfering the learning process. With the popularization of eye tracking technology, eye movement data visualization is rapidly developing in terms of theory and application. The eye movement data visualization method comprises 4 main visualization methods, namely a scanning path method, a hotspot graph method, an interest region method and a three-dimensional space method, wherein the time-space characteristics of the eye movement are physiological and behavioral expressions in the visual information extraction process and have direct or indirect relation with the human psychological activity, and the eye movement characteristics can truly reflect the psychological state, participation and cognitive load degree of learners.
However, the research of emotion recognition of physiological signals of a single modality has certain limitations, and physiological signals of different modalities have complementarity and relevance, so that the emotion classification is performed by fusing the time-frequency domain features of the eye movement and the PPG bimodal physiological signals with deep features.
Disclosure of Invention
The technical problems to be solved by the invention are as follows: aiming at the existing defects, the invention provides an emotion recognition method based on FECNN-LSTM multi-mode fusion of eye movement and PPG, which comprises the following steps:
the technical scheme of the invention is as follows:
step 1: and self-establishing an eye movement and PPG multi-mode database, and acquiring eye movement and PPG data by taking the learning video as a stimulation material.
Step 2: marking the collected eye movement data and physiological signal data, adopting a discrete emotion marking model for marking, and dividing emotion marking words into four emotion states of interest, happiness, confusion and chatlessness; pre-processing the acquired eye movement data, the pre-processing comprising: high-quality data screening, data cleaning and data denoising; because the PPG original data can be interfered by electromagnetic interference, illumination influence, motion artifact and the like to generate noise in the acquisition process, and the effective band pass of the PPG signal is between 0.8 Hz and 10Hz, the threshold value of the high-pass filter is set to be 1Hz to filter drift generated by the signal at the low frequency, and the threshold value of the low-pass filter is set to be 10 to filter noise interference higher than 10 Hz.
And step 3: dividing the denoised data into a training set and a verification set, wherein the division ratio is 8: 2.
and 4, step 4: and converting the eye movement and PPG data of the preprocessed data set into UTF-8 format text, and constructing data sets with different time window lengths, including 5-second, 10-second and 15-second time window lengths.
And 5: and calculating the eye movement time domain and PPG time-frequency domain characteristics in the 5s time window.
Step 6: a total of 72 eye movements and PPG features most relevant to the emotional state were selected using principal component analysis.
And 7: carrying out feature layer fusion on the 72 features selected in the step 6 to generate shallow features, designing after normalization and extracting deep features by using a convolutional neural network FECNN; and selecting 57-dimensional features most relevant to emotional states from the deep features by using a principal component analysis method. And performing feature layer fusion on the deep features and the shallow features extracted by the FECNN to obtain a 129-dimensional feature vector as the input of the emotion classifier.
And 8: and designing an LSTM network model to carry out emotion classification on the shallow feature and the deep feature. After a plurality of tests, the data in the training set are subjected to a plurality of rounds of training in batches to adjust network parameters until the maximum iteration times is reached or the advanced cutoff condition is met, and the optimal LSTM network structure and parameters are selected. And (3) taking the 129-dimensional feature vector obtained in the step (7) as an input training of the LSTM model, evaluating the performance of the model by using test set data, outputting one of four emotional states of interest, happiness, confusion and boredom, and finally evaluating the performance of the model by using the accuracy and the loss value.
And step 9: and (4) operating the LSTM network model obtained by training in the step (8) on a test set to obtain a final classification precision index.
Step 10: the effectiveness of the machine learning model was measured using accuracy (Precision), Recall (Recall) and F1 score (F1-score). Several basic concepts need to be defined, NTP: the classifier judges the positive samples as the number of the positive samples, NFP: the classifier judges the negative samples as the number of positive samples, NTN: the classifier judges the negative samples as the number of the negative samples, NFN: the classifier judges the positive samples as the number of the negative samples. The accuracy is defined as the proportion of the number of correctly classified samples in the positive samples to the number of all the classified positive samples, and the formula is as follows:
Figure BDA0003247827100000021
the recall ratio is defined as the proportion of the number of correctly classified samples in the positive samples to the number of all actually classified positive samples, and the recall ratio measures the capacity of correctly classifying the positive samples by classification, and the formula is as follows:
Figure BDA0003247827100000031
the F1 score is defined as twice the accuracy and recall ratio and the mean value, the F1 score comprehensively considers the accuracy and recall ratio capability of the classifier, and the formula is as follows:
Figure BDA0003247827100000032
drawings
FIG. 1 is an experimental flow chart of a multi-modal data acquisition experiment in the present invention.
Fig. 2 shows a network structure of the FECNN in the present invention.
FIG. 3 is a flow chart of an emotion recognition method based on FECNN-LSTM multi-modal fusion of eye movement and PPG in the invention.
FIG. 4 is a diagram of a neuron structure of the long-short term memory network.
Detailed Description
The invention will be further described with reference to examples and figures, but the embodiments of the invention are not limited thereto.
As shown in fig. 3, the present implementation provides an emotion recognition method based on FECNN-LSTM of multimodal fusion of eye movement and PPG, comprising the following steps:
1. and cleaning and dividing data sets of different time windows, and denoising the data to obtain a processed eye movement signal and a processed PPG signal.
2. Calculating eye movement related features, comprising: features of gaze, saccade, blink, and pupil categories; calculating a PPG-relevant feature, comprising: HR, HRV, RPeaks.
3. And performing principal component analysis, selecting eye movement and PPG (photoplethysmography) features with high correlation with emotional states, performing feature layer fusion to form shallow features, and normalizing the shallow features.
4. And (3) carrying out feature learning on the shallow features by using FECNN, extracting deep features, selecting the deep features with high correlation with emotional states by using a principal component analysis method, and normalizing.
5. And then, carrying out feature layer fusion on the shallow features and the deep features to serve as the input of the emotion classifier. Four algorithms of a support vector machine, a random forest, K neighbor and a multilayer perceptron in machine learning are adopted to carry out emotional state classification on the eye movement, the PPG single-mode shallow layer characteristic and the fused eye movement and PPG characteristic to serve as a comparison test. The obtained model was evaluated using different evaluation indexes.
6. And designing a long-time memory network to classify the deep-layer and shallow-layer characteristics according to feelings, and evaluating the model by using the accuracy and the loss value.
More specifically, a multi-modal emotion recognition data set is manufactured, learning videos of five different subjects are used as stimulation materials, the whole multi-modal data acquisition experiment process is shown in fig. 1, and the following steps are specifically described:
and S1, before the experiment is carried out, the physiological signal acquisition equipment is worn on the tested object, and then the eye calibration is carried out on the tested object so as to check whether the tested object is qualified.
S2, before entering the experiment formally, the tested object needs to watch the fixation point, namely the crosshair appearing in the center of the screen, the time duration is 60S, and the baseline values of the eye movement and the PPG data can be obtained after the fixation point is added.
S3, in the experiment process, 4 video segments of 2min are played first, then 1 video segment of 10min is played, the 4 video segments of 2min are played in a random sequence, a tested subject is required to perform a knowledge questionnaire test before each video segment is played, the tested prior knowledge is measured, and the content of the tested prior knowledge is related to the content of the experimental material. And then, the video segment which is played is watched on the computer screen, after the video playing is finished, the emotion which is generated when the video is watched through the key marking and the post-test inspection is finished, and after the post-test inspection is finished, the knowledge questionnaire test, the watching and the post-test inspection of the next video segment are carried out.
S4, the last played video is a video of 10min evoked distraction, the video pops up a reminder in the process of watching, and if the video is tried to be popped up in the time period before the reminder pops up, the distraction can be marked by pressing a key.
And S5, after the whole experiment is finished, the experimenter de-labels the model for the trial speech, ensures that the trial completely understands the model, and then enables the trial to watch and review the video, including the video segment, the video when the trial person watches the video segment and the eye movement track when the trial watches the video segment, divides the event according to the emotional state generated when the trial reviews the video, and selects the emotional state and the emotional intensity of the trial person according to the classified emotional words and different awakening levels in the emotional model to label 5 videos. The emotional state in the data acquisition experiment is acquired by adopting a method of 'implied review' and a subjective report mode of a tested subject, namely after a video segment is watched by the tested subject, the video is played back, the synchronously recorded facial expression video and the eye movement track of the tested subject are stimulated to recall the current emotional state, the synchronous video is divided into event segments, and the emotional state and the emotional intensity of the user are selected from the emotional words and the awakening levels in the emotion classification model. Emotional states include happy, interesting, boring, confusing, vague, and others. And selecting an A dimension in the PAD dimension model to look back and mark the awakening dimension strength of the tested object in a certain emotion state, respectively representing the strength of the certain emotion by 1-5, wherein 1 is the lowest, 5 is the highest, and the values are sequentially increased from 1 to 5.
For the pretreatment of the eye movement signals, the abnormal values of the eye movement data of the testee in the experimental process are mainly removed, and the noise generated in the acquisition process is eliminated. For the PPG signal preprocessing, noise generated by interference influences such as electromagnetic interference, illumination influence, motion artifact and the like in the PPG signal acquisition process is mainly removed.
It is important to select a proper and effective eye movement and PPG index according to the research purpose, otherwise valuable data information is lost in the research process. In emotion recognition research, indexes of a single mode have certain limitations, and signals of multiple modes have correlation and complementarity, so that indexes of two modes, namely eye movement and PPG, are selected for analysis according to the research needs. The eye movement indexes selected in the experiment mainly comprise the following four types: gaze, eye jump, blink, and pupil diameter; the selected PPG indexes mainly comprise three types of HR, HRV and RPeaks.
In step 2, after the eye movement and the PPG data are preprocessed, the relevant statistical characteristics of the eye movement and the time-frequency domain characteristics of the PPG are calculated. The PPG frequency domain feature specific calculation formula is as follows (4) - (9).
Sampling the pulse sequence signals of each time window at equal intervals, selecting N points to form a discrete sequence X (N), and performing discrete Fourier transform to obtain a frequency domain sequence X (k), wherein k is a discrete frequency variable, and W is a variable of frequencyNFor the positive transform kernel, j is the imaginary unit. The calculation formula is as follows.
Figure BDA0003247827100000041
From the euler equation:
Figure BDA0003247827100000042
then:
WN=exp(-j2πnk)=cos2πnk-jsin2πnk (6)
where X (k) is a complex number,
X(k)=R(k)+jI(k) (7)
r (k) is the real part, and I (k) is the imaginary part. The phase value for each point of the frequency domain sequence is then:
Figure BDA0003247827100000051
the frequency spectrum is:
Figure BDA0003247827100000052
because the discrete Fourier transform has large calculation amount, the obtained data is processed by fast Fourier transform, the frequency and the phase are expressed as frequency functions, and corresponding frequency components HF, LF, VLF, LF/HF and total power are extracted from the power spectral density to be used as the frequency domain characteristics of HRV.
Let HRV sequence be R ═ R1,R2,…RN],RiDenotes the value of HRV at time i, N representing the sequence length. The HRV temporal feature calculation formulas are as follows (10) to (13). The RR interval difference root mean square RMSSD is calculated as the following formula (10), wherein RRi=Ri+1-Ri
Figure BDA0003247827100000053
The standard deviation SDNN formula is as follows:
Figure BDA0003247827100000054
wherein
Figure BDA0003247827100000055
Percent PNN50 with peak interval greater than 50 ms:
Figure BDA0003247827100000056
and screening out eye movement and PPG characteristics which are obviously related to the emotional state by adopting a Principal Component Analysis (PCA) method according to the characteristics obtained by each mode. The experimental analysis finally picked 32 eye movement features and 40 PPG features.
In step S3, feature layer fusion is performed on the eye movement statistical features and the PPG time-frequency domain features in the synchronization time window to form shallow features, and a 72-dimensional combined feature vector is obtained. Because of individual differences and different physiological signal baseline values of different people, the baseline value of an individual needs to be removed, each emotional characteristic of the eye movement and the PPG is normalized by a corresponding characteristic value in a calm state, the characteristic value is mapped into a [0,1] interval by min-max normalization, and the min-max normalization formula (14) is as follows:
Figure BDA0003247827100000057
X*is a normalized value, X is a sample value, XminIs the minimum value in the sample, XmaxIs the maximum value in the sample.
In step S4, the FECNN network structure is designed and described as follows: the feature extraction part in the FECNN network is composed of a convolution layer and a pooling layer, wherein the convolution layer is used for extracting deep information of input data, and the pooling layer is used for performing down-sampling processing on the obtained feature map so as to reduce the degree of network overfitting. The input to FECNN is a 72 × 1 vector with 6 convolutional layers, which are Conv1, Conv2, Conv3, Conv4, Conv5, and Conv6, respectively. Each convolutional layer contains a convolution kernel of 3 x 1 in one dimension, a max-pooling layer with a 2 x 1 filter, and a regularized Dropout layer. The Dropout layer deactivates a portion of the neurons with a probability of 0.5 to prevent overfitting of the model. The inactive neuron does not have error back-propagation, but the weight of the neuron is preserved, so that the network adopts a different network structure from before each time a sample is input. The step size for each convolutional layer is set to 1, using Relu as the activation function. The convolutional layer Conv6 was followed by a Flatten flattening layer, and then a Dense layer was used to compress the Flatten layer output features into 64X 1-dimensional deep features. And selecting deep features related to emotional states by using the Pearson correlation coefficient, wherein the deep features are 57-dimensional in total.
In step S5, feature-layer fusion is performed on the deep features and the shallow features extracted in step S4, and a 129-dimensional feature vector is output for input of the emotion classifier. Four machine learning algorithms are adopted to carry out emotion classification on eye movement and PPG single-mode data, grid search is used for parameter optimization, and finally obtained parameters are shown as the following table:
Figure BDA0003247827100000061
four mechanistic algorithms were evaluated using accuracy, recall and F1 scores, with the following results:
Figure BDA0003247827100000062
in step S6, the designed long-short duration memory network LSTSM structure is described as follows.
The LSTM is used for analyzing time sequence data, and consists of an input gate, a forgetting gate, an output gate and an internal memory unit, and determines when the network forgets a previous hidden state and updates the hidden state by effectively utilizing a computer memory, so that the problems of gradient disappearance and explosion of RNN in the process of processing sequence data with limited length in the back propagation process are solved. The LSTM network structure elements are shown in fig. 4. In the figure itRepresenting the output of the input gate unit, ftRepresenting the output of a forgetting gate unit, otRepresenting the output of the output gate unit, c'tRepresents an internal memory cell, htIs the output of the hidden unit. In the figure, sigma represents a sigmoid activation function. Let x betFor the input of the LSTM cell at time t, W and U represent weights, ht-1For hiding the sheet on the upper layerAnd (4) outputting the element. The following specific descriptions are shown in the following formulas (15) to (20).
it=σ(λWi(Wixt)+λUi(Uiht-1)) (15)
ft=σ(λWf(Wfxt)+λUf(Ufht-1)) (16)
ot=σ(λWo(Woxo)+λUo(Uoht-1)) (17)
c't=tanh(λWu(Wcxt)+λUc(Ucht-1)) (18)
ct=ftct-1+itc't (19)
ht=ottanh(ct) (20)
From equations (15) to (20), the final output h of the concealment unit at time ttOutput h from previous time point hiding unitt-1And current time point input xtAnd jointly determining, and realizing the memory function. Through the design of 3 gating units, the LSTM memory unit can selectively store and update information in a long distance, which is beneficial to learning sequence characteristic information of PPG signals and eye movements.
The LSTM network designed herein has 3 hidden layers, with 32, 64, 72 hidden units. And taking the shallow features and the deep features extracted by the FECNN as the input of the LSTM, updating the network weight through inverse gradient propagation in the training stage, accelerating the function convergence speed in order to optimize the problem of overlarge swing amplitude of the loss function, and selecting a self-adaptive learning rate dynamic adjustment algorithm as an optimization algorithm. And evaluating the difference situation of the probability distribution obtained by current training and the real distribution by using a multi-classification cross entropy loss function. The following formula (22) is a cross entropy function calculation formula, in which
Figure BDA0003247827100000071
For the desired output, y is the actual output of the neuron, and the loss value loss of the model is calculated as follows:
y=σ(Σwjxj+b) (21)
Figure BDA0003247827100000072
when the desired output and the actual output are equal, the loss value is 0. Using dropout after each LSTM layer prevents trained overfitting from reducing feature interactions. The output layer uses the softmax activation function to classify, and outputs a two-dimensional array consisting of 4 probabilities, which represents the probability value that the sample data belongs to a certain emotion. The LSTM ultimately outputs one of four emotional states of interest, happiness, confusion, and boredom. After 600 iterations, the accuracy and loss value of the LSTM model gradually stabilized, with an accuracy of 84.68% on the test set and a loss value of 0.43 on the test set.

Claims (2)

1. The FECNN-LSTM-based emotion recognition method based on multi-modal fusion of eye movement and PPG is characterized by comprising the following steps:
step 1: self-building an eye movement and PPG multi-mode database, and acquiring eye movement and PPG data by taking a learning video as a stimulation material;
step 2: pre-processing the acquired eye movement and PPG data, wherein the pre-processing comprises the following steps: data cleaning and data labeling; the PPG data is denoised by low-pass filtering and high-pass filtering, a discrete emotion marking model is adopted for marking, and emotion marking words are divided into four emotion states of interest, happiness, confusion and chatlessness; pre-processing the acquired eye movement data, the pre-processing comprising: high-quality data screening, data cleaning and data denoising; because the PPG original data can be influenced by electromagnetic interference, illumination and motion artifact interference to generate noise in the acquisition process, and the effective band pass of a PPG signal is between 0.8 and 10Hz, the threshold value of a high-pass filter is set to be 1Hz to filter drift generated by the signal at a low frequency, and the threshold value of a low-pass filter is set to be 10 to filter noise interference higher than 10 Hz;
and step 3: dividing the preprocessed acquired data set into a training set and a verification set, wherein the proportion is 80% and 20%;
and 4, step 4: converting the eye movement and PPG data of the preprocessed data set into UTF-8 format text, and constructing data sets with different time window lengths, wherein the data sets comprise 5-second, 10-second and 15-second time window lengths;
and 5: calculating the characteristics of an eye movement time domain and a PPG time-frequency domain in a 5s time window, and specifically describing the following steps:
s1, calculating relevant characteristics of the eye movement data, including: the statistical characteristics of watching times, watching duration and watching speed; the number of glances, the duration of glances, and the statistical characteristics of glance speed; statistical characteristics of the change rate of the left and right pupil diameters, the left and right pupil diameters and the pupil mean; the number of blinking times, the blinking frequency and the blinking duration are 50;
s2, extracting heart rate value HR, heart rate variability and peak value RPeaks data of the PPG data, and calculating related characteristics: HR mean value, HR maximum value, HR first-order difference and HR second-order difference time domain characteristics; HRV first order difference, second order difference, SDNN, RMSSD, PNN50, PNN20 time domain features; PSD, LF, HF, VLF, LF/HF five HRV frequency domain characteristics; 32 peak values and peak value first-order difference time domain features;
step 6: carrying out feature dimensionality reduction on the features by using PCA (principal component analysis), and selecting 72 features which are obviously related to emotional states;
and 7: carrying out feature layer fusion on the selected 72 features to construct shallow features, designing and extracting deep features of the shallow features by using FECNN after normalization, selecting features most relevant to emotional states from the deep features by using a principal component analysis method to achieve 57 dimensions, carrying out feature layer fusion on the shallow features and the deep features, and carrying out emotion classification on the shallow features and the deep features by using four machine learning algorithms of SVM, RF, KNN and MLP;
and 8: designing an LSTM network model to carry out sentiment classification on the shallow feature and the deep feature, carrying out multiple rounds of training on data in a training set in batches through multiple tests to adjust network parameters until the maximum iteration times is reached or a cut-off condition in advance is met, and selecting an optimal LSTM network structure and parameters;
and step 9: running the LSTM network model obtained by training in the step 8 on a test set to obtain a final classification precision index;
step 10: and comparing and analyzing the classification result of the LSTM with the classification result of SVM, KNN, MLP and RF algorithm.
2. The method for emotion recognition in video learning according to claim 1,
step 1 is described in detail as follows:
s1, before the experiment, wearing physiological signal acquisition equipment on the tested object, and then carrying out eye calibration on the tested object to check whether the tested object is qualified;
s2, before entering the experiment formally, the tested object needs to watch the fixation point, namely the crosshair appears in the center of the screen, the time duration is 60S, and the baseline values of the eye movement and the PPG data can be obtained after the fixation point is added;
s3, in the experimental process, 4 video clips of 2min are played firstly, then 1 video clip of 10min is played, the 4 video clips of 2min are played in a random sequence, a tested subject is required to be subjected to knowledge questionnaire testing before each video clip is played, the prior knowledge of the tested subject is measured, the content of the prior knowledge is related to the content of an experimental material, then the tested subject watches the played video clips on a computer screen, after the video playing is finished, the tested subject needs to mark emotion generated when watching the video through keys and complete post-test inspection, and after the post-test inspection, the next video clip knowledge questionnaire testing, watching and post-test inspection are carried out;
s4, the last played video is a video with 10min inducement vague, the video pops up a reminder in the watching process, and if the video is tried to be vague in the time period before the reminder pops up, the vague can be labeled by pressing a key;
s5, after the whole experiment is finished, an experimenter de-labels the model for the tested speech, ensures that the tested speech fully understands the model, and then the tested speech is enabled to watch and review the video, including the video segment, the video when the tested speech is watched by the tested person and the eye movement track when the tested speech is watched, divides the event according to the emotional state generated when the tested speech is reviewed, selects the emotional state and the emotional intensity of the tested speech according to the emotional words and different awakening levels in the classified emotional model, labels 5 videos, and obtains the emotional state in the data acquisition experiment by adopting a method of 'implicit review' and a mode of subjective report of the tested, namely, after the tested speech segment is watched, the video is played back, the tested facial expression video and the eye movement track of the tested which are synchronously recorded are stimulated to recall the emotional state of the tested speech, the synchronous video is divided into event segments, and selects the emotional state and the emotional intensity of the tested speech from the emotional words and the awakening levels in the classified emotional model, the emotional states comprise happiness, interest, boredom, confusion, nervousness and the like, A dimension in a PAD dimension model is selected to look back and mark the strength of awakening dimension of a tested object in a certain emotional state, 1-5 are respectively used for representing the strength of a certain emotion, 1 is the lowest, 5 is the highest, and the values are sequentially increased from 1 to 5;
step 2 is described in detail as follows:
s1, preprocessing data: the method comprises the steps of removing tested eye movement data lost in sight line tracking in the experiment process, dividing the acquired data into data according to asynchronous lengths, preprocessing eye movement signals, mainly removing abnormal values of the eye movement data of a tested person in the experiment process, eliminating noise generated in the acquisition process, preprocessing PPG signals, mainly removing electromagnetic interference, illumination influence and noise generated by motion artifact interference influence in the PPG signal acquisition process, setting an effective band pass of the PPG signals between 0.8 Hz and 10Hz, setting a high-pass filter threshold value to be 1Hz, filtering drift generated by the signals at a low frequency position, and setting a low-pass filter threshold value to be 10, and filtering noise interference higher than 10 Hz;
s2, data annotation: adding emotion attribute labels 'label' to all tested data, and recording that the interest is 0, the confusion is 1, the boring is 2 and the happy is 3;
step 5 is described in detail as follows:
the selection of a proper and effective eye movement PPG index according to the research purpose is very important, otherwise, valuable data information is lost in the research process, in the emotion recognition research, a single-mode index has certain limitation, and multi-mode signals have correlation and complementarity, so that the indexes of two modes, namely the eye movement and the PPG, are selected according to the research requirement for analysis, and the eye movement indexes selected in the experiment mainly comprise the following four types: gaze, eye jump, blink, and pupil diameter; the selected PPG indexes mainly comprise three types of HR, HRV and RPeaks;
the PPG time-frequency domain feature is calculated as follows, the PPG frequency domain feature is specifically calculated according to the following formulas (4) - (9), N points are selected for sampling the pulse sequence signal interval of each time window to form a discrete sequence X (N), and a frequency domain sequence X (k) is obtained by performing discrete Fourier transform, wherein k is a discrete frequency variable, W is a discrete frequency variableNFor the positive transform kernel, j is an imaginary unit, and the calculation formula is as follows:
Figure FDA0003247827090000031
Figure FDA0003247827090000032
from the euler equation:
e±jn=cos n±j sin n (5)
then:
WN=exp(-j2πnk)=cos2πnk-jsin2πnk (6)
where X (k) is a complex number,
X(k)=R(k)+jI(k) (7)
r (k) is the real part, I (k) is the imaginary part, then the phase value for each point in the frequency domain sequence is:
Figure FDA0003247827090000033
the frequency spectrum is:
Figure FDA0003247827090000034
because the calculation amount of discrete Fourier transform is large, the obtained data is processed by fast Fourier transform, then the frequency and the phase are expressed as frequency functions, and corresponding frequency components HF, LF, VLF, LF/HF and total power are extracted from the power spectral density to be used as the frequency domain characteristics of HRV;
let HRV sequence be R ═ R1,R2,…RN],RiThe HRV value at the time i is shown, N represents the sequence length, HRV time domain feature calculation formulas are shown in the formulas (10) to (13), and RR interval difference root mean square (RMSSD) calculation formula is shown in the formula (10), wherein RR interval difference value is shown in the formula (10)i=Ri+1-Ri
Figure FDA0003247827090000041
The standard deviation SDNN formula is as follows:
Figure FDA0003247827090000042
wherein
Figure FDA0003247827090000043
Percent PNN50 with peak interval greater than 50 ms:
Figure FDA0003247827090000044
screening out eye movement and PPG (photoplethysmography) features which are obviously related to emotional states by adopting a Principal Component Analysis (PCA) method according to the features obtained by each mode, and finally selecting 32 eye movement features and 40 PPG features by experimental analysis;
step 7, the following is described in detail:
s1, performing feature layer fusion on the eye movement statistical features and the PPG time-frequency domain features in the synchronous time window to form shallow features, obtaining a 72-dimensional combined feature vector, because of individual differences and different physiological signal baseline values of different people, removing the baseline value of an individual, standardizing each emotion feature of the eye movement and the PPG by using the corresponding feature value in a calm state, mapping the feature value into a [0,1] interval by using min-max normalization, wherein a min-max normalization formula (14) is as follows:
Figure FDA0003247827090000045
X*is a normalized value, X is a sample value, XminIs the minimum value in the sample, XmaxIs the maximum value in the sample or samples,
s2, designing the FECNN network structure and describing as follows: the FECNN network comprises a characteristic extraction part and a pooling layer, wherein the characteristic extraction part is composed of a convolution layer and a pooling layer, the convolution layer is used for extracting deep information of input data, the pooling layer is used for performing down-sampling processing on an obtained characteristic diagram to reduce the degree of overfitting of the network, the input of the FECNN is a 72 x 1 vector, 6 convolution layers are provided in total, namely Conv1, Conv2, Conv3, Conv4, Conv5 and Conv6, each convolution layer comprises a convolution kernel with the one-dimensional size of 3 x 1, a maximum pooling layer with a 2 x 1 filter and a regularized Dropout layer, the Dropout layer inactivates partial neurons with the probability of 0.5 to prevent overfitting of a model, the inactivated neurons do not perform error back propagation, but the weights of the neurons are kept, therefore, each time a sample is input, the network adopts a network structure different from the previous network structure, the step size of each convolution layer is set as 1, Relu is used as an activation function, superposing a Flatten flattening layer behind the convolutional layer Conv6, then compressing the output characteristics of the Flatten layer into 64 x 1-dimensional deep characteristics by using a Dense layer, and selecting the deep characteristics related to the emotional state by using a Pearson correlation coefficient, wherein the total dimension is 57;
in step 8, the designed long-time memory network LSTM structure is described as follows:
the LSTM is used for analyzing time sequence data, the LSTM selected for use in the method consists of an input gate, a forgetting gate, an output gate and an internal memory unit, and the method effectively utilizes a computer memory to determine when a network forgets a previous hidden state and updates the hidden state, so as to solve the problems of gradient loss and explosion of an RNN in the process of processing sequence data with limited length in the reverse propagation process, and is specifically shown in the following formulas (15) to (20);
it=σ(λWi(Wixt)+λUi(Uiht-1)) (15)
ft=σ(λWf(Wfxt)+λUf(Ufht-1)) (16)
ot=σ(λWo(Woxo)+λUo(Uoht-1)) (17)
c't=tanh(λWu(Wcxt)+λUc(Ucht-1)) (18)
ct=ftct-1+itc't (19)
ht=ottanh(ct) (20)
from equations (15) to (20), the final output h of the concealment unit at time ttOutput h from previous time point hiding unitt-1And current time point input xtThe memory function is realized through the joint determination, and through the design of 3 gate control units, the LSTM memory unit can selectively store and update long-distance information, which is beneficial to learning sequence characteristic information of PPG signals and eye movements;
the LSTM network is provided with 3 hidden layers, the number of hidden units is 32, 64 and 72 respectively, shallow layer features and deep layer features extracted by FECNN are used as input of the LSTM, network weight is updated through inverse gradient propagation in a training stage, in order to optimize the problem of overlarge swing amplitude of a loss function and accelerate function convergence speed, a self-adaptive learning rate dynamic adjustment algorithm is selected as an optimization algorithm, a multi-classification cross entropy loss function is used for evaluating the difference situation of probability distribution and real distribution obtained by current training, and the following formula (22) is a cross entropy function calculation formula, wherein the formula (22) is a cross entropy function calculation formula
Figure FDA0003247827090000051
To the desired output, y is the nerveThe metafactual output, the loss value loss of the model is calculated as follows:
y=σ(Σwjxj+b) (21)
Figure FDA0003247827090000061
when the expected output is equal to the actual output, the loss value is 0, dropout is used for preventing training and fitting after each LSTM layer to reduce the interaction of features, the output layer is classified by using a softmax activation function, a two-dimensional array consisting of 4 probabilities is output to represent the probability value that sample data belongs to a certain emotion, and finally the LSTM outputs one of four emotional states of interest, happiness, confusion and chatlessness.
CN202111037434.5A 2021-09-06 2021-09-06 An emotion recognition method based on FECNN-LSTM multimodal fusion of eye movement and PPG Pending CN113729707A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111037434.5A CN113729707A (en) 2021-09-06 2021-09-06 An emotion recognition method based on FECNN-LSTM multimodal fusion of eye movement and PPG

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111037434.5A CN113729707A (en) 2021-09-06 2021-09-06 An emotion recognition method based on FECNN-LSTM multimodal fusion of eye movement and PPG

Publications (1)

Publication Number Publication Date
CN113729707A true CN113729707A (en) 2021-12-03

Family

ID=78735860

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111037434.5A Pending CN113729707A (en) 2021-09-06 2021-09-06 An emotion recognition method based on FECNN-LSTM multimodal fusion of eye movement and PPG

Country Status (1)

Country Link
CN (1) CN113729707A (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114343640A (en) * 2022-01-07 2022-04-15 北京师范大学 Attention assessment method and electronic equipment
CN114627329A (en) * 2022-02-24 2022-06-14 海信集团控股股份有限公司 Visual sensitive information detection model training method, device and equipment
CN114648354A (en) * 2022-02-23 2022-06-21 上海外国语大学 Advertisement evaluation method and system based on eye movement tracking and emotional state
CN114970412A (en) * 2022-05-18 2022-08-30 合肥工业大学 IGBT service life prediction method based on ICGWO-LSTM
CN115067935A (en) * 2022-06-28 2022-09-20 华南师范大学 Wink detection method and system based on photoplethysmography and storage medium
CN115381467A (en) * 2022-10-31 2022-11-25 浙江浙大西投脑机智能科技有限公司 Attention mechanism-based time-frequency information dynamic fusion decoding method and device
CN115439921A (en) * 2022-09-22 2022-12-06 徐州华讯科技有限公司 Image preference prediction method based on eye diagram reasoning
CN115620706A (en) * 2022-11-07 2023-01-17 之江实验室 A model training method, device, equipment and storage medium
CN115919313A (en) * 2022-11-25 2023-04-07 合肥工业大学 Facial myoelectricity emotion recognition method based on space-time characteristics
CN116299684A (en) * 2023-05-17 2023-06-23 成都理工大学 A Novel Microseismic Classification Method Based on Bimodal Neurons in Artificial Neural Networks
CN116595423A (en) * 2023-07-11 2023-08-15 四川大学 Air traffic controller cognitive load assessment method based on multi-feature fusion
CN116701917A (en) * 2023-07-28 2023-09-05 电子科技大学 Open set emotion recognition method based on physiological signals
CN116740015A (en) * 2023-06-12 2023-09-12 北京长木谷医疗科技股份有限公司 Medical image intelligent detection method and device based on deep learning and electronic equipment
CN117717340A (en) * 2024-02-07 2024-03-19 中汽研汽车检验中心(天津)有限公司 Driver sleepiness detection method, device, equipment and medium
CN118743537A (en) * 2024-06-21 2024-10-08 深圳市嗨西西科技有限公司 Detection method and system of multimodal physiological indicators for pet health assessment

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114343640B (en) * 2022-01-07 2023-10-13 北京师范大学 Attention assessment methods and electronic devices
CN114343640A (en) * 2022-01-07 2022-04-15 北京师范大学 Attention assessment method and electronic equipment
CN114648354A (en) * 2022-02-23 2022-06-21 上海外国语大学 Advertisement evaluation method and system based on eye movement tracking and emotional state
CN114627329A (en) * 2022-02-24 2022-06-14 海信集团控股股份有限公司 Visual sensitive information detection model training method, device and equipment
CN114970412A (en) * 2022-05-18 2022-08-30 合肥工业大学 IGBT service life prediction method based on ICGWO-LSTM
CN115067935A (en) * 2022-06-28 2022-09-20 华南师范大学 Wink detection method and system based on photoplethysmography and storage medium
CN115439921A (en) * 2022-09-22 2022-12-06 徐州华讯科技有限公司 Image preference prediction method based on eye diagram reasoning
CN115381467A (en) * 2022-10-31 2022-11-25 浙江浙大西投脑机智能科技有限公司 Attention mechanism-based time-frequency information dynamic fusion decoding method and device
CN115620706A (en) * 2022-11-07 2023-01-17 之江实验室 A model training method, device, equipment and storage medium
CN115620706B (en) * 2022-11-07 2023-03-10 之江实验室 A model training method, device, equipment and storage medium
CN115919313A (en) * 2022-11-25 2023-04-07 合肥工业大学 Facial myoelectricity emotion recognition method based on space-time characteristics
CN115919313B (en) * 2022-11-25 2024-04-19 合肥工业大学 A facial electromyography emotion recognition method based on spatiotemporal features
CN116299684A (en) * 2023-05-17 2023-06-23 成都理工大学 A Novel Microseismic Classification Method Based on Bimodal Neurons in Artificial Neural Networks
CN116299684B (en) * 2023-05-17 2023-07-21 成都理工大学 Novel microseismic classification method based on bimodal neurons in artificial neural network
CN116740015A (en) * 2023-06-12 2023-09-12 北京长木谷医疗科技股份有限公司 Medical image intelligent detection method and device based on deep learning and electronic equipment
CN116595423B (en) * 2023-07-11 2023-09-19 四川大学 A cognitive load assessment method for air traffic controllers based on multi-feature fusion
CN116595423A (en) * 2023-07-11 2023-08-15 四川大学 Air traffic controller cognitive load assessment method based on multi-feature fusion
CN116701917A (en) * 2023-07-28 2023-09-05 电子科技大学 Open set emotion recognition method based on physiological signals
CN116701917B (en) * 2023-07-28 2023-10-20 电子科技大学 An open set emotion recognition method based on physiological signals
CN117717340A (en) * 2024-02-07 2024-03-19 中汽研汽车检验中心(天津)有限公司 Driver sleepiness detection method, device, equipment and medium
CN117717340B (en) * 2024-02-07 2024-05-31 中汽研汽车检验中心(天津)有限公司 Driver sleepiness detection method, device, equipment and medium
CN118743537A (en) * 2024-06-21 2024-10-08 深圳市嗨西西科技有限公司 Detection method and system of multimodal physiological indicators for pet health assessment

Similar Documents

Publication Publication Date Title
CN113729707A (en) An emotion recognition method based on FECNN-LSTM multimodal fusion of eye movement and PPG
Zhang Expression-EEG based collaborative multimodal emotion recognition using deep autoencoder
CN113627518B (en) Method for realizing neural network brain electricity emotion recognition model by utilizing transfer learning
Houssein et al. Human emotion recognition from EEG-based brain–computer interface using machine learning: a comprehensive review
Bobade et al. Stress detection with machine learning and deep learning using multimodal physiological data
US11696714B2 (en) System and method for brain modelling
Liu et al. Subject-independent emotion recognition of EEG signals based on dynamic empirical convolutional neural network
CN109157231B (en) Portable multichannel depression tendency evaluation system based on emotional stimulation task
Özerdem et al. Emotion recognition based on EEG features in movie clips with channel selection
Zhao et al. EmotionSense: Emotion recognition based on wearable wristband
CN106886792B (en) An EEG Emotion Recognition Method Based on Hierarchical Mechanism to Build a Multi-Classifier Fusion Model
Yan et al. Emotion classification with multichannel physiological signals using hybrid feature and adaptive decision fusion
Soni et al. Graphical representation learning-based approach for automatic classification of electroencephalogram signals in depression
Wang et al. Maximum weight multi-modal information fusion algorithm of electroencephalographs and face images for emotion recognition
CN111000556A (en) An emotion recognition method based on deep fuzzy forest
CN117370828A (en) Multi-mode feature fusion emotion recognition method based on gating cross-attention mechanism
Dar et al. YAAD: young adult’s affective data using wearable ECG and GSR sensors
CN115422973A (en) An Attention-Based Spatial-Temporal Network EEG Emotion Recognition Method
Pan et al. Recognition of human inner emotion based on two-stage FCA-ReliefF feature optimization
Khan et al. AT2GRU: A human emotion recognition model with mitigated device heterogeneity
Li et al. Eye-tracking signals based affective classification employing deep gradient convolutional neural networks
Saha et al. Automatic emotion recognition from multi-band EEG data based on a deep learning scheme with effective channel attention
Lupión et al. Data augmentation for human activity recognition with generative adversarial networks
Cao et al. Emotion Recognition of Single-electrode EEG based on Multi-feature Combination in Time-frequency Domain
CN118557195B (en) Emotion state detection method, device, equipment and medium based on electroencephalogram signals

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20211203