CN120108028B

CN120108028B - An emotion recognition method based on eye gaze analysis

Info

Publication number: CN120108028B
Application number: CN202510592066.2A
Authority: CN
Inventors: 高云龙; 胡炜; 马文越; 程露红
Original assignee: Hangzhou Zhilan Health Co ltd
Current assignee: Hangzhou Zhilan Health Co ltd
Priority date: 2025-05-09
Filing date: 2025-05-09
Publication date: 2025-07-18
Anticipated expiration: 2045-05-09
Also published as: CN120108028A

Abstract

The invention discloses a emotion recognition method based on eye analysis, and belongs to the technical field of image processing. The method comprises the steps of firstly extracting eye images at a plurality of continuous moments and dividing the eye images to obtain iris, sclera and periocular regions, marking the iris and sclera formed regions with contacted positions as eye regions, then extracting texture values from all subregions of the periocular regions, obtaining texture centrifugal values and constructing texture centrifugal vectors, obtaining width centrifugal values according to the width of the eye regions at all moments, constructing width centrifugal vectors, calculating position centrifugal values according to the distance between the eye regions and the iris regions, constructing position centrifugal vectors, obtaining characteristic change factors for each vector, and finally inputting the texture centrifugal vectors, the width centrifugal vectors, the position centrifugal vectors and the characteristic change factors corresponding to all the vectors into an emotion recognition neural network model to obtain emotion types. The invention solves the problem of low emotion recognition precision in the prior art.

Description

Emotion recognition method based on eye analysis

Technical Field

The invention relates to the technical field of image processing, in particular to a emotion recognition method based on eye analysis.

Background

The eyes act as windows of hearts, carrying a lot of emotional information. The subtle changes of the eye rotation direction, the gazing time length and the eye circumference muscles in the eye spirit are closely connected with the emotion state. Under normal conditions, in daily communication and environmental observation, eyeballs can naturally and flexibly rotate to acquire surrounding visual information, the glance range is wide, and the rotation speed is relatively uniform. However, individuals in a bad mood or depressed state often exhibit significant retardation and limitation in eye rotation. Their rotation amplitudes in the horizontal and vertical directions are reduced, and it is difficult for the eyeballs to move to the target position as quickly and smoothly as normal persons, for example, when reading text or viewing images, and more time and effort may be required to accomplish the same visual search task.

The existing emotion recognition method through eyes recognizes emotion of a person according to distances among upper left eyelid feature points, lower left eyelid feature points, upper right eyelid feature points and lower right eyelid feature points. However, the opening and closing degree of eyes of each person is different, the distance between the characteristic points cannot truly reflect the movement condition of eyes, and the problem of low emotion recognition accuracy exists.

Disclosure of Invention

Aiming at the defects in the prior art, the emotion recognition method based on eye analysis solves the problem of low emotion recognition precision in the prior art.

In order to achieve the aim of the invention, the technical scheme adopted by the invention is that the emotion recognition method based on eye analysis comprises the following steps:

Extracting eye images at a plurality of continuous moments, and dividing each eye image to obtain an iris area, a sclera area and a periocular area;

Marking a constituent region of the iris region and the sclera region in positional contact as an eye region;

Extracting texture values from each subarea in the periocular region, obtaining texture centrifugal values, and constructing a texture centrifugal vector;

Acquiring a width centrifugal value according to the width of the eye area at each moment, and constructing a width centrifugal vector;

Calculating a position centrifugal value according to the distance between the eye area and the iris area, and constructing a position centrifugal vector;

acquiring a characteristic change factor for each vector;

And inputting the texture centrifugal vector, the width centrifugal vector, the position centrifugal vector and the characteristic change factors corresponding to the vectors into the emotion recognition neural network model to obtain emotion types.

Further, the process of segmentation includes:

Carrying out gray scale processing on each eye image, and carrying out clustering processing on pixel points according to gray values to obtain a plurality of clusters;

Calculating the similarity between the average gray level of each cluster and the stored iris gray level value, and finding out the cluster corresponding to the maximum similarity as an iris area;

calculating the similarity between the average gray level of each cluster and the stored sclera gray level value, and finding out the cluster corresponding to the maximum similarity as a sclera area;

Other clusters adjacent to the sclera region and the iris region were considered as periocular regions.

Further, the process of constructing the texture centrifugation vector comprises:

Extracting a contour from the periocular region to obtain a contour map;

Counting the number of contour points in the contour map, and carrying out logarithmic normalization processing to obtain contour density of each periocular region;

calculating standard deviation of each gray value in the periocular region, and normalizing the standard deviation to obtain gray fluctuation value of each periocular region;

adding the contour density belonging to the same periocular region and the gray scale fluctuation value to obtain a texture value;

subtracting the average texture value from the texture value of each periocular region to obtain a texture centrifugal value;

and constructing the texture centrifugal value of the periocular region at each moment into a texture centrifugal vector.

Further, the process of constructing the width centrifugal vector includes:

constructing an external rectangle for the eye region;

Extracting the width of the external rectangle;

subtracting the average width from the width of each eye region to obtain a width centrifugal value;

The width centrifugal value of the eye region at each time is constructed as a width centrifugal vector.

Further, the process of constructing the positional centrifugal vector is as follows:

Acquiring the eye center position of an eye area at each moment;

acquiring an iris center position of an iris region at each moment;

calculating the distance between the central position of the iris and the central position of the eye to obtain the position centrifugal value of the iris;

The position centrifugal value of the iris at each moment is constructed as a position centrifugal vector.

Further, the formula of the position centrifugal value of the iris is: Where γ is the position centrifugation value of the iris, x _r is the abscissa of the iris center, y _r is the ordinate of the iris center, x _e is the abscissa of the eye center, y _e is the ordinate of the eye center, f is a sign function, f (x _r-x_e) is assigned 1 when x _r-x_e is greater than 0, and f (x _r-x_e) is assigned-1 when x _r-x_e is less than 0.

Further, the process of obtaining the feature change factor includes:

summing absolute values of adjacent element differences in each vector to obtain a total characteristic change value;

According to the group number of the adjacent elements, taking an average value of the total characteristic change value to obtain a characteristic change average value;

and carrying out normalization processing on the characteristic change average value to obtain a characteristic change factor.

The emotion recognition neural network model further comprises a texture vector feature extraction module, a width vector feature extraction module, a position vector feature extraction module, a texture feature fusion module, a width feature fusion module, a position feature fusion module and a full-connection layer;

the input end of the texture vector feature extraction module is used for inputting a texture centrifugal vector, the input end of the width vector feature extraction module is used for inputting a width centrifugal vector, and the input end of the position vector feature extraction module is used for inputting a position centrifugal vector;

the first input end of the texture feature fusion module is connected with the output end of the texture vector feature extraction module, and the second input end of the texture feature fusion module is used for inputting texture feature change factors;

the first input end of the width characteristic fusion module is connected with the output end of the width vector characteristic extraction module, and the second input end of the width characteristic fusion module is used for inputting a width characteristic change factor;

the first input end of the position feature fusion module is connected with the output end of the position vector feature extraction module, and the second input end of the position feature fusion module is used for inputting a position feature change factor;

the input end of the full-connection layer is respectively connected with the output end of the texture feature fusion module, the output end of the width feature fusion module and the output end of the position feature fusion module, and the output end of the full-connection layer is used as the output end of the emotion recognition neural network model.

The system comprises a texture vector feature extraction module, a width vector feature extraction module, a position vector feature extraction module and a position vector feature extraction module, wherein the texture vector feature extraction module is used for extracting texture features from a texture centrifugal vector;

the system comprises a texture feature fusion module, a width feature fusion module, a position feature fusion module and a position feature fusion module, wherein the texture feature fusion module is used for fusing texture features and texture feature change factors to obtain texture fusion features;

the full-connection layer is used for classifying according to texture fusion features, width fusion features and position fusion features to obtain emotion types.

Further, the texture vector feature extraction module, the width vector feature extraction module and the position vector feature extraction module all comprise an LSTM network, a two-dimensional feature construction layer and a CNN network which are sequentially connected;

The LSTM network is used for extracting shallow features from vectors, the two-dimensional data construction layer is used for constructing the shallow features into two-dimensional features, and the CNN network is used for extracting deep features from the shallow features.

The beneficial effects of the invention are as follows:

1. According to the invention, the iris region, the sclera region and the periocular region are obtained by extracting and dividing eye images at a plurality of continuous moments, different characteristics of the regions (such as texture values of the periocular region, widths of the eye region and distances between the eye region and the iris region) are analyzed, and the texture centrifugal value, the width centrifugal value and the position centrifugal value are extracted, so that the states of the texture deviating from the mean value, the width deviating from the mean value and the iris position deviating from the center at each moment are reflected, the actual movement condition of eyes can be reflected more comprehensively and accurately, and the limitation caused by the distance of eyelid feature points is avoided, thereby improving the emotion recognition precision.

2. The invention acquires the characteristic change factor for each vector and reflects the change speed of the elements in the vector. The change in emotion is dynamic, not only in the static values of the eye features, but also in the rate of change of these features over time.

3. According to the invention, the texture centrifugal vector, the width centrifugal vector and the position centrifugal vector are processed by adopting the emotion recognition neural network model, and the feature change factors corresponding to the vectors are adopted, so that the accuracy of model emotion classification is further improved.

Drawings

FIG. 1 is a flow chart of a method of emotion recognition based on eye analysis;

fig. 2 is a schematic diagram of a structure of an emotion recognition neural network model.

Detailed Description

The following description of the embodiments of the present invention is provided to facilitate understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and all the inventions which make use of the inventive concept are protected by the spirit and scope of the present invention as defined and defined in the appended claims to those skilled in the art.

As shown in fig. 1, a emotion recognition method based on eye analysis includes the following steps:

acquiring a characteristic change factor for each vector;

The iris region is located in the central portion of the eye and is a colored circular portion of the eye that is used to control the amount of light entering the eye. The scleral region, generally described as the "white of the eye", is a white region surrounding the iris, covering most of the surface of the eyeball, and serves to protect the internal tissues of the eyeball.

In this embodiment, 10-20 eye images can be acquired every 30 seconds to facilitate observation of eye movement over a period of time.

In this embodiment, the process of segmentation includes:

In this embodiment, the formula for calculating the similarity between the average gray level of each cluster and the stored iris gray level value or the stored sclera gray level value is: Where S is the similarity, G _avg is the average gray of the cluster, and G _s is the stored iris gray value or the stored sclera gray value.

After the gray level map is clustered, the iris area and the sclera area are found according to the similarity of gray level values of the clusters, and then other clusters adjacent to the sclera area and the iris area are used as periocular areas.

In this embodiment, the process of constructing the texture centrifugation vector includes:

Extracting a contour from the periocular region to obtain a contour map;

The invention obtains the contour density by extracting the outline around the eyes and counting the number of the points of the outline, can embody the shape and structure characteristics around the eyes, calculates the standard deviation of gray values to obtain the gray fluctuation value, and jointly represents the texture condition around the eyes by combining the outline and the gray fluctuation value. The emotional changes of the person can drive the muscle around eyes to move. For example, when the eye is opened, more wrinkles are formed in the periocular region, and both the contour density and the gray scale fluctuation value are increased. However, because the outline of each periocular region is different and the wrinkling condition is different, the texture centrifugal value is obtained by subtracting the average texture value from the texture value of each periocular region, and the deviation condition of the texture is reflected. In this embodiment, the average texture value may be an average of texture values of a plurality of periocular regions at a plurality of times, or an average of texture values of a plurality of periocular regions on a daily basis.

In the present embodiment, the specific process of extracting the contour of the periocular region includes centering each pixel, and the gray value of the pixel at the centerAnd when the gray values in the neighborhood range are the same, marking the pixel point at the center as a non-contour point, discarding the non-contour point, and obtaining a contour map by using the rest pixel points as contour points.

In the present embodiment, the expression for obtaining the contour density of each periocular region is: Where μ _o is the contour density of the periocular region, M _o is the number of contour points in the contour map, and M _E is the number of pixel points in the periocular region.

In the present embodiment, the expression for obtaining the gradation fluctuation value of each periocular region is: where θ is a gray scale fluctuation value of the periocular region, σ is a standard deviation of each gray scale value in the periocular region, G _max is a maximum gray scale value in the periocular region at each time, and G _min is a minimum gray scale value in the periocular region at each time.

In this embodiment, the process of constructing the width centrifugal vector includes:

constructing an external rectangle for the eye region;

Extracting the width of the external rectangle;

According to the invention, the external rectangle is constructed for the eye area, the opening and closing state of the eye is reflected through the width of the external rectangle, and the larger the width of the external rectangle is, the larger the iris area and the sclera area are, so that the opening and closing degree of the eye is shown to be larger. For example, when a person is surprised, the eyes may be opened more, the width of the circumscribed rectangle may be increased, and when the person is in a relaxed or tired state, the eyes may be squinted slightly, and the width may be reduced.

Because the sizes of eyes of each person are different, the state of the eyes cannot be accurately reflected through the widths of the eyes, the average width is subtracted from the width of each eye area to obtain a width centrifugal value, the condition that the width deviates from the average is reflected, and the state of the eyes is accurately reflected.

In this embodiment, the average width may be an average of the widths of the eye regions at a plurality of times, or an average of the widths of a plurality of daily eye regions.

In this embodiment, the process of constructing the positional centrifugal vector is:

Acquiring the eye center position of an eye area at each moment;

acquiring an iris center position of an iris region at each moment;

In this embodiment, the formula of the position centrifugal value of the iris is: Where γ is the position centrifugation value of the iris, x _r is the abscissa of the iris center, y _r is the ordinate of the iris center, x _e is the abscissa of the eye center, y _e is the ordinate of the eye center, f is a sign function, f (x _r-x_e) is assigned 1 when x _r-x_e is greater than 0, and f (x _r-x_e) is assigned-1 when x _r-x_e is less than 0.

The invention calculates the distance between the center of the eye and the center of the iris, and can accurately quantify the position of the iris in the eye area. The sign function f is added, so that the distance value can be reflected, and the direction of the iris relative to the center of the eye can be reflected.

The horizontal coordinate of the eye center is the average value of the horizontal coordinates of all the pixel points in the eye area, the vertical coordinate of the eye center is the average value of the vertical coordinates of all the pixel points in the eye area, and the other implementation mode is that the horizontal coordinate of the eye center is the average value of the horizontal coordinates of all the pixel points at the outer edge of the eye area, the vertical coordinate of the eye center is the average value of the vertical coordinates of all the pixel points at the outer edge of the eye area, namely the eye center position is the geometric center of the eye area. The abscissa of the iris region is the average value of the abscissas of all the pixel points in the iris region, the ordinate of the iris region is the average value of the abscissas of all the pixel points in the iris region, the abscissa of the iris region is the average value of the abscissas of all the pixel points at the outer edge in the iris region, and the ordinate of the iris region is the average value of the abscissas of all the pixel points at the outer edge in the iris region, namely the center position of the iris is the geometric center of the iris region. The manner of acquiring the eye center position and the iris center position described in the present embodiment is not limited. The outer edge refers to a pixel point at the boundary of the outermost side of the eye region or iris region.

In this embodiment, the process of obtaining the feature change factor includes:

The expression for obtaining the characteristic change average value is: e is a feature change average value, E _t+1 is an element at the t+1th time in the vector, E _t is an element at the T time in the vector, I is an absolute value operation, T-1 is the number of groups of adjacent elements, and T is the number of times.

The method comprises the steps of summing absolute values of differences of adjacent elements to obtain overall characteristic change conditions, taking an average value of the overall characteristic change values according to the number of groups of the adjacent elements to obtain a characteristic change average value, reflecting average change speed, and uniformly evaluating the scale through normalization processing.

In the embodiment, the normalization processing of the characteristic variation average value comprises the steps of dividing the characteristic variation average value by an average texture value for a texture centrifugal vector, dividing the characteristic variation average value by an average width for a width centrifugal vector, and dividing the characteristic variation average value by the absolute value of the difference between the maximum value and the minimum value in the position centrifugal vector for a position centrifugal vector.

As shown in fig. 2, the emotion recognition neural network model comprises a texture vector feature extraction module, a width vector feature extraction module, a position vector feature extraction module, a texture feature fusion module, a width feature fusion module, a position feature fusion module and a full connection layer;

The texture characteristic change factor is a characteristic change factor corresponding to the texture centrifugal vector, the width characteristic change factor is a characteristic change factor corresponding to the width centrifugal vector, and the position characteristic change factor is a characteristic change factor corresponding to the position centrifugal vector.

According to the invention, three time sequence vectors are respectively processed through the three feature extraction modules, the vector features are extracted, and the three feature fusion modules are used for fusing the feature change factors and the vector features, so that the emotion recognition accuracy is improved.

In the embodiment, the texture vector feature extraction module is used for extracting texture features from texture centrifugal vectors, the width vector feature extraction module is used for extracting width features from width centrifugal vectors, and the position vector feature extraction module is used for extracting position features from position centrifugal vectors;

In this embodiment, the expressions of the texture feature fusion module, the width feature fusion module, and the position feature fusion module are: Wherein y is the output of the feature fusion module, g ₁ is the input of the first input end of the feature fusion module, g ₂ is the input of the second input end of the feature fusion module, w ₁ is the weight of g ₁, and w ₂ is the weight of g ₂.

In the embodiment, the texture vector feature extraction module, the width vector feature extraction module and the position vector feature extraction module all comprise an LSTM network and a full connection layer which are sequentially connected, wherein the LSTM network is used for extracting shallow features from vectors, and the full connection layer is used for carrying out feature mapping on the shallow features to obtain deep features. More preferably, the texture vector feature extraction module, the width vector feature extraction module and the position vector feature extraction module all comprise an LSTM network, a two-dimensional feature construction layer and a CNN network which are sequentially connected;

The LSTM network is used for extracting shallow features from vectors, the two-dimensional data construction layer is used for constructing the shallow features into two-dimensional features, and the CNN network is used for extracting deep features from the shallow features. For the texture vector feature extraction module, the deep features are texture features, for the width vector feature extraction module, the deep features are width features, and for the position vector feature extraction module, the deep features are position features.

The two-dimensional feature construction layer is formed by H=h ^T H, H is a two-dimensional feature, H is a vector formed by outputting each feature value H _t by the LSTM network, and T is transposition operation.

According to the invention, time sequence information in the vector can be effectively extracted through the LSTM network, dynamic changes of eye features along with time are captured, and then the output features of the LSTM network are formed into two-dimensional data, so that the data volume is increased, the strong space feature extraction capacity of the CNN network is conveniently utilized, the space relation among the features is excavated, the model can comprehensively utilize the time sequence and the space information, the eye features are more comprehensively represented, and the emotion recognition accuracy is improved.

In this embodiment, the emotion types include happiness, sadness, anger, lowness, and the like.

According to the invention, the iris region, the sclera region and the periocular region are obtained by extracting and dividing eye images at a plurality of continuous moments, different characteristics of the regions (such as texture values of the periocular region, widths of the eye region and distances between the eye region and the iris region) are analyzed, and the texture centrifugal value, the width centrifugal value and the position centrifugal value are extracted, so that the states of the texture deviating from the mean value, the width deviating from the mean value and the iris position deviating from the center at each moment are reflected, the actual movement condition of eyes can be reflected more comprehensively and accurately, and the limitation caused by the distance of eyelid feature points is avoided, thereby improving the emotion recognition precision.

The invention acquires the characteristic change factor for each vector and reflects the change speed of the elements in the vector. The change in emotion is dynamic, not only in the static values of the eye features, but also in the rate of change of these features over time.

According to the invention, the texture centrifugal vector, the width centrifugal vector and the position centrifugal vector are processed by adopting the emotion recognition neural network model, and the feature change factors corresponding to the vectors are adopted, so that the accuracy of model emotion classification is further improved.

The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. An emotion recognition method based on eye expression analysis, characterized in that it comprises the following steps:

Extract eye images at multiple consecutive moments, and segment each eye image to obtain the iris area, sclera area, and eye periorbital area;

marking a region composed of the iris region and the sclera region where the position contacts as an eye region;

Extracting texture values from each sub-region in the periorbital region, obtaining texture centrifugal values, and constructing texture centrifugal vectors;

According to the width of the eye area at each moment, the width centrifugal value is obtained and the width centrifugal vector is constructed;

According to the distance between the eye area and the iris area, the position centrifugal value is calculated and the position centrifugal vector is constructed;

Get the feature change factor for each vector;

The texture centrifugal vector, the width centrifugal vector, the position centrifugal vector, and the characteristic change factors corresponding to each vector are input into the emotion recognition neural network model to obtain the emotion type.

2. The emotion recognition method based on eye contact analysis according to claim 1, wherein the segmentation process comprises:

The grayscale of each eye image is processed, and the pixels are clustered according to the grayscale value to obtain multiple clusters;

Calculate the similarity between the average grayscale of each cluster and the stored iris grayscale value, and find the cluster corresponding to the maximum similarity as the iris region;

Calculate the similarity between the average grayscale of each cluster and the stored sclera grayscale value, and find the cluster corresponding to the maximum similarity as the sclera area;

The other clusters adjacent to the sclera region and the iris region are regarded as the periocular region.

3. The emotion recognition method based on eye expression analysis according to claim 1, wherein the process of constructing the texture centrifugal vector comprises:

Extract the contour of the peri-eye area to obtain a contour map;

Count the number of contour points in the contour map and normalize the number to obtain the contour density of each periocular area;

Calculate the standard deviation of each grayscale value in the periocular area, and normalize the standard deviation to obtain the grayscale fluctuation value of each periocular area;

The contour density and grayscale fluctuation values belonging to the same periocular area are added together to obtain the texture value;

The texture centrifugal value was obtained by subtracting the average texture value from the texture value of each periocular area;

The texture centrifugal value of the eye area at each moment is constructed as a texture centrifugal vector.

4. The emotion recognition method based on eye expression analysis according to claim 1, characterized in that the process of constructing the width centrifugal vector comprises:

Construct a bounding rectangle for the eye area;

Extract the width of the bounding rectangle;

Subtract the average width from the width of each eye region to obtain the width eccentricity value;

The width eccentric value of the eye area at each moment is constructed as a width eccentric vector.

5. The emotion recognition method based on eye expression analysis according to claim 1, characterized in that the process of constructing the position centrifugal vector is:

Get the eye center position for the eye area at each moment;

Obtain the iris center position for the iris area at each moment;

Calculate the distance between the center of the iris and the center of the eye to obtain the eccentricity value of the iris position;

The position eccentricity value of the iris at each moment is constructed as a position eccentricity vector.

6. The emotion recognition method based on eye expression analysis according to claim 5, characterized in that the formula for the eccentricity value of the iris position is: , where γ is the eccentricity of the iris, _xr is the horizontal coordinate of the iris center, _yr is the vertical coordinate of the iris center, _xe is the horizontal coordinate of the eye center, _ye is the vertical coordinate of the eye center, and f is the sign function. When _xr - _xe is greater than 0, f( _xr - _xe ) is assigned a value of 1. When _xr - _xe is less than 0, f( _xr - _xe ) is assigned a value of -1.

7. The emotion recognition method based on eye expression analysis according to claim 1, wherein the process of obtaining the characteristic change factor comprises:

Sum the absolute values of the differences between adjacent elements in each vector to obtain the total feature change value;

According to the number of groups of adjacent elements, the total feature change value is averaged to obtain the feature change average value;

The feature change average is normalized to obtain the feature change factor.

8. The emotion recognition method based on eye contact analysis according to claim 1, characterized in that the emotion recognition neural network model comprises: a texture vector feature extraction module, a width vector feature extraction module, a position vector feature extraction module, a texture feature fusion module, a width feature fusion module, a position feature fusion module and a fully connected layer;

The input end of the texture vector feature extraction module is used to input the texture centrifugal vector; the input end of the width vector feature extraction module is used to input the width centrifugal vector; the input end of the position vector feature extraction module is used to input the position centrifugal vector;

The first input end of the texture feature fusion module is connected to the output end of the texture vector feature extraction module, and the second input end thereof is used to input a texture feature change factor;

The first input end of the width feature fusion module is connected to the output end of the width vector feature extraction module, and the second input end thereof is used to input a width feature change factor;

The first input end of the position feature fusion module is connected to the output end of the position vector feature extraction module, and the second input end thereof is used to input the position feature change factor;

The input end of the fully connected layer is connected to the output end of the texture feature fusion module, the output end of the width feature fusion module and the output end of the position feature fusion module respectively, and its output end serves as the output end of the emotion recognition neural network model.

9. The emotion recognition method based on eye expression analysis according to claim 8 is characterized in that the texture vector feature extraction module is used to extract texture features from the texture centrifugal vector; the width vector feature extraction module is used to extract width features from the width centrifugal vector; and the position vector feature extraction module is used to extract position features from the position centrifugal vector;

The texture feature fusion module is used to fuse the texture feature and the texture feature change factor to obtain the texture fusion feature; the width feature fusion module is used to fuse the width feature and the width feature change factor to obtain the width fusion feature; the position feature fusion module is used to fuse the position feature and the position feature change factor to obtain the position fusion feature;

The fully connected layer is used to classify the emotion type based on the texture fusion features, width fusion features and position fusion features.

10. The emotion recognition method based on eye expression analysis according to claim 8, characterized in that the texture vector feature extraction module, the width vector feature extraction module and the position vector feature extraction module all include an LSTM network, a two-dimensional feature construction layer and a CNN network connected in sequence;

The LSTM network is used to extract shallow features from vectors; the two-dimensional data construction layer is used to construct shallow features into two-dimensional features; and the CNN network is used to extract deep features from shallow features.