CN113688701B

CN113688701B - Facial paralysis detection method and system based on computer vision

Info

Publication number: CN113688701B
Application number: CN202110915679.7A
Authority: CN
Inventors: 陈如中
Original assignee: Jiangsu Renhe Medical Equipment Co ltd
Current assignee: Jiangsu Renhe Medical Equipment Co ltd
Priority date: 2021-08-10
Filing date: 2021-08-10
Publication date: 2022-04-22
Anticipated expiration: 2041-08-10
Also published as: CN113688701A

Abstract

The invention relates to a facial paralysis detection method and system based on computer vision, and belongs to the field of facial paralysis detection. The method comprises the following steps: acquiring a face region image of a person to be detected and a corresponding face region depth map; symmetrically dividing the face region image according to key points of the face region to obtain image sequences of each symmetrical region under the designated action; obtaining a speed index of a symmetrical area corresponding to each image in each image sequence according to each image sequence; obtaining a depth index corresponding to each image in each image sequence according to the face region depth map; obtaining each key frame image corresponding to each symmetrical area according to the speed index of the symmetrical area corresponding to each image and the depth index corresponding to each image; and obtaining the facial paralysis detection result of each key frame image according to the texture information matrix of the face region image in each key frame image. The invention can ensure that the obtained key frame image is a front view with obvious motion characteristics, and improves the accuracy of facial paralysis detection.

Description

Facial paralysis detection method and system based on computer vision

Technical Field

The invention relates to the technical field of facial paralysis detection, in particular to a facial paralysis detection method and system based on computer vision.

Background

Facial paralysis is a disease with facial expression muscle movement dysfunction as the main characteristic, and the general symptoms are facial distortion and uncoordinated facial expression. Due to the inability to control facial muscles, facial paralysis patients often do not have sufficient flexibility in their movements on their face and may present other symptoms including running water, speech problems, and nasal congestion, among others.

In the method for detecting facial paralysis in the prior art, a key frame of a face image is obtained based on the bilateral symmetry of a face under different actions, and then whether facial paralysis exists in a person or not and the degree of the facial paralysis are detected only based on the key frame. However, the above method for acquiring the key frame only refers to the bilateral symmetry of the human face under different actions, and does not consider the relative angle between the camera and the human face when shooting each facial image, and the relative angle also affects the bilateral symmetry of the human face, so the method for detecting facial paralysis in the prior art has the problem that the acquired key frame cannot accurately reflect the facial abnormality of the human face, and the accuracy of facial paralysis detection is low.

Disclosure of Invention

The invention provides a facial paralysis detection method and system based on computer vision, which are used for solving the problem that the existing facial paralysis cannot be accurately detected, and adopt the following technical scheme:

in a first aspect, an embodiment of the present invention provides a facial paralysis detection method and system based on computer vision, including the following steps:

acquiring a face region image of a person to be detected and a corresponding face region depth map;

obtaining key points of a face area according to the face area image;

symmetrically dividing the face region image according to the key points of the face region to obtain image sequences of each symmetrical region under the designated action;

obtaining a speed index of a symmetrical area corresponding to each image in each image sequence according to each image sequence; obtaining a depth index corresponding to each image in each image sequence according to the face region depth map;

obtaining each key frame image corresponding to each symmetrical area according to the speed index of the symmetrical area corresponding to each image and the depth index corresponding to each image;

and obtaining the facial paralysis detection result of each key frame image according to the texture information matrix of the face region image in each key frame image.

The invention also provides a facial paralysis detection system based on computer vision, which comprises a memory and a processor, so as to realize the facial paralysis detection method based on computer vision and the acquisition of the human face area image of the person to be detected and the acquisition of the corresponding human face area depth map.

The facial paralysis detection method and the facial paralysis detection system have the beneficial effects that: according to the speed index of the symmetrical area corresponding to each image and the depth index corresponding to each image, each key frame image corresponding to each symmetrical area is obtained; obtaining facial paralysis detection results of the key frame images according to the texture information matrix of the face region images in the key frame images; the method takes the speed index of the symmetrical area corresponding to each image and the depth index corresponding to each image as the basis for obtaining each key frame image corresponding to each symmetrical area, can ensure that the obtained key frame image is a front view with obvious motion characteristics, avoids the influence on the facial paralysis detection result caused by the deviation of the relative angle between a camera and a face when a person to be detected performs a specified action, and improves the accuracy of facial paralysis detection.

Preferably, the method for obtaining the speed index comprises the following steps:

obtaining a speed index of a symmetric region corresponding to each image in each image sequence according to the average instantaneous speed of the symmetric region corresponding to each image in each image sequence, wherein the speed index comprises the following steps:

the speed index is calculated according to the following formula:

wherein, γ_m，nIs the speed index corresponding to the nth image in the mth image sequence,

is the average instantaneous speed of the left symmetrical area corresponding to the nth image in the mth image sequence,

the average instantaneous speed of the right symmetrical region corresponding to the nth image in the mth image sequence is obtained.

Preferably, the method of calculating the average instantaneous speed comprises:

obtaining the instantaneous speed of each key point in the symmetrical area corresponding to each image in each image sequence according to each image sequence;

and obtaining the average instantaneous speed of the symmetrical area corresponding to each image in each image sequence according to the instantaneous speed of the key point.

Preferably, the average instantaneous velocity of the symmetric region corresponding to each image in each image sequence is calculated according to the following formula:

wherein,

is the average instantaneous velocity, j, of the left symmetric region corresponding to the nth image in the mth image sequence_leftThe total number of key points in the left symmetrical region corresponding to the nth image in the mth image sequence,

the instantaneous speed of the ith key point in the left symmetrical region corresponding to the nth image in the mth image sequence is obtained;

is the average instantaneous velocity, j, of the right symmetric region corresponding to the nth image in the mth image sequence_rightThe total number of key points in the right symmetrical region corresponding to the nth image in the mth image sequence,

the instantaneous speed of the ith key point in the right symmetrical region corresponding to the nth image in the mth image sequence is obtained.

Preferably, the depth index is obtained by averaging depth values of key points in the nose region in the face region depth map corresponding to the person to be measured in each image.

Preferably, each key frame image should satisfy the following requirements:

the depth index satisfies the formula

Wherein

Depth index, h, for any image in any image sequence₀The standard depth value is the standard depth value of the nose area when the person to be measured looks at the camera, and H is a preset threshold value.

Preferably, the method for facial paralysis detection comprises:

calculating the facial paralysis judgment index of each key frame image according to the following formula:

L_q＝||(F_right-F_left)_a×b||₂

wherein L is_qA facial paralysis judgment index for the qth key frame image, F_rightIs a texture information matrix of the right face region image in the q-th key frame image, F_leftThe texture information matrix of the left face region image in the q-th key frame image is shown, wherein a is the row number of the matrix, and b is the column number of the matrix.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions and advantages of the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 is a flowchart of a facial paralysis detection method based on computer vision according to an embodiment of the present invention.

Fig. 2 is a schematic view of key points of a human face in a facial paralysis detection method based on computer vision according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, rather than all embodiments, and all other embodiments obtained by those skilled in the art based on the embodiments of the present invention belong to the protection scope of the embodiments of the present invention.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

The embodiment provides a facial paralysis detection method based on computer vision, which is described in detail as follows:

as shown in fig. 1, the facial paralysis detecting method based on computer vision includes the following steps:

and S001, acquiring a face region image of the person to be detected and a corresponding face region depth map.

The RGBD depth camera adds a depth measurement to the function of an RGB common camera, and can acquire an RGB image and a corresponding depth image based on the RGBD depth camera; in this embodiment, an RGBD depth camera is deployed to perform RGB image acquisition and depth image acquisition on a person to be measured, the acquired image is sent to a semantic segmentation network to obtain a Mask image of a face region of the person to be measured, the Mask image of the face region is directly multiplied by the acquired RGB image to obtain a face region image, which is denoted as I₁Multiplying the Mask image of the face region with the collected depth image to obtain a face region depth image, and recording as I₂。

It should be noted that when the RGBD depth camera is deployed, the person to be measured can look at the camera forward, and the field of view of the camera covers the whole area of the face of the person to be measured.

In this embodiment, the training process of the semantic segmentation network specifically includes: using an RGBD depth camera to take the collected sample image as a training set; manually labeling the training set, labeling the pixel points of the face region as 1, and labeling the pixel points of other regions as 0 to obtain labeled data; and training the network by using the training set and the labeled data, wherein the loss function adopts a cross entropy function, and the parameters are continuously updated.

In the embodiment, semantic segmentation networks such as U-Net, deep nav 3+ and the like can be used; when the semantic segmentation network is used, the RGB image collected to the person to be detected is directly input into the semantic segmentation network, the output Mask image is a binary image, the pixel value of the face region is 1, and the pixel values of other regions are 0.

And step S002, obtaining key points of the face area according to the face area image.

In this embodiment, the face region image is subjected to face keypoint detection to obtain 2D-landmark, as shown in fig. 2.

It should be noted that there are many methods for obtaining the face 2D-landmark, which are known technologies, and therefore, detailed description in this embodiment is omitted, and existing technologies such as OpenFace and DAN may be selected according to actual situations.

And S003, symmetrically dividing the face region image according to the key points of the face region to obtain image sequences of each symmetrical region under the designated action.

In this embodiment, the symmetric division of the face region image is performed according to the key points of the face region, so as to obtain each image sequence of each symmetric region under the designated action, which specifically includes: in this embodiment, the face region image is divided into a left and right eyebrow region, a left and right eye region, a left and right cheek region, a left and right mouth region, and a nose region according to key points of the face region, and the left and right mouth regions divide the mouth region into a left and right two regions by using coordinates of a nose center point, that is, the mouth region on the left side of the nose center point is recorded as a left mouth region, and the mouth region on the right side of the nose center point is recorded as a right mouth region.

Acquiring a key point data set of each region according to the divided regions to obtain the maximum value and the minimum value of the abscissa and the maximum value and the minimum value of the ordinate in each data set; and determining the rectangular area of each area according to the maximum value and the minimum value of the abscissa and the maximum value and the minimum value of the ordinate in each obtained data set.

Different appointed actions can correspond to the movement of different areas of the human face, respectively and sequentially appointing the symmetrical areas of the person to be tested, performing the eyebrow creasing action on the left and right eyebrow areas, performing the eye closing action on the left and right eye areas, performing the cheek bulging action on the left and right cheek areas, and performing the blowing action on the left and right mouth areasWhistling action, obtaining four image sequences of specified action respectively, and recording as S₁，S₂，S₃，S₄。

As another embodiment, the designated motion may be set for different regions according to different requirements, for example, the chewing motion may be set for the tooth region.

Step S004, obtaining the speed index of the symmetrical area corresponding to each image in each image sequence according to each image sequence; and obtaining a depth index corresponding to each image in each image sequence according to the face region depth map.

In the embodiment, the instantaneous speed of each key point in the symmetrical area corresponding to each image in each image sequence is obtained according to an optical flow method; the optical flow method is a method for obtaining object motion information according to the corresponding relationship of pixel points between adjacent frames, and the pixel points are key points in the embodiment; and acquiring the optical flow information of each key point in the symmetrical area corresponding to each image in each image sequence to obtain the motion information of each key point at different moments and obtain the instantaneous speed.

In this embodiment, the process of obtaining the instantaneous velocity of each key point in the symmetric region corresponding to each image in each image sequence according to the optical flow method is as follows:

acquiring a pixel value of any key point at a coordinate (x, y) at the time t, recording a gray value of the key point as I (x, y, t), moving the key point to a position (x + delta x, y + delta y) after the key point passes a time delta t to a time t +1, and taking the gray value as I (x + delta x, y + delta y, t + delta t); because the optical flow information is the same point at two different time instants, the optical flow information of the key point can be obtained

Where V is the optical flow information for the key point, V_xIs a velocity vector along the x-axis, V_yIs the velocity vector along the y-axis; therefore, the instantaneous speed of the key point at the moment t +1 is obtained according to the optical flow information of the key point

Therefore, the instantaneous speed of the key point in the symmetrical region corresponding to each image in each image sequence can be sequentially obtained through the method; and obtaining the average instantaneous speed of the symmetrical region corresponding to each image in each image sequence according to the instantaneous speed of the key point in the symmetrical region corresponding to each image in each image sequence, wherein the average instantaneous speed can reflect the overall motion characteristics of the symmetrical region corresponding to each image in each image sequence.

In this embodiment, the relationship between the average instantaneous speed and the instantaneous speed is a positive correlation, so that the relationship between the average instantaneous speed and the instantaneous speed can be fitted by a mathematical modeling method to obtain the average instantaneous speed of the symmetric region corresponding to each image in each image sequence:

wherein,

In this embodiment, the method for calculating the average instantaneous speed of the symmetric region corresponding to each image in each image sequence is only one preferred embodiment of the present embodiment, and as another embodiment, a method for calculating the average instantaneous speed of the symmetric region corresponding to each image in each image sequence may be differently set according to different requirements, but it is sufficient that the instantaneous speed of the key point in the symmetric region corresponding to each image in each image sequence is in a positive correlation with the average instantaneous speed of the symmetric region corresponding to each image in each image sequence.

In this embodiment, the average instantaneous velocity of the symmetric region corresponding to each image in each image sequence obtained through the above process forms an average instantaneous velocity sequence corresponding to each image sequence, and is recorded as

Wherein

Is the average instantaneous velocity of the left symmetrical region corresponding to the 1 st image in the mth image sequence,

is the average instantaneous velocity of the right symmetric region corresponding to the 1 st image in the mth image sequence,

is the average instantaneous velocity of the left symmetrical region corresponding to the 2 nd image in the mth image sequence,

the average instantaneous velocity of the left symmetric region corresponding to the last image in the mth image sequence,

the average instantaneous velocity of the left symmetric region corresponding to the last image in the mth image sequence.

It should be noted that the reason why the above process exponentiates the instantaneous speed of each key point is that the speed of the face movement of the person is relatively slow, and the exponentiation can make the output result relatively sensitive to the change of the instantaneous speed, thereby playing a role in enhancing the features.

In this embodiment, the speed index of the symmetric region corresponding to each image in each image sequence is obtained according to the obtained average instantaneous speed of the symmetric region corresponding to each image in each image sequence, and the speed index is calculated according to the following formula:

The formula for calculating the speed index not only considers the motion characteristics of the person to be measured, but also considers the situation of errors caused by non-standard actions of the person to be measured when the person to be measured performs the specified actions.

In this embodiment, the method for calculating the speed index of the symmetric region corresponding to each image in each image sequence is only one preferred embodiment of this embodiment, as another embodiment, different methods for calculating the speed index of the symmetric region corresponding to each image in each image sequence may be set according to different requirements, but the motion characteristics of the person to be measured and the situation that the movement of the person to be measured causes an error due to the non-normative movement of the person to be measured when the person to be measured performs the specified movement should be considered at the same time, and the relationship between the two methods is obtained by fitting through a mathematical modeling method.

In this embodiment, the depth index may determine whether each image is a front view, and in this embodiment, it is determined that the nose region is selected because the depth information of the region is not sensitive to the facial movement, but when the person to be tested performs head twisting, head lowering, head raising, and the like, the depth information of the nose region changes greatly.

In this embodiment, the depth values of the key points in the nose region in the face region depth map corresponding to the person to be measured in each image are averaged to obtain the depth index corresponding to each image in each image sequence, and the depth values of the key points in the nose region in the face region depth map corresponding to the person to be measured in each image are in positive correlation with the depth index corresponding to each image in each image sequence, and the larger each depth value is, the larger each depth index is; therefore, a mathematical modeling method is used to fit a functional relationship between each depth value and each depth index, and the depth index corresponding to each image in each image sequence is obtained, that is:

wherein,

for a depth indicator corresponding to any image in any image sequence,

for the ith in the nose area of the image_noseDepth value of each key point, n_noseThe number of key points in the nose region.

In this embodiment, the method for calculating the depth index corresponding to each image in each image sequence is only one preferred embodiment of the present invention, and as another embodiment, different methods for calculating the depth index corresponding to each image in each image sequence may be set according to different requirements, but it should be satisfied that the depth value of the key point in the nose region in the face region depth map corresponding to the person to be measured in each image is in a positive correlation with the depth index corresponding to each image in each image sequence.

The reason why the depth values of the key points in the nose region are exponentiated in this embodiment is to make the change in the depth value more sensitive in the depth index.

And step S005, obtaining each key frame image corresponding to each symmetric region according to the speed index of the symmetric region corresponding to each image and the depth index corresponding to each image.

In this embodiment, each key frame image corresponding to each symmetric region is obtained according to the speed index of the symmetric region corresponding to each image and the depth index corresponding to each image; the sum of the average instantaneous speed of the left symmetrical area and the average instantaneous speed of the right symmetrical area in the speed indexes of the symmetrical areas corresponding to the key frame images is larger, and the absolute value of the difference value between the average instantaneous speed of the left symmetrical area and the average instantaneous speed of the right symmetrical area in the speed indexes of the symmetrical areas corresponding to the key frame images is smaller; and the depth index corresponding to the key frame image is required to meet the judgment formula

Wherein

Depth index, h, for any image in any image sequence₀The standard depth value is the average exponential power of the depth values of all key points in the nose area when the person to be measured looks at the camera as the standard depth value, H is a preset threshold value, the median value in the embodiment is 0.2, and the threshold value is an empirical value and is set according to actual conditions.

When any image meets the requirement, the image is the key frame image of the image sequence corresponding to the image.

Therefore, in this embodiment, the key frame images corresponding to four different motions are obtained for each image sequence of the designated motion according to the above method, that is, the key frame images of four designated motions, namely, frown, eye closing, gill bulging, and whistle, and the images of the person to be measured when the person is still are obtained, and five key frame images are obtained in total.

It should be noted that the sum of the average instantaneous speed of the left symmetric region and the average instantaneous speed of the right symmetric region in the speed index of the symmetric region corresponding to each key frame image in the speed index is selected to be larger, and the absolute value of the difference between the average instantaneous speed of the left symmetric region and the average instantaneous speed of the right symmetric region in the speed index of the symmetric region corresponding to each key frame image is smaller, when a condition for obtaining a key frame is based on reflecting an obvious motion characteristic, the condition that an error is caused by an irregular motion of a person to be measured when the person to be measured performs a specified motion can be reduced, the irregular motion refers to the fact that the eye can be closed by one eye, and the like.

And step S006, obtaining facial paralysis detection results of the key frame images according to the texture information matrix of the face region images in the key frame images.

In this embodiment, feature extraction is performed on single-side different regions in the face region image in each key frame image, where the single-side different regions refer to left-side regions or right-side regions of the face region image, the extracted texture features include gradient features and gray scale features, and taking feature extraction of a left-eye region of any one key frame image as an example, the feature extraction process is as follows:

carrying out graying processing on the left eye area to obtain a Gray image Gray of the left eye area, and carrying out Gaussian filtering processing on the Gray image; the gaussian filtering is a well-known technique and is not described in detail in this embodiment.

Dividing the Gray scale on the Gray scale image Gray into 16 pixel levels, wherein the specific division formula is as follows:

where Gray is the Gray value at the (f, g) location.

As another embodiment, the Gray scale levels on the Gray scale map Gray may be divided into other pixel levels according to different requirements, for example, 32 or 64 Gray scale levels.

It should be noted that, the Gray value range in the obtained Gray is [0,16], and further the Gray co-occurrence matrix G of the left eye area is obtained, and the shape of the matrix is 16 × 16; the gray level co-occurrence matrix is a matrix reflecting gray level spatial correlation obtained based on statistics, the matrix can reflect texture features of an image, and a generation method of the co-occurrence matrix is a known technology, so detailed description is not given in this embodiment.

Then, the energy (ASE), Inverse Difference Moment (IDM) and Entropy (ENT) of the gray level co-occurrence matrix G are used as final characteristic values of the left eye area, and the characteristic vector f of the left eye area is obtained_left-eye＝{ASE,IDM，ENT}^T。

By using the method, the feature vectors of 4 areas such as a left eyebrow area, a left eye area, a left cheek area, a left mouth area and the like in each key frame image can be obtained, and a texture information matrix F of the left area of each key frame image is formed_leftThe matrix shape is 3 × 4, the number of columns represents 4 corresponding regions, and the number of rows represents that the eigenvector of each region is represented by three eigenvalues; similarly, obtaining a texture information matrix F of the right area of each key frame image_right。

It should be noted that the feature vector referred to herein refers to a feature descriptor of texture information in each region, and the feature value refers to an index in each feature descriptor, that is, energy (ASE), Inverse Difference Moment (IDM), and Entropy (ENT) of the gray level co-occurrence matrix G.

Performing facial paralysis detection on each key frame image according to the texture information matrix of the left region of each key frame image and the corresponding texture information matrix of the right region, and calculating a facial paralysis judgment index of each key frame image according to the following formula:

L_q＝||(F_right-F_left)_a×b||₂

wherein L is_qA facial paralysis judgment index for the qth key frame image, F_rightIs a texture information matrix of the right face region image in the q-th key frame image, F_leftThe texture information matrix of the left face region image in the q-th key frame image is shown, where a is the number of rows of the matrix, b is the number of columns of the matrix, and in this embodiment, the number of rows of the matrix is 3, and the number of columns is 4.

It should be noted that a normal face has symmetry, and is not optically represented on texture information, and also has symmetry on depth information; in an area with asymmetric depth information, the probability of face abnormality is high, and the texture characteristics of the area are focused; in the embodiment, the weight distribution is carried out by utilizing the difference between the left and right side depth values of the same type of area, so as to realize the attention mechanism; and because different regions have different degrees of influence on the detection results of the key frames, different weights are distributed to different regions in the detection process, the method for detecting the facial paralysis of the key frames is optimized, the attention of the detection results to the key regions is improved, and the accuracy of the facial paralysis detection can be improved.

In this embodiment, the symmetry axes are determined for five key frames, the center point of the nose tip is located in the middle of the face and does not change with the movement, and the center point of the nose tip is recorded as P₁(ii) a The midpoint of the connecting line of the left and right inner canthus has the same characteristic, and is marked as the central point of the left and right inner canthus as P₂Then the symmetric axis of the face region is a straight line P₁P₂。

And adjusting each symmetric region in each key frame according to the symmetric axis, wherein the specific adjusting method comprises the following steps: and adjusting the size of the left area by using the size of the right area based on the face area on the right side of the symmetry axis, so as to make the sizes of the left and right symmetrical areas the same, namely, the size of the left eye area is equal to the size of the right eye area, and the like.

And respectively obtaining the first depth value of each symmetrical area in each adjusted key frame according to the same way of obtaining the standard depth value in the nose area when the person to be detected is static in the front-view camera in the steps.

Calculating a difference index between first depth values of each symmetric region in each key frame image according to the following formula:

wherein,

for the difference indicator between the first depth values of the xth symmetric region in the qth key frame image,

for the first depth value of the left area in the xth symmetric area in the qth key frame image,

is the first depth value of the right area in the x symmetric area in the q key frame image.

Performing weight distribution on each symmetric region in each key frame image according to a difference index between first depth values of each symmetric region in each key frame image, wherein the larger the difference index is, the larger weight should be distributed, the value range of all weights in this embodiment is [1, 2 ], calculating the weight of each symmetric region in each key frame image according to the following formula, and obtaining the weight of each symmetric region in each key frame image:

wherein d is_q，xThe weight of the xth symmetric region in the qth key frame image,

and the difference index between the first depth values of the xth symmetric region in the qth key frame image.

The obtained weights of the symmetric regions in each key frame image are combined into a weight matrix according to the above process, and the weight matrix obtained in this embodiment is 4 rows and 1 column.

In this embodiment, the method for obtaining the weights of the symmetric regions in each key frame image is only one preferred embodiment of this embodiment, and as another embodiment, different methods for obtaining the weights of the symmetric regions in each key frame image may be set according to different requirements, but it should be satisfied that a difference index between first depth values of the symmetric regions in each key frame image and the weights of the symmetric regions in the corresponding key frame image are in a positive correlation relationship, and a relationship between the difference index and the weights of the symmetric regions is fitted by a mathematical modeling method.

Optimizing the facial paralysis detection method according to the obtained weight of each symmetrical area in each key frame image, namely:

wherein L is_q1For the facial paralysis evaluation index of the q-th key frame image after optimization, d_q,_xThe weight of each symmetric region in the key frame image.

In this embodiment, when L is_qWhen the norm exceeds the preset threshold, the threshold is 2.5 in this embodiment, i.e., L_q>2.5, the person to be tested is considered as a facial paralysis patient, the threshold is an experience threshold, and an implementer can change the threshold according to actual conditions; when L is_q1Satisfies the above conditions, and L_q1A larger value of (a) indicates a more facial paralysis of the key frame image.

It should be noted that, in order to avoid an error caused by a single key frame image to the detection result, the invention combines the detection results of five key frame images (including the key frame image in a stationary state), and determines that the person to be detected is the facial paralysis patient when three or more than three key frame images are judged to be facial paralysis.

As another embodiment, facial paralysis detection results of different numbers of key frame images may be combined according to different requirements, for example, when all the detection results of five key frame images are facial paralysis, the person may be determined to be facial paralysis.

The facial paralysis detection method and the facial paralysis detection system of the embodiment have the beneficial effects that: in this embodiment, each key frame image corresponding to each symmetric region is obtained according to the speed index of the symmetric region corresponding to each image and the depth index corresponding to each image; obtaining facial paralysis detection results of the key frame images according to the texture information matrix of the face region images in the key frame images; in the embodiment, the speed index of the symmetric region corresponding to each image and the depth index corresponding to each image are used as the basis for obtaining each key frame image corresponding to each symmetric region, so that the obtained key frame image is a front view with obvious motion characteristics, the influence of the deviation of the relative angle between the camera and the face on the facial paralysis detection result when a person to be detected performs a specified action is avoided, and the accuracy of facial paralysis detection is improved.

The facial paralysis detection system based on computer vision in the embodiment comprises a memory and a processor, wherein the processor executes a computer program stored in the memory to realize the method for acquiring the face area image and the corresponding face area depth map of the person to be detected in the embodiment of the facial paralysis detection method based on computer vision.

It should be noted that the order of the above-mentioned embodiments of the present invention is merely for description and does not represent the merits of the embodiments, and in some cases, actions or steps recited in the claims may be executed in an order different from the order of the embodiments and still achieve desirable results.

Claims

1. A facial paralysis detection method based on computer vision is characterized by comprising the following steps:

obtaining key points of a face area according to the face area image;

2. A method of computer vision based facial paralysis detection method as claimed in claim 1, wherein the method of obtaining a velocity indicator comprises:

the speed index is calculated according to the following formula:

for the right symmetric region corresponding to the nth image in the mth image sequenceAverage instantaneous speed of.

3. A method of computer vision based facial paralysis detection, as claimed in claim 2, wherein said method of calculating average instantaneous velocity comprises:

4. A method as claimed in claim 3, wherein the average instantaneous velocity of the symmetric region corresponding to each image in each image sequence is calculated according to the following formula:

wherein,

for the m-th image sequenceAverage instantaneous velocity j of right symmetric region corresponding to the nth image in the column_rightThe total number of key points in the right symmetrical region corresponding to the nth image in the mth image sequence,

5. The method of claim 1, wherein the depth index is obtained by averaging depth values of key points in a nose region of a face region depth map corresponding to the person to be tested in each image.

6. The method of claim 1, wherein the key frame images satisfy the following requirements:

the depth index satisfies the formula

Wherein

7. The method of computer vision-based facial paralysis detection, as claimed in claim 1, wherein said method of facial paralysis detection comprises:

8. A computer vision based facial paralysis detection system, comprising a memory and a processor, characterized in that said processor executes a computer program stored in said memory to implement a computer vision based facial paralysis detection method according to any of claims 1-7.