CN111091028A

CN111091028A - Method and device for recognizing shaking motion and storage medium

Info

Publication number: CN111091028A
Application number: CN201811238995.XA
Authority: CN
Inventors: 张修宝
Original assignee: Beijing Didi Infinity Technology and Development Co Ltd
Current assignee: Beijing Didi Infinity Technology and Development Co Ltd
Priority date: 2018-10-23
Filing date: 2018-10-23
Publication date: 2020-05-01

Abstract

The invention provides a method and a device for recognizing a shaking motion and a storage medium. The method comprises the following steps: acquiring facial feature points of a frame image; the face feature points comprise at least one contour point of a first half face, at least one contour point of a second half face and preset feature points corresponding to preset positions of the face, and according to the face feature points of the frame image, the frame image is determined, each contour point in the at least one contour point of the first half face and a first distance between the preset feature points are arranged, each contour point in the at least one contour point of the second half face and a second distance between the preset feature points are arranged, and according to the frame image, the first distance and the second distance are used for judging whether a shaking motion exists. The invention solves the problems of high hardware cost and complex algorithm in the prior art.

Description

Method and device for recognizing shaking motion and storage medium

Technical Field

The invention relates to the field of motion recognition, in particular to a method and a device for recognizing a shaking motion and a storage medium.

Background

Currently, there is an increasing need to detect the head shaking motion in many fields, such as the human-computer interaction field, the human face living body detection field, and so on.

In the prior art, the following method is usually adopted to detect the shaking motion: firstly, obtaining three-dimensional information when the head shakes by using a somatosensory peripheral Kinect of Microsoft, wherein the three-dimensional information can be called as training information; then, using the training information, performing classification-based machine learning on the hidden Markov Model (Hidde3Markov Model); and finally, detecting the head shaking action according to the head three-dimensional information acquired by the Kinect by using the trained hidden Markov model.

However, the prior art has the problems of high hardware cost and complex algorithm.

Disclosure of Invention

The embodiment of the invention provides a method and a device for recognizing a shaking motion and a storage medium, which are used for solving the problems of high hardware cost and complex algorithm in the prior art.

In a first aspect, an embodiment of the present invention provides a head shaking motion recognition method, including:

acquiring facial feature points of a frame image; the face feature points comprise at least one contour point of a first half face, at least one contour point of a second half face and preset feature points corresponding to preset positions of the face, and 3 is an integer greater than 1;

determining a first distance between each contour point in at least one contour point of the first half face of the frame image and the preset feature point and a second distance between each contour point in at least one contour point of the second half face and the preset feature point according to the facial feature point of the frame image;

and judging whether the shaking motion exists or not according to the first distance and the second distance of each frame of image.

Optionally, the determining whether there is a head shaking motion according to the first distance and the second distance of the frame image includes:

judging whether a preset identification condition is met or not according to the first distance and the second distance of the frame image, wherein the preset identification condition comprises an action condition;

if the preset identification condition is met, determining that the shaking motion exists;

and if the preset identification condition is not met, determining that no shaking motion exists.

Optionally, the number of frames of the frame image is 3.

The 3-frame images are respectively a first frame image, a second frame image and a third frame image, wherein the second frame image is behind the third frame image, and the third frame image is behind the first frame image.

Optionally, the action condition includes:

the difference degree of the first distance and the second distance of the first frame image is smaller than or equal to the first difference degree;

the difference degree of the first distance and the second distance of the second frame image is smaller than or equal to the first difference degree;

the difference degree of the first distance and the second distance of the third frame image is greater than or equal to the second difference degree;

the second difference degree is greater than the first difference degree, and the difference between the second difference degree and the first difference degree is greater than or equal to a preset threshold value.

Optionally, the action condition includes:

the difference degree of the first distance and the second distance of the first frame image is greater than or equal to a third difference degree;

the difference degree of the first distance and the second distance of the second frame image is greater than or equal to the third difference degree;

the difference degree of the first distance and the second distance of the third frame image is less than or equal to a fourth difference degree;

wherein the third difference degree is greater than the fourth difference degree, and a difference between the third difference degree and the fourth difference degree is greater than a preset threshold.

Optionally, the preset identification condition further includes a time condition and/or an angle condition.

Optionally, the time condition includes:

the number of the frame images from the first frame image to the second frame image is greater than or equal to a first preset number; and/or the number of the frame images from the first frame image to the second frame image is less than or equal to a second preset number.

Optionally, the angle condition includes:

and the included angle between a straight line fitted by the preset characteristic points of all the frame images from the first frame image to the second frame image and a horizontal axis is smaller than a preset angle.

Optionally, the degree of difference asymmetry between the first distance and the second distance satisfies the following formula (1):

asymmetry＝max(asymmetryL,asymmetryR) (1)

wherein, asymmetryL satisfies the following formula (2), and asymmetryR satisfies the following formula (3);

wherein n represents the number of contour points of a first half face and a second half face in a frame of image; l is_iThe distance between the ith contour point representing the first half face and a preset feature point, R_iThe distance between the ith contour point of the second half face and a preset feature point, L_iAnd R_iAnd symmetrically distributed relative to the preset characteristic point.

Optionally, the preset position of the face is a nose tip position, and the preset feature point is a nose tip point.

In a second aspect, an embodiment of the present invention provides a head shaking motion recognition apparatus, including:

the acquisition module is used for acquiring the facial feature points of the frame image; the face feature points comprise at least one contour point of a first half face, at least one contour point of a second half face and preset feature points corresponding to preset positions of the face;

a determining module, configured to determine, according to feature points of a face of the frame image, a first distance between each contour point in at least one contour point of the first half face of the frame image and the preset feature point, and a second distance between each contour point in the at least one contour point of the second half face and the preset feature point;

and the judging module is used for judging whether the shaking motion exists according to the first distance and the second distance of the frame image.

Optionally, the determining module is specifically configured to:

Optionally, the number of frames of the frame image is 3.

Optionally, the action condition includes:

Optionally, the time condition includes:

Optionally, the angle condition includes:

asymmetry＝max(asymmetryL,asymmetryR) (1)

In a third aspect, an embodiment of the present invention provides a head shaking motion recognition apparatus, including:

a processor;

a memory; and

a computer program;

wherein the computer program is stored in the memory and configured to be executed by the processor, the computer program comprising instructions for performing the method of any of the first aspects as described above.

In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored, and when executed by a processor, the computer program implements the method according to any one of the above first aspects.

The method, the device and the storage medium for recognizing the shaking motion provided by the embodiment of the invention have the advantages that by acquiring the facial feature points of the frame image, wherein the facial feature points comprise at least one contour point of a first half face, at least one contour point of a second half face and preset feature points corresponding to preset positions of the face, determining the first distance between each contour point in the at least one contour point of the first half face and the preset feature points of the frame image and the second distance between each contour point in the at least one contour point of the second half face and the preset feature points according to the facial feature points of the frame image, judging whether the shaking motion exists or not according to the first distances of the frame image, recognizing the shaking motion according to the distances between the contour points of the first half face and the second half face of the frame image and the preset feature points and using a trained hidden Markov model, compared with the detection of the head shaking motion according to the head three-dimensional information acquired by the Kinect, the use of hardware Kinect is avoided, so that the hardware cost is reduced, the use of a hidden Markov model is avoided, the algorithm is simplified, and the algorithm time is saved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a flowchart of a first embodiment of a head shaking motion recognition method provided by the present invention;

FIG. 2A is a first schematic diagram of facial feature points according to the present invention;

FIG. 2B is a schematic diagram of a second exemplary embodiment of facial feature points;

FIG. 3 is a flowchart of a second embodiment of a head shaking motion recognition method provided by the present invention;

fig. 4 is a schematic structural diagram of a first shaking motion recognition apparatus according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a second embodiment of the head shaking motion recognition apparatus according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The method, the device and the storage medium for recognizing the head shaking action provided by the invention can be applied to any scene needing to recognize the head shaking action. For example, in the field of human-computer interaction, a machine needs to recognize a human shaking motion. For example, in the field of robots, a robot needs to recognize a human head shaking motion, or a robot needs to recognize a head shaking motion of another robot.

The head shaking motion that can be recognized by the present invention includes, but is not limited to, human head shaking motion, robot head shaking motion, and the like.

Fig. 1 is a flowchart of a head shaking motion recognition method according to a first embodiment of the present invention. As shown in fig. 1, the method of this embodiment may include:

step 101, acquiring facial feature points of a frame image; the face feature points comprise at least one contour point of the first half face, at least one contour point of the second half face and preset feature points corresponding to preset positions of the face.

In this step, the frame image may specifically be a frame image including facial feature points. The first half-face may be any half-face, such as a left half-face or a right half-face. The half face of second with first half face is relative, and when first half face was half face on the left side, half face of second can be half face on the right side, and when first half face was half face on the right side, half face of second can be half face on the left side. The at least one contour point of the first half face may specifically be all contour points or partial contour points of the first half face. The at least one contour point of the second half face may specifically be a whole contour point or a partial contour point of the second half face. The partial contour points may be contour points of a specific area, for example, contour points of an area other than the forehead area.

The preset face position may be a position that can be continuously recognized during a panning motion, for example, a nasion root position, a nose tip position, a side nose wing position, another side nose wing position, a side mouth angle position, another side mouth angle position, a side inner eye angle position, or another side inner eye angle position. Optionally, the preset face position may be a position with high recognition accuracy, such as a nasal tip position, among the above positions that can be continuously recognized. At this time, the preset feature point of the preset position of the face may be a nose tip point. The distance between the preset feature point corresponding to the preset position of the face and the contour points of the first half face and the second half face can be changed along with the shaking motion. For example, in the process of turning the face from the front face to the side face, the distance between the contour point of the first half face and the preset feature point may be larger and smaller, and the distance between the contour point of the second half face and the preset feature point may be smaller and smaller.

Step 102, determining a first distance between each contour point in at least one contour point of the first half face of the frame image and the preset feature point and a second distance between each contour point in at least one contour point of the second half face and the preset feature point according to the facial feature point of the frame image.

In this step, taking the first half face as the left half face a, the second half face as the right half face B, and the preset feature point as the nose tip point as an example, referring to fig. 2A and 2B, first distances between at least one contour point of the first half face and the preset feature point of the frame image are L1-L7, respectively, and second distances between at least one contour point of the first half face and the preset feature point are R1-R7, respectively. Note that the facial feature points included in the frame image are the facial feature points of the front face in fig. 2A, and the facial feature points included in the frame image are the facial feature points of the side face in fig. 2B.

And 103, judging whether a shaking motion exists or not according to the first distance and the second distance of the frame image.

In this step, as can be seen from fig. 2A and 2B: when the face is corrected, the difference between a first distance between each contour point of the first half face and a preset feature point and a second distance between each contour point of the second half face and the preset feature point is smaller; when the face is laterally shaken, the difference between a first distance between each contour point of the first half face and the preset feature point and a second distance between each contour point of the second half face and the preset feature point is larger, so that whether the shaking motion exists can be judged according to the first distance and the second distance of the frame image. For example, when the difference degree of the first distance and the second distance of the frame image is greater than or equal to a preset difference degree, it may be determined that there is a head-shaking motion; when the difference degree of the first distance and the second distance of the frame image is less than the preset difference degree, it may be determined that there is no panning motion.

In this embodiment, by acquiring facial feature points of a frame image, where the facial feature points include at least one contour point of a first half face, at least one contour point of a second half face, and preset feature points corresponding to preset positions of a face, determining a first distance between each contour point of the first half face and the preset feature points and a second distance between each contour point of the second half face and the preset feature points according to the facial feature points of the frame image, and determining whether there is a panning motion according to the first distance and the second distance of the frame image, recognition of the panning motion according to the distances between the contour points of the first half face and the second half face of the frame image and the preset feature points is realized, compared with detection of the panning motion according to three-dimensional information of a head acquired by Kinect using a trained hidden markov model, the use of hardware Kinect is avoided, so that the hardware cost is reduced, the use of a hidden Markov model is avoided, the algorithm is simplified, and the algorithm time is saved.

Fig. 3 is a flowchart of a second embodiment of the head shaking motion recognition method provided by the present invention. This embodiment mainly describes an alternative implementation manner of determining whether there is a panning motion according to the first distance and the second distance of the frame image on the basis of the embodiment shown in fig. 1. As shown in fig. 3, the method of this embodiment may include:

step 301, determining whether a preset identification condition is satisfied according to the first distance and the second distance of the frame image.

In this step, the preset identification condition may include an action condition. The action condition may specifically be an action condition that needs to be satisfied when it is determined that there is a shaking motion. For example, the action condition may be that a difference degree between the first distance and the second distance of the frame image is greater than or equal to a preset difference degree.

Optionally, in order to improve the accuracy of head shaking motion recognition, the number of frames of the frame image may be multiple, that is, the number of frames of the frame image may be N, and N may be an integer greater than 1. The first distance and the second distance of the frame image may specifically be: a first distance between each contour point in the at least one contour point of the first half face of each frame image in the N frame images and a preset feature point, and a second distance between each contour point in the at least one contour point of the second half face and the preset feature point. Correspondingly, the motion condition may specifically be a condition that the first distance and the second distance of each frame image in the N frame images need to be satisfied when it is determined that there is a panning motion.

It should be noted that, when shaking the head, the initial rotation angle of the face and the characteristics of shaking the head may be as follows: in case 1, the initial rotation angle of the head can be small, and the head shaking action is realized by rotating the face part in the direction of increasing the rotation angle and then in the direction of decreasing the rotation angle; in case 2, the initial rotation angle of the face is large, and the head-shaking motion is performed by rotating the face first in the direction in which the rotation angle is small and then in the direction in which the rotation angle is large. Here, when the face is a frontal face, the rotation angle may be considered to be the smallest, for example, 0 °. When the face changes from the front face to the side face, the rotation angle increases, and when the face changes from the side face to the front face, the rotation angle decreases. For example, when the face is turned from the front face to the side face and then from the side face to the front face, it can be considered as case 1. For another example, when the face portion is turned from the first side face to the second side face, and then turned from the second side face to the first side face, and the turning angle of the first side face is smaller than the turning angle of the second side face, it can be considered as case 1. For example, when the face is turned from the side face to the side face and then from the side face to the front face, it can be considered as case 2. For another example, when the face portion is turned from the first side face to the second side face, and then turned from the second side face to the first side face, and the turning angle of the first side face is larger than the turning angle of the second side face, it may be considered as case 2.

Further optionally, N is an integer greater than 2, and the N frame images may include a first frame image, a second frame image, and a third frame image, where the second frame image is subsequent to the third frame image, and the third frame image is subsequent to the first frame image.

For case 1, the action condition may include: the difference degree of the first distance and the second distance of the first frame image is smaller than or equal to the first difference degree; the difference degree of the first distance and the second distance of the second frame image is smaller than or equal to the first difference degree; the difference degree of the first distance and the second distance of the third frame image is greater than or equal to the second difference degree. The second difference degree is greater than the first difference degree, and the difference between the second difference degree and the first difference degree is greater than or equal to a preset threshold value.

For case 2, the action condition may include: the difference degree of the first distance and the second distance of the first frame image is greater than or equal to a third difference degree; the difference degree of the first distance and the second distance of the second frame image is greater than or equal to the third difference degree; the difference degree of the first distance and the second distance of the third frame image is less than or equal to a fourth difference degree. Wherein the third difference degree is greater than the fourth difference degree, and a difference between the third difference degree and the fourth difference degree is greater than a preset threshold.

It should be noted that the preset threshold may be used to specify the amplitude of the face rotation during the panning motion, and the specific value of the preset threshold may be flexibly designed according to the requirement.

Optionally, N is equal to 3, and the 3 frame images are the first frame image, the second frame image, and the third frame image, respectively. Here, the 3-frame image may be a discontinuous frame image.

In order to ensure real-time performance, when N is equal to 3, for case 1, according to the frame image sequence, a frame image in which the difference degree between the first distance and the second distance of the frame image is detected for the first time is smaller than or equal to the first difference degree, and a frame image in which the difference degree between the first distance and the second distance of the frame image is detected again is smaller than or equal to the first preset difference degree is used as the second frame image, and it is further determined whether a frame image (i.e., the third frame image) exists between the first frame image and the second frame image, in which the difference degree between the first distance and the second distance is greater than or equal to the second difference degree, and if the difference degree does not exist, the frame image may be considered to satisfy the action condition, and if the difference degree does not exist, the frame image may be considered to not satisfy the action.

When N is equal to 3, for case 2, a frame image in which the degree of difference between the first distance and the second distance of the frame image detected for the first time is greater than or equal to the third degree of difference may be regarded as a first frame image, a frame image in which the degree of difference between the first distance and the second distance of the frame image detected for the second time is greater than or equal to the first preset degree of difference may be regarded as a second frame image, and it may be further determined whether or not a frame image in which the degree of difference between the first distance and the second distance is less than or equal to the fourth degree of difference (i.e., a third frame image) exists between the first frame image and the second frame image, and the action condition may be considered to be satisfied when the difference does not exist, and the action condition may be considered to be not satisfied when the difference does not exist.

In order to improve the accuracy of the head shaking motion recognition, it is considered that when the time consumed by the motion of rotating the face in the opposite direction is too short after the face rotates in one direction, the face is not considered to be the head shaking motion; and/or may not be considered a shaking motion, considering that when the motion of the face rotating in one direction and then in the opposite direction takes too long. Therefore, further optionally, the preset identification condition may further include a time condition. The time condition may specifically be a time condition that needs to be satisfied when it is determined that there is a shaking motion.

Alternatively, the time condition may be limited by the number of frame images from the first frame image to the second frame image. Further optionally, the time condition may include: the number of the frame images from the first frame image to the second frame image is greater than or equal to a first preset number; and/or the number of the frame images from the first frame image to the second frame image is less than or equal to a second preset number. For example, when the number of frame images of the first frame image to the second frame image is greater than or equal to 10, it can be considered that the time condition is satisfied.

In order to improve the accuracy of the head shaking motion recognition, considering the motion that the face rotates in one direction after rotating in the opposite direction in the non-horizontal direction, the head shaking motion may not be considered, for example, the head shaking motion may not be considered when the head of the person or robot rotates obliquely upward or rotates obliquely downward. Therefore, further optionally, the preset identification condition may further include an angle condition. The angle condition may specifically be an angle condition that needs to be satisfied when it is determined that there is a shaking motion.

Alternatively, the angle condition may be limited by an angle between a straight line fitted to the feature points corresponding to the respective face fixing positions of all the frame images from the first frame image to the second frame image and a horizontal axis. Further optionally, the angle condition includes: and the included angle between a straight line fitted by the preset characteristic points of all the frame images from the first frame image to the second frame image and a horizontal axis is smaller than a preset angle.

In this step, if the preset identification condition is satisfied, step 302 is executed. If the preset identification condition is not satisfied, step 303 is executed.

The present invention is not limited to the specific embodiment for determining the degree of difference between the first distance and the second distance of the frame image. Optionally, when the number of contour points of the first half face and the number of contour points of the second half face of one frame image are both one, the difference degree between the first distance and the second distance of the frame image may specifically be: the difference or ratio of the first distance to the second distance, etc. When the number of contour points of the first half face of one image is multiple and the number of contour points of the second half face of the image is one, the difference between the first distance and the second distance of the frame image may specifically be: the difference or ratio of the mean of the plurality of first distances to the second distance, and the like. When the number of the contour points of the first half face and the number of the contour points of the second half face of the frame image are both multiple, the difference degree between the first distance and the second distance of the frame image may specifically be: the difference or ratio of the mean value of the plurality of first distances to the mean value of the plurality of second distances, and the like.

When the number of the contour points of the first half face and the number of the contour points of the second half face in one frame image are both n, where n is a positive integer, further optionally, the difference degree asymmetry between the first distance and the second distance of the frame image may satisfy the following formula (1):

asymmetry＝max(asymmetryL,asymmetryR) (1)

wherein n represents the number of contour points of a first half face and a second half face in a frame of image; l is_iThe distance between the ith contour point representing the first half face and a preset feature point, R_iThe distance between the ith contour point of the second half face and a preset feature point, L_iAnd R_iMay be symmetrically distributed with respect to the preset feature point.

Wherein, with n equal to 7, L_iAnd R_iSymmetric distribution with respect to the predetermined feature points, L₁-L₇And R₁-R₇As shown in fig. 2A or fig. 2B.

Wherein L is_iCan satisfy the following formula (4)

L_i＝[(Lx_i–x₀)²+(Ly_i–y₀)²]^1/2(4)

Wherein, Lx_iThe coordinate, Ly, of the ith contour point of the first half-face on the X-axis can be represented_iThe coordinate of the ith contour point of the first half-face on the Y-axis, x, can be represented₀Can represent the coordinate of the preset characteristic point on the X axis, y₀The coordinates of the preset feature points on the Y axis may be represented.

Wherein R is_iCan satisfy the following formula (5)

R_i＝[(Rx_i–x₀)²+(Ry_i–y₀)²]^1/2(5)

Wherein, Rx_iThe ith contour point, which may represent the second half-face, is on the X-axisCoordinate of (A), Ry_iThe coordinate of the ith contour point of the second half-face on the Y-axis, x, can be represented₀Can represent the coordinate of the preset characteristic point on the X axis, y₀The coordinates of the preset feature points on the Y axis may be represented.

Step 302, determining that there is a head shaking motion.

Step 303, determining that no panning motion exists.

In this embodiment, whether a preset identification condition is met is determined according to the first distance and the second distance of the frame image, the preset identification condition includes an action condition, if the preset identification condition is met, it is determined that there is an oscillation motion, and if the preset identification condition is not met, it is determined that there is no oscillation motion, so that it is determined whether there is an oscillation motion according to the first distance and the second distance of the frame image.

Fig. 4 is a schematic structural diagram of a head shaking motion recognition apparatus according to a first embodiment of the present invention. The apparatus can be implemented by software, hardware or a combination of software and hardware. As shown in fig. 4, the apparatus includes:

an obtaining module 401, configured to obtain facial feature points of a frame image; the face feature points comprise at least one contour point of a first half face, at least one contour point of a second half face and preset feature points corresponding to preset positions of the face;

a determining module 402, configured to determine, according to feature points of a face of the frame image, a first distance between each contour point in at least one contour point of the first half face of the frame image and the preset feature point, and a second distance between each contour point in the at least one contour point of the second half face and the preset feature point;

a determining module 403, configured to determine whether there is a shaking motion according to the first distance and the second distance of the frame image.

Optionally, the determining module 403 is specifically configured to:

Optionally, the number of frames of the frame image is 3.

Optionally, the action condition includes:

Optionally, the time condition includes:

Optionally, the angle condition includes:

asymmetry＝max(asymmetryL,asymmetryR) (1)

The shaking motion recognition device provided by the embodiment of the invention can execute the method embodiments shown in the above fig. 1, fig. 3 or fig. 5, and the implementation principle and technical effect are similar, which are not described herein again.

Fig. 5 is a schematic structural diagram of a second embodiment of the head shaking motion recognition apparatus according to the present invention. This motion recognition device of shaking head includes: a processor 501; a memory 502 and a computer program, wherein the computer program is stored in the memory and configured to be executed by the processor, the computer program comprising instructions for performing the method according to any of the embodiments above.

The embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and the computer program, when executed by a processor, implements the method for recognizing a shaking motion provided in any of the foregoing embodiments.

Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A head shaking motion recognition method is characterized by comprising the following steps:

acquiring facial feature points of a frame image, wherein the facial feature points comprise at least one contour point of a first half face, at least one contour point of a second half face and preset feature points corresponding to preset positions of the face;

and judging whether a shaking motion exists or not according to the first distance and the second distance of the frame image.

2. The method according to claim 1, wherein said determining whether there is a panning motion according to the first distance and the second distance of the frame image comprises:

3. The method according to claim 2, wherein the number of frames of the frame image is 3.

4. The method according to claim 3, wherein 3 frame images are a first frame image, a second frame image and a third frame image, respectively, the second frame image following the third frame image and the third frame image following the first frame image;

the action conditions include:

5. The method according to claim 3, wherein 3 frame images are a first frame image, a second frame image and a third frame image, respectively, the second frame image following the third frame image and the third frame image following the first frame image;

the action conditions include:

6. The method according to claim 4 or 5, wherein the preset identification condition further comprises a time condition and/or an angle condition.

7. The method of claim 6, wherein the time condition comprises:

8. The method of claim 6, wherein the angular condition comprises:

9. The method according to claim 4 or 5, wherein the degree of difference asymmetry between the first distance and the second distance satisfies the following formula (1):

asymmetry＝max(asymmetryL,asymmetryR) (1)

10. The method according to any one of claims 1 to 9, wherein the preset face position is a nose tip position, and the preset feature point is a nose tip point.

11. A head shaking motion recognition device, comprising:

the acquisition module is used for acquiring facial feature points of the frame image; the face feature points comprise at least one contour point of a first half face, at least one contour point of a second half face and preset feature points corresponding to preset positions of the face;

a determining module, configured to determine, according to feature points of a face of the frame image, a first distance between each contour point of the at least one contour point of the first half face of the frame image and the preset feature point, and a second distance between each contour point of the at least one contour point of the second half face and the preset feature point;

12. The apparatus according to claim 11, wherein the determining module is specifically configured to:

13. The apparatus according to claim 12, wherein the number of frames of the frame image is 3.

14. The apparatus according to claim 13, wherein the 3 frame images are a first frame image, a second frame image and a third frame image, respectively, the second frame image following the third frame image and the third frame image following the first frame image;

the action conditions include:

15. The apparatus according to claim 13, wherein the 3 frame images are a first frame image, a second frame image and a third frame image, respectively, the second frame image following the third frame image and the third frame image following the first frame image;

the action conditions include:

16. The apparatus according to claim 14 or 15, wherein the preset identification condition further comprises a time condition and/or an angle condition.

17. The apparatus of claim 16, wherein the time condition comprises:

18. The apparatus of claim 16, wherein the angular condition comprises:

19. The apparatus of claim 14 or 15, wherein the degree of difference asymmetry between the first distance and the second distance satisfies the following equation (1):

asymmetry＝max(asymmetryL,asymmetryR) (1)

20. The apparatus according to any one of claims 11-19, wherein the predetermined facial position is a nose tip position, and the predetermined feature point is a nose tip point.

21. A head shaking motion recognition device, comprising:

a processor;

a memory; and

a computer program;

wherein the computer program is stored in the memory and configured to be executed by the processor, the computer program comprising instructions for performing the method of any of claims 1-10.

22. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, implements the method of any one of claims 1-10.