[go: up one dir, main page]

CN109446892B - Human eye attention positioning method and system based on deep neural network - Google Patents

Human eye attention positioning method and system based on deep neural network Download PDF

Info

Publication number
CN109446892B
CN109446892B CN201811073698.4A CN201811073698A CN109446892B CN 109446892 B CN109446892 B CN 109446892B CN 201811073698 A CN201811073698 A CN 201811073698A CN 109446892 B CN109446892 B CN 109446892B
Authority
CN
China
Prior art keywords
face
distance
detected
attention
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201811073698.4A
Other languages
Chinese (zh)
Other versions
CN109446892A (en
Inventor
郑东
赵五岳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Yufan Intelligent Technology Co ltd
Original Assignee
Universal Ubiquitous Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Universal Ubiquitous Technology Co ltd filed Critical Universal Ubiquitous Technology Co ltd
Priority to CN201811073698.4A priority Critical patent/CN109446892B/en
Publication of CN109446892A publication Critical patent/CN109446892A/en
Application granted granted Critical
Publication of CN109446892B publication Critical patent/CN109446892B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a human eye attention positioning method based on a deep neural network, which comprises the following steps: the method comprises the steps of key point positioning, face posture angle detection, center point coordinate calculation, attention vector calculation and attention point positioning. According to the human eye attention positioning method based on the deep neural network, the key points are obtained through key point positioning, the key points are subjected to normalization processing to obtain corresponding key point coordinates, the human face posture angle value is obtained according to the key point coordinates, the coordinates of the central point between two pupils of a human face to be detected are obtained through calculation, the spatial offset distance is obtained through the preset deep neural network, the attention vector is obtained according to the central point coordinates and the spatial offset distance, and finally whether the intersection point of the attention vector and the attention plane is in the attention effective area or not is judged.

Description

Human eye attention positioning method and system based on deep neural network
Technical Field
The invention relates to the field of human eye attention positioning, in particular to a human eye attention positioning method and system based on a deep neural network.
Background
The traditional human eye attention positioning method is based on double-shot, 3D structured light, TOF and other methods to position human eye attention, but the traditional human eye attention positioning method has the following defects: 1. the accuracy is low, the robustness is not strong, and the method cannot be applied to all scenes and equipment. 2. The human eye attention method of different devices in different scenes cannot be used universally. 3. Conventional cameras and device screens must be rigidly bound together. In summary, the conventional human eye attention localization method has a low accuracy and a certain limitation.
Disclosure of Invention
In order to overcome the defects of the prior art, an object of the present invention is to provide a human eye attention localization method based on a deep neural network, which can solve the problems of low accuracy and certain limitations of the conventional human eye attention localization method.
The second objective of the present invention is to provide a human eye attention localization system based on a deep neural network, which can solve the problems of low accuracy and certain limitations of the conventional human eye attention localization method.
One of the purposes provided by the invention is realized by adopting the following technical scheme:
the human eye attention positioning method based on the deep neural network is applied to a camera to acquire a human face image, and is characterized by comprising the following steps of:
positioning key points, namely positioning the key points of the face image to be detected through a preset key point neural network to obtain 68 key points;
detecting a face pose angle, namely normalizing the key points to obtain corresponding key point coordinates, inputting the key point coordinates into a preset face pose angle detection neural network, and outputting a face pose angle value by the preset face pose angle detection neural network;
calculating a central point coordinate, namely calculating the distance of the face to be detected according to the distance of the eye pupils to be detected, the key point coordinate and the known image parameters of the face image to be detected to obtain the central point coordinate between two pupils in the face image to be detected by establishing a mapping relation between the distance of the eye pupils and the distance of the face;
calculating a spatial offset distance, namely inputting a left eye region image, a right eye region image, a face region image, a preset face proportion image and the face posture angle value in the face image to be detected into a preset depth neural network, and outputting the spatial offset distance by the preset depth neural network;
calculating an attention vector, namely calculating the attention vector according to the space offset distance and the central point coordinate;
marking an attention plane and an attention effective area, marking the attention plane of the attention device corresponding to the axis of the camera, and marking the attention effective area on the attention plane according to the size of the attention device;
and positioning an attention point, calculating an intersection point of the attention vector and the attention plane, and judging whether the intersection point is on the attention effective area, wherein if yes, the attention of the human eyes is on the attention device, and if not, the attention of the human eyes is not on the attention device.
Further, the key point localization comprises:
acquiring an image, namely acquiring an image to be detected containing a face to be detected;
detecting a face to be detected, wherein the face to be detected contains a face characteristic region in the image to be detected;
and (4) positioning key points, namely positioning the key points of the face image to be detected through a preset key point neural network to obtain 68 key points.
Further, the center point coordinate calculation includes:
establishing a mapping relation, acquiring a front face image of an original human face when a first axis distance and a second axis distance are preset through a camera, obtaining a first average pixel value and a second average pixel value corresponding to a human eye pupil distance in the front face image, and calculating the original mapping relation between the human eye pupil distance and the human face distance according to the preset first axis distance, the preset second axis distance, the first average pixel value and the second average pixel value, wherein the human face distance is the distance from the human face to the camera;
generating the inter-pupil distance of the human eyes to be detected, and carrying out image processing on the human face image to be detected to obtain the inter-pupil distance of the human eyes to be detected;
calculating the face distance, and calculating the face distance to be measured according to the original mapping relation and the interpupillary distance of the eyes to be measured;
and calculating coordinates, namely calculating to obtain the coordinate of a central point between two pupils in the face image to be detected according to the distance of the face to be detected, the coordinate of the key point and the known image parameters of the face image to be detected, wherein the coordinate of the central point is the coordinate of the central point between the two pupils in the face image to be detected.
Further, the size range of the horizontal rotation angle and the pitch angle from the original face to the camera axis of the front face image is 0-5 degrees.
Further, the preset first axle center distance is different from the preset second axle center distance.
Further, the preset human face posture angle detection neural network comprises an input layer, a first full connection layer, a second full connection layer and an output layer.
Further, the detection of the face pose angle specifically comprises: and normalizing the key points to obtain corresponding key point coordinates, entering the key point coordinates through an input layer, sequentially processing the key point coordinates through a first full-connection layer and a second full-connection layer, and finally outputting a face pose angle value through an output layer.
The second purpose of the invention is realized by adopting the following technical scheme:
human eye attention positioning system based on deep neural network is characterized by comprising:
the key point positioning module is used for carrying out key point positioning on the face image to be detected through a preset key point neural network to obtain 68 key points;
the human face posture angle detection module is used for carrying out normalization processing on the key points to obtain corresponding key point coordinates, inputting the key point coordinates into a preset human face posture angle detection neural network, and outputting a human face posture angle value by the preset human face posture angle detection neural network;
the central point coordinate calculation module is used for calculating the distance of the face to be detected according to the distance of the eye pupil to be detected and the mapping relation by establishing the mapping relation between the distance of the eye pupil and the distance of the face to be detected, and calculating the central point coordinate between two pupils in the face image to be detected according to the distance of the face to be detected, the key point coordinate and the known image parameters of the face image to be detected;
the spatial offset distance calculation module is used for inputting a left eye region image, a right eye region image, a face region image, a preset face proportion image and the face posture angle value in the face image to be detected into a preset deep neural network, and the preset deep neural network outputs a spatial offset distance;
the attention vector calculation module is used for calculating an attention vector according to the space offset distance and the center point coordinate;
the marking module is used for marking an attention plane of the attention device corresponding to the axis of the camera and marking an attention effective area on the attention plane according to the size of the attention device;
an attention point locating module, configured to calculate an intersection point of the attention vector and the attention plane, and determine whether the intersection point is on the attention valid region.
Furthermore, the key point positioning module comprises a camera, a face detection unit and a key point positioning unit, wherein the camera is used for acquiring an image to be detected containing a face to be detected; the face detection unit is used for detecting a face image to be detected containing a face characteristic region in the image to be detected; the key point positioning unit is used for carrying out key point positioning on the face image to be detected through a preset key point neural network to obtain 68 key points.
Furthermore, the central point coordinate calculation module comprises a mapping relation establishing unit, a to-be-detected human eye pupil distance generating unit, a human face distance calculating unit and a coordinate calculation unit;
the mapping relationship establishing unit is used for acquiring a frontal face image of an original human face when the frontal face image is located at a preset first axial distance and a preset second axial distance through a camera, obtaining a first average pixel value and a second average pixel value corresponding to a human eye pupil distance in the frontal face image, and calculating an original mapping relationship between the human eye pupil distance and the human face distance according to the preset first axial distance, the preset second axial distance, the first average pixel value and the second average pixel value, wherein the human face distance is the distance from the human face to the camera;
the unit for generating the interpupillary distance of the human face to be detected is used for carrying out image processing on the human face image to be detected to obtain the interpupillary distance of the human eye to be detected;
the face distance calculating unit is used for calculating the distance of the face to be measured according to the original mapping relation and the interpupillary distance of the eyes to be measured;
the coordinate calculation unit is used for calculating the coordinate of the central point between two pupils in the face image to be detected according to the distance of the face to be detected, the coordinate of the key point and the known image parameters of the face image to be detected, wherein the coordinate of the central point is the coordinate of the central point between the two pupils in the face image to be detected.
Compared with the prior art, the invention has the beneficial effects that: the invention relates to a human eye attention positioning method based on a deep neural network, which comprises the steps of obtaining key points through key point positioning, carrying out normalization processing on the key points to obtain corresponding key point coordinates, obtaining a human face posture angle value according to the key point coordinates, obtaining coordinates of a central point between two pupils of a human face to be detected through calculation, obtaining a spatial offset distance through a preset deep neural network, obtaining an attention vector according to the central point coordinates and the spatial offset distance, finally judging whether an intersection point of the attention vector and an attention plane is in an attention effective area, if so, the human eye attention is on attention equipment, if not, the human eye attention is not on the attention equipment, the positioning result accuracy rate of the whole process is high, the method is suitable for different equipment and can be used universally in different scenes.
The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical solutions of the present invention more clearly understood and to make the technical solutions of the present invention practical in accordance with the contents of the specification, the following detailed description is given of preferred embodiments of the present invention with reference to the accompanying drawings. The detailed description of the present invention is given in detail by the following examples and the accompanying drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a schematic flow chart of a human eye attention localization method based on a deep neural network according to the present invention;
fig. 2 is a block architecture diagram of the human eye attention localization system based on the deep neural network of the present invention.
Detailed Description
The present invention will be further described with reference to the accompanying drawings and the detailed description, and it should be noted that any combination of the embodiments or technical features described below can be used to form a new embodiment without conflict.
As shown in FIG. 1, the human eye attention localization method based on the deep neural network of the present invention comprises the following steps:
positioning key points, namely positioning the key points of the face image to be detected through a preset key point neural network to obtain 68 key points; the method specifically comprises the following steps: acquiring an image, namely acquiring an image to be detected containing a face to be detected; and acquiring an image to be detected containing the face to be detected by using a camera, wherein the image to be detected contains the face to be detected and other backgrounds.
Detecting a face, namely detecting a face image to be detected containing a face characteristic region in the image to be detected; and detecting the face characteristic region in the image to be detected and obtaining the image of the face to be detected only containing the face to be detected.
Positioning key points, namely positioning the key points of the face image to be detected through a preset key point neural network to obtain 68 key points; and inputting the key point neural network into a training set in advance for training to obtain a usable preset key point neural network, and processing the face image to be detected by using the preset key point neural network to obtain 68 key points in total.
Detecting a face pose angle, namely normalizing the key points to obtain corresponding key point coordinates, inputting the key point coordinates into a preset face pose angle detection neural network, and outputting a face pose angle value by the preset face pose angle detection neural network; in this embodiment, the original training set is used to train the face pose angle detection neural network, and the preset face pose angle detection neural network that can be used is obtained through repeated training for many times. The method specifically comprises the following steps: and normalizing the key points to obtain corresponding key point coordinates, namely transforming the key points in the face image to be detected to finally obtain uniform two-dimensional coordinates. And entering the key point coordinates through an input layer, sequentially processing the key point coordinates through a first full-connection layer and a second full-connection layer, and finally outputting the face pose angle value through an output layer. The face posture angle value comprises a horizontal rotation angle value, a tilt angle value and a pitch angle value; the horizontal rotation angle value is the degree value of the left-right rotation of the face, the inclination angle value is the degree value of the inclination of the face, and the pitch angle value is the degree value of the face rising or overlooking. In this embodiment, the normalization processing is performed on the key points to obtain corresponding key point coordinates, the key point coordinates enter through the input layer and sequentially pass through the first full-link layer and the second full-link layer, and finally the output layer outputs the face pose angle value. In this embodiment, the dimension of the input layer is 1 × 136, the dimension of the first fully-connected layer is 136 × 68, the dimension of the second fully-connected layer is 68 × 3, and the dimension of the output layer is 1 × 3. The size of the preset human face posture angle detection neural network is only 38k. In this embodiment, the preset human face posture model detection neural network only needs 1ms when detecting the human face posture angle.
And calculating the center point coordinate, namely calculating the distance of the face to be detected according to the eye pupil distance of the person to be detected and the mapping relation by establishing the mapping relation between the eye pupil distance and the face distance, and calculating the center point coordinate between two pupils in the face image to be detected according to the distance of the face to be detected, the key point coordinate and the known image parameters of the face image to be detected. The method specifically comprises the following steps:
establishing a mapping relation, acquiring a frontal face image of an original human face when the frontal face image is located at a preset first axial distance and a preset second axial distance through a camera, obtaining a first average pixel value and a second average pixel value corresponding to a human eye pupil distance in the frontal face image, and calculating the original mapping relation between the human eye pupil distance and the human face distance according to the preset first axial distance, the preset second axial distance, the first average pixel value and the second average pixel value, wherein the human face distance is the distance from the human face to the camera. In this embodiment, the following are specific:
establishing a space coordinate system by taking the axis of the camera as an original point, wherein the space coordinate system comprises an X axis, a Y axis and a Z axis, the preset first axis distance is the distance (Z axis direction) from the original face to the axis of the camera, and in this embodiment, the preset first axis distance is d1; the preset second axial distance is also the distance (Z-axis direction) from the original face to the axis of the camera, and in this embodiment, the preset second axial distance is d2; and d1 is not equal to d2; then, collecting the front face image of the original human face when the camera is positioned at the preset first axial center distance and the preset second axial center distance to obtain the width and the height of the front face image, and obtaining a first average pixel value and a second average pixel value corresponding to the eye pupil distance in the front face image; the first average pixel corresponds to a preset first axle center distance, the second average pixel corresponds to a preset second axle center distance, the first average pixel is made to be L1, and the second average pixel is made to be L2; calculating an original mapping relation between the human eye pupil distance and the human face distance according to a preset first axial center distance, a preset second axial center distance, a first average pixel value and a second average pixel value, wherein the specific mapping relation is shown as a formula (1),
d=k*(L-L1)+d1 (1)
wherein d is the distance between the human face,
Figure BDA0001800219770000081
d1 is a preset first axial distance, d2 is a preset second axial distance, and L is a human eye pupil distance, wherein 0<L<And = min (W, H), where W is the width of the front face image of the original face acquired by the camera, H is the height of the front face image of the original face acquired by the camera, L1 is the first average pixel, and L2 is the second average pixel. According to the formula (1), only d and L in the formula are variables, so that the variable relation of d and L can be obtained, and the variable relation is the distance between the pupil distance of the human eyes and the distance between the human facesAnd (5) original mapping relation. The front face image of the embodiment is that the horizontal rotation angle and the pitch angle of the original face relative to the axis of the camera are required to be in the range of 0-5 degrees, and due to actual error operation, certain errors are allowed to be accepted in the actual detection process.
Generating the interpupillary distance of the human eye to be detected, collecting the image of the human face to be detected through a camera, and carrying out image processing on the image of the human face to be detected to obtain the interpupillary distance of the human eye to be detected; specifically, the generation of the inter-pupil distance of the human eye to be detected is to acquire a human face image to be detected containing the human face to be detected through a camera, perform face detection processing, key point positioning processing and human face posture angle calculation processing on the human face image to be detected to obtain an unprocessed inter-pupil distance of the human eye and a horizontal corner to be detected, wherein the horizontal corner to be detected is a horizontal corner between the human face to be detected and the axis of the camera, and calculate the inter-pupil distance of the human eye to be detected according to the unprocessed inter-pupil distance of the human eye and the horizontal corner to be detected. The following are exemplified:
acquiring a human face image to be detected through a camera, performing human face detection processing, key point positioning processing and human face posture angle calculation processing on the human face image to be detected to obtain an unprocessed human eye pupil interval and a horizontal corner to be detected, wherein the unprocessed human eye pupil interval is set to be L _ temp, the horizontal corner to be detected is set to be Y, the human face image to be detected obtained at the moment has a rotation angle Y on the horizontal position relative to the axis of the camera, so that the unprocessed human eye pupil interval at the moment is converted into the human eye pupil interval in the face-up state, and the unprocessed human eye pupil interval and the horizontal corner to be detected are substituted into a formula (2) to calculate to obtain the human eye pupil interval to be detected, wherein the formula (2) is as follows:
Figure BDA0001800219770000091
wherein L is 1 The distance between the pupils of the human to be detected is L _ temp, the distance between the pupils of the human to be detected is unprocessed, and the horizontal corner to be detected is Y; in equation (2), Y must be greater than-90 and less than 90.
And calculating the distance of the human face, and calculating the distance of the human face to be detected according to the original mapping relation and the interpupillary distance of the human eyes to be detected. And (3) calculating the distance between the faces to be detected, namely the distance between the faces to be detected and the axis Z of the camera according to the mapping relation in the formula (1) and the interpupillary distance between the eyes to be detected obtained by the formula (2).
And calculating coordinates, namely calculating a central point coordinate between two pupils in the face image to be detected according to the distance of the face to be detected, the key point coordinate and the known image parameters of the face image to be detected, wherein the central point coordinate is the coordinate of the central point between the two pupils in the face image to be detected. The method specifically comprises the following steps: in this embodiment, let the coordinate of the central point between the left and right pupils in the face image to be measured be P1, the coordinate of P1 relative to the axis of the camera be (x 1, y1, Z1), and the face distance d to be measured obtained through the above calculation is the distance of P1 on the Z axis, that is, d = Z1. Calculating the coordinates (w 1, h 1) of the P1 on the face image to be detected according to the obtained key point coordinates; making the intersection point of the Z axis and the plane formed by the X axis and the Y axis where the face image to be detected is located be P0, then the coordinate of P0 is (0, Z1), and at the moment, the coordinate of P0 on the face image to be detected is (W/2, H/2), wherein W and H are the width and height of the face image to be detected; x1, y1 is calculated according to the following formula (3) and formula (4), the formula is as follows,
x1=k*(w1-W 1 /2-L1)+d1 (3)
y1=k*(h1-H 1 /2-L1)+d1 (4)
wherein X1 is the coordinate of the X axis of the central point on the space coordinate system taking the axis of the camera as the origin, Y1 is the coordinate of the Y axis of the central point on the space coordinate system taking the axis of the camera as the origin, wherein
Figure BDA0001800219770000101
d1 is a preset first axial distance, d2 is a preset second axial distance, and L is the interpupillary distance of human eyes, wherein 0<L<=min(W 1 ,H 1 ),W 1 Width of the face image to be measured, H 1 And L1 is a first average pixel and L2 is a second average pixel, wherein the height of the face image to be detected is obtained. From x1 and y1 obtained by the above formula, a specific value of the center point coordinate, i.e., (x 1, y1, z 1) can be obtained.
Calculating a spatial offset distance, namely inputting a left eye region image, a right eye region image, a face region image, a preset face proportion image and a face posture angle value in the face image to be detected into a preset depth neural network, and outputting the spatial offset distance by the preset depth neural network; let the spatial offset distance at this time be converted into a vector, i.e., (Δ x, Δ y, Δ z).
Calculating an attention vector, namely calculating to obtain the attention vector according to the space offset distance and the coordinates of the central point; let the attention vector be V1, V1= (Δ X-X1, Δ Y-Y1, Δ Z-Z1), and V1 can be obtained from the center point coordinates (X1, Y1, Z1) obtained as described above, where Δ X-X1 is the vector of attention on the X axis, Δ Y-Y1 is the vector of attention on the Y axis, and Δ Z-Z1 is the vector of attention on the Z axis.
Marking an attention plane and an attention effective area, marking the attention plane of the attention device corresponding to the axis of the camera, and marking the attention effective area on the attention plane according to the size of the attention device; the method specifically comprises the following steps: the attention device in this embodiment is a screen or a device or an object (such as a painting and calligraphy, an exhibit, etc.) in a scene, and a spatial plane of the attention device with respect to an axis (a center of a spatial three-dimensional coordinate system) of the camera, that is, an attention plane, is labeled, specifically: if the attention device is a regular plane, such as a screen, a planar device, etc., three non-collinear points p1, p2, p3 may be taken on the plane and the spatial coordinates of each point relative to the axis of the camera may be calculated. If the attention device is an irregular plane, three non-collinear points p1, p2, and p3 can be approximated on the plane, and the spatial coordinates of each point relative to the axis of the camera can be calculated. A plane formed by the three points is an attention plane, and an attention effective area is marked in the attention plane according to the length and width (size) of the attention device.
And (4) positioning an attention point, calculating an intersection point of an attention vector and an attention plane, and judging whether the intersection point is on an attention effective area, wherein if so, the attention of the human eyes is on the attention device, and if not, the attention of the human eyes is not on the attention device. And calculating the intersection point of the attention vector V1 and the attention plane, wherein if the intersection point exists and is on the attention valid region, the attention point of the human eye is on the attention device, and if no intersection point exists or the intersection point does not exist on the attention device, the attention of the human eye is not on the attention device.
As shown in fig. 2, the present invention provides a deep neural network-based human eye attention localization system, comprising: the key point positioning module is used for carrying out key point positioning on the face image to be detected through a preset key point neural network to obtain 68 key points;
the face pose angle detection module is used for carrying out normalization processing on the key points to obtain corresponding key point coordinates, inputting the key point coordinates into a preset face pose angle detection neural network, and outputting a face pose angle value by the preset face pose angle detection neural network;
the central point coordinate calculation module is used for calculating the distance of the face to be detected according to the eye pupil distance of the person to be detected and the mapping relation by establishing the mapping relation between the eye pupil distance and the face distance, and calculating the central point coordinate between two pupils in the face image to be detected according to the distance of the face to be detected, the key point coordinate and the known image parameters of the face image to be detected;
the spatial offset distance calculation module is used for inputting a left eye region image, a right eye region image, a face region image, a preset face proportion image and a face posture angle value in the face image to be detected into a preset depth neural network, and the preset depth neural network outputs a spatial offset distance;
the attention vector calculation module is used for calculating an attention vector according to the space offset distance and the center point coordinate;
the marking module is used for marking an attention plane of the attention device corresponding to the axis of the camera and marking an attention effective area on the attention plane according to the size of the attention device;
and the attention point positioning module is used for calculating the intersection point of the attention vector and the attention plane and judging whether the intersection point is on the attention effective area.
In this embodiment, the key point positioning module includes a camera, a face detection unit, and a key point positioning unit, where the camera is used to obtain a to-be-detected image containing a to-be-detected face; the face detection unit is used for detecting a face image to be detected containing a face characteristic area in the image to be detected; the key point positioning unit is used for carrying out key point positioning on the face image to be detected through a preset key point neural network to obtain 68 key points. The central point coordinate calculation module comprises a mapping relation establishing unit, a unit for generating the interpupillary distance of the human eye to be detected, a unit for calculating the distance of the human face and a coordinate calculation unit; the mapping relation establishing unit is used for acquiring a frontal face image of an original human face when the first axis distance and the second axis distance are preset through the camera, obtaining a first average pixel value and a second average pixel value corresponding to the human eye pupil distance in the frontal face image, and calculating the original mapping relation between the human eye pupil distance and the human face distance according to the preset first axis distance, the preset second axis distance, the first average pixel value and the second average pixel value, wherein the human face distance is the distance from the human face to the camera; the generating unit for the interpupillary distance of the human eyes to be detected is used for carrying out image processing on the human face image to be detected to obtain the interpupillary distance of the human eyes to be detected; the human face distance calculating unit is used for calculating the distance of the human face to be detected according to the original mapping relation and the interpupillary distance of the human eyes to be detected; the coordinate calculation unit is used for calculating and obtaining a central point coordinate between two pupils in the face image to be detected according to the face distance to be detected, the key point coordinate and the known image parameter of the face image to be detected, wherein the central point coordinate is the coordinate of the central point between the two pupils in the face image to be detected.
The invention relates to a human eye attention positioning method based on a deep neural network, which comprises the steps of obtaining key points through key point positioning, carrying out normalization processing on the key points to obtain corresponding key point coordinates, obtaining a human face posture angle value according to the key point coordinates, obtaining coordinates of a central point between two pupils of a human face to be detected through calculation, obtaining a spatial offset distance through a preset deep neural network, obtaining an attention vector according to the central point coordinates and the spatial offset distance, finally judging whether an intersection point of the attention vector and an attention plane is in an attention effective area, if so, the human eye attention is on attention equipment, if not, the human eye attention is not on the attention equipment, the positioning result accuracy rate of the whole process is high, the method is suitable for different equipment and can be used universally in different scenes.
The foregoing is merely a preferred embodiment of the invention and is not intended to limit the invention in any manner; those skilled in the art can readily practice the invention as shown and described in the drawings and detailed description herein; however, those skilled in the art should appreciate that they can readily use the disclosed conception and specific embodiments as a basis for designing or modifying other structures for carrying out the same purposes of the present invention without departing from the scope of the invention as defined by the appended claims; meanwhile, any changes, modifications, and evolutions of the equivalent changes of the above embodiments according to the actual techniques of the present invention are still within the protection scope of the technical solution of the present invention.

Claims (10)

1. The human eye attention positioning method based on the deep neural network is applied to a camera to acquire a human face image, and is characterized by comprising the following steps of:
positioning key points, namely positioning the key points of the face image to be detected through a preset key point neural network to obtain 68 key points;
detecting a face pose angle, namely normalizing the key points to obtain corresponding key point coordinates, inputting the key point coordinates into a preset face pose angle detection neural network, and outputting a face pose angle value by the preset face pose angle detection neural network;
calculating a central point coordinate, namely calculating the distance of the face to be detected according to the distance of the eye pupils to be detected, the key point coordinate and the known image parameters of the face image to be detected to obtain the central point coordinate between two pupils in the face image to be detected by establishing a mapping relation between the distance of the eye pupils and the distance of the face;
calculating a spatial offset distance, namely inputting a left eye region image, a right eye region image, a face region image, a preset face proportion image and the face posture angle value in the face image to be detected into a preset depth neural network, and outputting the spatial offset distance by the preset depth neural network;
calculating an attention vector, namely calculating the attention vector according to the space offset distance and the central point coordinate;
marking an attention plane and an attention effective area, marking the attention plane of attention equipment corresponding to the axis of the camera, and marking the attention effective area on the attention plane according to the size of the attention equipment;
and positioning an attention point, calculating an intersection point of the attention vector and the attention plane, and judging whether the intersection point is on the attention effective area, wherein if yes, the attention of the human eyes is on the attention device, and if not, the attention of the human eyes is not on the attention device.
2. The deep neural network-based human eye attention localization method of claim 1, wherein: the key point positioning comprises:
acquiring an image, namely acquiring an image to be detected containing a face to be detected;
detecting a face to be detected, wherein the face to be detected contains a face characteristic region in the image to be detected;
and (4) positioning key points, namely positioning the key points of the face image to be detected through a preset key point neural network to obtain 68 key points.
3. The deep neural network-based human eye attention localization method of claim 1, wherein: the center point coordinate calculation includes:
establishing a mapping relation, acquiring a front face image of an original human face when a first axis distance and a second axis distance are preset through a camera, obtaining a first average pixel value and a second average pixel value corresponding to a human eye pupil distance in the front face image, and calculating the original mapping relation between the human eye pupil distance and the human face distance according to the preset first axis distance, the preset second axis distance, the first average pixel value and the second average pixel value, wherein the human face distance is the distance from the human face to the camera;
generating the interpupillary distance of the human eye to be detected, and carrying out image processing on the human face image to be detected to obtain the interpupillary distance of the human eye to be detected;
calculating the face distance, and calculating the face distance to be measured according to the original mapping relation and the interpupillary distance of the eyes to be measured;
and calculating coordinates, namely calculating to obtain the coordinates of a central point between two pupils in the face image to be detected according to the distance of the face to be detected, the coordinates of the key point and the known image parameters of the face image to be detected, wherein the coordinates of the central point are the coordinates of the central point between the two pupils in the face image to be detected.
4. The deep neural network-based human eye attention localization method of claim 3, wherein: the front face image is that the horizontal rotation angle and the pitch angle from the original face to the axis of the camera are in the range of 0-5 degrees.
5. The deep neural network-based human eye attention localization method of claim 3, wherein: the preset first axle center distance is different from the preset second axle center distance.
6. The deep neural network-based human eye attention localization method of claim 1, wherein: the preset human face posture angle detection neural network comprises an input layer, a first full connection layer, a second full connection layer and an output layer.
7. The deep neural network-based human eye attention localization method of claim 6, wherein: the detection of the face pose angle specifically comprises the following steps: and normalizing the key points to obtain corresponding key point coordinates, entering the key point coordinates through an input layer, sequentially processing the key point coordinates through a first full-connection layer and a second full-connection layer, and finally outputting a face pose angle value through an output layer.
8. Human eye attention positioning system based on deep neural network is characterized by comprising:
the key point positioning module is used for carrying out key point positioning on the face image to be detected through a preset key point neural network to obtain 68 key points;
the human face posture angle detection module is used for carrying out normalization processing on the key points to obtain corresponding key point coordinates, inputting the key point coordinates into a preset human face posture angle detection neural network, and outputting a human face posture angle value by the preset human face posture angle detection neural network;
the central point coordinate calculation module is used for calculating the distance of the face to be detected by establishing a mapping relation between the interpupillary distance of human eyes and the distance of the face, and calculating the coordinate of the central point between two pupils in the image of the face to be detected according to the distance of the human eyes to be detected and the mapping relation, and the coordinate of the key point and the known image parameters of the image of the face to be detected;
the spatial offset distance calculation module is used for inputting a left eye region image, a right eye region image, a face region image, a preset face proportion image and the face posture angle value in the face image to be detected into a preset deep neural network, and the preset deep neural network outputs a spatial offset distance;
the attention vector calculation module is used for calculating an attention vector according to the space offset distance and the center point coordinate;
the marking module is used for marking an attention plane of the attention device corresponding to the axis of the camera and marking an attention effective area on the attention plane according to the size of the attention device;
an attention point locating module, configured to calculate an intersection point of the attention vector and the attention plane, and determine whether the intersection point is on the attention valid region.
9. The deep neural network-based eye attention localization system of claim 8, wherein: the key point positioning module comprises a camera, a face detection unit and a key point positioning unit, wherein the camera is used for acquiring an image to be detected containing a face to be detected; the face detection unit is used for detecting a face image to be detected containing a face characteristic region in the image to be detected; the key point positioning unit is used for carrying out key point positioning on the face image to be detected through a preset key point neural network to obtain 68 key points.
10. The deep neural network-based eye attention localization system of claim 8, wherein: the central point coordinate calculation module comprises a mapping relation establishing unit, a to-be-detected human eye pupil distance generating unit, a human face distance calculating unit and a coordinate calculation unit;
the mapping relationship establishing unit is used for acquiring a frontal face image of an original human face when the frontal face image is located at a preset first axial distance and a preset second axial distance through a camera, obtaining a first average pixel value and a second average pixel value corresponding to a human eye pupil distance in the frontal face image, and calculating an original mapping relationship between the human eye pupil distance and the human face distance according to the preset first axial distance, the preset second axial distance, the first average pixel value and the second average pixel value, wherein the human face distance is the distance from the human face to the camera;
the unit for generating the interpupillary distance of the human face to be detected is used for carrying out image processing on the human face image to be detected to obtain the interpupillary distance of the human eye to be detected;
the human face distance calculating unit is used for calculating the distance of the human face to be detected according to the original mapping relation and the interpupillary distance of the human eyes to be detected;
the coordinate calculation unit is used for calculating the coordinate of the central point between two pupils in the face image to be detected according to the distance of the face to be detected, the coordinate of the key point and the known image parameters of the face image to be detected, wherein the coordinate of the central point is the coordinate of the central point between the two pupils in the face image to be detected.
CN201811073698.4A 2018-09-14 2018-09-14 Human eye attention positioning method and system based on deep neural network Expired - Fee Related CN109446892B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811073698.4A CN109446892B (en) 2018-09-14 2018-09-14 Human eye attention positioning method and system based on deep neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811073698.4A CN109446892B (en) 2018-09-14 2018-09-14 Human eye attention positioning method and system based on deep neural network

Publications (2)

Publication Number Publication Date
CN109446892A CN109446892A (en) 2019-03-08
CN109446892B true CN109446892B (en) 2023-03-24

Family

ID=65532820

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811073698.4A Expired - Fee Related CN109446892B (en) 2018-09-14 2018-09-14 Human eye attention positioning method and system based on deep neural network

Country Status (1)

Country Link
CN (1) CN109446892B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111723596B (en) * 2019-03-18 2024-03-22 北京市商汤科技开发有限公司 Gaze area detection and neural network training methods, devices and equipment
CN110400351A (en) * 2019-07-30 2019-11-01 晓智科技(成都)有限公司 A kind of X-ray front end of emission automatic adjusting method and system
CN110191234B (en) * 2019-06-21 2021-03-26 中山大学 Intelligent terminal unlocking method based on fixation point analysis
CN110543813B (en) * 2019-07-22 2022-03-15 深思考人工智能机器人科技(北京)有限公司 Face image and gaze counting method and system based on scene
CN110633664A (en) * 2019-09-05 2019-12-31 北京大蛋科技有限公司 Method and device for tracking attention of user based on face recognition technology
CN110781754A (en) * 2019-09-27 2020-02-11 精英数智科技股份有限公司 Method, device and system for intelligent monitoring of manual inspection and storage medium
CN112417949A (en) * 2020-09-28 2021-02-26 深圳市艾为智能有限公司 Network teaching attention monitoring system and method based on vision
CN112597823A (en) * 2020-12-07 2021-04-02 深延科技(北京)有限公司 Attention recognition method and device, electronic equipment and storage medium
CN112766215B (en) * 2021-01-29 2024-08-09 北京字跳网络技术有限公司 Face image processing method and device, electronic equipment and storage medium
CN113011286B (en) * 2021-03-02 2022-09-09 重庆邮电大学 Strabismus discrimination method and system based on video-based deep neural network regression model
CN113052064B (en) * 2021-03-23 2024-04-02 北京思图场景数据科技服务有限公司 Attention detection method based on face orientation, facial expression and pupil tracking
CN114897024A (en) * 2022-05-23 2022-08-12 重庆大学 Attention detection method based on deep learning

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007058507A (en) * 2005-08-24 2007-03-08 Konica Minolta Holdings Inc Line of sight detecting device
CN101584573A (en) * 2008-05-20 2009-11-25 丛繁滋 Interpupillary distance, optical range measuring device
CN102830793A (en) * 2011-06-16 2012-12-19 北京三星通信技术研究有限公司 Sight tracking method and sight tracking device
CN103793719A (en) * 2014-01-26 2014-05-14 深圳大学 Monocular distance-measuring method and system based on human eye positioning
CN104766059A (en) * 2015-04-01 2015-07-08 上海交通大学 Rapid and accurate human eye positioning method and sight estimation method based on human eye positioning
JP2016173313A (en) * 2015-03-17 2016-09-29 国立大学法人鳥取大学 Visual line direction estimation system, visual line direction estimation method and visual line direction estimation program
CN106355147A (en) * 2016-08-26 2017-01-25 张艳 Acquiring method and detecting method of live face head pose detection regression apparatus
CN106598221A (en) * 2016-11-17 2017-04-26 电子科技大学 Eye key point detection-based 3D sight line direction estimation method
CN107345814A (en) * 2017-07-11 2017-11-14 海安中科智能制造与信息感知应用研发中心 A kind of mobile robot visual alignment system and localization method
CN107392963A (en) * 2017-06-28 2017-11-24 北京航空航天大学 A kind of imitative hawkeye moving target localization method for soft autonomous air refuelling
CN107818305A (en) * 2017-10-31 2018-03-20 广东欧珀移动通信有限公司 Image processing method, device, electronic device, and computer-readable storage medium
CN107818310A (en) * 2017-11-03 2018-03-20 电子科技大学 A kind of driver attention's detection method based on sight
CN108288280A (en) * 2017-12-28 2018-07-17 杭州宇泛智能科技有限公司 Dynamic human face recognition methods based on video flowing and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8066373B2 (en) * 2009-02-03 2011-11-29 Pixeloptics, Inc. Multifocal measurement device
US20150339589A1 (en) * 2014-05-21 2015-11-26 Brain Corporation Apparatus and methods for training robots utilizing gaze-based saliency maps

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007058507A (en) * 2005-08-24 2007-03-08 Konica Minolta Holdings Inc Line of sight detecting device
CN101584573A (en) * 2008-05-20 2009-11-25 丛繁滋 Interpupillary distance, optical range measuring device
CN102830793A (en) * 2011-06-16 2012-12-19 北京三星通信技术研究有限公司 Sight tracking method and sight tracking device
CN103793719A (en) * 2014-01-26 2014-05-14 深圳大学 Monocular distance-measuring method and system based on human eye positioning
JP2016173313A (en) * 2015-03-17 2016-09-29 国立大学法人鳥取大学 Visual line direction estimation system, visual line direction estimation method and visual line direction estimation program
CN104766059A (en) * 2015-04-01 2015-07-08 上海交通大学 Rapid and accurate human eye positioning method and sight estimation method based on human eye positioning
CN106355147A (en) * 2016-08-26 2017-01-25 张艳 Acquiring method and detecting method of live face head pose detection regression apparatus
CN106598221A (en) * 2016-11-17 2017-04-26 电子科技大学 Eye key point detection-based 3D sight line direction estimation method
CN107392963A (en) * 2017-06-28 2017-11-24 北京航空航天大学 A kind of imitative hawkeye moving target localization method for soft autonomous air refuelling
CN107345814A (en) * 2017-07-11 2017-11-14 海安中科智能制造与信息感知应用研发中心 A kind of mobile robot visual alignment system and localization method
CN107818305A (en) * 2017-10-31 2018-03-20 广东欧珀移动通信有限公司 Image processing method, device, electronic device, and computer-readable storage medium
CN107818310A (en) * 2017-11-03 2018-03-20 电子科技大学 A kind of driver attention's detection method based on sight
CN108288280A (en) * 2017-12-28 2018-07-17 杭州宇泛智能科技有限公司 Dynamic human face recognition methods based on video flowing and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Biometric presentation attack detection using gaze alignment;N.Alsufyani et al.;《2018 IEEE 4th International Conference on Identity》;20180312;1-8 *

Also Published As

Publication number Publication date
CN109446892A (en) 2019-03-08

Similar Documents

Publication Publication Date Title
CN109446892B (en) Human eye attention positioning method and system based on deep neural network
CN111783820B (en) Image labeling method and device
CN110599540B (en) Real-time three-dimensional human body shape and posture reconstruction method and device under multi-viewpoint camera
Moghadam et al. Line-based extrinsic calibration of range and image sensors
US12217444B2 (en) Method for measuring the topography of an environment
CN100417231C (en) Stereo vision hardware-in-the-loop simulation system and method
CN105550670A (en) Target object dynamic tracking and measurement positioning method
CN109086727B (en) Method and device for determining motion angle of human head and electronic equipment
CN111127540B (en) Automatic distance measurement method and system for three-dimensional virtual space
CN105894511B (en) Demarcate target setting method, device and parking assistance system
CN102831601A (en) Three-dimensional matching method based on union similarity measure and self-adaptive support weighting
CN105551020A (en) Method and device for detecting dimensions of target object
CN110675436A (en) Laser radar and stereoscopic vision registration method based on 3D feature points
CN116681776B (en) External parameter calibration method and system for binocular camera
Mahdy et al. Projector calibration using passive stereo and triangulation
CN106170086A (en) The method of drawing three-dimensional image and device, system
CN113848931B (en) Agricultural machinery automatic driving obstacle recognition method, system, equipment and storage medium
CN107590444A (en) Detection method, device and the storage medium of static-obstacle thing
CN116193108B (en) Online self-calibration method, device, equipment and medium for camera
Real-Moreno et al. Camera calibration method through multivariate quadratic regression for depth estimation on a stereo vision system
CN110068308B (en) Distance measurement method and distance measurement system based on multi-view camera
CN111047636B (en) Obstacle avoidance system and obstacle avoidance method based on active infrared binocular vision
CN114463832B (en) Point cloud-based traffic scene line of sight tracking method and system
CN103260008B (en) A kind of image position is to the projection conversion method of physical location
CN114821497A (en) Method, device, device and storage medium for determining the position of a target

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: Room 658, building 1, No.1, luting Road, Cangqian street, Yuhang District, Hangzhou City, Zhejiang Province 310000

Patentee after: Hangzhou Yufan Intelligent Technology Co.,Ltd.

Country or region after: China

Address before: Room 658, building 1, No.1, luting Road, Cangqian street, Yuhang District, Hangzhou City, Zhejiang Province 310000

Patentee before: UNIVERSAL UBIQUITOUS TECHNOLOGY Co.,Ltd.

Country or region before: China

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20230324