CN101930543B

CN101930543B - Method for adjusting eye image in self-photographed video

Info

Publication number: CN101930543B
Application number: CN2010102640432A
Authority: CN
Inventors: 袁杰; 郑晖; 刘诗诗
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2010-08-27
Filing date: 2010-08-27
Publication date: 2012-06-27
Anticipated expiration: 2030-08-27
Also published as: CN101930543A

Abstract

The invention discloses a method for correcting an eye image in a selfie video, comprising the following steps: step 1, target eye image detection and positioning: detecting and locating the position of the eye from the video image; step 2, the sclera image in the eye image, Recognition and positioning of iris image and pupil image: recognize sclera image and iris image according to grayscale; recognize iris image and pupil image according to texture; locate the relative position of sclera image and iris image, iris image and pupil image; Step 3, iris The secondary projection of the image and the pupil image translates the iris image and the pupil image to the center of the sclera image, thereby realizing the adjustment of the eye image. The present invention performs image processing through a software method without adding additional devices, so that when a person's face is facing the display device and the eyes are not watching the camera lens, the moving video image of the eyes gazing at the display device can be obtained on the display device, which is greatly improved. The improvement cost of the hardware system is reduced.

Description

A method for adjusting eye images in selfie videos

技术领域 technical field

本发明涉及视频数据处理和成像领域，特别是一种自拍视频中眼睛图像的调正方法。The invention relates to the field of video data processing and imaging, in particular to a method for adjusting eye images in selfie videos.

背景技术 Background technique

在数字视频处理的应用领域中，尤其随着3G通信网络的普及，视频自拍和网络视频的应用越来越广泛。目前存在一个很令人烦恼的现状，就是视频采集装置一般都位于显示装置的外边缘附近，如图2所示。在这种情况下，当被拍摄人目光注视显示装置的屏幕时，显示装置上的成像结果是眼睛的注视点偏离显示装置的屏幕，简而言之，就是屏幕观察者获得的人脸显示图像中眼睛图像歪的，而非正视的，人类视觉所感受到的眼睛图像的“正视”与“非正视”主要是根据人眼中巩膜、虹膜以及瞳孔的相对位置感受的，一般认为虹膜和瞳孔位于巩膜中心位置即为“正视”，否则为“非正视”。而当被拍摄人目光注视视频采集装置时，显示装置上的成像结果是眼睛的注视点朝向显示装置的屏幕但被拍摄人自己看不到这一成像结果，如图1a和图1b所示。In the application field of digital video processing, especially with the popularization of 3G communication network, the application of video Selfie and network video becomes more and more extensive. At present, there is a very disturbing situation, that is, video capture devices are generally located near the outer edge of the display device, as shown in FIG. 2 . In this case, when the person being photographed looks at the screen of the display device, the imaging result on the display device is that the gaze point of the eyes deviates from the screen of the display device. In short, it is the face display image obtained by the screen observer The image of the middle eye is crooked, not facing up. The "face up" and "non-face up" of the eye image perceived by human vision are mainly based on the relative positions of the sclera, iris and pupil in the human eye. It is generally believed that the iris and pupil are located in the sclera. The central position is "face-to-face", otherwise it is "non-face-to-face". And when the person being photographed gazes at the video capture device, the imaging result on the display device is that the gaze point of the eyes faces the screen of the display device, but the person being photographed cannot see this imaging result, as shown in Fig. 1a and Fig. 1b.

发明内容 Contents of the invention

发明目的：本发明所要解决的技术问题是针对现有技术的不足，提供一种自拍视频中眼睛图像的调正方法，从而使得被拍摄者在注视屏幕时，视频采集装置采集并最终显示出来的是眼睛正视的图像。Purpose of the invention: The technical problem to be solved by the present invention is to provide a method for correcting the eye image in the self-portrait video, so that when the subject is watching the screen, the video acquisition device collects and finally displays is the image that the eyes face up to.

为了解决上述技术问题，本发明公开了一种自拍视频中眼睛图像的调正方法，包括以下步骤：In order to solve the above-mentioned technical problems, the present invention discloses a method for correcting eye images in a selfie video, comprising the following steps:

步骤一，目标眼睛图像检测和定位：从视频图像中检测并定位眼睛的位置；Step 1, target eye image detection and positioning: detect and locate the position of the eye from the video image;

步骤二，眼睛图像中巩膜图像、虹膜图像以及瞳孔图像的识别定位：根据灰度区分出巩膜图像和虹膜图像；根据纹理区分出虹膜图像和瞳孔图像；定位巩膜图像和虹膜图像、虹膜图像和瞳孔图像的相对位置；Step 2, recognition and positioning of the sclera image, iris image and pupil image in the eye image: distinguish the sclera image and the iris image according to the gray scale; distinguish the iris image and the pupil image according to the texture; locate the sclera image and the iris image, the iris image and the pupil the relative position of the image;

步骤三，虹膜图像和瞳孔图像的二次投影，将虹膜图像和瞳孔图像平移到巩膜图像的中心，从而实现眼睛图像的调正。Step 3, the secondary projection of the iris image and the pupil image, which translates the iris image and the pupil image to the center of the sclera image, thereby realizing the adjustment of the eye image.

本发明中，优选地，所述步骤一包括以下步骤：In the present invention, preferably, said step 1 includes the following steps:

步骤(11)，对自拍视频的图像进行预处理；包括使用腐蚀膨胀法加强图像中各个分散点的连通性，使用中值滤波处理图像使得图像更加平滑。此步骤可以采用本领域常见的图像处理方法，同时，本步骤不是本发明的必要步骤，只是优化步骤之一，本发明在脱离了本步骤的情况下，仍然能够实现发明目的。Step (11), preprocessing the image of the selfie video; including using the erosion-expansion method to strengthen the connectivity of each scattered point in the image, and using the median filter to process the image to make the image smoother. This step can adopt common image processing methods in the field. At the same time, this step is not a necessary step of the present invention, but only one of the optimization steps. The present invention can still achieve the purpose of the invention without this step.

步骤(12)，图像进行色度空间转换，由于在双色差或色调饱和度平面上，不同人种的肤色变化不大，肤色的差异更多的是存在于亮度而不是色度，因此可以根据肤色情况从自拍视频的图像中识别出人脸图像；例如在光照良好且对比度适宜的情况下，即平均亮度值在100～200之间，对比度在50％～80％之间，肤色区域在YCbCr空间占据102＜Cb＜128，125＜Cr＜160的范围。Step (12), the image is converted into chromaticity space. Since the skin color of different races does not change much on the two-color difference or hue-saturation plane, the difference of skin color is more in brightness than chroma, so it can be based on Skin color conditions Recognize face images from selfie video images; for example, in the case of good lighting and appropriate contrast, that is, the average brightness value is between 100 and 200, the contrast is between 50% and 80%, and the skin color area is YCbCr The space occupies the range of 102<Cb<128, 125<Cr<160.

步骤(13)，根据灰度法从人脸图像中识别出左、右眼睛的图像；根据眼球区域和面部图像在灰度上的截然不同，通过对该区域图像进行黑白二值化处理后即可根据灰度的不同快速划分出两者的分界。Step (13), according to the grayscale method, identify the images of the left and right eyes from the face image; The boundary between the two can be quickly divided according to the difference in grayscale.

本发明中，优选地，所述步骤二包括以下步骤：In the present invention, preferably, said step 2 includes the following steps:

步骤(21)，对识别出的眼睛图像进行黑白二值化处理，并根据灰度法识别出巩膜图像和虹膜图像；根据巩膜和虹膜图像在灰度上的截然不同，通过对该区域图像进行黑白二值化处理后即可根据灰度的不同快速划分出两者的分界。Step (21), black-and-white binary processing is carried out to the identified eye image, and the sclera image and the iris image are identified according to the grayscale method; After black and white binarization processing, the boundary between the two can be quickly divided according to the difference in gray level.

步骤(22)，根据纹理分析法识别出虹膜图像和瞳孔图像，并计算虹膜图像和瞳孔图像的相对位置；虹膜区域有较多复杂的纹理，而瞳孔区域基本呈现单一纹理并且虹膜区域总是呈现圆形，因此可以对该区域进行分块傅里叶变换分析或分块离散余弦变换，通过分析变换域中高频分量，高频分量多表明该区域纹理复杂，为虹膜区域，反之则为瞳孔区域，从而给出空间域两者之间的界限。Step (22), identify the iris image and the pupil image according to the texture analysis method, and calculate the relative position of the iris image and the pupil image; the iris area has more complex textures, while the pupil area basically presents a single texture and the iris area always presents Circle, so block Fourier transform analysis or block discrete cosine transform can be performed on this area. By analyzing the high-frequency components in the transform domain, more high-frequency components indicate that the texture of the area is complex, which is the iris area, and vice versa is the pupil area. , thus giving the boundary between the two in the space domain.

步骤(23)，计算出瞳孔图像中心点距离虹膜中心点的方位角α和距离d。Step (23), calculating the azimuth α and the distance d between the central point of the pupil image and the central point of the iris.

本发明中，优选地，所述步骤三包括以下步骤：In the present invention, preferably, said step three includes the following steps:

步骤(31)，将虹膜图像平移到巩膜图像的中心；Step (31), the iris image is translated to the center of the sclera image;

步骤(32)，对于虹膜图像平移后巩膜图像上的图像缺失部分，使用平移前虹膜图像周围的巩膜图像进行填充；Step (32), for the image missing part on the sclera image after the translation of the iris image, use the sclera image around the iris image before translation to fill;

步骤(33)，根据瞳孔图像中心点距离虹膜中心点的方位角α和距离d，将平移后的虹膜图像所在的圆形区域以圆心为中心进行有向旋转；旋转方向为π+α，旋转角度为rtan^-1(d/r)，其中r为瞳孔的半径。Step (33), according to the azimuth α and the distance d between the central point of the pupil image and the central point of the iris, the circular area where the iris image after translation is located is rotated with the center of the circle as the center; the rotation direction is π+α, and the rotation The angle is rtan ^-1 (d/r), where r is the radius of the pupil.

步骤(34)，对于虹膜图像有向旋转后空缺部分，使用巩膜图像周围的虹膜图像进行填充。Step (34), for the vacant part of the iris image after the directional rotation, use the iris image around the sclera image to fill.

本发明的原理是当被拍摄者视线对准显示屏幕时，将拍摄到的视频图像中人眼目标检测之后根据瞳孔在眼球上的分布情况判断出视线和瞳孔中心到摄像机光心连线的夹角，根据该角度对采集到的视频图像眼部附近的区域进行二次投影，最终实现在显示屏幕上显示目光对准屏幕的视频图像。The principle of the present invention is that when the subject's line of sight is aligned with the display screen, after detecting the human eye target in the captured video image, the distance between the line of sight and the center of the pupil to the optical center of the camera is judged according to the distribution of the pupil on the eyeball. According to the angle, the area near the eyes of the collected video image is re-projected, and finally the video image with eyes on the screen is displayed on the display screen.

有益效果：本发明在不增加额外装置的情况下，通过软件方法进行图像处理，从而使得当人脸面对显示装置而眼睛不注视摄像镜头时可在显示装置上获得眼睛注视显示装置的活动视频图像，大大降低了硬件系统的改进成本。本发明方法在视频通信，视频会议等需要使用视频进行双向或者多向通讯的方面有重要的应用前景。Beneficial effects: the present invention performs image processing through a software method without adding additional devices, so that when the human face faces the display device and the eyes do not watch the camera lens, the active video of the eyes watching the display device can be obtained on the display device image, greatly reducing the improvement cost of the hardware system. The method of the invention has important application prospects in video communication, video conferencing and other aspects that need to use video for two-way or multi-way communication.

附图说明 Description of drawings

下面结合附图和具体实施方式对本发明做更进一步的具体说明，本发明的上述和/或其他方面的优点将会变得更加清楚。The advantages of the above and/or other aspects of the present invention will become clearer as the present invention will be further described in detail in conjunction with the accompanying drawings and specific embodiments.

图1是现实中注视对准和注视不对准的示意图。Figure 1 is a schematic diagram of gaze alignment and gaze misalignment in reality.

图2是现有技术常见视频自拍装置的示意图。Fig. 2 is a schematic diagram of a common video selfie device in the prior art.

图3是本发明注视矫正计算的示意图。Fig. 3 is a schematic diagram of gaze correction calculation in the present invention.

图4是本发明注视矫正计算的过程图。Fig. 4 is a process diagram of gaze correction calculation in the present invention.

图5是本发明连通区域的检测的流程图。Fig. 5 is a flow chart of the detection of connected regions in the present invention.

图6是本发明类Haar矩形特征示例图。Fig. 6 is an example diagram of a Haar-like rectangular feature in the present invention.

图7是本发明方法简化流程图。Fig. 7 is a simplified flowchart of the method of the present invention.

具体实施方式： Detailed ways:

本发明硬件部分由单个视频拍摄装置、运算处理装置和显示装置组成，核心思路是利用视频图像中目标识别、目标配准和目标二次投影，实现显示装置中显示观察者的目光正视的视频图像。The hardware part of the present invention is composed of a single video shooting device, an arithmetic processing device, and a display device. The core idea is to use the target recognition, target registration and target secondary projection in the video image to realize the video image displayed in the display device by the observer's eyes. .

如图7所示，本发明公开了一种自拍视频中眼睛图像的调正方法，包括以下步骤：As shown in Figure 7, the present invention discloses a method for correcting eye images in a selfie video, comprising the following steps:

所述步骤一包括以下步骤：步骤11，对自拍视频的图像进行预处理；步骤12，从自拍视频的图像中识别出人脸图像；步骤13，根据灰度法从人脸图像中识别出左、右眼睛的图像。The first step includes the following steps: step 11, preprocessing the image of the selfie video; step 12, identifying the face image from the image of the selfie video; step 13, identifying the left face image from the face image according to the grayscale method , the image of the right eye.

步骤11，对自拍视频的图像进行预处理；Step 11, preprocessing the image of the selfie video;

对自拍视频的图像进行预处理，由于图像的采集往往在多变的，不可预料的环境(主要是光照环境)下进行，对图像进行预处理使其使其能够适应算法的要求显得尤为必要，本发明中涉及到的图像预处理包括直方图均衡、形态学操作和中值滤波。Preprocessing the image of the selfie video, because the image acquisition is often carried out in a changeable and unpredictable environment (mainly the lighting environment), it is particularly necessary to preprocess the image so that it can adapt to the requirements of the algorithm. The image preprocessing involved in the present invention includes histogram equalization, morphological operation and median filtering.

直方图均衡化是数字图像处理中最为基本的一个操作，其作用是使得图像的对比度分明。形态学操作，分为形态学腐蚀和形态学膨胀，它们针对二值图像进行。先腐蚀在膨胀称为闭操作，可以使得图像中缺损的图形闭合，相反则称为开操作，使得闭合的图像断裂。经过形态学操作可以去除图像中的孤立噪声点并且将由于各种原因造成的断裂连通区域进行修复。Histogram equalization is the most basic operation in digital image processing, and its function is to make the contrast of the image clear. Morphological operations are divided into morphological erosion and morphological expansion, which are performed on binary images. Corrosion before expansion is called the closing operation, which can close the defective graphics in the image, and the opposite is called the opening operation, which makes the closed image break. After morphological operations, the isolated noise points in the image can be removed and the disconnected connected areas caused by various reasons can be repaired.

中值滤波是一种能有效抑制噪声的非线性信号处理技术。中值滤波的基本原理是把数字图像或数字序列中一点的值用该点的一个邻域中各点值的中值代替，从而消除孤立的噪声点。经过中值滤波波后图像将变得平滑。步骤12，从自拍视频的图像中识别出人脸图像，包括：Median filtering is a nonlinear signal processing technique that can effectively suppress noise. The basic principle of median filtering is to replace the value of a point in a digital image or digital sequence with the median value of each point in a neighborhood of the point, thereby eliminating isolated noise points. The image will be smoothed after median filtering. Step 12, identify the face image from the image of the selfie video, including:

基于肤色分割的人脸检测：Face detection based on skin color segmentation:

多数人脸分析的方法都是基于灰度图像，而肤色分割是利用了人类肤色的颜色色度信息作为特征，进行人脸检测，是一种基于特征不变量的人脸检测方法。Most face analysis methods are based on grayscale images, and skin color segmentation uses the color and chromaticity information of human skin color as features for face detection. It is a face detection method based on feature invariants.

人类肤色与自然背景存在明显的区别，由于面部血管的作用，其红色分量较为饱满；并且在不同光照、人种条件下的肤色相对维持在一个稳定的范围内。同时，这种方法只需对全局图像进行数次遍历，运算速度快，易于实现，是一种被广泛运用于人脸检测系统的基础算法。There is a clear difference between human skin color and the natural background. Due to the effect of facial blood vessels, the red component is relatively full; and the skin color is relatively maintained within a stable range under different lighting and ethnic conditions. At the same time, this method only needs to traverse the global image several times, which is fast and easy to implement. It is a basic algorithm widely used in face detection systems.

该算法主要分为三个步骤：The algorithm is mainly divided into three steps:

步骤a，肤色区域分割：利用YCbCr色彩空间进行肤色分割，在该空间内，肤色Cr分量的阈值易于选取，且受到光照影响很小。YCbCr与RGB颜色空间的转换关系为：Step a, skin color area segmentation: use the YCbCr color space for skin color segmentation, in this space, the threshold of the skin color Cr component is easy to select, and is less affected by light. The conversion relationship between YCbCr and RGB color space is:

Y＝0.256789R+0.504129G+0.0097906B+16Y＝0.256789R+0.504129G+0.0097906B+16

Cb＝-0.148223R-0.290992G+0.439215B+128Cb=-0.148223R-0.290992G+0.439215B+128

Cr＝0.439215R-0.367789G-0.071426B+128Cr＝0.439215R-0.367789G-0.071426B+128

R＝1.164383×(Y-16)+1.596027×(Cr-128)R=1.164383×(Y-16)+1.596027×(Cr-128)

G＝1.164382×(Y-16)-0.391762×(Cb-128)-0.812969×(Cr-128)G=1.164382×(Y-16)-0.391762×(Cb-128)-0.812969×(Cr-128)

B＝1.164382×(Y-16)+2.017230×(Cb-128)B=1.164382×(Y-16)+2.017230×(Cb-128)

通过阈值分割，将YCbCr彩色图像转化为黑白图像，黑色表示背景，白色标记了接近肤色的区域。一般情况下，也就是光照良好且对比度适宜的情况下，肤色区域在YCbCr空间占据102＜Cb＜128，125＜Cr＜160的范围，因此分割的阈值可以选择Cb＝116，Cr＝144。Through threshold segmentation, the YCbCr color image is converted into a black and white image, black represents the background, and white marks the area close to the skin color. In general, that is, when the lighting is good and the contrast is appropriate, the skin color area occupies the range of 102<Cb<128, 125<Cr<160 in the YCbCr space, so the threshold for segmentation can be selected as Cb=116, Cr=144.

步骤b，将所有连通的白色区域将被定位出来，在检测区域值前，对图像进行一些预处理：(1)利用形态学闭操作(腐蚀-膨胀)加强各个分散点的连通性。(2)利用中值滤波使得图像平滑。In step b, all connected white areas will be located, and some preprocessing is performed on the image before detecting the area value: (1) Use morphological closing operation (corrosion-expansion) to strengthen the connectivity of each scattered point. (2) Use median filtering to smooth the image.

步骤c，所有被找到的白色区域中，通过面积，长宽比，位置等等信息筛选出最有可能是人脸的区域。在本实施例中，面部占图像比例一般都很高(比如60％以上)，位置处于图像中心区域，长宽比接近1∶1，因此很容易被区分。In step c, among all the found white areas, the areas most likely to be human faces are screened out based on information such as area, aspect ratio, position, etc. In this embodiment, the proportion of the face in the image is generally high (for example, more than 60%), the position is in the center of the image, and the aspect ratio is close to 1:1, so it is easy to distinguish.

连通区域的检测：在人脸肤色分割算法中，一个重要的步骤是要将形态学操作之后的图像中的连通区域检测出来，确定包围这些区域的最小矩形边界的坐标，大小，长宽比等。为了进行检测，首先确定一些定义“连通区域”的规则，这些规则可表示为：(1)两点连通要求它们在行或列上相邻(斜相邻不算为连通)；(2)如果一个连通区域包含另一个连通区域，后者将被忽略；(3)如果一个连通区域的最小矩形边界与另一连通区域的最小矩形边界部分重叠，两者被定义为独立的两个连通区域。根据这些规则，设计一种遍历边界的算法，逐行逐列的搜索连通的类肤色像素，定义类肤色像素为1，非肤色像素为0。算法流程表示如图5所示，图5a和图5b分别是多个区域检测的总流程以及单个区域检测具体流程。Detection of connected areas: In the face skin color segmentation algorithm, an important step is to detect the connected areas in the image after the morphological operation, and determine the coordinates, size, aspect ratio, etc. of the smallest rectangular boundary surrounding these areas. . In order to detect, first determine some rules that define "connected regions", these rules can be expressed as: (1) two points are connected to require them to be adjacent in rows or columns (obliquely adjacent are not considered connected); (2) if A connected region contains another connected region, the latter will be ignored; (3) If the smallest rectangular boundary of one connected region partially overlaps with the smallest rectangular boundary of another connected region, the two are defined as independent two connected regions. According to these rules, an algorithm for traversing the boundary is designed to search for connected skin-color-like pixels row by column, and define the skin-color-like pixels as 1 and the non-skin-color pixels as 0. The algorithm flow representation is shown in Figure 5, and Figure 5a and Figure 5b are the overall flow of multiple area detection and the specific flow of single area detection respectively.

如图5a所示，从图像左上角第一行第一个像素开始按行遍历，判断当前像素是否为起始点，若是则应用图5b的方法检测以该像素为起始点的边界，并在完成检测后标记边界，并指向上一起始点的右边一个像素点；若不是起始点则继续按行遍历下一个像素，重复以上过程直到遍历完整所有像素。As shown in Figure 5a, start from the first pixel in the first row of the upper left corner of the image to traverse line by line, judge whether the current pixel is the starting point, if so, apply the method in Figure 5b to detect the boundary with this pixel as the starting point, and complete After detection, mark the boundary and point to a pixel to the right of the previous starting point; if it is not the starting point, continue to traverse the next pixel by row, and repeat the above process until all pixels are traversed.

检测边界如图5b所示，从起始点作为当前点开始检测边界。首先以左、上、右、下的顺时针方向查找当前点周围是否有同类型像素点，之后在判断找到的像素点是否为边界点。为方便说明，本发明以当前点左边存在同类型像素点为例。设当前像素点A，并且查找到点A左边是否存在同类型像素点B，如果不存在，则更新边界，并判断能否回到起始点，如果不能回到起始点，则重新查找到点A左边是否存在同类型像素点B，如果能回到起始点，则结束；如果存在同类型像素点B，将B设为当前点，再判断AB方向的逆时针第一个方向上即“下”方是否有同类型像素点，如果没有像素点，则更新边界，并重新查找到点A左边是否存在同类型像素点B，若下方有同类型像素点C，则说明A不是边界点，将当前点设为C以BC的逆时针方向向下走查找判断是否回到起始点，如果回到起始点，则结束，如果没有回到起始点，则判断右边是否有点，如果没有，若没有则说明点B是边缘点，更新边界并继续以AB方向为起始方向顺时针查找边界，并重新判断下方是否有像素点，如果有点，则向右走，判断是否回到起始点，如果回到起始点，则结束；如果没有回到起始点，则判断上边是否有点，如果没有，则更新边界，并重新判断右边是否有点；如果上方有点，则向上走，并判断是否回到起始点，如果是则结束，如果不是回到起始点则重新判断左边是否有点，直至查找到起始点为止。通过边界信息，获取区域的面积、长宽比，面积过大、过小，长宽比过大、过小的区域都将被排除。Detecting the boundary As shown in Figure 5b, the boundary is detected from the starting point as the current point. First, find whether there are pixels of the same type around the current point in the clockwise direction of left, top, right, and bottom, and then judge whether the found pixel point is a boundary point. For the convenience of description, the present invention assumes that there are pixels of the same type on the left of the current point as an example. Set the current pixel point A, and find out whether there is a pixel point B of the same type on the left side of point A. If not, update the boundary and judge whether it can return to the starting point. If it cannot return to the starting point, then find point A again. Whether there is a pixel point B of the same type on the left, if it can return to the starting point, it will end; if there is a pixel point B of the same type, set B as the current point, and then judge the first counterclockwise direction of the AB direction, that is, "down" If there are no pixels of the same type, update the boundary and find out whether there is a pixel B of the same type on the left side of point A. If there is a pixel C of the same type below, it means that A is not a boundary point. Set the point to C and go down in the counterclockwise direction of BC to find out whether it returns to the starting point. If it returns to the starting point, it ends. If it does not return to the starting point, it judges whether there is a point on the right. Point B is an edge point, update the boundary and continue to search the boundary clockwise with the AB direction as the starting direction, and re-judge whether there is a pixel point below, if there is a point, go to the right to judge whether to return to the starting point, if it returns to the starting point If it does not return to the starting point, then judge whether there is a point on the upper side, if not, update the boundary, and re-judge whether there is a point on the right; if there is a point above, go up, and judge whether to return to the starting point, if yes Then end, if it is not back to the starting point, then re-judgment whether there is a point on the left, until the starting point is found. Through the boundary information, the area and aspect ratio of the region are obtained, and the regions with too large or too small area and too large or too small aspect ratio will be excluded.

本实施例采用基于AdaBoost的人脸检测This embodiment adopts face detection based on AdaBoost

Adaboost具体解决了两个问题：一是怎么处理训练样本？在AdaBoost中，每个样本都被赋予一个权重。如果某个样本没有被正确分类，它的权重就会被提高，反之则降低。这样，AdaBoost方法将注意力更多地放在“难分”的样本上。二是怎么合并弱分类器成为一个强分类器？强分类器表示为若干弱分类器的线性加权和形式，准确率越高的弱学习机权重越高。因此AdaBoost包含了两个重要思想，一是多特征的融合(这是Boosting算法的核心)，二是加权分类，将多个特征赋予不同的权重，且权重是通过训练获得的(传统加权分类的权重是预知的)。结合实际来说，对于人脸检测，要从背景中取得人脸，前面已经阐述过必定要根据某些特征，例如纹理和边缘特征等等，在本实施例中，采用类Haar方法进行特征提取。选取K个特征，也就是有K个弱分类器，T个训练样本，通过循环测试获得分类正确率最高的K个特征向量的权重组合，在循环过程中不断更新T个训练样本的权重值，将难以分类的权重提高，易于分类的降低。本方法采取AdaBoost方法进行人脸检测，下面将详细阐述算法的细节，所有算法都是针对灰度图像进行的Adaboost specifically solves two problems: one is how to deal with training samples? In AdaBoost, each sample is assigned a weight. If a sample is not classified correctly, its weight will be increased, otherwise it will be decreased. In this way, the AdaBoost method pays more attention to the "difficult" samples. The second is how to merge weak classifiers into a strong classifier? The strong classifier is expressed as a linear weighted sum of several weak classifiers, and the weight of the weaker learning machine with higher accuracy is higher. Therefore, AdaBoost contains two important ideas, one is the fusion of multiple features (this is the core of the Boosting algorithm), and the other is weighted classification, which assigns multiple features to different weights, and the weights are obtained through training (traditional weighted classification. weights are predictable). Combined with reality, for face detection, to obtain the face from the background, it has been stated that it must be based on certain features, such as texture and edge features, etc. In this embodiment, the Haar-like method is used for feature extraction . Select K features, that is, there are K weak classifiers and T training samples, and the weight combination of K feature vectors with the highest classification accuracy is obtained through loop testing, and the weight values of T training samples are continuously updated during the loop process. Increase the weight of hard-to-classify and reduce the weight of easy-to-classify. This method uses the AdaBoost method for face detection. The details of the algorithm will be elaborated below. All algorithms are based on grayscale images.

类Haar特征提取类Haar特征是一种矩形对特征，在给定有限的数据情况下，基于类Haar特征的检测能够编码特定区域的状态，矩形特征对一些简单的图形结构，比如边缘、线段，比较敏感，但是其只能描述特定走向(水平、垂直、对角)的结构，因此比较粗略。脸部一些特征能够由矩形特征简单地描绘，例如，通常，眼睛要比脸颊颜色更深，这就是一种边缘特征；鼻梁两侧要比鼻梁颜色要深，这就是一种线性特征；嘴巴要比周围颜色更深等等，如图6b所示，这就是一种特定方向的特征。常用的特征矩形，分为边缘特征和线性特征以及特定方向特征，如图6a所示，边缘特征模版用于提取不同角度的边缘信息，线性特征模版用于提取不同角度的线性图像块，特定方向特征模版用于提取指定类型的图像块。Haar-like feature extraction The Haar-like feature is a rectangular pair feature. Given limited data, the detection based on the Haar-like feature can encode the state of a specific area. The rectangular feature is for some simple graphic structures, such as edges and line segments. It is more sensitive, but it can only describe the structure of a specific direction (horizontal, vertical, diagonal), so it is relatively rough. Some features of the face can be simply described by rectangular features. For example, usually, the eyes are darker than the cheeks, which is an edge feature; the sides of the bridge of the nose are darker than the bridge of the nose, which is a linear feature; The surrounding color is darker, etc., as shown in Figure 6b, which is a characteristic of a specific direction. Commonly used feature rectangles are divided into edge features, linear features, and specific direction features. As shown in Figure 6a, edge feature templates are used to extract edge information at different angles, and linear feature templates are used to extract linear image blocks at different angles. Feature templates are used to extract image blocks of a specified type.

基础模板在尺寸上是最小的，故而可以通过缩放形成各种尺寸的同类模板，例如边缘特征模板1就是2个像素的模板。而模板在遍历图像时的特征值，是图像上被白色矩形覆盖的区域之和减去被黑色矩形覆盖的区域之和。因此一幅纯色图像的上所有的特征的值都将是零。各个模板的特征模板可以在子窗口内以“任意”尺寸“任意”放置，每一种形态称为一个特征。找出子窗口所有特征，是进行弱分类训练的基础。The basic template is the smallest in size, so similar templates of various sizes can be formed by scaling, for example, edge feature template 1 is a template with 2 pixels. The feature value of the template when traversing the image is the sum of the areas covered by the white rectangles on the image minus the sum of the areas covered by the black rectangles. So all the features on a solid color image will have zero values. The feature templates of each template can be placed "arbitrarily" with "arbitrary" size in the sub-window, and each form is called a feature. Finding out all the features of the sub-window is the basis for weak classification training.

对于一幅待检测的图像，例如m×n的图像，显然其中包含大量的特征，多个特征的总数目之和是一个可观的数字，因此，下面将讨论图像中包含特征总数的问题。形象的说，图像是一个大盒子，模板是在其中自由活动的小盒子，小盒子在大盒子里面有很多不同的摆放位置，各种尺度的小盒子的所有可能摆放位置的总和就是特征的总数。设模板的大小为s×t，则m×n中所包含的特征总数为：For an image to be detected, such as an m×n image, obviously it contains a large number of features, and the sum of the total number of multiple features is a considerable number. Therefore, the following will discuss the problem of the total number of features contained in the image. Visually speaking, the image is a big box, and the template is a small box that moves freely in it. The small box has many different placement positions in the big box. The sum of all possible placement positions of small boxes of various scales is the feature total. Let the size of the template be s×t, then the total number of features contained in m×n for:

${Ω Ω}_{((s the s,, t t))}^{((m m,, n no))} = = {Σ Σ}_{x x = = 11}^{m m - - s the s + + 11} {Σ Σ}_{y the y = = 11}^{n no - - t t + + 11} pq pq$

$= = {Σ Σ}_{x x = = 11}^{m m - - s the s + + 11} {Σ Σ}_{y the y = = 11}^{n no - - t t + + 11} \frac{m m - - x x + + 11}{s the s} \times \times \frac{n no - - y the y + + 11}{t t}$

$= = {Σ Σ}_{x x = = 11}^{m m - - s the s + + 11} \frac{m m - - x x + + 11}{s the s} \times \times {Σ Σ}_{y the y = = 11}^{n no - - t t + + 11} \frac{n no - - y the y + + 11}{t t}$

多种不同模板的特征数的和即为图像中特征总数，通常的对于2个边缘模板、2个线性模板，1个特定方向模板，5个模板在16×16大小的图像中特征数为32384，如果图像大小为36×36，特征数将达到816264。The sum of the number of features of various templates is the total number of features in the image. Usually, for 2 edge templates, 2 linear templates, 1 specific direction template, and 5 templates, the number of features in a 16×16 size image is 32384 , if the image size is 36×36, the number of features will reach 816264.

积分图运算从上述数据可以看出，一幅图像中的特征数目十分庞大，并且随着图像大小的急剧增长。因此找到一个合适的特征计算方法十分必要。本实施例采用的积分图方法是一种有效、快速的特征计算方法。Integral graph operation From the above data, it can be seen that the number of features in an image is very large, and increases sharply with the size of the image. Therefore, it is necessary to find a suitable feature calculation method. The integral graph method adopted in this embodiment is an effective and fast feature calculation method.

对于一幅图像A，其中A(x，y)点的积分图值定义为：For an image A, the integral image value of point A(x, y) is defined as:

$ii i ((x x,, y the y)) = = \underset{{x x}^{' '} \leq \leq x x,, {y the y}^{' '} \leq \leq y the y}{Σ Σ} A A (({x x}^{' '},, {y the y}^{' '}))$

也就是该点和原点为对角点的矩形内所有点的和。利用积分图可以快速方便的计算出图像的类Haar矩形特征。矩形特征的特征值计算，只与此特征端点的积分图有关，而与图像坐标值无关。因此，不管此矩形特征的尺度如何，特征值的计算所耗费的时间都是常量，而且都只是简单的加减运算。正因如此，积分图的引入，大大地提高了检测的速度。That is, the point and the origin are the sum of all points in the rectangle whose corners are opposite. The Haar-like rectangular feature of the image can be calculated quickly and conveniently by using the integral image. The calculation of the eigenvalue of the rectangular feature is only related to the integral map of the endpoint of this feature, and has nothing to do with the image coordinate value. Therefore, regardless of the scale of the rectangular feature, the calculation of the feature value takes a constant time, and it is just a simple addition and subtraction. Because of this, the introduction of the integral map greatly improves the detection speed.

AdaBoost设计流程AdaBoost算法最终是要获得一个合适的强分类器，设计分类器的过程主要是训练过程，训练采用大量的样本，包括人脸与非人脸，其流程如下：AdaBoost design process AdaBoost algorithm is ultimately to obtain a suitable strong classifier. The process of designing a classifier is mainly a training process. Training uses a large number of samples, including human faces and non-human faces. The process is as follows:

1)给定一系列训练样本(x₁，y₁)，(x₂，y₂)，，，(x_n，y_n)，其中y_i＝0表示其为负样本(非人脸)，y_i＝1表示其为正样本(人脸)。n为一共的训练样本数量；1) Given a series of training samples (x ₁ , y ₁ ), (x ₂ , y ₂ ),,, (x _n , y _n ), where y _i =0 means it is a negative sample (non-face), y _i =1 means it is a positive sample (face). n is the total number of training samples;

2)初始化权重W_1，i＝D(i)，令

或

其中m正样本的数量，l为附样本的数量，m+l＝n；2) Initialize the weight W _{1, i} = D(i), let

or

Where m is the number of positive samples, l is the number of additional samples, m+l=n;

3)对t＝1...T首先归一化权重，T为迭代次数：3) For t=1...T first normalize the weight, T is the number of iterations:

${q q}_{t t,, i i} = = \frac{{w w}_{t t,, i i}}{{Σ Σ}_{j j = = 11}^{n no} {w w}_{t t,, j j}}$

再对每个特征f，训练一个弱分类器h(x，f，p，θ)；计算对应所有特征的弱分类器的加权q_t错误率ε_f，其中，f为特征，θ为阈值和p指示不等号方向：Then for each feature f, train a weak classifier h(x, f, p, θ); calculate the weighted q _t error rate ε _f of the weak classifier corresponding to all features, where f is the feature, θ is the threshold and p indicates the direction of the inequality sign:

ε_f＝∑_iq_i|h(x_i，f，p，θ)-y_i|ε _f ＝∑ _i q _i |h(x _i , f, p, θ)-y _i |

继而选取最佳的弱分类器h(x)(拥有最小错误率ε_t)：Then select the best weak classifier h(x) (with the minimum error rate ε _t ):

ε_t＝min_f，p，θ∑_iq_i|h(x_i，f，p，θ)-y_i|ε _t = min _{f, p, θ} ∑ _i q _i |h(x _i , f, p, θ)-y _i |

＝∑_iq_i|h(x_i，f_t，p_t，θ_t)-y_i|＝∑_iq_i|h_t(x)-y_i|=∑ _i q _i |h(x _i , f _t , p _t , θ _t )-y _i |=∑ _i q _i |h _t (x)-y _i |

弱分类器的训练及选取将在下面详细阐述。按照这个最佳弱分类器，调整权重：The training and selection of weak classifiers will be described in detail below. According to this best weak classifier, adjust the weights:

${w w}_{t t + + 11,, i i} = = {w w}_{t t,, i i} {β β}_{t t}^{11 - - {e e}_{i i}}$

其中e_i＝0表示x_i被正确地分类，其中e_i＝1表示x_i被错误地分类；where e _i = 0 means that x _i is correctly classified, where e _i = 1 means that x _i is incorrectly classified;

${β β}_{t t} = = \frac{{ϵ ϵ}_{t t}}{11 - - {ϵ ϵ}_{t t}}$

最后的强分类器为：The final strong classifier is:

其中in

${α α}_{t t} = = log log \frac{11}{{β β}_{t t}}$

一个弱分类器h(x，f，p，θ)由一个特征f，阈值θ和指示不等号方向的p组成：A weak classifier h(x, f, p, θ) consists of a feature f, a threshold θ, and p indicating the direction of the inequality:

对于本实施例中的矩形特征来说，弱分类器的特征值f(x)就是矩形特征的特征值。由于在训练的时候，选择的训练样本集的尺寸等于检测子窗口的尺寸，检测子窗口的尺寸决定了矩形特征的数量，所以训练样本集中的每个样本的特征相同且数量相同，而且一个特征对一个样本有一个固定的特征值。对于理想的像素值随机分布的图像来说，同一个矩形特征对不同图像的特征值的平均值应该趋于一个定值K。这个情况，也应该发生在非人脸样本上，但是由于非人脸样本不一定是像素随机的图像，因此上述判断会有一个较大的偏差。对每一个特征，计算其对所有的一类样本(人脸或者非人脸)的特征值的平均值，最后得到所有特征对所有一类样本的平均值分布。人脸样本与非人脸样本的分布曲线差别并不大，不过注意到特征值大于或者小于某个值后，分布曲线出现了一致性差别，这说明了绝大部分特征对于识别人脸和非人脸的能力是很微小的，但是存在一些特征及相应的阈值，可以有效地区分人脸样本与非人脸样本。For the rectangular feature in this embodiment, the feature value f(x) of the weak classifier is the feature value of the rectangular feature. Since the size of the selected training sample set is equal to the size of the detection sub-window during training, and the size of the detection sub-window determines the number of rectangular features, so the features of each sample in the training sample set are the same and the number is the same, and a feature There is a fixed eigenvalue for a sample. For an ideal image with randomly distributed pixel values, the average of the feature values of the same rectangular feature to different images should tend to a constant value K. This situation should also occur on non-face samples, but since non-face samples are not necessarily images with random pixels, the above judgment will have a large deviation. For each feature, calculate the average value of its feature values for all types of samples (face or non-face), and finally get the average distribution of all features for all types of samples. There is not much difference between the distribution curves of face samples and non-face samples, but after the feature value is greater or smaller than a certain value, there is a consistent difference in the distribution curve, which shows that most of the features are useful for recognizing faces and non-face samples. The ability of the face is very small, but there are some features and corresponding thresholds that can effectively distinguish between face samples and non-face samples.

一个弱学习器(一个特征)的要求仅仅是：它能够以稍低于50％的错误率来区分人脸和非人脸图像，因此上面提到只能在某个概率范围内准确地进行区分就已经完全足够。按照这个要求，可以把所有错误率低于50％的矩形特征都找到(适当地选择阈值，对于固定的训练集，几乎所有的矩形特征都可以满足上述要求)。每轮训练，将选取当轮中的最佳弱分类器(在算法中，迭代T次即是选择T个最佳弱分类器)，最后将每轮得到的最佳弱分类器按照一定方法提升为强分类器。The requirement of a weak learner (a feature) is only: it can distinguish between face and non-face images with an error rate slightly lower than 50%, so the above mentioned can only be accurately distinguished within a certain probability range It is totally enough. According to this requirement, all rectangular features with an error rate lower than 50% can be found (appropriately choose the threshold, for a fixed training set, almost all rectangular features can meet the above requirements). In each round of training, the best weak classifier in the current round will be selected (in the algorithm, iterating T times is to select T best weak classifiers), and finally the best weak classifier obtained in each round will be improved according to a certain method is a strong classifier.

训练一个弱分类器(特征f)就是在当前权重分布的情况下，确定f的最优阈值，使得这个弱分类器(特征f)对所有训练样本的分类误差最低。选取一个最佳弱分类器就是选择那个对所有训练样本的分类误差在所有弱分类器中最低的那个弱分类器(特征)。对于每个特征f，计算所有训练样本的特征值，并将其排序。通过扫描一遍排好序的特征值，可以为这个特征确定一个最优的阈值，从而训练成一个弱分类器。具体来说，对排好序的表中的每个元素，计算下面四个值：Training a weak classifier (feature f) is to determine the optimal threshold of f under the current weight distribution, so that the weak classifier (feature f) has the lowest classification error for all training samples. Selecting an optimal weak classifier is to select the weak classifier (feature) whose classification error for all training samples is the lowest among all weak classifiers. For each feature f, the feature values of all training samples are calculated and sorted. By scanning the sorted feature values once, an optimal threshold can be determined for this feature, thus training a weak classifier. Specifically, for each element in the sorted list, the following four values are computed:

1)全部人脸样本的权重的和T⁺；1) the sum T ⁺ of the weights of all face samples;

2)全部非人脸样本的权重的和T-；2) The sum T- of the weights of all non-face samples;

3)在此元素之前的人脸样本的权重的和S+；3) The sum S+ of the weights of the face samples before this element;

4)在此元素之前的非人脸样本的权重的和S-；4) The sum S- of the weights of the non-face samples before this element;

这样，当选取当前元素的特征值

和它前面的一个特征值之间的数作为阈值时，所得到的弱分类器就在当前元素处把样本分开——也就是说这个阈值对应的弱分类器将当前元素前的所有元素分类为人脸(或非人脸)，而把当前元素后(含)的所有元素分类为非人脸(或人脸)。In this way, when selecting the feature value of the current element

and an eigenvalue preceding it When the number between is used as the threshold, the obtained weak classifier will separate the samples at the current element—that is to say, the weak classifier corresponding to this threshold will classify all elements before the current element as faces (or non-faces) , and classify all elements after (including) the current element as non-human faces (or human faces).

可以认为这个阈值所带来的分类误差为：It can be considered that the classification error brought by this threshold is:

e＝min(S⁺+(T^--S^-)，S^-+(T⁺-S⁺))e=min(S ⁺ +(T ^- -S ^- ), S ^- +(T ⁺ -S ⁺ ))

通过把这个排序的表扫描从头到尾扫描一遍就可以为弱分类器选择使分类误差最小的阈值(最优阈值)，也就是选取了一个最佳弱分类器。By scanning the sorted table from the beginning to the end, the threshold (optimum threshold) that minimizes the classification error can be selected for the weak classifier, that is, an optimal weak classifier is selected.

AdaBoost强分类器由弱分类器级联而成，强分类器对待一幅待检测图像时，相当于让所有弱分类器投票，再对投票结果按照弱分类器的错误率加权求和，将投票加权求和的结果与平均投票结果比较得出最终的结果。平均投票结果，即假设所有的弱分类器投“赞同”票和“反对”票的概率都相同，求出的平均概率为：The AdaBoost strong classifier is formed by cascading weak classifiers. When the strong classifier treats an image to be detected, it is equivalent to asking all the weak classifiers to vote, and then the voting results are weighted and summed according to the error rate of the weak classifiers. The weighted sum result is compared with the average voting result to obtain the final result. The average voting result, that is, assuming that all weak classifiers have the same probability of voting "yes" and "no" votes, the average probability obtained is:

$\frac{11}{22} (({Σ Σ}_{t t = = 11}^{T T} {α α}_{t t} \cdot &Center Dot; 11 + + {Σ Σ}_{t t = = 11}^{T T} {α α}_{t t} \cdot &Center Dot; 00)) = = \frac{11}{22} {Σ Σ}_{t t = = 11}^{T T} {α α}_{t t}$

步骤13，根据灰度法从人脸图像中识别出左、右眼睛的图像。Step 13: Identify images of left and right eyes from the face image according to the grayscale method.

步骤二，眼睛图像中巩膜图像、虹膜图像以及瞳孔图像的识别定位：根据灰度识别出巩膜图像和虹膜图像；根据纹理识别出虹膜图像和瞳孔图像；定位巩膜图像和虹膜图像、虹膜图像和瞳孔图像的相对位置；Step 2, recognition and positioning of the sclera image, iris image and pupil image in the eye image: recognize the sclera image and iris image according to the gray scale; recognize the iris image and the pupil image according to the texture; locate the sclera image and iris image, the iris image and the pupil the relative position of the image;

所述步骤二包括以下步骤：步骤21，对识别出的眼睛图像进行黑白二值化处理，并根据灰度的不同识别出巩膜图像和虹膜图像，二值化采用ostu方法，即选取使得二值化后图像灰度方差和最大的阈值进行二值化，由于巩膜图像和虹膜图像在灰度上的较大差异，二值化后巩膜区域为白色而虹膜区域为黑色，根据虹膜区域边界为圆形的特征可以很方便的把两者区分开来；步骤22，根据纹理分析法识别出虹膜图像和瞳孔图像，并计算虹膜图像和瞳孔图像的相对位置；虹膜区域有较多复杂的纹理，而瞳孔区域基本呈现单一纹理并且虹膜区域总是呈现圆形，因此可以对该区域进行分块傅里叶变换分析或分块离散余弦变换，通过分析变换域中高频分量，高频分量多表明该区域纹理复杂，为虹膜区域，反之则为瞳孔区域，从而给出空间域两者之间的界限。本发明中通过比较两个图像频谱间高频分量占低频分量的比例来确定，实际计算时，一般可以认为在频谱呈双峰分布时，高频成分占总频谱能量20％以上就可界定为虹膜区域。步骤23，计算出瞳孔图像中心点距离虹膜中心点的方位角和距离。由于瞳孔和虹膜都呈现圆形，本发明中瞳孔中心点和虹膜中心点的获取通过分别提取瞳孔和虹膜的弧形边界，利用圆心和圆弧的几何关系定位出来。Described step 2 comprises the following steps: Step 21, carries out black-and-white binarization processing to the identified eye image, and recognizes the sclera image and the iris image according to the difference in gray scale, binarization adopts the ostu method, promptly selects to make binary value The image grayscale variance and the maximum threshold are binarized. Due to the large difference in grayscale between the sclera image and the iris image, the sclera area is white and the iris area is black after binarization. According to the iris area boundary is a circle The features of the shape can easily distinguish the two; step 22, identify the iris image and the pupil image according to the texture analysis method, and calculate the relative position of the iris image and the pupil image; the iris area has more complex textures, and The pupil area basically presents a single texture and the iris area is always circular, so block Fourier transform analysis or block discrete cosine transform can be performed on this area. By analyzing the high-frequency components in the transform domain, more high-frequency components indicate this area The texture is complex for the iris area and vice versa for the pupil area, thus giving the boundary between the two in the spatial domain. In the present invention, it is determined by comparing the ratio of high-frequency components to low-frequency components between two image spectrums. During actual calculation, it can generally be considered that when the frequency spectrum is bimodal, the high-frequency components account for more than 20% of the total spectrum energy and can be defined as iris area. Step 23, calculating the azimuth and distance between the central point of the pupil image and the central point of the iris. Since both the pupil and the iris are circular, the acquisition of the center point of the pupil and the center point of the iris in the present invention is obtained by extracting the arc boundaries of the pupil and iris respectively, and positioning them by using the geometric relationship between the center of the circle and the arc.

步骤三，虹膜图像和瞳孔图像的二次投影，将虹膜图像和瞳孔图像通过有向旋转移动到巩膜图像的中心，从而实现眼睛图像的调正。Step 3, secondary projection of the iris image and the pupil image, moving the iris image and the pupil image to the center of the sclera image through directional rotation, so as to realize the adjustment of the eye image.

如图4所示，其中斜条纹区域表示平移和有向旋转后的空缺部分，点阵部分表示虹膜，黑色区域表示瞳孔。所述步骤三包括以下步骤：步骤31，将虹膜图像平移到巩膜图像的中心，如图4(a)所示；步骤32，对于虹膜图像平移后巩膜图像上的图像缺失部分，使用平移前虹膜图像周围的巩膜图像进行填充，如图4(c)中所示；步骤33，根据瞳孔图像中心点距离虹膜中心点的方位角和距离，将平移后的虹膜图像所在的圆形区域以圆心为中心进行有向旋转，如图3和图4(c)中所示。根据瞳孔图像中心点距离虹膜中心点的方位角α和距离d，将平移后的虹膜图像所在的圆形区域以圆心为中心进行有向旋转；旋转方向为π+α，旋转角度为r tan^-1(d/r)，其中r为瞳孔的半径；步骤34，对于虹膜图像有向旋转后空缺部分，使用巩膜图像周围的虹膜图像进行填充。As shown in Figure 4, the oblique striped area represents the vacant part after translation and directional rotation, the dot matrix part represents the iris, and the black area represents the pupil. Said step three comprises the following steps: Step 31, translate the iris image to the center of the sclera image, as shown in Figure 4 (a); Step 32, for the image missing part on the sclera image after iris image translation, use the iris before translation The sclera image around the image is filled, as shown in Figure 4 (c); Step 33, according to the azimuth and distance from the center point of the pupil image to the center point of the iris, the circular area where the iris image after translation is located is taken as the center of the circle The center undergoes a directional rotation, as shown in Fig. 3 and Fig. 4(c). According to the azimuth α and distance d between the center point of the pupil image and the center point of the iris, the circular area where the translated iris image is located is rotated with the center of the circle as the center; the rotation direction is π+α, and the rotation angle is r tan ^{- 1} (d/r), where r is the radius of the pupil; step 34, for the vacant part of the iris image after the direction rotation, use the iris image around the sclera image to fill.

所述步骤11的预处理包括使用腐蚀膨胀法加强图像中各个分散点的连通性。所述步骤11的预处理包括使用直方图均衡化提高图像的对比度。所述步骤11的预处理包括使用中值滤波处理图像。The preprocessing in step 11 includes using the erosion-dilation method to strengthen the connectivity of each scattered point in the image. The preprocessing in step 11 includes using histogram equalization to improve the contrast of the image. The preprocessing in step 11 includes processing the image using a median filter.

本发明提供了一种自拍视频中眼睛图像的调正方法的思路及方法，具体实现该技术方案的方法和途径很多，以上所述仅是本发明的优选实施方式，应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明原理的前提下，还可以做出若干改进和润饰，这些改进和润饰也应视为本发明的保护范围。本实施例中未明确的各组成部分均可用现有技术加以实现。The present invention provides an idea and method of a method for correcting eye images in a selfie video. There are many methods and approaches for realizing the technical solution. The above description is only a preferred embodiment of the present invention. Those of ordinary skill in the art can also make some improvements and modifications without departing from the principle of the present invention, and these improvements and modifications should also be regarded as the protection scope of the present invention. All components that are not specified in this embodiment can be realized by existing technologies.

Claims

1. the regulation method of an eye image in shooting the video is characterized in that, may further comprise the steps:

Step 1, subject eye image detection and location: the position of from video image, detecting and locate eyes;

Step 2, the differentiation of sclera image, iris image and pupil image and location in the eye image: tell sclera image and iris image according to gray area; Tell iris image and pupil image according to texture area; The relative position of location sclera image and iris image, iris image and pupil image; Saidly tell iris image and pupil image method for this iris region and pupil region are carried out analysis of piecemeal Fourier transform or piecemeal discrete cosine transform according to texture area; Through analytic transformation territory medium-high frequency component; Bright this zone-texture of high fdrequency component multilist is complicated; Be iris region, otherwise then be pupil region, thereby provide spatial domain boundary between the two;

Step 3, the reprojection of iris image and pupil image moves to the center of sclera image with iris image and pupil image through oriented rotation, thereby realizes the levelling of eye image;

Said step 3 may further comprise the steps:

Step (31) moves to iris image at the center of sclera image;

Step (32) for the disappearance of the image on the sclera image after iris image translation part, uses the preceding iris image of translation sclera image on every side to fill;

Step (33) according to the position angle and the distance of pupil image central point distance iris central point, is carried out the oriented iris image center that moves to pupil image;

Step (34) to the vacancy part that pupil image is oriented after moving, uses iris image to fill.

2. a kind of regulation method from the middle eye image that shoots the video according to claim 1 is characterized in that said step 1 may further comprise the steps:

Step (11) is carried out pre-service to the image that shoots the video certainly;

Step (12) identifies facial image from the image that shoots the video certainly;

Step (13) identifies the image of left and right eyes from facial image according to gray-scale relation.

3. a kind of regulation method from the middle eye image that shoots the video according to claim 2 is characterized in that said step 2 may further comprise the steps:

Step (21) is carried out the black and white binary conversion treatment to the eye image that identifies, and is identified sclera image and iris image according to gray-scale relation;

Step (22) identifies iris image and pupil image according to texture analysis method, and calculates the relative position of iris image and pupil image;

Step (23) calculates the position angle and the distance of pupil image central point distance iris central point.

4. a kind of regulation method from the middle eye image that shoots the video according to claim 2 is characterized in that the pre-service of said step (11) comprises the connectedness of using the corrosion plavini to strengthen each spaced point in the image.

5. a kind of regulation method from the middle eye image that shoots the video according to claim 2 is characterized in that, the pre-service of said step (11) comprises uses medium filtering to handle image.