Summary of the invention
In order to overcome the deficiency of above-mentioned prior art; The object of the present invention is to provide a kind of facial image sample collection optimization method, adopt isomery active vision network, based on empirical evaluation function to the facial image sample quality based on isomery active vision network; Through the multiple goal distribution of camera and the status adjustment of camera; The sample collection link is optimized, obtains the facial image sample that resolution is better, attitude angle is abundant, to be used for registration of people's face and identification.
To achieve these goals, the technical scheme of the present invention's employing is:
A kind of facial image sample collection optimization method based on isomery active vision network comprises following content:
Searching process is: every at a distance from the Δ t time, all cameras carry out Target Assignment and state adjustment, take turns the target image sample collection to accomplish one;
The optimizing variable is followed successively by: the relations of distribution I of camera and target (c, t), the position of camera in world coordinate system with towards the angle L
c, and the focal distance f of camera
c, wherein: I (c is an indicative function t), and c representes the label of camera, c ∈ 1,2 ..., N
c, t representes the label of target, t ∈ 1,2 ..., N
t, I (c, t)=1 expression is given c camera with t Target Assignment, I (c, t)=0 expression is unallocated,
Position, level angle and the pitching corner of
c camera of expression in world coordinate system;
Optimizing target: the overall empirical evaluation function S um that maximizes all targets
t(f
t), the evaluation function f of t target wherein
tAs follows
n
t=2,3,...
f
t(p
1,r
1)=f
p(p
1)·f
r(r
1)
Be the n of t target
tThe association evaluation function of individual's face image pattern calculates through reaching for method; f
pBe evaluation function, calculate in (90 °, 90 °) interval all even dense degree that distributes according to all attitude angle to attitude angle,
Represent n
tThe attitude angle of individual's face image pattern; f
rExpression is to the evaluation function of resolution, calculates according to the height of resolution,
Represent n
tThe resolution of individual's face image pattern.
Optimization method is: (c t) is the Target Assignment scheme to traversal I; Under each Target Assignment scheme, take genetic algorithm, find the solution each camera L
cAnd f
cMake f
tMaximum; Thereby choose optimum distributing scheme and obtain maximum Sum
t(f
t).
Under the situation of gathering single goal, (c is normal value t), in the Δ t time to I; Camera carries out taking turns sample collection successively, because behind a camera collection, variation has just taken place the set of the image pattern of target, and the evaluation function when this will have influence on the collection of next camera calculates; Thereby its behavior is gathered in influence, thereby needs the traversal solve order, finds optimum ordered; Obtain optimum solution, but, random order also is fine.
Gathering under the multiobject situation, if a camera only distributes a target, then travel through I (c, t), i.e. Target Assignment scheme travels through the acquisition orders of camera again, in each order, the evaluation function f of its target of being assigned to of each camera maximization
tThereby, confirm optimum distributing scheme, optimal acquisition order and best camera state.
Gathering under the multiobject situation, if a camera distributes a plurality of targets, (c t), travels through the acquisition orders of camera again, and the objective function sum of its all target of being assigned to of each camera maximization promptly then then to travel through I
T
cThe goal set that is assigned to for camera c, thus confirm optimum distributing scheme, optimal acquisition order and best camera state.
Compared with prior art; The present invention is with reference to existing face recognition technology and systematic research; Recognize that different people face image pattern is in the index that people's face is registered and recognition of face exerts an influence; Comparatively main two is resolution and the attitude angle of people's face in image, and the present invention improves these two indexs with the active vision technology.Isomery active vision network is novel, advanced active vision technology; Characteristics such as it possesses initiatively collection, task cooperation, have complementary functions can more effectively be obtained the facial image sample of noncooperative target than single camera collection, static camera, the first-class legacy equipment of scan camera shooting and technology.
The present invention is according to existing face recognition technology and systematic research; Having carried out further graphical analysis and mathematical analysis, set up the empirical evaluation function (main relevant with resolution and attitude angle) of facial image sample quality, is target with maximization empirical evaluation functional value; Utilize isomery active vision network; The sample collection link is optimized, obtains the facial image sample that resolution is better, attitude angle is abundant, to be used for registration of people's face and identification.This The Application of Technology scene mainly is positioned in monitoring or the military scene, and the facial image sample of noncooperative target is gathered, and target also can be further extensive is inhuman target.
Embodiment
Below in conjunction with accompanying drawing and embodiment the present invention is explained further details.
Divide three parts that this embodiment is detailed.
(1) facial image sample interpretational criteria
The facial image sample has compression of images rate, target range, target expression etc. to the index that people's face is registered and recognition of face exerts an influence.With reference to the research in recognition of face field, and the active vision technology index that can change, present technique has been chosen resolution and two indexs of attitude angle, the i.e. resolution of human face region and the attitude angle of people's face in image.Human face region is generally rectangle, like 40 * 60 sizes (unit is wide pixel * length pixel), can describe with width pixel value r=40.The attitude angle p of people's face from left to right with angle to describe from-90 ° to+90 °.
The empirical evaluation function of facial image sample is estimated respectively resolution and attitude angle, and then sets up the association evaluation function to these two indexs.According to existing method, register a target, need the facial image sample of many different attitude angle usually, to improve registration back system to the recognition capability of target under different attitude angle; When recognition objective, the target of collection is many more at the facial image sample of different attitude angle, and it is then many more to be used for identified information, and recognition correct rate is also high more.Result based on to the face identification system evaluation experimental can know that human face region resolution ratio is high more, and the effect of recognition of face is good more; The attitude angle deviation of facial image sample is big more when registration and identification, and recognition correct rate is low more.Thereby the present invention sums up and to obtain: in the facial image sample, the resolution of human face region is high more, people's face in image towards angle--attitude angle is abundant more, it is even more to distribute, and then the effect of registration of people's face and recognition of face is good more, so the evaluation function value should be big more.In addition, the effect of people's face registration and recognition of face can be saturated gradually along with the increase of the increase of resolution, attitude angle, and no longer phenomenal growth, so evaluation function also need have corresponding characteristic.
In the practical application, the facial image sample that collects is various resolution normally, need the scaling to arrive same normative reference resolution, carry out registration of people's face or recognition of face then.If original resolution is lower than normative reference resolution, that is amplified to after the normative reference resolution, and picture quality can variation.Thereby we pay close attention to is the original resolution size when gathering, and has promptly determined the quality of image on the resolution index.
Adopt FERET people's face sample storehouse (ba~bj series sample set) to experimentize.Sample set comprises the facial image of 194 targets (people) in different angles.In the experiment, the facial image sample of 194 targets (people) is narrowed down to 4 * 6,8 * 12,20 * 30,40 * 60 original resolutions respectively, and unification is amplified to 40 * 60 reference resolution again, and then carries out Flame Image Process and mathematical analysis, like Fig. 1.Adopt two-dimension fourier transform then, behind Fourier transform, in real part, imaginary part, amplitude and four components of phase angle, choose amplitude, because sample is best in the separability of amplitude components.Behind the sample process Fourier transform (image space of W * L dimension transforms to the frequency domain space of W * L dimension) of Fig. 2 for different original resolutions; Calculate the separability numerical value of all samples each point on frequency domain; The numerical value of every bit is scaled (multiply by 50 again after getting the log10 logarithm; Guaranteeing gray-scale value between 0~255, and the result of different resolution contrasts obviously) gray-scale value, get final product Fig. 2.Bright more, to represent under this original resolution, the sample separability is good more.
Each image pattern obtains i.e. 2400 dimensions of one 40 * 60 dimension, at the sample of the amplitude components of frequency domain.Regard the face images sample of each target as one type, different targets is different class.Calculate separability (notion of the similar variance) sum of sample on all dimensions of sample of frequency domain amplitude components of different original resolutions; Get with the log10 logarithm and multiply by 50 (handling to be consistent like this) again with same processing mentioned above; As the evaluation function value of this resolution, its result is as shown in table 1.Separability can determine the effect of sample in registration of people's face and recognition of face, and separability is big more, and effect is good more.
Table 1 sample evaluation result
Original resolution |
4×0 |
4×6 |
8×12 |
20×30 |
40×60 |
The evaluation function value |
0 |
726 |
1090 |
1519 |
1790 |
According to table 1, obtain evaluation function curve about resolution shown in segment of curve among Fig. 3, horizontal ordinate is represented the resolution r (representing with the pixel count of human face region width) of sample here, ordinate is evaluation function value f
rMore level and smooth curve is the result to the segment of curve match among Fig. 3, rule of thumb chooses regression model:
Through the method for non-linear regression, obtain parameter k
r=32, returning square error is 0.0319, span f
r∈ [0,1.1180], r ∈ [0,40].As shown in Figure 3, the evaluation function value increases with resolution, and rate of growth constantly decays, and has saturated phenomenon, meets registration of people's face and the empirical law of recognition of face effect with image pattern resolution.
The image pattern of many attitude angle is adopted in current people's face registration and recognition of face usually, the horizontal lateral rotation of people's face before camera lens, and the attitude angle that causes changes, and is that the one dimension under the original geometry space changes; It has caused the variation of the gray-scale value of all pixels of image (length and width are L and W), is the variation of the image space of W * L dimension.The variation of image space; Be the one dimension variation on a kind of approximate stream shape, referring to .A Global Geometric Framework for Nonlinear Dimensionality Reduction.SCIENCE VOL 290 22 DECEMBER 2000 such as document Joshua B.Tenenbaum.
Common a kind of people's face register method is at image space, the test sample book of classifying and the training sample in the database compared, and then coupling identification.Wherein, training sample comes from people's face registration process, and test sample book comes from face recognition process.The training sample of the different attitude angle that target is gathered is many more, and attitude angle distributes even more, then at image space, with the test sample book training sample adjacent, that belong to a target together of certain attitude angle of target can be many more, then test sample book is not easy by the mistake branch more.
Said by two sections in preceding text; We attempt to change apart from portrayal sample approximate one dimension with attitude angle in image space with the Euler of the pixel vectors of the facial image sample of adjacent attitude angle; Intensive and degree of uniformity according to sample distribution in approximate one dimension changes; The evaluation sample quality is good and bad, the design evaluation function.
Selecting resolution is 40 * 60, and from left to right 9 attitude angle comprise 0 °, and ± 15 °, 25 °, 40 °, 60 °, totally 194 people's facial image sample carries out Flame Image Process and mathematical analysis.From-60 ° spend+60 °; Sample between the angle in twos; Square distance average at image space pixel grey scale vector (divided by 100 to reduce numerical values recited) is as shown in table 2, wherein because the sample set of FERET database lacks suitable sample, not from-90 ° spend+90 ° analyze.
Euler's distance of the adjacent angle sample vector of table 2
According to the result of table 2, be able to design curve p-y, p characterizes the sample attitude angle, and the y axle characterizes the approximate one dimension of different attitude angle samples in image space and distributes.Difference is proportional to the square distance average (apart from the result after the summation normalization) of the image space pixel grey scale vector of experiment sample on the y axle.Sample distribution after the normalization, is obtained curve p-y as shown in Figure 4 on the y axle.The figure cathetus is a consult straight line; The difference explanation of curve p-y and consult straight line; Sample is in the variation of image space variation and the heterogeneous line sexual intercourse with its attitude angle: near curve tangent slope larger part (the 0 degree attitude angle); Variation at image space is more violent with the variation of its attitude angle--and in this zone, the collection of different attitude angle samples is suitable more.
Dotted line is Fig. 4 middle polyline section among Fig. 5, and dotted line among Fig. 5 is used the sectional parabola match, can get:
k
p≈286,err=0.0063(4-4)
N to target t
tThe angle evaluation function of individual's face image pattern is as follows:
Its mathematical meaning is n
tIndividual sample is with solid line among the one dimension distribution-Fig. 5 of its place-be divided into n
t+ 1 segment of curve.The sample of different attitude angle is many more, and attitude angle is even more, and the quadratic sum of all segment of curve end points differences in height is more little, then evaluation function f
pBig more.f
P0Be a normal value, guarantee f
pNon-negative.The physical significance of this evaluation function does, the deviation size of sample attitude angle is reflected on the difference in height of corresponding point on the curve, and difference in height is more little, and the attitude angle deviation is more little; All difference in height quadratic sums are more little, represent attitude angle many more, and it is even more to distribute.
N to certain target
tThe association evaluation function of individual sample such as formula (4-7) calculate through reaching for method
n
t=2,3,...
f
t(p
1,r
1)=f
p(p
1)·f
r(r
1)
Be the n of t target
tThe association evaluation function of individual's face image pattern calculates through reaching for method; f
pBe evaluation function, calculate in (90 °, 90 °) interval all even dense degree that distributes according to all attitude angle to attitude angle,
Represent n
tThe attitude angle of individual's face image pattern; f
rExpression is to the evaluation function of resolution, calculates according to the height of resolution,
Represent n
tThe resolution of individual's face image pattern.
This function has met following several characteristic
● attitude angle is intensive more, even, f
tBig more
● resolution is high more, f
tBig more
(2) perception of facial image sample state and control
Computer program can detect the resolution and the attitude angle of current facial image sample automatically according to image information, and through regulating the camera state, changes the resolution and the attitude angle of facial image sample.
The people face of the perception of attitude angle through calculating target towards with camera towards, thereby calculate the relative orientation of people's face to camera, the i.e. attitude angle of people's face in image.Here hypothetical target people face according to the image information of target, can calculate the motion state of target in world coordinate system, thereby obtain its direction of motion towards consistent with the target travel direction.As shown in Figure 6; Figure (A) has described the image pattern collection of four cameras to two targets; Relative orientation that figure (B) has explained people's face and camera determined people's face in image towards; Irrelevant with both relative positions, figure (C) explained according to people's face towards with camera towards calculating the relative orientation of people's face to camera.The control of attitude angle through change camera towards, thereby change the attitude angle of people's face in image.The perception of resolution can be got the pixel size of human face region in the image by people's face trace routine.The control of resolution is through the focal length of change camera, thus the size of change people's face in image, i.e. resolution.
(3) status adjustment of distribution of the multiple goal of camera and camera is to the optimizing of empirical evaluation function
Based on (one) and (two), the multiple goal through camera distribute and the status adjustment of camera to the optimizing of empirical evaluation function.The empirical evaluation function is the association evaluation function of the resolution and the angle of facial image sample, i.e. F
P, r, referring to (one) part, in order to estimate sample quality.The optimizing variable is the position of camera, towards the focal length of angle and camera, with resolution that influences the facial image sample and attitude angle, thus decision sample collection postevaluation functional value.
Searching process is: every at a distance from the Δ t time, all cameras carry out Target Assignment and state adjustment, take turns the target image sample collection to accomplish one;
The optimizing variable is followed successively by: the relations of distribution I of camera and target (c, t), the position of camera in world coordinate system with towards the angle L
c, and the focal distance f of camera
c, wherein: I (c is an indicative function t), and c representes the label of camera, c ∈ 1,2 ..., N
c, t representes the label of target, t ∈ 1,2 ..., N
t, I (c, t)=1 expression is given c camera with t Target Assignment, I (c, t)=0 expression is unallocated,
Position, level angle and the pitching corner of
c camera of expression in world coordinate system;
Optimizing target: the overall empirical evaluation function S um that maximizes all targets
t(f
t), the evaluation function f of t target wherein
tPromptly shown in formula (4-7).
Optimization method is: (c t) is the Target Assignment scheme to traversal I; Under each Target Assignment scheme, take genetic algorithm, find the solution each camera L
cAnd f
cMake f
tMaximum; Thereby choose optimum distributing scheme and obtain maximum Sum
t(f
t).
Under the situation of gathering single goal, (c t) is normal value to I.In the Δ t time, camera carries out taking turns sample collection.But the collection between the camera is mutually coupling, and the acquisition orders of camera can influence collection result---after a camera was accomplished and gathered, variation had just taken place in the sample set of gained, the calculating of evaluation function when influencing next camera collection.If the traversal solve order finds optimum ordered, then can obtain optimum solution; If the employing random order, what obtain is suboptimal solution; If a plurality of cameras are found the solution simultaneously, can cause redundancy and conflict again, also be suboptimal solution.After confirming solve order, each camera is found the solution best L
cAnd f
c, to maximize the objective function f that it is assigned to target
tValue.
Under the multiple goal situation,, must travel through I (c if a camera only distributes a target; T), i.e. Target Assignment situation travels through the acquisition orders of camera again; Obtaining optimum solution, in each situation of traversal, the objective function f of its target of being assigned to of each camera maximization
tIf a camera distributes a plurality of targets, (c t), travels through the acquisition orders of camera again, and the objective function sum of its all target of being assigned to of each camera maximization is Sum then then to travel through I
T ∈ Tc(f
t), T
cThe goal set that is assigned to for camera c.
Emulation experiment
In order to verify the validity of patent art, and provide with Physical Experiment and prepare and reference for disposing real system, the emulation experiment of having carried out single goal sample collection as shown in Figure 7 is carried out the feasibility that isomery is initiatively gathered to verify according to evaluation function.After adopting Task Distribution mechanism mentioned above, promptly applicable and multiple goal situation.
This experimental simulation the two-dimensional space of a 10m * 10m, four corners in this space are deployed with the varifocal camera of PTZ The Cloud Terrace, also have a dollying head in the space, but free movement is to any position.The target segment distance that can in the space, move, during camera can gather the facial image sample of target.
Here provide following hypothesis:
1) people's face of target is towards being consistent with its direction of motion.
2) detection of target, location and tracking be by extra camera (like static camera, panoramic shooting head), perhaps carries out the camera of sample collection-be referred to as tracker-collaborative completion.Not in consideration and realization scope are focused in this emulation.
With Δ t=1s is the time interval, and tracker estimates target in the position of time interval terminal point with towards (being velocity reversal).And then camera carries out the state adjustment, collection sample, maximization evaluation function.For the ease of the state optimizing to camera, evaluation function mentioned above has been done certain simplification: resolution is greater than threshold value r
0Sample for can accept sample; Can accept sample exists
Diversity-the determining size of evaluation function value of distribution density and degree of uniformity-be attitude angle in interval.Here interval
is drawn as a circle; Like Fig. 8, the little open circles of each on the great circle is represented the sample of certain attitude angle.Here, taking away the interval is the algebraic operation of evaluation function for ease.
The association evaluation function f of simplifying
The N of certain a target sample resolution, attitude angle (r
i, p
i), i ∈ 1,2,3 ..., N}
P '
1, p '
2..., p '
MBe that N sample intermediate-resolution is more than or equal to threshold value r
0M sample, its attitude angle arrives the result after the big ordering from childhood
In the emulation experiment, realized five kinds of acquisition methods altogether,, be respectively to compare and to analyze: the static collection, camera is fixed a position and constant towards the angle; Scanning collection, camera is fixed a position, at the uniform velocity swing at random, swing speed is 10 °/s; Initiatively gathering, is purpose with tracking target, collection target polarization face sample; Improved active collection is the acquisition method that patent art proposes, and, attitude angle diversified sample enough with acquisition resolution is purpose; Isomery initiatively is captured on the basis of improved active collection, has added the dollying head.
In the experiment, the target in-position is changed by (0,1)~(0,9), gets into and is constantly selected at random by t=0~89s at random, and the time interval, Δ t equaled 1s.Carry out independent repeated experiments altogether 90 times, the sample evaluation function is as shown in Figure 9 with the average result of the change curve that sample collection increases.
Because the static camera of gathering is fixedly put with the zone boundary and become 45, compares the scanning collection camera of 0~180 ° of scanning, the static collection becomes the chance of better attitude angle bigger with target.Thereby in simulation result, static sample evaluation function value of gathering is more excellent than surface sweeping collection.Initiatively collection and improved active collection effect are optimum near (hereinafter detailed description) and add the isomery collection.
Owing to initiatively gather with tracking target; Collecting the positive sample of target as far as possible is aim; When the target travel scope was big, the attitude angle range that camera lens can collect was bigger, and usually finally gathers the border near 0 ° of angle of attitude angle variation range; Because that all sample attitude angle that finally get access to change is various, be evenly distributed, thus by chance with pursue attitude angle change various, initiatively the acquisition method effect is approaching for the improvement that is evenly distributed.In the target travel scope hour, shown in figure 10, the effect of improved active collection then will obviously be superior to initiatively gathering.
Adopt fixing camera and panoramic shooting head that target and people's face thereof are carried out detection and tracking, for other cameras provide target position information; Utilize the varifocal camera of PTZ The Cloud Terrace (the scalable camera lens is towards angle and focal length), propose Optimization Model according to this paper facial image sample collection process is optimized; The dollying head is deployed on the mobile robot, can accomplish the target following task, also can go to and gather the facial image sample in the blind area that the PTZ camera can't be gathered.
All cameras are operated under the network environment, are controlled by computer program.Computer programming adopts the multiple agent framework, for each camera all moves a sub program, in order to tasks such as the detection tracking of accomplishing target, facial image sample collections; Move a coordinator in addition, the communication that produces in order to accomplish camera to distribute target.