CN102693417A

CN102693417A - Method for collecting and optimizing face image sample based on heterogeneous active visual network

Info

Publication number: CN102693417A
Application number: CN2012101522917A
Authority: CN
Inventors: 张涛; 李潇涵; 陈宋; 成宇; 陈学东; 孙昊; 李何羿
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2012-05-16
Filing date: 2012-05-16
Publication date: 2012-09-26
Also published as: CN103310190A; CN103310190B

Abstract

A face image sample collection optimization method based on heterogeneous active vision network. The optimization process is that every Δt time, all cameras perform target allocation and state adjustment to complete a round of target image sample collection; the optimization variables are: camera The allocation relationship with the target, the position and orientation angle of the camera in the world coordinate system, and the focal length of the camera, the optimization goal is to maximize the overall experience evaluation function of all targets, traverse the target allocation scheme, and under each target allocation scheme , using a genetic algorithm to solve the optimization variables to maximize the target evaluation function, and select the best allocation scheme to obtain the largest overall experience evaluation function. The present invention improves the resolution and attitude angle of the human face in the image with active vision technology, Using the heterogeneous active vision network to optimize the sample collection link, obtain face image samples with better resolution and rich pose angles for face registration and recognition.

Description

Facial image sample collection optimization method based on isomery active vision network

Technical field

The invention belongs to the face recognition technology field, particularly a kind of facial image sample collection optimization method based on isomery active vision network.

Background technology

Face recognition technology all has good using value and application prospect in the civil and military field; Mainly comprise facial image sample collection, sample image pre-service, sorter training (also claiming the registration of people's face) and sample identification (also claiming recognition of face) these several sport technique segments, at present less to the research of sample collection link.

In the man-machine interaction application scenarios, possessed some researchs to the sample collection link.These technology are common rotation and zooms through being installed in camera in the robot with research, and the moving of robot itself, and come acquisition resolution, facial image sample that attitude angle is suitable, with registration with discern people's face.Are Marc Hanheide etc. at document Who am I talking with? A Face Memory for Social Robots.2008 IEEE International Conference on Robotics and Automation Pasadena; CA, USA, May 19-23; The interactive frame of a kind of people and robot has been proposed in 2008; Adopt a mobile robot who is equipped with the The Cloud Terrace camera, can remember the people that seen, and talk with it.Do Joon Jung etc. are at document Detection and Tracking of Face by a Walking Robot.J.S.Marques et al. (Eds.): IbPRIA 2005; LNCS 3522; Pp.500-507, the machine designed people can be in the environment of dynamic change in 2005, detection and tracking people face; And, make people's face remain on camera lens central authorities through to the simple motion control of robot.Chi-Yi Tsai etc. are at document ROBUST FACE TRACKING CONTROL OF A MOBILE ROBOT USING SELF-TUNING KALMAN FILTER AND ECHO STATE NETWORK.Asian Journal of Control; Vol.12; No.4; Pp.488509; Adopt antithesis Jacobi model to describe the spatial relation in the world coordinate system and the plane of delineation and the kinematic relation of robot and target among the July 2010, and utilize Kaman's filtering algorithm that the target location is estimated and tracking.T.Wilhelm etc. utilize a polymorphic system in document A multi-modal system for tracking and analyzingfaces on a mobile robot.Robotics and Autonomous Systems 48 (2004) 31-40.; Form by panoramic shooting head, laser sensor and mobile robot, follow the tracks of and evaluating objects and people's face thereof.

Be different from man-machine interaction, in monitoring or military applications field, target often can not appear in the suitable distance and angular range of camera with the ideal pose of expectation.Single camera is not enough to accomplish the facial image sample collection to target owing to the limitation in the visual field, and; Rely on like document ROBUST FACE TRACKING CONTROL OF A MOBILE ROBOT USING SELF-TUNING KALMAN FILTER AND ECHO STATE NETWORK.Asian Journal of Control, Vol.12, No.4; Pp.488509; Filtering algorithm described in the July 2010 is difficult to situation such as processing target position hopping, can cause track rejection.Therefore, can consider to adopt the camera network, the scope of broadening one's vision is utilized the visual redundancy of multi-cam, strengthens robustness.Because the camera of selecting for use should possess camera motion and lens parameters regulating power simultaneously, thereby is called the active vision network.Camera can also be dissimilar simultaneously, is called isomery.The difference in functionality characteristics of camera can form complementation.In addition; Simple control to camera is not enough to guarantee that the facial image sample that collects has enough good effect; Make it can really be used for registration of people's face or recognition of face; Need or discern the requirement to the facial image sample according to the registration of people's face, the design evaluation function carries out refined control to camera quantitatively.James N.K.Liu etc. are at document iBotGuard:An Internet-Based IntelligentRobot Security System Using Invariant FaceRecognition Against Intruder.IEEE TRANSACTIONS ON SYSTEMS; MAN; AND CYBERNETICS-PART C:APPLICATIONS AND REVIEWS, VOL.35, NO.1; Designed intelligent robot safety-protection system among the FEBRUARY2005 based on the internet; Utilize recognition of face that the invador is monitored, but, linking between technology modules and integrated shortage are considered because its designed system framework is simple relatively; And only unilateral research the face recognition technology under the framework, be not enough under true environment, use.It is very important that system, careful architecture design, the research of gordian technique and integration technology thereof seem.

In monitoring or military applications field, target is normally noncooperative, can not appear in the suitable distance and angular field of view of camera with the attitude of expectation.The resolution of the facial image sample that collects at this moment, tends to not high enough; Attitude angle can be inadequately just, and perhaps different attitude angle is not abundant.Such facial image sample, in registration of people's face and identification, poor effect.

Summary of the invention

In order to overcome the deficiency of above-mentioned prior art; The object of the present invention is to provide a kind of facial image sample collection optimization method, adopt isomery active vision network, based on empirical evaluation function to the facial image sample quality based on isomery active vision network; Through the multiple goal distribution of camera and the status adjustment of camera; The sample collection link is optimized, obtains the facial image sample that resolution is better, attitude angle is abundant, to be used for registration of people's face and identification.

To achieve these goals, the technical scheme of the present invention's employing is:

A kind of facial image sample collection optimization method based on isomery active vision network comprises following content:

Searching process is: every at a distance from the Δ t time, all cameras carry out Target Assignment and state adjustment, take turns the target image sample collection to accomplish one;

The optimizing variable is followed successively by: the relations of distribution I of camera and target (c, t), the position of camera in world coordinate system with towards the angle L _c, and the focal distance f of camera _c, wherein: I (c is an indicative function t), and c representes the label of camera, c ∈ 1,2 ..., N _c, t representes the label of target, t ∈ 1,2 ..., N _t, I (c, t)=1 expression is given c camera with t Target Assignment, I (c, t)=0 expression is unallocated,

Position, level angle and the pitching corner of

c camera of expression in world coordinate system;

Optimizing target: the overall empirical evaluation function S um that maximizes all targets _t(f _t), the evaluation function f of t target wherein _tAs follows

f_{t} (p_{n_{t}}, f_{n_{t}}, . . ., p_{1}, r_{1}) = f_{t} (p_{n_{t} - 1}, r_{n_{t} - 1}, . . ., p_{1}, r_{1}) + (f_{p} (p_{1}, . . ., p_{n_{t}}) - f_{p} (p_{1}, . . ., p_{n_{t} - 1})) \cdot f_{r} (r_{n_{t}})

n _t＝2，3，...

f _t(p ₁，r ₁)＝f _p(p ₁)·f _r(r ₁)

Be the n of t target _tThe association evaluation function of individual's face image pattern calculates through reaching for method; f _pBe evaluation function, calculate in (90 °, 90 °) interval all even dense degree that distributes according to all attitude angle to attitude angle, Represent n _tThe attitude angle of individual's face image pattern; f _rExpression is to the evaluation function of resolution, calculates according to the height of resolution,

Represent n _tThe resolution of individual's face image pattern.

Optimization method is: (c t) is the Target Assignment scheme to traversal I; Under each Target Assignment scheme, take genetic algorithm, find the solution each camera L _cAnd f _cMake f _tMaximum; Thereby choose optimum distributing scheme and obtain maximum Sum _t(f _t).

Under the situation of gathering single goal, (c is normal value t), in the Δ t time to I; Camera carries out taking turns sample collection successively, because behind a camera collection, variation has just taken place the set of the image pattern of target, and the evaluation function when this will have influence on the collection of next camera calculates; Thereby its behavior is gathered in influence, thereby needs the traversal solve order, finds optimum ordered; Obtain optimum solution, but, random order also is fine.

Gathering under the multiobject situation, if a camera only distributes a target, then travel through I (c, t), i.e. Target Assignment scheme travels through the acquisition orders of camera again, in each order, the evaluation function f of its target of being assigned to of each camera maximization _tThereby, confirm optimum distributing scheme, optimal acquisition order and best camera state.

Gathering under the multiobject situation, if a camera distributes a plurality of targets, (c t), travels through the acquisition orders of camera again, and the objective function sum of its all target of being assigned to of each camera maximization promptly then then to travel through I

T _cThe goal set that is assigned to for camera c, thus confirm optimum distributing scheme, optimal acquisition order and best camera state.

Compared with prior art; The present invention is with reference to existing face recognition technology and systematic research; Recognize that different people face image pattern is in the index that people's face is registered and recognition of face exerts an influence; Comparatively main two is resolution and the attitude angle of people's face in image, and the present invention improves these two indexs with the active vision technology.Isomery active vision network is novel, advanced active vision technology; Characteristics such as it possesses initiatively collection, task cooperation, have complementary functions can more effectively be obtained the facial image sample of noncooperative target than single camera collection, static camera, the first-class legacy equipment of scan camera shooting and technology.

The present invention is according to existing face recognition technology and systematic research; Having carried out further graphical analysis and mathematical analysis, set up the empirical evaluation function (main relevant with resolution and attitude angle) of facial image sample quality, is target with maximization empirical evaluation functional value; Utilize isomery active vision network; The sample collection link is optimized, obtains the facial image sample that resolution is better, attitude angle is abundant, to be used for registration of people's face and identification.This The Application of Technology scene mainly is positioned in monitoring or the military scene, and the facial image sample of noncooperative target is gathered, and target also can be further extensive is inhuman target.

Description of drawings

Fig. 1 is same as the resolution synoptic diagram for the sample of different original resolutions zooms to, and from left to right is followed successively by original resolution 4 * 6,8 * 12,20 * 30,40 * 60 and is scaled with reference to resolution 40 * 60.

Fig. 2 is the separability image of image on the amplitude components of frequency domain of different original resolutions; From left to right be followed successively by original resolution original resolution 4 * 6,8 * 12,20 * 30,40 * 60 and be scaled with reference to resolution 40 * 60, the sample separability of bright more this original resolution of representative is strong more.

Fig. 3 is resolution r and evaluation function value f _rConcern synoptic diagram, horizontal ordinate is r, and ordinate is f _rValue, broken line has characterized the resolution evaluation function value of people's face sample image of some given resolution; Smooth curve is the result with the conic fitting broken line.

Fig. 4 is to the different attitude angle samples portrayal synoptic diagram that approximate one dimension distributes in image space; Broken line has characterized some appointment attitude angle approximate one dimension in image space and has distributed; Straight line is reference line (image space-attitude angle evenly distributes), and it is not even distribution that the approximate one dimension of the clear different attitude angle samples of comparison sheet in image space distributes.

Fig. 5 is that the approximate one dimension of different attitude angle samples in image space distributes, and reaches the analytic curve that match obtains.Dotted line is the broken line among Fig. 4, and solid line is the result with the conic fitting dotted line.

Fig. 6 is the perception synoptic diagram of attitude angle; Figure (A) has described the image pattern collection of four cameras to two targets; Relative orientation that figure (B) has explained people's face and camera determined people's face in image towards; Irrelevant with both relative positions, figure (C) explained according to people's face towards with camera towards calculating the relative orientation of people's face to camera.

Fig. 7 is an emulation experiment synoptic diagram of the present invention.

Interval

each the little open circles of Fig. 8 attitude angle is represented the sample of certain attitude angle.

Fig. 9 compares synoptic diagram for the different acquisition method, and triangle is represented the isomery collection, and improved active collection represented in plus sige, and circle is represented initiatively to gather, and scanning collection represented in asterisk, and diamond symbols is represented static the collection.

Figure 10 is a target motion synoptic diagram among a small circle.

When Figure 11 moved for target among a small circle, improved active collection was superior to initiatively gathering.

When Figure 12 moved for target among a small circle, the attitude angle that initiatively collects sample distributed.

When Figure 13 moved for target among a small circle, the improved attitude angle that initiatively collects sample distributed.

Embodiment

Below in conjunction with accompanying drawing and embodiment the present invention is explained further details.

Divide three parts that this embodiment is detailed.

(1) facial image sample interpretational criteria

The facial image sample has compression of images rate, target range, target expression etc. to the index that people's face is registered and recognition of face exerts an influence.With reference to the research in recognition of face field, and the active vision technology index that can change, present technique has been chosen resolution and two indexs of attitude angle, the i.e. resolution of human face region and the attitude angle of people's face in image.Human face region is generally rectangle, like 40 * 60 sizes (unit is wide pixel * length pixel), can describe with width pixel value r=40.The attitude angle p of people's face from left to right with angle to describe from-90 ° to+90 °.

The empirical evaluation function of facial image sample is estimated respectively resolution and attitude angle, and then sets up the association evaluation function to these two indexs.According to existing method, register a target, need the facial image sample of many different attitude angle usually, to improve registration back system to the recognition capability of target under different attitude angle; When recognition objective, the target of collection is many more at the facial image sample of different attitude angle, and it is then many more to be used for identified information, and recognition correct rate is also high more.Result based on to the face identification system evaluation experimental can know that human face region resolution ratio is high more, and the effect of recognition of face is good more; The attitude angle deviation of facial image sample is big more when registration and identification, and recognition correct rate is low more.Thereby the present invention sums up and to obtain: in the facial image sample, the resolution of human face region is high more, people's face in image towards angle--attitude angle is abundant more, it is even more to distribute, and then the effect of registration of people's face and recognition of face is good more, so the evaluation function value should be big more.In addition, the effect of people's face registration and recognition of face can be saturated gradually along with the increase of the increase of resolution, attitude angle, and no longer phenomenal growth, so evaluation function also need have corresponding characteristic.

In the practical application, the facial image sample that collects is various resolution normally, need the scaling to arrive same normative reference resolution, carry out registration of people's face or recognition of face then.If original resolution is lower than normative reference resolution, that is amplified to after the normative reference resolution, and picture quality can variation.Thereby we pay close attention to is the original resolution size when gathering, and has promptly determined the quality of image on the resolution index.

Adopt FERET people's face sample storehouse (ba～bj series sample set) to experimentize.Sample set comprises the facial image of 194 targets (people) in different angles.In the experiment, the facial image sample of 194 targets (people) is narrowed down to 4 * 6,8 * 12,20 * 30,40 * 60 original resolutions respectively, and unification is amplified to 40 * 60 reference resolution again, and then carries out Flame Image Process and mathematical analysis, like Fig. 1.Adopt two-dimension fourier transform then, behind Fourier transform, in real part, imaginary part, amplitude and four components of phase angle, choose amplitude, because sample is best in the separability of amplitude components.Behind the sample process Fourier transform (image space of W * L dimension transforms to the frequency domain space of W * L dimension) of Fig. 2 for different original resolutions; Calculate the separability numerical value of all samples each point on frequency domain; The numerical value of every bit is scaled (multiply by 50 again after getting the log10 logarithm; Guaranteeing gray-scale value between 0～255, and the result of different resolution contrasts obviously) gray-scale value, get final product Fig. 2.Bright more, to represent under this original resolution, the sample separability is good more.

Each image pattern obtains i.e. 2400 dimensions of one 40 * 60 dimension, at the sample of the amplitude components of frequency domain.Regard the face images sample of each target as one type, different targets is different class.Calculate separability (notion of the similar variance) sum of sample on all dimensions of sample of frequency domain amplitude components of different original resolutions; Get with the log10 logarithm and multiply by 50 (handling to be consistent like this) again with same processing mentioned above; As the evaluation function value of this resolution, its result is as shown in table 1.Separability can determine the effect of sample in registration of people's face and recognition of face, and separability is big more, and effect is good more.

Table 1 sample evaluation result

Original resolution	4×0	4×6	8×12	20×30	40×60
						The evaluation function value	0	726	1090	1519	1790

According to table 1, obtain evaluation function curve about resolution shown in segment of curve among Fig. 3, horizontal ordinate is represented the resolution r (representing with the pixel count of human face region width) of sample here, ordinate is evaluation function value f _rMore level and smooth curve is the result to the segment of curve match among Fig. 3, rule of thumb chooses regression model:

r = k_{r} \times f_{r}^{2} - - - (4 - 1)

f_{r} = \sqrt{r / k_{r}} - - - (4 - 2)

Through the method for non-linear regression, obtain parameter k _r=32, returning square error is 0.0319, span f _r∈ [0,1.1180], r ∈ [0,40].As shown in Figure 3, the evaluation function value increases with resolution, and rate of growth constantly decays, and has saturated phenomenon, meets registration of people's face and the empirical law of recognition of face effect with image pattern resolution.

The image pattern of many attitude angle is adopted in current people's face registration and recognition of face usually, the horizontal lateral rotation of people's face before camera lens, and the attitude angle that causes changes, and is that the one dimension under the original geometry space changes; It has caused the variation of the gray-scale value of all pixels of image (length and width are L and W), is the variation of the image space of W * L dimension.The variation of image space; Be the one dimension variation on a kind of approximate stream shape, referring to .A Global Geometric Framework for Nonlinear Dimensionality Reduction.SCIENCE VOL 290 22 DECEMBER 2000 such as document Joshua B.Tenenbaum.

Common a kind of people's face register method is at image space, the test sample book of classifying and the training sample in the database compared, and then coupling identification.Wherein, training sample comes from people's face registration process, and test sample book comes from face recognition process.The training sample of the different attitude angle that target is gathered is many more, and attitude angle distributes even more, then at image space, with the test sample book training sample adjacent, that belong to a target together of certain attitude angle of target can be many more, then test sample book is not easy by the mistake branch more.

Said by two sections in preceding text; We attempt to change apart from portrayal sample approximate one dimension with attitude angle in image space with the Euler of the pixel vectors of the facial image sample of adjacent attitude angle; Intensive and degree of uniformity according to sample distribution in approximate one dimension changes; The evaluation sample quality is good and bad, the design evaluation function.

Selecting resolution is 40 * 60, and from left to right 9 attitude angle comprise 0 °, and ± 15 °, 25 °, 40 °, 60 °, totally 194 people's facial image sample carries out Flame Image Process and mathematical analysis.From-60 ° spend+60 °; Sample between the angle in twos; Square distance average at image space pixel grey scale vector (divided by 100 to reduce numerical values recited) is as shown in table 2, wherein because the sample set of FERET database lacks suitable sample, not from-90 ° spend+90 ° analyze.

Euler's distance of the adjacent angle sample vector of table 2

According to the result of table 2, be able to design curve p-y, p characterizes the sample attitude angle, and the y axle characterizes the approximate one dimension of different attitude angle samples in image space and distributes.Difference is proportional to the square distance average (apart from the result after the summation normalization) of the image space pixel grey scale vector of experiment sample on the y axle.Sample distribution after the normalization, is obtained curve p-y as shown in Figure 4 on the y axle.The figure cathetus is a consult straight line; The difference explanation of curve p-y and consult straight line; Sample is in the variation of image space variation and the heterogeneous line sexual intercourse with its attitude angle: near curve tangent slope larger part (the 0 degree attitude angle); Variation at image space is more violent with the variation of its attitude angle--and in this zone, the collection of different attitude angle samples is suitable more.

Dotted line is Fig. 4 middle polyline section among Fig. 5, and dotted line among Fig. 5 is used the sectional parabola match, can get:

y = g (p) = \{\begin{matrix} \sqrt{p} / k_{p}, p &GreaterEqual; 0 \\ - \sqrt{p} / k_{p}, p < 0 \end{matrix} - - - (4 - 3)

k _p≈286，err＝0.0063(4-4)

N to target t _tThe angle evaluation function of individual's face image pattern is as follows:

f_{p} (p_{1}, . . ., p_{n_{t}}) = f_{p 0} - Σ_{i = 1}^{n + 1} {dz}_{i}^{2}, f_{p 0} = 2 - - - (4 - 5)

{dz}_{i} = \{\begin{matrix} y_{i} - y_{i - 1}, 2 \leq i \leq n \\ y_{1} = 0, i = 1 \end{matrix} - - - (4 - 6)

Its mathematical meaning is n _tIndividual sample is with solid line among the one dimension distribution-Fig. 5 of its place-be divided into n _t+ 1 segment of curve.The sample of different attitude angle is many more, and attitude angle is even more, and the quadratic sum of all segment of curve end points differences in height is more little, then evaluation function f _pBig more.f _P0Be a normal value, guarantee f _pNon-negative.The physical significance of this evaluation function does, the deviation size of sample attitude angle is reflected on the difference in height of corresponding point on the curve, and difference in height is more little, and the attitude angle deviation is more little; All difference in height quadratic sums are more little, represent attitude angle many more, and it is even more to distribute.

N to certain target _tThe association evaluation function of individual sample such as formula (4-7) calculate through reaching for method

f_{t} (p_{n_{t}}, f_{n_{t}}, . . ., p_{1}, r_{1}) = f_{t} (p_{n_{t} - 1}, r_{n_{t} - 1}, . . ., p_{1}, r_{1}) + (f_{p} (p_{1}, . . ., p_{n_{t}}) - f_{p} (p_{1}, . . ., p_{n_{t} - 1})) \cdot f_{r} (r_{n_{t}}) - - - (4 - 7)

n _t＝2，3，...

f _t(p ₁，r ₁)＝f _p(p ₁)·f _r(r ₁)

Be the n of t target _tThe association evaluation function of individual's face image pattern calculates through reaching for method; f _pBe evaluation function, calculate in (90 °, 90 °) interval all even dense degree that distributes according to all attitude angle to attitude angle,

Represent n _tThe attitude angle of individual's face image pattern; f _rExpression is to the evaluation function of resolution, calculates according to the height of resolution,

Represent n _tThe resolution of individual's face image pattern.

This function has met following several characteristic

● attitude angle is intensive more, even, f _tBig more

● resolution is high more, f _tBig more

(2) perception of facial image sample state and control

Computer program can detect the resolution and the attitude angle of current facial image sample automatically according to image information, and through regulating the camera state, changes the resolution and the attitude angle of facial image sample.

The people face of the perception of attitude angle through calculating target towards with camera towards, thereby calculate the relative orientation of people's face to camera, the i.e. attitude angle of people's face in image.Here hypothetical target people face according to the image information of target, can calculate the motion state of target in world coordinate system, thereby obtain its direction of motion towards consistent with the target travel direction.As shown in Figure 6; Figure (A) has described the image pattern collection of four cameras to two targets; Relative orientation that figure (B) has explained people's face and camera determined people's face in image towards; Irrelevant with both relative positions, figure (C) explained according to people's face towards with camera towards calculating the relative orientation of people's face to camera.The control of attitude angle through change camera towards, thereby change the attitude angle of people's face in image.The perception of resolution can be got the pixel size of human face region in the image by people's face trace routine.The control of resolution is through the focal length of change camera, thus the size of change people's face in image, i.e. resolution.

(3) status adjustment of distribution of the multiple goal of camera and camera is to the optimizing of empirical evaluation function

Based on (one) and (two), the multiple goal through camera distribute and the status adjustment of camera to the optimizing of empirical evaluation function.The empirical evaluation function is the association evaluation function of the resolution and the angle of facial image sample, i.e. F _{P, r}, referring to (one) part, in order to estimate sample quality.The optimizing variable is the position of camera, towards the focal length of angle and camera, with resolution that influences the facial image sample and attitude angle, thus decision sample collection postevaluation functional value.

Position, level angle and the pitching corner of

c camera of expression in world coordinate system;

Optimizing target: the overall empirical evaluation function S um that maximizes all targets _t(f _t), the evaluation function f of t target wherein _tPromptly shown in formula (4-7).

Under the situation of gathering single goal, (c t) is normal value to I.In the Δ t time, camera carries out taking turns sample collection.But the collection between the camera is mutually coupling, and the acquisition orders of camera can influence collection result---after a camera was accomplished and gathered, variation had just taken place in the sample set of gained, the calculating of evaluation function when influencing next camera collection.If the traversal solve order finds optimum ordered, then can obtain optimum solution; If the employing random order, what obtain is suboptimal solution; If a plurality of cameras are found the solution simultaneously, can cause redundancy and conflict again, also be suboptimal solution.After confirming solve order, each camera is found the solution best L _cAnd f _c, to maximize the objective function f that it is assigned to target _tValue.

Under the multiple goal situation,, must travel through I (c if a camera only distributes a target; T), i.e. Target Assignment situation travels through the acquisition orders of camera again; Obtaining optimum solution, in each situation of traversal, the objective function f of its target of being assigned to of each camera maximization _tIf a camera distributes a plurality of targets, (c t), travels through the acquisition orders of camera again, and the objective function sum of its all target of being assigned to of each camera maximization is Sum then then to travel through I _{T ∈ Tc}(f _t), T _cThe goal set that is assigned to for camera c.

Emulation experiment

In order to verify the validity of patent art, and provide with Physical Experiment and prepare and reference for disposing real system, the emulation experiment of having carried out single goal sample collection as shown in Figure 7 is carried out the feasibility that isomery is initiatively gathered to verify according to evaluation function.After adopting Task Distribution mechanism mentioned above, promptly applicable and multiple goal situation.

This experimental simulation the two-dimensional space of a 10m * 10m, four corners in this space are deployed with the varifocal camera of PTZ The Cloud Terrace, also have a dollying head in the space, but free movement is to any position.The target segment distance that can in the space, move, during camera can gather the facial image sample of target.

Here provide following hypothesis:

1) people's face of target is towards being consistent with its direction of motion.

2) detection of target, location and tracking be by extra camera (like static camera, panoramic shooting head), perhaps carries out the camera of sample collection-be referred to as tracker-collaborative completion.Not in consideration and realization scope are focused in this emulation.

With Δ t=1s is the time interval, and tracker estimates target in the position of time interval terminal point with towards (being velocity reversal).And then camera carries out the state adjustment, collection sample, maximization evaluation function.For the ease of the state optimizing to camera, evaluation function mentioned above has been done certain simplification: resolution is greater than threshold value r ₀Sample for can accept sample; Can accept sample exists

Diversity-the determining size of evaluation function value of distribution density and degree of uniformity-be attitude angle in interval.Here interval

is drawn as a circle; Like Fig. 8, the little open circles of each on the great circle is represented the sample of certain attitude angle.Here, taking away the interval is the algebraic operation of evaluation function for ease.

The association evaluation function f of simplifying

The N of certain a target sample resolution, attitude angle (r _i, p _i), i ∈ 1,2,3 ..., N}

P ' ₁, p ' ₂..., p ' _MBe that N sample intermediate-resolution is more than or equal to threshold value r ₀M sample, its attitude angle arrives the result after the big ordering from childhood

f = Σ_{i = 1}^{M} {(p_{i + 1}^{'} - p_{i}^{'})}^{Q} + {(p_{1}^{'} - (- \frac{π}{2}))}^{Q} + {(\frac{π}{2} - p_{M}^{'})}^{Q}, Q = 2

In the emulation experiment, realized five kinds of acquisition methods altogether,, be respectively to compare and to analyze: the static collection, camera is fixed a position and constant towards the angle; Scanning collection, camera is fixed a position, at the uniform velocity swing at random, swing speed is 10 °/s; Initiatively gathering, is purpose with tracking target, collection target polarization face sample; Improved active collection is the acquisition method that patent art proposes, and, attitude angle diversified sample enough with acquisition resolution is purpose; Isomery initiatively is captured on the basis of improved active collection, has added the dollying head.

In the experiment, the target in-position is changed by (0,1)～(0,9), gets into and is constantly selected at random by t=0～89s at random, and the time interval, Δ t equaled 1s.Carry out independent repeated experiments altogether 90 times, the sample evaluation function is as shown in Figure 9 with the average result of the change curve that sample collection increases.

Because the static camera of gathering is fixedly put with the zone boundary and become 45, compares the scanning collection camera of 0～180 ° of scanning, the static collection becomes the chance of better attitude angle bigger with target.Thereby in simulation result, static sample evaluation function value of gathering is more excellent than surface sweeping collection.Initiatively collection and improved active collection effect are optimum near (hereinafter detailed description) and add the isomery collection.

Owing to initiatively gather with tracking target; Collecting the positive sample of target as far as possible is aim; When the target travel scope was big, the attitude angle range that camera lens can collect was bigger, and usually finally gathers the border near 0 ° of angle of attitude angle variation range; Because that all sample attitude angle that finally get access to change is various, be evenly distributed, thus by chance with pursue attitude angle change various, initiatively the acquisition method effect is approaching for the improvement that is evenly distributed.In the target travel scope hour, shown in figure 10, the effect of improved active collection then will obviously be superior to initiatively gathering.

Adopt fixing camera and panoramic shooting head that target and people's face thereof are carried out detection and tracking, for other cameras provide target position information; Utilize the varifocal camera of PTZ The Cloud Terrace (the scalable camera lens is towards angle and focal length), propose Optimization Model according to this paper facial image sample collection process is optimized; The dollying head is deployed on the mobile robot, can accomplish the target following task, also can go to and gather the facial image sample in the blind area that the PTZ camera can't be gathered.

All cameras are operated under the network environment, are controlled by computer program.Computer programming adopts the multiple agent framework, for each camera all moves a sub program, in order to tasks such as the detection tracking of accomplishing target, facial image sample collections; Move a coordinator in addition, the communication that produces in order to accomplish camera to distribute target.

Claims

1. the facial image sample collection optimization method based on isomery active vision network is characterized in that, comprises following content:

Represent position, level angle and the pitching corner of c camera in world coordinate system;

f_{t} (p_{n_{t}}, r_{n_{t}}, . . ., p_{1}, r_{1}) = f_{t} (p_{n_{t} - 1}, r_{n_{t} - 1}, . . ., p_{1}, r_{1}) + (f_{p} (p_{1} . . . . . p_{n_{t}}) - f_{p} (p_{1}, . . ., p_{n_{t} - 1})) \cdot f_{r} (r_{n_{t}})

n _t＝2,3，...

f _t(p ₁,r ₁)＝f _p(p ₁)·f _r(r ₁)

Be the n of t target _tThe association evaluation function of individual's face image pattern calculates through alternative manner; f _pBe evaluation function, calculate in (90 °, 90 °) interval all even dense degree that distributes according to all attitude angle to attitude angle,

Represent n _tThe attitude angle of individual's face image pattern; f _rExpression is to the evaluation function of resolution, calculates according to the height of resolution, Represent n _tThe resolution of individual's face image pattern;

2. according to the said optimization of collection method of claim 1, it is characterized in that under the situation of gathering single goal, (c is normal value t) to I, and in the Δ t time, camera carries out taking turns sample collection with random order.

3. according to the said optimization of collection method of claim 1, it is characterized in that under the situation of gathering single goal, (c is normal value t) to I, and in the Δ t time, camera carries out taking turns sample collection successively, and the traversal solve order finds optimum ordered, obtains optimum solution.

4. according to the said optimization of collection method of claim 1, it is characterized in that, gathering under the multiobject situation; If a camera only distributes a target, then travel through I (c, t); It is the Target Assignment scheme; Travel through the acquisition orders of camera again, in each order, the evaluation function f of its target of being assigned to of each camera maximization _tThereby, confirm optimum distributing scheme, optimal acquisition order and best camera state.

5. according to the said optimization of collection method of claim 1; It is characterized in that, gathering under the multiobject situation, if a camera distributes a plurality of targets; Then travel through I (c; T), travel through the acquisition orders of camera again, the objective function sum of its all target of being assigned to of each camera maximization promptly then T _cThe goal set that is assigned to for camera c, thus confirm optimum distributing scheme, optimal acquisition order and best camera state.