CN100339863C

CN100339863C - Stereo door sensor

Info

Publication number: CN100339863C
Application number: CNB038237938A
Authority: CN
Inventors: 桑贾伊·尼沙尼; 大卫·A·沙茨; 威廉·西尔弗
Original assignee: Cognex Corp
Current assignee: Cognex Corp
Priority date: 2002-09-05
Filing date: 2003-08-25
Publication date: 2007-09-26
Anticipated expiration: 2023-08-25
Also published as: CN1689024A

Abstract

A stereo imaging based vision system to monitor the area on the two sides of either a door and control door motion according to the motion of 3D objects in the viewing area.The system calibrated to provide heights above the ground plane for any point in the field of view. Therefore, when any object enters the field of view, it generates interest points called ''features'', the heights of which are measured relative to the ground plane. These points are then clustered in 3D space to provide ''objects''. These objects are then tracked in multiple frames to provide ''trajectories''. Such a system could then control the door motion (open, close, stall) based on the various pieces of information generated about the object.

Description

Stereo door sensor

The cross reference of related application

The application requires the U.S. Provisional Patent Application Serial No.60/408 that is entitled as " Stereo Door Sensor " in submission on September 5th, 2002,266 right of priority.

Invention field

The present invention relates to Vision Builder for Automated Inspection, and more specifically, relate to a kind of use automated three-dimensional vision system and come control gate method of operating and device.

Background of invention

Multiple sensors becomes known for automatic object detection and control system.For example, photovoltaic sensor detects the object of blocking visible light bundle or UV light.Mechanical switch and load cells (loadcell) are come inspected object by direct or indirect contact or by inspected object weight.Thermal sensor detects photothermal object, and electromagnetic sensor detects the object such as the metal object that changes electromagnetic field.These sensors typically send signal to logical circuit, this logical circuit control mechanical actuator, the existence of record object, and/or based on the existence of object or there is not alarm operation person.

Because these sensors are cheated easily, so they can not be applicable to some application well.They only detect a certain type objects that moves through strict restricted clearance.Similarly, they can not directly determine the direction and the speed of object.These sensors usually have such problem, and promptly in whole monitored space or keep uniform sensitivity in time, and they may be very expensive.

In some applications, need a more than sensor.For example, the typical automatic door controller that uses in most of grocery store uses microwave remote sensor or ultrasonic sensor to detect the people of approaching door.The infrared motion detection device is normally used for determining whether had the people to pace up and down on the doorway before the permission door is closed.

And multiple system based on video camera becomes known for object detecting system and control system.System based on video camera has additional advantage, and the image of institute's monitored space promptly is provided, and it can store the analysis that is used for subsequently.This system typically uses Electrofax or electric video video camera, and it catches the image on the charge-coupled device (CCD) array, and this image is converted into electronic data file is used for automatically analyzing or storage.For example, the Automatic face recognition system is the experiment problem for a long time, and is used for some high-security applications now.For the common application of major part, these systems may be very slowly, expensive or insecure.

Used electric video video camera and frame-grab to handle and set up movement detection systems, some feature in each frame of its detection and tracking video captured sequence.For example, known automatic gate control system follows the tracks of the corner of object and calculates velocity about this object in frame ground one by one.This velocity is used to determine to open or close automatically-controlled door.

Known up to now signature tracking system extracts data in monocular image sequence.This monocular system only provides the two dimension (2-D) that is used for computing velocity vector data.This monocular system makes shade and brightness effect have difficulty aspect the difference mutually with actual three-dimensional body.In some security system, this problem worse, for example, wherein early-warning conditions has triggered the warning flashlamp, and it has influenced the detected image of institute's monitored space.

The monocular video surveillance that operates on the 2-D view data must be allowed blind spot or blind area, wherein manifests obstacle clocklike in the visual field of video camera.For example, the visual field that when at every turn opening, enters monitoring camera by some or doorframe of monocular video system control.Some system is programmed to ignore frame or frame segmentation at every turn when opening the door.Other accurate more system uses extra sensor door physical location in time, and only ignore wherein the frame part that door or doorframe are supposed to display, for example referring to the U.S. Patent application No.US 2001/0030689 that transfers Spinelli.

When for the first time the monocular vision movement detection systems being installed, must use reference picture that they are carried out " training ", so that set up the reference frame that is suitable for specific environment.This training usually involves heavy and expensive program.Because true 3-D coordinate is no in monocular system, therefore calculating in the 2-D image space, storage or output image coordinate.

Summary of the invention

The invention provides a kind of by automatically-controlled door from the signal controlling of stereoscopic vision system.The stereo-picture of entrance area is handled to produce anaglyph.Use this anaglyph to determine to open or close door by controller.

Embodiments of the invention use factory-calibrated stero, and it provides the 3D coordinate of the point in the visual field.Ground level is calibrated with respect to video camera when mounted.Only there are those to be paid close attention to respect to the point that ground level has some height.Therefore, any shade and highlighted since their lack with respect to the height of ground level and by filtering.Then, directly in 3d space, make the some cluster of being paid close attention to, perhaps make on their ground level of projection cluster in the 2D space.Each independent group is considered to an object and frame by frame is followed the tracks of it.Therefore, in selected each frame, information available comprises: the quantity of object; Their positions (barycenter) in 3d space; With transient motion vector (value and direction).Use this raw data, the incident that can generate is to open or to close door.

In illustrative embodiment of the present invention, stereo door sensor (SDS) comprises the vision system based on three-dimensional imaging, in order to monitor such as sliding gate or door one side of swing door or the zone of both sides.Wherein the zone of current access door will be called as " entering the zone " (incomingarea), and the wherein current zone of leaving door will be called as " leaving the zone " (outgoing area).Enter the zone or leave regional ground and will be called as " ground level ".

This system can trigger, do not trigger or suppose safe condition based on multiple situation.For example, trigger to open or to close door in the time of can object occurs in entering the zone.Replacedly, can trigger based on the track that enters (a plurality of) object in the zone.And this system can be in such state, does not promptly trigger owing to leave the appearance of the object in the zone or supposes safe condition (based on the type of door).

Owing to the present invention includes system based on video camera, therefore can document image (in the intrusion situation, being useful) and can use various embodiments of the present invention to collect current traffic statistics.Under prerequisite without departing from the spirit and scope of the present invention, can also be widely applicable for many application more according to the motion algorithm frame by frame of various embodiments of the present invention.

Other feature and advantage of the of the present invention various embodiments relevant with control system with the known up to now motion detection based on video camera comprise that good shade distinguishes and the background unchangeability.Because the 3D character of stero, so it is easier to distinguish between shade and actual object.With respect to actual object, shade is positioned at (zero elevation) on the ground level.To work together with background any structureization or non-structured according to SDS of the present invention (" SDS ").This is a particular importance, and this is because kinds of surface can occur at student's face, that is, and and carpet, concrete, straw mats or the like.And these surperficial outward appearances change along with the disappearance of time.Because the object of which movement among the present invention detects and is based on physical coordinates but not the background outward appearance, therefore in various embodiments of the present invention, eliminated in the existing field by shade and the highlighted problem that causes.

Feature of the present invention also is to be easy to install and be provided with under the situation that does not need the initial training program.SDS only involves disposable installation setting, and without any need for the further training of type.This feature provides the feature with respect to the uniqueness of the system that moves based on monocular, should need reference picture in order to compare with catching image usually based on the system of monocular motion.Another advantage of native system is that different with situation about being occurred in the movement detection systems, object static or that slowly move can not become sightless.

Feature of the present invention also is the triggering based on track, SDS can be cut apart object in 3d space thus, and use custom algorithm to follow the tracks of them, such as the Patquick that can obtain from Cognex Corp.ofNatick MA, its block matching algorithm that is much better than the use standard is followed the tracks of their projections in the 2D image space.

Feature of the present invention also is the 3D system of calibrating, and SDS calibrates with the unit of real world thus.The present invention can accept based on the real world height and distance parameter and triggering to be set thus.

Feature of the present invention also is to be used for the optional stereo-picture storage of predetermined time interval.This option can provide the video evidence of disaster or be used in and rebuild complete 3D scene in the expanded period.This growth data can provide more objectively analysis foundation.

Feature of the present invention also is screening ability flexibly.This screening ability allows the user to specify region to be sheltered with graphics mode in 2D or in 3D in setting up procedure.This feature can be used for, and for example, considers non-customized gateway or static background scenery in the visual field.

Feature of the present invention also is to have eliminated too much blind spot.By at first detecting doorframe and ignoring the point that is positioned on this plane subsequently simply, can shelter astatic background effectively, the motion (opening) of opening voluntarily as door towards the perimeter.During this system always operates and do not have any blind intervals.Therefore, than known motion detection and control system up to now, the present invention is easier to use and have more robustness.

Description of drawings

By the detailed description at illustrative embodiment of carrying out below in conjunction with accompanying drawing, will understand front of the present invention and other feature and advantage more all sidedly, in the accompanying drawings:

Fig. 1 is the synoptic diagram according to the stereo door sensor layout of illustrative embodiment of the present invention;

Fig. 2 and 3 is schematic block diagram of the replaceable system unit configuration of illustrative embodiment of the present invention;

Figure 4 and 5 show the processing flow chart of the step of replaceable illustrative embodiment of the present invention;

Fig. 6 is the schematic block diagram according to the stereo door sensor apparatus of illustrative embodiment of the present invention; With

Fig. 7 and 8 shows the processing flow chart according to the three-dimensional matching treatment step of replaceable illustrative embodiment of the present invention.

Describe in detail

The layout of illustrative embodiment of the present invention has been described with reference to figure 1.This illustrative embodiment comprises the stereoscopic photograph unit 10 that is installed in doorframe 12 tops, and it is towards entering zone 14 downwards and gaze out.Alternatively, another set of cameras (not shown) can be installed on the opposite side of doorframe, and it is watched attentively and leaves the zone.The present invention calibrates with thinking that any point in the visual field provides ground level above height.Therefore, when any object comes into view, generated the focus that is called as " feature ", its height with respect to ground level has obtained measurement.Then, these the point in 3d space cluster so that " object " to be provided.In a plurality of frames, follow the tracks of these objects then so that " track " to be provided.Then, this system can be based on the multiple message segment actuating doors (open, close, pause) about this object generated.

In illustrative embodiment, the geometry below having used about camera arrangement.Two (perhaps three) stereocameras 10 are observed and are entered zone 14, and alternatively, two other (perhaps three) stereocamera (not shown) is observed and left the zone.These two groups of video cameras are installed on any side in the both sides of doorframe 12 tops, and from doorframe downwards and gaze out.Fig. 1 shows only about entering the geometry in zone.In this illustrative embodiment about the geometry that leaves the zone about doorframe reflection and be symmetrical (although need not do like this).

In example system, the parallax range between the photocentre of video camera is 12mm, and camera lens has the focal length (70 degree horizontal field of view (HFOV)) of 4mm.This video camera is installed in liftoff about 2.2 meters height, and has about 2.5 * 2.5 meters viewing area.The surface vertical with the video camera plane points to down and outside pointing to, as shown in Figure 1, the angle of wherein regulating video camera makes it just enough to observe the bottom of doorframe.In the bottom of doorframe, the camera angle in the example system provide enter set of cameras and leave between the visual field of video camera some is overlapping.

Can use at least two feasible system configuration to realize the present invention.In first illustrative system configuration as shown in Figure 2, monitor that the system that enters the zone and leave the zone closely integrates.Frame grabber 20 receives from entering area cameras 22 and leave the input of area cameras 24, and this input is handled on disposal system 26.This disposal system output

appropriate control signals

27,28,29.

In second illustrative system configuration as shown in Figure 3, independently system monitors individually and enters and leave the zone.Independent frame grabber 30,35 receives from entering set of cameras 32 or leaving the input of set of cameras 34.The output separately that independent processor 31,36 is handled from each frame grabber 30,35.In this configuration, be optional to the supervision of leaving the zone.If carried out entering the zone and leaving the supervision in zone, then subsystem is designated as master subsystem and another subsystem is designated as from subsystem.Here the output from subsystem that is illustrated as leaving camera chain is input to the master subsystem then, and it allows master subsystem to make about opening, close the final decision of door of still pausing.

In illustrative embodiment of the present invention, multiple parameter is set in factory.This factory's setting involves about the calibration of the intrinsic parameters of video camera and the relative orientation between calculating and the video camera.Calibration involves the solution of plurality of sub problem, and as discussed below, each subproblem has some solutions, and it is that those of ordinary skill in the art is known.And, must calculate hereinafter described correction coefficient so that can working time image correction.

Measurement in space can carry out in the coordinate system different with the coordinate system of arbitrary video camera in above-mentioned two kinds of video cameras.For example, scene or world coordinates are corresponding to the point of observing in the scene.Camera coordinates (left side and right) is the expression at center with observer corresponding to scene point.Undistorted image coordinates is corresponding to the scene point that projects on the plane of delineation.Distorted image coordinates is corresponding to the point that has experienced lens distortions.Pixel coordinate is corresponding to the image sampling grid in the pattern matrix.

In illustrative embodiment, a video camera is designated as " reference camera ", and the spatial coordinate system depends on this reference camera.Inner orientation is handled and is performed to determine the interior geometry of video camera.These parameters, it also is called as intrinsic parameters, comprises the following: effective focal length, it also is called as camera constant; The principal point position, it also is called as picture centre; The radial distortion coefficient; With the horizontal proportion factor, it also is called as aspect ratio.The video camera that uses in this illustrative embodiment has the fixing camera lens of focus, and it can not be modified; Therefore can in factory, calculate these parameters and set in advance.

Also carry out relative orientation and handle, in order to determine two relative position and orientations between the video camera by the projection of the calibration point in the scene.Again, video camera is a mechanical fixation, make them keep aiming at, and therefore these parameters also can set in advance in factory.

Also carry out the treatment for correcting that is closely related with relative orientation.Correction is the processing of resampling stereo-picture, examines line thus corresponding to image line." corresponding to the nuclear line on the stereo-picture of the set point in another stereo-picture be, as from the perspective projection on first stereo-picture of the three-dimensional ray of the inverse perspective projection of the set point of another stereo-picture.(An?epipolarline?on?one?stereo?image?corresponding?to?a?given?point?in?another?stereoimage?is?the?perspective?projection?on?the?first?stereo?image?of?thethree-dimensional?ray?that?is?the?inverse?perspective?projection?of?thegiven?point?from?the?other?stereo?image.)”Robert?M.Haralick?&?Linda?G.Shapiro，Computer?and?Robot?Vision?Vol.II?598(1993)。If left image and right image are coplanes, and transverse axis is the conllinear rotation of optical axis (not around), and then image line is the nuclear line, and can form stereoscopic correspondence along corresponding row.The image that it is right that these are called as vertical image provides the advantage of calculating, and this is because only need to carry out a right correction of vertical image.

The method that is used for correcting image is independent of at the employed expression of the given attitude of two video cameras.It depends on this principle of projection that any perspective projection is projection.Replaced by the plane of delineation corresponding to the plane of delineation of these two video cameras, the geometry by point and ray that projection centre is crossed over is kept perfectly with required geometry (vertical image to).This has caused planar projective transformation.These coefficients also can calculate in factory.

Suppose that parameter calculates in inner orientation, relative orientation and correction, then camera image can be revised at distortion and misalignment in software or hardware.The correction image of gained has the right geometry of vertical image, that is, and and the optical plane of square pixels, aligning, the axis of aligning (OK) and pinhole photography machine model.

Also carrying out exterior orientation in factory's setting up procedure of this illustrative embodiment handles.Because the 3D point in the known observation scene is only relevant with camera coordinate system, therefore need exterior orientation to handle.Position and the orientation of video camera in absolute coordinate system determined in exterior orientation.Absolute 3D coordinate system is set up as, and the XY plane is corresponding to ground level, and elects initial point as in this plane arbitrfary point.

In installation site place of execution plane calibration.Alignment target is settled on the ground the three-dimensional coordinate system that invests reference camera in order to calculating the unify world that invests ground level or the relation between the scene coordinate system.

The zone of being paid close attention to also manually is set in the installation site.This involves the image of catching from reference camera (video camera that the spatial coordinate system is relied on), proofreaies and correct this image, shows this image, and uses the figure overlay tool to specify region to be monitored then.Can select a plurality of regions in advance, in each this region, move to allow different algorithms working time.These a plurality of regions typically comprise the specific 3d space of being paid close attention to.Carry out and filter to eliminate the feature of the outside, region that is monitored.In interchangeable embodiment of the present invention,, can carry out automatic calibration by settling reference mark or tape on the ground.Although exist Several Methods to be used for carrying out, summarized a kind of this method below with reference to Fig. 7 according to stereoscopic vision of the present invention.The edge in stereo block 70 employing input picture group 72A, 72B, 72C (right, left, top) and the generation reference picture or the 3D position of frontier point.Show input, although two video cameras will be enough in most applications, particularly when feature mainly appears on the orientation from three video cameras.For example, if this feature is vertical, then the right side of horizontal setting and left video camera can provide good 3D information, such as the situation of door sensor application.

In edge treated

step

75A, 75B, 75C, stereo algorithm use characteristic detection scheme, it comprises that level and smooth, the non-integration sub sampling of para-curve (with specific granularity), Sobel rim detection and real peak subsequently detect and final link.This feature detection scheme is well known in the art, and can obtain the Patmax product from Cognex Corp.of Natick MA.Edge detecting

step

75A, 75B, 75C have caused the tabulation of connection border element (edgelet) (chain).Only there are those features that belong to sufficiently long chain to be passed to next stage.For example, only there are those chains that surpass predetermined length to be identified as feature to be passed.In Fig. 7, have x, y position and they are passed to adaptation about the gradient magnitude (m) of three video camera r, l, t and the feature of angle (a).

Matching treatment (it also is called as alignment processing) 73A, 73B are used to make characteristic matching from right image 72A to left image 72B (horizontal parallax), and make characteristic matching from right image 72A to top image 72C (vertical parallax).Use and examine the initial set that may mate that retrains about each feature.Then, by the relatively intensity and the orientation of this border element (edgelet), matching properties between the feature in two images is described by initial matching intensity (SOM).

Next step by limiting admissible gradient of disparity, forces the flatness constraint; This provides the ability of getting rid of ambiguity and has handled appropriate balance between the ability on surface on a large scale.This step involves, and considers the correspondence of the adjacent features of feature by watching, and upgrades this each corresponding SOM.Next step is taken over by the iteration of forcing uniqueness " win person takes (winner-take-all) entirely " program.This method is worked as follows: in each iteration, coupling intensity all is that those maximum couplings are chosen as for two features that form this coupling is correct.Then, because unique constraints is got rid of the every other coupling that is associated with these two features in further considering.Have the highest intensity if other coupling is formed to characterize for these two now, then this permission is elected as this other coupling correct.Adaptation 73A, 73B have exported the x of unique point in reference picture and the parallax (dri, drt) of y position (xr, yr) and level and vertical direction.Also the angle of output characteristic (ar) is to assist merging.

Then, level and vertical parallax are merged 74 to produce merging output.In illustrative embodiment, used very simple multichannel scheme.If feature be oriented between 45 degree and 135 degree or between 225 degree and 315 degree, usage level parallax then, otherwise use vertical parallax.Should be noted that if only used two video cameras, then do not need to carry out combining step 74.The output of merging 74 is a plurality of unique points with parallax (xr, yr, d) 76.

In case having calculated the position of unique point and the geometry 78 of parallax 76 and video camera is known (because calibration), then calculates 77 X, Y and Z position 79 in stereocamera or scene coordinate system.

Can carry out optional segmentation procedure 71 (it also is called as cluster).Segmentation procedure 71 is returned the different 3D object in the scene, and wherein each object comprises the subclass of the mutual repulsion of the 3D frontier point of being exported by stereo algorithm.Matching process can be categorized as based on feature method (being described with reference to figure 7 as mentioned) and based on the method in zone.Technology based on feature can be tolerated more viewpoint, but has produced sparse result.Relevant (coupling) technology in zone has produced intensive result still can tolerate still less viewpoint.This area correlation techniques has fairly regular algorithm structure, therefore is easier to optimize.The example that the common known degree of correlation of being used by known third party system is measured comprises the SAD (absolute difference and) of LOG (Gauss's Laplace operator) changing image.

Can use the image processing techniques of standard, as be used to determine whether to exist histogram, the spot connectedness of the remarkable height on plane above Ground, intensive anaglyph is handled.Yet these only provide rough estimation.Therefore, required is that intensive disparity map is converted into sparse some cloud.This can realize by only considering those " effectively " the parallax pixels in the dense graph.Fig. 8 has summarized and has been used to use known correlation technique to produce the method for sparse disparities.

As preamble with reference to figure 7 described methods, to concentrate on frontier point or edge (owing to blocking (occlusion) and reflectance (reflectance)) with reference to the interchangeable method that figure 8 describes, this is because information is the most reliable on these aspects only.Right and left image 80B, 80A are corrected 81B, 81A and are passed to the adaptation 84 that produces dense disparity map (image) 83.By edge processor 82 further assessment reference images, described with reference to figure 7 as mentioned.The output of edge processor 82 is xr, yr positions of feature, and it is mapped in the anaglyph 83 to assess the parallax at these some places then.This is called as rarefaction (sparsification) 85.The output of rarefaction processing 85 is a plurality of unique points with parallax (xr, yr, d), uses the video camera geometry of understanding from pre-calibration 88 can easily be translated into 3D X, Y, Z coordinate 87.

Solid coupling step according to Fig. 7 and 8 has generated unique point (edge or frontier point), and it has the 3D information at these some places.With reference to figure 4 the further processing that this 3D is ordered is described.

In solid/cluster step (all steps of describing with reference to figure 7 and 8 as mentioned), with this 3D point from being that the coordinate system transformation at center is to the world coordinate system that invests ground level with the video camera.In a single day alternatively, the 3D point carries out cluster subsequently, has extracted the 3D point at feature place in the image thus, then they are split into the subclass of mutual repulsion.Each subclass is corresponding to the different object in the scene.

The clustering technique of standard can be used for forming the group that 3D is ordered.Efficient technology is the cluster (agglomerative hierarchical clustering) of merger layering.At first obtained initial population by the chain structure that uses border element (edgelet).Based on the unexpected variation among the z between the continuous point, the chain of this feature is split as adjacent segmentation, and (this theory is, if they are adjacent on image coordinate and have similar z value, then they are corresponding to identical object, and therefore corresponding to identical group.) now, each in these segmentations is corresponding to potential independent group.Next step based on " minor increment " standard, merges two immediate groups.This is similar to greedy minimum expansion tree algorithm.This algorithm iteration is until having obtained required group's quantity or " minor increment " greater than certain threshold level.

The cluster in 3D of technology above, however hereinafter Gai Shu technology is the 2D problem by using constraint with problem reduction.Employed constraint is placed in object in the plane in the 3d space.This is not very disadvantageous restriction in typical application.The standard that is used to be divided into different objects is to have surpassed default spacing threshold value along the minor increment between the object of specific plane (2D distance).Therefore, self-evident, the projection of object in this plane can be not overlapping.Again, this is not very disadvantageous, and this is because object is positioned on this plane, and object surfaces is vertical with this plane usually.

Next step carries out filtration step 41, wherein on the ground or approaching with it the having a few of filtering.Ignore any point of being sheltered by region-of-interest, this region-of-interest is provided with in installation process.Because the 3D coordinate system at first invests ground level, therefore suppose that the surface normal on this plane is the z axle.Initial point, x axle and y axle are arbitrarily selected in this permission.Known (how therefore x, y) plane separate (considering the 2D distance along the xy plane) according to object and cut apart object in this plane because object is constrained to lie in.

In illustrative embodiment, all 3D points at first are transformed into ground plane coordinate system.Next step, excessively is higher than x-y plane (object height) or excessively be eliminated near the point on x-y plane at point of the point of hypertelorism or near excessively (scope), too take back or take over (lateral separation).The approaching point of elimination and ground level assists in removing the surface characteristics on shade and plane.Then, do not projected in the ground level by the left point of filtering.Then, this can be converted into the 2D image, and mark/spot connectedness of 2D of using standard to be to obtain different zones, that is, collection of pixels, wherein each pixel is represented a plurality of unique points.

Carry out scoring step 42 then, wherein use the point of score function assessment gained.Accumulate this score value and it is compared with predetermined threshold value, to determine that object exists 43 still not have 44.In illustrative embodiment, this score value is independent the accumulation for each group.Then, with this threshold value according to generated somewhat opposite mode be applied to each group.This can be a robustness more, particularly when scene has produced the mistake coupling of many isolation, but is to be cost with bigger computing power.

By having described interchangeable algorithm with reference to figure 5, wherein the first of this algorithm is similar to the algorithm of Fig. 4, and different is that cluster no longer is optionally, but compulsory.In case detect object (group), then carry out the track calculation procedure, wherein calculate the motion vector of this group, be 52 not 53 towards the result who advances in front of the door with further checking about object.

By 2D sports ground or the light stream (apparent motion) in the frame group in the estimation image sequence, carry out motion estimation.A large amount of motion estimation techniques are well known in the art.Motion estimation and total some similarity of parallax estimation are such as feature that is used for stereo visual system and relevant matches.In application, can use other differential technology, such as optical flow approach with the short time interval between the frame.Yet in the illustrative embodiment of door sensor application, the time interval of not making between the relevant frame is short hypothesis.Therefore, in illustrative embodiment, do not use optic flow technique.A known method for motion estimation involves tracking, by using motion estimation frame by frame or obtaining this estimation by recessiveness, follows the trail of element in time thus.Can use the motion between piece matching scheme (being widely used for motion compensation and video compress) or area correlation schemes (as using in the solid coupling) estimated frames.

Illustrative embodiment has used combination based on the relevant of feature and relevant unique algorithm based on the zone.The point that belongs to object has been divided in the given frame.According to this unique related algorithm, relevant by these features in the peripheral region that makes the desired object position in the subsequent frame, in this subsequent frame, determine these points.At moment t-1, each object feature point comprises weight, x and y position and direction.At moment t, rectified reference image is considered to " image working time ".The edge processor similar to above-described edge processor passed in this image operation, in order to produce gradient magnitude and angular image.In thick relevant (Coarse correlation) step, the training detector is relevant with angular image, employed degree of correlation measurement be absolute difference with.In thin correlation step, use magnitude image to produce sharper keen correlation peak.

This technology has the advantage of the area correlation techniques that is better than standard.For example, in standard technique, relevant piece or zone have different motion vectors.This has produced the correlation of difference, and in some cases, has produced wrong dependent vector.Benefit from this fact according to the algorithm of this illustrative embodiment, promptly the object of being followed the tracks of is cut apart.Therefore this illustrative algorithm only will be concentrated on these unique points, and attempt to seek them in (a plurality of) subsequent frame.

In case calculated motion vector at given object, then the correspondence between the multiple object point of cicada from frame t to t-1.Because the 3D position of these points is known, therefore alternatively, can calculate the 3D motion.By the rectilinear motion of hypothesis object, this algorithm does not expand to a plurality of frames to obtain the track of smoothing with can having influence.Another algorithm expansion is to use filtering technique.Use current input, past input and output to come filter result, in order to produce current output.Another expansion is to use the Kalman filtrator.Referring to " A New Approach toLinear Filtering and Prediction Problems, the Transaction of the ASME (March 1960) " of R.E.Kalman, it is incorporated into herein and classifies reference as.This Kalman filtrator is to be used for carrying out the favourable technology that increases progressively real-time estimation in dynamic system.It allows in time information integrated, and is robustness with respect to system and sensor noise.

The incident formation logic depends on a number of factors: door type, optionally leave the existence of district system, the algorithm in the use.Should be noted that single system self may have the algorithm in a plurality of uses.This logic adopts the output from the multiple region in the multiple systems, and it is integrated the incident that can be directly used in the control gate motion to provide.With reference to figure 6 illustrative stereo door sensor apparatus has been described.Stereo image acquisition device 60, for example, a pair of machine vision cameras of obtaining the stereo-picture that monitors scene be fix and the aiming viewing area.This viewing area in illustrative embodiment is near the zone that enters the door.

Image acquisition equipment 60 (typically by hard lead) is communicated by letter with 3D processor 62.3D processor 62 calculates the position of observing 3D object in the scene according to above-described any method, and wiping out background 2D effect, such as shade, pattern or early-warning lamp light effect.3D processor 62 can be any treatment facility or the piece that can carry out minimum treat step mentioned above at least, is used to calculate 3D group of objects and filtering 2D background information.Personal computer, application specific processor or many treatment facilities can be used as according to 3D processor of the present invention.Those of ordinary skill in the art it should be understood that this 3D processor can also be software block that isolates or the software block of moving in bigger software program.

Footprint processor 66 is communicated by letter with 3D processor 64, and receives the 3D position of object from it.Footprint processor 66 can be such as the hardware processor of personal computer or can carry out the software block of track calculation procedure mentioned above.In illustrative embodiment, footprint processor 66 generates control signal (that is, open, close or halted signals) based on object trajectory, and control signal is delivered to door actuator 66, and it is based on this control signal actuating doors.

Although illustrative embodiment according to the present invention has herein been described multiple calibration steps, those of ordinary skill in the art will be appreciated that, under prerequisite without departing from the spirit and scope of the present invention, can use many calibration stepss.For example, referring to reference 1～4.Although use factory's setting program that illustrative embodiment described herein is set in factory, those of ordinary skill in the art will be appreciated that, under the prerequisite that does not depart from scope of the present invention, can also carry out any described step that is provided with.

Although the interior geometry that is used for determining according to camera constant, picture centre, radial distortion coefficient and aspect ratio video camera is handled in inner orientation, but those of ordinary skill in the art will be appreciated that, under the prerequisite that does not depart from scope of the present invention, can add extra intrinsic parameters or in interchangeable embodiment, can ignore some parameters in these parameters.

Although carried out the ground plane calibration in the illustrative embodiment described herein in the installation site, but those of ordinary skill in the art will be appreciated that, under prerequisite without departing from the spirit and scope of the present invention, also can be in factory or in position place of execution plane calibration for replacement.

Although level and smooth, the non-integration sub sampling of para-curve (with specific granularity), Sobel rim detection and real peak subsequently detect and chain fetches the execution edge treated by carrying out in the illustrative embodiment of Miao Shuing herein, but those skilled in the art will be appreciated that, under prerequisite without departing from the spirit and scope of the present invention, in the edge treated step, can use many edge treated methods known in the art.

Although described the present invention according to two video camera stereo visual systems herein, but those skilled in the art will be appreciated that, under the prerequisite that does not depart from scope of the present invention, can use single video camera to obtain two or more images from different position so that stereo-picture to be provided.For example, video camera can obtain independent image in a plurality of positions.Replacedly, a plurality of opticses can be arranged to static video camera a plurality of continuous scenes are provided, to be used as according to stereo-picture of the present invention.This optics comprises reflection optics, for example, and mirror, and refractive optical components, for example, lens.

Although described the coupling step of illustrative embodiment herein, wherein by realizing that flatness retrains the characteristic that coupling intensity is before described matching characteristic, but those of ordinary skill in the art will be appreciated that, under prerequisite without departing from the spirit and scope of the present invention, can replace multiple interchangeable matching treatment, such as SAD of LOG (Gauss's Laplace operator) changing image (absolute difference and) or the like.

Although illustrative embodiment of the present invention described herein comprises combining step, its use has the simple multichannel scheme of certain orientation restriction, by this specific orientation restriction horizontal parallax is distinguished mutually with vertical parallax, but those of ordinary skill in the art will be appreciated that, under prerequisite without departing from the spirit and scope of the present invention, these are limited on certain scope is arbitrarily, and can widen or constriction.

Although usually describe illustrative embodiment of the present invention according to the stereo door sensor that is used for optionally opening, pause or close door, those skilled in the art should imagine security personnel, safety, motion control and multiple other of the present invention interchangeable embodiment in using.For example, as the people or object enters the specific region or when moving with specific direction in this zone or passage, stereo visual system according to the present invention can be used for triggering reports to the police.For example, advance if detect automobile direction with mistake on highway or exit ramp, interchangeable illustrative embodiment then of the present invention can trigger alerting signal or close door.

Although have the object of the predetermined altitude that leaves ground level according to filtration illustrative embodiment of the present invention has been described, but those of ordinary skill in the art will be appreciated that, under prerequisite without departing from the spirit and scope of the present invention, can also filter object with the preset distance that leaves arbitrary plane (such as wall) according to stereo visual system of the present invention.

Although illustrate and described the present invention by exemplary embodiment, those of ordinary skill in the art will be appreciated that, under prerequisite without departing from the spirit and scope of the present invention, can carry out on form and the details multiple other modifications, abridge and augment.

Claims

1. the method for a control gate may further comprise the steps:

Obtain the stereo-picture that enters the zone;

Calculate 3D feature group from described stereo-picture in the following manner:

Described stereo-picture is carried out edge treated to generate a plurality of connection border elements;

To have greater than the connection border element of predetermined threshold length and confirm as feature;

Mate the parallax that this feature generates with the different images that generates in described image sets; With

According to the 3D position of the geometry calculation unique point of described parallax and video camera,

In described 3D feature group, filter to generate filter 23 D feature group;

Calculate the track of described filter 23 D feature group; With

Generate gate control signal in response to described track.

2. the process of claim 1 wherein that described filtration step eliminates the ground level feature in described 3D feature group.

3. the process of claim 1 wherein that described filtration step eliminates shade in described 3D feature group.

4. the process of claim 1 wherein that described filtration step eliminates background patterns in described 3D feature group.

5. the process of claim 1 wherein described filtration step in described 3D feature group, eliminate around lighting effects.

6. the process of claim 1 wherein that described filtration step eliminates the feature beyond the 3D region of preliminary election.

7. the process of claim 1 wherein that described stereo-picture is the Image Acquisition obtained by the stereoscopic vision video camera by combination.

8. the process of claim 1 wherein that described stereo-picture is a plurality of Image Acquisition of being obtained by monocular camera by combination.

9. the method for claim 8, at least one image in wherein said a plurality of images is obtained by reflection optics by described monocular camera.

10. the method for claim 8, at least one image in wherein said a plurality of images is obtained by refractive optical components by described monocular camera.

11. a stereo vision apparatus that is used for control gate, described device comprises:

Stereo image acquisition device;

The 3D processor, it receives the stereo-picture from described stereo image acquisition device;

Footprint processor, it receives the frame that comprises the 3D object from described 3D processor; With

Door actuator, it receives gate control signal from described footprint processor in response to object trajectory,

The frame that wherein said 3D processor generates the described 3D of comprising object is to carry out in the following manner: determine that the point in the visual field of described stereo image acquisition device leaves the height of ground level; With in 3d space, make described some cluster generating object, and

Described filtration comprises: the chain structure according to described border element generates initial population; Based on the unexpected variation of the z coordinate between the continuous point, the chain of feature is split as adjacent segmentation; With merge two immediate groups based on the minor increment standard.

12. the device of claim 11, wherein said stereo image acquisition device comprises a plurality of electronic still camera.

14. the device of claim 11 wherein approaches the object of described ground level as the ground level noise and by filtering with respect to predetermined threshold.

15. the device of claim 11, wherein said footprint processor is determined object trajectory by follow the tracks of described object in a plurality of frames.

16. comprising, the device of claim 11, wherein said stereo image acquisition device be arranged to the monocular camera that obtains a plurality of images.

17. the device of claim 16, at least one image in wherein said a plurality of images obtains by mirror.