[go: up one dir, main page]

CN100525395C - Pedestrian tracting method based on principal axis marriage under multiple vedio cameras - Google Patents

Pedestrian tracting method based on principal axis marriage under multiple vedio cameras Download PDF

Info

Publication number
CN100525395C
CN100525395C CNB200510108137XA CN200510108137A CN100525395C CN 100525395 C CN100525395 C CN 100525395C CN B200510108137X A CNB200510108137X A CN B200510108137XA CN 200510108137 A CN200510108137 A CN 200510108137A CN 100525395 C CN100525395 C CN 100525395C
Authority
CN
China
Prior art keywords
matching
main axis
under
camera
people
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB200510108137XA
Other languages
Chinese (zh)
Other versions
CN1941850A (en
Inventor
胡卫明
周雪
胡敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CNB200510108137XA priority Critical patent/CN100525395C/en
Publication of CN1941850A publication Critical patent/CN1941850A/en
Application granted granted Critical
Publication of CN100525395C publication Critical patent/CN100525395C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Closed-Circuit Television Systems (AREA)
  • Image Analysis (AREA)

Abstract

本发明涉及计算机视觉技术领域,一种在多摄像机下基于主轴匹配的行人跟踪方法,包括步骤:对序列图像进行运动检测;提取人的主轴特征;进行单摄像机下的跟踪;依据主轴匹配函数建立主轴匹配对;融合多视角信息对跟踪结果进行优化更新。本发明提出了一种新型的多摄像机匹配技术,克服了传统多摄像机匹配方法需要定标、采用特征易受噪声与视角因素的影响、依赖于准确分割算法的缺点,具有很好的应用前景。

Figure 200510108137

The present invention relates to the technical field of computer vision, a pedestrian tracking method based on main axis matching under multi-camera, comprising the steps of: performing motion detection on sequence images; extracting the main axis features of people; performing tracking under single camera; establishing according to the main axis matching function Axis matching pair; fusion of multi-view information to optimize and update the tracking results. The invention proposes a new type of multi-camera matching technology, which overcomes the shortcomings of the traditional multi-camera matching method that requires calibration, adopts features that are easily affected by noise and viewing angle factors, and relies on accurate segmentation algorithms, and has a good application prospect.

Figure 200510108137

Description

The pedestrian tracting method that mates based on main shaft under the multiple-camera
Technical field
The present invention relates to technical field of computer vision, particularly the multiple-camera matching process.
Background technology
Along with the development of society, it is important that safety problem becomes day by day.Vision monitoring becomes the requirement in a lot of large-scale places.These places comprise security department, the military affairs that country is important.Meanwhile in a lot of public places, it is most important that vision monitoring also becomes, and wherein typical application is a field of traffic.Because traditional visual monitor system all is a kind of monitoring afterwards, all accidents and abnormal conditions all are that the security personnel finds at that time or afterwards, and the behavior of various breach securitys is not had the prevention effect; Therefore traditional supervisory control system can not adapt to demands of social development.At this moment, intelligentized visual monitor system becomes the requirement in epoch and researcher's target.Intelligentized vision monitoring final purpose is to realize the behavior of Real time identification moving target, thereby abnormal behaviour is in time made judgement and warning.And depend on bottom layer treatment, i.e. the tracking of moving target with the solution of analyzing the contour level problem as the behavior identification of moving target.In monitoring scene, people's motion is the focus of our concern often.In addition, people's motion is a kind of non-movement of Rigid Body, and it is complicated more that other motion tracking (such as the motion tracking of vehicle) is compared in its tracking.Therefore people's tracking becomes vital part of Intellectualized monitoring.
Nowadays, people are more and more higher to the intellectuality and the security requirement of visual monitor system, use single camera that the people who moves is followed the tracks of far away and can not satisfy needs of society.In the last few years, utilized multiple-camera to come the people of pursuit movement to become the research focus.
Multiple-camera servant's tracking is the problem of a multiple-camera coupling in essence, promptly sets up the corresponding relation between the moving object under the different visual angles (video camera) at synchronization.The multiple-camera coupling is newer problem of computer vision.In this respect, abroad some universities and research institution (Maryland, Oxford MIT) have carried out some research work.Type according to selected feature can roughly be divided into related work two big classes: based on the method in zone with based on the method for characteristic point.
Matching process based on the zone is that the people is regarded as a moving region, utilizes the feature of moving region to set up different visual angles servant's corresponding relation.In these class methods, color is a most frequently used provincial characteristics.Orwell etc. and Krumm etc. utilize color histogram to estimate that people's field color distributes, and set up the coupling of multiple-camera by comparing color histogram; Mittal etc. have set up Gauss's color model for people's moving region, utilize these modelling couplings then; Another method is to utilize the distribution of cuclear density Function Estimation field color, sets up coupling on this basis.Though color characteristic is more directly perceived, however its robust not very aspect coupling.This be because, depend on the color of people's clothes based on the coupling of color, when two people's clothes color is identical, may produce wrong coupling; Secondly, owing to be subjected to the influence at illumination and visual angle, observed same individual's clothes color may change.
Method based on characteristic point is to regard the people as a series of characteristic point, and different visual angles servant's coupling just is converted into the coupling of characteristic point.The coupling of characteristic point is based on certain geometrical constraint.According to the difference of selected geometrical constraint, these class methods can be divided into two subclasses: three-dimensional method and two-dimension method.A.Utsumi chooses the barycenter of moving object as characteristic point, compares their tripleplane's point then; Q.Cai chooses some characteristic points of people's upper part of the body center line, and limit of utilization retrains and seeks coupling then.Three-dimensional method all needs video camera is demarcated.When employed video camera number is very big in the monitoring scene, camera calibration will be a huger task.In order to overcome the shortcoming of three-dimensional method, some researcher has proposed to utilize two-dimensional signal to set up coupling between the multiple-camera.Khan etc. utilize the constraint (homography constraint) of singly reflecting of ground level to come the characteristic point on the match people pin.But these characteristic points are subjected to The noise easily.Block or to detect effect bad when existence, when the people had only part to be observed, the characteristic point of extraction is robust not, and therefore the performance of coupling will variation.
Worldwide obtained paying close attention to widely and studying though it is emphasized that multiple-camera servant's tracking, still had many difficult points aspect the multiple-camera matching technique.Therefore how to choose more robust, feature is mated and is still a challenge accurately.
Summary of the invention
The objective of the invention is to avoid conventional method to need calibration, adopt feature to be subject to shortcomings such as noise and visual angle factor affecting, provide a kind of multiple-camera matching process simple, robust to be used for multiple-camera servant's tracking.
For achieving the above object, the pedestrian tracting method based on the main shaft coupling comprises step under the multiple-camera:
(1) moving Object Segmentation;
(2) extraction people's main shaft feature;
(3) tracking under the single camera;
(4) it is right to seek all optimum Match according to main shaft match likelihood function;
(5) merge various visual angles information updating tracking results.
Multiple-camera is meant that using two or more single cameras, the precondition of this method is to have a public plane in the viewing area of hypothesis different cameras, and generally speaking, this public plane is meant ground level.
The present invention extracts people's main shaft as feature, is a kind of novel multiple-camera matching process based on main shaft, has good application prospects.
Description of drawings
Fig. 1 is single people's main shaft detection example figure.
Fig. 2 is group's main shaft detection example figure.
Fig. 3 is the main shaft detection example figure of the people under the situation of blocking.
Fig. 4 is single camera servant's a tracking block diagram.
Fig. 5 is multiple-camera servant's main shaft projection relation figure.
Fig. 6 (a) is many people tracking test figure as a result under two video cameras of NLPR database.
Fig. 6 (b) is many people tracking test figure as a result under three video cameras of NLPR database.
Fig. 6 (c) is to the tracking test of single people under the situation of blocking figure as a result under two video cameras of PETS2001 database.
Fig. 6 (d) is to the tracking test of group under the situation of blocking figure as a result under two video cameras of PETS2001 database.
Fig. 7 is based on the general introduction figure of the pedestrian tracting method of main shaft coupling under the multiple-camera.
Embodiment
Main feature of the present invention is: 1) extract the feature of people's main shaft as coupling.Because people's main shaft is the symmetry axis of human region, according to symmetry, the point that is symmetrically distributed in the main shaft both sides their error of can cancelling each other, thus make main shaft robust more.And people's main shaft is subjected to motion detection and the result of cutting apart to influence less; 2) detection method of three kinds of different situations servants' main shaft has been proposed: the detection that promptly single people's main shaft detects, group's main shaft detects and blocks the main shaft of the people under the situation; 3) based on the geometrical relationship constraint of different visual angles lower main axis, defined main shaft match likelihood function and weighed the right similarity of different visual angles lower main axis; 4) merge various visual angles information according to matching result, tracking results is optimized renewal.
The general frame of scheme is seen accompanying drawing 7.At first the image sequence under the single camera is carried out motion detection, extract people's main shaft feature, carry out the tracking under the single camera; Then according to singly reflect relation constraint to the main shaft under the different visual angles to mating; Merge various visual angles information updating tracking results according to matching result at last.
Provide the explanation of each related in this invention technical scheme detailed problem below in detail.
(1) moving Object Segmentation
Moving Object Segmentation is the first step of motion tracking, and algorithm adopts is that the background method of wiping out of single Gauss model detects the moving region.In order to reduce the influence of illumination variation and shade, adopt normalized color rgs model, r=R/ (R+G+B) wherein, g=G/ (R+G+B), s=R+G+B.Earlier one section background image sequence is carried out filtering, obtain single Gauss's background model.Gauss's parameter of each point is U wherein iAnd σ i(i=r, g s) are the average and the variance of this background model respectively.Then present image and background image are carried out difference processing, can obtain binary image M in that difference image is carried out threshold process Xy, the foregrounding pixel value is 1 in bianry image, background pixel value is 0.Its process is:
M xy = 0 otherwise 1 | I i ( x , y ) - u i ( x , y ) | > α i σ i ( x , y ) , i ∈ { r , g , s } - - - ( 1 )
I wherein i(x, y), (i=r, g s) are pixel (x, current measured value y).α iBe threshold parameter, can in experiment, determine by experience.After obtaining bianry image, it is being carried out corrosion and the next further filtering noise of expansive working in the morphological operator.Connected component analysis by binaryzation at last is used to extract simply connected foreground area, as the moving region after cutting apart.
(2) detection of people's main shaft
For simplicity, suppose that the people is upright walking, and people's main shaft exists.Suppose that based on this we discuss the detection method of three kinds of people's under the different situations main shaft, promptly single people's main shaft detects; Group's main shaft detects; Block the detection of situation lower main axis.These three kinds of situations can be judged according to corresponding relation in following the tracks of.
Single people's main shaft detects.For single people, adopt minimum intermediate value quadratic method (LeastMedianSequence) to detect its main shaft.Suppose i foreground pixel X iTo the vertical range for the treatment of boning out l is D (X i, l), according to minimum intermediate value quadratic method, then all foreground pixels are to square intermediate value minimum of the vertical range of main shaft.
L = arg min i median i { D ( X i , l ) 2 } - - - ( 2 )
Accompanying drawing 1 has provided single people's main shaft detection example.
Group's main shaft detects.Group's main shaft detection comprises that individuality is cut apart and main shaft detects the two large divisions.
For individuality is cut apart, introduce the upright projection histogram, the individuality among the group is corresponding to the histogrammic peak region of upright projection.Have only the peak region that satisfies certain condition just corresponding to single individuality, we call the obvious peak value zone to these peak regions.Two conditions must be satisfied in the obvious peak value zone:
A) maximum of peak region must be greater than a certain specific threshold value, peak threshold P T
B) minimum value of peak region must be less than a certain specific threshold value, trough threshold value C T
Suppose in a peak region P 1, P 2...., P nIt is its local extremum.C l, C rBe respectively this regional left and right sides trough value, then above two conditions can be expressed as with mathematic(al) representation:
max(P 1,P 2,.....,P n)>P T (3)
C l<C T,C r<C T (4)
Trough threshold value C TBe chosen for whole histogrammic average, crest threshold value P TBe chosen for people in image coordinate height 80 percent.The method that in the second step main shaft detects, can adopt above-mentioned single people's main shaft to detect.
Fig. 2 has provided the example that a plurality of people's main shafts detect.(b) being detected foreground area, (c) is its upright projection histogram, (d) is the result after cutting apart.In upright projection histogram (c), three tangible peak regions are arranged, can be that three parts correspond respectively to three individualities thus with this Region Segmentation.At last, their main shaft provides in figure (e).
The detection of the main shaft of the people under the situation of blocking.Earlier the people is split, find out the pixel of its foreground area, detect people's main shaft in method the minimum intermediate value square error of foreground pixel utilization that splits.We adopt and realize cutting apart under the situation of blocking based on the color template method.This model comprises a color model and an additional probability mask.When occurring blocking, cutting apart of moving object can be expressed as classification problem, judges promptly which model which foreground pixel belongs to.This problem can solve with bayes rule.If satisfy following equation:
k = arg max i P i ( x ) - - - ( 5 )
Then foreground pixel X belongs to k moving object (model).
The main shaft that blocks under the situation detects as shown in Figure 3.
(3) tracking under the single camera
Adopt Kalman filter to realize following the tracks of
Choose state vector X comprise the position of people in image (x, y) and movement velocity (v x, v y), observation vector Z be the people the position (x, y).The measured value of people's movement position (x y) is estimated as people pin point (" land-point in image "), the i.e. intersection point of the lower limb of people's main shaft and rectangle frame.Then:
Z t=[x t,y t] T X t=[x t,v x,t,y t,v y,t] T (6)
The speed of people's walking can be thought constant approx, and state-transition matrix and observing matrix are:
&Phi; = 1 &Delta;t 0 0 0 1 0 0 0 0 1 &Delta;t 0 0 0 1 H = 1 0 0 0 0 0 1 0 - - - ( 7 )
By the position of the position prediction present frame object of former frame object, the measured value (intersection point of people's main shaft and rectangle frame lower limb) with present frame compares then.If the distance between measured value and the predicted value is very little, show that then measured value is with a high credibility, measured value can be represented the position of present frame object; If both distances surpass certain threshold value (such as people's the Lower Half this situation that is blocked), then measured value is insincere, and the intersection point that we have defined main shaft and another line upgrades measured value, and this line is exactly the vertical line of future position to main shaft.Tracking framework under the whole single camera can be referring to accompanying drawing 4.
(4) it is right to seek all best main shaft couplings
At first need to calculate and singly reflect matrix between the different images plane.Singly reflected matrix description between the different images about the one-to-one relationship of the point on the same plane.Can find the solution the parameter of matrix by simultaneous equations by given corresponding points.The corresponding points of this method are obtained the employing manual mode: promptly in advance index point is set in scene or utilizes some special corresponding points in the scene.
Define main shaft match likelihood function then.
Main shaft match likelihood function is used for weighing the matching degree of different visual angles servant's main shaft.Before providing concrete definition, the geometrical relationship of different visual angles lower main axis can be referring to accompanying drawing 5.As shown in the figure, suppose to have two video camera i and j.Suppose that cameras view to the main shaft of people s is
Figure C200510108137D0012183558QIETU
,
Figure C200510108137D0012183604QIETU
Be
Figure C200510108137D0012183613QIETU
Projection on ground level.Observing people k for video camera j has accordingly With The video camera i plane of delineation to the matrix that singly reflects of the video camera j plane of delineation is
Figure C200510108137D0012183638QIETU
We will by singly reflecting matrix
Figure C200510108137D0012183647QIETU
Project in the plane of delineation coordinate system of video camera j and can obtain
Figure C200510108137D0012183657QIETU
Figure C200510108137D0012183704QIETU
With
Figure C200510108137D0012183710QIETU
Will intersect at a point
Figure C200510108137D0012183719QIETU
According to the character of singly reflecting matrix, if the people k that people s that video camera i observes and video camera j observe is corresponding to the same individual in the three dimensions, then
Figure C200510108137D0012183811QIETU
Corresponding to this people's " land-point ", the i.e. intersection point of people's main shaft and ground level.Therefore, the measured value of " land-point " and intersection point
Figure C200510108137D0012183820QIETU
Between distance can be used for weighing matching degree between the main shaft.Distance is mated between the bright main shaft of novel more more.
According to the geometrical relationship of different visual angles lower main axis, the match likelihood function between definition people s and the k main shaft is
L ( L s i , L k j ) = p ( X s i | Q ks ji ) p ( X k j | Q sk ij ) - - - ( 8 )
Wherein
Figure C200510108137D0012183859QIETU
Be " land-point " that video camera i observes people s,
Figure C200510108137D0012183908QIETU
Be " land-point " that video camera j observes people k It is main shaft s is transformed into j visual angle and main shaft k from the i visual angle intersection point.
In order to be without loss of generality, suppose top two probability density functions clothes cloth, then:
p ( X s i | Q ks ji ) = 2 &pi; ( | &Sigma; s i | ) - 1 / 2 exp { - 1 2 ( X s i - Q ks ji ) ( &Sigma; s i ) - 1 ( X s i - Q ks ji ) T }
p ( X k j | Q sk ij ) = 2 &pi; ( | &Sigma; k j | ) - 1 / 2 exp { - 1 2 ( X k j - Q sk ij ) ( &Sigma; k j ) - 1 ( X k j - Q sk ij ) T } - - - ( 9 )
At last find all optimum Match right according to matching algorithm
The coupling of multiple-camera in fact can modeling becomes the problem of maximum likelihood function, and the promptly corresponding mutually right main shaft match likelihood function of main shaft is maximum in the right main shaft match likelihood function of numerous main shafts.For the simplification problem, we have defined the main shaft matching distance, and the maximum likelihood function problem is converted into the minimal matching span problem. arg max s , k L ( L s i , L k j ) &DoubleLeftRightArrow; arg min s , k D sk ij
Wherein
Figure C200510108137D00125
Be the main shaft matching distance, definition main shaft matching distance as shown in the formula:
D sk ij = ( X s i - Q ks ji ) ( &Sigma; i ) - 1 ( X s i - Q ks ji ) T + ( X k j - Q sk ij ) ( &Sigma; j ) - 1 ( X k j - Q sk ij ) T - - - ( 10 )
The main shaft distance
Figure C200510108137D00127
More little, then main shaft mates each other more.
The main shaft matching algorithm is right in order to seek global best matches, makes theirs
Figure C200510108137D00128
The sum minimum.Main shaft matching algorithm between two video cameras is described below:
Suppose under video camera i, detect M main shaft:
Figure C200510108137D00131
Under video camera j, detect N main shaft: L 1 j , L 2 j , . . . . . , L N j .
Step 1: detected main shaft under two visual angles is made up in twos, form the main shaft that might mate right.Be without loss of generality, suppose M≤N, then M main shaft selected M main shaft successively according to priority and matched formation altogether from N main shaft Plant combination.Each combination has following form:
&theta; k = { ( L 1 i , L k 1 j ) , ( L 2 i , L k 2 j ) . . . . . . . , ( L M i , L kM j ) } , k = 1 &CenterDot; &CenterDot; &CenterDot; &CenterDot; &CenterDot; &CenterDot; P N M ; - - - ( 11 )
Step 2: for each the combination in each main shaft to { m, n} calculate its matching distance
Figure C200510108137D00135
And guarantee D mn ij < D T , D wherein TBe the threshold value that draws by experience, be used for judging that main shaft is to { m, whether n} mates.If do not satisfy above-mentioned constraint, then main shaft to { m, n} is from θ kIn delete.Step 3: choose the maximum combination Θ of coupling logarithm l k, all Θ kForm set Θ:
&Theta; = { &Theta; k = ( L k 1 i , L k 1 &prime; j ) , ( L k 2 i , L k 2 &prime; j ) , . . . , ( L k l i , L k l &prime; j ) } Wherein k &Element; P N M - - - ( 12 )
Step 4: in set Θ, seek global best matches combination λ, feasible wherein all matching distance Sum reaches minimum value, promptly satisfies following formula:
&lambda; = arg min k ( &Sigma; w = 1 l ( D ( k w , k w &prime; ) ( i , j ) ) ) - - - ( 13 )
Step 5: the Θ that finally obtains λThen be the global best matches combination, wherein each is to then right for the main shaft of coupling.
Above-mentioned algorithm is easy to expand to the situation more than two video cameras.At first video camera makes up in twos, for a pair of video camera that public ground level zone is arranged, sets up matching relationship and can adopt the algorithm of introducing above; When if coupling produces contradiction between the video camera in twos, it is right that then matching relationship only considers to have the main shaft of minimal matching span.
(5) merge various visual angles information updating tracking results
When find all coupling main shafts to after, these match information just can be used for upgrading the tracking results under the single camera.At the situation of two video cameras, have only when the people who follows the tracks of is in public ground level zone, two visual angles, it is just effective to upgrade this step.
Suppose to find corresponding under same individual two visual angles by matching algorithm above-mentioned, the main shaft that is visual angle i and visual angle j respectively is right, main shaft under the j of visual angle is transformed in the plane of delineation of visual angle i by the relation of singly reflecting between two planes of delineation, main shaft under the then original visual angle i is exactly the position of final this person in plane of delineation i with the intersection point of the straight line that conversion is come, and is used for upgrading the tracking results under the original single-view i.For visual angle j in like manner as can be known.
As accompanying drawing 5, if the main shaft under two visual angles
Figure C200510108137D00141
With
Figure C200510108137D00142
Corresponding to same people, then will
Figure C200510108137D00143
Be transformed into visual angle j from visual angle i and obtain straight line
Figure C200510108137D00144
, then this straight line with
Figure C200510108137D00145
Meet at a little
Figure C200510108137D00146
This intersection point just corresponding " land-point " of this people under the j of visual angle, i.e. the intersection point of people's main shaft and ground level.
For the situation more than two video cameras, a people has two or more such intersection points, then chooses the final position " land-point " of the mean value of these intersection points as the people.
Situation about being blocked for people's Lower Half, based on the position of prediction and the main shaft of detection, the coupling that this algorithm still can robust estimates the position (" land-point ") of people in image accurately.
In order to implement concretism of the present invention, we have done a large amount of experiments on two databases, have realized that the pedestrian based on the main shaft coupling follows the tracks of under the multiple-camera.Experimental result has further been verified the validity and the robustness of this method.
Experimental result as shown in Figure 6, tracked human rectangle frame is represented, the sequence number that digital watch below the rectangle frame is leted others have a look at, the center line of rectangle frame is represented detected main shaft, the intersection point of the lower limb line of main shaft and rectangle frame is represented the position of people in image estimated.
Numeral 1,2,3,4 expressions in (a) are many people tracking test results under two video cameras of NLPR database.
(b) numeral 1,2,3,4 expressions in are many people tracking test results under three video cameras of NLPR database.
Numeral 1,2 expressions among Fig. 6 (c) are to the tracking test result of single people under the situation of blocking under two video cameras of PETS2001 database.
Numeral 1,2,3,4 expressions among Fig. 6 (d) are to the tracking test result of group under the situation of blocking under two video cameras of PETS2001 database.

Claims (8)

1.一种在多摄像机下基于主轴匹配的行人跟踪方法,包括步骤:1. A pedestrian tracking method based on main axis matching under multi-camera, comprising steps: 多摄像机是指使用两个或两个以上的单摄像机,该方法的前提条件是假设不同摄像机的可视区域中存在一个公共的平面,这个公共的平面是指地平面;Multi-camera refers to the use of two or more single cameras. The premise of this method is to assume that there is a common plane in the visible areas of different cameras, and this common plane refers to the ground plane; (1)对图像序列滤波,得到单高斯的背景模型,通过背景剪除方法进行运动目标分割,得到运动区域;(1) The image sequence is filtered to obtain a single Gaussian background model, and the moving target is segmented through the background pruning method to obtain the moving area; (2)提取运动区域中人的主轴特征作为匹配的特征,对三种情况下入的主轴检测为:单个人、一群人以及遮挡情况下的人的主轴检测;(2) Extract the main axis feature of the person in the motion area as the matching feature, and the main axis detection in the three cases is: the main axis detection of a single person, a group of people and people under occlusion; (3)结合检测到的主轴通过卡尔曼滤波器得到预测的主轴位置,实现单摄像机下的跟踪;(3) Combined with the detected main axis, the predicted main axis position is obtained through the Kalman filter, and the tracking under a single camera is realized; (4)计算不同图像平面之间的单映矩阵,采用最佳匹配对算法,依据预测的主轴位置来寻找所有的最佳匹配对;(4) Calculate the homography matrix between different image planes, use the best matching pair algorithm, and find all the best matching pairs according to the predicted main axis position; (5)当找到所有最佳匹配对后,融合匹配信息更新单视角下的跟踪结果。(5) When all the best matching pairs are found, the matching information is fused to update the tracking results under the single view. 2.按权利要求1所述的方法,其特征在于,运动目标分割,运动目标分割是运动跟踪的第一步,算法采用的是单高斯模型的背景剪除方法来检测运动区域,采用归一化的颜色rgs模型,其中r=R/(R+G+B),g=G/(R+G+B),s=R+G+B,先对一段背景图像序列进行滤波,得到单高斯的背景模型,然后将当前图像与背景图像进行差分处理,在对差分图像进行阈值处理得到二值化图像Mxy,得到二值化图像后,在对其进行形态学算子中腐蚀与膨胀操作来进一步滤除噪声,最后通过二值化的连通分量分析用于提取单连通的前景区域,作为分割后的运动区域。2. by the described method of claim 1, it is characterized in that, moving object segmentation, moving object segmentation is the first step of motion tracking, and what algorithm adopted is the background pruning method of single Gaussian model to detect moving area, adopts normalization The color rgs model, where r=R/(R+G+B), g=G/(R+G+B), s=R+G+B, first filter a sequence of background images to obtain a single Gaussian background model, and then perform differential processing on the current image and the background image, and perform threshold processing on the differential image to obtain a binary image M xy . After obtaining the binary image, perform erosion and expansion operations on it in the morphological operator To further filter the noise, and finally use the binary connected component analysis to extract the single-connected foreground area as the segmented motion area. 3.按权利要求1所述的方法,其特征在于,单摄像机下的跟踪采用卡尔曼滤波器来实现跟踪由前一帧物体的位置预测当前帧物体的位置,然后和当前帧的观测值进行比较,如果观测值和预测值之间的距离很小,则表明观测值可信度高,观测值代表当前帧物体的位置;如果两者的距离超过一定的阈值,则观测值不可信,定义主轴与另一条线的交点来更新观测值,这条线就是预测点到主轴的垂直线。3. by the described method of claim 1, it is characterized in that, the tracking under single camera adopts Kalman filter to realize tracking by the position of previous frame object prediction current frame object, then carries out with the observed value of current frame In comparison, if the distance between the observed value and the predicted value is small, it indicates that the observed value is highly reliable, and the observed value represents the position of the object in the current frame; if the distance between the two exceeds a certain threshold, the observed value is not credible, defined The intersection of the main axis and another line to update the observation value, this line is the perpendicular line from the predicted point to the main axis. 4.按权利要求1所述的方法,其特征在于,所述的寻找所有的最佳匹配对包括步骤:4. by the described method of claim 1, it is characterized in that, described searching for all optimal matches comprises the step: 计算不同图像平面之间的单映矩阵;Calculate the homography matrix between different image planes; 单映矩阵描述了不同图像之间关于同一平面上的点的一一对应关系,通过给定的对应点联立方程求解矩阵的参数,对应点获取采用手工方式:即事先在场景中设置标志点或者利用场景中的一些特殊的对应点;The homography matrix describes the one-to-one correspondence between different images on the same plane. The parameters of the matrix are solved by the given simultaneous equations of corresponding points. The corresponding points are obtained manually: that is, the marker points are set in the scene in advance. Or use some special corresponding points in the scene; 计算所有主轴对的匹配似然函数值;Compute the matching likelihood function values for all pairs of principal axes; 根据不同视角下主轴的几何关系,定义主轴s和k之间的匹配似然函数为: L ( L i s , L j k ) = p ( X s i | Q ks ji ) p ( X k j | Q sk ij ) According to the geometric relationship of the main axes under different viewing angles, the matching likelihood function between the main axes s and k is defined as: L ( L i the s , L j k ) = p ( x the s i | Q ks the ji ) p ( x k j | Q sk ij ) 其中
Figure C200510108137C00032
是摄像机i观测到人s的“脚点”,
Figure C200510108137C00033
是摄像机j观测到人k的“脚点”
Figure C200510108137C00034
是主轴s从i视角转换到j视角与主轴k的交点;
in
Figure C200510108137C00032
is the "foot point" of person s observed by camera i,
Figure C200510108137C00033
is the "foot point" of person k observed by camera j
Figure C200510108137C00034
is the intersection point of the main axis s from the angle of view i to the angle of view j and the main axis k;
根据匹配算法找到所有的最佳匹配对;Find all the best matching pairs according to the matching algorithm; 多摄像机的匹配建模成为最大似然函数的问题,即相互对应的主轴对的主轴匹配似然函数在众多主轴对的主轴匹配似然函数中是最大的,为了简化问题把最大似然函数问题转化成最小匹配距离问题;The matching modeling of multiple cameras becomes a problem of maximum likelihood function, that is, the principal axis matching likelihood function of the corresponding principal axis pair is the largest among the principal axis matching likelihood functions of many principal axis pairs. In order to simplify the problem, the maximum likelihood function problem Transformed into the minimum matching distance problem; 寻找所有的最佳匹配对算法如下:The algorithm for finding all the best matching pairs is as follows: 在摄像机i下,检测到M个主轴:
Figure C200510108137C00035
在摄像机j下,检测到N个主轴:
Figure C200510108137C0003100423QIETU
Figure C200510108137C0003100443QIETU
,.....,
Figure C200510108137C0003100449QIETU
Under camera i, M spindles are detected:
Figure C200510108137C00035
Under camera j, N spindles are detected:
Figure C200510108137C0003100423QIETU
Figure C200510108137C0003100443QIETU
,...,
Figure C200510108137C0003100449QIETU
步骤一:将两个视角下检测到的主轴两两组合,组成所有可能匹配的主轴对,不失一般性,假定M≤N,则M个主轴依次按顺序地从N个主轴中选出M个主轴与之匹配,共形成
Figure C200510108137C00037
种组合,每一种组合具有如下的形式:
Step 1: Combine the spindles detected from the two perspectives in pairs to form all possible matching spindle pairs. Without loss of generality, assuming M≤N, the M spindles are sequentially selected from the N spindles. A main axis is matched with it to form a
Figure C200510108137C00037
combinations, each of which has the following form:
&theta;&theta; kk == {{ (( LL 11 ii ,, LL kk 11 jj )) ,, (( LL 22 ii ,, LL kk 22 jj )) .. .. .. .. .. .. .. ,, (( LL ii Mm ,, LL kMkM jj )) }} ,, kk == 11 &CenterDot;&Center Dot; &CenterDot;&Center Dot; &CenterDot;&Center Dot; &CenterDot;&Center Dot; &CenterDot;&Center Dot; &CenterDot;&Center Dot; PP NN Mm ;; 步骤二:对于每一种组合中的每一个主轴对{m,n},计算其匹配距离
Figure C200510108137C000310
并且保证 D mn ij < D T , 其中DT是阈值,用来判断主轴对{m,n}是否匹配,如果不满足上述约束,则把主轴对{m,n}从θk中删除掉;
Step 2: For each principal axis pair {m, n} in each combination, calculate its matching distance
Figure C200510108137C000310
and guarantee D. mn ij < D. T , Where D T is the threshold value, which is used to judge whether the main axis pair {m, n} matches. If the above constraints are not satisfied, the main axis pair {m, n} is deleted from θ k ;
步骤三:选取匹配对数l最多的组合Θk,所有的Θk组成集合Θ:Step 3: Select the combination Θ k with the most matching pairs l, and all Θ k form the set Θ: &Theta; = { &Theta; k = ( L k 1 i , L k 1 j ) , ( L k 2 i , L k 2 j ) , . . . , ( L k 1 i , L k 1 j ) } 其中 k &Element; P N M &Theta; = { &Theta; k = ( L k 1 i , L k 1 j ) , ( L k 2 i , L k 2 j ) , . . . , ( L k 1 i , L k 1 j ) } in k &Element; P N m 步骤四:在集合Θ中,寻找全局最佳匹配组合λ,使得其中所有的匹配距离
Figure C200510108137C000314
之和达到最小值,即满足下式:
Step 4: In the set Θ, find the global best matching combination λ, so that all the matching distances
Figure C200510108137C000314
The sum reaches the minimum value, which satisfies the following formula:
&lambda;&lambda; == argarg minmin kk (( &Sigma;&Sigma; ww == 11 jj (( DD. (( kk ww ,, kk ww )) (( ii ,, jj )) )) )) 步骤五:最终得到的Θλ则为全局最佳匹配组合,其中的每一对则为匹配的主轴对。Step 5: The final Θ and λ are the global best matching combination, and each pair of them is the matching principal axis pair.
5.按权利要求1所述的方法,其特征在于,所述的融合匹配信息更新单视角下的跟踪结果包括步骤:5. according to the described method of claim 1, it is characterized in that, described fusion matching information updates the tracking result under single angle of view and comprises the steps: 当找到所有的匹配主轴对后,这些匹配信息就用来更新单摄像机下的跟踪结果,针对两个摄像机的情况,只有当跟踪的人处于两个视角公共地平面区域时,更新这一步骤才有效,When all matching principal axis pairs are found, the matching information is used to update the tracking results under a single camera. For the case of two cameras, the update step is only available when the tracked person is in the common ground plane area of the two perspectives. efficient, 假设通过上面提到的匹配算法找到对应于同一个人两个视角下,分别是视角i和视角j的主轴对,将视角j下的主轴通过两个图像平面之间的单映关系转换到视角i的图像平面中,则原来视角i下的主轴与转换过来的直线的交点就是最终此人在图像平面i中的位置,用来更新原来单视角i下的跟踪结果,对于视角j同理可知,Assuming that the matching algorithm mentioned above is used to find the main axis pair corresponding to the same person under the two perspectives, namely the perspective i and the perspective j, and convert the main axis under the perspective j to the perspective i through the homography relationship between the two image planes In the image plane of , the intersection point of the main axis under the original viewing angle i and the converted straight line is the final position of the person in the image plane i, which is used to update the tracking result under the original single viewing angle i. The same is true for the viewing angle j. 对于多于两个摄像机的情况,一个人可能有两个或两个以上这样的交点,则选取这些交点的平均值作为人的最终位置。For the case of more than two cameras, a person may have two or more such intersection points, and the average value of these intersection points is selected as the final position of the person. 6.按权利要求3所述的方法,其特征在于,单个人的主轴的检测,6. The method according to claim 3, characterized in that the detection of the main axis of a single person, 单个人的主轴检测,对于单个人,采用最小中值平方方法来检测其主轴,假设第i个前景像素Xi到待定直线l的垂直距离是D(Xi,l),根据最小中值平方方法,则所有前景像素到主轴的垂直距离的平方中值最小,The main axis detection of a single person, for a single person, the least median square method is used to detect its main axis, assuming that the vertical distance from the ith foreground pixel X i to the undetermined straight line l is D(X i , l), according to the least median square method, the median of the squares of the vertical distances from all foreground pixels to the main axis is the smallest, LL == argarg minmin ii medianmedian ii {{ DD. (( Xx ii ,, ll )) 22 }} .. 7.按权利要求3所述的方法,其特征在于,一群人的主轴检测包括个体分割和主轴检测两大部分,为了将个体分割引入垂直投影直方图,一群人中的个体对应于垂直投影直方图的峰值区域,将满足条件的峰值区域对应于单个个体称为明显峰值区域,所述明显峰值区域必须满足所述峰值区域两个条件为:7. by the described method of claim 3, it is characterized in that, the main axis detection of a group of people comprises individual segmentation and main axis detection two major parts, in order to introduce individual segmentation into vertical projection histogram, the individual in a group of people corresponds to the vertical projection histogram The peak area of the graph, the peak area that satisfies the condition corresponds to a single individual is called the obvious peak area, and the obvious peak area must meet the two conditions of the peak area as follows: a)峰值区域的最大值必须大于某一特定的阈值,峰值阈值PTa) The maximum value of the peak area must be greater than a certain threshold, the peak threshold PT ; b)峰值区域的最小值必须小于某一特定的阈值,波谷阈值CTb) The minimum value of the peak area must be less than a certain threshold, the valley threshold C T ; 假设在一峰值区域中,P1,P2,....,Pn是其局部极值,Cl,Cr分别是该区域的左右波谷值,则以上两个条件用数学表达式表示为:Assuming that in a peak area, P 1 , P 2 ,..., P n are its local extremums, C l , C r are the left and right valley values of the area respectively, then the above two conditions are expressed by mathematical expressions for: max(P1,P2,......,Pn)>PT max(P 1 ,P 2 ,...,P n )>P T Cl<CT,Cr<CT C l < C T , C r < C T 波谷阈值CT选取为整个直方图的均值,波峰阈值PT被选取为在图像坐标中人的高度的百分之八十,在主轴检测中采用单个人主轴检测。The trough threshold C T is selected as the mean value of the entire histogram, the peak threshold PT is selected as 80% of the person's height in the image coordinates, and a single person's axis is detected in the axis detection. 8.按权利要求3所述的方法,其特征在于,在遮挡情况下的人的主轴的检测,先将人分割出来,找出其前景区域的像素,在对分割出来的前景像素利用最小中值平方误差的方法来检测人的主轴,采用基于颜色模板方法来实现遮挡情况下的分割,该模型包括一个颜色模型和一个附加的概率掩模,当出现遮挡时,运动物体的分割表示为分类问题,即判断哪些前景像素属于哪个模型,该分类问题用贝叶斯法则来解决,如果满足以下方程:8. by the described method of claim 3, it is characterized in that, in the detection of the people's main axis under occlusion situation, people is segmented out earlier, finds out the pixel of its foreground area, utilizes minimum in the foreground pixel that is segmented out The value squared error method is used to detect the main axis of the person, and the segmentation based on the color template method is used to realize the occlusion situation. The model includes a color model and an additional probability mask. When the occlusion occurs, the segmentation of the moving object is expressed as a classification The problem is to judge which foreground pixels belong to which model. The classification problem is solved by Bayesian rule, if the following equation is satisfied: kk == argarg maxmax ii PP ii (( xx )) 则前景像素X属于第k个运动物体。Then the foreground pixel X belongs to the kth moving object.
CNB200510108137XA 2005-09-29 2005-09-29 Pedestrian tracting method based on principal axis marriage under multiple vedio cameras Expired - Fee Related CN100525395C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB200510108137XA CN100525395C (en) 2005-09-29 2005-09-29 Pedestrian tracting method based on principal axis marriage under multiple vedio cameras

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB200510108137XA CN100525395C (en) 2005-09-29 2005-09-29 Pedestrian tracting method based on principal axis marriage under multiple vedio cameras

Publications (2)

Publication Number Publication Date
CN1941850A CN1941850A (en) 2007-04-04
CN100525395C true CN100525395C (en) 2009-08-05

Family

ID=37959590

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB200510108137XA Expired - Fee Related CN100525395C (en) 2005-09-29 2005-09-29 Pedestrian tracting method based on principal axis marriage under multiple vedio cameras

Country Status (1)

Country Link
CN (1) CN100525395C (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5205337B2 (en) * 2009-06-18 2013-06-05 富士フイルム株式会社 Target tracking device, image tracking device, operation control method thereof, and digital camera
CN101930603B (en) * 2010-08-06 2012-08-22 华南理工大学 Method for fusing image data of medium-high speed sensor network
CN102385750B (en) * 2011-06-22 2013-07-10 清华大学 Line matching method and line matching system on basis of geometrical relationship
CN103099623B (en) * 2013-01-25 2014-11-05 中国科学院自动化研究所 Extraction method of kinesiology parameters
CN103164855B (en) * 2013-02-26 2016-04-27 清华大学深圳研究生院 A kind of Bayesian decision foreground extracting method in conjunction with reflected light photograph
CN103729620B (en) * 2013-12-12 2017-11-03 北京大学 A kind of multi-view pedestrian detection method based on multi-view Bayesian network
CN103700106A (en) * 2013-12-26 2014-04-02 南京理工大学 Distributed-camera-based multi-view moving object counting and positioning method
CN104978735B (en) * 2014-04-14 2018-02-13 航天信息股份有限公司 It is suitable for the background modeling method of random noise and illumination variation
JP6800628B2 (en) * 2016-06-22 2020-12-16 キヤノン株式会社 Tracking device, tracking method, and program
CN109521450B (en) * 2017-09-20 2020-12-29 阿里巴巴(中国)有限公司 Positioning drift detection method and device
CN110276233A (en) * 2018-03-15 2019-09-24 南京大学 A multi-camera collaborative tracking system based on deep learning
CN109002825A (en) * 2018-08-07 2018-12-14 成都睿码科技有限责任公司 Hand bandage for dressing detection method based on video analysis
CN109165600B (en) * 2018-08-27 2021-11-26 浙江大丰实业股份有限公司 Intelligent search platform for stage performance personnel
WO2020061792A1 (en) * 2018-09-26 2020-04-02 Intel Corporation Real-time multi-view detection of objects in multi-camera environments
CN110503097A (en) * 2019-08-27 2019-11-26 腾讯科技(深圳)有限公司 Training method, device and the storage medium of image processing model

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1508755A (en) * 2002-12-17 2004-06-30 中国科学院自动化研究所 Sensitive video detection method
CN1508756A (en) * 2002-12-17 2004-06-30 中国科学院自动化研究所 Sensitive image recognition method based on human body part and shape information
WO2004081895A1 (en) * 2003-03-10 2004-09-23 Mobotix Ag Monitoring device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1508755A (en) * 2002-12-17 2004-06-30 中国科学院自动化研究所 Sensitive video detection method
CN1508756A (en) * 2002-12-17 2004-06-30 中国科学院自动化研究所 Sensitive image recognition method based on human body part and shape information
WO2004081895A1 (en) * 2003-03-10 2004-09-23 Mobotix Ag Monitoring device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种能处理部分遮挡的鲁棒性实时跟踪方法. 田原,谭铁牛,孙洪赞,胡卫明.工程图学学报,第2001年增刊期. 2001 *

Also Published As

Publication number Publication date
CN1941850A (en) 2007-04-04

Similar Documents

Publication Publication Date Title
CN100525395C (en) Pedestrian tracting method based on principal axis marriage under multiple vedio cameras
Zhou et al. Seamless fusion of LiDAR and aerial imagery for building extraction
CN106203274B (en) Real-time pedestrian detection system and method in video monitoring
CN102663743B (en) Personage&#39;s method for tracing that in a kind of complex scene, many Kameras are collaborative
CN110502965A (en) A Construction Helmet Wearing Monitoring Method Based on Computer Vision Human Pose Estimation
CN110175576A (en) A kind of driving vehicle visible detection method of combination laser point cloud data
CN111488804A (en) Method for detection and identification of labor protection equipment wearing condition based on deep learning
CN106204640A (en) A kind of moving object detection system and method
CN106600625A (en) Image processing method and device for detecting small-sized living thing
CN103077539A (en) Moving object tracking method under complicated background and sheltering condition
CN104978567B (en) Vehicle checking method based on scene classification
CN103824070A (en) A Fast Pedestrian Detection Method Based on Computer Vision
CN102622769A (en) Multi-target tracking method by taking depth as leading clue under dynamic scene
CN105160649A (en) Multi-target tracking method and system based on kernel function unsupervised clustering
Börcs et al. Extraction of vehicle groups in airborne LiDAR point clouds with two-level point processes
Manduchi et al. Distinctiveness maps for image matching
CN103400120A (en) Video analysis-based bank self-service area push behavior detection method
Hautière et al. Road scene analysis by stereovision: a robust and quasi-dense approach
CN104200483A (en) Human body central line based target detection method under multi-camera environment
CN115410097A (en) Low-altitude unmanned video tracking method
Wang et al. Self-calibration of traffic surveillance cameras based on moving vehicle appearance and 3-D vehicle modeling
Raikar et al. Automatic building detection from satellite images using internal gray variance and digital surface model
Maithil et al. Semantic Segmentation of Urban Area Satellite Imagery Using DensePlusU-Net
Unno et al. Vehicle motion tracking using symmetry of vehicle and background subtraction
Wu et al. Enhanced roadway geometry data collection using an effective video log image-processing algorithm

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090805

Termination date: 20180929