CN100447820C

CN100447820C - Bus Passenger Flow Statistics Method and System Based on Stereo Vision

Info

Publication number: CN100447820C
Application number: CNB2005100602882A
Authority: CN
Inventors: 刘济林; 于海滨; 沈晔湖
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2005-08-04
Filing date: 2005-08-04
Publication date: 2008-12-31
Anticipated expiration: 2025-08-04
Also published as: CN1731456A

Abstract

The invention discloses a bus passenger flow statistics method and system based on stereo vision. The present invention adopts stereo vision device and algorithm. The stereo vision image acquisition device is placed on the top of the bus door to collect the images of passengers getting on and off the bus in real time. The processor uses the stereo vision algorithm to process the collected binocular images to obtain the image depth information, combined with the processing information of the monocular image to obtain the head of the passengers. The position, size, grayscale and other information of the top, and then the tracking part will track the information on the top of the passenger's head in real time, and the number of passengers getting on and off the bus can be known from the tracking results, so as to obtain real-time and accurate passenger flow information. The invention improves the accuracy rate of the bus passenger flow statistics system, and the obtained passenger flow information can be used as a basis for deploying urban bus frequency density and bus line planning and design, and plays an important role in promoting the healthy development of the automobile passenger transport business.

Description

Bus Passenger Flow Statistics Method and System Based on Stereo Vision

技术领域： Technical field:

本发明涉及公交客流统计，尤其涉及一种基于立体视觉的公交客流统计方法及其系统。The invention relates to bus passenger flow statistics, in particular to a method and system for bus passenger flow statistics based on stereo vision.

背景技术： Background technique:

近些年来，展览馆、体育场、图书馆、机场、地铁及公交车等公共场所都安装有智能监控与调度系统，对这些系统来说，实时准确的客流信息的重要性不言而喻。而目前客流信息统计的主要手段为红外遮挡系统及压力传感系统，这两种传统方法的优势在于实现简单，成本低廉，但其主要缺陷在于统计不准确，对客流高峰期拥挤状况无能为力。In recent years, public places such as exhibition halls, stadiums, libraries, airports, subways and buses have installed intelligent monitoring and dispatching systems. For these systems, the importance of real-time and accurate passenger flow information is self-evident. At present, the main means of passenger flow information statistics are infrared occlusion system and pressure sensor system. The advantages of these two traditional methods are simple implementation and low cost, but their main disadvantages are inaccurate statistics, and they are powerless to solve the congestion situation during peak passenger flow.

相对于其他信息，图像信息的容量更大，更丰富，因此图像处理技术的兴起与不断的发展为传统的客流统计技术面临的一系列问题的解决提出了很多新方法。目前，已经有很多基于图像处理的方法应用于客流统计系统，但它们所采用的方法都是基于单目二维图像的，应用于公交客流统计系统时，还存在很多无法克服的问题，这些问题主要有：Compared with other information, the capacity of image information is larger and richer. Therefore, the rise and continuous development of image processing technology has proposed many new methods for solving a series of problems faced by traditional passenger flow counting technology. At present, there are many methods based on image processing applied to the passenger flow statistics system, but the methods they adopt are all based on monocular two-dimensional images. When applied to the bus passenger flow statistics system, there are still many insurmountable problems. These problems There are:

1)背景随光线变化复杂，不利于前景(人头部)的提取；1) The background changes complexly with the light, which is not conducive to the extraction of the foreground (human head);

2)客流高峰拥挤情况下，人与人之间结合紧密，基于二维信息的方法很难将拥挤在一起的几个人准确的区分开来，而且拥挤情况又是非常频繁的；2) In the case of peak passenger flow congestion, people are closely connected, and the method based on two-dimensional information is difficult to accurately distinguish several people who are crowded together, and the crowding situation is very frequent;

3)由于公交车台阶的存在，导致同一人的头部在上下车时在位置固定的单目摄像机看来大小是时刻变化的，非常不利于识别与跟踪。3) Due to the existence of the steps of the bus, the head of the same person appears to change in size from the monocular camera at a fixed position when getting on and off the bus, which is very unfavorable for recognition and tracking.

发明内容： Invention content:

本发明的目的是提供一种基于立体视觉的公交客流统计方法及其系统。The purpose of the present invention is to provide a method and system for bus passenger flow statistics based on stereo vision.

基于立体视觉的公交客流统计方法：处理器对立体视觉装置获取的双目图像进行立体视觉处理，得到场景中的各点到摄像机之间的距离，然后在距离上设置阈值，得到距离摄像机某一距离范围内的场景中的所有点，通过对这些点的去噪、拟和，再结合单目图像的特征识别方法，将那些近似组成圆的场景中点的集合作为人的头部，从而实现了人头部的检测，再将人头部检测的结果的位置、半径、灰度信息交由跟踪算法实施有效跟踪，便可以正确判断出客流的运动方向，从而完成客流信息统计。Bus passenger flow statistics method based on stereo vision: the processor performs stereo vision processing on the binocular images acquired by the stereo vision device, obtains the distance between each point in the scene and the camera, and then sets a threshold on the distance to obtain a certain distance from the camera All points in the scene within the distance range, through denoising and fitting of these points, combined with the feature recognition method of the monocular image, the set of points in the scene that approximately form a circle is used as the head of the person, so as to realize After the detection of the human head, the position, radius, and gray level information of the human head detection result are handed over to the tracking algorithm for effective tracking, so that the movement direction of the passenger flow can be correctly judged, and the passenger flow information statistics can be completed.

立体视觉处理：指的是利用两台略有位置偏移的摄像机通过三角运算，获得场景的深度信息，立体视觉处理方法步骤如下：1)在不同的图像之间建立图像特征点之间的对应关系，2)计算图像特征点位置之间的相对偏移，3)通过已知的摄像机参数计算图中特征点的三维位置信息。Stereo vision processing: refers to the use of two slightly shifted cameras to obtain the depth information of the scene through triangulation operations. The steps of the stereo vision processing method are as follows: 1) Establish correspondence between image feature points between different images 2) Calculate the relative offset between the image feature point positions, 3) Calculate the three-dimensional position information of the feature points in the image through the known camera parameters.

单目图像的特征识别方法：将人体头部的形状作为特征，取双目图像中的任一目图像作为单目图像，对其进行图像的边缘检测，在边缘检测结果上实施改进的哈夫变换，在哈夫变换的结果中进行模糊聚类，从而获得场景中所有类圆形目标作为人体头部的检测结果。Feature recognition method of monocular image: take the shape of the human head as a feature, take any image in the binocular image as a monocular image, perform image edge detection on it, and implement improved Hough transform on the edge detection result , fuzzy clustering is performed on the results of the Hough transform, so as to obtain the detection results of all circular objects in the scene as human heads.

基于立体视觉的公交客流统计装置：处理器分别与控制电路、网络传输模块、协议转换模块、存储器相接，控制电路分别与双目图像采集模块、网络传输模块相接，网络传输模块与网络接口相接，存储器与双目图像采集模块相接。Bus passenger flow statistics device based on stereo vision: the processor is connected to the control circuit, network transmission module, protocol conversion module, and memory, the control circuit is connected to the binocular image acquisition module and the network transmission module, and the network transmission module is connected to the network interface connected, the memory is connected with the binocular image acquisition module.

双目图像采集模块：时序控制电路分别与摄像机1和摄像机2相接。Binocular image acquisition module: the timing control circuit is connected to camera 1 and camera 2 respectively.

本发明可以克服现有的公交客流统计系统存在的很多问题，对于城市公交汽车，客流统计信息对于调配公交车辆班次密度和对公交线路的规划设计具有十分重要的参考意义；对于长途客运汽车，客流统计信息对于超载、私自收费等威胁乘客人身安全、造成国家经济损失的不良现象具有良好的监督作用。总之，本发明不仅可以为公交规划、市政建设提供更加可靠的客流数据，对汽车客运运输事业的健康发展同样起着重要的作用。The present invention can overcome many problems existing in the existing bus passenger flow statistics system. For urban buses, the passenger flow statistics information has very important reference significance for the deployment of bus frequency density and the planning and design of bus lines; for long-distance passenger vehicles, the passenger flow Statistical information has a good supervisory effect on overloading, private toll collection and other undesirable phenomena that threaten the personal safety of passengers and cause national economic losses. In a word, the present invention can not only provide more reliable passenger flow data for public transportation planning and municipal construction, but also play an important role in the healthy development of automobile passenger transportation.

附图说明： Description of drawings:

图1是基于立体视觉的公交客流统计系统结构框图；Fig. 1 is a block diagram of the bus passenger flow statistics system based on stereo vision;

图2是本发明的双目图像采集模块结构框图；Fig. 2 is a structural block diagram of the binocular image acquisition module of the present invention;

图3是本发明实现人头部检测算法软件流程图；Fig. 3 is that the present invention realizes the human head detection algorithm software flowchart;

图4是本发明跟踪算法软件流程图。Fig. 4 is a software flow chart of the tracking algorithm of the present invention.

具体实施方式： Detailed ways:

如图1所示，基于立体视觉的公交客流统计系统：处理器分别与控制电路、网络传输模块、协议转换模块、存储器相接，控制电路分别与双目图像采集模块、网络传输模块相接，网络传输模块与网络接口相接，存储器与双目图像采集模块相接。为确保实时处理，图像分辨率不宜过大，不采用彩色图像，直接由摄像头获取灰度图像，在这种情况下，图像数据量不是很大，不需要考虑与处理器接口复杂的SDRAM，直接采用处理器片上SRAM或者外部扩展SRAM。立体视觉算法与一般图像处理算法比较，运算量大，浮点运算多，对内存的需求大，为确保实时性要求，处理器的选择很重要。TMS320C6000系列是德州仪器公司生产的高档DSP产品，这种DSP可以提高系统性价比，减少开发时间，增加可靠性，得到了十分广泛的应用。其主要特性为：①定点/浮点系列兼容DSP，主频100MHz-600MHz；②具有VelciTI^TM先进VLIW结构内核；③具有类似RISC的指令集；④片内集成大容量SRAM，可达8M；⑤高效率协处理器(C64x)；⑥片内提供多种集成外设。包括多通道缓冲串口McBSP，多通道音频串口McASP和多通道DMA/EDMA控制器等；其特性决定了C6000系列特别适合于开发图像产品，因此作为立体视觉客流统计系统的中央处理器，C6000系列是个合适的选择。嵌入式系统开发中，高效的调试手段是异常重要的，不仅在产品研发阶段需要调试接口，在产品上市初期，为了修补各种研发阶段没有遭遇的问题，也必须保留调试接口。考虑到图像的数据量，该调试接口必须保证较高的传输速率，综合考虑传输率与开发成本，网络接口是一个比较理想的选择。采用百兆以太网卡芯片(如RTL8139)作为网络接口，可以在客流统计系统与宿主机(可以为PC或专门开发的带有网络接口的嵌入式系统)之间传送各种必要的调试数据甚至是实时的双目图像数据，为高效的调试提供保障。As shown in Figure 1, the bus passenger flow statistics system based on stereo vision: the processor is connected with the control circuit, network transmission module, protocol conversion module, and memory respectively, and the control circuit is connected with the binocular image acquisition module and the network transmission module respectively. The network transmission module is connected with the network interface, and the memory is connected with the binocular image acquisition module. In order to ensure real-time processing, the image resolution should not be too large, do not use color images, and directly obtain grayscale images from the camera. Use processor on-chip SRAM or external expansion SRAM. Compared with the general image processing algorithm, the stereo vision algorithm has a large amount of computation, many floating-point operations, and a large demand for memory. In order to ensure real-time requirements, the choice of processor is very important. TMS320C6000 series is a high-end DSP product produced by Texas Instruments. This kind of DSP can improve the cost performance of the system, reduce the development time, increase the reliability, and has been widely used. Its main features are: ① Fixed-point/floating-point series compatible with DSP, main frequency 100MHz-600MHz; ② VelciTI ^TM advanced VLIW structure core; ③ RISC-like instruction set; ④ On-chip integrated large-capacity SRAM, up to 8M; ⑤ High-efficiency coprocessor (C64x); ⑥ provides a variety of integrated peripherals on-chip. Including multi-channel buffer serial port McBSP, multi-channel audio serial port McASP and multi-channel DMA/EDMA controller, etc.; its characteristics determine that the C6000 series is especially suitable for developing image products, so as the central processor of the stereo vision passenger flow statistics system, the C6000 series is a suitable choice. In the development of embedded systems, efficient debugging methods are extremely important. Not only are debugging interfaces needed in the product development stage, but also in the initial stage of product launch, in order to repair various problems that have not been encountered in the development stage, debugging interfaces must also be retained. Considering the amount of image data, the debugging interface must ensure a high transmission rate. Considering the transmission rate and development cost comprehensively, the network interface is an ideal choice. Using a 100M Ethernet card chip (such as RTL8139) as a network interface, various necessary debugging data and even Real-time binocular image data provides guarantee for efficient debugging.

如图2所示，双目图像采集模块：时序控制电路分别与摄像机1和摄像机2相接。立体视觉系统采用双目摄像机，且双目摄像机为水平放置，即两摄像机镜头中心连线处于同一水平线上。考虑到摄像机安装高度有限，为了兼顾两摄像机具有较大的公共视场，需要采用广角镜头。镜头焦距不大于600像素宽度。考虑实时性要求，两个摄像机采集灰度图像，结合立体视觉精度要求，采集图像分辨率为640*480；两摄像机之间基线长度为15厘米。As shown in Figure 2, the binocular image acquisition module: the timing control circuit is connected to the camera 1 and the camera 2 respectively. The stereo vision system adopts binocular cameras, and the binocular cameras are placed horizontally, that is, the line connecting the centers of the lenses of the two cameras is on the same horizontal line. Considering the limited installation height of the cameras, in order to take into account that the two cameras have a larger common field of view, a wide-angle lens is required. The focal length of the lens is not greater than 600 pixels wide. Considering the real-time requirements, two cameras collect grayscale images, combined with the stereo vision accuracy requirements, the resolution of the collected images is 640*480; the baseline length between the two cameras is 15 cm.

如图3所示，检测模块的算法流程图，首先对双目图像进行立体视觉处理，采用基于窗口的匹配算法获得含深度信息的图像I(x，y，z)，在Z方向设定相应的阈值后，可能为人头部的区域就都被提取出来。随后在双目图像中任意取出一幅作为单目图像进行平面类圆物体提取。提取过程先要对单目图像进行边缘检测、背景去除等预处理工作，接着对经过预处理的前景边缘图像进行改进的哈夫变换，对哈夫变换的结果引入模糊度量，在模糊置信度的指导下去除哈夫变换结果中的虚假类圆物体，获得单目图像中所有类圆物体的位置与半径。最后，将立体视觉算法处理的结果与单目图像哈夫变换的结果结合起来，提取出前景中所有人头部所在的位置，并同时计算出跟踪部分所需要的每个人头部的特征向量(x，y，z，R，G)。由于人的头部明显高于人体肩部等其他部位，也就是说人体头部与摄像机之间的距离比其他部分近。利用立体视觉的方法可以得到场景中的各点到摄像机之间的距离，因此我们可以得到距离摄像机某一范围内的场景中的所有点。通过对这些点的去噪、拟和过程，将那些近似组成圆的场景中点的集合作为人的头部，从而实现了人头部的检测，再将人头部检测的结果(位置、半径、灰度等信息)交由跟踪算法实施有效跟踪，便可以正确判断出此人运动方向，从而完成计数。立体视觉技术能够利用两台略有位置偏移的摄像机通过三角运算，获得场景的深度信息。摄像机模块同时获取场景的两幅图像。场景中的点分别在两幅图像中存在像点。在不同的图像中像点的位置不同，这个位置的不同被称为视差，场景中的点距离摄像机距离不同导致视差大小也不相同，距离近则视差大，距离远则视差小。立体视觉正是基于不同的偏移程度，通过三角运算来确定场景中的物体到摄像机的距离。立体视觉大体上可以分为以下三步：1、在不同的图像之间建立图像特征点之间的对应关系；2、计算图像特征点位置之间的相对偏移；3、通过已知的摄像机参数计算图中特征点的三维位置信息。其中最关键的是第一步。为了建立两幅图中的对应点对，我们采用基于窗口的匹配方法，该方法适合硬件加速并且具有较为稳定的输出结果，相似度衡量函数采用SAD，表达式如下：As shown in Figure 3, the algorithm flow chart of the detection module, firstly, stereo vision processing is performed on the binocular image, and the image I(x, y, z) containing depth information is obtained by using the window-based matching algorithm, and the corresponding value is set in the Z direction. After the threshold value, the regions that may be human heads are extracted. Then randomly select one of the binocular images as a monocular image for plane circle-like object extraction. The extraction process first requires preprocessing such as edge detection and background removal on the monocular image, and then performs an improved Hough transform on the preprocessed foreground edge image, and introduces a fuzzy measure to the result of the Hough transform. Under the guidance, the false circle-like objects in the Hough transform result are removed, and the positions and radii of all circle-like objects in the monocular image are obtained. Finally, the results of the stereo vision algorithm are combined with the results of the Hough transform of the monocular image to extract the positions of the heads of all people in the foreground, and at the same time calculate the eigenvectors of each head required for the tracking part ( x, y, z, R, G). Since the head of the human body is obviously higher than other parts such as the shoulder of the human body, that is to say, the distance between the head of the human body and the camera is closer than other parts. The distance between each point in the scene and the camera can be obtained by using the method of stereo vision, so we can obtain all points in the scene within a certain range from the camera. Through the denoising and fitting process of these points, the set of points in the scene that approximately form a circle is used as the head of the person, thereby realizing the detection of the head, and then the detection results of the head (position, radius) , grayscale and other information) is effectively tracked by the tracking algorithm, and the direction of the person's movement can be correctly judged, thereby completing the counting. Stereo vision technology can use two slightly offset cameras to obtain the depth information of the scene through triangulation. The camera module acquires two images of the scene at the same time. The points in the scene have image points in the two images respectively. In different images, the position of the image point is different. This difference in position is called parallax. The distance between the point in the scene and the camera is different, resulting in different parallax. The distance is close, the parallax is large, and the distance is small. Stereo vision is based on different degrees of offset and uses triangulation to determine the distance from objects in the scene to the camera. Stereo vision can be roughly divided into the following three steps: 1. Establish the correspondence between image feature points between different images; 2. Calculate the relative offset between the image feature point positions; 3. Through the known camera The three-dimensional position information of the feature points in the parameter calculation graph. The most critical of these is the first step. In order to establish corresponding point pairs in the two images, we use a window-based matching method, which is suitable for hardware acceleration and has relatively stable output results. The similarity measurement function uses SAD, and the expression is as follows:

${min min}_{d d = = {d d}_{min min}}^{{d d}_{max max}} {Σ Σ}_{i i = = - - \frac{m m}{22}}^{\frac{m m}{22}} {Σ Σ}_{j j = = - - \frac{m m}{22}}^{\frac{m m}{22}} | | {I I}_{right right} [[x x + + i i]] [[y the y + + j j]] - - {I I}_{left left} [[x x + + i i + + d d]] [[y the y + + j j]] | |$

其中d_min和d_max分别为最小和最大视差值。m为模板窗口的大小。I_left和I_right分别为左图和右图。where d _min and d _max are the minimum and maximum disparity values, respectively. m is the size of the template window. I _left and I _right are the left picture and the right picture respectively.

特征点对建立后，我们可以对场景中的任意一点A计算其视差，公式如下：D(A)＝x(A_left)-x(A_right)。其中x(A_left)和x(A_right)分别为点A在左图和右图中像点的x坐标。After the feature point pair is established, we can calculate its parallax for any point A in the scene, the formula is as follows: D(A)=x(A _left )-x(A _right ). Where x(A _left ) and x(A _right ) are the x-coordinates of the image points of point A in the left and right images respectively.

最后我们通过下式对场景中的点进行三维重建：Finally, we perform 3D reconstruction of the points in the scene by the following formula:

$\{\begin{matrix} Xc Xc = = \frac{b b \times \times ((U u - - {U u}_{00}))}{d d} \\ Yc Yc = = \frac{b b \times \times ((V V - - {V V}_{00}))}{d d} \\ Zc Zc = = \frac{b b \times \times f f}{d d} \end{matrix}$

其中U₀、V₀为图像中心坐标，f为焦距，b为摄像机基线长度，d为场景点的视差，Xc，Yc，Zc为场景点在摄像机坐标系下的三维坐标。在获取场景的深度信息之后，再取双目图像中的任一目图像作为单目图像，对其首先进行边缘检测，对边缘检测结果实施改进的哈夫变换以检测类圆形物体的存在，主要原理为：圆心计算公式： $\{\begin{matrix} x_{c} = x_{i} - r \cos θ \\ y_{c} = y_{i} - r \sin θ \end{matrix} - - - (1),$ Where U ₀ and V ₀ are the image center coordinates, f is the focal length, b is the camera baseline length, d is the parallax of the scene point, Xc, Yc, Zc are the three-dimensional coordinates of the scene point in the camera coordinate system. After obtaining the depth information of the scene, take any one of the binocular images as a monocular image, first perform edge detection on it, and implement an improved Hough transform on the edge detection results to detect the existence of circular objects, mainly The principle is: the formula for calculating the center of a circle: $\{\begin{matrix} x_{c} = x_{i} - r \cos θ \\ {the y}_{c} = {the y}_{i} - r \sin θ \end{matrix} - - - (1),$

其中，(x_i，y_i)为边缘点在单目图像中的坐标值，(x_c，y_c)为对应半径r的圆心坐标值，θ＝arctg(g_y/g_x)，g_y，g_x分别是两个方向的梯度，在边缘检测时可以预先得到。这样，给定人头部半径范围，在每个半径下对所有检测区域内的边缘点按(1)式计算其对应的圆心坐标，就可以得到一张所有半径下圆心位置的映射表，在这张表中按一定的聚类规则进行聚类运算，再给定一个落到同一个圆上点数的阈值就可以得到所有类圆物体的圆心位置和半径了。当然，这其中会存在很多虚假圆，包括一些相交圆、相容圆的情况，还要针对这些情况作进一步的处理，以尽可能的消除虚假圆，使这一步得到的结果尽可能都是人头部。对于哈夫变换得到的结果，有可能出现曲线不闭合、边缘信息不完整等近似圆程度不定的情况，这种不定量的情况需要引入模糊概念进行处理。在这里引入两个置信度来衡量类圆物体近似圆的程度，其隶属函数分别为：Among them, (x _i , y _i ) is the coordinate value of the edge point in the monocular image, (x _c , y _c ) is the coordinate value of the center of the circle corresponding to the radius r, θ=arctg(g _y /g _x ), g _y , g _x are the gradients in two directions, which can be obtained in advance during edge detection. In this way, given the radius range of the human head, according to formula (1) to calculate the corresponding center coordinates of all the edge points in the detection area under each radius, a mapping table of the center positions under all radii can be obtained. In this table, the clustering operation is performed according to certain clustering rules, and a threshold value of the number of points falling on the same circle can be given to obtain the center position and radius of all circle-like objects. Of course, there will be many false circles, including some cases of intersecting circles and compatible circles, and further processing should be done on these cases to eliminate false circles as much as possible, so that the results obtained in this step are as artificial as possible. head. For the results obtained by the Hough transform, there may be cases where the degree of approximation to the circle is uncertain, such as the curve is not closed, the edge information is incomplete, and such uncertain cases need to be dealt with by introducing fuzzy concepts. Two confidence levels are introduced here to measure the degree to which a circle-like object approximates a circle, and their membership functions are:

1)μ₁＝T/2πr；2)μ₂＝A/πr² 1) μ ₁ =T/2πr; 2) μ ₂ =A/πr ²

其中T为该类圆哈夫变换得到的点数，可以理解为弧长，则当弧长等于圆周长，即T＝2πr时，隶属度为1，其余情况下，0＜μ₁＜1。A为该类圆中，哈夫变换得到的点依次连接所得到的圆内接多边形的面积，则极限情况下，A＝πr²，隶属度为1，其余情况下，0＜μ₂＜1。可以看到，μ₁和μ₂可以很好的描述曲线不闭合，边缘信息不完整情况下的类圆物体近似圆的程度，将这两个置信度分别或以一定的线性组合(系数可由神经网络或支撑向量机经训练得到)作用于哈夫变换的结果，就可以按照置信度的大小来消除、合并虚假圆，使得输出结果尽可能为真实的人头部。Where T is the number of points obtained by the Hough transform of this type of circle, which can be understood as the arc length, then when the arc length is equal to the circumference of the circle, that is, T=2πr, the degree of membership is 1, and in other cases, 0<μ ₁ <1. A is the area of the inscribed polygon obtained by sequentially connecting the points obtained by the Hough transformation in this type of circle, then in the limit case, A=πr ² , the degree of membership is 1, and in other cases, 0<μ ₂ <1 . It can be seen that μ ₁ and μ ₂ can well describe the degree to which a circle-like object approximates a circle under the condition that the curve is not closed and the edge information is incomplete, and these two confidence levels can be combined separately or in a certain linear combination (the coefficient can be determined by the neural network Network or support vector machine trained) acts on the result of Hough transformation, and the false circles can be eliminated and merged according to the size of the confidence, so that the output result is as real as possible.

如图4所示，跟踪模块的算法流程图，每个人在进出视场的过程中每次出现时由检测部分提取的特征向量构成了一个特征向量的序列。每个序列除了需要记录每次出现时的特征向量外，还要通过卡尔曼预测器预测下一帧中该序列特征向量的预测值。这样，所有出现在视场中的人每个人都有一个跟踪序列，所有人的跟踪序列组成了跟踪序列组或称为跟踪序列矩阵。跟踪模块的任务就是从前一帧的跟踪序列矩阵中提取出与当前帧的检测向量的向量距离最近的跟踪序列，如果该最近向量距离低于所设定的阈值，则匹配成功，将该检测向量加入到相应的跟踪序列中，否则，匹配失败，此时如果检测向量的模糊置信度高于给定的阈值说明一个此前未出现的人进入视场，则与此人相对应的新的跟踪序列产生，而该检测向量即为新的跟踪序列中的第一个向量值。值得一提的是，这里采用了向量距离，而不是通常意义上的几何距离。其原因是由于人上下公交车，尤其是上下台阶时运动很不规则，如果只采用几何距离的话，跟踪出错的几率很大，而采用向量距离，由于兼顾了头部大小与灰度信息，则跟踪出错的几率大大降低。当一个人在视场中消失(可以定义为两帧或多帧未匹配)时，其对应的跟踪序列任务结束，将其中记录的所有向量值取出作为判断其上下车的依据。As shown in Figure 4, the algorithm flow chart of the tracking module, the feature vectors extracted by the detection part each time each person appears in the process of entering and leaving the field of view constitute a sequence of feature vectors. In addition to recording the feature vector of each occurrence, each sequence also needs to predict the predicted value of the sequence feature vector in the next frame through the Kalman predictor. In this way, all the people appearing in the field of view each have a tracking sequence, and the tracking sequences of all people form a tracking sequence group or a tracking sequence matrix. The task of the tracking module is to extract the tracking sequence with the closest vector distance to the detection vector of the current frame from the tracking sequence matrix of the previous frame. If the distance of the closest vector is lower than the set threshold, the matching is successful, and the detection vector Add to the corresponding tracking sequence, otherwise, the matching fails. At this time, if the fuzzy confidence of the detection vector is higher than the given threshold, it means that a person who has not appeared before enters the field of view, and the new tracking sequence corresponding to this person Generated, and the detection vector is the first vector value in the new tracking sequence. It is worth mentioning that the vector distance is used here instead of the geometric distance in the usual sense. The reason is that when people get on and off the bus, especially when going up and down the steps, the movement is very irregular. If only the geometric distance is used, the probability of tracking errors is very high. However, if the vector distance is used, due to the consideration of the size of the head and the grayscale information, then The chances of tracking errors are greatly reduced. When a person disappears in the field of view (can be defined as two or more frames do not match), the corresponding tracking sequence task ends, and all the vector values recorded in it are taken out as the basis for judging whether he gets on or off the car.

Claims

1. public traffice passenger flow statistical method based on stereoscopic vision, it is characterized in that, processor carries out stereopsis to the binocular image that stereo vision apparatus obtains, obtain the distance between the video camera of each point in the described stereo vision apparatus in the scene, threshold value is set on distance then, obtain apart from the scene in a certain distance range of video camera have a few, by denoising to these points, fit, again in conjunction with the characteristic recognition method of monocular image, the set of the scene mid point that those proximate compositions are round is as people's head, thereby realized head part's detection, the result's that the head part is detected position again, radius, half-tone information is transferred to track algorithm and is implemented effectively to follow the tracks of, just can judge the direction of motion of passenger flow, thereby finish the passenger flow information statistics.

2. a kind of public traffice passenger flow statistical method according to claim 1 based on stereoscopic vision, be primarily characterized in that, described stereopsis: refer to utilize two slightly the video camera of offset pass through triangulo operation, obtain the depth information of scene, the stereopsis method step is as follows: 1) set up the corresponding relation between the image characteristic point between different images, 2) relativity shift between the computed image characteristic point position, the 3) three dimensional local information by unique point in the known camera parameters calculating chart.

3. a kind of public traffice passenger flow statistical method according to claim 1 based on stereoscopic vision, be primarily characterized in that, the characteristic recognition method of described monocular image: with the shape of human body head as feature, get arbitrary order image in the binocular image as monocular image, it is carried out edge of image detects, on edge detection results, implement improved hough transform, in the result of hough transform, carry out fuzzy clustering, thereby all similar round targets are as the testing result of human body head in the acquisition scene.

4. bus passenger flow statistical system based on stereoscopic vision, it is characterized in that, processor respectively with control circuit, network transmission module, protocol conversion module, storer joins, control circuit respectively with the binocular image capture module, network transmission module joins, network transmission module and network interface join, storer and binocular image capture module join, described binocular image capture module: sequential control circuit joins with video camera 1 and video camera 2 respectively, processor is a flush bonding processor, processor carries out stereopsis to the binocular image that stereo vision apparatus obtains, obtain the distance between the video camera of each point in the described stereo vision apparatus in the scene, threshold value is set on distance then, obtain apart from the scene in a certain distance range of video camera have a few, by denoising to these points, fit, again in conjunction with the characteristic recognition method of monocular image, the set of the scene mid point that those proximate compositions are round is as people's head, thereby realized head part's detection, the result's that the head part is detected position again, radius, half-tone information is transferred to track algorithm and is implemented effectively to follow the tracks of, just can judge the direction of motion of passenger flow, thereby finish the passenger flow information statistics.