CN116434160A

CN116434160A - Expressway casting object detection method and device based on background model and tracking

Info

Publication number: CN116434160A
Application number: CN202310417221.8A
Authority: CN
Inventors: 吕洪燕; 薛梦婕; 张秀杰
Original assignee: Guangzhou Gro Run One Traffic Information Co ltd
Current assignee: Guangzhou Gro Run One Traffic Information Co ltd
Priority date: 2023-04-18
Filing date: 2023-04-18
Publication date: 2023-07-14

Abstract

The invention discloses a method and a device for detecting a highway casting object based on a background model and tracking, wherein the method comprises the following steps: acquiring a video stream to be detected; background modeling and moving object detection are carried out, so that a moving object in a video frame image is detected, and according to the detecting result of the moving object, the GMM generates a binary image with the same size as the frame image; processing the foreground detection binary image by using mathematical morphology; tracking each moving target based on a tracking algorithm matched with the IOU and the Hungary, and associating the detection frames of the same moving target in the video in different frames based on the IOU values of the target detection frame of the current frame and the target detection frame of the history frame, so that the tracking of the target is completed; fusing a perception loss function, binarizing the perception images of the whole image and the images in the candidate frames by using an otsu threshold value, performing AND operation to obtain a candidate target of the throwing object, and tracking the candidate target; road surface interference is eliminated by using the two-class network, and accuracy of detecting the throwing objects is improved.

Description

Method and device for detection of sprinkled objects on expressway based on background model and tracking

技术领域technical field

本发明属于运动目标检测的技术领域，具体涉及一种基于背景模型与跟踪的高速公路抛洒物检测方法。The invention belongs to the technical field of moving target detection, and in particular relates to a method for detecting sprinkled objects on expressways based on background models and tracking.

背景技术Background technique

随着经济发展，高速公路上客货运输发展迅速，高速公路行车安全也受到更多关注，其中抛洒物是造成高速公路事故的重要原因之一。一般抛洒物的检测大多依靠人工肉眼观测，效率较低，难以为继。为了提高检测效率，现有技术中也有通过高速公路摄像机视频检测实现7*24小时的高效检测，解决这个难题。但是，由于抛洒物种类繁多，在外观上不具有统一特征，因此现有大多抛洒物的检测算法选择先检测出视频的运动目标，再根据抛洒物先运动后静止的运动特征从运动目标中检测出抛洒物。这些抛洒物检测算法虽然对抛洒物具有一定的检测效果，但是受限于算法中的运动目标检测效果以及对目标的跟踪效果，算法的检测准确度还存在上升空间。同时这些算法对高速公路场景下的人、车以及道路外的干扰项的抗干扰能力不强，容易受影响产生误报情况。With the development of economy, the transportation of passengers and goods on expressways has developed rapidly, and the safety of driving on expressways has also received more attention. Among them, spilled objects are one of the important reasons for expressway accidents. Generally, the detection of spilled objects mostly relies on manual naked eye observation, which is inefficient and unsustainable. In order to improve the detection efficiency, there is also a 7*24-hour high-efficiency detection through the highway camera video detection in the prior art to solve this problem. However, due to the wide variety of spills, they do not have uniform characteristics in appearance. Therefore, most of the existing detection algorithms for spills choose to detect moving objects in the video first, and then detect them from the moving objects according to the motion characteristics of the throwing objects that move first and then remain still. Spill out. Although these spill detection algorithms have a certain detection effect on spills, they are limited by the detection effect of moving targets and the tracking effect of targets in the algorithm, and there is still room for improvement in the detection accuracy of the algorithm. At the same time, these algorithms do not have strong anti-interference ability to people, vehicles and interference items outside the road in the highway scene, and are easily affected and cause false alarms.

发明内容Contents of the invention

本发明的主要目的在于克服现有技术的缺点与不足，提供一种基于背景模型与跟踪的高速公路抛洒物检测方法及装置，解决了在训练样本特征不明显易混淆的情况下识别准确率低的难题。The main purpose of the present invention is to overcome the shortcomings and deficiencies of the prior art, to provide a method and device for detecting sprinkled objects on highways based on background models and tracking, and to solve the problem of low recognition accuracy when the characteristics of training samples are not obvious and easily confused problem.

为了达到上述目的，本发明采用以下技术方案：In order to achieve the above object, the present invention adopts the following technical solutions:

第一方面，本发明提供了一种基于背景模型与跟踪的高速公路抛洒物检测方法，包括下述步骤：In a first aspect, the present invention provides a method for detecting sprinkled objects on expressways based on background models and tracking, comprising the following steps:

获取待检测的视频流，将待检测的视频流解码后得到当前视频帧图像，所述当前视频帧图像包括背景图像和运动目标；Obtain a video stream to be detected, and obtain a current video frame image after decoding the video stream to be detected, and the current video frame image includes a background image and a moving target;

背景建模与运动目标检测，采用预先设立的混合高斯模型GMM对背景图像中每个像素分别使用K个高斯分布进行背景建模，划分出背景图像，将当前视频帧中的图像与背景图像像素进行匹配，从而检测出视频帧图像中的运动目标，根据运动目标检测的结果，GMM生成与帧图像大小一致的二值图像；Background modeling and moving target detection, using the pre-established mixed Gaussian model GMM to use K Gaussian distributions for each pixel in the background image for background modeling, divide the background image, and combine the image in the current video frame with the background image pixel Matching is performed to detect the moving target in the video frame image, and according to the result of the moving target detection, the GMM generates a binary image with the same size as the frame image;

使用数学形态学对前景检测二值图像进行处理，去除二值图像中的噪声；Use mathematical morphology to process the foreground detection binary image to remove the noise in the binary image;

基于IOU与匈牙利匹配的跟踪算法对各运动目标进行跟踪，基于当前帧的目标检测框与历史帧的目标检测框的IOU值将视频中同一运动目标在不同帧中的检测框进行关联，从而完成对目标的跟踪；Track each moving target based on the tracking algorithm based on IOU and Hungary matching, and associate the detection frames of the same moving target in different frames in the video based on the IOU value of the target detection frame of the current frame and the target detection frame of the historical frame, so as to complete tracking of targets;

融合感知损失函数，将整幅图和候选框内图像的感知图都使用otsu阈值二值化，并进行求与操作获取抛洒物候选目标，对候选目标进行跟踪；Integrate the perceptual loss function, binarize the entire image and the perceptual image of the image in the candidate frame using the otsu threshold, and perform the sum operation to obtain the candidate target of the sprinkler, and track the candidate target;

利用二分类网络排除路面干扰，提升对抛洒物检测的准确率。Use the binary classification network to eliminate road interference and improve the accuracy of the detection of spilled objects.

作为优选的技术方案，所述背景建模与运动目标检测具体为：As a preferred technical solution, the background modeling and moving target detection are specifically:

参数初始化：对于输入的待检测视频流的第一帧帧图像，取该第一帧帧图像中各像素的值作为背景图像中对应像素点的第一个高斯分布的均值μ_1,1，同时为该高斯分布设立一个较大的权值ω_1,1以及方差∑_1,1，对于各像素点剩余的K-1个高斯分布，将它们的均值设为0，方差初始化为一个较大的值，权值均设为(1-ω_1,1)/(K-1)；Parameter initialization: For the first frame image of the input video stream to be detected, take the value of each pixel in the first frame image as the mean value μ _1,1 of the first Gaussian distribution of the corresponding pixel in the background image, and at the same time Set up a larger weight ω _1,1 and variance ∑ _1,1 for the Gaussian distribution. For the remaining K-1 Gaussian distributions of each pixel, set their mean value to 0 and initialize the variance to a larger value, the weights are all set to (1-ω _1,1 )/(K-1);

背景划分：设在待检测视频流的第t帧时某像素点的K个高斯分布各自的优先级p_i,t的计算公式为p_i,t＝ω_i,t/|σ_i,t|，其中p_i,t的值越大，说明该高斯分布权值越大，更适合描述背景内容；另一方面，方差越小，p_i,t的值越大，代表着在历史帧图像中像素点的值变化不大，所以第t帧属于背景的概率较大；Background division: The formula for calculating the respective priorities p _i,t of K Gaussian distributions of a certain pixel in the tth frame of the video stream to be detected is p _i,t =ω _i,t /|σ _i,t | , where the larger the value of pi _,t, the larger the weight of the Gaussian distribution, which is more suitable for describing the background content; on the other hand, the smaller the variance, the larger the value of pi _,t , which means that in the historical frame The value of the pixel does not change much, so the probability of the tth frame belonging to the background is relatively high;

运动目标检测：背景划分之后，GMM将输入帧图像的各个像素与前K₀个高斯分布进行比较，从而判断该像素是否属于运动目标，若帧图像中某个像素点与背景图像中对应像素点的前K₀个高斯分布之一匹配，则认为该像素属于背景像素；否则，将该像素判定为运动目标像素；根据运动目标检测的结果，GMM生成与帧图像大小一致的二值图像，属于运动目标的像素的值为255，显示为白色区域，属于背景区域的像素的值为0，显示为黑色区域，得到二值图；Moving object detection: After the background is divided, GMM compares each pixel of the input frame image with the first K ₀ Gaussian distributions to determine whether the pixel belongs to a moving object. If a pixel in the frame image and the corresponding pixel in the background image If it matches one of the first K ₀ Gaussian distributions, the pixel is considered to belong to the background pixel; otherwise, the pixel is judged as a moving object pixel; according to the result of moving object detection, GMM generates a binary image with the same size as the frame image, which belongs to The value of the pixel of the moving target is 255, which is displayed as a white area, and the value of the pixel belonging to the background area is 0, which is displayed as a black area, and a binary image is obtained;

GMM参数更新：为使背景模型能够跟上视频中的背景变化，对GMM进行参数更新，若帧图像的某一个像素与背景图像中对应像素的第i个高斯分布成功匹配，则对该高斯分布的参数进行更新，若该像素与对应的K个高斯分布均不匹配，则认为它属于帧图像中新出现的运动目标，为背景图像中的对应像素建立新的高斯分布。GMM parameter update: In order to enable the background model to keep up with the background changes in the video, the parameters of the GMM are updated. If a pixel of the frame image successfully matches the i-th Gaussian distribution of the corresponding pixel in the background image, then the Gaussian distribution If the pixel does not match the corresponding K Gaussian distributions, it is considered that it belongs to a new moving target in the frame image, and a new Gaussian distribution is established for the corresponding pixel in the background image.

作为优选的技术方案，在背景划分的步骤中，取排序后的前K₀个高斯分布，认为它们描述视频中的背景内容，令：

其中D为预先设立的阈值。As a preferred technical solution, in the step of background division, the top _K Gaussian distributions after sorting are taken, and they are considered to describe the background content in the video, so that:

Where D is a preset threshold.

作为优选的技术方案，所述使用数学形态学对前景检测二值图像进行处理，去除二值图像中的噪声，具体为：As a preferred technical solution, the use of mathematical morphology to process the foreground detection binary image to remove noise in the binary image is specifically:

针对可能存在微小的光照变化、视频录制时拍摄设备产生的噪声情况，数学描述如下：In view of the possible existence of small lighting changes and the noise generated by the shooting equipment during video recording, the mathematical description is as follows:

式中A'为数学形态学处理后的结果，A为待处理的二值图，B1为开运算以及闭运算时使用的探针，为3×3的方形结构，B2是膨胀操作中使用的探针；In the formula, A' is the result of mathematical morphology processing, A is the binary image to be processed, B1 is the probe used in the opening operation and closing operation, which is a 3×3 square structure, and B2 is the one used in the expansion operation probe;

针对行驶中的车辆以及少数特殊情况下从车上下来的行人这类对抛洒物检测的干扰，使用YOLOx检测，每隔设定帧抽取一张帧图像输入至预先设立的YOLO网络模型中进行人车检测，得到行人与车辆在图像中的具体位置，通过该位置信息，生成人车检测二值图像；取当前帧图像的各运动目标检测框的中心点，若该中心点位于人车检测二值图中的黑色区域，则在算法程序中将该检测框标记为人车检测框，在后续检测抛洒物的步骤中，这些被标记的检测框将不会被判定为抛洒物；Aiming at the interference to the detection of spilled objects such as moving vehicles and pedestrians getting off the vehicle under special circumstances, YOLOx detection is used to extract a frame image every set frame and input it to the pre-established YOLO network model for human detection. Vehicle detection, to obtain the specific position of pedestrians and vehicles in the image, and generate a binary image for human-vehicle detection through the position information; take the center point of each moving target detection frame of the current frame image, if the center point is located in the second level of human-vehicle detection In the black area in the value map, the detection frame is marked as a human-vehicle detection frame in the algorithm program, and in the subsequent steps of detecting spilled objects, these marked detection frames will not be judged as spilled objects;

针对高速公路的道路外区域可能存在的干扰项，使用语义分割得到道路区域进行排除，对语义分割网络模型的输出部分进行修改，使语义分割结果图像中分类为道路的区域像素值变为[255,255,255]，其余区域像素值变为[0,0,0]，并将该图像变为单通道的灰度图像，从而得到道路分割二值图像。道路分割二值图像中，白色区域为道路区域，黑色区域则为非道路区域。For the interference items that may exist in the area outside the road of the expressway, use semantic segmentation to get rid of the road area, and modify the output part of the semantic segmentation network model, so that the pixel value of the area classified as road in the semantic segmentation result image becomes [255, 255, 255 ], and the pixel values in the remaining areas become [0,0,0], and the image is converted into a single-channel grayscale image, thus obtaining a road segmentation binary image. In the road segmentation binary image, the white area is the road area, and the black area is the non-road area.

作为优选的技术方案，所述基于IOU与匈牙利匹配的跟踪算法对各运动目标进行跟踪，基于当前帧的目标检测框与历史帧的目标检测框的IOU值将视频中同一运动目标在不同帧中的检测框进行关联，从而完成对目标的跟踪，具体为：As a preferred technical solution, the tracking algorithm based on IOU and Hungary matching tracks each moving target, based on the IOU value of the target detection frame of the current frame and the target detection frame of the historical frame, the same moving target in the video is displayed in different frames The detection frame is associated to complete the tracking of the target, specifically:

设对视频当前帧图像的运动目标检测框与历史帧的运动目标检测框进行匹配，设当前帧图像中运动目标检测框的集合为Sc，其中检测框数量为m，历史帧图像运动目标检测框集合为Sh，检测框数量为n，构建代价矩阵C，矩阵中的元素c(i,j)为Sc中第i个检测框与Sh中第j个检测框的IOU值的相反数，其中1≤i≤m,1≤j≤n；若n≠m，则使用0填补代价矩阵中空缺的列或行；对Sc以及Sh的检测框进行匹配，目标为得到的匹配总代价最小，即总的IOU值的相反数之和最小，意味着IOU值之和最大；然后使用匈牙利匹配算法求得该代价矩阵下的最优解；在帧率较高的情况下，运动目标在相邻帧图像中的检测框重叠程度较高，因此使IOU值之和最大的最优解能让各个运动目标都达到了正确关联，从而完成对目标的跟踪。作为优选的技术方案，所述融合感知损失函数，将整幅图和候选框内图像的感知图都使用otsu阈值二值化，并进行求与操作获取抛洒物候选目标，对候选目标进行跟踪，从而排除路面区域的干扰，具体为：It is assumed that the moving object detection frame of the current frame image of the video is matched with the moving object detection frame of the historical frame, and the set of the moving object detection frame in the current frame image is Sc, wherein the number of detection frames is m, and the moving object detection frame of the historical frame image The set is Sh, the number of detection frames is n, and the cost matrix C is constructed. The element c(i,j) in the matrix is the opposite number of the IOU value of the i-th detection frame in Sc and the j-th detection frame in Sh, where 1 ≤i≤m, 1≤j≤n; if n≠m, use 0 to fill the vacant columns or rows in the cost matrix; match the detection frames of Sc and Sh, and the goal is to obtain the minimum matching cost, that is, the total The sum of the opposite numbers of the IOU value is the smallest, which means the sum of the IOU value is the largest; then use the Hungarian matching algorithm to find the optimal solution under the cost matrix; in the case of a high frame rate, the moving target is in the adjacent frame image The overlapping degree of the detection frames in is relatively high, so the optimal solution that maximizes the sum of IOU values can achieve correct association of each moving target, thereby completing the tracking of the target. As a preferred technical solution, the fusion perceptual loss function uses the otsu threshold to binarize the entire image and the perceptual image of the image in the candidate frame, and performs a summation operation to obtain the candidate target of the spilled objects, and then tracks the candidate target, In order to eliminate the interference of the road area, specifically:

对于当前视频帧，跟踪连续静止2帧边缘轮廓进行匹配，得到长背景图像和短背景图像；然后利用otsu二值化候选框4倍区域内的perceptual map，接着otsu二值化整幅图像的perceptual map，并将perceptual map做形态学处理，将候选框内的perceptual map与整幅图像的perceptual map进行求与操作，对求与后的map图形提取矩形框，判断Lou(总框框，map框)与设定阈值的大小，如小余设定阈值，则丢弃，如大于设定阈值，则新抛洒物加入抛洒物框数组内。For the current video frame, track the edge contours of two consecutive still frames for matching to obtain a long background image and a short background image; then use otsu to binarize the perceptual map within 4 times the area of the candidate frame, and then use otsu to binarize the perceptual of the entire image map, and perform morphological processing on the perceptual map, perform a sum operation on the perceptual map in the candidate frame and the perceptual map of the entire image, extract a rectangular frame from the summed map graphic, and judge Lou (total frame, map frame) The size of the set threshold, if it is less than the set threshold, it will be discarded, and if it is greater than the set threshold, the new scatter will be added to the scatter box array.

作为优选的技术方案，所述二分类网络采用ResNet-34，具体为：As a preferred technical solution, the two classification network adopts ResNet-34, specifically:

将采集的样本输入到ResNet-34模型中，进行训练，二分类指路面与非路面，选择的训练样本中的非路面样本包含：箱子、轮胎、路锥、瓶子、袋子和纸；路面样本包含：有车道线的路面、雨天水痕路面、反光路面、积水路面和影子路面。Input the collected samples into the ResNet-34 model for training. The two classification refers to road and non-road. The non-road samples in the selected training samples include: boxes, tires, road cones, bottles, bags and paper; road samples include : Roads with lane lines, roads with water marks in rainy days, reflective roads, waterlogged roads and shadow roads.

第二方面，本发明提供了一种基于背景模型与跟踪的高速公路抛洒物检测系统，应用于所述的基于背景模型与跟踪的高速公路抛洒物检测方法，包括数据获取模块、背景建模与运动目标检测模块、噪声处理模块、目标跟踪模块、抛洒物检测模块、以及分类模块；In the second aspect, the present invention provides a highway spill detection system based on background model and tracking, which is applied to the highway spill detection method based on background model and tracking, including a data acquisition module, background modeling and A moving target detection module, a noise processing module, a target tracking module, a sprinkler detection module, and a classification module;

所述数据获取模块，用于获取待检测的视频流，将待检测的视频流解码后得到当前视频帧图像，所述当前视频帧图像包括背景图像和运动目标；The data acquisition module is used to obtain a video stream to be detected, and obtain a current video frame image after decoding the video stream to be detected, and the current video frame image includes a background image and a moving target;

所述背景建模与运动目标检测模块，用于背景建模与运动目标检测，采用预先设立的混合高斯模型GMM对背景图像中每个像素分别使用K个高斯分布进行背景建模，划分出背景图像，将当前视频帧中的图像与背景图像像素进行匹配，从而检测出视频帧图像中的运动目标，根据运动目标检测的结果，GMM生成与帧图像大小一致的二值图像；The background modeling and moving object detection module is used for background modeling and moving object detection, using the pre-established mixed Gaussian model GMM to perform background modeling on each pixel in the background image using K Gaussian distributions respectively, and dividing the background Image, the image in the current video frame is matched with the background image pixels, thereby detecting the moving target in the video frame image, according to the result of the moving target detection, GMM generates a binary image with the same size as the frame image;

所述噪声处理模块，用于使用数学形态学对前景检测二值图像进行处理，去除二值图像中的噪声；The noise processing module is used to process the foreground detection binary image by using mathematical morphology to remove noise in the binary image;

所述目标跟踪模块，用于基于IOU与匈牙利匹配的跟踪算法对各运动目标进行跟踪，基于当前帧的目标检测框与历史帧的目标检测框的IOU值将视频中同一运动目标在不同帧中的检测框进行关联，从而完成对目标的跟踪；The target tracking module is used to track each moving target based on the tracking algorithm based on IOU and Hungary matching, based on the IOU value of the target detection frame of the current frame and the target detection frame of the historical frame, the same moving target in the video is displayed in different frames The detection frame is associated to complete the tracking of the target;

所述抛洒物检测模块，用于融合感知损失函数，将整幅图和候选框内图像的感知图都使用otsu阈值二值化，并进行求与操作获取抛洒物候选目标，对候选目标进行跟踪；The spill detection module is used to fuse the perceptual loss function, binarize the entire image and the perceptual image of the image in the candidate frame using the otsu threshold, and perform a sum operation to obtain the candidate target of the spill, and track the candidate target ;

所述分类模块，用于利用二分类网络排除路面干扰，提升对抛洒物检测的准确率。The classification module is used to use a binary classification network to eliminate road interference and improve the accuracy of detection of spilled objects.

第三方面，本发明提供了一种电子设备，所述电子设备包括：In a third aspect, the present invention provides an electronic device, the electronic device comprising:

至少一个处理器；以及，at least one processor; and,

与所述至少一个处理器通信连接的存储器；其中，a memory communicatively coupled to the at least one processor; wherein,

所述存储器存储有可被所述至少一个处理器执行的计算机程序指令，所述计算机程序指令被所述至少一个处理器执行，以使所述至少一个处理器能够执行所述的基于背景模型与跟踪的高速公路抛洒物检测方法。The memory stores computer program instructions executable by the at least one processor, the computer program instructions are executed by the at least one processor, so that the at least one processor can perform the background model-based and Tracking detection method for freeway spills.

第四方面，本发明提供了一种计算机可读存储介质，存储有程序，所述程序被处理器执行时，实现所述的基于背景模型与跟踪的高速公路抛洒物检测方法。In a fourth aspect, the present invention provides a computer-readable storage medium storing a program, and when the program is executed by a processor, the method for detecting sprinkled objects on a highway based on background model and tracking is implemented.

本发明与现有技术相比，具有如下优点和有益效果：Compared with the prior art, the present invention has the following advantages and beneficial effects:

与现有方法相比，本发明提供的是一种基于背景模型与跟踪的高速公路抛洒物检测方法，基于抛洒物先运动后静止的运动特征对其进行检测，先使用混合高斯模型检测出视频帧图像中的运动目标，然后使用基于IOU与匈牙利匹配的跟踪算法获取运动目标的运动状态信息，最后根据抛洒物的运动特征检测出运动目标中的抛洒物。此外，本发明还使用YOLO网络模型以及语义分割网络模型去除人、车及道路区域外的干扰项，使用阈值二值化方法对候选框和原始图片感知图进行求与，利用二分类网络排除路面干扰，有效地提升了算法对抛洒物检测的准确率。能有效地检测出高速公路场景中多种类型的抛洒物，同时对环境噪声拥有高鲁棒性。Compared with the existing methods, the present invention provides a method for detecting sprinkled objects on highways based on background models and tracking. The sprinkled objects are detected based on the motion characteristics of the first moving and then stationary, and the mixed Gaussian model is used to detect the video first. The moving target in the frame image, and then use the tracking algorithm based on IOU and Hungary to obtain the moving state information of the moving target, and finally detect the throwing objects in the moving target according to the movement characteristics of the throwing objects. In addition, the present invention also uses the YOLO network model and the semantic segmentation network model to remove the interference items outside the human, vehicle and road areas, uses the threshold binarization method to sum the candidate frame and the perceptual map of the original picture, and uses the binary classification network to exclude road surface Interference effectively improves the accuracy of the algorithm for detecting spilled objects. It can effectively detect various types of spills in highway scenes, and has high robustness to environmental noise.

附图说明Description of drawings

为了更清楚地说明本申请实施例中的技术方案，下面将对实施例描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本申请的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings that need to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present application. For those skilled in the art, other drawings can also be obtained based on these drawings without creative effort.

图1为本发明实施例基于背景模型与跟踪的高速公路抛洒物检测方法的流程图；Fig. 1 is the flow chart of the detection method of expressway spillage based on background model and tracking according to the embodiment of the present invention;

图2为本发明实施例背景建模与运动目标检测的流程图；Fig. 2 is the flowchart of background modeling and moving object detection of the embodiment of the present invention;

图3为本发明实施例感知图差异对比流程图；Fig. 3 is a flow chart of comparing the difference between perceptual maps according to the embodiment of the present invention;

图4为本发明实施例基于背景模型与跟踪的高速公路抛洒物检测系统的结构示意图；FIG. 4 is a schematic structural diagram of an expressway sprinkler detection system based on a background model and tracking according to an embodiment of the present invention;

图5为本发明实施例电子设备的结构图。FIG. 5 is a structural diagram of an electronic device according to an embodiment of the present invention.

具体实施方式Detailed ways

为了使本技术领域的人员更好地理解本申请方案，下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整地描述。显然，所描述的实施例仅仅是本申请一部分实施例，而不是全部的实施例。基于本申请中的实施例，本领域技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本申请保护的范围。In order to enable those skilled in the art to better understand the solutions of the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Apparently, the described embodiments are only some of the embodiments of this application, not all of them. Based on the embodiments in this application, all other embodiments obtained by those skilled in the art without making creative efforts belong to the scope of protection of this application.

在本申请中提及“实施例”意味着，结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例，也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是，本申请所描述的实施例可以与其它实施例相结合。Reference in this application to an "embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the present application. The occurrences of this phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is understood explicitly and implicitly by those skilled in the art that the embodiments described in this application can be combined with other embodiments.

本发明提供的基于背景模型与跟踪的高速公路抛洒物检测方法，通过背景模型获取视频帧图像中的运动目标，然后使用基于IOU与匈牙利匹配的跟踪算法对运动目标进行跟踪，分析它们的运动状态信息，通过抛洒物的运动特征将其与本文选择背景差分法进行运动目标检测。The invention provides a detection method based on the background model and tracking of highway spills. The background model is used to obtain the moving targets in the video frame images, and then the moving targets are tracked using the tracking algorithm based on IOU and Hungary matching, and their moving states are analyzed. Information, through the movement characteristics of the sprinkled objects and the background difference method selected in this paper for moving target detection.

请参阅图1，本实施例基于背景模型与跟踪的高速公路抛洒物检测方法包括六个部分：数据获取、背景建模与运动目标检测、噪声处理与干扰项排除、运动目标跟踪与抛洒物检测、感知图差异对比、二分类识别，具体如下：Please refer to Fig. 1, the present embodiment is based on the background model and the detection method of the freeway spilled object that tracks and includes six parts: Data acquisition, background modeling and moving target detection, noise processing and interference item exclusion, moving target tracking and spilled object detection , comparison of perceptual map differences, and two-category recognition, as follows:

S1、获取待检测的视频流，将待检测的视频流解码后得到当前视频帧图像，所述当前视频帧图像包括背景图像和运动目标。S1. Acquire a video stream to be detected, and decode the video stream to obtain a current video frame image, where the current video frame image includes a background image and a moving target.

示例性的，所述待检测的视频流可以为从交通监控视频中提取的视频流，需要从监控视频流中检测出是否有抛洒物，如有抛洒物，对抛洒物进行目标跟踪。Exemplarily, the video stream to be detected may be a video stream extracted from a traffic surveillance video, and it is necessary to detect whether there is a spilled object from the surveillance video stream, and if there is a spilled object, perform target tracking on the spilled object.

S2、背景建模与运动目标检测；S2. Background modeling and moving target detection;

检测高速公路抛洒物，先要在视频中检测出运动目标，再通过其运动特征选出其中的抛洒物，本实施例选用混合高斯模型(GMM，Gaussian Mixture Model)进行背景见面以及运动目标检测，请参阅图2，所述背景建模与运动目标检测的具体过程如下：To detect the spilled objects on the expressway, it is first necessary to detect the moving objects in the video, and then select the scattered objects therein through its motion characteristics. In this embodiment, a Gaussian Mixture Model (GMM, Gaussian Mixture Model) is used for background meeting and moving object detection. Please refer to Figure 2, the specific process of background modeling and moving target detection is as follows:

S2.1、GMM对背景图像中每个像素分别使用K个高斯分布进行背景建模，K一般取3至5之间的值，K的值越大，背景模型对像素的描述更加全面，鲁棒性更强，每个高斯分布描述其对应的像素点在背景图像中的像素值范围，多个高斯分布可以更加准确地表示背景中的复杂情况。S2.1. GMM uses K Gaussian distributions for background modeling for each pixel in the background image. K generally takes a value between 3 and 5. The larger the value of K, the more comprehensive the description of the pixel by the background model. The stickiness is stronger, each Gaussian distribution describes the pixel value range of its corresponding pixel in the background image, and multiple Gaussian distributions can more accurately represent the complex situation in the background.

S2.2、参数初始化：对于输入视频的第一帧帧图像，取该图像中各像素的值作为背景图像中对应像素点的第一个高斯分布的均值μ_1,1，同时为该高斯分布设立一个较大的权值ω_1,1(0.5-0.9之间)以及方差∑_1,1。对于各像素点剩余的K-1个高斯分布，将它们的均值设为0，方差初始化为一个较大的值，权值均设为(1-ω_1,1)/(K-1)。S2.2. Parameter initialization: For the first frame image of the input video, take the value of each pixel in the image as the mean value μ _1,1 of the first Gaussian distribution of the corresponding pixel in the background image, and at the same time be the Gaussian distribution Set up a larger weight ω _1,1 (between 0.5-0.9) and variance Σ _1,1 . For the remaining K-1 Gaussian distributions of each pixel, set their mean to 0, initialize their variance to a larger value, and set their weights to (1-ω _1,1 )/(K-1).

S2.3、背景划分：设在监控视频的第t帧时某像素点的K个高斯分布各自的优先级p_i,t的计算公式为p_i,t＝ω_i,t/|σ_i,t|，其中p_i,t的值越大，说明该高斯分布权值越大，更适合描述视频的背景内容；另一方面，方差越小，p_i,t的值越大，代表着在历史帧图像中像素点的值变化不大，所以它属于背景的概率较大。本实施例取排序后的前K₀个高斯分布，认为它们描述视频中的背景内容，令：

(D为设立的阈值，定为0.8)。S2.3. Background division: the calculation formula of the respective priorities p _i,t of K Gaussian distributions of a certain pixel in the tth frame of the surveillance video is p _i,t =ω _i,t /|σ _{i, t} |, where the larger the value of p _i,t, the larger the weight of the Gaussian distribution, which is more suitable for describing the background content of the video; on the other hand, the smaller the variance, the larger the value of p _i,t , which means that in The value of the pixel in the historical frame image does not change much, so it has a higher probability of belonging to the background. The present embodiment takes the _top K Gaussian distributions after sorting, and considers that they describe the background content in the video, so that:

(D is the established threshold, set at 0.8).

S2.4、运动目标检测：背景划分之后，GMM将输入帧图像的各个像素与前K₀个高斯分布进行比较，从而判断该像素是否属于运动目标，若帧图像中某个像素点与背景图像中对应像素点的前K₀个高斯分布之一匹配，则认为该像素属于背景像素。否则，将该像素判定为运动目标像素。S2.4. Moving object detection: After the background is divided, GMM compares each pixel of the input frame image with the first K ₀ Gaussian distributions to determine whether the pixel belongs to a moving object. If a pixel in the frame image is consistent with the background image If one of the first K ₀ Gaussian distributions of the corresponding pixel matches, the pixel is considered to belong to the background pixel. Otherwise, the pixel is determined as a moving target pixel.

进一步的，根据运动目标检测的结果，GMM生成与帧图像大小一致的二值图像，属于运动目标的像素的值为255，显示为白色区域，属于背景区域的像素的值为0，显示为黑色区域，得到二值图。Further, according to the result of moving object detection, GMM generates a binary image with the same size as the frame image, the value of the pixels belonging to the moving object is 255, which is displayed as a white area, and the value of pixels belonging to the background area is 0, which is displayed as black area to get a binary image.

S2.5、GMM参数更新：为使背景模型能够跟上视频中的背景变化，对GMM进行参数更新；若帧图像的某一个像素与背景图像中对应像素的第i个高斯分布成功匹配，则对该高斯分布的参数进行更新，若该像素与对应的K个高斯分布均不匹配，则认为它属于帧图像中新出现的运动目标，为背景图像中的对应像素建立新的高斯分布。S2.5, GMM parameter update: In order to enable the background model to keep up with the background changes in the video, update the parameters of the GMM; if a certain pixel of the frame image successfully matches the ith Gaussian distribution of the corresponding pixel in the background image, then The parameters of the Gaussian distribution are updated. If the pixel does not match the corresponding K Gaussian distributions, it is considered that it belongs to a new moving target in the frame image, and a new Gaussian distribution is established for the corresponding pixel in the background image.

S3、噪声处理与干扰项排除，具体如下：S3. Noise processing and interference elimination, details are as follows:

S3.1、使用数学形态学对前景检测二值图进行处理，去除二值图中的噪声，这部分主要针对可能存在微小的光照变化、视频录制时拍摄设备产生的噪声等情况，数学描述如下：S3.1. Use mathematical morphology to process the foreground detection binary image and remove the noise in the binary image. This part is mainly aimed at situations such as possible slight lighting changes and noise generated by the shooting equipment during video recording. The mathematical description is as follows :

其中，A'为数学形态学处理后的结果，A为待处理的二值图，B1为开运算以及闭运算时使用的探针，为3×3的方形结构，B2是膨胀操作中使用的探针，为5×5的方形结构。Among them, A' is the result of mathematical morphology processing, A is the binary image to be processed, B1 is the probe used in the opening operation and closing operation, which is a 3×3 square structure, and B2 is the one used in the expansion operation The probe is a 5×5 square structure.

S3.2、针对行驶中的车辆，以及少数特殊情况下从车上下来的行人这类对抛洒物检测的干扰，使用YOLOx检测。每隔25帧抽取一张帧图像输入至YOLO网络模型中进行人车检测，能够得到行人与车辆在图像中的具体位置，通过该位置信息，生成人车检测二值图像。取当前帧图像的各运动目标检测框的中心点，若该中心点位于人车检测二值图中的黑色区域，则在算法程序中将该检测框标记为人车检测框，在后续检测抛洒物的步骤中，这些被标记的检测框将不会被判定为抛洒物。S3.2. Use YOLOx to detect the interference to the detection of spilled objects such as vehicles in motion and pedestrians who get off the vehicle in a few special cases. A frame image is extracted every 25 frames and input to the YOLO network model for human-vehicle detection. The specific positions of pedestrians and vehicles in the image can be obtained. Through this position information, a binary image for human-vehicle detection is generated. Take the center point of each moving target detection frame of the current frame image, if the center point is located in the black area in the binary image of human-vehicle detection, mark the detection frame as a human-vehicle detection frame in the algorithm program, and detect spilled objects in the subsequent In the step of , these marked detection frames will not be judged as spilled objects.

S3.3、针对高速公路的道路外区域可能存在的干扰项，使用语义分割得到道路区域进行排除。对语义分割网络模型的输出部分进行修改，使语义分割结果图像中分类为道路的区域像素值变为[255,255,255]，其余区域像素值变为[0,0,0]，并将该图像变为单通道的灰度图像，从而得到道路分割二值图像。道路分割二值图像中，白色区域为道路区域，黑色区域则为非道路区域。S3.3. For the interference items that may exist in the off-road area of the expressway, the road area obtained by semantic segmentation is used to exclude it. Modify the output part of the semantic segmentation network model, so that the pixel values of the regions classified as roads in the semantic segmentation result image become [255,255,255], and the pixel values of other regions become [0,0,0], and the image becomes A single-channel grayscale image is used to obtain a road segmentation binary image. In the road segmentation binary image, the white area is the road area, and the black area is the non-road area.

S4、运动目标跟踪；S4, moving target tracking;

使用基于IOU与匈牙利匹配的跟踪算法对各运动目标进行跟踪，基于当前帧的目标检测框与历史帧的目标检测框的IOU值将视频中同一运动目标在不同帧中的检测框进行关联，从而完成对它们的跟踪。Use the tracking algorithm based on IOU and Hungarian matching to track each moving target. Based on the IOU value of the target detection frame of the current frame and the target detection frame of the historical frame, the detection frames of the same moving target in different frames in the video are associated, so that Finish tracking them.

进一步的，运动目标跟踪的具体过程如下：Further, the specific process of moving target tracking is as follows:

假设对视频当前帧图像的运动目标检测框与历史帧的运动目标检测框进行匹配，这里的历史帧设为当前帧之前的5帧。设当前帧图像中运动目标检测框的集合为Sc，其中检测框数量为m，历史帧图像运动目标检测框集合为Sh，检测框数量为n。构建代价矩阵C，矩阵中的元素c(i,j)为Sc中第i个检测框与Sh中第j个检测框的IOU值的相反数，其中1≤i≤m,1≤j≤n。若n≠m，则使用0填补代价矩阵中空缺的列或行。对Sc以及Sh的检测框进行匹配，目标为得到的匹配总代价最小，即总的IOU值的相反数之和最小，意味着IOU值之和最大。然后使用匈牙利匹配算法求得该代价矩阵下的最优解。在帧率较高的情况下，运动目标在相邻帧图像中的检测框重叠程度较高，因此使IOU值之和最大的最优解能让各个运动目标都达到了正确关联，从而完成对它们的跟踪。Assume that the moving object detection frame of the current frame image of the video is matched with the moving object detection frame of the historical frame, where the historical frame is set as 5 frames before the current frame. Let the set of moving object detection frames in the current frame image be Sc, where the number of detection frames is m, the set of moving object detection frames in historical frame images is Sh, and the number of detection frames is n. Construct the cost matrix C, the element c(i, j) in the matrix is the opposite number of the IOU value of the i-th detection frame in Sc and the j-th detection frame in Sh, where 1≤i≤m, 1≤j≤n . If n≠m, use 0 to fill the vacant columns or rows in the cost matrix. To match the detection frames of Sc and Sh, the goal is to obtain the minimum matching cost, that is, the sum of the opposite numbers of the total IOU value is the minimum, which means that the sum of the IOU values is the largest. Then use the Hungarian matching algorithm to find the optimal solution under the cost matrix. In the case of a high frame rate, the detection frames of the moving target in adjacent frame images have a high degree of overlap, so the optimal solution that maximizes the sum of the IOU values allows each moving target to achieve correct association, thereby completing the detection their tracking.

S5、抛洒物检测；S5. Spill detection;

融合感知损失函数，将整幅图和候选框内图像的感知图都使用otsu阈值二值化，并进行求与操作获取抛洒物候选目标，对候选目标进行跟踪，从而排除路面区域的干扰。Integrating the perceptual loss function, the perceptual map of the entire image and the image in the candidate frame is binarized using the otsu threshold, and the sum operation is performed to obtain the candidate target of the sprinkler, and the candidate target is tracked to eliminate the interference of the road area.

进一步的，请参阅图3，对于当前视频帧，跟踪连续静止2帧边缘轮廓进行匹配，得到长背景图像和短背景图像；然后利用otsu二值化候选框4倍区域内的perceptual map，接着otsu二值化整幅图像的perceptual map，并将perceptual map做形态学处理，将候选框内的perceptual map与整幅图像的perceptual map进行求与操作，对求与后的map图形提取矩形框，判断Lou(总框框，map框)与设定阈值的大小，如小余设定阈值，则丢弃，如大于设定阈值，则新抛洒物加入抛洒物框数组内。Further, please refer to Figure 3. For the current video frame, track the edge contours of 2 consecutive still frames for matching to obtain a long background image and a short background image; then use otsu to binarize the perceptual map within 4 times the area of the candidate frame, and then otsu Binarize the perceptual map of the entire image, and perform morphological processing on the perceptual map, perform an AND operation on the perceptual map in the candidate frame and the perceptual map of the entire image, extract a rectangular frame from the summed map graphic, and judge Lou (total frame, map frame) and the size of the set threshold. If it is less than the set threshold, it will be discarded. If it is greater than the set threshold, the new throwing object will be added to the throwing object frame array.

S6、路面与非路面二分类模型；S6, road and non-road binary classification model;

采用ResNet-34结构网络训练，Using ResNet-34 structure network training,

将采集的样本输入到ResNet-34模型中，进行训练。二分类指路面与非路面，本文选择的训练样本中的非路面样本包含：箱子、轮胎、路锥、瓶子、袋子、纸，路面样本包含：有车道线的路面、雨天水痕路面、反光路面、积水路面、影子路面。Input the collected samples into the ResNet-34 model for training. The two categories refer to road surface and non-road surface. The non-pavement samples in the training samples selected in this paper include: boxes, tires, road cones, bottles, bags, and paper. , stagnant water pavement, shadow pavement.

基于抛洒物先运动后静止的运动特征对其进行检测，先使用混合高斯模型检测出视频帧图像中的运动目标，然后使用基于IOU与匈牙利匹配的跟踪算法获取运动目标的运动状态信息，最后根据抛洒物的运动特征检测出运动目标中的抛洒物。此外，该算法还使用YOLO网络模型以及语义分割网络模型去除人、车及道路区域外的干扰项，使用阈值二值化方法对候选框和原始图片感知图进行求与，利用ResNet34二分类网络排除路面干扰，有效地提升了算法对抛洒物检测的准确率。The sprinkler is detected based on the motion characteristics that first move and then stay still. First, the mixed Gaussian model is used to detect the moving target in the video frame image, and then the tracking algorithm based on IOU and Hungary matching is used to obtain the moving state information of the moving target. Finally, according to The motion feature of the splatter detects the splatter in the moving target. In addition, the algorithm also uses the YOLO network model and the semantic segmentation network model to remove the interference items outside the human, vehicle and road areas, uses the threshold binarization method to sum the candidate frame and the original image perceptual map, and uses the ResNet34 binary classification network to exclude Road interference effectively improves the accuracy of the algorithm for detecting spilled objects.

需要说明的是，对于前述的各方法实施例，为了简便描述，将其都表述为一系列的动作组合，但是本领域技术人员应该知悉，本发明并不受所描述的动作顺序的限制，因为依据本发明，某些步骤可以采用其它顺序或者同时进行。It should be noted that for the foregoing method embodiments, for the sake of simplicity of description, they are expressed as a series of action combinations, but those skilled in the art should know that the present invention is not limited by the described action sequence, because Certain steps may be performed in other orders or simultaneously in accordance with the present invention.

基于与上述实施例中的基于背景模型与跟踪的高速公路抛洒物检测方法相同的思想，本发明还提供了基于背景模型与跟踪的高速公路抛洒物检测系统，该系统可用于执行上述基于背景模型与跟踪的高速公路抛洒物检测方法。为了便于说明，基于背景模型与跟踪的高速公路抛洒物检测系统实施例的结构示意图中，仅仅示出了与本发明实施例相关的部分，本领域技术人员可以理解，图示结构并不构成对装置的限定，可以包括比图示更多或更少的部件，或者组合某些部件，或者不同的部件布置。Based on the same idea as the detection method for freeway spills based on background model and tracking in the above embodiments, the present invention also provides a detection system for freeway spills based on background model and tracking, which can be used to implement the above-mentioned method based on background model A detection method for freeway spills with tracking. For the sake of illustration, in the schematic structural diagram of the embodiment of the highway sprinkler detection system based on the background model and tracking, only the parts related to the embodiment of the present invention are shown, and those skilled in the art can understand that the illustrated structure does not constitute a reference Device definitions may include more or fewer components than shown, or combinations of certain components, or different arrangements of components.

请参阅图4，在本申请的另一个实施例中，提供了一种基于背景模型与跟踪的高速公路抛洒物检测系统100，该系统包括数据获取模块101、背景建模与运动目标检测模块102、噪声处理模块103、目标跟踪模块104、抛洒物检测模块105、以及分类模块106；Please refer to FIG. 4 , in another embodiment of the present application, a highway spill detection system 100 based on background model and tracking is provided, the system includes a data acquisition module 101, a background modeling and moving object detection module 102 , a noise processing module 103, a target tracking module 104, a spill detection module 105, and a classification module 106;

所述数据获取模块101，用于获取待检测的视频流，将待检测的视频流解码后得到当前视频帧图像，所述当前视频帧图像包括背景图像和运动目标；The data acquisition module 101 is configured to acquire a video stream to be detected, and obtain a current video frame image after decoding the video stream to be detected, and the current video frame image includes a background image and a moving object;

所述背景建模与运动目标检测模块102，用于背景建模与运动目标检测，采用预先设立的混合高斯模型GMM对背景图像中每个像素分别使用K个高斯分布进行背景建模，划分出背景图像，将当前视频帧中的图像与背景图像像素进行匹配，从而检测出视频帧图像中的运动目标，根据运动目标检测的结果，GMM生成与帧图像大小一致的二值图像；The background modeling and moving object detection module 102 is used for background modeling and moving object detection, using a pre-established mixed Gaussian model GMM to perform background modeling on each pixel in the background image using K Gaussian distributions, and divides The background image matches the image in the current video frame with the pixels of the background image, thereby detecting the moving target in the video frame image, and according to the result of the moving target detection, GMM generates a binary image with the same size as the frame image;

所述噪声处理模块103，用于使用数学形态学对前景检测二值图像进行处理，去除二值图像中的噪声；The noise processing module 103 is used to process the foreground detection binary image by using mathematical morphology to remove noise in the binary image;

所述目标跟踪模块104，用于基于IOU与匈牙利匹配的跟踪算法对各运动目标进行跟踪，基于当前帧的目标检测框与历史帧的目标检测框的IOU值将视频中同一运动目标在不同帧中的检测框进行关联，从而完成对目标的跟踪；The target tracking module 104 is used to track each moving target based on the tracking algorithm based on IOU and Hungary matching, and based on the IOU value of the target detection frame of the current frame and the target detection frame of the historical frame, the same moving target in the video is displayed in different frames. The detection frame in is associated to complete the tracking of the target;

所述抛洒物检测模块105，用于融合感知损失函数，将整幅图和候选框内图像的感知图都使用otsu阈值二值化，并进行求与操作获取抛洒物候选目标，对候选目标进行跟踪；The spill detection module 105 is used to fuse the perceptual loss function, binarize the entire picture and the perceptual map of the image in the candidate frame using the otsu threshold, and perform a summation operation to obtain the candidate target of the spill, and perform a calculation on the candidate target track;

所述分类模块106，用于利用二分类网络排除路面干扰，提升对抛洒物检测的准确率。The classification module 106 is configured to use a binary classification network to eliminate road interference and improve the accuracy of detection of spilled objects.

需要说明的是，本发明的基于背景模型与跟踪的高速公路抛洒物检测系统与本发明的基于背景模型与跟踪的高速公路抛洒物检测方法一一对应，在上述基于背景模型与跟踪的高速公路抛洒物检测方法的实施例阐述的技术特征及其有益效果均适用于基于背景模型与跟踪的高速公路抛洒物检测的实施例中，具体内容可参见本发明方法实施例中的叙述，此处不再赘述，特此声明。It should be noted that the freeway spill detection system based on the background model and tracking of the present invention corresponds to the freeway spill detection method based on the background model and tracking of the present invention. The technical features and beneficial effects described in the embodiment of the spill detection method are applicable to the embodiment of the detection of expressway spills based on the background model and tracking. For details, please refer to the description in the method embodiment of the present invention, which is not described here. Again, hereby declare.

此外，上述实施例的基于背景模型与跟踪的高速公路抛洒物检测系统的实施方式中，各程序模块的逻辑划分仅是举例说明，实际应用中可以根据需要，例如出于相应硬件的配置要求或者软件的实现的便利考虑，将上述功能分配由不同的程序模块完成，即将所述基于背景模型与跟踪的高速公路抛洒物检测系统的内部结构划分成不同的程序模块，以完成以上描述的全部或者部分功能。In addition, in the implementation of the highway sprinkler detection system based on the background model and tracking in the above-mentioned embodiment, the logical division of each program module is only an example. Considering the convenience of software implementation, the above-mentioned function allocation is completed by different program modules, that is, the internal structure of the highway sprinkler detection system based on the background model and tracking is divided into different program modules, so as to complete all or all of the above-described Some functions.

请参阅图5，在一个实施例中，提供了一种实现基于背景模型与跟踪的高速公路抛洒物检测方法的电子设备，所述电子设备200可以包括第一处理器201、第一存储器202和总线，还可以包括存储在所述第一存储器202中并可在所述第一处理器201上运行的计算机程序，如高速公路抛洒物检测程序203。Please refer to FIG. 5 , in one embodiment, there is provided an electronic device for realizing a detection method for highway spills based on a background model and tracking, and the electronic device 200 may include a first processor 201, a first memory 202 and The bus may also include a computer program stored in the first memory 202 and operable on the first processor 201 , such as a highway spill detection program 203 .

其中，所述第一存储器202至少包括一种类型的可读存储介质，所述可读存储介质包括闪存、移动硬盘、多媒体卡、卡型存储器(例如：SD或DX存储器等)、磁性存储器、磁盘、光盘等。所述第一存储器202在一些实施例中可以是电子设备200的内部存储单元，例如该电子设备200的移动硬盘。所述第一存储器202在另一些实施例中也可以是电子设备200的外部存储设备，例如电子设备200上配备的插接式移动硬盘、智能存储卡(Smart Media Card，SMC)、安全数字(SecureDigital，SD)卡、闪存卡(Flash Card)等。进一步地，所述第一存储器202还可以既包括电子设备200的内部存储单元也包括外部存储设备。所述第一存储器202不仅可以用于存储安装于电子设备200的应用软件及各类数据，例如高速公路抛洒物检测程序203的代码等，还可以用于暂时地存储已经输出或者将要输出的数据。Wherein, the first memory 202 includes at least one type of readable storage medium, and the readable storage medium includes a flash memory, a mobile hard disk, a multimedia card, a card-type memory (for example: SD or DX memory, etc.), a magnetic memory, Disk, CD, etc. The first storage 202 may be an internal storage unit of the electronic device 200 in some embodiments, such as a mobile hard disk of the electronic device 200 . The first memory 202 may also be an external storage device of the electronic device 200 in other embodiments, such as a plug-in mobile hard disk equipped on the electronic device 200, a smart memory card (Smart Media Card, SMC), a secure digital ( SecureDigital, SD) card, flash memory card (Flash Card), etc. Further, the first memory 202 may also include both an internal storage unit of the electronic device 200 and an external storage device. The first memory 202 can not only be used to store application software and various data installed in the electronic device 200, such as the code of the expressway litter detection program 203, but also can be used to temporarily store data that has been output or will be output .

所述第一处理器201在一些实施例中可以由集成电路组成，例如可以由单个封装的集成电路所组成，也可以是由多个相同功能或不同功能封装的集成电路所组成，包括一个或者多个中央处理器(Central Processing unit，CPU)、微处理器、数字处理芯片、图形处理器及各种控制芯片的组合等。所述第一处理器201是所述电子设备的控制核心(Control Unit)，利用各种接口和线路连接整个电子设备的各个部件，通过运行或执行存储在所述第一存储器202内的程序或者模块，以及调用存储在所述第一存储器202内的数据，以执行电子设备200的各种功能和处理数据。In some embodiments, the first processor 201 may be composed of an integrated circuit, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions packaged, including one or A combination of multiple central processing units (Central Processing unit, CPU), microprocessors, digital processing chips, graphics processors and various control chips, etc. The first processor 201 is the control core (Control Unit) of the electronic device, which uses various interfaces and lines to connect various components of the entire electronic device, and runs or executes programs stored in the first memory 202 or modules, and call data stored in the first memory 202 to execute various functions of the electronic device 200 and process data.

图5仅示出了具有部件的电子设备，本领域技术人员可以理解的是，图5示出的结构并不构成对所述电子设备200的限定，可以包括比图示更少或者更多的部件，或者组合某些部件，或者不同的部件布置。FIG. 5 only shows an electronic device with components. Those skilled in the art can understand that the structure shown in FIG. components, or combinations of certain components, or different arrangements of components.

所述电子设备200中的所述第一存储器202存储的高速公路抛洒物检测程序203是多个指令的组合，在所述第一处理器201中运行时，可以实现：The highway spill detection program 203 stored in the first memory 202 in the electronic device 200 is a combination of multiple instructions, and when running in the first processor 201, it can realize:

进一步地，所述电子设备200集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个非易失性计算机可读取存储介质中。所述计算机可读介质可以包括：能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM，Read-Only Memory)。Further, if the integrated modules/units of the electronic device 200 are realized in the form of software function units and sold or used as independent products, they may be stored in a non-volatile computer-readable storage medium. The computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, and a read-only memory (ROM, Read-Only Memory) .

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程，是可以通过计算机程序来指令相关的硬件来完成，所述的程序可存储于一非易失性计算机可读取存储介质中，该程序在执行时，可包括如上述各方法的实施例的流程。其中，本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用，均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限，RAM以多种形式可得，诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be realized through computer programs to instruct related hardware, and the programs can be stored in a non-volatile computer-readable storage medium When the program is executed, it may include the processes of the embodiments of the above-mentioned methods. Wherein, any references to memory, storage, database or other media used in the various embodiments provided in the present application may include non-volatile and/or volatile memory. Nonvolatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in many forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

以上实施例的各技术特征可以进行任意的组合，为使描述简洁，未对上述实施例中的各个技术特征所有可能的组合都进行描述，然而，只要这些技术特征的组合不存在矛盾，都应当认为是本说明书记载的范围。The technical features of the above embodiments can be combined arbitrarily. To make the description concise, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features, they should be It is considered to be within the range described in this specification.

上述实施例为本发明较佳的实施方式，但本发明的实施方式并不受上述实施例的限制，其他的任何未背离本发明的精神实质与原理下所作的改变、修饰、替代、组合、简化，均应为等效的置换方式，都包含在本发明的保护范围之内。The above-mentioned embodiment is a preferred embodiment of the present invention, but the embodiment of the present invention is not limited by the above-mentioned embodiment, and any other changes, modifications, substitutions, combinations, Simplifications should be equivalent replacement methods, and all are included in the protection scope of the present invention.

Claims

1. The highway casting detection method based on the background model and tracking is characterized by comprising the following steps of:

obtaining a video stream to be detected, and decoding the video stream to be detected to obtain a current video frame image, wherein the current video frame image comprises a background image and a moving target;

background modeling and moving object detection, wherein a pre-established Gaussian Mixture Model (GMM) is adopted to perform background modeling on each pixel in a background image by using K Gaussian distributions respectively, the background image is divided, the image in a current video frame is matched with the background image pixels, so that a moving object in the video frame image is detected, and the GMM generates a binary image with the same size as the frame image according to the moving object detection result;

Processing the foreground detection binary image by using mathematical morphology to remove noise in the binary image;

tracking each moving target based on a tracking algorithm matched with the IOU and the Hungary, and associating the detection frames of the same moving target in the video in different frames based on the IOU values of the target detection frame of the current frame and the target detection frame of the history frame, so that the tracking of the target is completed;

fusing a perception loss function, binarizing the perception images of the whole image and the images in the candidate frames by using an otsu threshold value, performing AND operation to obtain a candidate target of the throwing object, and tracking the candidate target;

road surface interference is eliminated by using the two-class network, and accuracy of detecting the throwing objects is improved.

2. The method for detecting highway casts based on background model and tracking according to claim 1, wherein the background modeling and moving object detection are specifically as follows:

parameter initialization: for a first frame image of an input video stream to be detected, taking the value of each pixel in the first frame image as the mean value mu of the first Gaussian distribution of corresponding pixel points in a background image _1,1 At the same time, a larger weight omega is set for the Gaussian distribution _1,1 Variance sigma _1,1 For the remaining K-1 Gaussian distributions of each pixel, the mean value is set to 0, the variance is initialized to a larger value, and the weights are all set to (1-omega) _1,1 )/(K-1)；

Background division: k Gaussian components arranged at a pixel point at the t-th frame of a video stream to be detectedCloth respective priority p _i,t The calculation formula of (2) is p _i,t ＝ω _i,t /|σ _i,t I, wherein p _i,t The larger the value of (2), the larger the Gaussian distribution weight is, and the more suitable for describing background content is described; on the other hand, the smaller the variance, p _i,t The larger the value of (2) is, the less the value change of the pixel point in the historical frame image is represented, so the probability that the t frame belongs to the background is larger;

moving object detection: after background division, the GMM associates each pixel of the input frame image with the previous K ₀ The Gaussian distribution is compared to judge whether the pixel belongs to a moving object, if the pixel belongs to a front K of a pixel point in a frame image and a corresponding pixel point in a background image ₀ One of the gaussian distributions matches, then the pixel is considered to belong to the background pixel; otherwise, the pixel is judged as a motion target pixel; according to the detection result of the moving object, the GMM generates a binary image with the same size as the frame image, the value of the pixel belonging to the moving object is 255, the pixel is displayed as a white area, the value of the pixel belonging to the background area is 0, and the pixel is displayed as a black area, so as to obtain a binary image;

GMM parameter update: in order to enable the background model to keep up with background changes in the video, parameter updating is carried out on the GMM, if a certain pixel of the frame image is successfully matched with the ith Gaussian distribution of a corresponding pixel in the background image, the parameter of the Gaussian distribution is updated, and if the pixel is not matched with the corresponding K Gaussian distributions, the pixel is considered to belong to a newly-appearing moving object in the frame image, and a new Gaussian distribution is established for the corresponding pixel in the background image.

3. The method for detecting highway casting based on background model and tracking according to claim 2, wherein in the step of background division, the front K after sorting is taken ₀ A gaussian distribution, which is considered to describe the background content in the video, let:

where D is a pre-established threshold.

4. The method for detecting the highway casting based on the background model and the tracking according to claim 1, wherein the mathematical morphology is used for processing the foreground detection binary image, and the noise in the binary image is removed, specifically:

for the situations of possible tiny illumination change and noise generated by shooting equipment during video recording, the mathematics are described as follows:

Wherein A 'is the result of the morphological processing, A is a binary image to be processed, B1 is a probe used in the open operation and the close operation, B2 is a probe used in the expansion operation, and A' is a square structure of 3×3;

for the interference of the running vehicles and pedestrians from the vehicles to the detection of the throwing objects under a few special conditions, the YOLOx detection is used, a frame image is extracted every set frame and is input into a pre-established YOLO network model to carry out the detection of the vehicles, the specific positions of the pedestrians and the vehicles in the image are obtained, and a binary image of the detection of the vehicles is generated through the position information; taking the center point of each moving object detection frame of the current frame image, if the center point is positioned in a black area in the human-vehicle detection binary image, marking the detection frame as the human-vehicle detection frame in an algorithm program, and judging that the marked detection frames are not to be the casting objects in the subsequent step of detecting the casting objects;

for possible interference items in the road outer region of the expressway, the road region is obtained by semantic segmentation, the road region is eliminated, the output part of the semantic segmentation network model is modified, the pixel values of the region classified as the road in the semantic segmentation result image are changed to [255,255 ], the pixel values of the other regions are changed to [0, 0], and the image is changed to a single-channel gray image, so that a road segmentation binary image is obtained, wherein in the road segmentation binary image, the white region is the road region, and the black region is the non-road region.

5. The method for detecting the highway casting based on the background model and tracking according to claim 1, wherein the tracking algorithm based on the matching between the IOU and the hungarian tracks each moving target, and associates the detection frames of the same moving target in the video in different frames based on the IOU values of the target detection frame of the current frame and the target detection frame of the history frame, thereby completing the tracking of the target, specifically:

setting a moving target detection frame of a current frame image of a video and a moving target detection frame of a history frame to be matched, setting a set of the moving target detection frames in the current frame image as Sc, wherein the number of the detection frames is m, the set of the moving target detection frames of the history frame image is Sh, the number of the detection frames is n, and constructing a cost matrix C, wherein an element C (i, j) in the matrix is the opposite number of IOU values of an ith detection frame in Sc and a jth detection frame in Sh, wherein i is more than or equal to 1 and less than or equal to m, and j is more than or equal to 1 and less than or equal to n; if n is not equal to m, filling the vacant columns or rows in the cost matrix by using 0; the detection frames of Sc and Sh are matched, and the aim is that the obtained total cost of matching is minimum, namely the sum of the opposite numbers of the total IOU values is minimum, which means that the sum of the IOU values is maximum; then using a Hungary matching algorithm to obtain an optimal solution under the cost matrix; under the condition of higher frame rate, the overlapping degree of detection frames of the moving targets in the adjacent frame images is higher, so that the optimal solution with the maximum sum of IOU values can enable each moving target to achieve correct association, and the target tracking is completed.

6. The method for detecting the casting object on the expressway based on the background model and tracking according to claim 1, wherein the fusion perception loss function is used for binarizing the perception map of the whole image and the image in the candidate frame by using an otsu threshold value, performing an AND operation to obtain a casting object candidate target, and tracking the candidate target so as to eliminate the interference of a road surface area, and specifically comprises the following steps:

for the current video frame, tracking the edge contour of the continuous still 2 frames to match so as to obtain a long background image and a short background image; then, the perservical map in the 4-time area of the otsu binarization candidate frame is utilized, then the perservical map of the whole image is binarized by the otsu, morphological processing is carried out on the perservical map and the perservical map of the whole image in the candidate frame, the sum operation is carried out on the perservical map and the perservical map of the whole image in the candidate frame, a rectangular frame is extracted from the map image after the sum operation, the size of Lou (total frame, map frame) and a set threshold value is judged, if the smaller threshold value is set, the smaller threshold value is discarded, and if the smaller threshold value is larger than the set threshold value, new throws are added into a throws frame array.

7. The method for detecting highway casting objects based on the background model and tracking according to claim 1, wherein the two classification networks adopt res net-34, specifically:

The collected samples are input into a ResNet-34 model for training, the two classes refer to road surfaces and non-road surfaces, and the non-road surface samples in the selected training samples comprise: boxes, tires, road cones, bottles, bags and paper; the pavement sample comprises: road surface with lane line, rainy day water mark road surface, reflection of light road surface, ponding road surface and shadow road surface.

8. The expressway casting detection system based on the background model and tracking is characterized by being applied to the expressway casting detection method based on the background model and tracking, which is disclosed in any one of claims 1-7, and comprises a data acquisition module, a background modeling and moving target detection module, a noise processing module, a target tracking module, a casting detection module and a classification module;

the data acquisition module is used for acquiring a video stream to be detected, decoding the video stream to be detected to obtain a current video frame image, wherein the current video frame image comprises a background image and a moving target;

the background modeling and moving object detection module is used for background modeling and moving object detection, a pre-established Gaussian Mixture Model (GMM) is adopted to carry out background modeling on each pixel in a background image by using K Gaussian distributions respectively, a background image is divided, an image in a current video frame is matched with background image pixels, so that a moving object in the video frame image is detected, and the GMM generates a binary image with the same size as the frame image according to the moving object detection result;

The noise processing module is used for processing the foreground detection binary image by using mathematical morphology and removing noise in the binary image;

the target tracking module is used for tracking each moving target based on a tracking algorithm matched with the IOU and the Hungary, and associating the detection frames of the same moving target in the video in different frames based on the IOU values of the target detection frame of the current frame and the target detection frame of the history frame, so that the tracking of the target is completed;

the casting object detection module is used for fusing the perception loss function, binarizing the perception images of the whole image and the images in the candidate frames by using an otsu threshold value, performing AND operation to obtain casting object candidate targets, and tracking the candidate targets;

the classification module is used for eliminating road surface interference by utilizing a classification network and improving the accuracy of detecting the casting objects.

9. An electronic device, the electronic device comprising:

at least one processor; the method comprises the steps of,

a memory communicatively coupled to the at least one processor; wherein,,

the memory stores computer program instructions executable by the at least one processor to enable the at least one processor to perform the background model and tracking based highway casting detection method according to any one of claims 1-7.

10. A computer-readable storage medium storing a program, wherein the program, when executed by a processor, implements the method for detecting highway handoffs based on a background model and tracking as claimed in any one of claims 1 to 7.