CN108416307B

CN108416307B - An aerial image pavement crack detection method, device and equipment

Info

Publication number: CN108416307B
Application number: CN201810205751.5A
Authority: CN
Inventors: 王霞; 王博
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2018-03-13
Filing date: 2018-03-13
Publication date: 2020-08-14
Anticipated expiration: 2038-03-13
Also published as: CN108416307A

Abstract

The embodiment of the invention provides a method, a device and equipment for detecting a pavement crack of an aerial image. The method comprises the following steps: extracting deep high-dimensional features of a road surface area of the aerial photographing road surface image, and obtaining a high-dimensional feature map according to the deep high-dimensional features; based on deep high-dimensional characteristics of the pavement area, screening positive and negative samples of the high-dimensional characteristic diagram to distinguish a pavement crack target from a pavement background; and classifying and positioning the pavement crack target in a coordinate manner to obtain classification information and coordinate information of the pavement crack target. The method and the device can be applied to pavement crack detection of aerial images of high-altitude motion backgrounds and complex scenes, are more suitable for acquiring images by an unmanned aerial vehicle-mounted system compared with various commonly used crack detection algorithms, and have better crack robustness of aerial image detection.

Description

An aerial image pavement crack detection method, device and equipment

技术领域technical field

本发明实施例涉及深度学习和模式识别领域，尤其涉及一种航拍图像路面裂缝检测方法、装置及设备。Embodiments of the present invention relate to the fields of deep learning and pattern recognition, and in particular, to a method, device and device for detecting road cracks in aerial images.

背景技术Background technique

目前，公路路面主要破损形式之一是路面裂缝，其中，我国高速公路裂缝类型主要是横向和纵向裂缝。若能在裂缝出现的初期就发现，并跟踪其发展情况，那么路面维护费将大大降低，同时保证高速公路的行车安全。因此，对公路的路面状况进行定期的调查及养护极其重要。At present, one of the main damage forms of highway pavement is pavement cracks. Among them, the types of highway cracks in my country are mainly transverse and longitudinal cracks. If the cracks can be found at the early stage and their development can be tracked, the road maintenance cost will be greatly reduced, and the driving safety of the expressway will be guaranteed at the same time. Therefore, it is extremely important to conduct regular investigation and maintenance of the road surface condition.

路面裂缝检测方式从最初的人工检测方式开始发展；随着图像处理技术的应用，将车载采集装置和图像处理技术结合，应用于路面裂缝检测，使得检测效率得到很大改善。近年来，无人机技术得到快速发展，与之结合的应用得到极大丰富，结合于无人机采集方式的路面裂缝检测装置，相较于其他的方法，具有快速高效、视场大以及存储数据量有所下降的优势。但是相比于车载采集图像，存在路旁景物、车辆、电线以及阴影等干扰，且噪声也十分丰富。The pavement crack detection method has been developed from the initial manual detection method; with the application of image processing technology, the combination of vehicle-mounted acquisition device and image processing technology is applied to pavement crack detection, which greatly improves the detection efficiency. In recent years, UAV technology has developed rapidly, and the applications combined with it have been greatly enriched. Compared with other methods, the road crack detection device combined with the UAV acquisition method has the advantages of fast efficiency, large field of view and storage. The data volume has declined. However, compared with vehicle-mounted images, there are roadside scenery, vehicles, wires, shadows and other disturbances, and the noise is also very rich.

常用的裂缝识别方法主要集中于阈值分割、特征检测、纹理分析和种子生长等的应用，除此之外，还有机器学习以及模糊集的运用。但是目前已有的这些方法基本都是针对车载采集装置图像的基础进行检测发展，不能适用于干扰和噪声更加丰富的航拍图像。Commonly used crack identification methods mainly focus on the applications of threshold segmentation, feature detection, texture analysis and seed growth, in addition to machine learning and the use of fuzzy sets. However, these existing methods are basically based on the detection and development of vehicle-mounted acquisition device images, and cannot be applied to aerial images with more interference and noise.

发明内容SUMMARY OF THE INVENTION

针对现有技术存在的问题，本发明实施例提供一种航拍图像路面裂缝检测方法、装置及设备，结合航拍采集方式的一系列优势使裂缝检测更高效便捷，并改进传统图像处理算法的鲁棒性问题。Aiming at the problems existing in the prior art, the embodiments of the present invention provide a method, device and equipment for detecting road surface cracks in aerial images, which combine a series of advantages of aerial photography acquisition methods to make crack detection more efficient and convenient, and improve the robustness of traditional image processing algorithms Sexual issues.

第一方面，本发明实施例提供一种航拍图像路面裂缝检测方法，包括：In a first aspect, an embodiment of the present invention provides a method for detecting road cracks in aerial images, including:

提取航拍路面图像的路面区域的深层高维特征，根据所述深层高维特征获得高维特征图；extracting deep high-dimensional features of the pavement area of the aerial photographed road surface image, and obtaining a high-dimensional feature map according to the deep high-dimensional features;

基于所述路面区域的深层高维特征，对所述高维特征图进行正负样本筛选，以区分路面裂缝目标和路面背景；Based on the deep high-dimensional features of the pavement area, the high-dimensional feature map is screened for positive and negative samples to distinguish the pavement crack target and the pavement background;

对所述路面裂缝目标进行分类和坐标定位，获得所述路面裂缝目标的分类信息和坐标信息。Classify and coordinate positioning of the road surface crack target to obtain classification information and coordinate information of the road surface crack target.

第二方面，本发明实施例提供了一种航拍图像路面裂缝检测装置，包括：In a second aspect, an embodiment of the present invention provides an aerial image pavement crack detection device, including:

高维特征图模块，用于提取航拍路面图像的路面区域的深层高维特征，根据所述深层高维特征获得高维特征图；The high-dimensional feature map module is used to extract the deep high-dimensional features of the road surface area of the aerial photographic road image, and obtain the high-dimensional feature map according to the deep high-dimensional features;

裂缝识别模块，用于基于所述路面区域的深层高维特征，对所述高维特征图进行正负样本筛选，以区分路面裂缝目标和路面背景；以及a crack identification module for screening positive and negative samples of the high-dimensional feature map based on the deep high-dimensional features of the pavement area, so as to distinguish the pavement crack target from the pavement background; and

分类定位模块，用于对所述路面裂缝目标进行分类和坐标定位，获得所述路面裂缝目标的分类信息和坐标信息。The classification and positioning module is used for classifying and coordinate positioning of the road surface crack target, and obtaining classification information and coordinate information of the road surface crack target.

第三方面，本发明实施例提供了一种航拍图像路面裂缝检测设备，其特征在于，包括：In a third aspect, an embodiment of the present invention provides an aerial image pavement crack detection device, characterized in that it includes:

至少一个处理器；以及at least one processor; and

与所述处理器通信连接的至少一个存储器，其中：at least one memory communicatively coupled to the processor, wherein:

所述存储器存储有可被所述处理器执行的程序指令，所述处理器调用所述程序指令能够执行本发明实施例第一方面所述航拍图像路面裂缝检测方法及其任一可选实施例所述的方法。The memory stores program instructions that can be executed by the processor, and the processor can invoke the program instructions to execute the method for detecting road cracks in aerial images according to the first aspect of the embodiment of the present invention and any optional embodiments thereof. the method described.

第四方面，提供一种非暂态计算机可读存储介质，所述非暂态计算机可读存储介质存储计算机指令，所述计算机指令执行本发明实施例第一方面所述航拍图像路面裂缝检测方法及其任一可选实施例的方法。In a fourth aspect, a non-transitory computer-readable storage medium is provided, the non-transitory computer-readable storage medium stores computer instructions, and the computer instructions execute the method for detecting road cracks in aerial images according to the first aspect of the embodiments of the present invention and the method of any alternative embodiment thereof.

本发明实施例提供的一种航拍图像路面裂缝检测方法，通过提取路面区域的高维特征图，进行路面裂缝目标和路面背景的初步区分，并对裂缝目标进行分类和精确定位，实现航拍路面的裂缝目标定位和分类，并获取裂缝目标的坐标信息。本发明实施例可应用于高空运动背景和复杂场景的航拍图像的路面裂缝检测，相较于常用的各种裂缝检测算法更为适合无人机载系统采集图像，具有更好的航拍图像检测裂缝鲁棒性。An embodiment of the present invention provides a method for detecting road surface cracks in aerial images. By extracting high-dimensional feature maps of the road surface area, the road surface crack target and the road surface background are preliminarily distinguished, and the crack target is classified and accurately positioned, so as to realize the detection of the road surface by aerial photography. Crack target location and classification, and obtain the coordinate information of the crack target. The embodiments of the present invention can be applied to road crack detection in aerial images of high-altitude motion backgrounds and complex scenes. Compared with various commonly used crack detection algorithms, the embodiments of the present invention are more suitable for unmanned aerial systems to collect images, and have better aerial image detection capabilities. robustness.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍，显而易见地，下面描述中的附图是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following briefly introduces the accompanying drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description These are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained according to these drawings without creative efforts.

图1为本发明实施例一种航拍图像路面裂缝检测方法流程图；1 is a flowchart of a method for detecting road cracks in aerial images according to an embodiment of the present invention;

图2为本发明实施例航拍图像路面裂缝识别实施例示意图；FIG. 2 is a schematic diagram of an embodiment of identifying road cracks in an aerial image according to an embodiment of the present invention;

图3为本发明实施例一种航拍图像路面裂缝检测设备的框架示意图。FIG. 3 is a schematic diagram of a framework of an aerial image pavement crack detection device according to an embodiment of the present invention.

具体实施方式Detailed ways

为使本发明实施例的目的、技术方案和优点更加清楚，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly described below with reference to the drawings in the embodiments of the present invention. Obviously, the described embodiments are the Some, but not all, embodiments are disclosed. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

图1为本发明实施例一种航拍图像路面裂缝检测方法流程图，如图1所示的航拍图像路面裂缝检测方法，包括：1 is a flowchart of a method for detecting road cracks in aerial images according to an embodiment of the present invention. The method for detecting road cracks in aerial images as shown in FIG. 1 includes:

S100，提取航拍路面图像的路面区域的深层高维特征，根据所述深层高维特征获得高维特征图；S100, extracting deep high-dimensional features of the road surface area of the aerial photographed road surface image, and obtaining a high-dimensional feature map according to the deep high-dimensional features;

本发明实施例适用于航拍的路面图像的裂缝检测，航拍图像的特点是高空、距离远、运功拍摄、场景复杂，即路面以外的背景复杂。The embodiments of the present invention are suitable for crack detection of aerial photographed road images. The aerial photographic images are characterized by high altitude, long distance, sports performance shooting, and complex scenes, that is, the background outside the road surface is complex.

优选的，步骤S100之前还包括：对航拍路面图像进行粗分割以剔除路旁无效区域，获得航拍路面图像中的路面区域。Preferably, before step S100, the method further includes: performing rough segmentation on the aerial photographed road surface image to eliminate roadside invalid areas, and obtain the road surface area in the aerial photographed road surface image.

具体的，航拍路面图像中包括路面区域和路面区域以外的路旁无效区域，本发明实施例提剔除航拍路面图像中路面区域以外的路旁无效区域，剩下的即是路面区域；在此基础上，对路面区域进行高维特征提取，根据提取到的深层高维特征获得高维特征图。所述高维特征为：不同于浅层边缘局部信息特征的裂缝目标整体语义信息图像特征。Specifically, the aerial photographed road surface image includes the road surface area and the roadside invalid area other than the road surface area. In the embodiment of the present invention, the roadside invalid area other than the road surface area in the aerial photographed road surface image is eliminated, and the rest is the road surface area; on this basis Above, perform high-dimensional feature extraction on the pavement area, and obtain a high-dimensional feature map according to the extracted deep high-dimensional features. The high-dimensional feature is: the overall semantic information image feature of the crack target which is different from the local information feature of the shallow edge.

S200，基于所述路面区域的深层高维特征，对所述高维特征图进行正负样本筛选，以区分路面裂缝目标和路面背景；S200, based on the deep high-dimensional features of the pavement area, screening positive and negative samples of the high-dimensional feature map to distinguish the pavement crack target and the pavement background;

本发明实施例进行正负样本筛选，即是对路面裂缝目标进行初步识别，将路面裂缝目标和路面背景区分开，正样本为裂缝目标，负样本为路面背景。The embodiment of the present invention performs positive and negative sample screening, that is, preliminary identification of the pavement crack target, and distinguishes the pavement crack target from the pavement background. The positive sample is the crack target, and the negative sample is the pavement background.

S300，对所述路面裂缝目标进行分类和坐标定位，获得所述路面裂缝目标的分类信息和坐标信息。S300 , classify and coordinate the pavement crack target, and obtain classification information and coordinate information of the pavement crack object.

在步骤S200进行初步定位得到裂缝目标和路面背景后，步骤S300基于初步定位的裂缝目标进行分类，对裂缝目标的位置进行精确定位，最终获得裂缝目标的详细分类信息和坐标位置信息。所述分类信息是指裂缝目标的类型分类，比如横向裂缝或纵向裂缝等；所述坐标定位，是定量裂缝目标的精确位置坐标。After preliminary positioning in step S200 to obtain the crack target and the road background, step S300 classifies the crack target based on the preliminary positioning, precisely locates the position of the crack target, and finally obtains detailed classification information and coordinate position information of the crack target. The classification information refers to the type classification of the crack target, such as a transverse crack or a longitudinal crack, etc. The coordinate location is the precise position coordinates of the quantitative crack target.

基于上述实施例，所述对所述路面裂缝目标进行分类和坐标定位，获得所述路面裂缝目标的分类信息和坐标信息，之后还包括：Based on the above embodiment, the classification and coordinate positioning of the road surface crack target to obtain the classification information and coordinate information of the road surface crack target further include:

S400，根据所述路面裂缝目标的分类信息和坐标信息，计算所述路面裂缝目标的长度。S400: Calculate the length of the road surface crack target according to the classification information and coordinate information of the road surface crack target.

基于上述裂缝目标的详细的分类信息和坐标位置信息，步骤S400进一步计算路面裂缝目标的长度，从而本发明实施例根据航拍路面图像可最终获得如下信息：定位到裂缝目标，裂缝目标的详细分类、精确的坐标位置和定量的长度数据。Based on the detailed classification information and coordinate position information of the above-mentioned crack target, step S400 further calculates the length of the road surface crack target, so that the embodiment of the present invention can finally obtain the following information according to the aerial photographed road surface image: the crack target is located, the detailed classification of the crack target, Precise coordinate position and quantitative length data.

本发明实施例克服了无人机采集方式造成的图像处理困难，能应用于高空运动背景和场景复杂的路面裂缝检测，相较于常用的各种裂缝检测算法具有更强的鲁棒性，获得更好的航拍图像识别裂缝效果。本发明应用于航拍裂缝检测上，不仅可以为观察者提供裂缝目标更突出的图像，而且可以对裂缝长度做定量分析，为后续道路维护提供参考依据。The embodiment of the present invention overcomes the difficulty of image processing caused by the UAV acquisition method, can be applied to the detection of road cracks with high-altitude motion background and complex scenes, and has stronger robustness than various commonly used crack detection algorithms, and can obtain Better recognition of cracks in aerial images. When the invention is applied to crack detection in aerial photography, it can not only provide the observer with a more prominent image of the crack target, but also quantitatively analyze the crack length to provide a reference for subsequent road maintenance.

在一个可选的实施例中，步骤S100，所述提取航拍路面图像的路面区域的深层高维特征，根据所述深层高维特征获得高维特征图，具体包括：In an optional embodiment, in step S100, the deep high-dimensional features of the road surface area of the aerial photographed road surface image are extracted, and a high-dimensional feature map is obtained according to the deep high-dimensional features, which specifically includes:

S100.1，利用卷积神经网络构造特征提取网络，在所述特征提取网络中添加一个基于K-means聚类算法的道路粗分割层；S100.1, using a convolutional neural network to construct a feature extraction network, and adding a rough road segmentation layer based on the K-means clustering algorithm in the feature extraction network;

S100.2，利用所述道路粗分割层筛选剔除所述航拍路面图像的路旁无效区域，获得所述航拍路面图像的路面区域；S100.2, using the rough road segmentation layer to filter out the roadside invalid area of the aerial photographed road surface image to obtain the road surface area of the aerial photographed road surface image;

S100.3，利用所述特征提取网络将所述路面区域的低纬度特征组合成高纬度特征，获得高维特征图。S100.3, using the feature extraction network to combine the low-latitude features of the road surface area into high-latitude features to obtain a high-dimensional feature map.

步骤S100.1中，本发明实施例基于卷积神经网络进行学习训练，通过分析成熟网络结构，结合复杂场景小目标检测任务情况，利用卷积神经网络构造了一种特征提取网络，所述特征提取网络用于提取路面区域的高维特征。进一步，本发明实施例结合K-means算法，在所述特征提取网络中添加一个基于K-means聚类算法的道路粗分割层，所述道路粗分割层满足高召回率特点，可以去除路旁的无效区域。In step S100.1, the embodiment of the present invention performs learning and training based on a convolutional neural network. By analyzing the mature network structure and combining with the task of detecting small targets in complex scenes, a convolutional neural network is used to construct a feature extraction network. The extraction network is used to extract high-dimensional features of the pavement area. Further, in the embodiment of the present invention, combined with the K-means algorithm, a rough road segmentation layer based on the K-means clustering algorithm is added to the feature extraction network. The rough road segmentation layer satisfies the characteristics of high recall rate and can remove roadside invalid area.

步骤S100.2利用所述道路粗分割层筛选剔除所述航拍路面图像的路旁无效区域，获得所述航拍路面图像的路面区域，具体处理如下：Step S100.2 uses the rough road segmentation layer to filter out the roadside invalid area of the aerial photographed road surface image to obtain the road surface area of the aerial photographed road surface image, and the specific processing is as follows:

为了保证所有的路面区域都会被特征提取网络提取到高维特征，利用基于K-means聚类算法的道路粗分割层对输入的航拍路面图像进行粗分割，去除路旁的无效区域。In order to ensure that all the pavement areas will be extracted into high-dimensional features by the feature extraction network, the road rough segmentation layer based on the K-means clustering algorithm is used to roughly segment the input aerial pavement image to remove the invalid areas on the roadside.

对于给定的样本集，按照样本之间的距离大小，利用初始的K个聚类中心，计算各个聚类中心与每一个数据元素的距离，每次迭代分配距离各个聚类中心最近的K个数据元素形成K个簇，然后重新计算分配后的聚类中心，迭代分配所有数据元素，直到所有的簇不发生变化时为止，将样本集划分为固定的K个簇，让簇内的点尽量紧密的连在一起，而让簇间的距离尽量的大。For a given sample set, according to the distance between samples, use the initial K cluster centers to calculate the distance between each cluster center and each data element, and assign the K closest to each cluster center in each iteration. The data elements form K clusters, then recalculate the assigned cluster centers, and iteratively assign all data elements until all the clusters do not change, divide the sample set into fixed K clusters, and let the points in the clusters as far as possible close together, and make the distance between clusters as large as possible.

假设聚类簇数目K，最大迭代次数N，输入样本集数据为：Assuming the number of clusters K, the maximum number of iterations N, the input sample set data is:

D＝{x₁,x₂…x_m} (1)D={x ₁ ,x ₂ …x _m } (1)

其中，x_i为输入样本，i＝0，2，…，m。Among them, x _i is the input sample, i=0, 2, ..., m.

从数据集D中随机选择的k个样本作为质心向量为：The k samples randomly selected from the dataset D as the centroid vector are:

μ＝{μ₁,μ₂…μ_k} (2)μ={μ ₁ , μ ₂ ... μ _k } (2)

对于N个迭代次数中的每一个迭代过程，将簇划分为初始化为：For each of the N iterations, the cluster division is initialized as:

并计算每一个输入样本计算样本x_i(i＝1,2…m)和各个质心向量μ_j(j＝1,2…k)的距离，其表达式为：And calculate the distance between each input sample calculation sample x _i (i=1,2...m) and each centroid vector μ _j (j=1,2...k), its expression is:

将所有样本中具有最小距离d_ij的样本x_i划分到对应的类别λ_i中，样本簇更新规则为：The sample _xi with the smallest distance d _ij among all samples is divided into the corresponding category λ _i , and the sample cluster update rule is:

样本质点的更新规则为：The update rule for sample particles is:

重复上述(4)、(5)和(6)步骤直到完成N次迭代或者迭代过程簇不更新为止，得到最后的K个分割簇为：Repeat the above steps (4), (5) and (6) until N iterations are completed or the clusters are not updated in the iterative process, and the final K segmentation clusters are obtained as:

C＝{C₁,C₂…C_k} (7)C={C ₁ , C ₂ . . . C _k } (7)

总的来说，就是将目标的最小化平方误差E训练更新至最小，其表达式为：In general, it is to update the training of the target's minimum square error E to the minimum, and its expression is:

其中，β为代价函数的距离类型，当β＝1时，采用的是曼哈顿距离，当β＝2时，采用的是欧式距离。Among them, β is the distance type of the cost function. When β=1, the Manhattan distance is used, and when β=2, the Euclidean distance is used.

根据簇更新规则可以看出，经过K-means粗分割层后，具有相似特征的图像区域被划分到同一个簇内，路面区域一般具有相似的特征分布，路旁区域特征则分布较为离散，按照K＝2的簇进行图像粗分割，就能分割与路面区域相似的特征和与其不相似的特征两种分类，得到高召回率的路面区域分割效果。According to the cluster update rule, it can be seen that after the K-means coarse segmentation layer, image areas with similar features are divided into the same cluster, the pavement area generally has similar feature distribution, and the roadside area features are distributed more discretely. The cluster with K=2 can be roughly segmented, and the features similar to the pavement area and the features dissimilar to it can be divided into two categories, and the segmentation effect of the pavement area with high recall rate can be obtained.

S100.3，利用所述特征提取网络将所述路面区域的低纬度特征组合成高纬度特征，获得高维特征图，具体处理如下：S100.3, using the feature extraction network to combine the low-latitude features of the road surface area into high-latitude features to obtain a high-dimensional feature map, and the specific processing is as follows:

利用卷积神经网络构造的特征提取网络对路面区域进行高维特征提取。将神经元按层级构成，其中的权重和偏置都是依据反向传播算法进行训练更新，训练利用输入数据局部和整体的关系，将低维度的特征组合成高维度特征，得到不同维度的不同特征之间的空间相关性。通过共享相同层级的卷积核参数实现局部连接和权值共享，为卷积神经网络引入先验知识，大幅度降低网络训练的难度，同样使得卷积神经网络对于图像数据具有很好的适用性。The feature extraction network constructed by convolutional neural network is used to extract high-dimensional features of the pavement area. The neurons are composed of layers, and the weights and biases are trained and updated according to the back-propagation algorithm. The training uses the local and overall relationship of the input data to combine low-dimensional features into high-dimensional features to obtain different dimensions. Spatial correlation between features. By sharing the same level of convolution kernel parameters to achieve local connection and weight sharing, the prior knowledge is introduced into the convolutional neural network, which greatly reduces the difficulty of network training, and also makes the convolutional neural network suitable for image data. .

在卷积神经网络的层级中，主要有四种基础层级结构：卷积层；池化层；全连接层；激活层。In the hierarchy of convolutional neural networks, there are mainly four basic hierarchical structures: convolutional layer; pooling layer; fully connected layer; activation layer.

卷积层是一个可以训练学习参数的过滤器。卷积核按填充方式分类主要存在两种：边缘不填充的卷积核以及边缘按照卷积核尺寸一半的大小填充为0像素点的卷积核，后者作为填充方式有一个好处就是防止图像在多层的卷积操作之后图像尺寸缩小严重，保证特征图不被卷积操作影响。相同层级的卷积核尺寸一致，一个卷积核负责提取一种图像形状特征，不同层级之间的卷积核尺寸没有尺寸限制。A convolutional layer is a filter that can be trained to learn parameters. There are two main types of convolution kernels according to the filling method: convolution kernels with no edge filling and convolution kernels with edges filled with 0 pixels according to half of the size of the convolution kernel. The latter has an advantage as a filling method to prevent image After the multi-layer convolution operation, the image size is severely reduced to ensure that the feature map is not affected by the convolution operation. The size of the convolution kernels at the same level is the same, and one convolution kernel is responsible for extracting an image shape feature. There is no size limit on the size of the convolution kernels between different levels.

假设利用二维卷积H对二维特征图像F进行操作，其表达式为：Suppose the two-dimensional convolution H is used to operate the two-dimensional feature image F, and its expression is:

其中，F_layer为layer层二维特征图，F_layer+1为(layer+1)层二维特征图，i、j为卷积中心对应的图像坐标点，m、n分别为二维卷积的长、宽尺寸。Among them, F _layer is the two-dimensional feature map of the layer layer, F _layer+1 is the two-dimensional feature map of the (layer+1) layer, i and j are the image coordinate points corresponding to the convolution center, and m and n are the two-dimensional convolution respectively. length and width dimensions.

特征图经过卷积层后，下一层特征图的尺寸为：After the feature map goes through the convolutional layer, the size of the feature map of the next layer is:

height_layer+1＝(height_layer-m+2*padding)/stride+1 (10)height _layer+1 =(height _layer -m+2*padding)/stride+1 (10)

width_layer+1＝(width_layer-n+2*padding)/stride+1 (11)width _layer+1 =(width _layer -n+2*padding)/stride+1 (11)

其中，height_layer为layer层二维特征图的高，height_layer+1为(layer+1)层二维特征图的高，width_layer为layer层二维特征图的宽，width_layer+1为(layer+1)层二维特征图的宽，padding为特征图边缘填充尺寸，stride为卷积计算步长。Among them, the height _layer is the height of the two-dimensional feature map of the layer layer, the height _layer+1 is the height of the two-dimensional feature map of the (layer+1) layer, the width _layer is the width of the two-dimensional feature map of the layer layer, and the width _layer+1 is ( layer+1) The width of the two-dimensional feature map of the layer, padding is the edge padding size of the feature map, and stride is the step size of the convolution calculation.

池化层作为相邻卷积层之间的过渡层，能有效的压缩数据和网络参数的数量，同时也对防止网络产生过拟合现象有一定帮助。池化操作可分为两种：最大池化和平均池化，其中前者是根据池化块的尺寸将特征图分割为相应的区域，每个区域内选取最大特征的数据作为池化后特征图参数，后者则是对每个对应的区域内特征进行平均处理，选取均值作为池化后的特征图参数。最大池化适用于提取图像目标的特征，平均池化适用于平均图像背景的特征。As a transition layer between adjacent convolutional layers, the pooling layer can effectively compress the number of data and network parameters, and also help to prevent the network from overfitting. Pooling operations can be divided into two types: maximum pooling and average pooling. The former divides the feature map into corresponding regions according to the size of the pooling block, and selects the data with the largest feature in each region as the pooled feature map. The latter is to average the features in each corresponding region, and select the mean value as the pooled feature map parameter. Max pooling is suitable for extracting the features of the image object, and average pooling is suitable for averaging the features of the image background.

全连接层是神经网络的最初形式之一，通过全连接层可以将之前网络的输出的高维特征连接成一个细长的特征向量，同时映射到一个线性可分的空间，供输出层实现对特征目标的分类，其输出大小就是数据分类的种类。The fully connected layer is one of the initial forms of the neural network. Through the fully connected layer, the high-dimensional features of the previous network output can be connected into a slender feature vector and mapped to a linearly separable space for the output layer to realize the The classification of feature targets, and its output size is the type of data classification.

激活层则主要是给线性的模型增加非线性的因素，去除数据中的冗余，把卷积层提取的特征保留并映射出来，使模型拟合的更好。为了避免梯度消失和神经元失火，且使得损失曲线收敛更快，采用ELU激活函数，其表达式为：The activation layer mainly adds nonlinear factors to the linear model, removes redundancy in the data, and retains and maps the features extracted by the convolution layer to make the model fit better. In order to avoid gradient disappearance and neuron misfire, and make the loss curve converge faster, the ELU activation function is used, and its expression is:

其中，x为输入激活层的参数信息，α为激活层预设参数，同时采用Dropout和BatchNormalization方法避免过拟合风险，减轻深层网络对数据的拟合情况。Among them, x is the parameter information of the input activation layer, and α is the preset parameter of the activation layer. At the same time, Dropout and BatchNormalization methods are used to avoid the risk of overfitting and reduce the fitting situation of the deep network to the data.

Dropout在当前层中随机失活若干比例神经元，前向传播计算预测值与标签值的损失，反向传播更新网络的参数，不断重复这个过程。每次迭代时失活是随机的，导致参数是不同的网络结构迭代更新的，能在一方面达到避免过拟合的目的；减少神经元之间复杂的共适应关系，权重更新不会受到存在内在联系的神经元影响，筛选使模型表现更泛化的特征，迫使网络去学习更加鲁棒的特征。Dropout randomly deactivates a number of proportional neurons in the current layer, calculates the loss of the predicted value and the label value through forward propagation, and updates the parameters of the network through back propagation, and the process is repeated. The deactivation is random at each iteration, resulting in the parameters being iteratively updated with different network structures, which can avoid overfitting on the one hand; reduce the complex co-adaptation relationship between neurons, and the weight update will not be affected by the existence of The influence of interconnected neurons, filtering features that make the model perform more general, forces the network to learn more robust features.

Batch Normalization是一种归一化数据分布的方法。在深层网络训练时，由于输入层数据分布没有固定分布，当前层的分布也会随机变化，导致训练模型需要不断适应学习新的数据分布，且学习率要设置的非常小，同时对参数初始化要求也非常高。BatchNormalization就是为了解决这一问题，在激活函数前通过归一化数据分布到固定区间，模型训练不用每次适应不同分布，使得更好的用梯度进行参数更新。Batch Normalization is a method of normalizing the distribution of data. During deep network training, since the data distribution of the input layer has no fixed distribution, the distribution of the current layer will also change randomly. As a result, the training model needs to constantly adapt to learn new data distribution, and the learning rate should be set very small. Also very high. BatchNormalization is to solve this problem, by normalizing the data distribution to a fixed interval before the activation function, model training does not need to adapt to different distributions each time, so that it is better to use gradients for parameter update.

训练阶段，计算mini-batch样本均值为：In the training phase, the mean value of the mini-batch samples is calculated as:

其中，m为mini-batch样本的数量。where m is the number of mini-batch samples.

根据样本均值计算mini-batch样本标准差为：The standard deviation of mini-batch samples calculated from the sample mean is:

其中，δ为预设参数，X为数据集中各个样本，归一化之后的样本分布为：Among them, δ is a preset parameter, X is each sample in the data set, and the sample distribution after normalization is:

替换批量样本归一化的样本分布为：The sample distribution for the replacement batch sample normalization is:

其中，γ为样本均值的学习参数，

为样本标准差的学习参数。Among them, γ is the learning parameter of the sample mean,

is the learning parameter of the sample standard deviation.

梯度更新算法采用收敛更加快速的自适应学习速率方法RMSprop，利用前一时刻的梯度更新当前位置梯度和学习速率，其梯度更新公式为：The gradient update algorithm adopts the adaptive learning rate method RMSprop with faster convergence, and uses the gradient of the previous moment to update the current position gradient and learning rate. The gradient update formula is:

其中，E[g²]_t-1为前一时刻梯度平方的均值，E[g²]_t为当前时刻梯度平方的均值，

为当前时刻位置的梯度平方，η为学习速率，θ_t为前一时刻权重参数，θ_t+1为当前时刻权重参数。Among them, E[g ² ] _t-1 is the mean value of the gradient squares at the previous moment, E[g ² ] _t is the mean value of the gradient squares at the current moment,

is the square of the gradient at the current moment, η is the learning rate, θ _t is the weight parameter at the previous moment, and θ _t+1 is the weight parameter at the current moment.

除此之外，针对小尺寸特征图的卷积，利用小卷积核组合替代大卷积核，网络后端用n×1和1×n卷积核组合替换n×n卷积核，能有效的减小网络参数数量，加快网络收敛速度。In addition, for the convolution of small-sized feature maps, the combination of small convolution kernels is used to replace the large convolution kernel, and the combination of n×1 and 1×n convolution kernels at the back end of the network replaces the n×n convolution kernel. Effectively reduce the number of network parameters and speed up network convergence.

基于上述实施例，步骤S200，所述基于所述路面区域的深层高维特征，对所述高维特征图进行正负样本筛选，以区分路面裂缝目标和路面背景，具体包括：Based on the above embodiment, in step S200, the high-dimensional feature map is screened for positive and negative samples based on the deep high-dimensional features of the pavement area, so as to distinguish the pavement crack target and the pavement background, specifically including:

S200.1，所述基于所述路面区域的深层高维特征，利用anchor滑窗遍历所述高维特征图，获得预设面积尺度和预设宽高比的候选样本框，所述候选样本框即候选样本区域；S200.1, based on the deep high-dimensional features of the pavement area, use the anchor sliding window to traverse the high-dimensional feature map, and obtain a candidate sample frame with a preset area scale and a preset aspect ratio, the candidate sample frame That is, the candidate sample area;

其中，所述预设面积尺度可以是3种面积尺度；优选的，预设面积尺度为128²、256²、512²。所述预设宽高比可以是3中宽高比；优选的，所述预设宽高比为1:1、1:3、3:1。Wherein, the preset area scale may be 3 kinds of area scales; preferably, the preset area scale is 128 ² , 256 ² , and 512 ² . The preset aspect ratio may be three aspect ratios; preferably, the preset aspect ratio is 1:1, 1:3, and 3:1.

具体的，所述正负样本的划分依据为：Specifically, the division basis of the positive and negative samples is:

将所述候选样本框与任一标定样本框的IOU大于第一预设阈值以及所述候选样本框与剩余标定样本框的IOU最大的2种候选样本区域划分为正样本；The IOU of the candidate sample frame and any calibration sample frame is greater than the first preset threshold and 2 candidate sample regions with the largest IOU of the candidate sample frame and the remaining calibration sample frame are divided into positive samples;

将所述候选样本框与除正样本外的标定样本框的IOU小于第二预设阈值的候选样本框划分为负样本。The candidate sample frame and the candidate sample frame whose IOU of the calibration sample frame except the positive sample is smaller than the second preset threshold are divided into negative samples.

其中，in,

IOU＝(候选样本框∩标定样本框)/(候选样本框∪标定样本框)。IOU=(candidate sample frame∩calibration sample frame)/(candidate sample frame∪calibration sample frame).

优选的，所述第一预设阈值为0.7；所述第二预设阈值为0.3。Preferably, the first preset threshold is 0.7; the second preset threshold is 0.3.

S200.2，利用所述特征提取网络的区域提名网络的分类损失函数、定位损失函数和多任务损失函数、对所述候选样本框进行训练以筛选正负样本，获得路面裂缝目标样本和路面背景样本，其中正样本为路面裂缝目标样本，负样本为路面背景样本。S200.2, using the classification loss function, localization loss function and multi-task loss function of the regional nomination network of the feature extraction network, train the candidate sample frame to screen positive and negative samples, and obtain the pavement crack target sample and the pavement background samples, in which the positive samples are the pavement crack target samples, and the negative samples are the pavement background samples.

具体的，所述区域提名网络的分类损失函数为：Specifically, the classification loss function of the regional nomination network is:

l_cls(p)＝-(1-p_u)^γlogp_u；l _cls (p)=-(1-p _u ) ^γ logp _u ;

其中，p_u为标定样本框是正负样本的概率，γ为Focal Loss训练参数；Among them, p _u is the probability that the calibration sample frame is a positive and negative sample, and γ is the Focal Loss training parameter;

所述区域提名网络的定位损失函数为：The localization loss function of the regional nomination network is:

其中，v_i为候选样本框的坐标信息，

为前景和背景预测样本框的回归修正参数，u为第u个预测样本框，t_i为标定样本框的坐标信息，

损失函数为：Among them, _vi is the coordinate information of the candidate sample frame,

is the regression correction parameter of the foreground and background prediction sample frames, u is the u-th prediction sample frame, t _i is the coordinate information of the calibration sample frame,

The loss function is:

所述区域提名网络的多任务损失函数为：The multi-task loss function of the region nomination network is:

其中，n_cls为所有样本数量，n_reg为正样本数量，μ为归一化值为0.2，p_i为预测为裂缝目标的概率，

为离散的标记值。Among them, n _cls is the number of all samples, n _reg is the number of positive samples, μ is the normalized value of 0.2, p _i is the probability of being predicted as a crack target,

are discrete tag values.

基于上述特征，步骤S200.1的具体处理为：Based on the above features, the specific processing of step S200.1 is:

首先，根据特征提取网络最后一层卷积层提取的高维特征图，利用固定尺寸的anchor滑窗遍历整幅特征图，其每个滑窗中心按照不同层级产生不同局部感受野，其表达式为：First, according to the high-dimensional feature map extracted by the last convolutional layer of the feature extraction network, a fixed-size anchor sliding window is used to traverse the entire feature map, and the center of each sliding window generates different local receptive fields according to different levels. The expression for:

size_l-1＝stride×(size_l-1)+size_conv-2×padding (19)size _l-1 = stride×(size _l -1)+size _conv -2×padding (19)

其中，size_l-1为感受野上一层尺寸，size_l为特征图输出尺寸，stride为卷积移动步长，padding为卷积填充尺寸。Among them, size _l-1 is the size of the upper layer of the receptive field, size _l is the output size of the feature map, stride is the convolution moving step size, and padding is the convolution padding size.

每个滑窗中心对应9种候选样本框，在滑窗中心对应的感受野区域内利用上述3种宽高比和3种面积的anchor样本框选取候选样本区域。结合裂缝目标细长的特征，选取3中优选的面积尺度和宽高比，3种面积尺度为128²、256²、512²，3种宽高比为1:1、1:3、3:1。The center of each sliding window corresponds to 9 kinds of candidate sample frames, and in the receptive field area corresponding to the center of the sliding window, the above-mentioned 3 kinds of aspect ratios and 3 kinds of area anchor sample frames are used to select candidate sample regions. Combined with the ^slender and ^slender characteristics of the fracture target, the preferred area scale and aspect ratio among the ^three are selected. 1.

然后将获得的候选区域坐标映射成一个1024维的特征向量，与卷积层输出通道数对应，需要分别计算两个损失函数：分类损失函数以及定位损失函数。先需要挑选出anchor机制产生的候选样本框中能用于训练的正负样本，利用IOU计算候选样本框与邻近的训练数据标定框的重叠率选出用于训练的样本，其表达式为：Then, the obtained coordinates of the candidate region are mapped into a 1024-dimensional feature vector, which corresponds to the number of output channels of the convolution layer, and two loss functions need to be calculated separately: the classification loss function and the localization loss function. First, it is necessary to select the positive and negative samples that can be used for training in the candidate sample frame generated by the anchor mechanism, and use the IOU to calculate the overlap rate between the candidate sample frame and the adjacent training data calibration frame to select the samples for training. The expression is:

IOU＝(候选样本框∩标定样本框)/(候选样本框∪标定样本框) (20)IOU=(candidate sample frame∩calibration sample frame)/(candidate sample frame∪calibration sample frame) (20)

具体的正负样本划分的依据为：将候选样本框与任一标定样本框的IOU>0.7以及候选样本框与剩余标定样本框的IOU最大的2种候选样本框划分为训练正样本；将候选样本框与除正样本外的标定样本框的IOU<0.3的候选样本框划分为训练负样本。The specific basis for dividing positive and negative samples is: dividing the candidate sample frame and any calibration sample frame IOU>0.7 and the candidate sample frame and the remaining calibration sample frame IOU The largest two candidate sample frames are divided into training positive samples; the candidate sample frame is divided into training positive samples; The sample frame and the candidate sample frame with the IOU<0.3 of the calibration sample frame except the positive sample are divided into training negative samples.

基于上述特征，步骤S200.2的具体处理为：Based on the above features, the specific processing of step S200.2 is:

利用区域提名网络的分类损失函数区分裂缝目标与背景区域这两类样本，采用划分的正负样本对应的标定样本框实际样本，其中标定目标的样本定义为1，背景区域定义为0。每一个标定样本框对应一个离散的概率分布，可表示为：The classification loss function of the regional nomination network is used to distinguish the two types of samples, the crack target and the background area, and the actual samples of the calibration sample frame corresponding to the divided positive and negative samples are used. The sample of the calibration target is defined as 1, and the background area is defined as 0. Each calibration sample box corresponds to a discrete probability distribution, which can be expressed as:

p＝(p₀,p₁) (21)p=(p ₀ ,p ₁ ) (21)

其中，p₀为其为背景区域的概率，p₁为其为裂缝目标区域的概率。Among them, p ₀ is the probability that it is the background region, and p ₁ is the probability that it is the crack target region.

其次，由于anchor机制划分的正负样本的比例不能保证平衡，导致产生类别不均衡的问题，且大部分负样本都是简单易判断为负样本的区域，这些简单样本由于数量众多，也会对于损失函数训练造成影响，使得损失函数不能收敛到一个比较好的结果。因此结合Focal Loss思想，所述区域提名网络的分类损失函数，其表达式为：Secondly, because the proportion of positive and negative samples divided by the anchor mechanism cannot be guaranteed to be balanced, resulting in the problem of unbalanced categories, and most of the negative samples are areas that are simple and easy to judge as negative samples. Due to the large number of these simple samples, they will also affect the The loss function training has an impact, so that the loss function cannot converge to a better result. Therefore, combined with the idea of Focal Loss, the classification loss function of the regional nomination network is expressed as:

l_cls(p)＝-(1-p_u)^γlogp_u (22)l _cls (p)=-(1-p _u ) ^γ logp _u (22)

其中，p_u为标定样本框是正负样本的概率，γ为Focal Loss训练参数，初始值为5。Among them, p _u is the probability that the calibration sample frame is a positive and negative sample, γ is the Focal Loss training parameter, and the initial value is 5.

最后，利用定位损失对感受野划分的候选样本框位置结合训练正负样本进行修正，每一个训练正负样本对应一组坐标框尺寸坐标向量，是利用划分的正负样本坐标和数据集标定的样本坐标映射计算而来，其边框之间的映射关系为：Finally, the position of the candidate sample frame divided by the receptive field is corrected by combining the training positive and negative samples with the positioning loss. Each training positive and negative sample corresponds to a set of coordinate frame size coordinate vectors, which are calibrated using the divided positive and negative sample coordinates and the data set. The sample coordinate mapping is calculated, and the mapping relationship between its borders is:

其中，f(P_x,P_y,P_w,P_h)为划分的正负样本框坐标信息，

为边框修正样本框坐标信息，g(G_x,G_y,G_w,G_h)为标定样本框坐标信息，每个坐标信息包含4个定量坐标信息，分别为左上角横纵坐标和样本框宽高尺寸。Among them, f(P _x , P _y , P _w , P _h ) is the divided positive and negative sample frame coordinate information,

Correct the coordinate information of the sample frame for the frame, g(G _x , G _y , G _w , G _h ) is the coordinate information of the calibration sample frame, and each coordinate information contains 4 quantitative coordinate information, which are the abscissa and ordinate of the upper left corner and the sample frame. width and height dimensions.

参照Faster-RCNN中的边框回归修正方式，利用一系列的平移以及尺度缩放，转化为两组分别各4个特征向量。Referring to the frame regression correction method in Faster-RCNN, a series of translation and scale scaling are used to convert into two groups of four feature vectors respectively.

标定样本框的回归修正参数为：The regression correction parameters for the calibration sample frame are:

预测样本框的回归修正参数为：The regression correction parameters of the predicted sample frame are:

边框修正的损失函数采用对于离散点更加鲁棒的L₁损失函数，其表达式为：The loss function of border correction adopts the L1 _loss function which is more robust to discrete points, and its expression is:

其中，

损失函数表达式为：in,

The loss function expression is:

对于这个区域提名网络的多任务损失函数则为The multi-task loss function for this region nomination network is then

为离散的标记值，其表达式为：Among them, n _cls is the number of all samples, n _reg is the number of positive samples, μ is the normalized value of 0.2, p _i is the probability of being predicted as a crack target,

is a discrete tag value whose expression is:

通过区域提名网络的多任务损失函数筛选训练所需的正负样本区域，并对裂缝目标和路面背景进行初步分类，结合标定样本框实现对裂缝目标样本框的初步回归定位。The positive and negative sample regions required for training are screened by the multi-task loss function of the regional nomination network, and the crack target and the road background are preliminarily classified, and the initial regression positioning of the crack target sample box is realized by combining with the calibration sample frame.

基于上述实施例，步骤S300，所述对所述路面裂缝目标进行分类和坐标定位，获得所述路面裂缝目标的分类信息和坐标信息，具体包括：Based on the above embodiment, in step S300, the classification and coordinate positioning of the road surface crack target is performed to obtain the classification information and coordinate information of the road surface crack target, which specifically includes:

S300.1，利用所述特征提取网络的ROI池化层将区域提名网络筛选的正负样本规整到统一尺寸特征图，并进行分类输出，得到包括横向裂缝、纵向裂缝和路面背景的分类信息；S300.1, using the ROI pooling layer of the feature extraction network to regularize the positive and negative samples screened by the regional nomination network into a feature map of uniform size, and perform classification output to obtain classification information including transverse cracks, longitudinal cracks and pavement background;

具体的，由区域提名网络产生的训练分类并进行边框回归的样本框与卷积神经网络结合时，利用空间金字塔池化结构的ROI池化层，将特征图平均的划分为n×n大小的网格，对一个小网格内的特征图部分采用最大池化处理，只保留其提取的最大。Specifically, when the sample frame generated by the regional nomination network for training classification and bounding box regression is combined with the convolutional neural network, the ROI pooling layer of the spatial pyramid pooling structure is used to divide the feature map into n×n size on average. Grid, the feature map part in a small grid is processed by maximum pooling, and only the extracted maximum is retained.

特征提取网络最后一层卷积层的输出特征图和区域提名网络输出的候选样本框对应的原图边框坐标经过ROI池化层，特征大小得到统一，连接到全连接层，其中全连接层的输出大小为分类网络的类别数加上背景分类，即输出分类网络参数为3种，包括横向裂缝、纵向裂缝和路面背景。相对于步骤S200初步分类的裂缝目标和路面背景，本步骤的分类更为精细。The output feature map of the last convolutional layer of the feature extraction network and the frame coordinates of the original image corresponding to the candidate sample frame output by the regional nomination network pass through the ROI pooling layer, the feature size is unified, and connected to the fully connected layer, of which the fully connected layer The output size is the number of categories of the classification network plus the background classification, that is, the output classification network parameters are 3 kinds, including transverse cracks, longitudinal cracks and pavement background. Compared with the crack target and the pavement background preliminarily classified in step S200, the classification in this step is more refined.

S300.2，利用分类损失函数对所述正样本的具体裂缝类别进行分类，并进行边框回归，修正裂缝目标框的坐标信息。S300.2, using a classification loss function to classify the specific crack category of the positive sample, and perform frame regression to correct the coordinate information of the crack target frame.

具体的，利用分类损失函数分类正样本候选样本框的具体裂缝类别，每一个候选样本框对应一个离散的概率分布，其表达式为：Specifically, the classification loss function is used to classify the specific crack category of the positive sample candidate sample frame, each candidate sample frame corresponds to a discrete probability distribution, and its expression is:

P＝(P₀,P₁,P₃) (36)P=(P ₀ ,P ₁ ,P ₃ ) (36)

其中，P₀为其为背景区域的概率，P₁为其为横向裂缝区域的概率，P₂为其为纵向裂缝区域的概率。Among them, P ₀ is the probability of being a background region, P ₁ is the probability that it is a transverse crack region, and P ₂ is the probability that it is a longitudinal crack region.

在对裂缝目标的分类中，存在横向裂缝样本和纵向裂缝样本数量不均衡的情况，因此利用Focal Loss设计分类损失函数，其表达式为：In the classification of crack targets, there is an imbalance in the number of transverse crack samples and longitudinal crack samples, so Focal Loss is used to design the classification loss function, and its expression is:

L_cls(P)＝-(1-P_U)^γlogP_U (37)L _cls (P)=-(1-P _U ) ^γ logP _U (37)

其中，P_U为候选样本框为正负样本的概率。Among them, _PU is the probability that the candidate sample frame is a positive and negative sample.

利用边框回归网络的损失函数对分类的正样本候选样本框进行坐标定位，采用S2.1所述对应的标定样本框的回归修正参数T＝(T_x,T_y,T_w,T_h)和预测样本框的回归修正参数V＝(V_x,V_y,V_w,V_h)修正，其表达式为：Use the loss function of the frame regression network to locate the coordinates of the classified positive sample candidate sample frame, and use the regression correction parameters T=(T _x ,T _y , _Tw ,T _h ) of the corresponding calibration sample frame described in S2.1 and The regression correction parameter V=(V _x , V _y , V _w , V _h ) of the predicted sample frame is corrected, and its expression is:

对于这个裂缝分类网络，结合分类损失函数和定位损失函数的多任务损失函数表达式为：For this crack classification network, the multi-task loss function expression combining the classification loss function and the localization loss function is:

其中，N_cls为所有样本数量，N_reg为正样本数量，P_i为预测为裂缝目标的概率，

为离散的标记值，其表达式为：Among them, N _cls is the number of all samples, N _reg is the number of positive samples, _Pi is the probability of being predicted as a crack target,

is a discrete tag value whose expression is:

通过多任务损失函数对区域提名网络初步识别的裂缝目标进行分类，具体划分裂缝目标种类，且结合标定样本框实现对裂缝目标样本框完成二次回归定位，修正裂缝目标框的坐标信息。The crack targets initially identified by the regional nomination network are classified by the multi-task loss function, and the types of crack targets are specifically divided. Combined with the calibration sample frame, the secondary regression positioning of the crack target sample frame is realized, and the coordinate information of the crack target frame is corrected.

基于上述实施例，步骤S400，所述根据所述路面裂缝目标的分类信息和坐标信息，计算所述路面裂缝目标的长度，具体包括：Based on the above embodiment, in step S400, calculating the length of the road surface crack target according to the classification information and coordinate information of the road surface crack target specifically includes:

S400.1，根据所述路面裂缝目标的分类信息和坐标信息，利用形态学击中击不中变换提取裂缝目标的单像素骨架，计算所述裂缝目标的像素长度；所述裂缝目标包括横向裂缝、纵向裂缝；S400.1, according to the classification information and coordinate information of the pavement crack target, use the morphological hit-miss transformation to extract the single-pixel skeleton of the crack target, and calculate the pixel length of the crack target; the crack target includes transverse cracks , longitudinal cracks;

S400.2，根据所述航拍路面图像的像素坐标和实际路面的道路长度，对所述裂缝目标的像素长度进行换算，获取裂缝目标的长度。S400.2, according to the pixel coordinates of the aerial photographed road surface image and the road length of the actual road surface, convert the pixel length of the crack target to obtain the length of the crack target.

具体的，步骤S400.1，通过形态学的膨胀操作将检测裂缝过程中断裂的裂缝片段连接起来，利用形态学处理的击中击不中变换方法对膨胀后的完整裂缝进行细化操作，使得裂缝其几何尺度不变的情况下，去除多余的边缘像素信息，同时去除检测裂缝目标产生的多像素宽度问题，由单像素宽度组成的裂缝骨架可以比较准确的反映裂缝的细节特性。Specifically, in step S400.1, through the morphological expansion operation, the fracture fragments broken during the crack detection process are connected, and the morphologically processed hit and miss transformation method is used to refine the expanded complete crack, so that the When the geometric scale of the crack remains unchanged, the redundant edge pixel information is removed, and the multi-pixel width problem caused by the crack target is removed. The crack skeleton composed of a single pixel width can more accurately reflect the detailed characteristics of the crack.

针对裂缝检测目标，分别使用多个方向的的结构元素迭代处理，其表达式为：For the crack detection target, the structural elements in multiple directions are used to iteratively process, and the expression is:

其中，f为输入结构图像，s为若干个适当的结构元素，c为迭代次数。Among them, f is the input structure image, s is a number of appropriate structure elements, and c is the number of iterations.

假设定义结构元素序列s为：Suppose the definition of the structural element sequence s is:

{s}＝{s¹,s²,·····sⁿ} (42){s}={s ¹ ,s ² ,...s ⁿ } (42)

则存在：then exists:

利用式(3.21)至式(3.23)的方法对所有结果元素进行迭代操作，如果没有收敛，则再依次对各个结构元素重复进行操作，直到结果没有变化为止。Iterative operation is performed on all the result elements using the methods of formula (3.21) to formula (3.23), and if there is no convergence, the operation is repeated for each structural element in turn until the result does not change.

其次，利用最小二乘拟合的方法的Huber权重函数拟合离散点为骨架曲线，标准的最小二乘原则要求各离散点到拟合的骨架曲线距离和最小。但是对于远离骨架曲线的离群点没有很好的鲁棒性，需要设置权重阈值处理。Secondly, the Huber weight function of the least square fitting method is used to fit the discrete points as the skeleton curve. The standard least squares principle requires that the distance between each discrete point and the fitted skeleton curve is the smallest. However, it does not have good robustness for outliers far away from the skeleton curve, and it is necessary to set a weight threshold for processing.

所述Huber权重函数为：The Huber weight function is:

其中，τ表示距离阈值，δ为相邻曲线距离。Among them, τ represents the distance threshold, and δ is the distance between adjacent curves.

根据图像裂缝识别效果以及各个相邻曲线距离δ，包括噪声干扰以及裂缝目标，选择合适阈值对其进行约束。According to the image crack recognition effect and the distance δ of each adjacent curve, including noise interference and crack target, an appropriate threshold is selected to constrain it.

当点到曲线距离小于等于一个阈值τ时，赋予权重为1，当点到曲线距离大于阈值时，权重函数等于距离的倒数乘以阈值，距离越远，值越小。优选的，τ＝3。When the point-to-curve distance is less than or equal to a threshold τ, the weight is assigned to 1. When the point-to-curve distance is greater than the threshold, the weight function is equal to the reciprocal of the distance multiplied by the threshold. The farther the distance, the smaller the value. Preferably, τ=3.

最后步骤S400.2结合S400.1计算得到的裂缝目标的像素长度与无人机飞行高度、速度等参数，换算图像像素与实际长度比例，可以得到裂缝目标的实际长度。The final step S400.2 combines the pixel length of the crack target calculated in S400.1 with parameters such as the flying height and speed of the drone, and converts the ratio of the image pixel to the actual length to obtain the actual length of the crack target.

本发明实施例可结合显示设备，对相关信息进行显示，包括道路长度、裂缝数量、裂缝种类以及各段裂缝长度的信息等等。The embodiment of the present invention can be combined with a display device to display relevant information, including road length, number of cracks, types of cracks, and information on the length of each section of cracks.

综上所述，本发明实施例提出的一种深度学习的航拍图像路面裂缝检测方法，利用高召回率的K-means道路粗分割层，去除路旁的无效区域；结合复杂场景小目标检测任务情况，提供一种特征提取网络结构，提取路面区域的高维特征；提供满足航拍图像路面裂缝检测的特征提取网络的区域提名结构生成候选样本框，初步筛选裂缝目标和路面背景的正负样本；利用分类定位结构对裂缝目标检测分类和二次边框回归，并利用检测结果区域提取裂缝单像素骨架计算裂缝目标的长度数据，具有良好的航拍图像检测裂缝鲁棒性。To sum up, a deep learning method for detecting road cracks in aerial images proposed by the embodiments of the present invention uses the K-means road rough segmentation layer with high recall rate to remove the invalid areas on the roadside; combined with the small target detection task in complex scenes Provide a feature extraction network structure to extract high-dimensional features of the pavement area; provide a region nomination structure that satisfies the feature extraction network of aerial image pavement crack detection to generate candidate sample frames, and preliminarily screen positive and negative samples of crack targets and pavement backgrounds; The classification and positioning structure is used to detect the crack target and the quadratic bounding box regression is used, and the single-pixel skeleton of the crack is extracted from the detection result area to calculate the length data of the crack target.

图2为本发明实施例航拍图像路面裂缝识别实施例示意图，请参考图2，本发明实施例采样航拍路面图像作为输入，根据步骤S100，利用高召回率的K-means道路粗分割层，去除航拍图像中路旁的无效区域，并结合复杂场景小目标检测任务情况，设计一种特征提取网络结构，提取路面区域的高维特征；根据步骤S200，提供一种满足航拍图像路面裂缝检测的特征提取网络的区域提名结构，对步骤2网络最后一层的特征图生成候选样本框，初步筛选裂缝目标和路面背景的正负样本；根据步骤S300，利用分类定位结构对裂缝目标进行分类和二次边框回归；根据步骤S400，利用击中击不中变换对检测结果区域提取裂缝单像素骨架，进一步根据图像坐标和实际路面尺寸比例计算裂缝长度；最后输出裂缝图像及相关信息，包括道路长度、裂缝数量、裂缝种类以及各段裂缝长度的信息，供显示设备进行显示。FIG. 2 is a schematic diagram of an embodiment of identifying road cracks in an aerial image according to an embodiment of the present invention. Please refer to FIG. 2 . An embodiment of the present invention samples a road surface image from an aerial photograph as an input. The invalid area on the roadside in the aerial image, combined with the small target detection task in complex scenes, a feature extraction network structure is designed to extract the high-dimensional features of the road surface area; according to step S200, a feature extraction method that meets the detection of road cracks in the aerial image is provided. The regional nomination structure of the network generates a candidate sample frame for the feature map of the last layer of the network in step 2, and preliminarily screens the positive and negative samples of the crack target and the road background; according to step S300, the classification and positioning structure is used to classify the crack target and the secondary frame. Regression; according to step S400, the single-pixel skeleton of cracks is extracted from the detection result area using the hit and miss transformation, and the crack length is further calculated according to the image coordinates and the actual road surface size ratio; finally, the crack image and related information are output, including the road length and the number of cracks. , the types of cracks and the information of the length of each section of cracks for display equipment to display.

对于航拍图像，本发明实施例可以克服无人机采集方式造成的图像处理困难，能应用于高空运动背景和场景复杂的路面裂缝检测，相较于常用的各种裂缝检测算法具有更好的鲁棒性，获得更好的航拍图像识别裂缝效果。本发明应用于航拍裂缝检测上，不仅可以自动检测、分类以及定位裂缝目标，而且可以对裂缝长度做定量分析，为后续道路维护提供参考依据。For aerial images, the embodiments of the present invention can overcome the difficulty of image processing caused by the collection method of unmanned aerial vehicles, can be applied to the detection of road cracks with high-altitude motion backgrounds and complex scenes, and have better robustness than various commonly used crack detection algorithms. Great for better recognition of cracks in aerial images. The invention is applied to aerial photographic crack detection, not only can automatically detect, classify and locate crack targets, but also can quantitatively analyze the crack length to provide reference for subsequent road maintenance.

本发明实施例基于深度学习对路面图像进行运算处理，主要包括三个步骤，其一，提取路面区域的深层高维特征图，其二，生成筛选用于训练的有效正负样本并初步区分路面裂缝目标和路面背景，其三，对裂缝目标进行分类并定位具体坐标，进一步计算裂缝目标骨架长度。在每个处理步骤中，运用不同的算法网络对图像数据进行处理分析，设置不同的超参数对路面裂缝目标和路面背景进行初次区分、详细分类和坐标定位，实现自动化对路面裂缝的精确分类定位和定量骨架计算。The embodiment of the present invention performs operation processing on the road surface image based on deep learning, which mainly includes three steps: firstly, extracting the deep high-dimensional feature map of the road surface area; secondly, generating and screening valid positive and negative samples for training and preliminarily distinguishing the road surface Crack target and pavement background, thirdly, classify the crack target and locate the specific coordinates, and further calculate the length of the crack target skeleton. In each processing step, different algorithm networks are used to process and analyze the image data, and different hyperparameters are set to perform initial distinction, detailed classification and coordinate positioning of the pavement crack target and pavement background, so as to realize automatic accurate classification and positioning of pavement cracks. and quantitative skeleton calculations.

本发明实施例所选用的算法尤其适用于数据量大、干扰噪声多的航拍图像的路面裂缝识别，解决了现有技术的不足，具有良好的有益效果。The algorithm selected in the embodiment of the present invention is especially suitable for the identification of road cracks in aerial images with a large amount of data and a lot of interference noise, which solves the deficiencies of the prior art and has good beneficial effects.

本发明实施例还提供一种航拍图像路面裂缝检测装置，包括：The embodiment of the present invention also provides an aerial image pavement crack detection device, including:

本发明实施例的装置，可用于执行图1所示的航拍图像路面裂缝检测方法实施例的技术方案，其实现原理和技术效果类似，此处不再赘述。The device in the embodiment of the present invention can be used to implement the technical solution of the embodiment of the method for detecting road cracks in aerial images shown in FIG.

图3为本发明实施例一种航拍图像路面裂缝检测设备的框架示意图。请参考图3，本发明实施例提供一种航拍图像路面裂缝检测设备，包括：处理器(processor)310、通信接口(Communications Interface)320、存储器(memory)330和总线340，其中，处理器310，通信接口320，存储器330通过总线340完成相互间的通信。处理器310可以调用存储器330中的逻辑指令，以执行如下方法，包括：提取航拍路面图像的路面区域的深层高维特征，根据所述深层高维特征获得高维特征图；基于所述路面区域的深层高维特征，对所述高维特征图进行正负样本筛选，以区分路面裂缝目标和路面背景；对所述路面裂缝目标进行分类和坐标定位，获得所述路面裂缝目标的分类信息和坐标信息。FIG. 3 is a schematic diagram of a framework of an aerial image pavement crack detection device according to an embodiment of the present invention. Referring to FIG. 3 , an embodiment of the present invention provides an aerial image pavement crack detection device, including: a processor 310 , a communications interface 320 , a memory 330 and a bus 340 , wherein the processor 310 , the communication interface 320 and the memory 330 complete the communication with each other through the bus 340 . The processor 310 may invoke the logic instructions in the memory 330 to perform the following method, including: extracting deep high-dimensional features of the road surface area of the aerial road surface image, obtaining a high-dimensional feature map according to the deep high-dimensional features; based on the road surface area The deep high-dimensional features of the high-dimensional feature map are screened for positive and negative samples to distinguish the pavement crack target and the pavement background; the pavement crack target is classified and coordinate location is obtained to obtain the classification information and Coordinate information.

本发明实施例公开一种计算机程序产品，所述计算机程序产品包括存储在非暂态计算机可读存储介质上的计算机程序，所述计算机程序包括程序指令，当所述程序指令被计算机执行时，计算机能够执行上述各方法实施例所提供的方法，例如包括：提取航拍路面图像的路面区域的深层高维特征，根据所述深层高维特征获得高维特征图；基于所述路面区域的深层高维特征，对所述高维特征图进行正负样本筛选，以区分路面裂缝目标和路面背景；对所述路面裂缝目标进行分类和坐标定位，获得所述路面裂缝目标的分类信息和坐标信息。An embodiment of the present invention discloses a computer program product, where the computer program product includes a computer program stored on a non-transitory computer-readable storage medium, the computer program includes program instructions, and when the program instructions are executed by a computer, The computer can execute the methods provided by the above method embodiments, for example, including: extracting the deep high-dimensional features of the road surface area of the aerial road image, and obtaining a high-dimensional feature map according to the deep high-dimensional features; The high-dimensional feature map is screened for positive and negative samples to distinguish the pavement crack target and the pavement background; the pavement crack object is classified and coordinate positioning is performed to obtain the classification information and coordinate information of the pavement crack object.

本发明实施例提供一种非暂态计算机可读存储介质，所述非暂态计算机可读存储介质存储计算机指令，所述计算机指令使所述计算机执行上述各方法实施例所提供的方法，例如包括：提取航拍路面图像的路面区域的深层高维特征，根据所述深层高维特征获得高维特征图；基于所述路面区域的深层高维特征，对所述高维特征图进行正负样本筛选，以区分路面裂缝目标和路面背景；对所述路面裂缝目标进行分类和坐标定位，获得所述路面裂缝目标的分类信息和坐标信息。Embodiments of the present invention provide a non-transitory computer-readable storage medium, where the non-transitory computer-readable storage medium stores computer instructions, and the computer instructions cause the computer to execute the methods provided by the foregoing method embodiments, for example The method includes: extracting the deep high-dimensional features of the pavement area of the aerial photographic road image, and obtaining a high-dimensional feature map according to the deep high-dimensional features; and performing positive and negative samples on the high-dimensional feature map based on the deep high-dimensional features of the pavement area. Screening to distinguish the pavement crack target from the pavement background; classify and coordinate the pavement crack object to obtain classification information and coordinate information of the pavement crack object.

本领域普通技术人员可以理解：实现上述设备实施例或方法实施例仅仅是示意性的，其中所述处理器和所述存储器可以是物理上分离的部件也可以不是物理上分离的，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下，即可以理解并实施。Those of ordinary skill in the art can understand that the implementation of the above device embodiments or method embodiments is merely illustrative, wherein the processor and the memory may be physically separated components or may not be physically separated, that is, they may be located in One place, or it can be distributed over multiple network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment. Those of ordinary skill in the art can understand and implement it without creative effort.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到各实施方式可借助软件加必需的通用硬件平台的方式来实现，当然也可以通过硬件。基于这样的理解，上述技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品可以存储在计算机可读存储介质中，如U盘、移动硬盘、ROM/RAM、磁碟、光盘等，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法。From the description of the above embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by means of software plus a necessary general hardware platform, and certainly can also be implemented by hardware. Based on this understanding, the above-mentioned technical solutions can be embodied in the form of software products in essence or the parts that make contributions to the prior art, and the computer software products can be stored in computer-readable storage media, such as U disk, mobile hard disk , ROM/RAM, magnetic disk, optical disk, etc., including several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform the methods described in various embodiments or parts of embodiments.

最后应说明的是：以上实施例仅用以说明本发明的技术方案，而非对其限制；尽管参照前述实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that it can still be The technical solutions described in the foregoing embodiments are modified, or some technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. an aerial image pavement crack detection method, is characterized in that, comprises:

extracting deep high-dimensional features of the pavement area of the aerial photographed road surface image, and obtaining a high-dimensional feature map according to the deep high-dimensional features;

Based on the deep high-dimensional features of the pavement area, the high-dimensional feature map is screened for positive and negative samples to distinguish the pavement crack target and the pavement background;

classifying and locating the coordinates of the road surface crack target to obtain classification information and coordinate information of the road surface crack target;

Wherein, extracting the deep high-dimensional features of the road surface area of the aerial photographic road surface image, and obtaining a high-dimensional feature map according to the deep high-dimensional features, specifically includes:

A feature extraction network is constructed by using a convolutional neural network, and a rough road segmentation layer based on the K-means clustering algorithm is added to the feature extraction network;

Use the rough road segmentation layer to filter out the roadside invalid area of the aerial photographed road surface image to obtain the road surface area of the aerial photographed road surface image;

Using the feature extraction network to combine the low-dimensional features of the road surface area into high-dimensional features to obtain a high-dimensional feature map;

The high-dimensional feature map is screened for positive and negative samples based on the deep high-dimensional features of the pavement area, so as to distinguish the pavement crack target and the pavement background, specifically including:

Described based on the deep high-dimensional features of the pavement area, using the anchor sliding window to traverse the high-dimensional feature map to obtain candidate sample frames with a preset area scale and a preset aspect ratio;

The candidate sample frame is trained by using the classification loss function, localization loss function and multi-task loss function of the regional nomination network of the feature extraction network to screen positive and negative samples, and obtain road surface crack target samples and road background samples, wherein the positive samples are is the pavement crack target sample, and the negative sample is the pavement background sample;

The classification and coordinate positioning of the road surface crack target to obtain the classification information and coordinate information of the road surface crack target specifically include:

Use the ROI pooling layer of the feature extraction network to regularize the positive and negative samples screened by the regional nomination network into a feature map of uniform size, and perform classification output to obtain classification information including transverse cracks, longitudinal cracks and pavement background;

The classification loss function is used to classify the specific crack category of the positive sample, and the frame regression is performed to correct the coordinate information of the crack target frame.

2. The method according to claim 1, wherein the classification and coordinate positioning of the road surface crack target are performed to obtain classification information and coordinate information of the road surface crack target, and further comprising:

According to the classification information and coordinate information of the pavement crack object, the length of the pavement crack object is calculated.

3. The method according to claim 1, wherein the division basis of the positive and negative samples is:

The IOU of the candidate sample frame and any calibration sample frame is greater than the first preset threshold and 2 candidate sample regions with the largest IOU of the candidate sample frame and the remaining calibration sample frame are divided into positive samples;

dividing the candidate sample frame and the candidate sample frame whose IOU of the calibration sample frame except the positive sample is less than the second preset threshold into a negative sample;

in,

IOU=(candidate sample frame∩calibration sample frame)/(candidate sample frame∪calibration sample frame).

4. The method according to claim 2, wherein the calculating the length of the road surface crack target according to the classification information and coordinate information of the road surface crack target specifically comprises:

According to the classification information and coordinate information of the pavement crack target, the single-pixel skeleton of the crack target is extracted by morphological hit-miss transformation, and the pixel length of the crack target is calculated; the crack target includes transverse cracks and longitudinal cracks;

According to the pixel coordinates of the aerial photographed road surface image and the road length of the actual road surface, the pixel length of the crack target is converted to obtain the length of the crack target.

5. An aerial image pavement crack detection device, characterized in that, comprising:

The high-dimensional feature map module is used to extract the deep high-dimensional features of the road surface area of the aerial photographic road image, and obtain the high-dimensional feature map according to the deep high-dimensional features;

a crack identification module for screening positive and negative samples of the high-dimensional feature map based on the deep high-dimensional features of the pavement area, so as to distinguish the pavement crack target from the pavement background; and

A classification and positioning module, used for classifying and coordinate positioning of the road surface crack target, and obtaining classification information and coordinate information of the road surface crack target;

Wherein, the high-dimensional feature map module is also used for,

The classification and positioning module is also used for:

6. An aerial image pavement crack detection device, characterized in that, comprising:

at least one processor; and

at least one memory communicatively coupled to the processor, wherein:

The memory stores program instructions executable by the processor, and the processor invokes the program instructions to be able to perform the method as claimed in any one of claims 1 to 4.

7 . A non-transitory computer-readable storage medium, wherein the non-transitory computer-readable storage medium stores computer instructions, the computer instructions cause the computer to execute any one of claims 1 to 4 . Methods.