[go: up one dir, main page]

CN114882367B - A method for detecting and evaluating airport pavement defects - Google Patents

A method for detecting and evaluating airport pavement defects Download PDF

Info

Publication number
CN114882367B
CN114882367B CN202210590510.3A CN202210590510A CN114882367B CN 114882367 B CN114882367 B CN 114882367B CN 202210590510 A CN202210590510 A CN 202210590510A CN 114882367 B CN114882367 B CN 114882367B
Authority
CN
China
Prior art keywords
depth
video frame
image
size
map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210590510.3A
Other languages
Chinese (zh)
Other versions
CN114882367A (en
Inventor
姜晓燕
王柏涵
方志军
黄哲栩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai University of Engineering Science
Original Assignee
Shanghai University of Engineering Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai University of Engineering Science filed Critical Shanghai University of Engineering Science
Priority to CN202210590510.3A priority Critical patent/CN114882367B/en
Publication of CN114882367A publication Critical patent/CN114882367A/en
Application granted granted Critical
Publication of CN114882367B publication Critical patent/CN114882367B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/176Urban or other man-made structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

本发明涉及一种机场道面缺陷检测与状态评估方法,包括以下步骤:S1:确定目标机场道面范围,获取目标机场道面视频,获取视频帧图像;S2:将所有视频帧图像输入深度估计网络,获取所有视频帧图像对应的深度图;S3:将所有视频帧图像拼接为全尺寸道面RGB图,将深度图拼接成全尺寸道面深度图;S4:将全尺寸道面RGB图和全尺寸道面深度图分别处理为局部RGB图和局部灰度图,并输入语义分割网络,获取分割结果掩码图;S5:将分割结果掩码图拼接为全尺寸缺陷掩码图,并结合全尺寸道面深度图,判断目标机场道面不同缺陷的严重程度,再根据路面状况指数对道面质量状况进行评估。与现有技术相比,该发明能够实现对机场道面状况高效且准确的评估。

The present invention relates to an airport pavement defect detection and state assessment method, comprising the following steps: S1: determine the target airport pavement range, obtain the target airport pavement video, and obtain the video frame image; S2: input all video frame images into the depth estimation network, and obtain the depth map corresponding to all video frame images; S3: splice all video frame images into a full-size pavement RGB image, and splice the depth map into a full-size pavement depth map; S4: process the full-size pavement RGB image and the full-size pavement depth map into a local RGB image and a local grayscale image, respectively, and input them into a semantic segmentation network to obtain a segmentation result mask map; S5: splice the segmentation result mask map into a full-size defect mask map, and combine it with the full-size pavement depth map to judge the severity of different defects of the target airport pavement, and then evaluate the pavement quality status according to the road surface condition index. Compared with the prior art, the invention can achieve efficient and accurate evaluation of the airport pavement condition.

Description

一种机场道面缺陷检测与状态评估方法A method for detecting and evaluating airport pavement defects

技术领域Technical Field

本发明涉及计算机图像分割技术领域,尤其是涉及一种机场道面缺陷检测与状态评估方法。The invention relates to the technical field of computer image segmentation, and in particular to an airport pavement defect detection and state assessment method.

背景技术Background Art

由于大量的旅客运输和货邮运输,机场道面不可避免地出现各类病害,根据病害的严重程度计算机场道面状况,并由计算出的路面状况指数(Pavement Condition Index,PCI)指数提醒并辅助道面的查验,以保障飞机的安全起降等。Due to the large amount of passenger and cargo transportation, airport pavement inevitably suffers from various diseases. The pavement condition is calculated according to the severity of the disease, and the calculated Pavement Condition Index (PCI) is used to remind and assist the pavement inspection to ensure the safe takeoff and landing of aircraft.

路面状况指数(PCI)是一项评价道路破损程度的指标,该指标综合了道面缺陷的类型、损坏程度、损坏范围或密度三方面的定量状况。道面损坏等级评定标准如表1所示。The pavement condition index (PCI) is an indicator for evaluating the degree of road damage. It combines the quantitative conditions of the type of pavement defects, the degree of damage, and the scope or density of damage. The pavement damage grade assessment criteria are shown in Table 1.

表1道面结构损坏等级评定标准Table 1 Pavement structure damage level assessment standard

道面损坏等级Pavement damage level Difference Second-rate middle good excellent PCI范围PCI Scope [0,40)[0,40) [40,55)[40,55) [55,70)[55,70) [70,85)[70,85) [85,100)[85,100)

传统的机场道面缺陷检测需要专业人员实地测量,存在以下问题:Traditional airport pavement defect detection requires on-site measurement by professionals, which has the following problems:

(1)由于机场道面承载大量的起飞降落,造成道面外观对比度相对较低,因此增加了检测难度,限制了检测效率;(1) Since the airport pavement carries a large number of takeoffs and landings, the pavement appearance contrast is relatively low, which increases the difficulty of detection and limits the detection efficiency;

(2)由于人工检测期间需暂停飞机的起飞降落活动,因此进一步增加了检测成本。(2) Since the aircraft's takeoff and landing activities need to be suspended during manual inspection, the inspection cost is further increased.

近年来深度学习快速发展,使用基于深度学习的方法代替人工检测,成为机场道面缺陷检测的趋势之一。目前主流方法是采集2D图片数据,通过基于神经网络的语义分割方法,检测机场道面缺陷。但此类方法只能利用表观特征对不同缺陷进行辨别,在针对纹理特征差异小、但深度特征差异大的缺陷时(例如裂缝与划痕),检测准确度较差。In recent years, deep learning has developed rapidly, and using deep learning-based methods to replace manual inspection has become one of the trends in airport pavement defect detection. The current mainstream method is to collect 2D image data and detect airport pavement defects through semantic segmentation methods based on neural networks. However, this method can only use surface features to distinguish different defects. When it comes to defects with small differences in texture features but large differences in depth features (such as cracks and scratches), the detection accuracy is poor.

因此,现需要一种有针对性地融合道面2D特征与3D特征的方法,以提高机场道面缺陷检测的效率与准确率。然而,同步采集RGB图像和深度图像极易造成误差,且同时标注两种模态数据需要较高的人工成本。Therefore, a method of integrating 2D and 3D features of pavement is needed to improve the efficiency and accuracy of airport pavement defect detection. However, the simultaneous acquisition of RGB images and depth images is prone to errors, and the simultaneous annotation of two modal data requires high labor costs.

发明内容Summary of the invention

本发明的目的就是为了克服上述现有技术存在的缺陷而提供一种机场道面缺陷检测与状态评估方法,该发明能够实现对机场道面状况高效且准确的评估。The purpose of the present invention is to provide a method for detecting and evaluating the defects of the above-mentioned prior art, which can realize efficient and accurate evaluation of the condition of the airport pavement.

本发明的目的可以通过以下技术方案来实现:The purpose of the present invention can be achieved by the following technical solutions:

本发明提供一种机场道面缺陷检测与状态评估方法,包括以下步骤:The present invention provides an airport pavement defect detection and status assessment method, comprising the following steps:

S1:确定目标机场道面范围,获取目标机场道面视频,并抽取视频帧,获取视频帧图像;S1: Determine the target airport pavement range, obtain the target airport pavement video, and extract the video frame to obtain the video frame image;

S2:将S1获取的所有视频帧图像依次输入深度估计网络,获取所有视频帧图像各自对应的深度图;S2: Input all the video frame images obtained in S1 into the depth estimation network in sequence to obtain the depth maps corresponding to all the video frame images;

S3:将S1获取的所有视频帧图像拼接为全尺寸道面RGB图,将S2获取的深度图拼接成全尺寸道面深度图;S3: stitch all the video frame images obtained by S1 into a full-size road surface RGB image, and stitch the depth images obtained by S2 into a full-size road surface depth image;

S4:将S3获取的全尺寸道面RGB图和全尺寸道面深度图分别处理为局部RGB图和局部灰度图,并输入语义分割网络,获取分割结果掩码图;S4: Process the full-size road surface RGB image and the full-size road surface depth image obtained in S3 into a local RGB image and a local grayscale image respectively, and input them into the semantic segmentation network to obtain a segmentation result mask image;

S5:将S4获取的分割结果掩码图拼接为全尺寸缺陷掩码图,并结合S3获取的全尺寸道面深度图,判断目标机场道面不同缺陷的严重程度,再根据路面状况指数对道面质量状况进行评估。S5: The segmentation result mask map obtained in S4 is spliced into a full-size defect mask map, and combined with the full-size pavement depth map obtained in S3, the severity of different defects of the target airport pavement is determined, and then the pavement quality condition is evaluated according to the pavement condition index.

优选地,S2中所述深度估计网络采用基于PackNet的单目无监督深度估计网络,所述单目无监督深度估计网络包括用以生成2D图像深度信息的PackNet网络分支和用以生成相邻视频帧图像之间位姿信息的Pose Convnet网络分支。Preferably, the depth estimation network in S2 adopts a monocular unsupervised depth estimation network based on PackNet, and the monocular unsupervised depth estimation network includes a PackNet network branch for generating 2D image depth information and a Pose Convnet network branch for generating pose information between adjacent video frame images.

优选地,所述S2包括以下步骤:Preferably, S2 comprises the following steps:

S2.1:将d时刻的视频帧图像Id输入至PackNet网络分支,生成Id的初始深度图,将s时刻和d时刻的视频帧图像Is和Id输入到Pose ConvNet网络分支,生成s时刻和d时刻的视频帧图像之间的位姿信息;S2.1: Input the video frame image I d at time d into the PackNet network branch to generate the initial depth map of I d , and input the video frame images I s and I d at time s and time d into the Pose ConvNet network branch to generate the pose information between the video frame images at time s and time d;

S2.2:根据S2.1获取的Id的初始深度图、s时刻和d时刻的视频帧图像之间的位姿信息以及s时刻的视频帧图像Is,获取d时刻的重构图片数据 S2.2: Obtain the reconstructed image data at time d based on the initial depth map of I d obtained in S2.1, the pose information between the video frame images at time s and time d, and the video frame image I s at time s

S2.3:根据S2.2获取的d时刻的重构图片数据和d时刻的视频帧图像Id计算PackNet网络分支的Loss函数,根据平移向量td->s、单目相机在d时刻的瞬时速度v以及s时刻和d时刻之间的时间差Δtd->s获取Pose ConvNet网络分支的Loss函数,使得深度估计网络自行更新迭代,生成d时刻的视频帧图像对应的深度图;S2.3: Reconstructed image data at time d obtained according to S2.2 The Loss function of the PackNet network branch is calculated based on the video frame image I d at time d, and the Loss function of the Pose ConvNet network branch is obtained according to the translation vector t d->s , the instantaneous speed v of the monocular camera at time d, and the time difference Δt d->s between time s and time d, so that the depth estimation network updates and iterates itself to generate a depth map corresponding to the video frame image at time d;

S2.4:重复S2.1~S2.3,获取所有视频帧图像对应的深度图。S2.4: Repeat S2.1 to S2.3 to obtain depth maps corresponding to all video frame images.

优选地,所述S3包括以下步骤:Preferably, S3 comprises the following steps:

S3.1:分别提取相邻两个视频帧图像,即第一视频帧图像和第二视频帧图像的特征点,生成特征描述子,并进行快速近似最邻近匹配,获取相邻两张视频帧图像对应的匹配点;S3.1: extract feature points of two adjacent video frame images, namely, the first video frame image and the second video frame image, respectively, generate feature descriptors, and perform fast approximate nearest neighbor matching to obtain matching points corresponding to the two adjacent video frame images;

S3.2:剔除S3.1中误差较大的匹配点,获取多组可靠匹配点对;S3.2: Eliminate the matching points with large errors in S3.1 and obtain multiple sets of reliable matching point pairs;

S3.3:对S3.1中的两个视频帧图像进行分解,获取两个视频帧图像的单应性矩阵H12S3.3: Decompose the two video frame images in S3.1 to obtain the homography matrix H 12 of the two video frame images;

S3.4:根据单应性矩阵H12,将第二视频帧图像转换到第一视频帧图像的像素平面进行图像拼接,获取拼接视频帧图像,将第二视频帧图像对应的深度图转换到第一视频帧图像对应的深度图的像素平面进行图像拼接,获取拼接深度图;S3.4: According to the homography matrix H 12 , the second video frame image is converted to the pixel plane of the first video frame image for image stitching to obtain a stitched video frame image, and the depth map corresponding to the second video frame image is converted to the pixel plane of the depth map corresponding to the first video frame image for image stitching to obtain a stitched depth map;

S3.5:将S1获取的所有视频帧图像和S2获取的所有视频帧图像各自对应的深度图均进行S3.1~S3.4的操作,进而获取全尺寸道面RGB图和全尺寸道面深度图。S3.5: Perform the operations of S3.1 to S3.4 on the depth maps corresponding to all the video frame images obtained by S1 and all the video frame images obtained by S2, so as to obtain a full-size road surface RGB map and a full-size road surface depth map.

优选地,S4中所述语义分割网络采用特征层融合的U-Net语义分割网络。Preferably, the semantic segmentation network in S4 adopts a U-Net semantic segmentation network with feature layer fusion.

优选地,S4中所述语义分割网络包括编码器和解码器,所述编码器包括RGB编码器和深度编码器,所述RGB编码器和所述深度编码器均包括多个编码层,每个所述编码层均包括卷积层、池化层和激活层,所述解码层包括多个解码层,每个所述解码层均包括卷积层、上采样层和激活层。Preferably, the semantic segmentation network in S4 includes an encoder and a decoder, the encoder includes an RGB encoder and a depth encoder, the RGB encoder and the depth encoder each include a plurality of encoding layers, each of the encoding layers includes a convolutional layer, a pooling layer and an activation layer, the decoding layer includes a plurality of decoding layers, each of the decoding layers includes a convolutional layer, an upsampling layer and an activation layer.

优选地,所述RGB编码器和所述深度编码器均包括四个编码层,所述解码器包括三个解码层。Preferably, the RGB encoder and the depth encoder each include four encoding layers, and the decoder includes three decoding layers.

优选地,所述S4包括以下步骤:Preferably, the S4 comprises the following steps:

S4.1:将S3获取的全尺寸道面RGB图和全尺寸道面深度图分别处理为多个局部RGB图和局部灰度图;S4.1: Process the full-size road surface RGB image and the full-size road surface depth image obtained in S3 into multiple local RGB images and local grayscale images respectively;

S4.2:将所述局部RGB图和局部灰度图分别进行卷积,获取第一RGB特征图和第一深度特征图;S4.2: Convolve the local RGB image and the local grayscale image respectively to obtain a first RGB feature image and a first depth feature image;

S4.3:将所述第一RGB特征图输入所述RGB编码器中,获取第二RGB特征图组,将所述第一深度特征图输入所述深度编码器中,获取第二深度特征图组;S4.3: Input the first RGB feature map into the RGB encoder to obtain a second RGB feature map group, and input the first depth feature map into the depth encoder to obtain a second depth feature map group;

S4.4:将S4.3获取的两个特征图组中相同尺寸的特征图进行通道维度的拼接,依次获取对应的拼接特征图;S4.4: splicing the feature maps of the same size in the two feature map groups obtained in S4.3 in channel dimension, and obtaining the corresponding spliced feature maps in sequence;

S4.5:将S4.4中尺寸最小的拼接特征图作为解码器的输入,并结合S4.4中其余的拼接特征图,由解码器进行解码并输出,获取第三特征图;S4.5: The smallest concatenated feature map in S4.4 is used as the input of the decoder, and is combined with the remaining concatenated feature maps in S4.4, and is decoded and output by the decoder to obtain a third feature map;

S4.6:将S4.4获取的第三特征图进行卷积操作,获取不同种类缺陷分割结果掩码图。S4.6: Perform a convolution operation on the third feature map obtained in S4.4 to obtain mask maps of segmentation results of different types of defects.

优选地,所述S5具体为:Preferably, S5 is specifically:

将S4获取的分割结果掩码图拼接为全尺寸缺陷掩码图,根据所述全尺寸缺陷掩码图获取不同种类的缺陷面积,根据S3获取的全尺寸道面深度图,计算全尺寸缺陷掩码图的深度和长度,再根据路面状况指数进而判断目标机场道面不同缺陷的损坏严重程度。The segmentation result mask map obtained in S4 is spliced into a full-size defect mask map, and the defect areas of different types are obtained according to the full-size defect mask map. According to the full-size pavement depth map obtained in S3, the depth and length of the full-size defect mask map are calculated, and then the damage severity of different defects of the target airport pavement is judged according to the pavement condition index.

优选地,所述路面状况指数PCI的计算公式具体为:Preferably, the calculation formula of the road condition index PCI is specifically:

式中,a0和a1均为道面材质系数,DR为路面破损率(%)。In the formula, a0 and a1 are the pavement material coefficients, and DR is the pavement damage rate (%).

与现有技术相比,本发明具有以下优点:Compared with the prior art, the present invention has the following advantages:

1、本发明使用深度估计网络,挖掘了2D视频帧图像中的3D深度信息,解决了现有技术中对同步采集多模态数据技术的高依赖问题,进而解决了现有技术中同时标注多种模态数据需要较高的人工成本的问题;1. The present invention uses a depth estimation network to mine 3D depth information in 2D video frame images, solving the problem of high dependence on synchronous multi-modal data acquisition technology in the prior art, and further solving the problem of high labor costs required for simultaneous labeling of multiple modal data in the prior art;

2、本发明使用改进的U-Net语义分割网络,将道面的2D纹理数据和3D深度数据,在特征层进行融合,解决了现有技术中由于道面3D信息缺失而造成的精度降低问题;2. The present invention uses an improved U-Net semantic segmentation network to fuse the 2D texture data and 3D depth data of the road surface at the feature layer, solving the problem of reduced accuracy caused by the lack of 3D road surface information in the prior art;

3、本发明根据语义分割掩码和估计深度数值,结合PCI度量的国家标准,实现了对机场道面状况高效且准确的评估。3. The present invention realizes efficient and accurate evaluation of airport pavement conditions based on semantic segmentation masks and estimated depth values in combination with national standards for PCI measurement.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本实施例提供的一种机场道面缺陷检测与状态评估方法的流程示意图;FIG1 is a schematic flow chart of a method for detecting and assessing airport pavement defects provided by this embodiment;

图2为图1所示实施例中采集装置的结构示意图。FIG. 2 is a schematic diagram of the structure of the collection device in the embodiment shown in FIG. 1 .

图3为图1所示实施例中基于PackNet的单目无监督深度估计网络的网络结构示意图。FIG3 is a schematic diagram of the network structure of a monocular unsupervised depth estimation network based on PackNet in the embodiment shown in FIG1 .

图4为图1所示实施例中语义分割网络的网络结构示意图。FIG4 is a schematic diagram of the network structure of the semantic segmentation network in the embodiment shown in FIG1 .

图5为图1所示实施例中基于SIFT描述子的拼接算法流程图。FIG. 5 is a flow chart of a splicing algorithm based on SIFT descriptors in the embodiment shown in FIG. 1 .

图6为图1所示实施例中基于分割结果掩码进行道面缺陷状况评估的流程图。FIG. 6 is a flow chart of evaluating road surface defect conditions based on the segmentation result mask in the embodiment shown in FIG. 1 .

图中标记说明:Description of the markings in the figure:

1、采集车辆,2、固定支架,3、单目相机,4、目标机场道面,5、单次可拍摄道面。1. Collection vehicle, 2. Fixed bracket, 3. Monocular camera, 4. Target airport pavement, 5. Single shot of the pavement.

具体实施方式DETAILED DESCRIPTION

下面结合附图和具体实施例对本发明进行详细说明。The present invention is described in detail below with reference to the accompanying drawings and specific embodiments.

实施例Example

本实施例提供一种机场道面缺陷检测与状态评估方法,包括以下步骤:This embodiment provides an airport pavement defect detection and status assessment method, comprising the following steps:

S1:确定目标机场道面范围,通过采集装置获取目标机场道面视频,并抽取视频帧,获取视频帧图像;S1: Determine the target airport pavement range, obtain the target airport pavement video through the acquisition device, and extract the video frame to obtain the video frame image;

作为一种可选的实施方式,采用由四个顶点唯一确定的矩形描述目标机场道面范围。As an optional implementation, a rectangle uniquely determined by four vertices is used to describe the range of the target airport runway surface.

作为一种可选的实施方式,参考图2所示,采集装置包括采集车辆1、固定支架2和单目相机3,固定支架2的一端固定于采集车辆1上,固定支架2的另一端安装有单目相机3,单目相机3的镜头平面与目标机场道面4平行。As an optional embodiment, referring to FIG2 , the acquisition device includes a acquisition vehicle 1, a fixed bracket 2 and a monocular camera 3, one end of the fixed bracket 2 is fixed to the acquisition vehicle 1, and the other end of the fixed bracket 2 is installed with the monocular camera 3, and the lens plane of the monocular camera 3 is parallel to the target airport runway surface 4.

作为一种可选的实施方式,以不超过45km/h的速度,操控采集装置驶过目标机场道面4,同时拍摄视频,拍摄完成后按每秒20帧的标准抽取视频帧,获取视频帧图像,视频帧图像的范围与单次可拍摄道面5的大小相等。As an optional implementation, the acquisition device is operated to pass over the target airport pavement 4 at a speed not exceeding 45 km/h, and video is shot at the same time. After the shooting is completed, video frames are extracted at a standard of 20 frames per second to obtain video frame images. The range of the video frame image is equal to the size of the pavement 5 that can be shot at a single time.

具体地,抽取视频帧的目的在于使得相邻视频帧之间有重叠区域,以便于后续进行深度估计。Specifically, the purpose of extracting video frames is to make adjacent video frames have overlapping areas to facilitate subsequent depth estimation.

S2:将S1获取的所有视频帧图像依次输入深度估计网络,生成所有视频帧图像各自对应的深度图;S2: Input all the video frame images obtained in S1 into the depth estimation network in sequence to generate the depth maps corresponding to all the video frame images;

作为一种可选的实施方式,S2中的深度估计网络采用基于PackNet的单目无监督深度估计网络。参考图3所示,该网络包括用以生成2D图像深度信息的PackNet网络分支和用以生成相邻视频帧图像之间位姿信息的Pose Convnet网络分支。As an optional implementation, the depth estimation network in S2 adopts a monocular unsupervised depth estimation network based on PackNet. Referring to FIG3 , the network includes a PackNet network branch for generating 2D image depth information and a Pose Convnet network branch for generating pose information between adjacent video frame images.

其中,PackNet网络包括编码-解码结构,采用packing和unpacking代替传统的上采样和下采样操作。在编码部分,采用残差块代替普通的卷积层,残差块之间使用packing实现下采样运算,每经过一次下采样运算,特征图的大小缩小一倍;在解码部分,卷积块之间采用unpacking实现上采样运算,每经过一次上采样运算,特征图的大小会放大一倍,最后生成与原图长宽相同的深度信息图。Among them, the PackNet network includes an encoding-decoding structure, using packing and unpacking to replace traditional upsampling and downsampling operations. In the encoding part, residual blocks are used to replace ordinary convolutional layers, and packing is used between residual blocks to implement downsampling operations. After each downsampling operation, the size of the feature map is reduced by one time; in the decoding part, unpacking is used between convolutional blocks to implement upsampling operations. After each upsampling operation, the size of the feature map will be doubled, and finally a depth information map with the same length and width as the original image is generated.

S2.1:将d时刻的视频帧图像Id输入至PackNet网络分支,生成Id的初始深度图,将s时刻和d时刻的视频帧图像Is和Id输入到Pose ConvNet网络分支,生成s时刻和d时刻的视频帧图像之间的位姿信息。S2.1: Input the video frame image I d at time d into the PackNet network branch to generate the initial depth map of I d , and input the video frame images I s and I d at time s and time d into the Pose ConvNet network branch to generate the pose information between the video frame images at time s and time d.

具体地,位姿信息包括旋转矩阵R和平移向量t。Specifically, the pose information includes the rotation matrix R and the translation vector t.

S2.2:根据S2.1获取的Id的初始深度图、s时刻和d时刻的视频帧图像之间的位姿信息以及s时刻的视频帧图像Is,获取d时刻的重构图片数据 S2.2: Obtain the reconstructed image data at time d based on the initial depth map of I d obtained in S2.1, the pose information between the video frame images at time s and time d, and the video frame image I s at time s

S2.3:根据S2.2获取的d时刻的重构图片数据和d时刻的视频帧图像Id计算PackNet网络分支的Loss函数,根据平移向量td->s、单目相机在d时刻的瞬时速度v以及s时刻和d时刻之间的时间差Δtd->s获取Pose ConvNet网络分支的Loss函数,使得深度估计网络自行更新迭代,生成d时刻的视频帧图像对应的深度图。S2.3: Reconstructed image data at time d obtained according to S2.2 The Loss function of the PackNet network branch is calculated based on the video frame image I d at time d, and the Loss function of the Pose ConvNet network branch is obtained according to the translation vector t d->s , the instantaneous velocity v of the monocular camera at time d, and the time difference Δt d->s between time s and time d, so that the depth estimation network updates and iterates itself to generate a depth map corresponding to the video frame image at time d.

具体地,td->s为平移向量t在s时刻的视频帧图像Is和d时刻视频帧图像Id的之间的平移向量。Specifically, t d->s is the translation vector t between the video frame image I s at time s and the video frame image I d at time d.

具体地,描述PackNet网络分支的Loss函数的公式具体为:Specifically, the formula describing the Loss function of the PackNet network branch is:

式中,为d时刻视频帧图像Id和d时刻的重构图片数据之间的结构相似度,a为超参数,取值在0到1之间,为d时刻视频帧图像Id和d时刻的重构图片数据之间的差值的模,为PackNet网络分支的Loss函数。In the formula, is the video frame image I d at time d and the reconstructed image data at time d The structural similarity between them, a is a hyperparameter, with a value between 0 and 1. is the video frame image I d at time d and the reconstructed image data at time d The modulus of the difference between is the Loss function of the PackNet network branch.

具体地,描述Pose ConvNet网络分支的Loss函数的公式具体为:Specifically, the formula describing the Loss function of the Pose ConvNet network branch is:

Lv(td->s,v)=||td->s||-|v||Δtd->s|L v (t d->s ,v)=||t d->s ||-|v||Δt d->s |

式中,||td->s||为d时刻视频帧图像Id和s时刻视频帧图像Is之间的平移向量的模,|v|为单目相机在d时刻的瞬时速度的模,|Δtd->s|为d时刻和s时刻之间的时间差的模。Wherein, ||t d->s || is the modulus of the translation vector between the video frame image I d at time d and the video frame image I s at time s , |v| is the modulus of the instantaneous velocity of the monocular camera at time d, and |Δt d->s | is the modulus of the time difference between time d and time s.

S2.4:重复S2.1~S2.3,获取所有视频帧图像对应的深度图。S2.4: Repeat S2.1 to S2.3 to obtain depth maps corresponding to all video frame images.

S3:将S1获取的所有视频帧图像拼接为全尺寸道面RGB图,将S2获取的深度图拼接成全尺寸道面深度图;S3: stitch all the video frame images obtained by S1 into a full-size road surface RGB image, and stitch the depth images obtained by S2 into a full-size road surface depth image;

作为一种可选的实施方式,采用基于SIFT描述子的拼接算法将将S1获取的所有视频帧图像拼接为全尺寸道面RGB图,该步骤的目的在于去除不同视频帧之间的重叠区域,避免语义分割过程中相同区域在不同图片的分割结果存在差异,并减少后续计算量。As an optional implementation, a stitching algorithm based on SIFT descriptor is used to stitch all video frame images obtained by S1 into a full-size road surface RGB image. The purpose of this step is to remove the overlapping areas between different video frames, avoid differences in segmentation results of the same area in different pictures during the semantic segmentation process, and reduce the subsequent calculation amount.

参考图5所示,以相邻两个视频帧图像RGB1和RGB2,以及各自对应的深度图Depth1和Depth2为例,S3包括以下步骤:Referring to FIG5 , taking two adjacent video frame images RGB 1 and RGB 2 and their corresponding depth maps Depth1 and Depth2 as an example, S3 includes the following steps:

S3.1:分别提取视频帧图像RGB1和RGB2的SIFT特征点,生成SIFT特征描述子,并进行快速近似最邻近匹配,获取相邻两张视频帧图像对应的匹配点;S3.1: Extract SIFT feature points of RGB 1 and RGB 2 of video frame images respectively, generate SIFT feature descriptors, and perform fast approximate nearest neighbor matching to obtain matching points corresponding to two adjacent video frame images;

S3.2:采用RANSAC算法,剔除S3.1中误差较大的匹配点,获取多组可靠匹配点对;S3.2: Use the RANSAC algorithm to remove the matching points with large errors in S3.1 and obtain multiple sets of reliable matching point pairs;

S3.3:采用SVD对视频帧图像RGB1和RGB2进行分解,获取视频帧图像RGB1和RGB2的单应性矩阵H12S3.3: Decompose the video frame images RGB 1 and RGB 2 using SVD to obtain the homography matrix H 12 of the video frame images RGB 1 and RGB 2 ;

S3.4:根据单应性矩阵H12,将视频帧图像RGB2转换到RGB1所在的像素平面进行图像拼接,获取拼接视频帧图像RGB12,将Depth2转换到Depth1所在的像素平面进行图像拼接,获取拼接深度图Depth12S3.4: According to the homography matrix H 12 , the video frame image RGB 2 is converted to the pixel plane where RGB 1 is located for image stitching to obtain the stitched video frame image RGB 12 , and Depth2 is converted to the pixel plane where Depth1 is located for image stitching to obtain the stitched depth map Depth 12 .

S3.5:将S1获取的所有视频帧图像和S2获取的所有视频帧图像各自对应的深度图均进行S3.1~S3.4的操作,进而获取全尺寸道面RGB图和全尺寸道面深度图。S3.5: Perform the operations of S3.1 to S3.4 on the depth maps corresponding to all the video frame images obtained by S1 and all the video frame images obtained by S2, so as to obtain a full-size road surface RGB map and a full-size road surface depth map.

S4:将S3获取的全尺寸道面RGB图和全尺寸道面深度图分别分割为局部RGB图和局部灰度图,并输入语义分割网络,获取分割结果掩码图;S4: Segment the full-size road surface RGB image and the full-size road surface depth image obtained in S3 into a local RGB image and a local grayscale image respectively, and input them into a semantic segmentation network to obtain a segmentation result mask image;

作为一种可选的实施方式,语义分割网络采用特征层融合的U-Net语义分割网络,参考图4所示,该网络包括编码器和解码器,网络的输入为处理后的RGB图和对应的深度图,网络的输出为分割结果掩码图。As an optional implementation, the semantic segmentation network adopts a U-Net semantic segmentation network with feature layer fusion, as shown in Figure 4, the network includes an encoder and a decoder, the input of the network is the processed RGB image and the corresponding depth image, and the output of the network is the segmentation result mask image.

具体地,编码器包括RGB编码器和深度编码器,RGB编码器和深度编码器均包括多个编码层,每个编码层包括一个卷积层、一个池化层和一个激活层,每经过一个编码层的卷积层,特征图的长宽减半、通道数翻倍。解码器包括多个解码层,每个解码层均包括一个卷积层、一个上采样层、一个激活层,每经过一个解码层的卷积层,特征图的长宽翻倍、通道数减半。Specifically, the encoder includes an RGB encoder and a depth encoder. The RGB encoder and the depth encoder each include multiple encoding layers. Each encoding layer includes a convolution layer, a pooling layer, and an activation layer. After each convolution layer of an encoding layer, the length and width of the feature map are halved and the number of channels is doubled. The decoder includes multiple decoding layers. Each decoding layer includes a convolution layer, an upsampling layer, and an activation layer. After each convolution layer of a decoding layer, the length and width of the feature map are doubled and the number of channels is halved.

作为一种可选的实施方式,RGB编码器和深度编码器均包括四个编码层,解码器包括三个解码层。As an optional implementation, the RGB encoder and the depth encoder each include four encoding layers, and the decoder includes three decoding layers.

优选地,将全尺寸道面RGB图和全尺寸道面深度图分别处理为1024*512*3的局部RGB图和1024*512*1的局部灰度图,S4具体包括以下步骤:Preferably, the full-size road surface RGB image and the full-size road surface depth image are processed into a local RGB image of 1024*512*3 and a local grayscale image of 1024*512*1 respectively, and S4 specifically includes the following steps:

S4.1:将S3获取的全尺寸道面RGB图和全尺寸道面深度图分别处理为多个1024*512*3的局部RGB图和1024*512*1的局部灰度图;S4.1: Process the full-size road surface RGB image and the full-size road surface depth image obtained in S3 into multiple 1024*512*3 local RGB images and 1024*512*1 local grayscale images respectively;

S4.2:将处理后的局部RGB图和局部灰度图分别进行1*1卷积,获取两个形状均为1024*512*16的第一RGB特征图和第一深度特征图;S4.2: Perform 1*1 convolution on the processed local RGB image and local grayscale image respectively to obtain the first RGB feature map and the first depth feature map, both of which have a shape of 1024*512*16;

S4.3:将第一RGB特征图输入RGB编码器中,获取第二RGB特征图组,将第一深度特征图输入深度编码器中,获取第二深度特征图组;S4.3: Input the first RGB feature map into the RGB encoder to obtain a second RGB feature map group, and input the first depth feature map into the depth encoder to obtain a second depth feature map group;

具体地,当第一RGB特征图输入RGB编码器中,每经过一层编码层,得到一张第二RGB特征图,由于设有四个编码层,因此第二RGB特征图组中有四张第二RGB特征图;对应地,第二深度特征图组中也有四张第二深度特征图。Specifically, when the first RGB feature map is input into the RGB encoder, a second RGB feature map is obtained after each encoding layer. Since there are four encoding layers, there are four second RGB feature maps in the second RGB feature map group; correspondingly, there are also four second depth feature maps in the second depth feature map group.

且第二RGB特征图组和第二深度特征图组的各特征图尺寸对应相等。两个特征图组中的各特征图由输入到输出的尺寸大小依次为:512*256*32、256*128*64、128*64*128、64*32*256;And the sizes of the feature maps of the second RGB feature map group and the second depth feature map group are equal. The sizes of the feature maps in the two feature map groups from input to output are: 512*256*32, 256*128*64, 128*64*128, 64*32*256;

S4.4:将S4.3获取的两个特征图组中相同尺寸的特征图进行通道维度的拼接,获取4个拼接特征图,拼接特征图的形状分别为512*256*64、256*128*128、128*64*256、64*32*512;S4.4: Concatenate the feature maps of the same size in the two feature map groups obtained in S4.3 in terms of channel dimension to obtain four concatenated feature maps, the shapes of which are 512*256*64, 256*128*128, 128*64*256, and 64*32*512 respectively;

S4.5:将S4.4中尺寸最小的拼接特征图作为解码器的输入,并结合S4.4中其余的拼接特征图,由解码器进行解码并输出,获取第三特征图。S4.5: The smallest concatenated feature map in S4.4 is used as the input of the decoder, and is combined with the remaining concatenated feature maps in S4.4, and is decoded and output by the decoder to obtain a third feature map.

具体地,将尺寸大小为64*32*512的拼接特征图输入解码器,在第一层解码层时,尺寸大小为64*32*512的拼接特征图和尺寸大小为128*64*256的拼接特征图经解码层处理为尺寸大小为128*64*512的特征图,而尺寸大小为128*64*512的特征图结合尺寸大小为256*128*128的特征图经过第二层解码层处理为尺寸大小为256*128*256的特征图,尺寸大小为256*128*256的特征图结合尺寸大小为512*256*64的特征图经过第三层解码层处理最终获取尺寸大小为512*256*128的第三特征图。Specifically, a spliced feature map with a size of 64*32*512 is input into the decoder. At the first decoding layer, the spliced feature map with a size of 64*32*512 and the spliced feature map with a size of 128*64*256 are processed by the decoding layer into a feature map with a size of 128*64*512, and the feature map with a size of 128*64*512 is combined with the feature map with a size of 256*128*128 and processed by the second decoding layer into a feature map with a size of 256*128*256. The feature map with a size of 256*128*256 is combined with the feature map with a size of 512*256*64 and processed by the third decoding layer to finally obtain a third feature map with a size of 512*256*128.

S4.6:将第三特征图进行两次卷积操作,获取尺寸为1024*512*1的不同种类缺陷分割结果掩码图。S4.6: Perform two convolution operations on the third feature map to obtain a mask map of different types of defect segmentation results with a size of 1024*512*1.

为保证不同模态的特征融合,需将RGB图及其对应的深度图分别输入不同的编码器,称作RGB编码器和深度编码器,并将两个分支在不同阶段的特征图,进行通道维度的拼接。To ensure the fusion of features of different modalities, the RGB image and its corresponding depth image need to be input into different encoders, called RGB encoder and depth encoder, respectively, and the feature maps of the two branches at different stages need to be spliced in the channel dimension.

且为保证不同尺度的特征融合,编码输出的特征图尺寸需和解码输出的特征图尺寸保持一致,并将两个特征图进行通道维度的拼接。In order to ensure the fusion of features of different scales, the size of the feature map output by encoding must be consistent with the size of the feature map output by decoding, and the two feature maps must be spliced in the channel dimension.

S5:将S4获取的分割结果掩码图拼接为全尺寸缺陷掩码图,并结合S3获取的全尺寸道面深度图,判断目标机场道面不同缺陷的严重程度,再根据路面状况指数对道面质量状况进行评估。S5: The segmentation result mask map obtained in S4 is spliced into a full-size defect mask map, and combined with the full-size pavement depth map obtained in S3, the severity of different defects of the target airport pavement is determined, and then the pavement quality condition is evaluated according to the pavement condition index.

具体地,参考图6所示,根据全尺寸缺陷掩码图获取不同种类的缺陷面积,根据S3获取的全尺寸道面深度图,计算全尺寸缺陷掩码图的深度和长度,进而判断目标机场道面不同缺陷的损坏严重程度。Specifically, referring to FIG6 , the defect areas of different types are obtained according to the full-size defect mask map, and the depth and length of the full-size defect mask map are calculated according to the full-size pavement depth map obtained in S3, so as to judge the damage severity of different defects of the target airport pavement.

路面状况指数PCI的计算公式具体为:The calculation formula of the road condition index PCI is as follows:

式中,a0和a1均为道面材质系数,当目标机场道面为水泥道面时,a0=10.66,a1=0.461,当目标机场道面为沥青道面时,a0=15,a1=0.412;DR为路面破损率(%),即不同种类缺陷的折合损坏面积之和与路面总面积的百分比,路面破损率的计算公式为:Wherein, a0 and a1 are the pavement material coefficients. When the target airport pavement is cement pavement, a0 = 10.66, a1 = 0.461; when the target airport pavement is asphalt pavement, a0 = 15, a1 = 0.412; DR is the pavement damage rate (%), that is, the percentage of the sum of the reduced damaged areas of different types of defects to the total pavement area. The calculation formula for the pavement damage rate is:

式中,i0为缺陷类别总数,当目标机场道面为水泥道面时,i0=15,当目标机场道面为沥青道面时,i0=16。A为目标道面总面积(平方米);Where i 0 is the total number of defect categories. When the target airport pavement is a cement pavement, i 0 = 15; when the target airport pavement is an asphalt pavement, i 0 = 16. A is the total area of the target pavement (square meters);

Ai为第i类缺陷的面积(平方米);A i is the area of the i-th defect (square meters);

wi为第i类缺陷的权重,其在水泥道面和沥青道面的取值及损坏程度的判断依据,分别如表2、表3所示:w i is the weight of the i-th defect. Its value and the basis for judging the degree of damage on cement pavement and asphalt pavement are shown in Table 2 and Table 3 respectively:

表2水泥道面缺陷权重Table 2 Cement pavement defect weights

表3沥青道面缺陷权重Table 3 Asphalt pavement defect weights

以上详细描述了本发明的较佳具体实施例。应当理解,本领域的普通技术人员无需创造性劳动就可以根据本发明的构思做出诸多修改和变化。因此,凡本技术领域中技术人员依本发明的构思在现有技术的基础上通过逻辑分析、推理或者有限的实验可以得到的技术方案,皆应在由权利要求书所确定的保护范围内。The preferred specific embodiments of the present invention are described in detail above. It should be understood that a person skilled in the art can make many modifications and changes based on the concept of the present invention without creative work. Therefore, any technical solution that can be obtained by a person skilled in the art through logical analysis, reasoning or limited experiments based on the concept of the present invention on the basis of the prior art should be within the scope of protection determined by the claims.

Claims (4)

1. An airport pavement defect detection and state assessment method is characterized by comprising the following steps:
s1: determining a target airfield pavement range, acquiring a target airfield pavement video, extracting a video frame, and acquiring a video frame image;
s2: sequentially inputting all the video frame images acquired in the step S1 into a depth estimation network to acquire depth maps corresponding to all the video frame images;
S3: all video frame images acquired in the step S1 are spliced into full-size road surface RGB images, and the depth images acquired in the step S2 are spliced into full-size road surface depth images;
S4: processing the full-size road surface RGB image and the full-size road surface depth image obtained in the step S3 into a partial RGB image and a partial gray image respectively, inputting a semantic segmentation network, and obtaining a segmentation result mask image;
S5: splicing the segmentation result mask map obtained in the step S4 into a full-size defect mask map, judging the severity of different defects of a target airport pavement by combining the full-size pavement depth map obtained in the step S3, and evaluating the pavement quality according to pavement condition indexes;
The depth estimation network in S2 adopts a PackNet-based monocular non-supervision depth estimation network, wherein the monocular non-supervision depth estimation network comprises PackNet network branches used for generating 2D image depth information and Pose Convnet network branches used for generating pose information of adjacent video frame images;
Wherein PackNet the network comprises an encoding-decoding structure, and uses packing and unpacking to replace the traditional up-sampling and down-sampling operations; in the coding part, a residual block is adopted to replace a common convolution layer, the residual blocks use packing to realize downsampling operation, and the size of a feature map is doubled after downsampling operation every time; in the decoding part, the up-sampling operation is realized by adopting unpacking among the convolution blocks, the size of the feature map is doubled after each up-sampling operation, and finally, a depth information map with the same length and width as the original map is generated;
S2.1: inputting a video frame image I d at the d moment into a PackNet network branch to generate an initial depth map of I d, inputting video frame images I s and I d at the s moment and the d moment into a Pose ConvNet network branch to generate pose information between the video frame images at the s moment and the d moment;
Specifically, the pose information includes a rotation matrix R and a translation vector t;
S2.2: acquiring reconstruction slice data at the time d according to the initial depth map of I d acquired at the time S2.1, pose information between video frame images at the time S and the time d and the video frame image I s at the time S
S2.3: reconstructed slice data from time d acquired in S2.2And calculating PackNet a Loss function of the network branch by the video frame image I d at the d moment, and acquiring Pose ConvNet the Loss function of the network branch according to the translation vector t d->s, the instantaneous speed v of the monocular camera at the d moment and the time difference deltat d->s between the s moment and the d moment, so that the depth estimation network automatically updates and iterates to generate a depth map corresponding to the video frame image at the d moment;
Specifically, t d->s is a translation vector between the video frame image I s at time s and the video frame image I d at time d of the translation vector t;
specifically, the formula describing the Loss function of PackNet network branches is specifically:
In the method, in the process of the invention, Reconstructed slice data for d-temporal video frame image I d and d-temporalStructural similarity between the two, a is an over parameter, the value is between 0 and 1,Reconstructed slice data for d-temporal video frame image I d and d-temporalThe modulus of the difference between them,A Loss function for PackNet network branches;
Specifically, the formula describing the Loss function of Pose ConvNet network branches is specifically:
Lv(td->s,v)=||td->s||-|v||Δtd->s|
Wherein, |t d->s | is a module of a translation vector between the d-moment video frame image I d and the s-moment video frame image I s, |v| is a module of an instantaneous speed of the monocular camera at the d-moment, and |delta t d->s | is a module of a time difference between the d-moment and the s-moment;
s2.4: repeating the steps S2.1-S2.3 to obtain depth maps corresponding to all video frame images;
s4, the semantic segmentation network adopts a U-Net semantic segmentation network fused by a feature layer, the network comprises an encoder and a decoder, the input of the network is a processed RGB image and a corresponding depth image, and the output of the network is a segmentation result mask image;
specifically, the encoder comprises an RGB encoder and a depth encoder, wherein the RGB encoder and the depth encoder comprise a plurality of encoding layers, each encoding layer comprises a convolution layer, a pooling layer and an activating layer, and the length and width of a feature map are halved and the number of channels is doubled when each convolution layer passes through one encoding layer; the decoder comprises a plurality of decoding layers, each decoding layer comprises a convolution layer, an up-sampling layer and an activation layer, and the length and width of the feature map are doubled and the channel number is halved through the convolution layer of each decoding layer;
the RGB encoder and the depth encoder each comprise four encoding layers, and the decoder comprises three decoding layers;
processing the full-size road surface RGB map and the full-size road surface depth map into a local RGB map of 1024 x 512 x 3 and a local gray map of 1024 x 512 x 1, respectively, wherein S4 specifically comprises the following steps:
S4.1: processing the full-size road surface RGB image and the full-size road surface depth image obtained in the step S3 into a plurality of 1024 x 512 x 3 partial RGB images and 1024 x 512 x 1 partial gray images respectively;
s4.2: carrying out 1*1 convolutions on the processed local RGB image and the local gray image respectively to obtain a first RGB feature image and a first depth feature image with the two shapes of 1024 x 512 x 16;
S4.3: inputting the first RGB feature map into an RGB encoder to obtain a second RGB feature map set, and inputting the first depth feature map into a depth encoder to obtain a second depth feature map set;
Specifically, when the first RGB feature map is input into an RGB encoder, a second RGB feature map is obtained after each pass through one coding layer, and four coding layers are arranged, so that four second RGB feature maps are arranged in a second RGB feature map group; correspondingly, four second depth feature images are also arranged in the second depth feature image group;
And the sizes of the feature images of the second RGB feature image group and the second depth feature image group are correspondingly equal; the sizes of the feature graphs in the two feature graph groups from input to output are as follows: 512 x 256 x 32, 256 x 128 x 64, 128 x 64 x 128, 64 x 32 x 256;
S4.4: the feature images with the same size in the two feature image groups obtained in the step S4.3 are subjected to channel dimension stitching to obtain 4 stitching feature images, wherein the shapes of the stitching feature images are 512 x 256 x 64, 256 x 128, 128 x 64 x 256, 64 x 32 x 512 respectively;
S4.5: taking the spliced characteristic diagram with the smallest size in the S4.4 as the input of a decoder, and combining the rest spliced characteristic diagrams in the S4.4, decoding and outputting by the decoder to obtain a third characteristic diagram;
Specifically, a concatenated feature map with a size of 64×32×512 is input to a decoder, when a first layer decodes a layer, a concatenated feature map with a size of 64×32×512 and a concatenated feature map with a size of 128×64×256 are processed by the decoding layer to form a feature map with a size of 128×64×512, and a feature map with a size of 128×64×512 combined with a feature map with a size of 256×128×128 is processed by a second layer decoding layer to form a feature map with a size of 256×128×256, and a feature map with a size of 256×128×256 combined with a feature map with a size of 512×256×64 is processed by a third layer decoding layer to finally obtain a third feature map with a size of 512×128×128.
S4.6: performing convolution operation on the third feature map twice to obtain different types of defect segmentation result mask maps with the size of 1024 x 512 x 1;
In order to ensure feature fusion of different modes, respectively inputting an RGB image and a corresponding depth image into different encoders, namely an RGB encoder and a depth encoder, and splicing the feature images of the two branches at different stages to perform channel dimension;
In order to ensure feature fusion of different scales, the sizes of the feature images output by encoding and decoding are consistent, and the two feature images are spliced in channel dimension.
2. The method for detecting and evaluating the defects of the airport pavement according to claim 1, wherein said S3 comprises the steps of:
S3.1: respectively extracting feature points of two adjacent video frame images, namely a first video frame image and a second video frame image, generating feature descriptors, and carrying out quick approximate nearest matching to obtain matching points corresponding to the two adjacent video frame images;
S3.2: removing the matching points with larger errors in the S3.1 to obtain a plurality of groups of reliable matching point pairs;
s3.3: decomposing two video frame images in the S3.1 to obtain homography matrixes H 12 of the two video frame images;
S3.4: according to the homography matrix H 12, converting the second video frame image into the pixel plane of the first video frame image for image stitching, obtaining a stitched video frame image, converting the depth image corresponding to the second video frame image into the pixel plane of the depth image corresponding to the first video frame image for image stitching, and obtaining a stitched depth image;
s3.5: and (3) performing operations of S3.1-S3.4 on all the video frame images acquired in the S1 and all the depth maps corresponding to all the video frame images acquired in the S2, so as to acquire a full-size road surface RGB map and a full-size road surface depth map.
3. The method for detecting and evaluating the defect of the airport pavement according to claim 1, wherein S5 is specifically:
And (3) splicing the segmentation result mask map obtained in the step (S4) into a full-size defect mask map, obtaining different types of defect areas according to the full-size defect mask map, calculating the depth and the length of the full-size defect mask map according to the full-size pavement depth map obtained in the step (S3), and judging the damage severity of different defects of the target airport pavement according to the pavement condition index.
4. The method for detecting and evaluating the defect of an airport pavement according to claim 3, wherein the calculation formula of the pavement condition index PCI is specifically as follows:
Where a 0 and a 1 are road surface material coefficients and DR is a road surface breakage rate.
CN202210590510.3A 2022-05-26 2022-05-26 A method for detecting and evaluating airport pavement defects Active CN114882367B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210590510.3A CN114882367B (en) 2022-05-26 2022-05-26 A method for detecting and evaluating airport pavement defects

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210590510.3A CN114882367B (en) 2022-05-26 2022-05-26 A method for detecting and evaluating airport pavement defects

Publications (2)

Publication Number Publication Date
CN114882367A CN114882367A (en) 2022-08-09
CN114882367B true CN114882367B (en) 2024-09-27

Family

ID=82676883

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210590510.3A Active CN114882367B (en) 2022-05-26 2022-05-26 A method for detecting and evaluating airport pavement defects

Country Status (1)

Country Link
CN (1) CN114882367B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115272300B (en) * 2022-09-20 2023-01-06 中电信数字城市科技有限公司 Pavement disease detection method, system, device, equipment and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113255678A (en) * 2021-06-17 2021-08-13 云南航天工程物探检测股份有限公司 Road crack automatic identification method based on semantic segmentation
CN113393522A (en) * 2021-05-27 2021-09-14 湖南大学 6D pose estimation method based on monocular RGB camera regression depth information

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013090830A1 (en) * 2011-12-16 2013-06-20 University Of Southern California Autonomous pavement condition assessment
CN111739078B (en) * 2020-06-15 2022-11-18 大连理工大学 A Monocular Unsupervised Depth Estimation Method Based on Contextual Attention Mechanism

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113393522A (en) * 2021-05-27 2021-09-14 湖南大学 6D pose estimation method based on monocular RGB camera regression depth information
CN113255678A (en) * 2021-06-17 2021-08-13 云南航天工程物探检测股份有限公司 Road crack automatic identification method based on semantic segmentation

Also Published As

Publication number Publication date
CN114882367A (en) 2022-08-09

Similar Documents

Publication Publication Date Title
CN109190752B (en) Image Semantic Segmentation Based on Deep Learning Global and Local Features
CN111768388B (en) A product surface defect detection method and system based on positive sample reference
CN113240623B (en) Pavement disease detection method and device
CN111415329A (en) A detection method for workpiece surface defects based on deep learning
CN111222519B (en) Construction method, method and device of hierarchical colored drawing manuscript line extraction model
CN107316064A (en) A kind of asphalt pavement crack classifying identification method based on convolutional neural networks
CN116563237B (en) A hyperspectral image detection method for chicken carcass defects based on deep learning
CN111986170A (en) Defect detection algorithm based on Mask R-CNN (deep neural network)
CN113763384B (en) Defect detection method and defect detection device in industrial quality inspection
CN115830004A (en) Surface defect detection method, device, computer equipment and storage medium
CN111080609A (en) Brake shoe bolt loss detection method based on deep learning
CN114612803B (en) Improved CENTERNET transmission line insulator defect detection method
CN117876836A (en) Image fusion method based on multi-scale feature extraction and object reconstruction
CN114155375B (en) Method, device, electronic equipment and storage medium for detecting airport pavement defects
Shit et al. An encoder‐decoder based CNN architecture using end to end dehaze and detection network for proper image visualization and detection
CN114037693A (en) A deep learning-based evaluation method for rock pore-crack and impurity characteristics
CN114882367B (en) A method for detecting and evaluating airport pavement defects
CN117350925A (en) Infrared and visible light image fusion method, device and equipment for inspection images
CN115049640B (en) A road crack detection method based on deep learning
CN114998713B (en) Pavement disease identification method, device and system, electronic equipment and storage medium
Cao et al. A novel image multitasking enhancement model for underwater crack detection
CN114428110A (en) Method and system for detecting defects of fluorescent magnetic powder inspection image of bearing ring
CN114821351A (en) Railway hazard identification method, device, electronic equipment and storage medium
CN114897790A (en) Method for quickly identifying surface cracks of reinforced concrete member based on deep learning
CN111145178A (en) Multi-scale segmentation method for high-resolution remote sensing images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant