CN117746252A

CN117746252A - A landslide detection method based on improved lightweight YOLOv7

Info

Publication number: CN117746252A
Application number: CN202311747742.6A
Authority: CN
Inventors: 杜宏煜; 季一木; 冯保龙; 刘尚东; 何俊杰; 张伟滔; 吴欣冉; 吴佳鸣
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2023-12-19
Filing date: 2023-12-19
Publication date: 2024-03-22

Abstract

The invention belongs to the technical field of computer image processing, and relates to a landslide detection method based on improved lightweight YOLOv7, which comprises the steps of firstly, collecting mountain satellite images, preprocessing the mountain satellite images by using a super-resolution algorithm GAN, then splicing, rotating and corroding the images, and marking a real frame and a category of the mountain satellite images; secondly, replacing the original Yolov7 features with a lightweight network MobileNet v3 to extract a backbone network, adding a small target detection layer into the model, adding a HAT attention mechanism, simulating different weather conditions, and repeatedly training to obtain an improved lightweight Yolov7 model, and detecting mountain satellite images to obtain a landslide detection result; the invention improves the resolution of the image; the problem of uneven distribution of positive and negative samples is solved; the system has high-efficiency small target detection capability and can be better adapted to complex and changeable weather conditions.

Description

A landslide detection method based on improved lightweight YOLOv7

技术领域Technical field

本发明属于计算机图像处理技术领域，具体地说，是一种基于改进的轻量型YOLOv7山体滑坡检测方法。The invention belongs to the technical field of computer image processing. Specifically, it is an improved lightweight YOLOv7 landslide detection method.

背景技术Background technique

山体滑坡是一种严重的自然灾害，通常由降雨、地质构造和地质条件等因素引发。这种灾害可能导致财产损失、人员伤亡和生态环境破坏。因此，山体滑坡的及时检测和监测对于减轻潜在风险至关重要。传统的山体滑坡检测方法通常依赖于人工巡视或固定式监测设备，这些方法存在一定的局限性，如依赖人力、费时、成本高昂等问题。传统的山体滑坡监测方法主要依赖于人工巡视和使用各种监测设备，如倾角仪、位移传感器、测深仪等。尽管这些方法在一定程度上可以提供有用的数据，但它们存在以下局限性：①依赖人工巡视：人工巡视需要大量的时间和人力资源，不仅效率低下，而且在恶劣天气或夜间难以进行。②有限监测范围：传统监测设备通常分布在有限的位置，难以覆盖广大山体滑坡潜在区域。③高昂的成本：安装、维护和操作监测设备需要昂贵的资金投入，这对于许多地区来说是一个严重的限制因素。Landslides are a serious natural disaster that are usually triggered by factors such as rainfall, geological structures, and geological conditions. Such disasters may result in property losses, casualties, and damage to the ecological environment. Therefore, timely detection and monitoring of landslides is crucial to mitigate potential risks. Traditional landslide detection methods usually rely on manual inspections or fixed monitoring equipment. These methods have certain limitations, such as dependence on manpower, time-consuming, and high cost. Traditional landslide monitoring methods mainly rely on manual inspections and the use of various monitoring equipment, such as inclinometers, displacement sensors, depth sounders, etc. Although these methods can provide useful data to a certain extent, they have the following limitations: ① Reliance on manual inspection: Manual inspection requires a lot of time and human resources, which is not only inefficient, but also difficult to carry out in bad weather or at night. ②Limited monitoring range: Traditional monitoring equipment is usually distributed in limited locations, making it difficult to cover the vast landslide potential area. ③High cost: Installing, maintaining and operating monitoring equipment requires expensive capital investment, which is a serious limiting factor for many regions.

现有的山体滑坡检测方法，面临一些挑战：①复杂的地形和背景：山体滑坡通常发生在多变复杂的地形中，地形和背景干扰严重，增加了检测的难度。②小目标检测：山体滑坡的初期迹象通常相对较小，例如土石材料的微小移动或裂缝的形成，因此需要模型具备高效的小目标检测能力。③数据不平衡：由于山体滑坡是一种罕见事件，正负样本不平衡问题可能导致模型训练不稳定，难以获得高准确性。④数据集模糊：由于山体滑坡事件危险度较高，目前公开视频，卫星图片的质量均不高，且环境较为单一不够丰富。Existing landslide detection methods face some challenges: ① Complex terrain and background: Landslides usually occur in changeable and complex terrain, and terrain and background interference are serious, increasing the difficulty of detection. ②Small target detection: The initial signs of landslides are usually relatively small, such as small movements of earth and rock materials or the formation of cracks, so the model needs to have efficient small target detection capabilities. ③Data imbalance: Since landslides are a rare event, the imbalance of positive and negative samples may lead to unstable model training and difficulty in obtaining high accuracy. ④The data set is blurry: Due to the high risk of landslide events, the quality of currently public videos and satellite images is not high, and the environment is relatively single and not rich enough.

近年来，大量的目标检测算法被提出，如利用典型传统检测算法SIFT(Scale-Invariant Feature Transform)：一种用于检测图像中的局部特征点的算法，可用于目标检测和匹配。随着深度卷积神经网络(Convolutional Neural Network，CNN)的兴起，计算机视觉和深度学习技术的发展为山体滑坡检测提供了新的可能性，基于深度学习的目标检测算法主要有两个大方向，以两阶段为代表的Faster R-CNN，和以一阶段为代表的YOLO、SSD。其中，Faster R-CNN是一种经典的目标检测算法，它通过引入区域提议网络(RPN)来提高检测速度。尽管它在准确性方面表现出色，但在实时性能和轻量化方面可能存在一些挑战。SSD是一种单阶段目标检测算法，它以较快的速度检测多尺度目标。RetinaNet是一种采用Focal Loss的目标检测算法，专注于解决正负样本不平衡问题。尽管它在一些特定场景下表现出色，但可能需要更多的计算资源。In recent years, a large number of target detection algorithms have been proposed, such as the typical traditional detection algorithm SIFT (Scale-Invariant Feature Transform): an algorithm used to detect local feature points in images, which can be used for target detection and matching. With the rise of deep convolutional neural network (CNN), the development of computer vision and deep learning technology has provided new possibilities for landslide detection. Target detection algorithms based on deep learning mainly have two general directions. Faster R-CNN represented by two stages, and YOLO and SSD represented by one stage. Among them, Faster R-CNN is a classic target detection algorithm that improves detection speed by introducing a Region Proposal Network (RPN). Although it performs well in terms of accuracy, there may be some challenges in terms of real-time performance and lightweight. SSD is a single-stage target detection algorithm that detects multi-scale targets at a faster speed. RetinaNet is a target detection algorithm using Focal Loss, focusing on solving the problem of positive and negative sample imbalance. Although it performs well in some specific scenarios, it may require more computing resources.

因此，需要一种轻量化、具备高效的小目标检测能力的山体滑坡检测方法。Therefore, a lightweight and efficient landslide detection method with small target detection capabilities is needed.

发明内容Contents of the invention

本发明针对上述现有技术存在的问题，提供了一种基于改进的轻量型YOLOv7的山体滑坡检测方法。利用轻量级网络MobileNetv3替换原YOLOv7特征提取主干网络，之后，将模型添加小目标检测层，并添加HAT注意力机制改进数据集分辨率低的问题。In view of the problems existing in the above-mentioned prior art, the present invention provides a landslide detection method based on the improved lightweight YOLOv7. The lightweight network MobileNetv3 is used to replace the original YOLOv7 feature extraction backbone network. After that, a small target detection layer is added to the model, and a HAT attention mechanism is added to improve the problem of low resolution of the data set.

为了实现上述目的，本发明采用以下技术方案：一种基于改进的轻量型YOLOv7的山体滑坡检测方法，所述方法具体步骤如下：In order to achieve the above objectives, the present invention adopts the following technical solution: a landslide detection method based on improved lightweight YOLOv7. The specific steps of the method are as follows:

步骤1，采集山体卫星图像；Step 1, collect satellite images of the mountain;

步骤2，使用超分辨率算法GAN对山体卫星图像进行预处理，同时，为了解决正负样本分布不均匀的问题，之后对图像采用拼接、旋转、腐蚀等图像增强操作，增加负样本数量，也增强了模型的泛化能力，再将处理后的山体卫星图像分为训练集、验证集、测试集，并为山体卫星图标注真实框以及所属类别；Step 2: Use the super-resolution algorithm GAN to preprocess the mountain satellite image. At the same time, in order to solve the problem of uneven distribution of positive and negative samples, image enhancement operations such as splicing, rotation, and erosion are then used to increase the number of negative samples. The generalization ability of the model is enhanced, and the processed mountain satellite images are divided into training sets, verification sets, and test sets, and the real frames and categories of the mountain satellite images are marked;

步骤3，构建并训练改进的轻量型YOLOv7模型，改进的轻量型YOLOv7模型包括骨干网络、颈部网络和检测头网络；骨干网络包括MobileNet1网络、MobileNet2网络、MobileNet3网络和MP-ELAN下采样网络；Step 3: Build and train an improved lightweight YOLOv7 model. The improved lightweight YOLOv7 model includes the backbone network, neck network and detection head network; the backbone network includes the MobileNet1 network, MobileNet2 network, MobileNet3 network and MP-ELAN downsampling. network;

步骤4，将步骤2中处理后的山体卫星图像输入改进的轻量型YOLOv7模型的骨干网络中，分别提取MobileNet1网络、MobileNet2网络、MobileNet3网络和MP-ELAN下采样网络的特征信息，得到四种不同尺寸的特征图；Step 4: Input the mountain satellite image processed in step 2 into the backbone network of the improved lightweight YOLOv7 model, and extract the characteristic information of the MobileNet1 network, MobileNet2 network, MobileNet3 network and MP-ELAN downsampling network respectively, and obtain four types of Feature maps of different sizes;

步骤5，将步骤4得到的四种不同尺寸的特征图输入到颈部网络，颈部网络采用PAFPN特征金字塔，通过自顶向下、自下而上的特征提取和FPA加强特征提取网络进特征融合，分别得到四种不同尺寸的加强特征图，同时，为了进一步提高网络性能，在颈部网络中引入HAT注意力机制，以增强网络对关键特征的关注和学习，从而更有效地捕捉山体卫星图像中重要的信息；Step 5: Input the four feature maps of different sizes obtained in step 4 into the neck network. The neck network uses the PAFPN feature pyramid to improve features through top-down, bottom-up feature extraction and FPA enhanced feature extraction network. After fusion, four enhanced feature maps of different sizes are obtained respectively. At the same time, in order to further improve the network performance, the HAT attention mechanism is introduced into the neck network to enhance the network's attention and learning of key features, so as to capture mountain satellites more effectively. Important information in the image;

步骤6，检测头网络部分包含结构重参化卷积RepConv和四种不同目标尺寸的IDetect的检测头；将步骤5得到的四种不同尺寸的加强特征图分别输入到检测头网络中，分别得到四种不同尺寸的预测特征图，并计算先验框，经过计算后为每个尺寸的预测特征图生成三个先验框，并与真实框进行比较，选择与真实框的交并比IoU最大的先验框进行预测，模拟不同天气条件进行反复训练，完成改进的轻量型YOLOv7网络的训练；Step 6. The detection head network part includes the detection heads of structural reparameterized convolution RepConv and IDetect of four different target sizes; input the four enhanced feature maps of different sizes obtained in step 5 into the detection head network respectively, and obtain Predict feature maps of four different sizes, and calculate a priori boxes. After calculation, three a priori boxes are generated for the predicted feature maps of each size, and compared with the real box, select the intersection with the real box with the largest IoU Make predictions based on the a priori frame, simulate different weather conditions for repeated training, and complete the training of the improved lightweight YOLOv7 network;

IoU计算公式如下：The IoU calculation formula is as follows:

其中Intersection_Area代表两个边界框相交的面积；Union_Area表示两个边界框的并集的面积； Among them, Intersection_Area represents the area where two bounding boxes intersect; Union_Area represents the area of the union of two bounding boxes;

步骤7，将测试集输入到训练好的改进的轻量型YOLOv7模型中，依次经过骨干网络、颈部网络和检测头网络，分别得到四种不同尺寸的预测特征图，并通过模型推理预测输出最终的目标框和目标标签，得到山体滑坡检测结果。Step 7: Input the test set into the trained and improved lightweight YOLOv7 model, and pass through the backbone network, neck network and detection head network in sequence to obtain prediction feature maps of four different sizes, and predict the output through model inference The final target frame and target label are used to obtain the landslide detection results.

进一步地，步骤2所述的使用超分辨率算法GAN对山体卫星图像进行预处理，具体为：对采集到的山体卫星图像进行GAN超分处理，以提高分辨率，然后对图像进行几何变换，几何变换包括随机扩展、随机裁剪、随机拼接和缩放到固定比例。Further, the super-resolution algorithm GAN described in step 2 is used to preprocess the mountain satellite image, specifically: perform GAN super-resolution processing on the collected mountain satellite image to improve the resolution, and then perform geometric transformation on the image, Geometric transformations include random expansion, random cropping, random splicing, and scaling to a fixed ratio.

进一步地，步骤2中，训练集、验证集、测试集的比例为8:1:1。Further, in step 2, the ratio of training set, verification set, and test set is 8:1:1.

进一步地，步骤2中使用LableImg工具为山体卫星图标注真实框以及所属类别。Further, in step 2, use the LableImg tool to label the real frame and category of the mountain satellite image.

进一步地，HAT注意力机制分为两个部分：局部窗口自注意力机制和融合通道注意力机制；Furthermore, the HAT attention mechanism is divided into two parts: the local window self-attention mechanism and the fusion channel attention mechanism;

局部窗口自注意力机制：对输入特征进行归一化处理，然后通过窗口自注意力机制将输入特征划分为局部窗口，以聚焦局部的关联信息；Local window self-attention mechanism: normalize the input features, and then divide the input features into local windows through the window self-attention mechanism to focus on local associated information;

融合通道注意力机制：引入全局信息，利用全局信息对特征进行加权处理，通过通道注意力机制激活更多的像素，以获取更多的特征信息。Fusion channel attention mechanism: introduce global information, use global information to weight features, and activate more pixels through the channel attention mechanism to obtain more feature information.

进一步地，步骤5中PAFPN特征金字塔包括SPPCSPC模块、ELAN-H模块、UP上采样、MP下采样和卷积层，其中SPPCSPC模块、ELAN-H模块融合的尾部引入HAT注意力机制。Furthermore, in step 5, the PAFPN feature pyramid includes the SPPCSPC module, the ELAN-H module, UP upsampling, MP downsampling and convolutional layers, in which the HAT attention mechanism is introduced at the tail of the fusion of the SPPCSPC module and the ELAN-H module.

进一步地，步骤2中处理后得到640*640图像，通过骨干网络后输出32倍降采样特征图，然后经过SPPCSPC模块将通道数从1024变为512，按照自顶向下的方式和MobileNet1和MobileNet3层结果进行融合，得到160*160大小特征图；再按照自下而上的方式和ELAN-H结果和SPPCSPC处理结果进行特征融合，得到特征图20*20大小和40*40大小特征图；再将MobileNet2层信息进行与ELAN-H输出特征图融合处理得到80*80大小特征图。Furthermore, the 640*640 image is obtained after processing in step 2, and the 32 times downsampling feature map is output after passing through the backbone network, and then the number of channels is changed from 1024 to 512 through the SPPCSPC module, in a top-down manner with MobileNet1 and MobileNet3 The layer results are fused to obtain a 160*160 size feature map; then the feature fusion is performed with the ELAN-H results and SPPCSPC processing results in a bottom-up manner to obtain a 20*20 size feature map and a 40*40 size feature map; then The MobileNet2 layer information is fused with the ELAN-H output feature map to obtain an 80*80 size feature map.

进一步地，训练过程中，计算匹配度与损失值，计算匹配度主要是将预测框与真实标签边界框的IoU对比，按照判断标准计算出预测框的匹配度；然后，对所有的预测框赋予正、负样本标签，并确定对应的真实物体标签，以方便后续损失的计算，计算出损失值后，使用损失值进行反向传播梯度，以更新改进的轻量型YOLOv7网络的权重和偏差，进而得到最小化损失函数；最小化损失函数计算公式如下：Furthermore, during the training process, the matching degree and loss value are calculated. The matching degree is mainly compared with the IoU of the prediction box and the real label boundary box, and the matching degree of the prediction box is calculated according to the judgment standard; then, all prediction boxes are given Positive and negative sample labels, and determine the corresponding real object labels to facilitate subsequent loss calculations. After calculating the loss value, use the loss value to backpropagate the gradient to update the weights and biases of the improved lightweight YOLOv7 network. Then the minimized loss function is obtained; the calculation formula of the minimized loss function is as follows:

FocalLoss(p_t)＝-α_t(1-p_t)^γlog(p_t)FocalLoss(p _t )＝-α _t (1-p _t ) ^γ log(p _t )

其中，p_t表示困难样本的比重，(1-p_t)^γ叫做调制系数，α_t默认取0.25，γ默认取值为1.5。Among them, p _t represents the proportion of difficult samples, (1-p _t ) ^γ is called the modulation coefficient, α _t defaults to 0.25, and γ defaults to 1.5.

进一步地，为了使预测框更加精确，引入改进的非极大值抑制算法，首先输入B＝{b₁,b₂,…,b_N}，S＝{s₁,s₂,…,s_N}，N_t；其中B是输入的预测框列表，S是每个预测框的置信度得分，N_t是非极大值抑制的阈值设置；计算预测框得分公式如下：Furthermore, in order to make the prediction box more accurate, an improved non-maximum suppression algorithm is introduced. First, B = {b ₁ ,b ₂ ,…,b _N }, S = {s ₁ ,s ₂ ,…,s _N }, N _t are input; where B is the input prediction box list, S is the confidence score of each prediction box, and N _t is the threshold setting for non-maximum suppression; the formula for calculating the prediction box score is as follows:

其中，M和σ是常数；s_i是第i个预测框的置信度得分；b_i是第i个预测框，i∈[1，N]；Among them, M and σ are constants; s _i is the confidence score of the i-th prediction box; b _i is the i-th prediction box, i∈[1, N];

通过计算预测框置信度得分，不断更新得分，进而提升预测框的精确度。By calculating the confidence score of the prediction box, the score is continuously updated to improve the accuracy of the prediction box.

本发明具有以下技术效果：(1)本发明提出了一种基于改进的轻量型YOLOv7的山体滑坡检测方法，利用超分辨率算法GAN对图像进行预处理，大大提高图像的分辨率；对图像采用拼接、旋转、腐蚀等图像操作，解决正负样本分布不均匀的问题；利用轻量型MobileNetV3骨干网络替换YOLOv7的骨干网络并添加小目标检测层，并在网络中引入HAT注意力机制，实现轻量化，并进一步解决目标分辨率低的问题，提高网络性能，以增强网络对关键特征的关注和学习；除此之外，为了增强模型的鲁棒性，利用图像增强手段，模拟不同天气条件。The present invention has the following technical effects: (1) The present invention proposes a landslide detection method based on the improved lightweight YOLOv7, using the super-resolution algorithm GAN to preprocess the image, greatly improving the resolution of the image; Use image operations such as splicing, rotation, and corrosion to solve the problem of uneven distribution of positive and negative samples; use the lightweight MobileNetV3 backbone network to replace the YOLOv7 backbone network and add a small target detection layer, and introduce the HAT attention mechanism into the network to achieve Lightweight, and further solve the problem of low target resolution, improve network performance, and enhance the network's attention and learning of key features; in addition, in order to enhance the robustness of the model, image enhancement methods are used to simulate different weather conditions .

(2)本发明通过构建并训练改进的轻量型YOLOv7网络，克服了传统山体滑坡检测方法中的一些局限性，同时改进了现有YOLO模型在山体滑坡检测中的性能；①天气适应性：模型经过优化，能够更好地适应复杂多变的天气条件。②小目标检测：改进后的模型具备高效的小目标检测能力，能够识别山体滑坡的初期迹象。③数据平衡策略：引入数据平衡策略，解决正负样本不平衡问题，提高模型的稳定性和准确性。(2) The present invention overcomes some limitations in traditional landslide detection methods by constructing and training an improved lightweight YOLOv7 network, and at the same time improves the performance of the existing YOLO model in landslide detection; ① Weather adaptability: The model has been optimized to better adapt to complex and changing weather conditions. ②Small target detection: The improved model has efficient small target detection capabilities and can identify early signs of landslides. ③Data balancing strategy: The data balancing strategy is introduced to solve the problem of imbalance between positive and negative samples and improve the stability and accuracy of the model.

附图说明Description of drawings

图1为本发明改进的轻量型YOLOv7的结构示意图；Figure 1 is a schematic structural diagram of the improved lightweight YOLOv7 of the present invention;

图2为ELAN-H网络的结构示意图；Figure 2 is a schematic structural diagram of the ELAN-H network;

图3为SPPCSPC模块的结构示意图。Figure 3 is a schematic structural diagram of the SPPCSPC module.

具体实施方式Detailed ways

下面将对本发明的内容和附图作详细说明，本实施例在以本发明技术方案为前提下进行实施，涉及到详细的实施方案与操作过程，但本发明的保护范围不仅限于下列的具体实施例，在本发明中使用的术语仅仅是出于描述特定实施例的目的，而非旨在限制本发明。The content and drawings of the present invention will be described in detail below. This embodiment is implemented based on the technical solution of the present invention and involves detailed implementation plans and operating processes. However, the protection scope of the present invention is not limited to the following specific implementations. For example, the terminology used in the present invention is for the purpose of describing particular embodiments only and is not intended to limit the invention.

一种基于改进的轻量型YOLOv7的山体滑坡检测方法，所述方法具体步骤如下：A landslide detection method based on improved lightweight YOLOv7. The specific steps of the method are as follows:

步骤2，使用超分辨率算法GAN对山体卫星图像进行预处理，以提高分辨率，然后对图像进行几何变换，几何变换包括随机扩展、随机裁剪、随机拼接刚和缩放到固定比例，同时，为了解决正负样本分布不均匀的问题，之后对图像采用拼接、旋转、腐蚀等图像增强操作，增加负样本数量，也增强了模型的泛化能力，再将处理后的山体卫星图像分为训练集、验证集、测试集，其比例为8:1:1，并使用LableImg工具为山体卫星图标注真实框以及所属类别；Step 2, use the super-resolution algorithm GAN to preprocess the mountain satellite image to improve the resolution, and then perform geometric transformation on the image. The geometric transformation includes random expansion, random cropping, random splicing and scaling to a fixed ratio. At the same time, in order to To solve the problem of uneven distribution of positive and negative samples, image enhancement operations such as splicing, rotation, and erosion are then used to increase the number of negative samples and enhance the generalization ability of the model. The processed mountain satellite images are then divided into training sets. , verification set and test set, the ratio is 8:1:1, and use the LableImg tool to mark the real frame and category of the mountain satellite image;

步骤5，将步骤4得到的四种不同尺寸的特征图输入到颈部网络，颈部网络采用PAFPN特征金字塔，通过自顶向下、自下而上的特征提取和FPA加强特征提取网络进特征融合，分别得到四种不同尺寸的加强特征图，同时，为了进一步提高网络性能，在颈部网络中引入HAT注意力机制，以增强网络对关键特征的关注和学习，从而更有效地捕捉山体卫星图像中重要的信息；PAFPN特征金字塔包括SPPCSPC模块、ELAN-H模块、UP上采样、MP下采样和卷积层，其中SPPCSPC模块、ELAN-H模块融合的尾部引入HAT注意力机制；SPPCSPC模块中Maxpool_9,Maxpool_7,Maxpool_5,Maxpool_3分别对应四种不同尺度的最大池化，有四种感受野，来区分大目标和小目标；Step 5: Input the four feature maps of different sizes obtained in step 4 into the neck network. The neck network uses the PAFPN feature pyramid to improve features through top-down, bottom-up feature extraction and FPA enhanced feature extraction network. After fusion, four enhanced feature maps of different sizes are obtained respectively. At the same time, in order to further improve the network performance, the HAT attention mechanism is introduced into the neck network to enhance the network's attention and learning of key features, so as to capture mountain satellites more effectively. Important information in the image; the PAFPN feature pyramid includes the SPPCSPC module, ELAN-H module, UP upsampling, MP downsampling and convolutional layers, in which the HAT attention mechanism is introduced at the end of the fusion of the SPPCSPC module and ELAN-H module; in the SPPCSPC module Maxpool_9, Maxpool_7, Maxpool_5, and Maxpool_3 respectively correspond to four different scales of maximum pooling and have four receptive fields to distinguish large targets from small targets;

步骤6，检测头网络部分包含结构重参化卷积RepConv和四种不同目标尺寸的IDetect的检测头；将步骤5得到的四种不同尺寸的加强特征图分别输入到检测头网络中，分别得到四种不同尺寸的预测特征图，并计算先验框，经过计算后为每个尺寸的预测特征图生成三个先验框，并与真实框进行比较，选择与真实框的交并比IoU最大的先验框进行预测，模拟不同天气条件(雨天、雾天、雪天天气条件)进行反复训练，完成改进的轻量型YOLOv7网络的训练；Step 6. The detection head network part includes the detection heads of structural reparameterized convolution RepConv and IDetect of four different target sizes; input the four enhanced feature maps of different sizes obtained in step 5 into the detection head network respectively, and obtain Predict feature maps of four different sizes, and calculate a priori boxes. After calculation, three a priori boxes are generated for the predicted feature maps of each size, and compared with the real box, select the intersection with the real box with the largest IoU Use the a priori frame to predict, simulate different weather conditions (rainy, foggy, snowy weather conditions) for repeated training, and complete the training of the improved lightweight YOLOv7 network;

IoU计算公式如下：The IoU calculation formula is as follows:

训练过程中，计算匹配度与损失值，计算匹配度主要是将预测框与真实标签边界框的IoU对比，按照判断标准计算出预测框的匹配度；然后，对所有的预测框赋予正、负样本标签，并确定对应的真实物体标签，以方便后续损失的计算，计算出损失值后，使用损失值进行反向传播梯度，以更新改进的轻量型YOLOv7网络的权重和偏差，进而得到最小化损失函数；最小化损失函数计算公式如下：During the training process, the matching degree and loss value are calculated. The matching degree is mainly compared with the IoU of the prediction box and the real label boundary box, and the matching degree of the prediction box is calculated according to the judgment standard; then, all prediction boxes are assigned positive and negative Sample labels, and determine the corresponding real object labels to facilitate subsequent loss calculations. After calculating the loss value, use the loss value to backpropagate the gradient to update the weights and biases of the improved lightweight YOLOv7 network, and then obtain the minimum minimize the loss function; the calculation formula for minimizing the loss function is as follows:

其中，p_t表示困难样本的比重，(1-p_t)^γ叫做调制系数，α_t默认取0.25，γ默认取值为1.5；Among them, p _t represents the proportion of difficult samples, (1-p _t ) ^γ is called the modulation coefficient, α _t defaults to 0.25, and γ defaults to 1.5;

HAT注意力机制分为两个部分：局部窗口自注意力机制和融合通道注意力机制；The HAT attention mechanism is divided into two parts: the local window self-attention mechanism and the fusion channel attention mechanism;

步骤2中处理后得到640*640图像，通过骨干网络后输出32倍降采样特征图，然后经过SPPCSPC模块将通道数从1024变为512，按照自顶向下的方式和MobileNet1和MobileNet3层结果进行融合，得到160*160大小特征图；再按照自下而上的方式和ELAN-H结果和SPPCSPC处理结果进行特征融合，得到特征图20*20大小和40*40大小特征图；再将MobileNet2层信息进行与ELAN-H输出特征图融合处理得到80*80大小特征图。After processing in step 2, a 640*640 image is obtained. After passing through the backbone network, a 32 times downsampled feature map is output. Then the number of channels is changed from 1024 to 512 through the SPPCSPC module, and the results of the MobileNet1 and MobileNet3 layers are processed in a top-down manner. Fusion to obtain a 160*160 size feature map; then perform feature fusion with the ELAN-H results and SPPCSPC processing results in a bottom-up manner to obtain a 20*20 size feature map and a 40*40 size feature map; then the MobileNet2 layer The information is fused with the ELAN-H output feature map to obtain an 80*80 size feature map.

进一步地，为了使预测框更加精确，引入改进的非极大值抑制算法，首先输入B＝{b₁,b₂,…,b_N}，S＝{s₁,s₂,…,s_N}，N_t；其中B是输入的预测框列表，S是每个预测框的置信度得分，N_t是非极大值抑制的阈值设置；计算预测框得分公式如下：Furthermore, in order to make the prediction box more accurate, an improved non-maximum suppression algorithm is introduced. First, input B={b ₁ , b ₂ ,..., b _N }, S={s ₁ , s ₂ ,..., s _N }, N _t ; where B is the input prediction box list, S is the confidence score of each prediction box, and N _t is the threshold setting of non-maximum suppression; the formula for calculating the prediction box score is as follows:

分别从参数量、精度、准确度三个指标对YOLOv7、替换主干网络后的YOLOv7_MobileNet、添加小目标检测层的YOLOv7_MobileNet_addlayer以及添加了HAT注意力机制的YOLOv7_MobileNet_addlayer_HAT进行了对比实验，如下表所示，Comparative experiments were conducted on YOLOv7, YOLOv7_MobileNet after replacing the backbone network, YOLOv7_MobileNet_addlayer with the addition of a small target detection layer, and YOLOv7_MobileNet_addlayer_HAT with the addition of the HAT attention mechanism from the three indicators of parameter quantity, precision, and accuracy, as shown in the following table.

表1结果对比Table 1 Comparison of results

模型Model 参数量Parameter quantity 精度Accuracy 准确度Accuracy YOLOv7YOLOv7 72M72M 71.36％71.36% 72.78％72.78% YOLOv7_MobileNetYOLOv7_MobileNet 16.7M16.7M 70.64％70.64% 73.12％73.12% YOLOv7_MobileNet_addlayerYOLOv7_MobileNet_addlayer 17.3M17.3M 73.56％73.56% 76.74％76.74% YOLOv7_MobileNet_addlayer_HATYOLOv7_MobileNet_addlayer_HAT 17.3M17.3M 75.23％75.23% 79.51％79.51%

结果表明，YOLOv7_MobileNet_addlayer_HAT模型，在模型参数量较小的情况下，精度和准确度均较高，效果较为显著。The results show that the YOLOv7_MobileNet_addlayer_HAT model has high precision and accuracy when the number of model parameters is small, and the effect is more significant.

对所公开的实施例的上述说明，使本领域专业技术人员能够使用本发明。同时以上实施例仅为说明本发明的技术思想，不能以此限定本发明的保护范围，凡是按照本发明提出的技术思想，在技术方案基础上所做的任何改动，均落入本发明保护范围之内。The above description of the disclosed embodiments enables those skilled in the art to utilize the present invention. At the same time, the above embodiments are only to illustrate the technical ideas of the present invention and cannot be used to limit the protection scope of the present invention. Any changes made based on the technical solutions based on the technical ideas proposed by the present invention will fall within the protection scope of the present invention. within.

Claims

1. The landslide detection method based on the improved lightweight YOLOv7 is characterized by comprising the following specific steps of:

step 1, acquiring mountain satellite images;

step 2, preprocessing a mountain satellite image by using a super-resolution algorithm GAN, then performing splicing, rotation and corrosion operations on the image, dividing the processed mountain satellite image into a training set, a verification set and a test set, and marking a real frame and a category of the mountain satellite image;

step 3, constructing and training an improved lightweight YOLOv7 model, wherein the improved lightweight YOLOv7 model comprises a backbone network, a neck network and a detection head network; the backbone network comprises a MobileNet1 network, a MobileNet2 network, a MobileNet3 network and an MP-ELAN downsampling network;

step 4, inputting the mountain satellite image processed in the step 2 into a backbone network of an improved lightweight YOLOv7 model, and respectively extracting characteristic information of a MobileNet1 network, a MobileNet2 network, a MobileNet3 network and an MP-ELAN downsampling network to obtain four characteristic diagrams with different sizes;

step 5, inputting the four feature graphs with different sizes obtained in the step 4 into a neck network, wherein the neck network adopts a PAFPN feature pyramid, and four reinforcement feature graphs with different sizes are respectively obtained through feature extraction from top to bottom and feature fusion of an FPA reinforcement feature extraction network, and a HAT attention mechanism is introduced into the neck network;

step 6, the detection head network part comprises detection heads of structural re-parameterized convolution RepConv and four different target sizes; inputting the four reinforcement feature images with different sizes obtained in the step 5 into a detection head network respectively to obtain four prediction feature images with different sizes, calculating prior frames, generating three prior frames for the prediction feature images with each size after calculation, comparing the three prior frames with a real frame, selecting the prior frame with the largest intersection ratio IoU with the real frame for prediction, simulating different weather conditions for repeated training, and completing the training of an improved lightweight YOLOv7 network;

IoU the formula is as follows:

wherein intersectionarea represents the Area where two bounding boxes intersect; union_area represents the Area of the Union of two bounding boxes;

and 7, inputting the test set into a trained improved lightweight YOLOv7 model, sequentially passing through a backbone network, a neck network and a detection head network to respectively obtain four prediction feature diagrams with different sizes, and predicting and outputting a final target frame and a final target label through model reasoning to obtain landslide detection results.

2. The improved lightweight YOLOv 7-based landslide detection method of claim 1, wherein the preprocessing of mountain satellite images using super resolution algorithm GAN in step 2 is specifically as follows: GAN super-division processing is carried out on the acquired mountain satellite images, and then geometric transformation is carried out on the images, wherein the geometric transformation comprises random expansion, random cutting, random splicing and scaling to a fixed proportion.

3. The improved lightweight YOLOv 7-based landslide detection method of claim 1 wherein in step 2 the ratio of training set, validation set, test set is 8:1:1.

4. The improved lightweight YOLOv 7-based landslide detection method of claim 1 wherein step 2 uses LableImg tool to label the landscapes with the true boxes and categories to which they belong.

5. The improved lightweight YOLOv 7-based landslide detection method of claim 1 wherein HAT attention mechanism is split into two parts: a local window self-attention mechanism and a fusion channel attention mechanism;

local window self-attention mechanism: carrying out normalization processing on the input features, and dividing the input features into local windows through a window self-attention mechanism so as to focus local associated information;

fusion channel attention mechanism: introducing global information, weighting the features by using the global information, and activating more pixels through a channel attention mechanism to acquire more feature information.

6. The improved lightweight YOLOv 7-based landslide detection method of claim 1 wherein the PAFPN feature pyramid of step 5 comprises an SPPCSPC module, an ELAN-H module, UP-sampling, MP down-sampling, and a convolution layer, wherein the tail of the SPPCSPC module, ELAN-H module fusion introduces HAT attention mechanisms.

7. The improved lightweight YOLOv 7-based landslide detection method of claim 6, wherein the image 640 x 640 is obtained after processing in step 2, a 32-time downsampling feature map is output after passing through a backbone network, then the number of channels is changed from 1024 to 512 through an SPPCSPC module, and the channel number is fused with the results of the MobileNet1 and the MobileNet3 layers in a top-down manner to obtain a 160 x 160 size feature map; then carrying out feature fusion on the ELAN-H result and the SPPCSPC processing result in a bottom-up mode to obtain a feature map 20 x 20 size and a feature map 40 x 40 size; and then fusing the MobileNet2 layer information with the ELAN-H output characteristic diagram to obtain an 80 x 80 size characteristic diagram.

8. The improved lightweight YOLOv 7-based landslide detection method of claim 1, wherein in the training process, matching degree and loss value are calculated, the matching degree is calculated by comparing a predicted frame with IoU of a real label bounding box, and the matching degree of the predicted frame is calculated according to a judgment standard; then, positive and negative sample labels are given to all the prediction frames, corresponding real object labels are determined so as to facilitate the calculation of subsequent loss, after a loss value is calculated, a back propagation gradient is carried out by using the loss value so as to update the weight and deviation of an improved lightweight YOLOv7 network, and then a minimized loss function is obtained; the minimization loss function calculation formula is as follows:

FocalLoss(p _t )＝-α _t (1-p _t ) ^γ log(p _t )

wherein p is _t Represents the specific gravity of a difficult sample, (1-p) _t ) ^γ Called modulation factor, alpha _t Default to 0.25 and gamma default to 1.5.

9. The method for detecting landslide of improved lightweight YOLOv7 of claim 1 wherein to make the prediction block more accurate, an improved non-maximum suppression algorithm is introduced by first inputting b= { B ₁ ，b ₂ ，...，b _N }，S＝{s ₁ ，s ₂ ，...，s _N }，N _t The method comprises the steps of carrying out a first treatment on the surface of the Where B is a list of input prediction boxes, S is a confidence score for each prediction box, N _t A threshold setting that is non-maximum suppression; the formula for calculating the prediction frame score is as follows:

wherein M and σ are constants; s is(s) _i Is the confidence score of the ith prediction box; b _i Is the i-th prediction frame, i is E [1, N]；

And the confidence score of the prediction frame is calculated, so that the score is continuously updated, and the accuracy of the prediction frame is further improved.