CN110232362B

CN110232362B - Ship size estimation method based on convolutional neural network and multi-feature fusion

Info

Publication number: CN110232362B
Application number: CN201910526996.2A
Authority: CN
Inventors: 王英华; 王聪; 刘宏伟; 何敬鲁
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2019-06-18
Filing date: 2019-06-18
Publication date: 2023-04-07
Anticipated expiration: 2039-06-18
Also published as: CN110232362A

Abstract

The invention discloses a method for estimating ship size based on convolutional neural network and multi-feature fusion, which mainly solves the problem in the prior art that the target size estimation error is relatively large under the condition of medium and low resolution. The implementation plan is: 1) Obtain training samples and test samples, and perform preprocessing; 2) Obtain the magnitude image of the target area, and calculate multi-dimensional features; 3) Construct the target size estimation network framework, and train it with training samples to obtain the trained model; 4) use the trained model to estimate the preliminary size feature of the test sample; 5) form the multi-dimensional feature and the preliminary size feature into a new multi-dimensional feature; 6) use the new multi-dimensional feature to train the gradient boosting decision tree GBDT; 7) use the trained The GBDT model estimates the final size characteristics of the test sample. The invention uses the CNN network to independently learn the target features of the SAR image, improves the estimation accuracy of the target size, and can be used for the identification and classification of the ship target in the SAR image.

Description

Ship size estimation method based on convolutional neural network and multi-feature fusion

技术领域Technical Field

本发明属于雷达图像处理技术领域，主要涉及SAR图像舰船尺寸估计方法，可用于SAR图像舰船目标的识别与分类。The invention belongs to the technical field of radar image processing, and mainly relates to a method for estimating the size of a ship in a SAR image, and can be used for identifying and classifying ship targets in a SAR image.

背景技术Background Art

合成孔径雷达SAR能够不受天气、环境等因素的影响，对海洋环境进行大范围的观测，已经成为海洋管理的一种有效手段，它还被广泛应用于军事侦察、灾情预报等领域，具有广阔的研究和应用前景。近年来，在对海洋目标进行监测的基础上，开展了对船舶检测与分类、冰山探测、风反演、溢油探测、海冰监测、船舶尾迹探测等领域的研究，并证明了其实用性。其中，对于舰船目标的尺寸估计是对船舶目标检测与分类的关键和基础，精细的几何参数估计同时也是SAR图像解译的关键，具有重要的研究意义。Synthetic aperture radar SAR can observe the marine environment over a large area without being affected by weather and environmental factors. It has become an effective means of marine management. It is also widely used in military reconnaissance, disaster forecasting and other fields, and has broad research and application prospects. In recent years, based on the monitoring of marine targets, research has been carried out in the fields of ship detection and classification, iceberg detection, wind inversion, oil spill detection, sea ice monitoring, ship wake detection, etc., and its practicality has been proved. Among them, the size estimation of ship targets is the key and basis for ship target detection and classification. The precise geometric parameter estimation is also the key to SAR image interpretation, which has important research significance.

对舰船目标尺寸估计的研究已经取得了很多的成果。然而，由于现实情况的限制，很难得到大量的高分辨SAR图像。对于中低分辨SAR图像，舰船目标细节信息不够丰富，这影响了目标尺寸估计的精度。为了解决这一问题，Bjorn Tings等人提出了动态自适应舰船参数估计方法，使用交叉熵和多元线性回归对算法参数进行优化，该方法在TerraSAR-X数据上取得了较高的估计精度，但算法受海杂波等环境因素影响较大。A lot of research has been done on ship target size estimation. However, due to practical limitations, it is difficult to obtain a large number of high-resolution SAR images. For medium and low-resolution SAR images, the ship target details are not rich enough, which affects the accuracy of target size estimation. To solve this problem, Bjorn Tings et al. proposed a dynamic adaptive ship parameter estimation method, which uses cross entropy and multivariate linear regression to optimize the algorithm parameters. This method has achieved high estimation accuracy on TerraSAR-X data, but the algorithm is greatly affected by environmental factors such as sea clutter.

此外，对于中低分辨SAR图像，Lui Bedini等人还提出了基于SAR图像的舰船目标尺寸提取方法，该方法对SAR图像做形态学分析，旨在从杂乱的SAR图像中找到目标的轮廓，从而提取出目标的尺寸特征。此外，机器学习也开始被应用于舰船目标的尺寸估计中，并取得了较好的结果。Boying Li等人提出了基于双极化融合和非线性回归的舰船尺寸估计方法，利用梯度提升决策树GBDT来减少成像质量对舰船尺寸估计精度的影响，但是这些算法均依赖于人工设计特征，且不具备鲁棒性。In addition, for medium and low resolution SAR images, Lui Bedini et al. also proposed a ship target size extraction method based on SAR images. This method performs morphological analysis on SAR images, aiming to find the outline of the target from the cluttered SAR images, thereby extracting the size characteristics of the target. In addition, machine learning has also begun to be applied to the size estimation of ship targets, and has achieved good results. Boying Li et al. proposed a ship size estimation method based on dual polarization fusion and nonlinear regression, using gradient boosting decision tree GBDT to reduce the impact of imaging quality on the accuracy of ship size estimation, but these algorithms all rely on artificially designed features and are not robust.

发明内容Summary of the invention

本发明的目的在于针对已有SAR目标尺寸估计方法的不足，提出了一种基于卷积神经网络和多特征融合的舰船目标尺寸估计方法，以提高在中低分辨情况下的目标尺寸估计的性能，从而提高目标尺寸估计的精度。The purpose of this invention is to address the shortcomings of existing SAR target size estimation methods and propose a ship target size estimation method based on convolutional neural network and multi-feature fusion to improve the performance of target size estimation in medium and low resolution situations, thereby improving the accuracy of target size estimation.

本发明的技术思路是：通过对训练样本和测试样本进行图像分割，得到每个样本的目标区域幅度图像，对目标区域幅度图像进行特征提取，得到目标的多维特征；将训练样本输入到基于卷积神经网络CNN的目标尺寸估计网络框架中进行训练，得到训练好的网络模型，将测试样本输入到训练好的网络模型中，得到初步的目标尺寸估计结果。再利用梯度提升决策树GBDT对初步的目标尺寸估计结果进行进一步的修正，得到最终的目标尺寸特征。其实现步骤包括如下：The technical idea of the present invention is: by performing image segmentation on training samples and test samples, the target area amplitude image of each sample is obtained, and the feature extraction of the target area amplitude image is performed to obtain the multi-dimensional features of the target; the training samples are input into the target size estimation network framework based on the convolutional neural network CNN for training to obtain a trained network model, and the test samples are input into the trained network model to obtain a preliminary target size estimation result. Then, the gradient boosting decision tree GBDT is used to further correct the preliminary target size estimation result to obtain the final target size feature. The implementation steps include the following:

(1)从公开的OpenSARShip数据集中挑选出五类舰船目标作为实验数据集，并从实验数据集中随机选取70％为训练样本，30％为测试样本；(1) Five types of ship targets were selected from the public OpenSARShip dataset as the experimental dataset, and 70% of the experimental dataset was randomly selected as training samples and 30% as test samples;

(2)对数据集中的原始图像G依次进行直方图均衡化、均值滤波以去除噪声干扰，得到滤波后的图像P；(2) Perform histogram equalization and mean filtering on the original image G in the data set to remove noise interference and obtain the filtered image P;

(3)对滤波后的图像P进行阈值分割，并对分割结果依次进行形态学滤波和聚类处理，得到目标区域二值图像Q，然后进行去旁瓣处理，得到去旁瓣后的二值图像Q'；(3) Performing threshold segmentation on the filtered image P, and performing morphological filtering and clustering processing on the segmentation results in turn to obtain a binary image Q of the target area, and then performing sidelobe removal processing to obtain a binary image Q' after sidelobe removal;

(4)对去旁瓣后的二值图像Q'与原始图像G做掩膜处理，得到目标区域幅度图像，计算目标区域幅度图像的多维特征，得到目标的多维特征h；(4) Masking the binary image Q' after removing the side lobes and the original image G to obtain the target area amplitude image, calculate the multi-dimensional features of the target area amplitude image, and obtain the multi-dimensional features h of the target;

(5)构建基于卷积神经网络CNN的目标尺寸估计网络框架Ψ：(5) Construct a target size estimation network framework Ψ based on convolutional neural network CNN:

5a)设置六层卷积层，即第一层卷积层L₁、第二层卷积层L₂、第三层卷积层L₃、第四层卷积层L₄、第五层卷积层L₅、第六层卷积层L₆；5a) setting six convolutional layers, namely, a first convolutional layer L ₁ , a second convolutional layer L ₂ , a third convolutional layer L ₃ , a fourth convolutional layer L ₄ , a fifth convolutional layer L ₅ , and a sixth convolutional layer L ₆ ;

5b)设置四层最大池化层，即第一层最大池化层P₁、第二层最大池化层P₂、第三层最大池化层P₃、第四层最大池化层P₄；5b) setting four maximum pooling layers, namely, a first maximum pooling layer P ₁ , a second maximum pooling layer P ₂ , a third maximum pooling layer P ₃ , and a fourth maximum pooling layer P ₄ ;

5c)将5a)的六层卷积层与5b)四层最大池化层交叉排列，即由第一层卷积层L₁、第一层最大池化层P₁、第二层卷积层L₂、第二层最大池化层P₂、第三层卷积层L₃、第三层最大池化层P₃、第四层卷积层L₄、第四层最大池化层P₄、第五层卷积层L₅、第六层卷积层L₆依次串联构成的目标尺寸估计网络框架Ψ；5c) cross-arranging the six convolutional layers in 5a) and the four maximum pooling layers in 5b), that is, the target size estimation network framework Ψ is formed by sequentially connecting in series the first convolutional layer L ₁ , the first maximum pooling layer P ₁ , the second convolutional layer L ₂ , the second maximum pooling layer P ₂ , the third convolutional layer L ₃ , the third maximum pooling layer P ₃ , the fourth convolutional layer L ₄ , the fourth maximum pooling layer P ₄ , the fifth convolutional layer L ₅ , and the sixth convolutional layer L ₆ ;

(6)将训练样本输入到构建好的目标尺寸估计网络框架Ψ中进行训练，得到训练好的网络模型Ψ′，同时得到训练样本的初步尺寸估计结果；(6) Input the training samples into the constructed target size estimation network framework Ψ for training, obtain the trained network model Ψ′, and obtain the preliminary size estimation results of the training samples;

(7)将测试样本输入到训练好的网络模型Ψ′中，得到测试样本的初步尺寸估计结果；(7) Input the test sample into the trained network model Ψ′ to obtain a preliminary size estimation result of the test sample;

(8)将多维特征向量h中训练样本部分与训练样本的初步尺寸估计结果进行组合，组成训练样本的新多维特征向量h₁，将多维特征向量h中测试样本部分与测试样本的初步尺寸估计结果进行组合，组成测试样本的新多维特征向量h₂；(8) Combining the training sample part of the multidimensional feature vector h with the preliminary size estimation result of the training sample to form a new multidimensional feature vector h ₁ of the training sample, and combining the test sample part of the multidimensional feature vector h with the preliminary size estimation result of the test sample to form a new multidimensional feature vector h ₂ of the test sample;

(9)利用训练样本的多维特征向量h₁训练梯度提升决策树GBDT分类器，得到训练好的梯度提升决策树GBDT模型；(9) Using the multi-dimensional feature vector _h1 of the training sample to train the gradient boosting decision tree GBDT classifier, a trained gradient boosting decision tree GBDT model is obtained;

(10)将测试样本的多维特征向量h₂送入到已经训练好的梯度提升决策树GBDT模型中，得到测试样本的最终尺寸特征。(10) The multidimensional feature vector _h2 of the test sample is sent to the trained gradient boosting decision tree (GBDT) model to obtain the final size feature of the test sample.

本发明与现有技术相比具有以下优点：Compared with the prior art, the present invention has the following advantages:

1)本发明利用卷积神经网络对SAR图像中的舰船目标做初步尺寸估计，不需要根据 SAR图像人工设计提取的特征，大大减少了人力的负担，提高了尺寸估计算法的鲁棒性。1) The present invention uses a convolutional neural network to make a preliminary size estimation of the ship target in the SAR image. It does not need to manually design and extract features based on the SAR image, which greatly reduces the burden of manpower and improves the robustness of the size estimation algorithm.

2)本发明借鉴深度学习的思想提取SAR图像舰船目标的尺寸特征，同时结合传统算法提取SAR图像的散射信息，并进行多特征融合，提高了舰船目标尺寸估计的精度。2) This paper draws on the idea of deep learning to extract the size features of ship targets in SAR images. At the same time, it combines traditional algorithms to extract the scattering information of SAR images and performs multi-feature fusion to improve the accuracy of ship target size estimation.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明的实现流程图；Fig. 1 is a flow chart of the implementation of the present invention;

图2为本发明中的目标尺寸估计网络框架图；FIG2 is a diagram of a target size estimation network framework in the present invention;

图3为用本发明对部分样本图像分割的结果图；FIG3 is a diagram showing the result of segmenting a portion of a sample image using the present invention;

图4为用本发明估计目标尺寸结果与目标真实尺寸的对比结果图；FIG4 is a diagram showing a comparison between the target size estimation result and the target real size using the present invention;

具体实施方式DETAILED DESCRIPTION

下面结合附图对本发明的实施方案和效果进行详细说明：The implementation scheme and effects of the present invention are described in detail below with reference to the accompanying drawings:

参照图1，本实施的实现步骤如下：Referring to Figure 1, the implementation steps of this embodiment are as follows:

步骤1，获取训练样本和测试样本。Step 1: Obtain training samples and test samples.

从公开的OpenSARShip数据集中挑选出五类舰船目标作为实验数据集，从实验数据集中随机选取70％为训练样本，30％为测试样本。Five types of ship targets are selected from the public OpenSARShip dataset as the experimental dataset, and 70% of the experimental dataset is randomly selected as training samples and 30% as test samples.

步骤2，对实验数据集中的原图像G进行预处理。Step 2: preprocess the original image G in the experimental data set.

对实验数据集中的原图像G先进行直方图均衡化，以提高目标区域与背景区域的明暗对比度，再进行均值滤波以去除噪声干扰，得到滤波后的图像P。The original image G in the experimental data set is first histogram equalized to improve the light and dark contrast between the target area and the background area, and then mean filtering is performed to remove noise interference to obtain the filtered image P.

步骤3，对滤波后的图像P进行阈值分割。Step 3: Perform threshold segmentation on the filtered image P.

3a)获取图像P的最大幅度值I_max和最小幅度值I_min；3a) Obtaining the maximum amplitude value I _max and the minimum amplitude value I _min of the image P;

3b)设定第一比例参数γ，计算目标-背景区域的分割阈值thr：3b) Set the first scale parameter γ and calculate the segmentation threshold thr of the target-background area:

thr＝γ(I_max-I_min)+I_min thr＝γ(I _max −I _min )+I _min

3c)对图像P进行二值化运算：3c) Perform binarization operation on image P:

其中，I(x,y)为图像P中位于(x,y)位置处像素点的幅度值，I_mask(x,y)为二值化后位于(x,y)位置处像素点的编码值，编码后的图像即为分割后的二值化图像；Wherein, I(x,y) is the amplitude value of the pixel at the position (x,y) in the image P, I _mask (x,y) is the encoding value of the pixel at the position (x,y) after binarization, and the encoded image is the binary image after segmentation;

步骤4，对分割后的二值化图像进行修正。Step 4: Correct the segmented binary image.

4a)对分割后的二值化图像依次做形态学滤波和聚类处理，得到目标区域二值图像Q；4a) Perform morphological filtering and clustering processing on the segmented binary image in turn to obtain a binary image Q of the target area;

4b)用主成分分析提取二值图像Q中目标区域的主轴，求取主轴的长度L_PA；4b) extracting the principal axis of the target area in the binary image Q by principal component analysis, and obtaining the length L _PA of the principal axis;

4c)提取目标区域中平行于主轴的所有直线，并计算直线的长度L_line；4c) extracting all straight lines parallel to the main axis in the target area and calculating the length L _line of the straight lines;

4d)选取第二比例参数ρ，比较目标区域内主轴和其他所有直线的长度，如果 L_line＜ρ×L_PA，则擦除该直线L_line上的所有点，即像素点置为0，得到去旁瓣后的二值图像 Q'。4d) Select the second scale parameter ρ, compare the lengths of the main axis and all other straight lines in the target area, if L _line <ρ×L _PA , erase all points on the straight line L _line , that is, set the pixels to 0, and obtain the binary image Q' after removing the side lobes.

步骤5，计算目标的多维特征h。Step 5: Calculate the multi-dimensional feature h of the target.

4a)对去旁瓣后的二值图像Q'与原图像G做掩膜处理，得到目标区域幅度图像；4a) Masking the binary image Q' after removing the side lobes and the original image G to obtain the amplitude image of the target area;

4b)用主成分分析提取目标区域幅度图像的主轴，得到目标的方位角θ,并计算方位角值θ的余弦值θ_cos。4b) Use principal component analysis to extract the principal axis of the target area amplitude image, obtain the target azimuth angle θ, and calculate the cosine value θ _cos of the azimuth angle value θ.

4c)计算目标区域的像素点幅值和s₁，均值μ₁，方差σ₁，令第一目标特征f₁＝(s₁,μ₁,σ₁)；4c) Calculate the pixel amplitude and s ₁ , mean μ ₁ , variance σ ₁ of the target area, and let the first target feature f ₁ =(s ₁ ,μ ₁ ,σ ₁ );

4d)用边缘算子提取目标区域的边缘像素点，计算边缘像素点的幅值和s₂，均值μ₂，方差σ₂，令第二目标特征f₂＝(s₂,μ₂,σ₂)；4d) Use edge operator to extract edge pixels of the target area, calculate the amplitude and s ₂ , mean μ ₂ , variance σ ₂ of the edge pixels, and let the second target feature f ₂ =(s ₂ ,μ ₂ ,σ ₂ );

4e)将目标区域沿主轴方向等分成三部分，分别为船头、船中部、船尾，求船头部分的幅值和s₃，均值μ₃，方差σ₃，求船中部部分的幅值和s₄，均值μ₄，方差σ₄，求船尾部分的幅值和s₅，均值μ₅，方差σ₅，令第三目标特征f₃＝(s₃,μ₃,σ₃,s₄,μ₄,σ₄,s₅,μ₅,σ₅)；4e) Divide the target area into three equal parts along the main axis direction, namely, the bow, the midship, and the stern. Calculate the amplitude sum s ₃ , mean μ ₃ , and variance σ ₃ of the bow part; calculate the amplitude sum s ₄ , mean μ ₄ , and variance σ ₄ of the midship part; calculate the amplitude sum s ₅ , mean μ ₅ , and variance σ ₅ of the stern part. Let the third target feature f ₃ =(s ₃ ,μ ₃ ,σ ₃ ,s ₄ ,μ ₄ ,σ ₄ ,s ₅ ,μ ₅ ,σ ₅ );

4f)将以上得到的特征组合成多维特征h＝(θ_cos,f₁,f₂,f₃)。4f) Combine the features obtained above into a multi-dimensional feature h = (θ _cos , f ₁ , f ₂ , f ₃ ).

上述幅值和，均值，方差的计算公式为：The calculation formulas for the above amplitude, mean, and variance are:

幅值和：

Amplitude and:

均值：

Mean:

方差：

variance:

其中，N为需要统计的区域内像素点的个数，I(x_i,y_i)为位于(x_i,y_i)处的像素点幅度值，j代表第j组统计量j＝1,…,5。Wherein, N is the number of pixels in the area to be counted, I( _xi , _yi ) is the amplitude value of the pixel located at ( _xi , _yi ), and j represents the jth group of statistics, j=1,…,5.

步骤6，构建基于卷积神经网络CNN的目标尺寸估计网络框架Ψ及参数。Step 6: Construct the target size estimation network framework Ψ and parameters based on the convolutional neural network CNN.

6a)参照图2，构建目标尺寸估计网络框架Ψ：6a) Referring to Figure 2, construct the target size estimation network framework Ψ:

6a1)设置六层卷积层，即第一层卷积层L₁、第二层卷积层L₂、第三层卷积层L₃、第四层卷积层L₄、第五层卷积层L₅、第六层卷积层L₆；6a1) setting six convolutional layers, namely, a first convolutional layer L ₁ , a second convolutional layer L ₂ , a third convolutional layer L ₃ , a fourth convolutional layer L ₄ , a fifth convolutional layer L ₅ , and a sixth convolutional layer L ₆ ;

6a2)设置四层最大池化层，即第一层最大池化层P₁、第二层最大池化层P₂、第三层最大池化层P₃、第四层最大池化层P₄；6a2) setting four maximum pooling layers, namely, a first maximum pooling layer P ₁ , a second maximum pooling layer P ₂ , a third maximum pooling layer P ₃ , and a fourth maximum pooling layer P ₄ ;

6a2)将6a1)的六层卷积层与6a2)的四层最大池化层交叉排列，6a2) Cross-arrange the six convolutional layers in 6a1) and the four maximum pooling layers in 6a2).

即由第一层卷积层L₁、第一层最大池化层P₁、第二层卷积层L₂、第二层最大池化层P₂、第三层卷积层L₃、第三层最大池化层P₃、第四层卷积层L₄、第四层最大池化层P₄、第五层卷积层L₅、第六层卷积层L₆依次串联构成的目标尺寸估计网络框架Ψ；That is, the target size estimation network framework Ψ is composed of the first convolution layer L ₁ , the first maximum pooling layer P ₁ , the second convolution layer L ₂ , the second maximum pooling layer P ₂ , the third convolution layer L ₃ , the third maximum pooling layer P ₃ , the fourth convolution layer L ₄ , the fourth maximum pooling layer P ₄ , the fifth convolution layer L ₅ , and the sixth convolution layer L ₆ which are sequentially connected in series;

6b)设置网络框架Ψ的各层参数：6b) Set the parameters of each layer of the network framework Ψ:

第一层卷积层L₁，输入图像数据x₁，尺寸大小为128×128×1，其卷积核K¹的窗口大小为5×5，滑动步长S^L1为1，填充参数P＝0，用于对输入数据做卷积运算，输出16个特征图Y¹，Y¹尺寸大小为124×124×16；The first convolution layer L ₁ takes as input image data x ₁ , with a size of 128×128×1, a window size of 5×5, a sliding step size S ^L1 of ¹ , and a padding parameter P=0. It is used to perform convolution operations on the input data and output 16 feature maps Y ¹ ^, with a size of 124×124×16.

第一层最大池化层P₁，输入数据为Y¹,填充参数P＝0，池化核U¹的窗口大小为2×2，滑动步长S^P1为2，用于对输入数据做下采样运算，输出特征图Y²的大小为62×62×16；The first maximum pooling layer P ₁ , the input data is Y ¹ , the padding parameter P = 0, the window size of the pooling kernel U ¹ is 2×2, the sliding step size S ^P1 is 2, which is used to downsample the input data, and the size of the output feature map Y ² is 62×62×16;

第二层卷积层L₂，输入数据为Y²，其卷积核K²的窗口大小为5×5，滑动步长S^L2为 1，填充参数P＝0，用于对输入数据做卷积运算，输出32个特征图Y³，Y³尺寸大小为 58×58×32；The second convolution layer L ₂ , the input data is Y ² , the window size of its convolution kernel K ² is 5×5, the sliding step size S ^L2 is 1, the padding parameter P=0, which is used to perform convolution operation on the input data and output 32 feature maps Y ³ , the size of Y ³ is 58×58×32;

第二层最大池化层P₂，输入数据为Y³,填充参数P＝0，池化核U²的窗口大小为2×2，滑动步长S^P2为2，用于对输入数据做下采样运算，输出特征图Y⁴的大小为29×29×32；The second maximum pooling layer P ₂ , the input data is Y ³ , the padding parameter P = 0, the window size of the pooling kernel U ² is 2×2, the sliding step size S ^P2 is 2, which is used to downsample the input data, and the size of the output feature map Y ⁴ is 29×29×32;

第三层卷积层L₃，输入数据为Y⁴，其卷积核K³的窗口大小为6×6，滑动步长S^L3为 1，填充参数P＝0，用于对输入数据做卷积运算，输出64个特征图Y⁵，Y⁵尺寸大小为 24×24×64；The third convolution layer L ₃ , the input data is Y ⁴ , the window size of its convolution kernel K ³ is 6×6, the sliding step size S ^L3 is 1, the padding parameter P=0, which is used to perform convolution operation on the input data and output 64 feature maps Y ⁵ , the size of Y ⁵ is 24×24×64;

第三层最大池化层P₃，输入数据为Y⁵,填充参数P＝0，池化核U³的窗口大小为2×2，滑动步长S^P3为2，用于对输入数据做下采样运算，输出特征图Y⁶的大小为12×12×64；The third maximum pooling layer P ₃ , the input data is Y ⁵ , the padding parameter P = 0, the window size of the pooling kernel U ³ is 2×2, the sliding step size S ^P3 is 2, which is used to downsample the input data, and the size of the output feature map Y ⁶ is 12×12×64;

第四层卷积层L₄，输入数据为Y⁶，其卷积核K⁴的窗口大小为3×3，滑动步长S^L4为 1，填充参数P＝0，用于对输入数据做卷积运算，输出128个特征图Y⁷，Y⁷尺寸大小为 10×10×128；The fourth convolution layer L ₄ , the input data is Y ⁶ , the window size of its convolution kernel K ⁴ is 3×3, the sliding step size S ^L4 is 1, the padding parameter P=0, which is used to perform convolution operation on the input data and output 128 feature maps Y ⁷ , the size of Y ⁷ is 10×10×128;

第四层最大池化层P₄，输入数据为Y⁷,填充参数P＝0，池化核U⁴的窗口大小为2×2，滑动步长S^P4为2，用于对输入数据做下采样运算，输出特征图Y⁸的大小为5×5×128；The fourth maximum pooling layer P ₄ , the input data is Y ⁷ , the padding parameter P = 0, the window size of the pooling kernel U ⁴ is 2×2, the sliding step size S ^P4 is 2, which is used to downsample the input data, and the size of the output feature map Y ⁸ is 5×5×128;

第五层卷积层L₅，输入数据为Y⁸，其卷积核K⁵的窗口大小为3×3，滑动步长S^L5为 1，填充参数P＝0，用于对输入数据做卷积运算，输出64个特征图Y⁹，Y⁹尺寸大小为 3×3×64；The fifth convolution layer L ₅ , the input data is Y ⁸ , the window size of its convolution kernel K ⁵ is 3×3, the sliding step size S ^L5 is 1, the padding parameter P=0, it is used to perform convolution operation on the input data, and output 64 feature maps Y ⁹ , the size of Y ⁹ is 3×3×64;

第六层卷积层L₆，输入数据为Y⁹，其卷积核K⁶的窗口大小为3×3，滑动步长S^L6为 1，填充参数P＝0，用于对输入数据做卷积运算，输出1个特征图Y¹⁰，Y¹⁰尺寸大小为1×1×1，得到初步的目标尺寸。The sixth convolution layer L ₆ , the input data is Y ⁹ , the window size of its convolution kernel K ⁶ is 3×3, the sliding step size S ^L6 is 1, the padding parameter P=0, it is used to perform a convolution operation on the input data and output a feature map Y ¹⁰ , the size of Y ¹⁰ is 1×1×1, and the preliminary target size is obtained.

步骤7，对目标尺寸估计网络框架Ψ进行训练，得到训练好的网络模型Ψ′，同时得到训练样本的初步尺寸估计结果。Step 7: Train the target size estimation network framework Ψ to obtain the trained network model Ψ′ and obtain the preliminary size estimation result of the training sample.

7a)将训练样本输入到目标尺寸估计网络框架Ψ中，计算网络输出层的损失loss：7a) Input the training sample into the target size estimation network framework Ψ and calculate the loss of the network output layer:

loss＝(y_true-y_prediction)²，loss=(y _true -y _prediction ) ² ,

式中y_true为样本的真实尺寸，y_prediction为样本目标的估计尺寸；Where y _true is the true size of the sample, and y _prediction is the estimated size of the sample target;

7b)利用反向传播算法将输出层的损失向前传播，并通过随机梯度下降算法计算损失函数loss的梯度向量，更新网络中每一层的参数；7b) Use the back-propagation algorithm to propagate the loss of the output layer forward, and use the stochastic gradient descent algorithm to calculate the gradient vector of the loss function loss, and update the parameters of each layer in the network;

7c)重复7b)，反复迭代不断更新参数，直到损失函数loss收敛，得到训练好的网络模型Ψ′；7c) Repeat 7b), iteratively update the parameters continuously until the loss function converges, and obtain the trained network model Ψ′;

7d)将训练样本输入到训练好的网络模型Ψ′中，得到训练样本的初步尺寸估计结果。7d) Input the training sample into the trained network model Ψ′ to obtain a preliminary size estimation result of the training sample.

步骤8，将测试样本输入到训练好的网络模型Ψ′中，得到测试样本的初步尺寸估计结果。Step 8: Input the test sample into the trained network model Ψ′ to obtain the preliminary size estimation result of the test sample.

步骤9，将目标初步尺寸估计结果与多维特征向量进行融合。Step 9: Fusion the target preliminary size estimation result with the multi-dimensional feature vector.

将多维特征向量h中训练样本部分与训练样本的初步尺寸估计结果进行组合，组成训练样本的新多维特征向量h₁，将多维特征向量h中测试样本部分与测试样本的初步尺寸估计结果进行组合，组成测试样本的新多维特征向量h₂。The training sample part of the multidimensional feature vector h is combined with the preliminary size estimation result of the training sample to form a new multidimensional feature vector h ₁ of the training sample, and the test sample part of the multidimensional feature vector h is combined with the preliminary size estimation result of the test sample to form a new multidimensional feature vector h ₂ of the test sample.

步骤10，训练梯度提升决策树GBDT，得到训练好的梯度提升决策树GBDT模型。Step 10: Train the gradient boosted decision tree GBDT to obtain a trained gradient boosted decision tree GBDT model.

10a)初始化梯度提升回归树模型f₀＝0；10a) Initialize the gradient boosting regression tree model f ₀ = 0;

10b)训练梯度提升回归树模型，设循环次数t＝1,…,10：10b) Train the gradient boosting regression tree model, assuming the number of cycles t = 1,…,10:

10b1)将训练样本的多维特征h₁作为输入，通过拟合残差损失r_t-1，得到一棵训练好的回归树模型f_t：10b1) Taking the multi-dimensional feature h ₁ of the training sample as input, a trained regression tree model f _t is obtained by fitting the residual loss r _t-1 :

r_t-1＝y_true-y_pre r _t-1 = y _true -y _pre

式中y_true为样本的真实尺寸，y_pre为回归树模型f_t-1输出的目标的估计尺寸；Where y _true is the true size of the sample, and y _pre is the estimated size of the target output by the regression tree model f _t-1 ;

10b2)更新梯度提升回归树模型f_t＝f_t-1+vf_t，其中v为学习率；10b2) Update the gradient boosting regression tree model _ft = ft _-1 + _vft , where v is the learning rate;

10b3)反复迭代不断更新梯度提升回归树模型，直到迭代次数达到t＝10，得到训练好的梯度提升回归树模型ζ。10b3) Repeatedly iterate and continuously update the gradient boosting regression tree model until the number of iterations reaches t = 10, and the trained gradient boosting regression tree model ζ is obtained.

步骤11，将测试样本的多维特征向量h₂送入到已经训练好的梯度提升决策树GBDT模型ζ中，得到测试样本的最终尺寸特征。Step 11, the multidimensional feature vector _h2 of the test sample is sent to the trained gradient boosting decision tree GBDT model ζ to obtain the final size feature of the test sample.

本发明的效果可通过以下实验数据进一步说明：The effect of the present invention can be further illustrated by the following experimental data:

一.实验条件：1. Experimental conditions:

1)实验数据：1) Experimental data:

实验所用数据为上海交通大学整理的OpenSARShip数据集，数据集可由上海交通大学的OpenSAR平台下载得到，下载网址为https://opensar.sjtu.edu.cn/。实验数据集来自于 VH极化的干涉宽幅IW模式，分辨率为20m×20m，像素间距为10m。本实验所用舰船图像的尺寸真值为船舶自动识别系统AIS提供，使用的数据集包含五类目标：油轮Tanker、货船Cargo、散货船Bulk carrier、普通货船General cargo、集装箱船Container。The data used in the experiment is the OpenSARShip dataset compiled by Shanghai Jiao Tong University. The dataset can be downloaded from the OpenSAR platform of Shanghai Jiao Tong University at https://opensar.sjtu.edu.cn/. The experimental dataset comes from the VH polarization interferometric wide-band IW mode, with a resolution of 20m×20m and a pixel spacing of 10m. The size truth value of the ship image used in this experiment is provided by the ship automatic identification system AIS. The dataset used contains five types of targets: Tanker, Cargo, Bulk carrier, General cargo, and Container.

实验选取的数据集共包含2467幅目标图像，图像大小为128×128，随机选取数据集的70％为训练样本，包含1727幅目标图像，数据集的30％为测试样本，包含740幅目标图像。The dataset selected for the experiment contains a total of 2467 target images with an image size of 128×128. 70% of the dataset is randomly selected as training samples, including 1727 target images, and 30% of the dataset is test samples, including 740 target images.

实验中所用目标的长度范围为42m-399m，宽度范围为6m-65m。The targets used in the experiment ranged from 42m to 399m in length and from 6m to 65m in width.

本实验选取相对误差和绝对误差两个评价标准对估计结果进行误差评定。This experiment selected two evaluation criteria, relative error and absolute error, to evaluate the error of the estimation results.

二.实验内容：2. Experimental content:

实验一：用本发明对上述实验数据进行实验，设置第一比例参数γ＝0.67，第二比例参数ρ＝0.59，学习率v＝0.5，对图像分割结果进行可视化，结果如图3，其中：Experiment 1: The present invention is used to conduct experiments on the above experimental data, setting the first scale parameter γ = 0.67, the second scale parameter ρ = 0.59, and the learning rate v = 0.5, and visualizing the image segmentation results. The results are shown in Figure 3, where:

图3(a)为部分样本的原始图像；Figure 3(a) shows the original images of some samples;

图3(b)为部分样本的目标区域分割图像。Figure 3(b) shows the target area segmentation image of some samples.

从图3可以看出，目标区域分割结果并不是十分准确，所以直接利用目标区域分割结果对目标进行尺寸估计会产生较大误差。As can be seen from Figure 3, the target region segmentation result is not very accurate, so directly using the target region segmentation result to estimate the target size will produce a large error.

实验二：用本发明方法与现有方法对上述实验数据进行对比实验。为了验证本发明的尺寸提取效果，将尺寸提取结果与其它方法进行对比，对比结果如表1所示。Experiment 2: The above experimental data were compared using the method of the present invention and the existing method. In order to verify the size extraction effect of the present invention, the size extraction results were compared with other methods. The comparison results are shown in Table 1.

表1本发明方法与现有方法尺寸估计精度对比结果Table 1 Comparison results of size estimation accuracy between the method of the present invention and the existing method

表1中：现有方法为基于双极化融合和非线性回归的舰船尺寸估计方法，主要包括图像预处理和非线性回归两个阶段，实验复现的方法与原始方法可能存在细节上的差异。In Table 1: The existing method is a ship size estimation method based on dual polarization fusion and nonlinear regression, which mainly includes two stages: image preprocessing and nonlinear regression. The experimental reproduction method may differ from the original method in details.

从表1可以看出，利用本发明对舰船目标进行尺寸估计的误差小于现有方法的估计误差，实现了更好的目标尺寸估计性能，且有更高的鲁棒性。对比初步目标尺寸估计结果与最终目标尺寸估计结果发现，利用梯度提升回归树对初步尺寸估计结果进行修正，能够提高目标尺寸估计的精度，说明由目标区域二值图像得到的多维特征，对目标尺寸回归具有一定的帮助。As can be seen from Table 1, the error of ship size estimation using the present invention is smaller than the estimation error of the existing method, achieving better target size estimation performance and higher robustness. Comparing the preliminary target size estimation results with the final target size estimation results, it is found that the use of gradient boosting regression tree to correct the preliminary size estimation results can improve the accuracy of target size estimation, indicating that the multi-dimensional features obtained from the binary image of the target area are helpful for target size regression.

实验三：用本发明方法对上述实验数据进行实验，对目标估计尺寸与目标真实尺寸的对比结果进行可视化，结果如图4，其中：Experiment 3: The above experimental data were tested using the method of the present invention, and the comparison results between the estimated target size and the actual target size were visualized. The results are shown in Figure 4, where:

图4(a)是利用本发明得到的目标估计长度与目标真实长度的对比结果；FIG4( a ) is a comparison result between the estimated target length and the actual target length obtained by using the present invention;

图4(b)是利用本发明得到的目标估计宽度与目标真实宽度的对比结果。FIG4( b ) is a comparison result between the estimated width of the target obtained by using the present invention and the actual width of the target.

从图4可以看出，利用本发明得到的目标尺寸估计值与目标的真实尺寸值存在很高的相关性。As can be seen from FIG4 , there is a high correlation between the target size estimation value obtained by using the present invention and the real size value of the target.

以上描述仅是本发明的一个具体实例，并未构成对本发明的任何限制，显然对于本领域的专业人员来说，在了解了本发明内容和原理后，都可能在不背离本发明原理、结构的情况下，进行形式和细节上的各种修改和改变，但是这些基于本发明思想的修正和改变仍在本发明的权利要求保护范围之内。The above description is only a specific example of the present invention and does not constitute any limitation to the present invention. It is obvious that for professionals in this field, after understanding the content and principles of the present invention, they may make various modifications and changes in form and details without departing from the principles and structures of the present invention. However, these modifications and changes based on the ideas of the present invention are still within the scope of protection of the claims of the present invention.

Claims

1. A ship target size estimation method based on convolutional neural network and multi-feature fusion, characterized by comprising:

(1) Five types of ship targets were selected from the public OpenSARShip dataset as the experimental dataset, and 70% were randomly selected as training samples and 30% as test samples;

(2) Perform histogram equalization and mean filtering on the original image G in the data set to remove noise interference and obtain the filtered image P;

(3) Performing threshold segmentation on the filtered image P, and performing morphological filtering and clustering processing on the segmentation results in turn to obtain a binary image Q of the target area, and then performing sidelobe removal processing to obtain a binary image Q' after sidelobe removal;

(4) Masking the binary image Q' after removing the side lobes and the original image G to obtain the target area amplitude image, calculate the multi-dimensional features of the target area amplitude image, and obtain the multi-dimensional features h of the target;

(5) Construct a target size estimation network framework Ψ based on convolutional neural network CNN:

5a) setting six convolutional layers, namely, a first convolutional layer L ₁ , a second convolutional layer L ₂ , a third convolutional layer L ₃ , a fourth convolutional layer L ₄ , a fifth convolutional layer L ₅ , and a sixth convolutional layer L ₆ ;

5b) setting four maximum pooling layers, namely, a first maximum pooling layer P ₁ , a second maximum pooling layer P ₂ , a third maximum pooling layer P ₃ , and a fourth maximum pooling layer P ₄ ;

5c) cross-arranging the six convolutional layers of 5a) and the four maximum pooling layers of 5b), that is, the target size estimation network framework Ψ is formed by sequentially connecting in series the first convolutional layer L ₁ , the first maximum pooling layer P ₁ , the second convolutional layer L ₂ , the second maximum pooling layer P ₂ , the third convolutional layer L ₃ , the third maximum pooling layer P ₃ , the fourth convolutional layer L ₄ , the fourth maximum pooling layer P ₄ , the fifth convolutional layer L ₅ , and the sixth convolutional layer L ₆ ;

(6) Inputting the training samples into the constructed target size estimation network framework Ψ for training, obtaining the trained network model Ψ′, and obtaining the preliminary size estimation results of the training samples;

(7) Input the test sample into the trained network model Ψ′ to obtain a preliminary size estimation result of the test sample;

(8) Combining the training sample part of the multidimensional feature vector h with the preliminary size estimation result of the training sample to form a new multidimensional feature vector h ₁ of the training sample, and combining the test sample part of the multidimensional feature vector h with the preliminary size estimation result of the test sample to form a new multidimensional feature vector h ₂ of the test sample;

(9) Using the multi-dimensional feature vector _h1 of the training sample to train the gradient boosting decision tree GBDT classifier, a trained gradient boosting decision tree GBDT model is obtained;

(10) The multidimensional feature vector _h2 of the test sample is sent to the trained gradient boosting decision tree GBDT model to obtain the final size feature of the test sample.

2. The method according to claim 1, characterized in that the threshold segmentation of the image P in (3) is performed according to the following steps:

3a) Obtaining the maximum amplitude value I _max and the minimum amplitude value I _min of the image P;

3b) Set the first scale parameter γ and calculate the segmentation threshold thr of the target-background area:

thr＝γ(I _max −I _min )+I _min

3c) Perform binarization operation on image P:

Among them, I(x,y) is the amplitude value of the pixel at the position (x,y) in the image P, I _mask (x,y) is the encoding value of the pixel at the position (x,y) after binarization, and the encoded image is the binary image after segmentation.

3. The method according to claim 1, characterized in that the sidelobe removal process in (3) is performed according to the following steps:

3d) extracting the principal axis of the target area in the binary image Q by principal component analysis, and obtaining the length L _PA of the principal axis;

3e) extracting all straight lines parallel to the main axis in the target area and calculating the length L _line of the straight lines;

3f) Select the second scale parameter ρ, compare the lengths of the main axis and all other straight lines in the target area, if L _line <ρ×L _PA , erase all points on the straight line L _line , that is, set the pixels to 0, and obtain the binary image Q' after removing the side lobes.

4. The method according to claim 1 is characterized in that (4) calculating the multidimensional features of the target area amplitude image to obtain the multidimensional features h of the target is performed according to the following steps:

4a) Use principal component analysis to extract the principal axis of the target area amplitude image, obtain the target azimuth θ, and calculate the cosine value of the azimuth θ

4b) Calculate the pixel amplitude and s ₁ , mean μ ₁ , variance σ ₁ of the target area, and let the first target feature f ₁ =(s ₁ ,μ ₁ ,σ ₁ );

4c) Use edge operator to extract edge pixels of the target area, calculate the amplitude and s ₂ , mean μ ₂ , variance σ ₂ of the edge pixels, and let the second target feature f ₂ =(s ₂ ,μ ₂ ,σ ₂ );

4d) Divide the target area into three equal parts along the main axis direction, namely, the bow, the midship, and the stern. Calculate the amplitude sum s ₃ , mean μ ₃ , and variance σ ₃ of the bow part; calculate the amplitude sum s ₄ , mean μ ₄ , and variance σ ₄ of the midship part; calculate the amplitude sum s ₅ , mean μ ₅ , and variance σ ₅ of the stern part. Let the third target feature f ₃ =(s ₃ ,μ ₃ ,σ ₃ ,s ₄ ,μ ₄ ,σ ₄ ,s ₅ ,μ ₅ ,σ ₅ );

4e) Combine the features obtained above into a multi-dimensional feature h = (θ _cos , f ₁ , f ₂ , f ₃ ).

5. The method according to claim 1 is characterized in that, in 5c), a ship size estimation network framework Ψ based on a convolutional neural network is constructed, and the parameters of each layer are as follows:

The first convolution layer L ₁ takes as input image data x ₁ , with a size of 128×128×1, a window size of 5×5 for the convolution kernel K ¹ , a sliding step size S ^L1 of 1, and a padding parameter P=0. It is used to perform convolution operations on the input data and output 16 feature maps Y ¹ , with ^a size of 124×124×16.

The first maximum pooling layer P ₁ , the input data is Y ¹ , the padding parameter P = 0, the window size of the pooling kernel U ¹ is 2×2, the sliding step size S ^P1 is 2, which is used to downsample the input data, and the size of the output feature map Y ² is 62×62×16;

The second convolution layer L ₂ , the input data is Y ² , the window size of its convolution kernel K ² is 5×5, the sliding step size S ^L2 is 1, the padding parameter P=0, which is used to perform convolution operation on the input data and output 32 feature maps Y ³ , the size of Y ³ is 58×58×32;

The second maximum pooling layer P ₂ , the input data is Y ³ , the padding parameter P = 0, the window size of the pooling kernel U ² is 2×2, the sliding step size S ^P2 is 2, which is used to downsample the input data, and the size of the output feature map Y ⁴ is 29×29×32;

The third convolution layer L ₃ , the input data is Y ⁴ , the window size of its convolution kernel K ³ is 6×6, the sliding step size S ^L3 is 1, the padding parameter P=0, which is used to perform convolution operation on the input data and output 64 feature maps Y ⁵ , the size of Y ⁵ is 24×24×64;

The third maximum pooling layer P ₃ , the input data is Y ⁵ , the padding parameter P = 0, the window size of the pooling kernel U ³ is 2×2, the sliding step size S ^P3 is 2, which is used to downsample the input data, and the size of the output feature map Y ⁶ is 12×12×64;

The fourth convolution layer L ₄ , the input data is Y ⁶ , the window size of its convolution kernel K ⁴ is 3×3, the sliding step size S ^L4 is 1, the padding parameter P=0, which is used to perform convolution operation on the input data and output 128 feature maps Y ⁷ , the size of Y ⁷ is 10×10×128;

The fourth maximum pooling layer P ₄ , the input data is Y ⁷ , the padding parameter P = 0, the window size of the pooling kernel U ⁴ is 2×2, the sliding step size S ^P4 is 2, which is used to downsample the input data, and the size of the output feature map Y ⁸ is 5×5×128;

The fifth convolution layer L ₅ , the input data is Y ⁸ , the window size of its convolution kernel K ⁵ is 3×3, the sliding step size S ^L5 is 1, the padding parameter P=0, it is used to perform convolution operation on the input data, and output 64 feature maps Y ⁹ , the size of Y ⁹ is 3×3×64;

The sixth convolution layer L ₆ , the input data is Y ⁹ , the window size of its convolution kernel K ⁶ is 3×3, the sliding step size S ^L6 is 1, the padding parameter P=0, it is used to perform a convolution operation on the input data and output a feature map Y ¹⁰ , the size of Y ¹⁰ is 1×1×1, and the preliminary target size is obtained.

6. The method according to claim 1 is characterized in that in (6), the training samples are input into the constructed target size estimation network framework Ψ for training, which is implemented as follows:

6a) Input the training sample into the target size estimation network framework Ψ and calculate the loss of the network output layer:

loss = (y _true -y _pre ) ²

Where y _true is the true size of the sample, and y _pre is the estimated size of the sample target;

6b) Use the back-propagation algorithm to propagate the loss of the output layer forward, and calculate the gradient vector of the loss function loss through the stochastic gradient descent algorithm to update the weights of each layer in the network;

6c) Repeat 6b) and iterate to continuously update the weights until the loss function converges to obtain a trained network model.

7. The method according to claim 1, characterized in that in (9), a gradient boosting decision tree (GBDT) classifier is trained using the multidimensional feature vector _h1 of the training sample, which is implemented as follows:

9a) Initialize the gradient boosting regression tree model f ₀ = 0;

9b) Train the gradient boosting regression tree model, assuming the number of cycles t = 1,…,10:

9b1) Taking the multi-dimensional feature h ₁ of the training sample as input, a trained regression tree model f _t is obtained by fitting the residual loss r _t-1 :

r _t-1 = y _true -y _pre

Where _ytrue is the true size of the sample, and _ypre is the estimated size of the target output by the regression tree model ft _-1 ;

9b2) Update the gradient boosted regression tree model _ft = ft _-1 + _νft , where ν is the learning rate;

9b3) Iterate repeatedly to continuously update the gradient boosting regression tree model until the number of iterations reaches t=10, and obtain the trained gradient boosting regression tree model ζ.