CN117437530A

CN117437530A - Synthetic aperture sonar twin matching identification method and system for small targets of interest

Info

Publication number: CN117437530A
Application number: CN202311320138.5A
Authority: CN
Inventors: 李宝奇; 黄海宁; 刘纪元; 刘正君; 韦琳哲
Original assignee: Institute of Acoustics CAS
Current assignee: Institute of Acoustics CAS
Priority date: 2023-10-12
Filing date: 2023-10-12
Publication date: 2024-01-23

Abstract

The invention provides a synthetic aperture sonar interesting small target twin matching identification method and a system. The method comprises the following steps: processing the received echo data of the synthetic aperture sonar to obtain a real-time synthetic aperture sonar image, and performing coarse recognition by adopting a target detection method to obtain a small target image of interest; inputting the small target image of interest into a pre-established and trained twin matching model, and matching with different reference image groups one by one to obtain the similarity with each reference image group; determining a recognition result according to the sorting result of the similarity, and realizing refined recognition; the twin matching model adopts a Siam FC network structure, and the backbone network is an improved MobileNet V3 network. The invention provides an effective solution for the task of autonomous underwater small target identification of the SAS image under the condition of rare samples.

Description

Synthetic aperture sonar twin matching identification method and system for small targets of interest

技术领域Technical field

本发明涉及水声信号处理领域，尤其涉及合成孔径声纳感兴趣小目标孪生匹配识别方法及系统。The invention relates to the field of underwater acoustic signal processing, and in particular to a synthetic aperture sonar twin matching identification method and system for small targets of interest.

背景技术Background technique

合成孔径声纳(Synthetic Aperture Sonar，SAS)是一种高分辨率水下成像声纳，其基本原理是利用小孔径基阵的移动形成虚拟大孔径，从而获得方位向的高分辨率。与普通侧扫声纳相比，SAS最为显著的优点是方位向分辨率较高，且理论分辨率与目标距离以及采用的声波频段无关。合成孔径声纳图像目标检测任务在水下无人平台自主导航和搜索发挥着重要作用。目标检测可以定位感兴趣小目标的位置，不过受限于样本数量少无法实现进一步精确化识别。Synthetic Aperture Sonar (SAS) is a high-resolution underwater imaging sonar. Its basic principle is to use the movement of a small aperture matrix to form a virtual large aperture, thereby obtaining high resolution in the azimuth direction. Compared with ordinary side scan sonar, the most significant advantage of SAS is its higher azimuth resolution, and the theoretical resolution has nothing to do with the target distance and the acoustic frequency band used. The synthetic aperture sonar image target detection task plays an important role in the autonomous navigation and search of underwater unmanned platforms. Object detection can locate the location of small targets of interest, but is limited by the small number of samples and cannot achieve further precise identification.

目标匹配是目标识别的重要手段，通常目标匹配识别可以分为基于特征的匹配和基于卷积神经网络的匹配。基于特征的匹配识别是通过提取模板中目标的特征与待识别图像中的特征进行匹配计算。不过，该类方法并不适用于水下感兴趣小目标。因为SAS图像内小目标的尺度较小，这也导致了目标中纹理信息的缺失,相应地也难以提取到有效的特征描述子。另外，尺度过小的目标容易受到环境噪声的影响，而具有稀疏检测特性的特征点对于噪声十分敏感，这导致特征点方法很难准确地描述水下感兴趣小目标。Target matching is an important means of target recognition. Generally, target matching recognition can be divided into feature-based matching and convolutional neural network-based matching. Feature-based matching recognition is performed by extracting the features of the target in the template and the features in the image to be recognized for matching calculation. However, this type of method is not suitable for small underwater targets of interest. Because the scale of small targets in SAS images is small, this also leads to the lack of texture information in the targets, and accordingly it is difficult to extract effective feature descriptors. In addition, targets that are too small are easily affected by environmental noise, and feature points with sparse detection characteristics are very sensitive to noise, which makes it difficult for the feature point method to accurately describe small underwater targets of interest.

近些年来，由于卷积神经网络(Convolutional Neural Networks，CNN)的发展，一些基于CNN网络模型的判别式目标匹配算法以其优异的匹配性能获得了研究人员广泛的关注。一些学者提出基于CNN网络的目标匹配方法，该类方法采用CNN提取候选样本与目标模板的深度特征，通过学习一个精确、鲁棒的相似度量函数，实现深度特征之间的精确匹配。最近几年，孪生卷积神经网络凭借在学习相似度量函数上的极大优势，Bertinetto等人提出的Siam FC在特征匹配上表现优异，但算法的实时性有待提升。Howard等提出了轻量化的卷积神经网络MobileNet V1。MobileNet V1用深度可分离卷积(depthwise separableconvolution，DSC)替换标准卷积来减少模型的参数和计算量。Sandler等提出了MobileNetV1的改进版本MobileNet V2。MobileNet V2在深度可分离卷积的基础上引入了跨连接(shortcut connections)结构，并设计了新的特征提取模块IRB(inverted residualblock)。新模块将原来的先“压缩”后“扩张”调整为先“扩张”后“压缩”，同时为了降低激活函数在高维信息向低维信息转换时的丢失和破坏，将最后卷积层的激活层由非线性更改为线性。Howard等在MobileNet V1和MobileNet V2的基础上提出了改进版本MobileNet V3和特征提取模块IRB⁺，IRB⁺引入了SE(squeeze and excitation)注意力机制。SE首先对卷积得到的特征进行squeeze操作，得到全局特征，然后对全局特征进行excitation操作，得到不同特征的权重，最后乘以对应通道的特征得到最终特征。本质上，SE组件是在特征维度上做选择，这可以更加关注信息量最大的特征，而抑制那些不重要的特征。IRB⁺在保持较低计算量的同时，具有更好的特征提取能力。不过，IRB⁺捕获所有通道的依赖关系是低效并且是不必要的。In recent years, due to the development of convolutional neural networks (CNN), some discriminative target matching algorithms based on CNN network models have received widespread attention from researchers due to their excellent matching performance. Some scholars have proposed target matching methods based on CNN networks. This method uses CNN to extract deep features of candidate samples and target templates, and achieves accurate matching between deep features by learning an accurate and robust similarity measure function. In recent years, Siamese convolutional neural networks have great advantages in learning similarity metric functions. Siam FC proposed by Bertinetto et al. has excellent performance in feature matching, but the real-time performance of the algorithm needs to be improved. Howard et al. proposed a lightweight convolutional neural network MobileNet V1. MobileNet V1 replaces standard convolution with depthwise separable convolution (DSC) to reduce the parameters and calculation amount of the model. Sandler et al. proposed MobileNet V2, an improved version of MobileNetV1. MobileNet V2 introduces a shortcut connections structure based on depth-separable convolution, and designs a new feature extraction module IRB (inverted residualblock). The new module adjusts the original "compression" and then "expansion" to "expansion" and then "compression". At the same time, in order to reduce the loss and damage of the activation function when converting high-dimensional information to low-dimensional information, the final convolution layer is The activation layer is changed from non-linear to linear. Howard et al. proposed an improved version of MobileNet V3 and feature extraction module IRB ⁺ based on MobileNet V1 and MobileNet V2. IRB ⁺ introduced the SE (squeeze and excitation) attention mechanism. SE first performs a squeeze operation on the features obtained by convolution to obtain the global features, then performs an excitation operation on the global features to obtain the weights of different features, and finally multiplies the features of the corresponding channels to obtain the final features. In essence, the SE component makes selections in the feature dimension, which can pay more attention to the features with the most information and suppress those unimportant features. IRB ⁺ has better feature extraction capabilities while maintaining a low computational load. However, IRB ⁺ capturing all channel dependencies is inefficient and unnecessary.

综上所述，目前急需一种适用水下感兴趣小目标的识别方法，以提高样本稀少条件下小目标精细化识别的准确率与效率。In summary, there is an urgent need for an identification method suitable for small targets of interest underwater to improve the accuracy and efficiency of refined identification of small targets under conditions of sparse samples.

发明内容Contents of the invention

本发明的目的在于克服现有技术缺陷，提出了合成孔径声纳感兴趣小目标孪生匹配识别方法及系统。The purpose of the present invention is to overcome the shortcomings of the existing technology and propose a synthetic aperture sonar twin matching identification method and system for small targets of interest.

为了实现上述目的，本发明提出了一种合成孔径声纳感兴趣小目标孪生匹配识别方法，所述方法包括：In order to achieve the above objectives, the present invention proposes a synthetic aperture sonar twin matching identification method for small targets of interest, which method includes:

对接收到的合成孔径声纳的回波数据进行处理，得到实时合成孔径声纳图像，采用目标检测方法进行粗识别，得到感兴趣小目标图像；Process the received echo data of synthetic aperture sonar to obtain real-time synthetic aperture sonar images, and use the target detection method to perform rough identification to obtain images of small targets of interest;

将感兴趣小目标图像输入预先建立和训练好的孪生匹配模型，与不同的参考图像组逐一进行匹配，得到与每个参考图像组的相似度；Input the small target image of interest into the pre-established and trained twin matching model, match it with different reference image groups one by one, and obtain the similarity with each reference image group;

根据相似度的排序结果确定识别结果，实现精细化识别；Determine the recognition result based on the ranking results of similarity to achieve refined recognition;

所述孪生匹配模型为改进的Siam FC网络结构，骨干网络为改进的MobileNet V3网络。The twin matching model is an improved Siam FC network structure, and the backbone network is an improved MobileNet V3 network.

优选的，所述采用目标检测方法进行粗识别，得到感兴趣小目标图像；具体包括：Preferably, the target detection method is used for rough identification to obtain images of small targets of interest; specifically including:

采用目标检测方法进行粗识别，将检测结果保存为固定尺寸的图像，得到感兴趣小目标图像。The target detection method is used for rough recognition, and the detection results are saved as fixed-size images to obtain small target images of interest.

优选的，所述改进的MobileNet V3网络采用EIRB模块替换原MobileNet V3中的IRB⁺模块；所述EIRB模块采用反残差网络结构，包括两个支路网络，其中上侧支路网络保持输入特征D不变，下侧支路网络用于对水下感兴趣小目标的特征提取和选择，并与上侧支路网络的输出特征相加；所述下侧支路网络包括：扩张层、通道可选择组件和压缩层，其中，Preferably, the improved MobileNet V3 network uses an EIRB module to replace the IRB ⁺ module in the original MobileNet V3; the EIRB module adopts an inverse residual network structure, including two branch networks, in which the upper branch network maintains the input characteristics. D remains unchanged, the lower branch network is used to extract and select features of small underwater targets of interest, and is added to the output features of the upper branch network; the lower branch network includes: expansion layer, channel Optional components and compression layers, where

所述扩张层，用于对输入特征通道的扩张；卷积核尺寸为1×1，卷积核的数量为输入特征通道的K倍；The expansion layer is used to expand the input feature channels; the convolution kernel size is 1×1, and the number of convolution kernels is K times the input feature channels;

所述通道可选择组件，用于通过学习权重选择包含重要信息的通道；The channel selectable component is used to select channels containing important information by learning weights;

所述压缩层，用于将特征通道压缩成与输入特征一致的数量。The compression layer is used to compress the feature channel into a quantity consistent with the input feature.

优选的，所述EIRB模块的处理过程包括：Preferably, the processing process of the EIRB module includes:

输入特征D∈Φ^H×H×M分别进入两个支路网络，其中H×H为输入特征的尺寸，M为输入特征的通道数，Φ为特征尺寸；The input features D∈Φ ^H×H×M enter the two branch networks respectively, where H×H is the size of the input feature, M is the number of channels of the input feature, and Φ is the feature size;

对于下侧支路网络，输入特征D经过扩张层F_ex后的特征D_ex为：For the lower branch network, the feature D _ex of the input feature D after passing through the expansion layer F _ex is:

D_ex＝F_ex(D)D _ex =F _ex (D)

D_ex进入通道可选择组件后的输出特征D_se为：The output feature D _se after D _ex enters the channel selectable component is:

D_se＝s·D_ex D _se =s·D _ex

s＝f_h(f₃(P_g(D_ex)))s＝f _h (f ₃ (P _g (D _ex )))

式中，D_se∈Φ^H×H×(K×M)，s为通道的选择系数，s∈Φ^1×(K×M)；P_g为全局池化函数，Φ^1×(K×M)为输出特征维度，f₃为卷积核尺寸为3的一维卷积层，f_h为hard swish激活函数， In the formula, D _se ∈Φ ^H×H×(K×M) , s is the selection coefficient of the channel, s∈Φ ^1×(K×M) ; P _g is the global pooling function, Φ ^{1×(K×M )} is the output feature dimension, f ₃ is a one-dimensional convolution layer with a convolution kernel size of 3, f _h is the hard swish activation function,

由压缩层对D_se进行通道压缩，得到通道压缩后的特征D'，输出特征维度为Φ^H×H×M：The compression layer performs channel compression on D _se to obtain the channel-compressed feature D', and the output feature dimension is Φ ^H×H×M :

D′＝F_sq(D_se)D′＝F _sq (D _se )

将经过上侧支路网络的输入特征D与下侧支路网络输出的D'相加，得到输出特征为：Add the input feature D that passes through the upper branch network and the output D' of the lower branch network to get the output feature for:

式中， In the formula,

优选的，，所述孪生匹配模型还包括相似度结果计算模块，用于采用余弦距离作为相似度量函数，计算待识别的感兴趣小目标图像的输出特征x与每个参考图像组输出特征y的平均相似度sim(x,y)：Preferably, the twin matching model also includes a similarity result calculation module for using cosine distance as a similarity measure function to calculate the difference between the output feature x of the small target image of interest to be identified and the output feature y of each reference image group. Average similarity sim(x,y):

其中，N为该参考图像组的参考图像个数，y_i第i个参考图像的输出特征。Among them, N is the number of reference images in the reference image group, and y _is the output feature of the i-th reference image.

优选的，所述方法还包括孪生匹配模型的训练步骤，具体包括：Preferably, the method also includes a training step of the twin matching model, specifically including:

建立训练集；Create a training set;

将训练集数据依次输入改进的Siam FC网络进行模型训练，待满足训练要求，得到训练好的孪生匹配模型。The training set data is sequentially input into the improved Siam FC network for model training. When the training requirements are met, the trained twin matching model is obtained.

另一方面，本发明提出了一种合成孔径声纳感兴趣小目标孪生匹配识别系统，所述系统包括：On the other hand, the present invention proposes a synthetic aperture sonar twin matching and identification system for small targets of interest, which system includes:

粗识别模块，用于对接收到的合成孔径声纳的回波数据进行处理，得到实时合成孔径声纳图像，采用目标检测方法进行粗识别，得到感兴趣小目标图像；The rough identification module is used to process the received echo data of synthetic aperture sonar to obtain real-time synthetic aperture sonar images, and uses the target detection method to perform rough identification to obtain images of small targets of interest;

精细识别模块，用于将感兴趣小目标图像输入预先建立和训练好的孪生匹配模型，与不同的参考图像组逐一进行匹配，实现精细化识别，得到每个参考图像组的相似度；和The fine recognition module is used to input small target images of interest into the pre-established and trained twin matching model, and match them with different reference image groups one by one to achieve fine recognition and obtain the similarity of each reference image group; and

结果输出模块，用于根据相似度的排序结果确定识别结果；The result output module is used to determine the recognition result based on the ranking results of similarity;

优选的，所述孪生匹配模型部署在边缘计算平台。Preferably, the twin matching model is deployed on an edge computing platform.

与现有技术相比，本发明的优势在于：Compared with the existing technology, the advantages of the present invention are:

1、本发明将合成孔径声纳图像目标检测结果与特征匹配结果相结合，提出了一种水下感兴趣小目标识别方法，以粗精两级的方式解决样本稀少条件下现有方法对水下感兴趣小目标识别准确率低的问题；该方法首先利用目标检测模型对感兴趣小目标进行检测，再利用孪生网络对感兴趣小目标进行识别，为样本稀少条件下的SAS图像水下小目标自主识别任务提供了一种有效解决的手段；1. The present invention combines the target detection results of synthetic aperture sonar images with the feature matching results, and proposes a method for identifying small underwater targets of interest, which solves the problem of existing methods under the condition of sparse samples in a two-level coarse-fine and fine-grained manner. This method solves the problem of low identification accuracy of small targets of interest; this method first uses a target detection model to detect small targets of interest, and then uses the twin network to identify small targets of interest, and provides underwater small targets for SAS images under the condition of sparse samples. The autonomous target recognition task provides an effective means to solve it;

2、改进的MobileNet V3网络引入了ECA注意力机制，ECA的核心思想是关注通道(channel)之间的关系，而不是像传统的自注意力机制(如Transformer)那样关注不同位置之间的关系，因此可以显著提高网络的性能；另外，ECA注意力机制使用简单的1x1卷积核来计算通道注意力权重，因此引入的额外参数非常有限；2. The improved MobileNet V3 network introduces the ECA attention mechanism. The core idea of ECA is to focus on the relationship between channels, rather than focusing on the relationship between different positions like traditional self-attention mechanisms (such as Transformer). , so the performance of the network can be significantly improved; in addition, the ECA attention mechanism uses a simple 1x1 convolution kernel to calculate the channel attention weight, so the additional parameters introduced are very limited;

3、改进的Siam FC网络结构将待识别图像与一组参考图像进行匹配，并计算平均相似度，这大大提高了匹配结果的可靠性，并且参考图像组可以现场添加和删减，提高了网络的快速部署能力。3. The improved Siam FC network structure matches the image to be recognized with a set of reference images and calculates the average similarity, which greatly improves the reliability of the matching results, and the reference image group can be added and deleted on-site, improving the network rapid deployment capabilities.

附图说明Description of the drawings

图1是本发明提供的合成孔径声纳水下感兴趣小目标识别方法与系统实现框架；Figure 1 is a synthetic aperture sonar underwater small target identification method and system implementation framework provided by the present invention;

图2是是改进的特征提取模块结构示意图；Figure 2 is a schematic structural diagram of the improved feature extraction module;

图3是孪生匹配模型结构示意图。Figure 3 is a schematic diagram of the structure of the twin matching model.

具体实施方式Detailed ways

为了解决上述技术问题，本发明提出了一种合成孔径声纳图像感兴趣小目标识别方法及系统实现，实现了水下感兴趣小目标的精细化识别；该方法是一种粗精两级的水下目标识别方法，通过目标检测模型和孪生匹配网络，提高了样本稀少条件下的水下感兴趣小目标的识别精度。In order to solve the above technical problems, the present invention proposes a method and system implementation for identifying small targets of interest in synthetic aperture sonar images, achieving refined identification of small targets of interest underwater; the method is a two-level coarse and fine method. The underwater target recognition method improves the recognition accuracy of underwater small targets of interest under the condition of sparse samples through the target detection model and twin matching network.

为了实现上述目的，本发明提供了一种合成孔径声纳图像感兴趣小目标识别方法及系统实现，其特征在于，所述模型包括：感兴趣小目标检测模块、数据集制作模块、模型训练模块和特征匹配模块；In order to achieve the above purpose, the present invention provides a method and system implementation for identifying small targets of interest in synthetic aperture sonar images, which is characterized in that the model includes: a small target of interest detection module, a data set production module, and a model training module. and feature matching module;

所述数据集制作模块，用于标注、并制作目标分类数据集；The data set production module is used to annotate and produce target classification data sets;

所述模型训练模块，用于对孪生网络进行参数初始化、训练和测试；The model training module is used for parameter initialization, training and testing of the twin network;

所述特征匹配模块，用于感兴趣小目标图像与目标模板的匹配，实时感兴趣小目标的精细化识别。The feature matching module is used for matching small target images of interest and target templates, and for refined identification of small targets of interest in real time.

所述回波提取模块进一步包括：合成孔径声纳图像子模块、目标检测子模块和待识别感兴趣小目标子模块；The echo extraction module further includes: a synthetic aperture sonar image sub-module, a target detection sub-module and a small target of interest sub-module to be identified;

所述合成孔径声纳图像子模块，用于对接收到的阵元数据进行处理，得到实时合成孔径声纳图像；The synthetic aperture sonar image sub-module is used to process the received array element data to obtain real-time synthetic aperture sonar images;

所述目标检测子模块，用于对合成孔径声纳图像中感兴趣小目标进行检测；The target detection sub-module is used to detect small targets of interest in synthetic aperture sonar images;

所述待识别感兴趣小目标子模块，用于将目标检测子模块的检测结果保存为固定尺寸的图像；The small target of interest to be identified sub-module is used to save the detection results of the target detection sub-module as a fixed-size image;

可选的，所述数据集制作模块子模块进一步包括：数据采集子模块、数据标注子模块和目标分类数据集制作子模块；Optionally, the data set production module sub-module further includes: data collection sub-module, data annotation sub-module and target classification data set production sub-module;

所述数据采集子模块，收集目标图像；The data collection sub-module collects target images;

所述数据标注子模块，结合任务需求对采集图像进行标注；The data annotation sub-module annotates the collected images in combination with task requirements;

所述目标检测数据集制作子模块，按照标准目标分类数据集格式，将数据随机划分为训练集和测试集。The target detection data set production sub-module randomly divides the data into a training set and a test set according to the standard target classification data set format.

可选的，其特征在于，所述模型训练子模块进一步包括：参数设置模块，孪生网络模块和模型测试模块。Optionally, it is characterized in that the model training sub-module further includes: a parameter setting module, a twin network module and a model testing module.

所述参数设置子模块，用于完成模型训练所需参数初始化工作；The parameter setting sub-module is used to complete the initialization of parameters required for model training;

所述孪生网络子模块，用于实现感兴趣小目标匹配识别；The twin network sub-module is used to achieve matching and identification of small targets of interest;

所述模型测试模块子模块，用于实时监视模型训练状态。The model testing module sub-module is used to monitor the model training status in real time.

可选的，所述特征匹配子模块进一步包括：目标模板子模块、孪生网络子模块和结果输出子模块；Optionally, the feature matching sub-module further includes: a target template sub-module, a twin network sub-module and a result output sub-module;

所述目标模板子模块，用于管理和存储水下感兴趣小目标模板；The target template submodule is used to manage and store underwater small target templates of interest;

所述孪生网络子模块，用于待识别感兴趣小目标图像与目标模板的匹配；The twin network sub-module is used to match the small target image of interest to be identified and the target template;

所述结果输出子模块，用于待识别感兴趣小目标图像相似度的显示输出。The result output sub-module is used to display and output the image similarity of the small target of interest to be identified.

下面结合附图和实施例对本发明的技术方案进行详细的说明。The technical solution of the present invention will be described in detail below with reference to the accompanying drawings and examples.

实施例1Example 1

本发明包括点合成孔径声纳图像预处理模块、数据集制作模块、模型训练模块和平台部署模块。第一步，采集目标图像、对样本进行标注并生成图像分类数据集；第二步，初始化训练参数，对孪生网络进行训练，并对训练结果进行质量评估；第三步，将检测到的水下感兴趣小目标与目标模板进行匹配，实现精细化识别。其总体流程框图如图1所示，具体步骤如下：The invention includes a point synthetic aperture sonar image preprocessing module, a data set production module, a model training module and a platform deployment module. The first step is to collect the target image, label the sample and generate an image classification data set; the second step is to initialize the training parameters, train the twin network, and evaluate the quality of the training results; the third step is to collect the detected water. Small targets of interest are matched with the target template to achieve refined recognition. The overall flow chart is shown in Figure 1, and the specific steps are as follows:

步骤1、目标分类数据集制作Step 1. Creation of target classification data set

步骤1-1、收集目标图像样本；Step 1-1. Collect target image samples;

步骤1-2、对图像样本进行分类；Step 1-2, classify the image samples;

步骤1-3、按照目标分类数据集格式，采用随机划分的原则，对标注好的图像进行划分，分为训练样本集和测试样本集。Steps 1-3: According to the format of the target classification data set, use the principle of random division to divide the annotated images into training sample sets and test sample sets.

步骤2、模型训练Step 2. Model training

步骤2-1、在深度学习服务器上搭建训练平台所需的环境，包括开源软件Anaconda、Pytorch以及Torchvision等，同时对模型训练初始化参数进行设定，包括batchsize、epoch和validation_epochs等；Step 2-1. Build the environment required for the training platform on the deep learning server, including open source software Anaconda, Pytorch, and Torchvision, etc., and set the model training initialization parameters, including batchsize, epoch, validation_epochs, etc.;

步骤2-2、搭建改进的特征提取模块(EIRB,Efficient Inverted ResidualBlock)，如图2所示。EIRB模块采用了反残差网络结构，即先对通道采取先“扩张”后“压缩”的策略，并由扩张层、通道可选择组件和压缩层组成，其中扩张层负责输入特征通道扩张；通道可选择组件通过学习权重选择包含重要信息的通道；压缩层负责将特征通道压缩成与输入特征一致的数量。Step 2-2. Build an improved feature extraction module (EIRB, Efficient Inverted ResidualBlock), as shown in Figure 2. The EIRB module adopts an inverse residual network structure, that is, the channel is first "expanded" and then "compressed", and is composed of an expansion layer, a channel selectable component, and a compression layer. The expansion layer is responsible for the expansion of the input feature channel; the channel The selectable component selects channels containing important information by learning weights; the compression layer is responsible for compressing the feature channels into a number consistent with the input features.

对于一个任意的输入特征D∈Φ^H×H×M，其中H×H为输入特征的尺寸，M为输入特征的通道数。输入特征D进入EIRB模块的两个支路网络：下侧支路负责水下感兴趣小目标特征提取和选择；上侧支路保持输入特征D不变，并最后与顶层支路网络的输出特征相加。对于下侧支路网络，输入特征D首先经过扩张层，其输出特征的数学表达为：For an arbitrary input feature D∈Φ ^H×H×M , where H×H is the size of the input feature and M is the number of channels of the input feature. The input feature D enters the two branch networks of the EIRB module: the lower branch is responsible for extracting and selecting features of small underwater targets of interest; the upper branch keeps the input feature D unchanged, and finally combines it with the output features of the top branch network Add up. For the lower branch network, the input feature D first passes through the expansion layer, and the mathematical expression of its output feature is:

D_ex＝F_ex(D),D∈Φ^H×H×M (1)D _ex ＝F _ex (D),D∈Φ ^H×H×M (1)

式中,D为原始输入特征；D_ex为经过扩张层后的特征，扩张层的卷积核尺寸为1×1，卷积核的数量为输入特征通道的K倍，即K×M。In the formula, D is the original input feature; D _ex is the feature after the expansion layer. The convolution kernel size of the expansion layer is 1×1, and the number of convolution kernels is K times the input feature channels, that is, K×M.

然后，输出特征D_ex送入ECA通道选择组件，其输出特征的数学表达式为：Then, the output feature D _ex is sent to the ECA channel selection component, and the mathematical expression of its output feature is:

D_se＝s·D_ex (2)D _se =s·D _ex (2)

s＝f_h(f₃(P_g(D_ex))) (3)s＝f _h (f ₃ (P _g (D _ex ))) (3)

式中,D_se为通道选择后的通道特征；s为通道的选择系数，s∈Φ^1×(K×M)；P_g()为全局池化函数，输出特征维度为Φ^1×(K×M)；f₃为卷积核尺寸为3的一维卷积层，输出特征维度为Φ^1×(K×M)；f_h为hard swish激活函数， In the formula, D _se is the channel feature after channel selection; s is the selection coefficient of the channel, s∈Φ ^1×(K×M) ; P _g () is the global pooling function, and the output feature dimension is Φ ^{1×(K ×M)} ; f ₃ is a one-dimensional convolution layer with a convolution kernel size of 3, and the output feature dimension is Φ ^1×(K×M) ; f _h is a hard swish activation function,

接着，对D_se进行通道压缩，数学表达式为：Next, channel compression is performed on D _se , and the mathematical expression is:

D'＝F_sq(D_se),D_se∈Φ^H×H×(K×M) (4)D'＝F _sq (D _se ),D _se ∈Φ ^H×H×(K×M) (4)

式中,D'为通道压缩后的特征,输出特征维度为Φ^H×H×M。In the formula, D' is the feature after channel compression, and the output feature dimension is Φ ^H×H×M .

通过上面的计算，最后可以得到EIRB模块的输出特征数学表达式为：Through the above calculation, the mathematical expression of the output characteristics of the EIRB module can finally be obtained:

式中,为EIRB模块的输出特征，/>特征图尺寸为H×H，通道数为M。In the formula, is the output feature of the EIRB module,/> The feature map size is H×H, and the number of channels is M.

步骤2-3搭建改进的MobileNet V3网络。改进的MobileNet V3网络如表1所示，其中各卷积层输入通道、中间通道、输出通道、步长和激活函数(RE表示ReLU激活函数，HS表示H-Swich激活函数)与原MobileNet V3网络保持一致；不同的是改进的MobileNet V3利用EIRB替换原MobileNet V3中IRB⁺模块。Step 2-3 build an improved MobileNet V3 network. The improved MobileNet V3 network is shown in Table 1, in which the input channel, intermediate channel, output channel, step size and activation function of each convolutional layer (RE represents the ReLU activation function, HS represents the H-Swich activation function) are the same as the original MobileNet V3 network Remain consistent; the difference is that the improved MobileNet V3 uses EIRB to replace the IRB ⁺ module in the original MobileNet V3.

表1改进的MobileNet V3网络结构Table 1 Improved MobileNet V3 network structure

步骤2-4、实时监测孪生网络的训练过程、以及测试结果，当评价指标满足要求时停止训练。Steps 2-4: Monitor the training process and test results of the twin network in real time, and stop training when the evaluation indicators meet the requirements.

步骤3、感兴趣小目标精细化识别Step 3. Refined identification of small targets of interest

步骤3-1、在边缘计算平台上搭建目标检测和孪生匹配所需的运行环境。Step 3-1. Build the operating environment required for target detection and twin matching on the edge computing platform.

步骤3-2、目标检测模型部署。将训练好的目标检测模型部署到平台端，并设定输入数据格式以及输出数据格式。Step 3-2. Target detection model deployment. Deploy the trained target detection model to the platform, and set the input data format and output data format.

步骤3-3、孪生匹配网络搭建和部署。搭建如图3所示的孪生网络结构，骨干网络为改进的MobileNet V3。如果模板与待匹配图像来自同一类正样本对，则CNN提取出的特征会非常相似。反之，若模板与待匹配图像来自负样本对或不同类别的样本，则CNN提取出的特征会有明显的差距。孪生卷积神经网络就是采用这样的方式提升了泛化能力，使其可以在只有少量样本的情况下实现准确、鲁棒地匹配。Step 3-3. Twin matching network construction and deployment. Build a twin network structure as shown in Figure 3, and the backbone network is the improved MobileNet V3. If the template and the image to be matched come from the same category of positive sample pairs, the features extracted by the CNN will be very similar. On the contrary, if the template and the image to be matched come from negative sample pairs or samples of different categories, there will be a significant gap in the features extracted by the CNN. The twin convolutional neural network uses this method to improve the generalization ability, allowing it to achieve accurate and robust matching with only a small number of samples.

步骤3-4、相似度计算和结果输出。在进行相似度计算时，采用余弦距离作为相似度量函数。具体的计算公式如下：Step 3-4, similarity calculation and result output. When calculating similarity, cosine distance is used as the similarity measure function. The specific calculation formula is as follows:

其中x为待识别图像在改进的MobileNet V3网络上的输出特征；y_i为第i个参考图像在改进的MobileNet V3网络上的输出特征；sim(x,y)为待识别图像与参考图像组的平均相似度。where x is the output feature of the image to be recognized on the improved MobileNet V3 network; y _i is the output feature of the i-th reference image on the improved MobileNet V3 network; sim(x,y) is the group of the image to be recognized and the reference image the average similarity.

下面结合仿真实验，对本发明的技术效果作进一步的说明：The following is a further explanation of the technical effects of the present invention in combination with simulation experiments:

实验硬件平台为i7-8750H、内存为32GB(16GB*2)、GPU为2070s(8G)，软件环境为win10、python3.7、Torch1.3.1和Torchvision0.4.2等。输入图像的尺寸为224像素*224像素。为了保证算法公平性，参考图像为一组图像，即同一类型目标包含多幅图像，共设置三类参考目标，非目标、可以目标和疑似目标。实验结果如图表1所示。The experimental hardware platform is i7-8750H, the memory is 32GB (16GB*2), the GPU is 2070s (8G), and the software environment is win10, python3.7, Torch1.3.1 and Torchvision0.4.2, etc. The size of the input image is 224 pixels * 224 pixels. In order to ensure the fairness of the algorithm, the reference image is a set of images, that is, the same type of target contains multiple images, and a total of three types of reference targets are set, namely non-target, possible target and suspected target. The experimental results are shown in Figure 1.

表2目标检测模型性能比较Table 2 Performance comparison of target detection models

非目标non-target 可疑目标suspicious target 疑似目标suspected target 疑似目标(待识别)Suspected target (to be identified) 0.82670.8267 0.88790.8879 0.93030.9303 可疑目标(待识别)Suspicious target (to be identified) 0.86550.8655 0.91920.9192 0.88520.8852 非目标(待识别)Non-target (to be identified) 0.85890.8589 0.84120.8412 0.83480.8348

从表2的实验结果可以发现，本文改进孪生网络可有效对感兴趣小目标类型做进一步的判断。对疑似目标(待识别)而言，相似度最高的为参考疑似目标组(0.9303)，相似度最低的为参考非目标组(0.8267)；对可疑目标(待识别)而言，相似度最高的为参考可疑目标组(0.9192)，相似度最低的为参考非目标组(0.8655)；对非目标(待识别)而言，相似度最高的为参考非目标组(0.8589)，相似度最低的为参考疑似目标组(0.8348)。三类待识别目标与同类目标均具有最大的相似度。From the experimental results in Table 2, it can be found that the improved twin network in this paper can effectively make further judgments on the types of small targets of interest. For the suspected target (to be identified), the highest similarity is the reference suspected target group (0.9303), and the lowest similarity is the reference non-target group (0.8267); for the suspicious target (to be identified), the highest similarity For the reference suspicious target group (0.9192), the lowest similarity is the reference non-target group (0.8655); for non-targets (to be identified), the highest similarity is the reference non-target group (0.8589), and the lowest similarity is Reference suspected target group (0.8348). The three types of targets to be identified have the greatest similarity with similar targets.

实施例2Example 2

本发明的实施例2提出了一种合成孔径声纳感兴趣小目标孪生匹配识别系统，基于实施例1的方法实现，所述系统包括：Embodiment 2 of the present invention proposes a synthetic aperture sonar twin matching and identification system for small targets of interest, which is implemented based on the method of Embodiment 1. The system includes:

粗识别模块，用于对接收到的合成孔径声纳的回波数据进行处理，得到实时合成孔径声纳图像，采用目标检测方法进行粗识别，得到感兴趣小目标图像；The rough identification module is used to process the received echo data of synthetic aperture sonar to obtain real-time synthetic aperture sonar images, and use the target detection method to perform rough identification to obtain images of small targets of interest;

精细识别模块，用于将感兴趣小目标图像输入预先建立和训练好的孪生匹配模型，与不同的参考图像组逐一进行匹配，实现精细化识别，得到与每个参考图像组的相似度；The fine recognition module is used to input the small target image of interest into the pre-established and trained twin matching model, and match it with different reference image groups one by one to achieve fine recognition and obtain the similarity with each reference image group;

所述孪生匹配模型采用改进的Siam FC网络结构，骨干网络为改进的MobileNetV3网络。The twin matching model adopts an improved Siam FC network structure, and the backbone network is an improved MobileNetV3 network.

最后所应说明的是，以上实施例仅用以说明本发明的技术方案而非限制。尽管参照实施例对本发明进行了详细说明，本领域的普通技术人员应当理解，对本发明的技术方案进行修改或者等同替换，都不脱离本发明技术方案的精神和范围，其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and are not limiting. Although the present invention has been described in detail with reference to the embodiments, those of ordinary skill in the art will understand that modifications or equivalent substitutions may be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and they shall all be covered by the scope of the present invention. within the scope of the claims.

Claims

1. A synthetic aperture sonar small target twinning matching identification method, the method comprising:

processing the received echo data of the synthetic aperture sonar to obtain a real-time synthetic aperture sonar image, and performing coarse recognition by adopting a target detection method to obtain a small target image of interest;

inputting the small target image of interest into a pre-established and trained twin matching model, and matching with different reference image groups one by one to obtain the similarity with each reference image group;

determining a recognition result according to the sorting result of the similarity, and realizing refined recognition;

the twin matching model is an improved Siam FC network structure, and the backbone network is an improved MobileNet V3 network.

2. The synthetic aperture sonar small target twinning matching identification method of claim 1, wherein the coarse identification is performed by using a target detection method to obtain a small target image of interest; the method specifically comprises the following steps:

and (3) performing coarse recognition by adopting a target detection method, and storing a detection result as an image with a fixed size to obtain a small target image of interest.

3. The method for identifying twin matching of small target of interest by using synthetic aperture sonar as recited in claim 3, wherein the improved MobileNet V3 network replaces IRB in original MobileNet V3 by EIRB module ⁺ A module; the EIRB module adopts an inverse residual error network structure and comprises two branch networks, wherein an upper branch network keeps an input characteristic D unchanged, and a lower branch network is used for extracting and selecting characteristics of a small underwater object of interest and adding the characteristics with output characteristics of the upper branch network; the lower leg network includes: an expansion layer, a channel selectable component, and a compression layer, wherein,

the expansion layer is used for expanding the input characteristic channel; the size of the convolution kernel is 1 multiplied by 1, and the number of the convolution kernels is K times of the number of the input characteristic channels;

the channel selectable component is used for selecting a channel containing important information through learning weights;

the compression layer is used for compressing the characteristic channels into the quantity consistent with the input characteristics.

4. A synthetic aperture sonar small target twinning match identification method as defined in claim 3, wherein the processing of the EIRB module comprises:

input features D ε Φ ^H×H×M Respectively entering two branch networks, wherein H multiplied by H is the size of an input feature, M is the number of channels of the input feature, and phi is the feature size;

for the lower branch network, the input feature D passes through the expansion layer F _ex Post feature D _ex The method comprises the following steps:

D _ex ＝F _ex (D)

D _ex output feature D after entering channel selectable component _se The method comprises the following steps:

D _se ＝s·D _ex

s＝f _h (f ₃ (P _g (D _ex )))

wherein D is _se ∈Φ ^H×H×(K×M) S is the selection coefficient of the channel, s ε Φ ^1×(K×M) ；P _g For global pooling function, Φ ^1×(K×M) To output feature dimensions, f ₃ One-dimensional convolution layer with convolution kernel size 3, f _h For the hard swish to activate the function,

from compressed layer pair D _se Channel compression is carried out to obtain a characteristic D' after channel compression, and the output characteristic dimension is phi ^H×H×M ：

D′＝F _sq (D _se )

Adding the input characteristic D passing through the upper branch network and the D' output by the lower branch network to obtain an output characteristicThe method comprises the following steps:

in the method, in the process of the invention,

5. a synthetic aperture sonar small target-of-interest twin matching recognition method according to claim 3, wherein the twin matching model further comprises a similarity result calculation module for calculating an average similarity sim (x, y) of the output feature x of the small target image of interest to be recognized and the output feature y of each reference image group using the cosine distance as a similarity metric function:

wherein N is the number of reference images of the reference image group, y _i Output characteristics of the i-th reference image.

6. The method for identifying twin matching of small objects of interest by using synthetic aperture sonar according to claim 1, wherein the method further comprises a training step of a twin matching model, and specifically comprises:

building a training set;

and sequentially inputting the training set data into an improved Siam FC network to perform model training, and obtaining a trained twin matching model when the training requirement is met.

7. A synthetic aperture sonar small target twinning match recognition system, the system comprising:

the coarse recognition module is used for processing the received echo data of the synthetic aperture sonar to obtain a real-time synthetic aperture sonar image, and performing coarse recognition by adopting a target detection method to obtain a small target image of interest;

the fine recognition module is used for inputting the small target image of interest into a pre-established and trained twin matching model, and matching the small target image with different reference image groups one by one to realize fine recognition and obtain the similarity with each reference image group; and

the result output module is used for determining a recognition result according to the sorting result of the similarity;

8. The synthetic aperture sonar small target twinning match recognition system of claim 8, wherein the twinning match model is deployed on an edge computing platform.