CN113435376B

CN113435376B - Bidirectional feature fusion deep convolution neural network construction method based on discrete wavelet transform

Info

Publication number: CN113435376B
Application number: CN202110760099.5A
Authority: CN
Inventors: 李亚峰; 孙洁琪; 张文博; 刘鹏辉
Original assignee: Baoji University of Arts and Sciences
Current assignee: Baoji University of Arts and Sciences
Priority date: 2021-07-05
Filing date: 2021-07-05
Publication date: 2023-04-18
Anticipated expiration: 2041-07-05
Also published as: CN113435376A

Abstract

The invention discloses a two-way feature fusion deep convolutional neural network construction method based on discrete wavelet transform, which belongs to the technical fields of image classification and artificial intelligence. The method includes the following specific steps: S101, constructing a bidirectional feature fusion module, the bidirectional feature fusion module is composed of a spatial domain feature fusion module, a pooling operation, and a channel domain feature fusion module; S102, embedding the bidirectional feature fusion module into a mainstream network architecture Replace the original pooling method; S103, use the network of the bidirectional feature fusion module embedded in the mainstream network architecture to perform feature map training and testing on the classic image classification data set. The present invention uses discrete wavelet transform and inverse discrete wavelet transform to provide a novel spatial domain feature fusion method, which can effectively suppress the information loss problem caused by direct use of pooling operations and improve image classification accuracy.

Description

Two-way feature fusion deep convolutional neural network construction method based on discrete wavelet transform

技术领域technical field

本发明属于图像分类和人工智能技术领域，具体为一种基于离散小波变换的双向特征融合深度卷积神经网络构建方法。The invention belongs to the technical field of image classification and artificial intelligence, and specifically relates to a method for constructing a deep convolutional neural network based on discrete wavelet transform-based two-way feature fusion.

背景技术Background technique

深度卷积神经网络目前已成为图像分类、目标检测和图像恢复等计算机视觉任务和图像处理任务的重要工具之一。池化层是深度卷积神经网络的重要组成部分，它可以增大网络的感受野、降低网络的复杂度，增加网络非线性及提高模型泛化能力。在深度卷积神经网络中，常用的池化方法有最大池化、平均池化、混合池化及随机池化等。其中，经典的最大池化和平均池化，因其设计原理简洁高效而被广泛应用于深度卷积神经网络中。然而这两类池化操作的一个主要局限性是随着图像分辨率的降低导致图像中部分特征信息丢失和弱化。混合池化和随机池化通过利用概率统计的方式建立起最大池化和平均池化之间的联系。尽管混合池化和随机池化继承了平均池化和最大池化的优势，但仍存在信息丢失和弱化问题。深度卷积神经网络在提取特征的过程中池化操作引起的特征信息丢失和弱化将会直接影响网络的表达能力，分类的准确率也同样受到影响。Deep convolutional neural networks have become one of the important tools for computer vision tasks and image processing tasks such as image classification, object detection, and image restoration. The pooling layer is an important part of the deep convolutional neural network. It can increase the receptive field of the network, reduce the complexity of the network, increase the nonlinearity of the network and improve the generalization ability of the model. In deep convolutional neural networks, commonly used pooling methods include maximum pooling, average pooling, hybrid pooling, and random pooling. Among them, the classic maximum pooling and average pooling are widely used in deep convolutional neural networks because of their simple and efficient design principles. However, one of the main limitations of these two types of pooling operations is that part of the feature information in the image is lost and weakened as the image resolution decreases. Hybrid pooling and random pooling establish the connection between maximum pooling and average pooling by using probability and statistics. Although hybrid pooling and random pooling inherit the advantages of average pooling and max pooling, there are still problems of information loss and weakening. The loss and weakening of feature information caused by the pooling operation in the process of extracting features of the deep convolutional neural network will directly affect the expressive ability of the network, and the accuracy of classification will also be affected.

研究者们针对池化操作的特征信息丢失问题进行了不断的改进。带步长的卷积在不丢弃数据的同时降低特征图的分辨率，但步长卷积使得训练网络的计算量和参数量大幅增加。研究者还观察到，直接进行下采样的池化操作忽略了低频特征和高频特征在空域与通道间的位置分布差异，造成了频域特征间的混叠效应。因此，提出先利用低通滤波去除高频特征再进行下采样，能够有效地避免混叠效应。由于特征图中的高频部分包含了大量的细节信息和边缘信息，将高频特征直接舍弃依然会造成特征信息丢失问题，并影响网络的表达能力。Researchers have made continuous improvements to the problem of feature information loss in pooling operations. Convolution with a step size reduces the resolution of the feature map without discarding data, but the step size convolution greatly increases the amount of computation and parameters for training the network. The researchers also observed that the direct downsampling pooling operation ignores the difference in the location distribution of low-frequency features and high-frequency features between the spatial domain and the channel, resulting in the aliasing effect between the frequency domain features. Therefore, it is proposed to use low-pass filtering to remove high-frequency features before down-sampling, which can effectively avoid the aliasing effect. Since the high-frequency part of the feature map contains a large amount of detail information and edge information, directly discarding high-frequency features will still cause the loss of feature information and affect the expressive ability of the network.

为了解决上述问题，我们提出了一种基于离散小波变换的双向特征融合深度卷积神经网络构建方法。To solve the above problems, we propose a discrete wavelet transform-based bidirectional feature fusion deep convolutional neural network construction method.

发明内容Contents of the invention

针对现有技术的不足，本发明提供了一种基于离散小波变换的双向特征融合深度卷积神经网络构建方法，以解决上述背景技术中提出的问题。Aiming at the deficiencies of the prior art, the present invention provides a method for constructing a deep convolutional neural network based on discrete wavelet transform-based two-way feature fusion, in order to solve the problems raised in the above-mentioned background technology.

为实现上述目的，本发明提供如下技术方案：一种基于离散小波变换的双向特征融合深度卷积神经网络构建方法，包括以下具体步骤：In order to achieve the above object, the present invention provides the following technical solutions: a method for constructing a two-way feature fusion deep convolutional neural network based on discrete wavelet transform, comprising the following specific steps:

S101、构建基于离散小波变换及逆离散小波变换的双向特征融合模块，所述双向特征融合模块由空域特征融合模块，池化操作以及通道域特征融合模块构成；S101. Construct a bidirectional feature fusion module based on discrete wavelet transform and inverse discrete wavelet transform, the bidirectional feature fusion module is composed of a spatial domain feature fusion module, a pooling operation, and a channel domain feature fusion module;

S102、将双向特征融合模块嵌入到主流网络架构中替换原始池化方法；S102. Embedding the bidirectional feature fusion module into the mainstream network architecture to replace the original pooling method;

S103、利用嵌入到主流网络架构中的双向特征融合模块的深度卷积神经网络在经典图像分类数据集上进行特征图训练及测试。S103. Using the deep convolutional neural network of the bidirectional feature fusion module embedded in the mainstream network architecture to perform feature map training and testing on the classic image classification dataset.

进一步优化本技术方案，所述空域特征融合模块在实现空域特征间的融合时包括以下步骤：To further optimize the technical solution, the airspace feature fusion module includes the following steps when realizing the fusion between the airspace features:

S201、利用离散小波变换分解输入特征，得到低频特征、水平方向的高频特征、垂直方向的高频特征以及对角方向的高频特征；S201. Using discrete wavelet transform to decompose the input features to obtain low-frequency features, high-frequency features in the horizontal direction, high-frequency features in the vertical direction, and high-frequency features in the diagonal direction;

S202、利用逆离散小波变换分组重构，得到与原始特征空域维度相同的四类特征，以实现空域特征间的融合。S202. Using inverse discrete wavelet transform for group reconstruction to obtain four types of features with the same spatial dimension as the original feature, so as to realize the fusion of spatial features.

进一步优化本技术方案，所述分组重构是指对低频特征单频重构，低频和水平方向高频进行重构，低频和垂直方向高频进行重构及低频和对角方向高频进行重构。To further optimize this technical solution, the grouping reconstruction refers to single-frequency reconstruction of low-frequency features, reconstruction of low-frequency and high-frequency in the horizontal direction, reconstruction of low-frequency and high-frequency in the vertical direction, and reconstruction of low-frequency and high-frequency in the diagonal direction. structure.

进一步优化本技术方案，所述空域特征融合模块得到的四组特征为空域特征且分布特征近似，对损失函数影响相当，同时对于从空域获取到的四组融合特征，分别对其进行池化。To further optimize this technical solution, the four sets of features obtained by the airspace feature fusion module are airspace features and have similar distribution features, which have a considerable impact on the loss function. At the same time, pooling is performed on the four sets of fusion features obtained from the airspace.

进一步优化本技术方案，所述空域特征融合模块，利用离散小波变换与逆离散小波变换进行空域特征融合的过程中，不增加任何参数。To further optimize this technical solution, the spatial feature fusion module does not add any parameters during the process of spatial feature fusion using discrete wavelet transform and inverse discrete wavelet transform.

进一步优化本技术方案，所述通道域特征融合模块，将池化后的多组特征进行通道维度的拼接，再利用1x1卷积进行降维，以实现信息交互，使得不同频段信息进行通道域的相互补充与传递。To further optimize this technical solution, the channel domain feature fusion module splices the pooled multiple groups of features in the channel dimension, and then uses 1x1 convolution to reduce the dimension, so as to realize information interaction, so that information in different frequency bands can be integrated in the channel domain. Complement and transfer each other.

进一步优化本技术方案，所述特征图经过双向特征融合模块，即依次经过空域特征融合模块，池化操作、通道域特征融合模块，因此，双域特征融合模块可以归结为：To further optimize this technical solution, the feature map passes through the two-way feature fusion module, that is, through the airspace feature fusion module, the pooling operation, and the channel domain feature fusion module in turn. Therefore, the dual-domain feature fusion module can be summarized as:

LL,LH,HL,HH＝Pooling(f_s(x))LL,LH,HL,HH=Pooling(f _s (x))

x'＝f_c(concat[LL,LH,HL,HH])x'=f _c (concat[LL,LH,HL,HH])

给定一个特征图x作为输入，依次经过空间特征融合模块f_s，池化操作Pooling和通道域特征融合模块f_c。这里，x'表示特征图经过双向特征融合模块后得到的结果。Given a feature map x as input, it goes through the spatial feature fusion module f _s , the pooling operation Pooling and the channel domain feature fusion module f _c in turn. Here, x' represents the result obtained after the feature map passes through the bidirectional feature fusion module.

进一步优化本技术方案，所述双向特征融合模块可以封装为独立模块用于替换深度卷积神经网络中的池化操作。To further optimize this technical solution, the bidirectional feature fusion module can be packaged as an independent module for replacing the pooling operation in the deep convolutional neural network.

进一步优化本技术方案，所述的利用嵌入到主流网络架构中的双向特征融合模块的深度卷积神经网络在公开图像分类数据上进行训练并测试，用于实现提升模型的分类精度。To further optimize this technical solution, the deep convolutional neural network using the bidirectional feature fusion module embedded in the mainstream network architecture is trained and tested on public image classification data to improve the classification accuracy of the model.

与现有技术相比，本发明提供了一种基于离散小波变换的双向特征融合深度卷积神经网络构建方法，具备以下有益效果：Compared with the prior art, the present invention provides a method for constructing a two-way feature fusion deep convolutional neural network based on discrete wavelet transform, which has the following beneficial effects:

(1)该方法通过设置空域特征融合模块，空域特征融合模块建立了空域中不同特征间的信息融合，且实现了通道维数的扩张；采用分组重构的方法，使得不同特征的分布近似，且对损失函数影响相近，由于在分组重构中，都使用了相对重要的小波低频特征，在空域和通道域的特征都有冗余，减少了池化造成的特征损失。(1) The method sets up the spatial feature fusion module, which establishes the information fusion between different features in the airspace, and realizes the expansion of the channel dimension; uses the method of group reconstruction to make the distribution of different features approximate, And the impact on the loss function is similar. Since the relatively important low-frequency wavelet features are used in the group reconstruction, the features in the spatial domain and the channel domain are redundant, which reduces the feature loss caused by pooling.

(2)该方法通过对四类特征分别进行池化处理，再将池化结果传入通道域特征融合模块，对比于经典网络、流行的基于嵌入注意力机制网络和最新基于小波基的深度卷积神经网络，嵌入双向特征融合模块的深度卷积神经网络，在较小参数增量下，提升了网络分类准确率，具有一定的优越性。(2) This method performs pooling processing on the four types of features, and then transfers the pooling results to the channel domain feature fusion module. Compared with the classic network, the popular network based on embedded attention mechanism and the latest wavelet-based deep convolution Convolutional neural network, a deep convolutional neural network embedded with a bidirectional feature fusion module, improves the accuracy of network classification with a small parameter increment, and has certain advantages.

(3)该方法利用离散小波变换和逆离散小波变换给出一种新颖的空域特征融合方法，解决了已有小波基方法的缺点，能够直接嵌入到目前流行的神经网络架构中替换传统池化操作，从空域和通道域两个方向维度抑制了池化的信息损失。(3) This method uses discrete wavelet transform and inverse discrete wavelet transform to give a novel spatial feature fusion method, which solves the shortcomings of existing wavelet-based methods, and can be directly embedded in the current popular neural network architecture to replace traditional pooling Operation, the information loss of pooling is suppressed from the two directions of space domain and channel domain.

附图说明Description of drawings

图1为本发明提出的一种基于离散小波变换的双向特征融合深度卷积神经网络构建方法的流程示意图；Fig. 1 is a schematic flow chart of a bidirectional feature fusion depth convolutional neural network construction method based on discrete wavelet transform proposed by the present invention;

图2为本发明提出的一种基于离散小波变换的双向特征融合深度卷积神经网络构建方法中双向特征融合模块的流程示意图；Fig. 2 is a schematic flow diagram of a bidirectional feature fusion module in a bidirectional feature fusion depth convolutional neural network construction method based on discrete wavelet transform proposed by the present invention;

图3为本发明提出的一种基于离散小波变换的双向特征融合深度卷积神经网络构建方法中双向特征融合模块的结构示意图。FIG. 3 is a schematic structural diagram of a bidirectional feature fusion module in a method for constructing a two-way feature fusion deep convolutional neural network based on discrete wavelet transform proposed by the present invention.

具体实施方式Detailed ways

下面将结合本发明的实施例，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

实施例一：Embodiment one:

请参阅图1-3，一种基于离散小波变换的双向特征融合深度卷积神经网络构建方法，包括以下具体步骤：Please refer to Figure 1-3, a method for constructing a deep convolutional neural network based on discrete wavelet transform for bidirectional feature fusion, including the following specific steps:

S101、构建基于离散小波变换及逆离散小波变换的双向特征融合模块，所述双向特征融合模块由空域特征融合模块以及通道域特征融合模块构成；S101. Construct a bidirectional feature fusion module based on discrete wavelet transform and inverse discrete wavelet transform, the bidirectional feature fusion module is composed of a spatial domain feature fusion module and a channel domain feature fusion module;

具体的，所述空域特征融合模块在实现空域特征间的融合时包括以下步骤：Specifically, the airspace feature fusion module includes the following steps when realizing the fusion between the airspace features:

具体的，所述分组重构是指对低频特征单频重构，低频和水平方向高频进行重构，低频和垂直方向高频进行重构及低频和对角方向高频进行重构。Specifically, the group reconstruction refers to single-frequency reconstruction of low-frequency features, reconstruction of low frequencies and high frequencies in the horizontal direction, reconstruction of low frequencies and high frequencies in the vertical direction, and reconstruction of low frequencies and high frequencies in the diagonal direction.

具体的，所述空域特征融合模块得到的四组特征为空域特征且分布特征近似，对损失函数影响相当，同时对于从空域获取到的四组融合特征，分别对其进行池化。Specifically, the four sets of features obtained by the airspace feature fusion module are airspace features and have similar distribution features, which have a considerable impact on the loss function. At the same time, pooling is performed on the four sets of fusion features obtained from the airspace.

具体的，所述空域特征融合模块，利用离散小波变换与逆离散小波变换进行空域特征融合的过程中，不增加任何参数。Specifically, the spatial feature fusion module does not add any parameters during the spatial feature fusion process using discrete wavelet transform and inverse discrete wavelet transform.

具体的，所述通道域特征融合模块，将池化后的多组特征进行通道维度的拼接，再利用1x1卷积进行降维，以实现特征挑选的目的，并实现信息交互，使得不同频段信息进行通道域的相互补充与传递。Specifically, the channel-domain feature fusion module splices the pooled multiple groups of features in the channel dimension, and then uses 1x1 convolution for dimensionality reduction, so as to achieve the purpose of feature selection and information interaction, so that information in different frequency bands Carry out mutual complementation and transfer of channel domains.

具体的，所述特征图经过双向特征融合模块，即依次经过空域特征融合模块，池化操作、通道域特征融合模块，因此，双域特征融合模块可以归结为：Specifically, the feature map passes through the two-way feature fusion module, that is, sequentially passes through the airspace feature fusion module, the pooling operation, and the channel domain feature fusion module. Therefore, the dual-domain feature fusion module can be summarized as:

LL,LH,HL,HH＝Pooling(f_s(x))LL,LH,HL,HH=Pooling(f _s (x))

x'＝f_c(concat[LL,LH,HL,HH])x'=f _c (concat[LL,LH,HL,HH])

给定一个特征图x作为输入，依次经过空间特征融合模块f_s，池化操作Pooling和通道域特征融合模块fc。这里，x'表示特征图经过双向特征融合模块后得到的结果。Given a feature map x as input, it goes through the spatial feature fusion module f _s , the pooling operation Pooling and the channel domain feature fusion module fc in sequence. Here, x' represents the result obtained after the feature map passes through the bidirectional feature fusion module.

具体的，所述双向特征融合模块可以封装为独立模块用于替换深度卷积神经网络中的池化操作。Specifically, the bidirectional feature fusion module can be packaged as an independent module for replacing the pooling operation in the deep convolutional neural network.

具体的，所述的利用嵌入到主流网络架构中的双向特征融合模块的深度卷积神经网络在公开图像分类数据上进行训练并测试，用于实现提升模型的分类精度。Specifically, the deep convolutional neural network using the bidirectional feature fusion module embedded in the mainstream network architecture is trained and tested on public image classification data to improve the classification accuracy of the model.

实施例二：Embodiment two:

采用实施例一中所述的基于离散小波变换的双向特征融合深度卷积神经网络构建方法，使用3种经典图像分类数据集CIFAR-10，CIFAR-100，Mini-ImageNet进行训练和验证基于离散小波变换的深度卷积神经网络提取的图像特征得到的特征融合图像分类模型。使用SGD优化器来迭代更新模型中卷积核和神经元的参数，优化器具体参数为：100次迭代训练，每个训练批次大小32，学习速率为0.001，权重惩罚项为0.0001，动量为0.8。当训练集和其中的验证集损失趋向于收敛时，表示模型稳定，得到训练好的分类模型。Using the discrete wavelet transform-based two-way feature fusion deep convolutional neural network construction method described in Example 1, use three classic image classification data sets CIFAR-10, CIFAR-100, and Mini-ImageNet for training and verification Based on discrete wavelets Transform image features extracted by a deep convolutional neural network to obtain a feature fusion image classification model. Use the SGD optimizer to iteratively update the parameters of the convolution kernel and neurons in the model. The specific parameters of the optimizer are: 100 iterations of training, each training batch size is 32, the learning rate is 0.001, the weight penalty item is 0.0001, and the momentum is 0.8. When the loss of the training set and the verification set in it tends to converge, it means that the model is stable and a trained classification model is obtained.

基于所述的基于离散小波变换的双向特征融合深度卷积神经网络构建方法，首先，构建双向特征融合模块，利用离散小波变换将输入特征分解为低频特征、水平方向的高频特征、垂直方向的高频特征以及对角方向的高频特征，再利用逆离散小波变换对不同的频域特征进行分组重构。其中，分别对低频特征单频重构，低频和水平方向高频进行重构，低频和垂直方向高频进行重构及低频和对角方向高频进行重构，得到与原始特征空域维度相同的四类特征。第二个模块池化操作Pooling，该模块对四类特征分别进行池化处理，再将池化结果传入第三个通道域特征融合模块。通道域特征融合模块是实现通道间的特征融合，首先进行通道维度的拼接，再利用1×1卷积进行降维以减少通道间的信息冗余。将双向特征融合模块嵌入到主流的深度神经网络架构VGG，ResNet，DenseNet中，其中以VGG16为例，该网络包含13个卷积层，5个池化层以及3各卷连接层，将VGG16中的5个池化层使用双向特征融合模块替换。将嵌入双向特征融合模块的深度卷积神经网络在公开图像分类数据数据集上进行训练及测试。其中包括CIFAR-10，CIFAR-100，Mini-ImageNet。CIFAR-10数据集包含50000张训练图片和10000张测试图片，共有10个不同的类别。每张图的图像大小是32×32。CIFAR-100数据集包含50000张训练图片，10000张测试图片，共有100个不同的类别，每张图的大小为32×32。Mini-Imagenet数据集包含100个类别，每个类别包含500张图片用于训练，100张图片用于测试。Based on the discrete wavelet transform-based two-way feature fusion deep convolutional neural network construction method, first, construct a two-way feature fusion module, and use discrete wavelet transform to decompose the input features into low-frequency features, high-frequency features in the horizontal direction, and high-frequency features in the vertical direction. High-frequency features and high-frequency features in the diagonal direction, and then use the inverse discrete wavelet transform to group and reconstruct different frequency domain features. Among them, the single-frequency reconstruction of the low-frequency feature, the reconstruction of the low-frequency and high-frequency in the horizontal direction, the reconstruction of the low-frequency and high-frequency in the vertical direction, and the reconstruction of the low-frequency and high-frequency in the diagonal direction, respectively, obtain the same spatial dimension as the original feature Four types of characteristics. The second module pooling operation Pooling, this module performs pooling processing on the four types of features, and then transfers the pooling results to the third channel domain feature fusion module. The channel domain feature fusion module is to realize the feature fusion between channels. First, the channel dimensions are spliced, and then the 1×1 convolution is used to reduce the dimension to reduce the information redundancy between channels. Embed the bidirectional feature fusion module into the mainstream deep neural network architectures VGG, ResNet, and DenseNet. Taking VGG16 as an example, the network includes 13 convolutional layers, 5 pooling layers, and 3 volume connection layers. The 5 pooling layers of 2 are replaced with a bidirectional feature fusion module. The deep convolutional neural network embedded with the bidirectional feature fusion module is trained and tested on the public image classification data dataset. These include CIFAR-10, CIFAR-100, Mini-ImageNet. The CIFAR-10 dataset contains 50,000 training images and 10,000 testing images, with a total of 10 different categories. The image size of each figure is 32×32. The CIFAR-100 data set contains 50,000 training pictures and 10,000 test pictures, with a total of 100 different categories, and the size of each picture is 32×32. The Mini-Imagenet dataset contains 100 categories, each category contains 500 pictures for training, and 100 pictures for testing.

本实施例中，在CIFAR-10，CIFAR-100，Mini-ImageNet数据集上均进行了大量的实验，证明了利用双向特征融合模块替换模型中的池化操作，构建基于离散小波变换的双向特征融合深度卷积神经网络，能够提升图像分类的准确率。从表1可以看到，使用不同类型的小波，相对原始方法，均能够提高分类准确率，同时验证了不同的小波类型对分类的结构也有不同的影响，其中利用Haar小波与双正交小波中的bior2.2小波构建的双向特征融合模块，在大部分实验中能够获得较高的分类准确率。In this example, a large number of experiments were carried out on CIFAR-10, CIFAR-100, and Mini-ImageNet datasets, which proved that the two-way feature fusion module was used to replace the pooling operation in the model, and the two-way feature based on discrete wavelet transform was constructed. Integrating deep convolutional neural networks can improve the accuracy of image classification. It can be seen from Table 1 that using different types of wavelets can improve the classification accuracy compared with the original method. At the same time, it has been verified that different wavelet types have different effects on the classification structure. Among them, using Haar wavelet and biorthogonal wavelet The bidirectional feature fusion module constructed by the bior2.2 wavelet can obtain high classification accuracy in most experiments.

表1嵌入双向特征融合模块的深度卷积神经网络分类准确率对比(％)Table 1 Comparison of classification accuracy of deep convolutional neural network embedded with bidirectional feature fusion module (%)

在本说明书中，对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且，描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外，在不相互矛盾的情况下，本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In this specification, the schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the described specific features, structures, materials or characteristics may be combined in any suitable manner in any one or more embodiments or examples. In addition, those skilled in the art can combine and combine different embodiments or examples and features of different embodiments or examples described in this specification without conflicting with each other.

尽管已经示出和描述了本发明的实施例，对于本领域的普通技术人员而言，可以理解在不脱离本发明的原理和精神的情况下可以对这些实施例进行多种变化、修改、替换和变型，本发明的范围由所附权利要求及其等同物限定。Although the embodiments of the present invention have been shown and described, those skilled in the art can understand that various changes, modifications and substitutions can be made to these embodiments without departing from the principle and spirit of the present invention. and modifications, the scope of the invention is defined by the appended claims and their equivalents.

Claims

1. A bidirectional feature fusion deep convolution neural network construction method based on discrete wavelet transform is characterized by comprising the following specific steps:

s101, constructing a bidirectional feature fusion module based on discrete wavelet transform and inverse discrete wavelet transform, wherein the bidirectional feature fusion module consists of a spatial domain feature fusion module, a pooling operation module and a channel domain feature fusion module;

s102, embedding the bidirectional feature fusion module into a mainstream network architecture to replace an original pooling method;

s103, performing feature map training and testing on the classical image classification data set by using a deep convolutional neural network of a bidirectional feature fusion module embedded into a mainstream network architecture;

the characteristic diagram passes through the bidirectional characteristic fusion module, namely sequentially passes through the airspace characteristic fusion module, the pooling operation and the channel domain characteristic fusion module, so that the bidirectional characteristic fusion module is summarized as follows:

given a feature mapxAs input, sequentially pass through a spatial feature fusion modulef _s Operation in pondsPoolingAnd channel domain feature fusion modulef _c ，x' represents the result of the feature graph after passing through the bidirectional feature fusion module.

2. The method for constructing the bidirectional feature fusion deep convolutional neural network based on the discrete wavelet transform as claimed in claim 1, wherein the spatial domain feature fusion module comprises the following steps when realizing fusion between spatial domain features:

s201, decomposing input features by utilizing discrete wavelet transform to obtain low-frequency features, high-frequency features in the horizontal direction, high-frequency features in the vertical direction and high-frequency features in the diagonal direction;

s202, grouping and reconstructing by utilizing inverse discrete wavelet transform to obtain four types of characteristics with the same spatial dimension as the original characteristics so as to realize the fusion of the spatial characteristics.

3. The method for constructing the bidirectional feature fusion deep convolution neural network based on the discrete wavelet transform as claimed in claim 2, wherein the grouping reconstruction refers to reconstructing a single frequency of a low frequency feature, reconstructing a high frequency in a low frequency and a horizontal direction, reconstructing a high frequency in a low frequency and a vertical direction, and reconstructing a high frequency in a low frequency and a high frequency in a diagonal direction.

4. The method for constructing the bidirectional feature fusion deep convolutional neural network based on the discrete wavelet transform as claimed in claim 2, wherein four groups of features obtained by the spatial domain feature fusion module are spatial domain features and distribution features are similar, the influence on the loss function is equivalent, and the four groups of fusion features obtained from the spatial domain are respectively pooled.

5. The method for constructing the bidirectional feature fusion deep convolutional neural network based on the discrete wavelet transform of claim 2, wherein the spatial domain feature fusion module does not add any parameter in the spatial domain feature fusion process by using the discrete wavelet transform and the inverse discrete wavelet transform.

6. The method for constructing the bidirectional feature fusion deep convolutional neural network based on the discrete wavelet transform as claimed in claim 1, wherein the channel domain feature fusion module performs channel dimension splicing on a plurality of groups of pooled features, and then performs dimension reduction by using 1x1 convolution to realize information interaction, so that different frequency band information performs mutual complementation and transmission of channel domains.

7. The method for constructing the bidirectional feature fusion deep convolutional neural network based on the discrete wavelet transform as claimed in claim 1, wherein the bidirectional feature fusion module is packaged as an independent module for replacing the pooling operation in the deep convolutional neural network.

8. The method for constructing the bidirectional feature fusion deep convolutional neural network based on the discrete wavelet transform of claim 1, wherein the deep convolutional neural network using the bidirectional feature fusion module embedded in the mainstream network architecture is trained and tested on public image classification data for realizing the purpose of improving the classification accuracy of the model.