CN114782298B - Infrared and visible light image fusion method with regional attention - Google Patents
Infrared and visible light image fusion method with regional attention Download PDFInfo
- Publication number
- CN114782298B CN114782298B CN202210434625.3A CN202210434625A CN114782298B CN 114782298 B CN114782298 B CN 114782298B CN 202210434625 A CN202210434625 A CN 202210434625A CN 114782298 B CN114782298 B CN 114782298B
- Authority
- CN
- China
- Prior art keywords
- image
- fusion
- visible light
- infrared
- encoder
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000007500 overflow downdraw method Methods 0.000 title abstract description 9
- 230000004927 fusion Effects 0.000 claims abstract description 82
- 238000012549 training Methods 0.000 claims description 36
- 238000000034 method Methods 0.000 claims description 31
- 238000004364 calculation method Methods 0.000 claims description 15
- 238000012360 testing method Methods 0.000 claims description 11
- 230000004913 activation Effects 0.000 claims description 8
- 238000010586 diagram Methods 0.000 claims description 8
- 238000011176 pooling Methods 0.000 claims description 8
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 5
- 238000005286 illumination Methods 0.000 claims description 2
- 230000000644 propagated effect Effects 0.000 claims 1
- 230000001902 propagating effect Effects 0.000 claims 1
- 238000005070 sampling Methods 0.000 claims 1
- 238000012795 verification Methods 0.000 claims 1
- 238000000605 extraction Methods 0.000 abstract description 6
- 230000005855 radiation Effects 0.000 abstract description 4
- 230000000694 effects Effects 0.000 abstract description 3
- 239000013589 supplement Substances 0.000 abstract 1
- 230000006870 function Effects 0.000 description 16
- 238000001514 detection method Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- RTAQQCXQSZGOHL-UHFFFAOYSA-N Titanium Chemical compound [Ti] RTAQQCXQSZGOHL-UHFFFAOYSA-N 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 230000004438 eyesight Effects 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000007499 fusion processing Methods 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 238000012880 independent component analysis Methods 0.000 description 1
- 238000003331 infrared imaging Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 230000016776 visual perception Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10048—Infrared image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
红外与可见光图像融合旨在利用信息互补性,融合同一场景下的热辐射、纹理细节等信息,使得融合图像内容更全面、清晰,并有利于人眼观察及后续任务等。图像融合的步骤通常为特征提取,特征融合和图像重构。本发明提出了一种具有区域注意力的融合方法。首先用编码器提取高维特征,然后设计了具有显著区域注意力的融合策略融合特征,最后用解码器重构图像。本发明旨在解决光照不足场景下的图像融合问题。结果表明本发明能够充分保留可见光图像良好的纹理细节,并利用红外图像对曝光不足的区域进行内容补充。另外,本发明对显著区域的关注使得源图像中高亮显示的区域在融合图像中仍保持高亮,达到红外和可见光图像优势互补的良好效果。
Infrared and visible light image fusion aims to take advantage of information complementarity to fuse thermal radiation, texture details and other information in the same scene, making the fused image content more comprehensive and clear, and conducive to human eye observation and subsequent tasks. The steps of image fusion are usually feature extraction, feature fusion and image reconstruction. The present invention proposes a fusion method with regional attention. First, the encoder is used to extract high-dimensional features, then a fusion strategy with significant regional attention is designed to fuse the features, and finally the decoder is used to reconstruct the image. The present invention aims to solve the problem of image fusion in scenes with insufficient lighting. The results show that the present invention can fully retain the good texture details of visible light images, and use infrared images to supplement the content of underexposed areas. In addition, the present invention focuses on salient areas so that the highlighted areas in the source image remain highlighted in the fused image, achieving a good effect of complementing the advantages of infrared and visible light images.
Description
技术领域Technical field
本发明属于图像处理技术领域,特别涉及一种具有区域注意力的红外与可见光图像融合方法。The invention belongs to the technical field of image processing, and particularly relates to an infrared and visible light image fusion method with regional attention.
背景技术Background technique
随着硬件、软件产业的稳步发展,利用传感器采集信息,以及对信息的传输和处理能力也日渐增强。在这一背景下,基于视觉的传感器因为能够提供丰富的环境信息得到广泛应用。单一类型的传感器只具有表征某一方面的信息特征,无法满足对监测环境的全面描述,因而多传感器系统开始得到越来越多关注与应用。多源传感器成像系统完整地填补了单一传感器图像表达能力不足的空缺。目前,图像融合技术已经在遥感探测、安全导航、医学图像分析、反恐检查、环境保护、交通监测、清晰图像重建、灾情检测与预报,尤其在计算机视觉等领域发挥着重大的应用价值。With the steady development of the hardware and software industries, the ability to use sensors to collect information, and to transmit and process information is also increasing. In this context, vision-based sensors have been widely used because they can provide rich environmental information. A single type of sensor only has information characteristics that represent a certain aspect and cannot fully describe the monitoring environment. Therefore, multi-sensor systems have begun to receive more and more attention and applications. The multi-source sensor imaging system completely fills the gap of the insufficient image expression ability of a single sensor. At present, image fusion technology has played a significant application value in remote sensing detection, safety navigation, medical image analysis, anti-terrorism inspection, environmental protection, traffic monitoring, clear image reconstruction, disaster detection and forecasting, especially in computer vision and other fields.
对于视觉的多源传感器系统,红外和可见光图像可以通过相对简易的设备获取,最典型的便是红外和可见光的图像融和。由于两者成像机理不同,可见图像通常具有较高空间分辨率和图像对比度,适合于人类视觉感知,但其极易受到恶劣条件的影响,比如亮度不足、暴雨雾霾等特殊气候。然而红外图像恰好具有较好的场景抗干扰能力,并且对于温度高于环境的目标,例如行人等可以更加显著的显示出来。但通常红外图像分辨率较低,图像细节表现较差。融合两者,可以在一副图像上显示多种信息,突出目标,具有比单一图像更丰富的细节以及抵抗恶劣环境的能力。因此,红外和可见光图像融合,就是旨在将同一场景下的红外与可见光图像进行细致融合,同时保留红外图像具有热辐射信息的高亮目标以及可见光图像具有高分辨率的背景纹理细节信息,使得最终融合图像更具信息丰富性,从而更有利于人眼的识别和机器的自动探测,人类的观察审美和计算机的后续图像处理。For visual multi-source sensor systems, infrared and visible light images can be obtained through relatively simple equipment. The most typical one is the fusion of infrared and visible light images. Due to the different imaging mechanisms of the two, visible images usually have high spatial resolution and image contrast, which are suitable for human visual perception. However, they are highly susceptible to harsh conditions, such as insufficient brightness, heavy rain, haze and other special climates. However, infrared images happen to have better scene anti-interference capabilities, and targets with temperatures higher than the environment, such as pedestrians, can be displayed more significantly. However, infrared images usually have low resolution and poor image details. Fusing the two can display multiple information on one image, highlight the target, have richer details than a single image, and have the ability to withstand harsh environments. Therefore, the fusion of infrared and visible light images aims to carefully fuse the infrared and visible light images in the same scene, while retaining the highlighted targets with thermal radiation information in the infrared images and the high-resolution background texture details in the visible light images, so that The final fused image is more information-rich, which is more conducive to human eye recognition and machine automatic detection, human observation and aesthetics, and computer subsequent image processing.
现有技术及其缺陷。Existing technology and its shortcomings.
图像融合的一般步骤为特征提取,特征融合和特征重构,其中,特征重构是特征提取的逆过程,特征提取和融合是图像融合中最关键的两个要素。在传统方法中,多尺度变换(MST)是最为常用的图像融合方法,主要特点是能够精确表征图像的空间结构,并具有空间和频谱的一致性。并且已经有许许多多的多尺度变换被提出来,例如金字塔变换、小波变换,轮廓变换及相关变体等。除此之外,基于稀疏表示(Sparse Representation,SR)的融合算法,以及基于子空间的方法如主成分分析和独立分量分析等也被提了出来。The general steps of image fusion are feature extraction, feature fusion and feature reconstruction. Feature reconstruction is the reverse process of feature extraction. Feature extraction and fusion are the two most critical elements in image fusion. Among traditional methods, multi-scale transform (MST) is the most commonly used image fusion method. Its main feature is that it can accurately characterize the spatial structure of the image and has spatial and spectral consistency. And many multi-scale transformations have been proposed, such as pyramid transform, wavelet transform, contour transform and related variants, etc. In addition, fusion algorithms based on Sparse Representation (SR) and subspace-based methods such as principal component analysis and independent component analysis have also been proposed.
近年来,深度学习已在各种领域展示了最先进的性能,也已成功应用于图像融合。这些算法可以大致分为三类,基于Auto encoder(AE)的方法,基于CNN的方法,基于GAN的方法。Li等人提出了一种简单的自编码器(AE)融合架构,它包括编码器,融合层,解码器。后来他们还增加了编码器的复杂度,提出了基于自编码器的嵌套融合方法,来获得更全面的特征融合。上述方法的缺点是靠人工设计融合策略,限制了融合性能。Zhang等人通过通用网络结构,即特征提取层、融合层和图像重建层,开发了一个通用的图像融合框架,在一类复杂的损失函数的指导下学习特征提取、特征融合和图像重构。这类方法仅关注到了全局层面的融合,没有突出感兴趣的目标区域。Ma等人创造性地将GAN引入图像融合社区,它利用鉴别器强制生成器合成具有丰富纹理的融合图像。为了提高细节信息的质量和锐化热目标的边缘,他们还引入了细节损失和边缘增强损失。由于GAN训练困难,这种方法未能获得较好的融合质量,并且也无法高亮显示显著信息。In recent years, deep learning has demonstrated state-of-the-art performance in various fields and has also been successfully applied to image fusion. These algorithms can be roughly divided into three categories, Auto encoder (AE)-based methods, CNN-based methods, and GAN-based methods. Li et al. proposed a simple autoencoder (AE) fusion architecture, which includes an encoder, fusion layer, and decoder. Later, they also increased the complexity of the encoder and proposed a nested fusion method based on autoencoders to obtain more comprehensive feature fusion. The disadvantage of the above method is that it relies on manual design of fusion strategies, which limits the fusion performance. Zhang et al. developed a general image fusion framework through a general network structure, namely feature extraction layer, fusion layer and image reconstruction layer, to learn feature extraction, feature fusion and image reconstruction under the guidance of a class of complex loss functions. This type of method only focuses on global level fusion and does not highlight the target area of interest. Ma et al. creatively introduced GAN into the image fusion community, which utilizes the discriminator to force the generator to synthesize fused images with rich textures. In order to improve the quality of detail information and sharpen the edges of hot targets, they also introduced detail loss and edge enhancement loss. Due to the difficulty of GAN training, this method fails to obtain better fusion quality and cannot highlight salient information.
发明内容Contents of the invention
为了克服上述现有技术的缺点,本发明的目的在于提供一种具有区域注意力的红外与可见光图像融合方法,用来解决红外与可见光图像在光照不足场景下的图像融合问题。本发明提出的方法可以充分发挥红外与可见光图像在场景表征方面的优势。通过提取图像高维特征并融合,可以实现在光照不足场景下,充分的融合红外的热辐射信息和可见光图像的纹理信息。并且,融合网络中的区域注意力模块可以关注到高维特征中显著的区域,例如红外图像的高亮目标、可见光图像曝光充足的区域,并在融合中增加这一部分的像素强度,实现有区域注意力的图像融合,从而实现了红外与可见光图像的优势互补。In order to overcome the above shortcomings of the prior art, the purpose of the present invention is to provide an infrared and visible light image fusion method with regional attention to solve the image fusion problem of infrared and visible light images in insufficient illumination scenes. The method proposed by the present invention can give full play to the advantages of infrared and visible light images in scene representation. By extracting and fusing high-dimensional features of images, it is possible to fully integrate infrared thermal radiation information and texture information of visible light images in scenes with insufficient lighting. Moreover, the regional attention module in the fusion network can pay attention to significant areas in high-dimensional features, such as highlighted targets in infrared images and areas with sufficient exposure in visible light images, and increase the pixel intensity of this part during the fusion to achieve regional Attentional image fusion achieves the complementary advantages of infrared and visible light images.
为了实现上述目的,本发明采用的技术方案是:In order to achieve the above objects, the technical solution adopted by the present invention is:
一种具有区域注意力的红外与可见光图像融合方法,包括:An infrared and visible light image fusion method with regional attention, including:
步骤1,训练自编码器(Auto Encoder),所述自编码器包括编码器和解码器;Step 1, train an autoencoder (Auto Encoder), which includes an encoder and a decoder;
步骤1.1:以RGB格式读取训练集中的图像I,调整图像尺寸,再将其转换到YCbCr颜色空间;Step 1.1: Read the image I in the training set in RGB format, adjust the image size, and convert it to YCbCr color space;
步骤1.2:将图像的亮度通道IY输入至编码器,得到高维特征图F;Step 1.2: Input the brightness channels I Y of the image to the encoder to obtain the high-dimensional feature map F;
步骤1.3:将高维特征图F输入至解码器,输出亮度通道图OY;Step 1.3: Input the high-dimensional feature map F to the decoder and output the brightness channel map O Y ;
步骤1.4:根据损失函数计算IY和OY之间的特征损失,然后优化梯度并反向传播,更新自编码器的模型参数;Step 1.4: Calculate the feature loss between I Y and O Y according to the loss function, then optimize the gradient and backpropagate, and update the model parameters of the autoencoder;
步骤1.5:重复步骤1.1到步骤1.4,直到在整个训练集上迭代次数达到设定阈值,得到训练好的自编码器;Step 1.5: Repeat steps 1.1 to 1.4 until the number of iterations on the entire training set reaches the set threshold, and the trained autoencoder is obtained;
步骤2:制作融合图像训练集Step 2: Create a fused image training set
获取用于训练的红外与可见光图像对,并进行子图裁剪扩充数据集,裁剪尺寸与步骤1调整后的图像尺寸一致,得到融合图像训练集;Obtain the infrared and visible light image pairs used for training, and perform sub-image cropping to expand the data set. The cropping size is consistent with the image size adjusted in step 1, and the fused image training set is obtained;
步骤3:训练融合网络Step 3: Train the fusion network
步骤3.1:将融合图像训练集中的红外与可见光图像对(IR,IV)分别转换到YCbCr颜色空间,并分别提取各自的亮度通道图,得到(IRY,IVY);Step 3.1: Convert the infrared and visible light image pairs (I R , I V ) in the fused image training set to YCbCr color space respectively, and extract their respective brightness channel maps to obtain (I RY , I VY );
步骤3.2:分别将(IRY,IVY)输入步骤1中训练好的编码器,计算得到特征图(FR,FV);Step 3.2: Input (I RY , I VY ) into the encoder trained in step 1 respectively, and calculate the feature map ( FR , F V );
步骤3.3:将(FR,FV)在特征维度连接,输入融合网络,计算得到融合特征图FF;Step 3.3: Connect (F R , F V ) in the feature dimension, input it into the fusion network, and calculate the fusion feature map F F ;
步骤3.4:将FF输入解码器解码,得到亮度通道的融合图像OFY;Step 3.4: Decode the F F input decoder to obtain the fusion image O FY of the brightness channel;
步骤3.5:根据损失函数计算损失值,然后优化梯度并反向传播,更新融合网络的模型参数;Step 3.5: Calculate the loss value according to the loss function, then optimize the gradient and backpropagate, and update the model parameters of the fusion network;
步骤3.6:重复步骤3.1到3.5,直到在整个融合图像训练集上计算次数达到设定值,得到训练好的融合网络;Step 3.6: Repeat steps 3.1 to 3.5 until the number of calculations on the entire fused image training set reaches the set value, and the trained fusion network is obtained;
步骤4,获取融合图像Step 4, obtain the fused image
步骤4.1:将待融合的红外与可见光图像对按照步骤3.1到3.4的方法,得到亮度通道的融合图像OFY;Step 4.1: Follow the method from steps 3.1 to 3.4 for the infrared and visible light images to be fused to obtain the fusion image O FY of the brightness channel;
步骤4.2:将OFY和可见光图像的CbCr通道在特征维度连接,得到YCbCr格式的图像,再转换为RGB格式,得到融合图像。Step 4.2: Connect the CbCr channels of O FY and visible light images in the feature dimension to obtain an image in YCbCr format, and then convert it to RGB format to obtain a fused image.
在一个实施例中,所述编码器具有四层卷积层,采用密集连接;所述解码器采用四层卷积层直接连接。In one embodiment, the encoder has four convolutional layers and uses dense connections; the decoder uses four convolutional layers for direct connection.
在一个实施例中,编码器和解码器的卷积核尺寸为3×3,step为1,padding为1,采用ReLu激活函数。所述步骤1.2,输入尺寸为256×256×1,得到的高维特征图F尺寸为256×256×128,所述步骤1.3,亮度通道图OY尺寸为256×256×1。In one embodiment, the convolution kernel size of the encoder and decoder is 3×3, the step is 1, the padding is 1, and the ReLu activation function is used. In step 1.2, the input size is 256×256×1, and the size of the obtained high-dimensional feature map F is 256×256×128. In step 1.3, the size of the brightness channel map O Y is 256×256×1.
在一个实施例中,步骤1.5之后,将训练数据改成测试数据,执行步骤1.1到1.3,得到OY,然后将OY与步骤1.1中的CbCr通道在特征维度连接,得到YCbCr格式的图像,再转换为RGB格式,得到输出图像O;主观验证O是否与I一致。In one embodiment, after step 1.5, change the training data to test data, perform steps 1.1 to 1.3 to obtain O Y , and then connect O Y with the CbCr channel in step 1.1 in the feature dimension to obtain an image in YCbCr format, Then convert it to RGB format to obtain the output image O; subjectively verify whether O is consistent with I.
在一个实施例中,步骤3.3的计算步骤如下:In one embodiment, the calculation steps of step 3.3 are as follows:
(1)将(FR,FV)在特征维度连接,经过卷积层Conv_1、卷积层Conv_2和卷积层Conv_3计算,得到全局信息融合特征图FF_0;(1) Connect (F R , F V ) in the feature dimension, and calculate the global information fusion feature map F F_0 through the convolution layer Conv_1, convolution layer Conv_2 and convolution layer Conv_3;
(2)分别将(FR,FV)输入同一个区域注意力模块RAB,计算得到注意力特征图(MR,MV);将(MR,MV)在特征维度连接,输入到卷积层Conv_Att,得到融合注意力特征图MRV;(2) Input (F R , F V ) into the same regional attention module RAB, and calculate the attention feature map ( MR , M V ); connect ( MR , M V ) in the feature dimension and input it to Convolutional layer Conv_Att, obtain the fused attention feature map M RV ;
(3)计算融合特征图FF=FF_0+MRV,即对应位置像素值相加。(3) Calculate the fusion feature map F F =F F_0 +M RV , that is, add the pixel values at the corresponding positions.
在一个实施例中,所述步骤1.4和步骤3.5,均采用Adam优化器优化梯度,其中步骤3.5,固定自编码器的模型参数,仅更新融合网络的模型参数。In one embodiment, the Adam optimizer is used to optimize the gradient in both steps 1.4 and 3.5. In step 3.5, the model parameters of the autoencoder are fixed and only the model parameters of the fusion network are updated.
在一个实施例中,所述步骤2,从公开数据集TNO挑选包含光照不足场景且具有显著目标的图像构成训练集和测试集,并对训练集离线扩充,扩充方式为对原始红外和可见光图像进行子图裁剪,子图尺寸为256×256,裁剪步长为16。In one embodiment, the step 2 is to select images containing poorly illuminated scenes and significant objects from the public data set TNO to form a training set and a test set, and expand the training set offline by performing the expansion on the original infrared and visible light images. Carry out sub-picture cropping, the sub-picture size is 256×256, and the cropping step is 16.
与现有技术相比,本发明的有益效果是:Compared with the prior art, the beneficial effects of the present invention are:
第一:在光照不足的场景下,能够充分融合可见光图像的纹理信息和红外图像的热辐射信息。编码器经过训练,可以充分提取到图像的高维特征,并且得益于高维特征计算损失,保证了在融合中可以对各维度的特征进行深度融合。First: In scenes with insufficient lighting, the texture information of visible light images and the thermal radiation information of infrared images can be fully integrated. After training, the encoder can fully extract the high-dimensional features of the image, and thanks to the high-dimensional feature calculation loss, it ensures that the features of each dimension can be deeply fused during the fusion.
第二,在融合全局内容的基础上,还能关注到源图像中显著高亮的区域,使之在融合图像中仍保持高亮。融合网络包含了两条融合路径,全局融合和显著区域融合。区域注意力模块可以从多种尺度提取图像的显著区域,将两条融合路径结果相加,则使得显著区域具有更高强度的亮度值,达到高亮显示的效果。Second, on the basis of fusing the global content, it can also focus on significantly highlighted areas in the source image so that they remain highlighted in the fused image. The fusion network contains two fusion paths, global fusion and salient area fusion. The regional attention module can extract the salient areas of the image from multiple scales, and add the results of the two fusion paths to make the salient areas have higher intensity brightness values to achieve the effect of highlighting.
第三,融合图像具有良好的对比度,清晰度。训练中,结构损失从灰度、对比度和结构相似度三方面进行度量。梯度损失则可使融合图像具有良好的图像纹理细节,增加了清晰度。另外,仅融合图像亮度通道的策略,使本发明既可处理灰度图像也能处理彩色图像。由于可见光图像的CbCr通道不参与计算,融合结果可以良好的还原可见光图像的颜色。Third, the fused image has good contrast and clarity. During training, structural loss is measured from three aspects: grayscale, contrast and structural similarity. Gradient loss can make the fused image have good image texture details and increase clarity. In addition, the strategy of fusing only the image brightness channel enables the present invention to process both grayscale and color images. Since the CbCr channel of the visible light image does not participate in the calculation, the fusion result can well restore the color of the visible light image.
附图说明Description of the drawings
图1给定方案的总体框架图。输入是待融合的红外和可见光图像,输出是融合图像。网络结构由编码器Encoder,注意力融合网络Attention FusionNet以及解码器Decoder组成。在虚线框中,注明了损失函数由三部分构成,分别是特征损失feature loss,结构相似度损失ssim loss,以及梯度损失gradient loss。Figure 1 is the overall framework diagram of the given scheme. The input is the infrared and visible light images to be fused, and the output is the fused image. The network structure consists of the encoder Encoder, the attention fusion network Attention FusionNet and the decoder Decoder. In the dotted box, it is noted that the loss function consists of three parts, namely feature loss, structural similarity loss ssim loss, and gradient loss.
图2给定自编码器的结构以及训练所需的损失函数的构成。Figure 2 gives the structure of the autoencoder and the composition of the loss function required for training.
图3给定注意力融合网络的结构。输入为特征图(FR,FV),输出为融合特征图FF。Figure 3 gives the structure of the attention fusion network. The input is the feature map (F R ,F V ), and the output is the fused feature map F F .
图4给定区域注意力模块的网络结构。输入特征图F,输出注意力图M。Figure 4 Network structure of the given regional attention module. Input feature map F and output attention map M.
图5给定了三组融合图像案例。方框标注的为显著目标的融合效果。Figure 5 gives three groups of fused image cases. The boxes marked are the fusion effects of salient targets.
具体实施方式Detailed ways
下面结合附图和实施例详细说明本发明的实施方式。The embodiments of the present invention will be described in detail below with reference to the drawings and examples.
可见光传感器在光照充足的情况下,通常能够捕获到足够清晰,并且符合人眼观察习惯的图像。最能凸显红外和可见光图像融合优势的领域往往是光照不足的场景。如何使融合结果能够弥补曝光不足的劣势,并突出感兴趣的目标,从而更利于人眼观察和后续高级任务,是当前面临的问题。Visible light sensors can usually capture images that are clear enough and consistent with the observation habits of the human eye when there is sufficient lighting. The areas where the advantages of merging infrared and visible light images are most highlighted are often scenes with insufficient lighting. How to make the fusion result make up for the disadvantage of insufficient exposure and highlight the target of interest, so as to be more conducive to human eye observation and subsequent advanced tasks, is the current problem.
以前的大部分融合方法从全局的角度出发设计融合策略,着重于图像纹理细节等内容的融合,而对于原本在红外图像中的显著目标,例如人,车等,在融合图像由于引入了可见光图像的成分而导致亮度下降。一些方法虽然引入了对显著目标的关注,但是需要用额外的算法预先得到目标分割的二值图。另一方面,现有的方法对于红外成像应用最广的夜间场景研究不足。Most previous fusion methods design fusion strategies from a global perspective, focusing on the fusion of image texture details and other content. However, for the salient targets originally in infrared images, such as people, cars, etc., due to the introduction of visible light images in the fusion image components resulting in a decrease in brightness. Although some methods introduce attention to salient targets, they require additional algorithms to obtain binary images of target segmentation in advance. On the other hand, existing methods have insufficient research on nighttime scenes where infrared imaging is most widely used.
基于此,本发明提供了一种具有区域注意力的红外与可见光图像融合方法,整体架构如图1,其步骤如下:Based on this, the present invention provides an infrared and visible light image fusion method with regional attention. The overall architecture is shown in Figure 1, and the steps are as follows:
步骤1,训练自编码器(Auto Encoder)。自编码器的结构如图2所示,包含一个编码器Encoder和一个解码器Decoder。图中每一个矩形代表一层,编码器Encoder和解码器Decoder均由卷积层和激活层组成。损失包括结构损失ssim loss和内容损失pixel loss。本实施例中,编码器Encoder具有四层卷积层,采用密集连接;解码器Decoder采用四层卷积层直接连接,卷积层的卷积核尺寸为3×3,step为1,padding为1。激活层采用ReLu激活函数。编码器Encoder和解码器Decoder的各层参数具体设置为:Step 1, train the autoencoder (Auto Encoder). The structure of the autoencoder is shown in Figure 2, which includes an encoder and a decoder. Each rectangle in the figure represents a layer, and the encoder Encoder and decoder Decoder are composed of convolutional layers and activation layers. The losses include structure loss ssim loss and content loss pixel loss. In this embodiment, the encoder Encoder has four convolution layers and uses dense connections; the decoder Decoder uses four convolution layers to directly connect. The convolution kernel size of the convolution layer is 3×3, step is 1, and padding is 1. The activation layer uses the ReLu activation function. The specific settings of each layer parameter of the encoder Encoder and decoder Decoder are:
步骤1.1利用OpenCV的imread函数读取训练集中的图像I,读取的图像I为RGB格式,将其尺寸调整为256×256×3。然后从RGB转换到YCbCr颜色空间,转换方法可利用OpenCV的库函数cvtColor。最后,图像的每个像素除以255,将像素值归一化到[0,1],即得到输入图像。Step 1.1 Use OpenCV's imread function to read the image I in the training set. The read image I is in RGB format, and its size is adjusted to 256×256×3. Then convert from RGB to YCbCr color space. The conversion method can use the OpenCV library function cvtColor. Finally, each pixel of the image is divided by 255 to normalize the pixel value to [0,1] to obtain the input image.
步骤1.2:将图像的亮度通道IY输入编码器Encoder,输入尺寸为256×256×1,得到高维特征图F,尺寸为256×256×128。Step 1.2: Input the brightness channel I Y of the image into the encoder Encoder, the input size is 256×256×1, and the high-dimensional feature map F is obtained, the size is 256×256×128.
步骤1.3:将高维特征图F输入解码器Decoder,得到输出的亮度通道图OY,尺寸为256×256×1。Step 1.3: Input the high-dimensional feature map F into the decoder Decoder to obtain the output brightness channel map O Y with a size of 256×256×1.
步骤1.4:根据损失函数,计算IY和OY之间的特征损失,损失函数定义为:Step 1.4: Calculate the feature loss between I Y and O Y according to the loss function. The loss function is defined as:
其中,μ(1-SSIM(OY,IY))为结构损失,SSIM(·)为结构相似度函数。为内容损失,即计算IY和OY的欧氏距离。μ为超参数,用于平衡两项损失。H和W分别为图像的高和宽。Among them, μ(1-SSIM(O Y ,I Y )) is the structural loss, and SSIM(·) is the structural similarity function. is the content loss, that is, calculating the Euclidean distance between I Y and O Y. μ is a hyperparameter used to balance the two losses. H and W are the height and width of the image respectively.
步骤1.5:用Adam优化器等方式优化梯度并反向传播,更新自编码器的模型参数。Step 1.5: Use Adam optimizer and other methods to optimize the gradient and backpropagate it, and update the model parameters of the autoencoder.
步骤1.6:重复步骤1.1到1.5。直到在整个训练集上迭代次数达到设定阈值,得到训练好的自编码器。Step 1.6: Repeat steps 1.1 to 1.5. Until the number of iterations on the entire training set reaches the set threshold, the trained autoencoder is obtained.
本实施例采用开源彩色图像数据集MS-COCO,共包含80000张图像。用python和pytorch实现算法,基于一块NVIDIA TITAN V的GPU训练,epoch设置为2,batch size设置为16,超参数μ设置为1。This embodiment uses the open source color image data set MS-COCO, which contains a total of 80,000 images. Use python and pytorch to implement the algorithm, and train based on a NVIDIA TITAN V GPU. The epoch is set to 2, the batch size is set to 16, and the hyperparameter μ is set to 1.
步骤1.7:为对上述训练进行验证,可将训练数据改成测试数据,执行步骤1.1到1.3,得到OY。然后将OY与步骤1.1中的CbCr通道图在特征维度连接,得到YCbCr格式的图像,再转换为RGB格式,得到输出图像O;主观验证输出图像O是否与输入图像I一致。Step 1.7: In order to verify the above training, the training data can be changed into test data, and steps 1.1 to 1.3 can be performed to obtain O Y . Then connect O Y with the CbCr channel map in step 1.1 in the feature dimension to obtain an image in YCbCr format, and then convert it to RGB format to obtain the output image O; subjectively verify whether the output image O is consistent with the input image I.
步骤2,制作融合图像训练集和测试集。Step 2: Create a fused image training set and test set.
从公开的红外和可见光图像融合数据集TNO中,挑选包含光照不足场景且具有显著目标的图像构成训练集和测试集。本实施例挑选了亮度较暗的41对图像作为训练集,25对作为测试集。然后,对训练集进行离线扩充,扩充方式为:对原始红外和可见光图像进行子图裁剪,每一个子图尺寸与步骤1调整后的图像尺寸一致,即256×256,裁剪移动步长为16,最终获得了共13940对红外与可见光图像对。From the public infrared and visible light image fusion data set TNO, images containing poorly illuminated scenes and salient objects are selected to form a training set and a test set. This embodiment selects 41 pairs of images with darker brightness as the training set and 25 pairs as the test set. Then, the training set is expanded offline. The expansion method is: sub-image cropping of the original infrared and visible light images. The size of each sub-image is consistent with the image size adjusted in step 1, that is, 256×256, and the cropping movement step is 16 , a total of 13940 pairs of infrared and visible light images were finally obtained.
步骤3:训练融合网络。Step 3: Train the fusion network.
步骤3.1:读取融合图像训练集中的红外与可见光图像对(IR,IV),然后分别执行同步骤1.1中的操作,即,转换到YCbCr颜色空间,并分别提取各自的亮度通道图,得到(IRY,IVY)。Step 3.1: Read the infrared and visible light image pairs (I R , I V ) in the fused image training set, and then perform the same operations as in step 1.1 respectively, that is, convert to YCbCr color space, and extract their respective brightness channel maps, Get (I RY ,I VY ).
步骤3.2:分别将(IRY,IVY)输入步骤1中训练好的编码器Encoder,计算得到特征图(FR,FV);Step 3.2: Input (I RY , I VY ) into the encoder Encoder trained in step 1, and calculate the feature map ( FR , F V );
步骤3.3:将(FR,FV)在特征维度连接,输入融合网络,计算得到融合特征图FF。融合层的结构如图3所示。融合过程分为了两条路径,全局信息融合和注意力特征图融合,即全局信息融合网络和注意力特征图融合网络。前者包含三层卷积层,分别是Conv_1、Conv_2和Conv_3,后者包含一个区域注意力模块RAB和一层卷积层Conv_Att,在本实施例中,网络层参数可以设置为:Step 3.3: Connect (F R , F V ) in the feature dimension, input it into the fusion network, and calculate the fusion feature map F F . The structure of the fusion layer is shown in Figure 3. The fusion process is divided into two paths, global information fusion and attention feature map fusion, namely global information fusion network and attention feature map fusion network. The former includes three convolutional layers, namely Conv_1, Conv_2 and Conv_3. The latter includes a regional attention module RAB and a convolutional layer Conv_Att. In this embodiment, the network layer parameters can be set as:
融合网络中的计算步骤如下:The calculation steps in the fusion network are as follows:
(1)将(FR,FV)在特征维度连接,然后经过Conv_1、Conv_2和Conv_3计算,得到全局信息融合特征图FF_0。(1) Connect (F R , F V ) in the feature dimension, and then calculate the global information fusion feature map F F_0 through Conv_1, Conv_2 and Conv_3.
(2)计算注意力特征图。(2) Calculate the attention feature map.
分别将(FR,FV)输入到区域注意力模块RAB中,得到注意力特征图(MR,MV),尺寸为256×256×128。注意,此处采用同一个RFB模块分别计算。区域注意力模块RAB的结构如图4所示。其中包括最大池化,全局平均池化,全连接层、激活层,上采样操作和标准化操作。表示权重和特征图相乘。/>表示特征图相加。为了从多个尺度提取特征图权重,模块分别用了三种最大池化核。Input (F R , F V ) into the regional attention module RAB respectively to obtain the attention feature map ( MR , M V ) with a size of 256×256×128. Note that the same RFB module is used for separate calculations here. The structure of the regional attention module RAB is shown in Figure 4. These include max pooling, global average pooling, fully connected layers, activation layers, upsampling operations and normalization operations. Represents the weight and feature map multiplied. /> Represents the addition of feature maps. In order to extract feature map weights from multiple scales, the module uses three maximum pooling kernels.
具体的计算步骤为:输入特征图F,进行最大池化,得到特征图fs,尺寸为其中,H和W代表图像尺寸,本实例中,H和W均为256,s分别为1、2、4,代表池化核的大小分别为1×1、2×2、4×4。然后,对fs进行全局平均池化操作,得到维度为1×1×128的向量。然后再接一个全连接层和激活层,最终得到权重向量ωs,维度为1×1×128。第k个维度特征的权重值用/>表示,用于衡量第k个特征图层/>的重要性。另一方面,为了得到和F相同尺寸的特征图,对fs进行上采样操作,然后将ωs和上采样后的特征图在对应维度相乘,得到加权特征图/>其中,k表示第k个维度的特征图,Hup(·)表示上采样函数。最后,将三个尺度的特征图相加,再进行标准化,得到维度为H×W×128的注意力特征图:/>其中,σ(·)表示标准化操作。The specific calculation steps are: input the feature map F, perform maximum pooling, and obtain the feature map f s with a size of Among them, H and W represent the image size. In this example, H and W are both 256, s is 1, 2, and 4 respectively, which means that the sizes of the pooling kernels are 1×1, 2×2, and 4×4 respectively. Then, a global average pooling operation is performed on f s to obtain a vector with a dimension of 1×1×128. Then connect a fully connected layer and an activation layer, and finally get the weight vector ω s with a dimension of 1×1×128. The weight value of the k-th dimension feature is used/> Represents, used to measure the kth feature layer/> importance. On the other hand, in order to obtain a feature map of the same size as F, perform an upsampling operation on f s , and then multiply ω s and the upsampled feature map in the corresponding dimensions to obtain a weighted feature map/> Among them, k represents the feature map of the kth dimension, and H up (·) represents the upsampling function. Finally, the feature maps of the three scales are added and then standardized to obtain an attention feature map with a dimension of H×W×128:/> Among them, σ(·) represents the normalization operation.
(3)将(MR,MV)在特征维度连接,输入到卷积层Conv_Att,得到融合注意力特征图MRV,尺寸为H×W×128。(3) Connect ( MR , M V ) in the feature dimension and input it to the convolution layer Conv_Att to obtain the fused attention feature map M RV with a size of H×W×128.
(4)计算最终的融合特征图FF=FF_0+MRV,即对应位置像素值相加。(4) Calculate the final fusion feature map F F =F F_0 +M RV , that is, add the pixel values at the corresponding positions.
步骤3.4:将FF输入解码器Decoder解码,得到亮度通道的融合图像OFY。Step 3.4: Decode the F F input decoder to obtain the fusion image O FY of the brightness channel.
步骤3.5:根据损失函数L计算损失值,利用Adam优化器等方式优化损失梯度并反向传播,更新融合网络的模型参数,注意,此处固定自编码器的模型参数,仅更新融合网络的模型参数。Step 3.5: Calculate the loss value according to the loss function L, use Adam optimizer and other methods to optimize the loss gradient and back propagate, and update the model parameters of the fusion network. Note that the model parameters of the autoencoder are fixed here, and only the model of the fusion network is updated. parameter.
损失函数L包含了三个部分,结构损失Lssim、特征损失Lpixel和梯度损失Lgradient,计算公式为:The loss function L contains three parts, the structure loss L ssim , the feature loss L pixel and the gradient loss L gradient . The calculation formula is:
L=ωLssim+λLpixel+Lgradient L=ωL ssim +λL pixel +L gradient
其中,ω、λ为超参数,用于平衡各类损失。Among them, ω and λ are hyperparameters, used to balance various losses.
结构损失Lssim计算公式为:The calculation formula of structural loss L ssim is:
Lssim=δ(1-SSIM(IRY,OY))+(1-δ)(1-SSIM(IVY,OY))L ssim =δ(1-SSIM(I RY ,O Y ))+(1-δ)(1-SSIM(I VY ,O Y ))
其中,δ为超参数,用于平衡两项损失值。Among them, δ is a hyperparameter used to balance the two loss values.
特征损失计算公式为:The calculation formula of feature loss is:
其中,η为超参数,特征图尺寸为H×W×C。||·||2表示求特征图的欧氏距离。Among them, eta is a hyperparameter, and the feature map size is H×W×C. ||·|| 2 means finding the Euclidean distance of the feature map.
梯度损失Lgradient计算公式为:The gradient loss L gradient calculation formula is:
其中,表示Sobel梯度计算操作,用于度量图像的细粒度纹理信息。in, Represents the Sobel gradient calculation operation, used to measure the fine-grained texture information of the image.
步骤3.6:重复步骤3.1到3.5,直到迭代次数在整个融合图像训练集上达到设定阈值,从而得到训练好的融合网络。在本实施例中,训练基于一块NVIDIA TITAN V的GPU,采用Adam优化器,batch size和epoch分别为4和2。初始学习率设置为1×10-4,损失函数的超参数ω、λ、δ、η分别设置为1,2.7,0.5,0.5。Step 3.6: Repeat steps 3.1 to 3.5 until the number of iterations reaches the set threshold on the entire fused image training set, thereby obtaining the trained fusion network. In this embodiment, the training is based on an NVIDIA TITAN V GPU, using the Adam optimizer, and the batch size and epoch are 4 and 2 respectively. The initial learning rate is set to 1×10 -4 , and the hyperparameters ω, λ, δ, and eta of the loss function are set to 1, 2.7, 0.5, and 0.5 respectively.
步骤4:输入测试数据,得到融合图像。Step 4: Enter the test data and obtain the fused image.
步骤4.1:将测试数据或待融合的红外与可见光图像对按照步骤3.1到3.4的方法,得到亮度通道的融合图像OFY。Step 4.1: Use the test data or the infrared and visible light images to be fused according to the method from steps 3.1 to 3.4 to obtain the fusion image OFY of the brightness channel.
步骤4.2:将OFY与可见光图像的CbCr通道在特征维度连接,得到YCbCr格式的图像,再转换为RGB格式,得到融合图像。Step 4.2: Connect O FY and the CbCr channel of the visible light image in the feature dimension to obtain an image in YCbCr format, and then convert it to RGB format to obtain a fused image.
从测试中选取了三组融合图像,如图5所示。从图中可以看出,融合图像融合了可见光图像的纹理细节,如图中虚线方框所示,并且图像整体亮度得到一定提升。同时,仅在红外图像中具有的显著区域在融合图像中也得到了很好的体现,如图中实线方框所示。Three sets of fused images were selected from the test, as shown in Figure 5. It can be seen from the figure that the fused image incorporates the texture details of the visible light image, as shown in the dotted box in the figure, and the overall brightness of the image is improved to a certain extent. At the same time, the salient areas only in the infrared image are also well reflected in the fused image, as shown by the solid line box in the figure.
Claims (6)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210434625.3A CN114782298B (en) | 2022-04-24 | 2022-04-24 | Infrared and visible light image fusion method with regional attention |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210434625.3A CN114782298B (en) | 2022-04-24 | 2022-04-24 | Infrared and visible light image fusion method with regional attention |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114782298A CN114782298A (en) | 2022-07-22 |
CN114782298B true CN114782298B (en) | 2024-03-12 |
Family
ID=82433252
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210434625.3A Active CN114782298B (en) | 2022-04-24 | 2022-04-24 | Infrared and visible light image fusion method with regional attention |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114782298B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115311186B (en) * | 2022-10-09 | 2023-02-03 | 济南和普威视光电技术有限公司 | Cross-scale attention confrontation fusion method and terminal for infrared and visible light images |
CN115423734B (en) * | 2022-11-02 | 2023-03-24 | 国网浙江省电力有限公司金华供电公司 | Infrared and visible light image fusion method based on multi-scale attention mechanism |
CN116363036B (en) * | 2023-05-12 | 2023-10-10 | 齐鲁工业大学(山东省科学院) | Infrared and visible light image fusion method based on visual enhancement |
CN119339201A (en) * | 2024-12-20 | 2025-01-21 | 泰山学院 | An image multimodal fusion method and system for complex dynamic environments |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111161201A (en) * | 2019-12-06 | 2020-05-15 | 北京理工大学 | Infrared and visible light image fusion method based on detail-enhanced channel attention |
CN111709902A (en) * | 2020-05-21 | 2020-09-25 | 江南大学 | Infrared and visible light image fusion method based on self-attention mechanism |
CN111797779A (en) * | 2020-07-08 | 2020-10-20 | 兰州交通大学 | Remote sensing image semantic segmentation method based on regional attention multi-scale feature fusion |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111080724B (en) * | 2019-12-17 | 2023-04-28 | 大连理工大学 | Fusion method of infrared light and visible light |
-
2022
- 2022-04-24 CN CN202210434625.3A patent/CN114782298B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111161201A (en) * | 2019-12-06 | 2020-05-15 | 北京理工大学 | Infrared and visible light image fusion method based on detail-enhanced channel attention |
CN111709902A (en) * | 2020-05-21 | 2020-09-25 | 江南大学 | Infrared and visible light image fusion method based on self-attention mechanism |
CN111797779A (en) * | 2020-07-08 | 2020-10-20 | 兰州交通大学 | Remote sensing image semantic segmentation method based on regional attention multi-scale feature fusion |
Non-Patent Citations (3)
Title |
---|
一种基于多尺度低秩分解的红外与可见光图像融合方法;陈潮起;孟祥超;邵枫;符冉迪;;光学学报;20200610(第11期);全文 * |
基于注意力机制的眼周性别属性识别;何勇;;企业科技与发展;20200610(第06期);全文 * |
基于视觉注意的可见光与红外图像融合算法;陈艳菲;桑农;王洪伟;但志平;;华中科技大学学报(自然科学版);20140110(第S1期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN114782298A (en) | 2022-07-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114782298B (en) | Infrared and visible light image fusion method with regional attention | |
CN112287940B (en) | Semantic segmentation method of attention mechanism based on deep learning | |
US11238602B2 (en) | Method for estimating high-quality depth maps based on depth prediction and enhancement subnetworks | |
CN112861729B (en) | Real-time depth completion method based on pseudo-depth map guidance | |
CN116681636B (en) | Light infrared and visible light image fusion method based on convolutional neural network | |
CN113052210A (en) | Fast low-illumination target detection method based on convolutional neural network | |
US20180231871A1 (en) | Depth estimation method for monocular image based on multi-scale CNN and continuous CRF | |
CN109741256A (en) | Image super-resolution reconstruction method based on sparse representation and deep learning | |
CN101916436B (en) | Multi-scale spatial projecting and remote sensing image fusing method | |
CN108269244B (en) | An Image Dehazing System Based on Deep Learning and Prior Constraints | |
CN110363215A (en) | A Method of Converting SAR Image to Optical Image Based on Generative Adversarial Network | |
CN110517306B (en) | Binocular depth vision estimation method and system based on deep learning | |
CN114049335B (en) | Remote sensing image change detection method based on space-time attention | |
CN110909690A (en) | A detection method for occluded face images based on region generation | |
CN112232134B (en) | Human body posture estimation method based on hourglass network and attention mechanism | |
CN113554032B (en) | Remote sensing image segmentation method based on multi-path parallel network of high perception | |
CN114170286B (en) | Monocular depth estimation method based on unsupervised deep learning | |
CN113283444A (en) | Heterogeneous image migration method based on generation countermeasure network | |
CN105184779A (en) | Rapid-feature-pyramid-based multi-dimensioned tracking method of vehicle | |
CN113111740B (en) | A feature weaving method for remote sensing image target detection | |
CN116363036B (en) | Infrared and visible light image fusion method based on visual enhancement | |
CN114639002A (en) | Infrared and visible light image fusion method based on multi-mode characteristics | |
CN117670965B (en) | Unsupervised monocular depth estimation method and system suitable for infrared image | |
CN117576402B (en) | Deep learning-based multi-scale aggregation transducer remote sensing image semantic segmentation method | |
CN115423734A (en) | Infrared and visible light image fusion method based on multi-scale attention mechanism |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |