CN109360179A

CN109360179A - Image fusion method, device and readable storage medium

Info

Publication number: CN109360179A
Application number: CN201811214128.2A
Authority: CN
Inventors: 程永翔; 刘坤; 于晟焘
Original assignee: Shanghai Maritime University
Current assignee: Shanghai Maritime University
Priority date: 2018-10-18
Filing date: 2018-10-18
Publication date: 2019-02-19
Anticipated expiration: 2038-10-18
Also published as: CN109360179B

Abstract

The invention discloses an image fusion method, a device and a readable storage medium, which are applied to the technical field of image processing. The image fusion method includes: firstly obtaining a registered first image and a second image; The first score map and the second score map are classified and output; the corresponding pixels of the first score map and the second score map are compared to obtain a binary map; the first fusion image is obtained; the first structural similarity map is calculated, and the first Two structural similarity maps; obtaining a difference map between the first structural similarity map and the second structural similarity map; and obtaining a second fusion image based on the difference map, the first image, and the second image. By applying the embodiment of the present invention, a fusion image of infrared and visible light images is obtained through a dual-channel convolutional neural network. As a deep learning algorithm, the convolutional neural network can automatically select image features, improve the singleness of feature extraction, and avoid the existing infrared Flaws of image and visible light image fusion methods.

Description

A kind of image interfusion method, device and readable storage medium storing program for executing

Technical field

The present invention relates to people's image fusion technology field more particularly to a kind of image interfusion methods, device and readable storage Medium.

Background technique

Infrared sensor is sensitive to the infrared thermal characteristics of target area, it can with work double tides and overcome illumination difficulty come It was found that target, but it often lacks detailed information abundant, blurred background；And visible images include more abundant texture Feature and detailed information, but its image-forming condition is to the more demanding of illumination.If by infrared image letter complementary with visible images Breath carries out effective integration, and the blending image information of acquisition is richer, robustness is stronger, is subsequent image segmentation, detection, identification It haves laid a good foundation.Therefore infrared and visual image fusion technology is widely used in military and security monitoring field.

Image co-registration is divided into: Pixel-level, feature level and decision level.The image co-registration of the Pixel-level figure of basis and fusion the most As information is richer.Image interfusion method based on multi-scale transform (MST) and rarefaction representation (SR) is pixel-level image fusion Most common method in method, image characteristics extraction device needs manual designs in such method, and operation efficiency is low；It mentions simultaneously The single characteristics of image got not is the image-context that can be applied to all kinds of complexity well, is easy in the region of uniform gray level Erroneous judgement.

Summary of the invention

The embodiment of the present invention is designed to provide a kind of image interfusion method, device and readable storage medium storing program for executing, by double Channel convolutional neural networks obtain infrared and visible images blending images, calculation of the convolutional neural networks as deep learning Method can automatically select characteristics of image, improve the unicity of feature extraction, avoid existing infrared image and melt with visible images The defect of conjunction method.Specific technical solution is as follows:

In order to achieve the above objectives, the embodiment of the invention provides a kind of image interfusion methods, comprising:

Infrared image is registrated with visible images, the first image and the second image after being registrated, wherein institute It is that visible images are the visible images that state the first image, which be parts of images, second image in the infrared image, In parts of images；

The first image and second image are input in trained convolutional neural networks, by the convolution Classification the first shot chart of output and the second shot chart after neural metwork training；

The respective pixel of first shot chart and second shot chart is compared, binary map is obtained；

Based on the binary map, the first image and second image, the first blending image is obtained；

The first structure similarity graph of the first image and first blending image is calculated, and calculates the second image With the second structural similarity figure of first blending image；

Obtain the disparity map of the first structure similarity graph and the second structural similarity figure；

Based on the disparity map, the first image and second image, the second blending image is obtained.

In a kind of implementation, the respective pixel to first shot chart and second shot chart compares Compared with the step of obtaining binary map, comprising:

For the first pixel on first shot chart, judge whether the pixel value greater than the second pixel, wherein First pixel is any one pixel on first shot chart, and second pixel is second score Pixel corresponding with first pixel on figure；

If it is, the pixel value of third pixel is 1 in the binary map；Otherwise, the pixel value of third pixel It is 0, wherein the third pixel is the pixel in the binary map with the first pixel corresponding position.

In a kind of implementation, first blending image embodies formula are as follows:

F₁(x, y)=D₁(x,y)A(x,y)+(1-D₁(x,y)B(x,y))

Wherein, D₁For binary map, A is the first image, and B is the second image, F₁For the first blending image, x, y are to constitute pixel The coordinate value of point.

In a kind of implementation, the difference for obtaining the first structure similarity graph and the second structural similarity figure The step of different figure, comprising:

Obtain the difference of the first structure similarity graph and the second structural similarity figure；

Using the absolute value of the difference as the difference of the first structure similarity graph and the second structural similarity figure Different figure.

It is described to be based on the disparity map, the first image and second image in a kind of implementation, obtain second The step of blending image includes:

Based on target area, region unrelated with target in the disparity map is removed, obtains target's feature-extraction image；

According to the target's feature-extraction image, the first image and second image, the second blending image is obtained.

In a kind of implementation, second blending image embodies formula are as follows:

F₂(x, y)=D₂(x,y)A(x,y)+(1-D₂(x,y)B(x,y))

Wherein, D₂For target's feature-extraction image, A is the first image, and B is the second image, and x, y are the seat for constituting pixel Scale value, F₂For the second blending image.

By the binary map as decision diagram, initial fusion image is obtained using Weighted Fusion rule, is finally mentioned using SSIM The notable figure for taking out target area, merges again, obtains final blending image；

In a kind of implementation, the training step of the convolutional neural networks, comprising:

The first quantity original image having a size of 32 × 32 is extracted from the first image set, and is added in the second image set Second quantity visible images；

The original image and the visible images are converted into grayscale image, and the above gray level image is cut into 16 × 16 sub-block, as high resolution graphics image set；

Gaussian Blur processing is carried out to the first quantity original image that the first image is concentrated, and the second image is added The infrared light image for the second quantity concentrated, then first quantity is opened into original image and the second quantity infrared light image It is cut into 16 × 16 sub-block, as fuzzy graph image set.

Convolutional neural networks structure is trained on the fuzzy graph image set and high resolution graphics image set made.

In a kind of implementation, the convolutional neural networks are binary channels network, each channel is by 5 layers of convolution Neural network is constituted, including 3 convolutional layers, and 1 maximum pond layer and 1 full articulamentum, last output layer are 1 Softmax classifier.

In addition, the embodiment of the invention also provides a kind of image fusion devices, comprising:

Registration module, for being registrated to infrared image with visible images, the first image after being registrated and Two images, wherein the first image is that parts of images, second image in the infrared image are that visible images are Parts of images in the visible images；

Categorization module, for the first image and second image to be input to trained convolutional neural networks In, classification the first shot chart of output and the second shot chart after convolutional neural networks training；

Comparison module is compared for the respective pixel to first shot chart and second shot chart, obtains Binary map；

First Fusion Module obtains first and melts for being based on the binary map, the first image and second image Close image；

Computing module, for calculating the first structure similarity graph of the first image Yu first blending image, with And calculate the second structural similarity figure of the second image and first blending image；

Module is obtained, for obtaining the disparity map of the first structure similarity graph and the second structural similarity figure；

Second Fusion Module obtains second and melts for being based on the disparity map, the first image and second image Close image.

And a kind of readable storage medium storing program for executing is provided, and it is stored thereon with computer program, it is real when which is executed by processor The step of incumbent item of image fusion method.

Using a kind of image interfusion method provided in an embodiment of the present invention, device and readable storage medium storing program for executing, pass through convolution mind Infrared and visible images blending images are obtained through network, characteristics of image is automatically selected, improves the unicity of feature extraction, keep away The defect of existing infrared image and visible light image fusion method is exempted from.For binary segmentation there is no completely by target area with Background area is accurately divided, and the case where shade occurs so as to cause the blending image in later period, according to infrared and visible light source figure Conspicuousness target area figure is obtained as the difference with the structural similarity of original fusion image, secondary fusion steps is taken to change Kind fused image quality, the fusion method based on conspicuousness can keep the integrality of prominent target area, and improve fusion figure The visual quality of picture, so as to preferably serve subsequent image understanding and identification etc..

Detailed description of the invention

Fig. 1 is the flow diagram of image interfusion method provided in an embodiment of the present invention；

Fig. 2 is the first effect diagram provided in an embodiment of the present invention；

Fig. 3 is second of effect diagram provided in an embodiment of the present invention；

Fig. 4 is the third effect diagram provided in an embodiment of the present invention；

Fig. 5 is the 4th kind of effect diagram provided in an embodiment of the present invention；

Fig. 6 is the 5th kind of effect diagram provided in an embodiment of the present invention；

Fig. 7 is the 6th kind of effect diagram provided in an embodiment of the present invention；

Fig. 8 is the 7th kind of effect diagram provided in an embodiment of the present invention；

Fig. 9 is the 8th kind of effect diagram provided in an embodiment of the present invention；

Figure 10 is the 9th kind of effect diagram provided in an embodiment of the present invention；

Figure 11 is the provided in an embodiment of the present invention ten kind of effect diagram.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.

It should be noted that in the image processing arts, the thermoradiation efficiency in infrared image with target is larger, and can Light-exposed image grayscale differs greatly even opposite；Infrared image background gray scale is low without apparent thermal sensation effect contrast, with Visible images are compared, and spectral information is lacked, but equally include detailed information.Therefore, only have when being merged to image More go the information for retaining original image that could further improve syncretizing effect.

Referring to Fig. 1, the embodiment of the invention provides a kind of image interfusion methods, include the following steps:

S101 is registrated infrared image with visible images, the first image and the second image after being registrated, In, the first image is that parts of images, second image in the infrared image are that visible images are described visible Parts of images in light image.

It should be noted that geometrical registration refer to different time, different-waveband, different remote sensor systems are obtained same The image (data) in one area, through geometric transformation make corresponding image points in position with the operation that is overlapped completely in orientation.Specifically Geometrical registration process is the prior art, and this will not be repeated here for the embodiment of the present invention.

It is understood that sliding window is the commonly used image processing tool in image procossing, specifically, sliding window The size of mouth can be 3*3,5*5 either 16*16 etc., and the embodiment of the present invention is not specifically limited herein.

Illustratively, by taking the first image as an example, the sliding window of 16*16 can be opened from first pixel in the upper left corner Begin, as first central pixel point of 16*16 sliding window, then successively moves the 16*16 sliding window.So the The chance of pixel centered on any one pixel in one image has, then and so on, for the second image Be in this way, so any one central pixel point in the first image can be calculated according to this principle, and it is right in the second image Answer the structural similarity of central pixel point.

Sliding window is defined having a size of 16 × 16, step-length 1, the infrared image being registrated and visible images of input are distinguished It is done from left to right in infrared image and visible images, slide from top to bottom obtains the first image of infrared image sub-block V_A, as shown in Figure 2；The second image of visible images sub-block V_B, as shown in Figure 3.

The first image and second image are input in trained convolutional neural networks, by institute by S102 Classification exports the first shot chart and the second shot chart after stating convolutional neural networks training.

It should be noted that convolutional neural networks are a kind of depth feed forward-fuzzy controls in machine learning, have become It is applied to image recognition to function.Convolutional neural networks are a kind of feedforward neural networks, and artificial neuron can respond surrounding cells, It can carry out large-scale image procossing, including convolutional layer and pond layer.

In a kind of implementation, the training step of the convolutional neural networks, comprising: from the first image set extract having a size of 32 × 32 the first quantity original image, and the second quantity visible images being added in the second image set；By the original Beginning image and the visible images are converted into grayscale image, and the above gray level image is cut into 16 × 16 sub-block, as height Resolution chart image set；Gaussian Blur processing is carried out to the first quantity original image that the first image is concentrated, and is added the The infrared light image of the second quantity in two image sets, then first quantity original image and second quantity is infrared Light image is cut into 16 × 16 sub-block, as fuzzy graph image set.

Illustratively, 2000 original clear images having a size of 32 × 32 are extracted from Cifar-10 image set, and added Enter 200 visible images in TNO_Image_Fusion_Datase image set, is then converted into grayscale image and image is complete Portion is cut into 16 × 16 sub-block, as high resolution graphics image set；Secondly high to all being carried out from Cifar-10 image subblock This Fuzzy Processing (since infrared light image background area is low compared with visible images resolution ratio), and TNO_Image_ is added 200 infrared light images (sub-block for having been entirely cut into 16 × 16) in Fusion_Datase image set, as fuzzy graph image set.

Using binary channels network, each channel is made of 5 layers of convolutional neural networks, including 3 convolutional layers, and 1 Maximum pond layer and 1 full articulamentum, last output layer are 1 softmax classifiers.Input picture block size is 16 × 16, the convolution kernel size of convolutional layer is set as 3 × 3, and step-length is set as 1；Maximum pond layer convolution kernel size 2 × 2, step-length 2 swash Function living is Relu.Momentum and weight decaying are set to 0.9 and 0.0005, learning rate 0.0001.

It is understood that the first image is input in trained convolutional neural networks, by the convolution Neural network is trained each of the first image pixel, obtains the score to each pixel, thus right All pixels point in first image obtains the score of all pixels point after being trained, thus the first score after being trained Scheme S_A, similarly, the corresponding second shot chart S of the second image can be obtained_B.Detailed process is shown in Figure 4, in convolutional Neural net Network exports the image after training after convolution twice, maximum pond, convolution sum connect entirely.

S103 is compared the respective pixel of first shot chart and second shot chart, obtains binary map.

Specifically, judging whether the pixel greater than the second pixel for the first pixel on first shot chart Value, wherein first pixel is any one pixel on first shot chart, and second pixel is described Pixel corresponding with first pixel on second shot chart；If it is, the third pixel in the binary map Pixel value is 1；Otherwise, the pixel value of third pixel be 0, wherein the third pixel be the binary map on it is described The pixel of first pixel corresponding position.

For binary map T, the first shot chart and the second shot chart are subjected to individual element comparison, if any one pixel Point, position are (m, n), value S_APixel point value be greater than S_BRespective pixel point value, then the pixel is corresponding in binary map Value is 1 at position (m, n), conversely, then the pixel in the corresponding position of binary map obtains 0, shown in following formula, illustratively, Based on Fig. 2 and Fig. 3, the binary map obtained after through neural network shown in Fig. 4 is as shown in Figure 5.

The binary map of a target area and background area is thus obtained, wherein white area indicates infrared image Target area, black region is background area, which can be used as the decision diagram of image co-registration.

S104 is based on the binary map, the first image and second image, obtains the first blending image.

First image and the second image are weighted according to binary map can obtain initial fusion as a result, initial fusion purpose That the background area of the target area of infrared image and high-resolution visible images is integrated into an image, based on Fig. 2, Fig. 3 and Fig. 5 obtains the first blending image as shown in FIG. 6.

F₁(x, y)=D₁(x,y)A(x,y)+(1-D₁(x,y)B(x,y))

S105 calculates the first structure similarity graph of the first image and first blending image, and calculating the Second structural similarity figure of two images and first blending image.

There are very strong relevance between infrared image and visible images pixel, there is a large amount of among these relevances Structural information, image structure similarity SSIM (structural similarity index) are that one kind is used to assess image matter The index of amount.From the perspective of image construction, structural information is defined as brightness and contrast by structural similarity index, with this To reflect the structural of objects in images.For two images C and D, then the similarity measure function of two images is defined as:

Wherein, μ_a, μ_bIt is the average gray of image C and D, σ_a, μ_bIt is the standard deviation of image C and D, σ_abIt is image C and D Covariance, C₁, C₂, C₃It is minimum normal number, it is therefore an objective to unstable caused by when avoiding denominator close to 0.α, beta, gamma > 0 are to use To adjust brightness, contrast, the weight of structure function.

Therefore, the first image A and the first blending image F is calculated₁First structure similarity graph S_AF, illustratively, it is based on Fig. 2 and Fig. 6 obtains first structure similarity graph as shown in Figure 7, calculates the second image B and the first blending image F₁The second knot Structure similarity graph S_BF, the second structural similarity figure as shown in Figure 8 is obtained based on Fig. 3 and Fig. 6.

S106 obtains the disparity map of the first structure similarity graph and the second structural similarity figure.

In a kind of implementation, the difference for obtaining the first structure similarity graph and the second structural similarity figure The step of different figure, comprising: obtain the difference of the first structure similarity graph and the second structural similarity figure；By the difference Disparity map of the absolute value of value as the first structure similarity graph and the second structural similarity figure.Specifically, first Structural similarity figure and the second structural similarity figure disparity map are as follows:

S=| S_AF-S_BF|

Wherein, first structure similarity graph S_AF, the second structural similarity figure S_BF, S is disparity map, illustratively, based on figure The disparity map that 7 and Fig. 8 is obtained is as shown in Figure 9.

S107 is based on the disparity map, the first image and second image, obtains the second blending image.

Since the first blending image that initial fusion obtains completely does not divide target area accurately with background area, Cause the blending image in later period shade occur, therefore takes secondary fusion steps to improve fused image quality.

It is described to be based on the disparity map, the first image and second image in a kind of implementation, obtain second The step of blending image includes: to be removed region unrelated with target in the disparity map based on target area, obtained target signature Extract image；According to the target's feature-extraction image, the first image and second image, the second fusion figure is obtained Picture.

Illustratively, it is based on disparity map shown in Fig. 9, obtains target's feature-extraction image as shown in Figure 10.

F₂(x, y)=D₂(x,y)A(x,y)+(1-D₂(x,y)B(x,y))

Regard secondary fusion as infrared image and visual image fusion based on conspicuousness Objective extraction.Disparity map S Contain the salient region of infrared image.Using morphological images processing method, area unrelated with target in disparity map is removed Domain obtains target's feature-extraction figure, it is to be appreciated that target area is the target person extracted by infrared sensor Therefore infrared figure enhances the conspicuousness of target area, so as to improve the detailed information retained in blending image, such as scheme Shown in 11, the second blending image based on Figure 10 and Fig. 2, Fig. 3 acquisition.

Using the thought of binary segmentation, obtain infrared merging figure with visible images by binary channels convolutional neural networks Picture, algorithm of the convolutional neural networks as deep learning, can automatically select characteristics of image, improve the unicity of feature extraction, Avoiding the defect of existing infrared image and visible light image fusion method, (majority needs that manual designs extract feature and feature mentions Take it is single, be easily lost).Secondly, completely do not divide target area accurately with background area for binary segmentation, thus The blending image in later period is caused the case where shade occur, according to infrared and visible light source image and original fusion image structure The difference of similitude obtains conspicuousness target area figure, and secondary fusion steps is taken to improve fused image quality, based on aobvious The fusion method of work property can keep the integrality of prominent target area, and improve the visual quality of blending image, so as to more Good serves subsequent image understanding and identification etc..

The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the scope of the present invention.It is all Any modification, equivalent replacement, improvement and so within the spirit and principles in the present invention, are all contained in protection scope of the present invention It is interior.

Claims

1. an image fusion method, is characterized in that, comprises:

The infrared image and the visible light image are registered to obtain the registered first image and the second image, wherein the first image is a part of the infrared image and the second image is the visible light image. Part of the image in the visible light image;

The first image and the second image are input into the trained convolutional neural network, and the first score map and the second score map are classified and output after the training of the convolutional neural network;

Comparing the corresponding pixels of the first score map and the second score map to obtain a binary map;

obtaining a first fused image based on the binary map, the first image and the second image;

calculating a first structural similarity map of the first image and the first fused image, and calculating a second structural similarity map of the second image and the first fused image;

obtaining a difference map between the first structural similarity map and the second structural similarity map;

Based on the difference map, the first image and the second image, a second fused image is obtained.

2. An image fusion method according to claim 1, wherein the step of comparing the corresponding pixels of the first score map and the second score map to obtain a binary map comprises:

For the first pixel on the first score map, determine whether the pixel value is greater than that of the second pixel, where the first pixel is any pixel on the first score map, and the first pixel is Two pixel points are the pixel points corresponding to the first pixel point on the second score map;

If yes, the pixel value of the third pixel point on the binary image is 1; otherwise, the pixel value of the third pixel point is 0, wherein the third pixel point is the same as that on the binary image. The pixel point at the corresponding position of the first pixel point.

3. a kind of image fusion method according to claim 1 and 2, is characterized in that, the concrete expression formula of described first fusion image is:

F ₁ (x,y)=D ₁ (x,y)A(x,y)+(1-D ₁ (x,y)B(x,y))

Among them, D ₁ is a binary image, A is the first image, B is the second image, F ₁ is the first fusion image, and x and y are the coordinate values of the pixel points.

4. An image fusion method according to claim 1 or 2, wherein the step of obtaining the difference map of the first structural similarity map and the second structural similarity map comprises:

obtaining the difference between the first structural similarity map and the second structural similarity map;

The absolute value of the difference is used as a difference map between the first structural similarity map and the second structural similarity map.

5. An image fusion method according to claim 1 or 2, wherein the step of obtaining the second fusion image based on the difference map, the first image and the second image comprises:

Based on the target area, remove the area irrelevant to the target in the difference map to obtain the target feature extraction image;

The image, the first image and the second image are extracted according to the target feature to obtain a second fused image.

6. A kind of image fusion method according to claim 5, is characterized in that, the concrete expression formula of described second fusion image is:

F ₂ (x,y)=D ₂ (x,y)A(x,y)+(1-D ₂ (x,y)B(x,y))

Among them, D ₂ is the target feature extraction image, A is the first image, B is the second image, x and y are the coordinate values of the pixel points, and F ₂ is the second fusion image.

The binary image is used as a decision map, and the weighted fusion rule is used to obtain the initial fusion image. Finally, SSIM is used to extract the saliency map of the target area, and then fused again to obtain the final fusion image.

7. A kind of image fusion method according to claim 1, is characterized in that, the training step of described convolutional neural network comprises:

extracting a first number of original images with a size of 32×32 from the first image set, and adding a second number of visible light images in the second image set;

Converting the original image and the visible light image into a grayscale image, and dividing the above grayscale image into 16×16 sub-blocks as a high-resolution image set;

Gaussian blurring is performed on the first number of original images in the first image set, and a second number of infrared light images in the second image set are added, and then the first number of original images and the second number of The infrared light image is divided into 16×16 sub-blocks as the blurred image set.

8. An image fusion method according to claim 1 or 7, wherein the convolutional neural network is a dual-channel network, and each channel is composed of 5-layer convolutional neural networks, including 3 Convolutional layers, 1 max pooling layer, and 1 fully connected layer, and the final output layer is a softmax classifier.

9. An image fusion device, comprising:

a registration module for registering an infrared image and a visible light image to obtain a registered first image and a second image, wherein the first image is a part of the infrared image and the second image The image is a visible light image and is a partial image in the visible light image;

A classification module, for inputting the first image and the second image into the trained convolutional neural network, and classifying and outputting the first score map and the second score map after being trained by the convolutional neural network;

a comparison module, configured to compare the corresponding pixels of the first score map and the second score map to obtain a binary map;

a first fusion module, configured to obtain a first fusion image based on the binary map, the first image and the second image;

a calculation module, configured to calculate a first structural similarity map between the first image and the first fused image, and calculate a second structural similarity map between the second image and the first fused image;

an obtaining module for obtaining a difference map between the first structural similarity graph and the second structural similarity graph;

A second fusion module, configured to obtain a second fusion image based on the difference map, the first image and the second image.

10. A readable storage medium on which a computer program is stored, characterized in that, when the program is executed by a processor, the steps of the image fusion method according to claims 1 to 8 are implemented.