[go: up one dir, main page]

CN111292264B - A high dynamic range image reconstruction method based on deep learning - Google Patents

A high dynamic range image reconstruction method based on deep learning Download PDF

Info

Publication number
CN111292264B
CN111292264B CN202010072803.3A CN202010072803A CN111292264B CN 111292264 B CN111292264 B CN 111292264B CN 202010072803 A CN202010072803 A CN 202010072803A CN 111292264 B CN111292264 B CN 111292264B
Authority
CN
China
Prior art keywords
image
network
hdr
ldr
dynamic range
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010072803.3A
Other languages
Chinese (zh)
Other versions
CN111292264A (en
Inventor
肖春霞
刘文焘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202010072803.3A priority Critical patent/CN111292264B/en
Publication of CN111292264A publication Critical patent/CN111292264A/en
Application granted granted Critical
Publication of CN111292264B publication Critical patent/CN111292264B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20172Image enhancement details
    • G06T2207/20208High dynamic range [HDR] image processing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses an image high dynamic range reconstruction method based on deep learning, and belongs to the field of computational photography and digital image processing. The invention establishes a mapping network from a single LDR image to an HDR image by adopting a method based on deep learning. The method first sequentially generates LDR training data, HDR sample labels with aligned brightness units and mask images of high brightness areas from the collected HDR data set. The neural network is then constructed and trained to obtain a network model with an LDR to HDR mapping relationship. And finally, directly inputting the LDR image into the network model by utilizing the generated network model obtained by training, and outputting the reconstructed HDR image. The method can effectively reconstruct the dynamic range of the real scene from a single common digital image, and can be used for HDR simulation effect display of the common digital image or providing more realistic rendering effect for the image illumination technology.

Description

一种基于深度学习的图像高动态范围重建方法A method for image high dynamic range reconstruction based on deep learning

技术领域Technical Field

本发明属于计算摄影学和数字图像处理领域,涉及图像的高动态范围重建方法,尤其是基于深度学习的图像高动态范围重建方法。The present invention belongs to the fields of computational photography and digital image processing, and relates to a method for reconstructing a high dynamic range of an image, in particular to a method for reconstructing a high dynamic range of an image based on deep learning.

背景技术Background Art

高动态范围成像(High Dynamic Range Imaging,HDRI)技术是用来实现比普通数字图像更大曝光范围的一种图像表示方法,高动态范围(High Dynamic Range,HDR)图像可以提供比普通数字图像更大的亮度变化范围和更多的明暗细节,这使得HDR图像能够呈现更加接近真实场景的亮度变化信息。近年来,随着显示设备的不断进化和基于物理渲染的需求提高,高动态范围成像技术在实际应用中变得越来越重要。然而,目前直接获取HDR图像的方法需要较高的专业技能,成本较高且耗时。针对从单幅普通数字图像中重建HDR的方法,传统方法只能通过增加约束的方法来尽可能减少问题的非适定性,这使得它们只能针对某些特定应用场景有效。一些学者也基于深度学习做了一些卓有成效的工作,但他们未能考虑到诸如HDR图片间的亮度等级不变性等因素导致重建效果有局限性。该发明可以从单幅普通数字图像中有效地重建出真实场景的动态范围,可用于普通数字图像的HDR模拟效果显示或为基于图像照明技术提供更逼真的渲染效果。High Dynamic Range Imaging (HDRI) technology is an image representation method used to achieve a larger exposure range than ordinary digital images. High Dynamic Range (HDR) images can provide a larger brightness range and more light and dark details than ordinary digital images, which enables HDR images to present brightness change information closer to the real scene. In recent years, with the continuous evolution of display devices and the increasing demand for physical rendering, high dynamic range imaging technology has become increasingly important in practical applications. However, the current method of directly obtaining HDR images requires high professional skills, is costly and time-consuming. For methods of reconstructing HDR from a single ordinary digital image, traditional methods can only reduce the non-adaptability of the problem as much as possible by adding constraints, which makes them only effective for certain specific application scenarios. Some scholars have also done some fruitful work based on deep learning, but they have failed to consider factors such as the invariance of brightness levels between HDR images, which leads to limitations in the reconstruction effect. The invention can effectively reconstruct the dynamic range of a real scene from a single ordinary digital image, which can be used for HDR simulation effect display of ordinary digital images or provide more realistic rendering effects based on image lighting technology.

发明内容Summary of the invention

本发明的目的是从单张普通数字图像中尽可能恢复原场景的高动态范围图像。这里普通数字图像是指以8位颜色深度、256色阶保存的低动态范围(Low Dynamic Range,LDR)图像,高动态范围图像指以“.EXR”或“.HDR”等格式保存的接近真实场景明暗变化的高动态范围图像。The purpose of the present invention is to restore the high dynamic range image of the original scene from a single ordinary digital image as much as possible. Here, the ordinary digital image refers to a low dynamic range (LDR) image saved with 8-bit color depth and 256 color levels, and the high dynamic range image refers to a high dynamic range image saved in the format of ".EXR" or ".HDR" that is close to the light and dark changes of the real scene.

为了达到上述目的,本发明采用基于深度学习的方法建立了一个从LDR图像到HDR图像的映射网络,通过训练数据对网络进行学习训练,使其建立一个端到端的LDR图像到HDR图像的映射关系,整体框架图如附图1所示。算法分为数据预处理和深度神经网络训练两个部分。数据预处理部分包括训练样本对的生成、HDR图像亮度单位的对齐和图像高光区域掩码的生成三个部分。神经网络结构分为基本HDR重建网络和训练优化网络,如附图2所示。其损失函数包括三项,分别为HDR重建图像的尺度不变损失、高光区域分类的交叉熵损失和生成对抗损失。In order to achieve the above-mentioned purpose, the present invention adopts a deep learning-based method to establish a mapping network from LDR images to HDR images, and trains the network through training data to establish an end-to-end mapping relationship from LDR images to HDR images. The overall framework diagram is shown in Figure 1. The algorithm is divided into two parts: data preprocessing and deep neural network training. The data preprocessing part includes three parts: the generation of training sample pairs, the alignment of HDR image brightness units, and the generation of image highlight area masks. The neural network structure is divided into a basic HDR reconstruction network and a training optimization network, as shown in Figure 2. Its loss function includes three items, namely, the scale-invariant loss of the HDR reconstructed image, the cross entropy loss of the highlight area classification, and the generative adversarial loss.

该方法具体包括以下内容和步骤:The method specifically includes the following contents and steps:

一、数据预处理1. Data Preprocessing

1)生成LDR训练样本输入1) Generate LDR training sample input

在使用深度神经网络进行有监督训练之前,需要获取对应网络输入输出的训练数据集。训练数据集包含若干LDR-HDR图像对,其中HDR图像数据可使用现有可用的HDR图片,该数据作为训练样本的标签,用于监督网络的训练;LDR图像数据作为HDR图像对应的样本输入,需要从原HDR图像中生成,其生成方法有两中途径,一是使用色调映射算法完成从HDR图片到LDR图片的生成,二是通过构建虚拟相机的方式以HDR图像作为模拟场景对齐进行模拟拍摄从而得到LDR图片。Before using a deep neural network for supervised training, it is necessary to obtain a training dataset corresponding to the network input and output. The training dataset contains several LDR-HDR image pairs, where the HDR image data can use existing available HDR images, which are used as labels for training samples to supervise the training of the network; the LDR image data, as the sample input corresponding to the HDR image, needs to be generated from the original HDR image. There are two ways to generate it: one is to use a tone mapping algorithm to complete the generation from HDR images to LDR images, and the other is to construct a virtual camera to align the HDR image as a simulated scene for simulated shooting to obtain an LDR image.

使用色调映射算法生成LDR图像:选择一种适当的色调映射算法,直接将HDR图像作为算法输入即可得到对应的LDR图像输出。Generate LDR image using tone mapping algorithm: Select an appropriate tone mapping algorithm and directly use the HDR image as the algorithm input to get the corresponding LDR image output.

通过构建虚拟相机来获取LDR图像:首先基于常用的数码单反相机确定虚拟相机动态范围的取值范围,对于每一次获取LDR图像都随机选取范围内的一个值作为该次模拟拍摄的相机的动态范围;然后虚拟相机根据输入的HDR图像进行自动曝光,对亮度超出虚拟相机动态范围的像素取边界值,再将其线性映射至LDR图像的低动态范围;最后,将所得到的图像从线性空间通过随机选择的近似相机响应函数映射为普通数字图像,即得到所需要的LDR图像。The LDR image is obtained by constructing a virtual camera: first, the value range of the virtual camera's dynamic range is determined based on the commonly used digital SLR camera, and each time an LDR image is obtained, a value within the range is randomly selected as the dynamic range of the camera for the simulated shooting; then the virtual camera automatically exposes according to the input HDR image, takes the boundary value of the pixels whose brightness exceeds the dynamic range of the virtual camera, and then linearly maps it to the low dynamic range of the LDR image; finally, the obtained image is mapped from the linear space to an ordinary digital image through a randomly selected approximate camera response function, that is, the required LDR image is obtained.

2)对齐HDR样本标签2) Align HDR sample labels

对于保存在相对亮度域的HDR图像,在将其作为训练样本标签前,需对齐它们的亮度单位。设原始HDR图像为H,LDR图像转换到线性空间并归一化到[0,1],设为L,Hl,c,Ll,c分别为图像在位置l,通道c处的像素值,其对齐方法为:For HDR images stored in the relative brightness domain, their brightness units need to be aligned before they are used as training sample labels. Let the original HDR image be H, the LDR image be converted to linear space and normalized to [0,1], let L, H l,c , L l,c be the pixel values of the image at position l and channel c respectively, and the alignment method is:

Figure BDA0002377717760000021
Figure BDA0002377717760000021

其中

Figure BDA0002377717760000022
为对齐后的HDR图像,ml,c定义为in
Figure BDA0002377717760000022
is the aligned HDR image, m l,c is defined as

Figure BDA0002377717760000023
Figure BDA0002377717760000023

其中τ为[0,1]的常数。对齐后的HDR图像与其对应的LDR图像组成供神经网络训练的训练样本对。Where τ is a constant in [0,1]. The aligned HDR image and its corresponding LDR image form a training sample pair for neural network training.

3)生成高光掩码图像3) Generate highlight mask image

得到对齐的HDR图像后,可以通过二值化的方式获取图像中高亮度区域的掩码图像。该掩码图像中值为1的区域代表场景中拥有较高亮度的物体或表面,如光源、强光反射面等。这些高光区域在LDR图像中往往会由于过曝光而被裁剪亮度,通过从HDR图像生成的高光掩码图像可作为网络优化训练部分的样本标签,用于优化神经网络的训练过程。After obtaining the aligned HDR image, the mask image of the high brightness area in the image can be obtained by binarization. The area with a value of 1 in the mask image represents objects or surfaces with higher brightness in the scene, such as light sources, strong light reflective surfaces, etc. These highlight areas are often clipped in brightness due to overexposure in LDR images. The highlight mask image generated from the HDR image can be used as a sample label for the network optimization training part to optimize the training process of the neural network.

二、神经网络的训练2. Training of Neural Networks

1)神经网络结构1) Neural network structure

网络结构主要分为生成网络和判别网络两个部分,结构示意图如附图3所示。其中生成网络为一个U-Net结构,网络接受一张LDR图像作为输入,经过由ResNet50模型构成的编码网络和6层“上采样+卷积层”模块构成解码网络后,分别输出一张HDR图像和高光掩码图像。网络输出的HDR图像即为根据LDR输入图像得到的HDR重建图像,而高光掩码图像作为网络对LDR图像中高光区域的预测,可作为优化网络训练的数据。判别器网络为一个由4层卷积层构成的全卷积网络,该判别器接受一张HDR图像与高光掩码图像作为输入,输出一张表征所输入的HDR图像为真实HDR图像或者是网络生成的虚假HDR图像的概率的特征图,该特征图可作为训练神经网络的数据。The network structure is mainly divided into two parts: the generating network and the discriminating network. The structural diagram is shown in Figure 3. The generating network is a U-Net structure. The network accepts an LDR image as input. After the encoding network composed of the ResNet50 model and the decoding network composed of the 6-layer "upsampling + convolution layer" module, an HDR image and a highlight mask image are output respectively. The HDR image output by the network is the HDR reconstructed image obtained according to the LDR input image, and the highlight mask image is the network's prediction of the highlight area in the LDR image, which can be used as data for optimizing network training. The discriminator network is a fully convolutional network composed of 4 convolutional layers. The discriminator accepts an HDR image and a highlight mask image as input, and outputs a feature map representing the probability that the input HDR image is a real HDR image or a false HDR image generated by the network. The feature map can be used as data for training the neural network.

2)神经网络的训练方法2) Neural network training method

该发明采用有监督学习的方式对上述网络进行训练。训练采用Adam优化器分别逐次对生成网络和判别网络进行反向传播优化,其中生成网络共包含三组优化函数,损失函数定义如下:The invention uses supervised learning to train the above network. The training uses the Adam optimizer to perform back propagation optimization on the generative network and the discriminative network one by one, where the generative network contains three sets of optimization functions, and the loss function is defined as follows:

LG=α1Lhdr2Lmask3Lgan L G =α 1 L hdr2 L mask3 L gan

该损失分别由三个损失函数控制,包括HDR重建图像的尺度不变损失、高光区域分类的交叉熵损失和生成对抗损失。The loss is controlled by three loss functions, including the scale-invariant loss of HDR reconstructed images, the cross-entropy loss for highlight area classification, and the generative adversarial loss.

HDR重建图像的尺度不变损失:该项损失函数依据HDR图像在相对亮度域中具有尺度不变性提出,该项损失控制网络输出的HDR图像使其尽可能与HDR标签值接近。其定义如下:Scale invariance loss of HDR reconstructed images: This loss function is proposed based on the scale invariance of HDR images in the relative brightness domain. This loss controls the HDR image output by the network to make it as close to the HDR label value as possible. Its definition is as follows:

Figure BDA0002377717760000031
Figure BDA0002377717760000031

其中y表示网络输出的HDR图像,

Figure BDA0002377717760000032
为目标图像,下标l,c分别表示像素位置和颜色通道,
Figure BDA0002377717760000033
表示在对数域中网络输出与样本标签的差值,∈为一个防止对数计算为零的微小值。该损失第一项即为普通的L2损失,引入第二项后,该项损失在计算时仅由预测值与样本标签的差值所影响,与预测值或样本标签的实际大小无关。Where y represents the HDR image output by the network,
Figure BDA0002377717760000032
is the target image, subscripts l and c represent pixel position and color channel respectively.
Figure BDA0002377717760000033
It represents the difference between the network output and the sample label in the logarithmic domain, and ∈ is a small value that prevents the logarithm from being calculated to zero. The first term of this loss is the ordinary L2 loss. After the introduction of the second term, this loss is only affected by the difference between the predicted value and the sample label during calculation, and has nothing to do with the actual size of the predicted value or the sample label.

高光区域分类的交叉熵损失:该项损失控制网络对图片中高亮度区域的检测。网络输出的高光掩码图像为对输入LDR图像中高光区域或非高光区域的分类结果,该结果应与预处理步骤时生成的高光掩码图像尽可能接近。该损失函数定义如下:Cross entropy loss for highlight area classification: This loss controls the network's detection of high brightness areas in the image. The highlight mask image output by the network is the classification result of the highlight area or non-highlight area in the input LDR image. The result should be as close as possible to the highlight mask image generated in the preprocessing step. The loss function is defined as follows:

Figure BDA0002377717760000041
Figure BDA0002377717760000041

其中m,

Figure BDA0002377717760000042
分别为网络预测值和标签值。Where m,
Figure BDA0002377717760000042
are the network prediction value and label value respectively.

生成对抗损失:该项损失能使网络预测的HDR图像在分布上尽可能接近真实的HDR图像,防止预测结果与标签结果仅在数值差异上减小而忽视整体分布的差异。该损失函数定义如下:Generate adversarial loss: This loss can make the HDR image predicted by the network as close to the real HDR image as possible in distribution, preventing the predicted result from being reduced in numerical difference from the label result while ignoring the difference in overall distribution. The loss function is defined as follows:

Figure BDA0002377717760000043
Figure BDA0002377717760000043

其中D(y)为判别网络以生成网络输出为输入的计算结果。Where D(y) is the calculation result of the discriminant network with the output of the generative network as input.

判别网络的损失函数为标准WGAN-GP损失,该项控制判别网络使其尽可能准确判断输入判别网络的图像是否为真实HDR图像。其定义如下:The loss function of the discriminant network is the standard WGAN-GP loss, which controls the discriminant network to make it judge as accurately as possible whether the image input to the discriminant network is a real HDR image. It is defined as follows:

Figure BDA0002377717760000044
Figure BDA0002377717760000044

Figure BDA0002377717760000045
Figure BDA0002377717760000045

其中∈为[0,1]的随机数。Where ∈ is a random number in [0,1].

根据上述训练方法对网络进行训练,当损失函数收敛后,生成网络便建立了一个从单张LDR图像到HDR图像的映射关系。利用该训练好的生成网络,即可完成从单幅普通数字图像尽可能重建真实的高动态范围图像的目标。The network is trained according to the above training method. When the loss function converges, the generative network establishes a mapping relationship from a single LDR image to an HDR image. Using this trained generative network, the goal of reconstructing a realistic high dynamic range image as much as possible from a single ordinary digital image can be achieved.

与现有技术相比,本发明有如下优点:Compared with the prior art, the present invention has the following advantages:

1.本发明构建了一个端到端的神经网络,仅需一张图片就能重建出足够真实的HDR图像,且无需人工交互的步骤;1. The present invention constructs an end-to-end neural network that can reconstruct a sufficiently realistic HDR image with only one picture, without the need for manual interaction;

2.本发明基于尺度不变性对HDR数据进行对齐和训练网络,能达到更好的重建效果;2. The present invention aligns HDR data and trains the network based on scale invariance, which can achieve better reconstruction effect;

3.本发明通过生成高光掩码图像对网络训练进行优化,能获得高亮度区域更好的重建效果。3. The present invention optimizes network training by generating highlight mask images, which can achieve better reconstruction effects in high-brightness areas.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是本发明的总框架图;Fig. 1 is a general framework diagram of the present invention;

图2是本发明的深度神经网络结构图;FIG2 is a diagram of a deep neural network structure of the present invention;

图3是本发明的效果示意图。FIG. 3 is a schematic diagram showing the effect of the present invention.

具体实施方式DETAILED DESCRIPTION

下面结合附图和实施例对本发明的技术方案作进一步说明。The technical solution of the present invention is further described below in conjunction with the accompanying drawings and embodiments.

如图1所示,本发明为一种基于深度学习的图像高动态范围重建方法,包括以下步骤:As shown in FIG1 , the present invention is a method for reconstructing an image with high dynamic range based on deep learning, comprising the following steps:

步骤1,预处理HDR数据集以构建神经网络的训练数据。首先从收集到的HDR数据集中生成LDR数据,然后使用LDR-HDR数据来对齐HDR数据,再利用对齐后的HDR图像生成高光掩码图像,最后将这三者整合作为神经网络的训练数据。其中LDR数据为数据输入,对齐的HDR数据和高光掩码图像为标签数据。Step 1: Preprocess the HDR dataset to construct training data for the neural network. First, generate LDR data from the collected HDR dataset, then use the LDR-HDR data to align the HDR data, and then use the aligned HDR image to generate a highlight mask image. Finally, integrate the three as training data for the neural network. The LDR data is the data input, and the aligned HDR data and highlight mask image are the label data.

步骤2,构建并训练神经网络以得到具有LDR到HDR映射关系的网络模型。根据训练策略,逐次轮替地对生成网络和判别网络进行反向传播训练,直到损失函数收敛。此时,生成网络部分即为最终用于重建高动态范围图像的网络模型。Step 2: Build and train a neural network to obtain a network model with a mapping relationship from LDR to HDR. According to the training strategy, the generator network and the discriminator network are back-propagated and trained alternately until the loss function converges. At this point, the generator network is the network model that is ultimately used to reconstruct high dynamic range images.

步骤3,根据步骤2训练得到的生成网络模型,直接将待重建的LDR图像输入至网络模型,即可输出其重建的HDR图像。Step 3: According to the generated network model trained in step 2, the LDR image to be reconstructed is directly input into the network model to output the reconstructed HDR image.

下面结合实例对该方法进行详细说明。The method is described in detail below with reference to examples.

一、数据预处理1. Data Preprocessing

HDR数据集选取现有的公共HDR数据集构成,通过对收集数据进行裁剪、缩放等操作组成一组尺寸、类型相同的数据集。根据此整合的数据集,分别应用显示适应的色调映射(Display Adaptive Tone Mapping)算法和前述的虚拟相机拍摄的方式获取LDR数据。关于色调映射,根据实际应用时图片的类型选取,比如如果应用时的大多图片是未经后期处理拍摄的图片,就选择本实施例这个的方法。如果是针对某处理的图片,就选取一种与后处理效果相近的方式。具体地,对每一张HDR图像,应用显示适应的色调映射算法获取一张LDR图像,再应用随机参数的虚拟相机进行一次拍摄获取一张LDR图像,即每张HDR图像生成两张不同方法获取的LDR图像。生成的LDR图像转换至线性空间,并从范围[0,255]的整数值归一化至[0,1]的小数值。The HDR data set is composed of an existing public HDR data set, and a set of data sets of the same size and type are formed by cropping, scaling and other operations on the collected data. According to this integrated data set, the display adaptive tone mapping algorithm and the aforementioned virtual camera shooting method are respectively applied to obtain LDR data. Regarding tone mapping, it is selected according to the type of picture in actual application. For example, if most of the pictures in the application are pictures taken without post-processing, the method of this embodiment is selected. If it is a picture for a certain processing, a method similar to the post-processing effect is selected. Specifically, for each HDR image, a display adaptive tone mapping algorithm is applied to obtain an LDR image, and then a virtual camera with random parameters is used to shoot once to obtain an LDR image, that is, each HDR image generates two LDR images obtained by different methods. The generated LDR image is converted to a linear space and normalized from an integer value in the range [0,255] to a decimal value [0,1].

然后基于每一对LDR-HDR图像对其中的HDR图像进行亮度对齐。对齐方法参照说明内容中的公式进行:Then, based on each pair of LDR-HDR images, the HDR images are brightness aligned. The alignment method is based on the formula in the description:

Figure BDA0002377717760000051
Figure BDA0002377717760000051

其中

Figure BDA0002377717760000052
为对齐后的HDR图像,ml,c定义为in
Figure BDA0002377717760000052
is the aligned HDR image, m l,c is defined as

Figure BDA0002377717760000053
Figure BDA0002377717760000053

其中τ可为[0,1]的常数,这里取τ=0.08。将该公式应用到每一对LDR-HDR图像后,得到对齐的HDR图像数据,该数据将作为训练网络时的样本标签,而LDR图像数据作为样本输入。Where τ can be a constant in [0,1], and here τ = 0.08 is taken. After applying this formula to each pair of LDR-HDR images, the aligned HDR image data is obtained, which will be used as the sample label when training the network, and the LDR image data is used as the sample input.

最后,基于对齐的HDR数据,由如下公式计算高光掩码图像:Finally, based on the aligned HDR data, the highlight mask image is calculated by the following formula:

Figure BDA0002377717760000061
Figure BDA0002377717760000061

其中

Figure BDA0002377717760000062
为对齐后的HDR图像的通道均值图像,t为常数,这里取t=e0.1。该掩码图像将作为训练网络时的另一项样本标签,同HDR图像一起监督网络的学习过程。in
Figure BDA0002377717760000062
is the channel mean image of the aligned HDR image, t is a constant, here we take t = e 0.1 . The mask image will be used as another sample label when training the network, and supervise the learning process of the network together with the HDR image.

二、神经网络训练2. Neural Network Training

神经网络的结构依照附图2所示进行搭建。具体地,对于ResNet50模型,这里使用现有的基于ImageNet数据集的分类任务中已训练好的模型作为该部分网络权重的初始化值;其余每一个网络块都代表一个由卷积层和实例规范化操作以及relu激活函数组成的卷积块;生成网络中decoder部分的输入由上一层卷积块的输出和encoder对称位置的输出拼接而成;对于判别器网络部分,其输入由生成网络的输出组成,在经过四层卷积块后,输出一张判断输入HDR图像是否为真实图像的概率特征图。The structure of the neural network is constructed as shown in Figure 2. Specifically, for the ResNet50 model, the trained model in the existing classification task based on the ImageNet dataset is used as the initialization value of the weight of this part of the network; each of the remaining network blocks represents a convolutional block consisting of a convolutional layer, an instance normalization operation, and a relu activation function; the input of the decoder part of the generated network is composed of the output of the previous convolutional block and the output of the encoder at a symmetrical position; for the discriminator network part, its input is composed of the output of the generated network, and after passing through four layers of convolutional blocks, a probability feature map is output to determine whether the input HDR image is a real image.

其中,生成网络的输出与对应输入的样本标签数据,以及判别网络根据该输出数据计算的到的概率特征图,一同根据如下公式计算生成器损失:Among them, the output of the generator network and the corresponding input sample label data, as well as the probability feature map calculated by the discriminant network based on the output data, are used together to calculate the generator loss according to the following formula:

LG=α1Lhdr2Lmask3Lgan L G =α 1 L hdr2 L mask3 L gan

该损失分别由三个损失函数控制,包括HDR重建图像的尺度不变损失、高光区域分类的交叉熵损失和生成对抗损失。The loss is controlled by three loss functions, including the scale-invariant loss of HDR reconstructed images, the cross-entropy loss for highlight area classification, and the generative adversarial loss.

HDR重建图像的尺度不变损失:该项损失函数依据HDR图像在相对亮度域中具有尺度不变性提出,该项损失控制网络输出的HDR图像使其尽可能与HDR标签值接近。其定义如下:Scale invariance loss of HDR reconstructed images: This loss function is proposed based on the scale invariance of HDR images in the relative brightness domain. This loss controls the HDR image output by the network to make it as close to the HDR label value as possible. Its definition is as follows:

Figure BDA0002377717760000063
Figure BDA0002377717760000063

其中y表示网络输出的HDR图像,

Figure BDA0002377717760000064
为目标图像,下标l,c分别表示像素位置和颜色通道,
Figure BDA0002377717760000065
表示在对数域中网络输出与样本标签的差值,∈为一个防止对数计算为零的微小值。该损失第一项即为普通的L2损失,引入第二项后,该项损失在计算时仅由预测值与样本标签的差值所影响,与预测值或样本标签的实际大小无关。Where y represents the HDR image output by the network,
Figure BDA0002377717760000064
is the target image, subscripts l and c represent pixel position and color channel respectively.
Figure BDA0002377717760000065
It represents the difference between the network output and the sample label in the logarithmic domain, and ∈ is a small value that prevents the logarithm from being calculated to zero. The first term of this loss is the ordinary L2 loss. After the introduction of the second term, this loss is only affected by the difference between the predicted value and the sample label during calculation, and has nothing to do with the actual size of the predicted value or the sample label.

高光区域分类的交叉熵损失:该项损失控制网络对图片中高亮度区域的检测。网络输出的高光掩码图像为对输入LDR图像中高光区域或非高光区域的分类结果,该结果应与预处理步骤时生成的高光掩码图像尽可能接近。该损失函数定义如下:Cross entropy loss for highlight area classification: This loss controls the network's detection of high brightness areas in the image. The highlight mask image output by the network is the classification result of the highlight area or non-highlight area in the input LDR image. The result should be as close as possible to the highlight mask image generated in the preprocessing step. The loss function is defined as follows:

Figure BDA0002377717760000071
Figure BDA0002377717760000071

其中m,

Figure BDA0002377717760000072
分别为网络预测值和标签值。Where m,
Figure BDA0002377717760000072
are the network prediction value and label value respectively.

生成对抗损失:该项损失能使网络预测的HDR图像在分布上尽可能接近真实的HDR图像,防止预测结果与标签结果仅在数值差异上减小而忽视整体分布的差异。该损失函数定义如下:Generate adversarial loss: This loss can make the HDR image predicted by the network as close to the real HDR image as possible in distribution, preventing the predicted result from being reduced in numerical difference from the label result while ignoring the difference in overall distribution. The loss function is defined as follows:

Figure BDA0002377717760000073
Figure BDA0002377717760000073

其中D(y)为判别网络以生成网络输出为输入的计算结果。Where D(y) is the calculation result of the discriminant network with the output of the generative network as input.

判别网络的损失函数为标准WGAN-GP损失,该项控制判别网络使其尽可能准确判断输入判别网络的图像是否为真实HDR图像。其定义如下:The loss function of the discriminant network is the standard WGAN-GP loss, which controls the discriminant network to make it judge as accurately as possible whether the image input to the discriminant network is a real HDR image. It is defined as follows:

Figure BDA0002377717760000074
Figure BDA0002377717760000074

Figure BDA0002377717760000075
Figure BDA0002377717760000075

其中∈为[0,1]的随机数。Where ∈ is a random number in [0,1].

其中α123分别取1,0.1,0.02,每一项损失的具体计算公式如发明内容中描述。根据计算得到的损失值,使用Adam优化算法对网络进行反向传播并更新权重。另外,每次更新完生成网络的权重,需要继续计算判别网络的损失并更新判别网络的权重,其更新算法同样使用Adam算法,具体公式如发明内容中描述。Wherein α 1 , α 2 , α 3 are 1, 0.1, and 0.02 respectively, and the specific calculation formula of each loss is described in the content of the invention. According to the calculated loss value, the network is back-propagated and the weight is updated using the Adam optimization algorithm. In addition, each time the weight of the generating network is updated, it is necessary to continue to calculate the loss of the discriminant network and update the weight of the discriminant network. The update algorithm also uses the Adam algorithm, and the specific formula is described in the content of the invention.

依照上述训练方法,每次交付一对或多对训练数据供网络训练,对整个训练数据集依次循环迭代,直至损失函数收敛,网络即训练完成。According to the above training method, one or more pairs of training data are delivered to the network for training each time, and the entire training data set is iterated in sequence until the loss function converges, and the network training is completed.

(三)网络模型应用(III) Network model application

在网络训练完成后,将生成网络部分的网络及其权重参数提取出来即得到了最终的高动态范围图像的重建模型。利用该模型,每次仅需要一张LDR图片作为输入,即可得到近似真实的HDR图片。附图3展示了一个应用的示例,其中生成网络模型即通过前述方法训练好的神经网络中的生成网络部分,该模型可接受任意尺寸的LDR图片作为输入,并直接输出同尺寸下的HDR重建图像。After the network training is completed, the network and its weight parameters of the generative network part are extracted to obtain the final reconstruction model of the high dynamic range image. Using this model, only one LDR picture is needed as input each time to obtain an approximately realistic HDR picture. Figure 3 shows an example of an application, in which the generative network model is the generative network part of the neural network trained by the above method. The model can accept LDR pictures of any size as input and directly output HDR reconstructed images of the same size.

本发明基于深度学习提出了一个单张普通数字图像重建高动态范围的方法,可对一般场景的图像进行有效、逼真的高动态范围重建。本发明应用广泛,根据对不同数据集进行训练可以适应不同需求的场景,且对于同一训练集仅需训练一次便可多次应用。The present invention proposes a method for reconstructing high dynamic range from a single ordinary digital image based on deep learning, which can effectively and realistically reconstruct high dynamic range for images of general scenes. The present invention has a wide range of applications. It can adapt to scenes with different requirements by training different data sets, and the same training set only needs to be trained once and can be applied multiple times.

Claims (4)

1. The image high dynamic range reconstruction method based on deep learning is characterized by comprising the following steps of:
step 1, a neural network is established based on a deep learning method, and comprises a generation network from a low dynamic range image to a high dynamic range image and a discrimination network for discriminating whether the high dynamic network image is real;
step 2, preprocessing an HDR data set to form training data, wherein the data preprocessing is divided into three parts of generation of LDR data, alignment of HDR image brightness units and generation of image highlight region masks, the LDR data obtained after preprocessing is used as training input data for generating a network and is output as an HDR image and a highlight mask image, the aligned HDR data and the highlight mask image after the data preprocessing are used as sample label data for training, the network is judged to accept one HDR image and the highlight mask image as input, and a feature map representing the probability that the input HDR image is a real HDR image or a false HDR image generated by the network is output;
step 3, training the neural network based on three loss functions, training the neural network in a supervised learning mode, performing back propagation optimization on a generation network and a discrimination network successively by adopting an Adam optimizer, wherein the three loss functions are respectively a scale invariant loss function of an HDR reconstructed image, a cross entropy loss function of highlight region classification and a generation antagonism loss function, and the loss functions are defined as follows:
L G =α 1 L hdr2 L mask3 L gan
the scale invariant loss function of the HDR reconstructed image is defined as follows:
Figure FDA0004105937280000011
where y represents the HDR image output by the network,
Figure FDA0004105937280000012
for an aligned HDR image, +.>
Figure FDA0004105937280000013
Representing the difference between the network output and the sample label in the logarithmic domain, e being a small value preventing logarithmic calculation to zero, subscript l, c representing pixel position and color channel, respectively;
a cross entropy loss function for highlight region classification, defined as follows:
Figure FDA0004105937280000014
wherein the method comprises the steps of
Figure FDA0004105937280000015
Respectively a network predicted value and a label value;
the challenge loss function is generated as defined below:
Figure FDA0004105937280000016
wherein D (y) is a calculation result for discriminating the network to generate network output as input;
training the network according to the loss function, and extracting and generating a network model as a final algorithm model after the loss function converges.
2. The method according to claim 1, characterized in that: the low dynamic range image refers to a low dynamic range image stored at 8-bit color depth, 256-tone scale, and the high dynamic range image refers to a high dynamic range image stored in ". EXR" or ". HDR" format that approximates the change in the brightness of a real scene.
3. The method according to claim 1, characterized in that: the neural network described in the step 1 comprises a generating network and a judging network, wherein the generating network is of a U-Net structure, the network receives an LDR image as input, and after a decoding network is formed by a coding network formed by a ResNet50 model and a 6-layer 'up-sampling+convolution layer' module, an HDR image and a highlight mask image are respectively output; the judging network is a full convolution network formed by 4 layers of convolution layers, receives an HDR image and a high light mask image as inputs, and outputs a feature map representing the probability that the input HDR image is a real HDR image or a false HDR image generated by the network.
4. The method according to claim 1, characterized in that: the data preprocessing described in the step 2 comprises the following specific processes:
step 2.1, generating LDR data, generating LDR training sample input, respectively shooting each HDR image by using a tone mapping algorithm and a virtual camera to obtain an LDR image, selecting a proper tone mapping algorithm, and directly inputting the HDR image as the algorithm to obtain a corresponding LDR image output; meanwhile, obtaining an LDR image by constructing a virtual camera, firstly determining a value range of a dynamic range of the virtual camera based on a common digital single-phase inverter, randomly selecting a value in the range as the dynamic range of the camera for simulating shooting every time the LDR image is obtained, then automatically exposing the virtual camera according to an input HDR image, taking a boundary value for a pixel with brightness exceeding the dynamic range of the virtual camera, linearly mapping the pixel to a low dynamic range of the LDR image, and finally mapping the obtained image from a linear space to a common digital image through a randomly selected approximate camera response function to obtain the required LDR image;
step 2.2, alignment of HDR image luminance units for preservingHDR images with relative luminance domains exist, aligning their luminance units before they are labeled as training samples; let the original HDR image be H, the LDR image is converted to linear space and normalized to [0,1]]Let L, H l,c ,L l,c The pixel values of the image at the position l and the channel c are aligned by the following methods:
Figure FDA0004105937280000021
wherein the method comprises the steps of
Figure FDA0004105937280000022
For an aligned HDR image, m l,c Is defined as
Figure FDA0004105937280000023
Wherein τ is a constant of [0,1], and the aligned HDR image and the corresponding LDR image form a training sample pair for training of the neural network;
step 2.3, generating an image highlight region mask, obtaining an aligned HDR image brightness unit, and obtaining a mask image of a highlight region in the image in a binarization mode, wherein the formula is as follows:
Figure FDA0004105937280000031
wherein the method comprises the steps of
Figure FDA0004105937280000032
For the channel mean image of the aligned HDR image, t is a constant, and the region with the median value of 1 in the mask image represents an object or surface with higher brightness in the scene, including a light source and a strong light reflecting surface. />
CN202010072803.3A 2020-01-21 2020-01-21 A high dynamic range image reconstruction method based on deep learning Active CN111292264B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010072803.3A CN111292264B (en) 2020-01-21 2020-01-21 A high dynamic range image reconstruction method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010072803.3A CN111292264B (en) 2020-01-21 2020-01-21 A high dynamic range image reconstruction method based on deep learning

Publications (2)

Publication Number Publication Date
CN111292264A CN111292264A (en) 2020-06-16
CN111292264B true CN111292264B (en) 2023-04-21

Family

ID=71023475

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010072803.3A Active CN111292264B (en) 2020-01-21 2020-01-21 A high dynamic range image reconstruction method based on deep learning

Country Status (1)

Country Link
CN (1) CN111292264B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111784598B (en) * 2020-06-18 2023-06-02 Oppo(重庆)智能科技有限公司 Training method of tone mapping model, tone mapping method and electronic equipment
CN113920040A (en) * 2020-07-09 2022-01-11 阿里巴巴集团控股有限公司 Video processing method and model construction method
CN111986106B (en) * 2020-07-30 2023-10-13 南京大学 A high dynamic image reconstruction method based on neural network
CN112435306A (en) * 2020-11-20 2021-03-02 上海北昂医药科技股份有限公司 G banding chromosome HDR image reconstruction method
CN112738392A (en) * 2020-12-24 2021-04-30 上海哔哩哔哩科技有限公司 Image conversion method and system
WO2022226771A1 (en) * 2021-04-27 2022-11-03 京东方科技集团股份有限公司 Image processing method and image processing device
CN113344773B (en) * 2021-06-02 2022-05-06 电子科技大学 Single picture reconstruction HDR method based on multi-level dual feedback
CN113379698B (en) * 2021-06-08 2022-07-05 武汉大学 Illumination estimation method based on step-by-step joint supervision
CN117441186A (en) * 2021-06-24 2024-01-23 Oppo广东移动通信有限公司 Image decoding and processing method, device and equipment
CN113784175B (en) * 2021-08-02 2023-02-28 中国科学院深圳先进技术研究院 A kind of HDR video conversion method, device, equipment and computer storage medium
CN113674231B (en) * 2021-08-11 2022-06-07 宿迁林讯新材料有限公司 Method and system for detecting iron scale in rolling process based on image enhancement
US20240202891A1 (en) * 2021-09-17 2024-06-20 Boe Technology Group Co., Ltd. Method for training image processing model, and method for generating high dynamic range image
CN114820373B (en) * 2022-04-28 2023-04-25 电子科技大学 Single image reconstruction HDR method based on knowledge heuristic
CN114998138B (en) * 2022-06-01 2024-05-28 北京理工大学 A high dynamic range image artifact removal method based on attention mechanism
CN115297254B (en) * 2022-07-04 2024-03-29 北京航空航天大学 A portable high-dynamic imaging fusion system under high radiation conditions
CN115641333B (en) * 2022-12-07 2023-03-21 武汉大学 A method and system for indoor illumination estimation based on spherical harmonic Gaussian
CN117456313B (en) * 2023-12-22 2024-03-22 中国科学院宁波材料技术与工程研究所 Training method, estimation, mapping method and system of tone curve estimation network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103413286A (en) * 2013-08-02 2013-11-27 北京工业大学 United reestablishing method of high dynamic range and high-definition pictures based on learning
CN103413285A (en) * 2013-08-02 2013-11-27 北京工业大学 HDR and HR image reconstruction method based on sample prediction
CN104969259A (en) * 2012-11-16 2015-10-07 汤姆逊许可公司 Processing high dynamic range images
WO2019001701A1 (en) * 2017-06-28 2019-01-03 Huawei Technologies Co., Ltd. Image processing apparatus and method

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120288217A1 (en) * 2010-01-27 2012-11-15 Jiefu Zhai High dynamic range (hdr) image synthesis with user input
US20160286226A1 (en) * 2015-03-24 2016-09-29 Nokia Technologies Oy Apparatus, a method and a computer program for video coding and decoding
US10048413B2 (en) * 2016-06-07 2018-08-14 Goodrich Corporation Imaging systems and methods

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104969259A (en) * 2012-11-16 2015-10-07 汤姆逊许可公司 Processing high dynamic range images
CN103413286A (en) * 2013-08-02 2013-11-27 北京工业大学 United reestablishing method of high dynamic range and high-definition pictures based on learning
CN103413285A (en) * 2013-08-02 2013-11-27 北京工业大学 HDR and HR image reconstruction method based on sample prediction
WO2019001701A1 (en) * 2017-06-28 2019-01-03 Huawei Technologies Co., Ltd. Image processing apparatus and method

Also Published As

Publication number Publication date
CN111292264A (en) 2020-06-16

Similar Documents

Publication Publication Date Title
CN111292264B (en) A high dynamic range image reconstruction method based on deep learning
CN108986050B (en) Image and video enhancement method based on multi-branch convolutional neural network
CN110889813B (en) Low-light image enhancement method based on infrared information
Ren et al. Low-light image enhancement via a deep hybrid network
Fan et al. Multiscale low-light image enhancement network with illumination constraint
CN109523617B (en) Illumination estimation method based on monocular camera
CN110197229B (en) Training method and device of image processing model and storage medium
CN112884758B (en) Defect insulator sample generation method and system based on style migration method
CN111915525A (en) Low-illumination image enhancement method based on improved depth separable generation countermeasure network
CN110675462A (en) A Colorization Method of Grayscale Image Based on Convolutional Neural Network
WO2023212997A1 (en) Knowledge distillation based neural network training method, device, and storage medium
CN116993975A (en) Panoramic camera semantic segmentation method based on deep learning unsupervised field adaptation
CN111652864A (en) A casting defect image generation method based on conditional generative adversarial network
Hovhannisyan et al. AED-Net: A single image dehazing
CN112215100B (en) Target detection method for degraded image under unbalanced training sample
Fu et al. Low-light image enhancement base on brightness attention mechanism generative adversarial networks
CN116485791A (en) Method and system for automatic detection of double-view breast tumor lesion area based on absorption
CN113066074A (en) Visual saliency prediction method based on binocular parallax offset fusion
CN114511475B (en) Image generation method based on improved Cycle GAN
CN116152571A (en) Kitchen waste identification and classification method based on deep learning
CN114663315A (en) Image bit enhancement method and device for generating countermeasure network based on semantic fusion
CN114612658A (en) Image semantic segmentation method based on dual-class-level confrontation network
Zhang et al. Single image relighting based on illumination field reconstruction
Yoon et al. Shadow detection and removal from photo-realistic synthetic urban image using deep learning
CN117593222A (en) Low-illumination image enhancement method for progressive pixel level adjustment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant