[go: up one dir, main page]

CN115841523A - Double-branch HDR video reconstruction algorithm based on Raw domain - Google Patents

Double-branch HDR video reconstruction algorithm based on Raw domain Download PDF

Info

Publication number
CN115841523A
CN115841523A CN202211113812.8A CN202211113812A CN115841523A CN 115841523 A CN115841523 A CN 115841523A CN 202211113812 A CN202211113812 A CN 202211113812A CN 115841523 A CN115841523 A CN 115841523A
Authority
CN
China
Prior art keywords
raw
hdr
video
domain
branch
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211113812.8A
Other languages
Chinese (zh)
Inventor
岳焕景
彭昱博
杨敬钰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202211113812.8A priority Critical patent/CN115841523A/en
Publication of CN115841523A publication Critical patent/CN115841523A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Processing (AREA)

Abstract

The invention discloses a double-branch HDR video reconstruction algorithm based on a Raw domain, belonging to the technical field of video signal processing; a double-branch HDR video reconstruction algorithm based on a Raw domain specifically comprises the following steps: s1, establishing a synthesized Raw domain video HDR data set; s2, designing a double-branch Raw video HDR reconstruction algorithm based on the S1; s3, training a model; s4, inputting the Raw video sequence of the Low Dynamic Range (LDR) in the test set into the model to obtain a corresponding output result of the high dynamic range; the invention synthesizes a first video HDR data set simulating real noise distribution in a Raw domain, and provides a reference data set for training and evaluating an HDR reconstruction method at night or in an extreme scene; meanwhile, the dynamic range expansion of the LDR video with noise under the difficult scene is realized by utilizing the proposed content enhancement module.

Description

一种基于Raw域的双支路HDR视频重建算法A Dual-branch HDR Video Reconstruction Algorithm Based on Raw Domain

技术领域Technical Field

本发明涉及视频信号处理技术领域,尤其涉及一种基于Raw域的双支路HDR视频重建算法。The present invention relates to the technical field of video signal processing, and in particular to a dual-branch HDR video reconstruction algorithm based on Raw domain.

背景技术Background Art

高动态范围(HDR)技术利用多张不同曝光的低动态范围(LDR)图片来扩展图像的动态范围,丰富图像细节和提高图像对比度。自然场景下的场景辐照度可能从10-5~109不等,而普通相机记录的照片可能只有8bit或10bit的位深度,无法完整捕获场景的亮度范围。相机捕获的动态范围小于场景的动态范围时,可能会使拍摄图像出现过曝或者欠曝的区域,影响视觉质量。High dynamic range (HDR) technology uses multiple low dynamic range (LDR) images with different exposures to expand the dynamic range of an image, enrich image details and improve image contrast. The scene irradiance in natural scenes may vary from 10 -5 to 10 9 , while the photos recorded by ordinary cameras may only have a bit depth of 8 or 10 bits, which cannot fully capture the brightness range of the scene. When the dynamic range captured by the camera is smaller than the dynamic range of the scene, the captured image may have overexposed or underexposed areas, affecting the visual quality.

随着HDR技术的发展,从传统的基于统计方法的融合逐渐转向依赖于深度学习的方法重建HDR图像。这些深度学习的方法往往经历两个步骤,一是将不同时刻、不同曝光的LDR图像对齐,二是将对齐后的LDR图像融合为HDR图像。相较于图像HDR重建,视频HDR重建需要重建原始LDR帧序列中每一帧的HDR结果。现有的方法往往针对交替曝光的视频序列(如-2EV,+2EV,-2EV,……),采用基于滑动窗口的方法输入三张或五张相邻的不同曝光的LDR帧,经过对齐和融合重建出中间帧的HDR结果。With the development of HDR technology, the traditional statistical fusion method has gradually shifted to deep learning-based methods to reconstruct HDR images. These deep learning methods often go through two steps: one is to align LDR images at different times and exposures, and the other is to fuse the aligned LDR images into HDR images. Compared with image HDR reconstruction, video HDR reconstruction requires reconstruction of the HDR results of each frame in the original LDR frame sequence. Existing methods often use a sliding window-based method to input three or five adjacent LDR frames of different exposures for alternating exposure video sequences (such as -2EV, +2EV, -2EV, ...), and reconstruct the HDR results of the intermediate frames after alignment and fusion.

以往的视频HDR重建方法往往直接处理sRGB图像,然而sRGB图像经过了相机内部复杂的图像处理管道(ISP),如黑电平矫正、去马赛克、白平衡、伽马校正和色域转换等。这使得sRGB图像不仅损失了部分原始信息,同时一些非线性映射操作为HDR的重建带来了困难。而利用相机传感器输出的Raw域数据可以很好的解决上面的问题,Raw域数据含有更宽的bit位数,包含更丰富的场景信息,同时由于未经后续ISP处理的影响,Raw域数据具有更好的线性性质,更有利于实现HDR重建。Previous video HDR reconstruction methods often directly process sRGB images. However, sRGB images have gone through a complex image processing pipeline (ISP) inside the camera, such as black level correction, de-mosaicing, white balance, gamma correction, and color gamut conversion. This not only causes the sRGB image to lose some of its original information, but also some nonlinear mapping operations make HDR reconstruction difficult. The above problems can be solved well by using the Raw domain data output by the camera sensor. Raw domain data contains a wider number of bits and contains richer scene information. At the same time, because it is not affected by subsequent ISP processing, Raw domain data has better linear properties and is more conducive to HDR reconstruction.

另一方面,对于过暗场景的HDR重建,往往需要考虑严重的噪声影响,Raw域数据能够更准确的建模噪声,从而令模型更好的学会真实场景的去噪,广泛应用于图像和视频去噪任务。同时现有的方法如Kalantari等人提出的Deep hdr video from sequences withalternating exposures.Computer Graphics Forum 38,193–205(2019)缺少针对过暗区域噪声的特殊设计,使得网络对于重建带噪图像的HDR表现不佳。Chen等人提出的Hdrvideo reconstruction:A coarse-to-fine network and a real-world benchmarkdataset.In:Proceedings of the IEEE/CVF International Conference on ComputerVision.pp.2502–2511(2021)采用两阶段处理对齐和融合,使得短曝光图像作参考时细节被过度平滑,而长曝光作参考时又保留噪声。On the other hand, for HDR reconstruction of overly dark scenes, it is often necessary to consider the serious influence of noise. Raw domain data can model noise more accurately, so that the model can better learn to denoise real scenes and is widely used in image and video denoising tasks. At the same time, existing methods such as Deep HDR video from sequences with alternating exposures proposed by Kalantari et al. Computer Graphics Forum 38, 193–205 (2019) lack special designs for noise in overly dark areas, which makes the network perform poorly for HDR reconstruction of noisy images. HDR video reconstruction: A coarse-to-fine network and a real-world benchmark dataset. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 2502–2511 (2021) proposed by Chen et al. uses two-stage processing of alignment and fusion, so that the details are over-smoothed when the short exposure image is used as a reference, while the noise is retained when the long exposure image is used as a reference.

综合上述内容可以看出,基于深度学习的HDR图像和视频的重建方法往往受到噪声的限制,现有的方法主要在sRGB域对图像进行训练,无法建模真实的噪声情况;同时网络结构主要针对对齐和融合进行设计,缺乏处理噪声的模块,导致困难场景下的HDR重建质量较低;为了解决上述问题,本发明提出了一种基于Raw域的双支路HDR视频重建算法。Based on the above content, it can be seen that the reconstruction methods of HDR images and videos based on deep learning are often limited by noise. The existing methods mainly train images in the sRGB domain and cannot model the actual noise situation. At the same time, the network structure is mainly designed for alignment and fusion, and lacks a module for processing noise, resulting in low HDR reconstruction quality in difficult scenarios. In order to solve the above problems, the present invention proposes a dual-branch HDR video reconstruction algorithm based on the Raw domain.

发明内容Summary of the invention

本发明的目的在于建立一个Raw视频HDR数据集,并在此基础上提出了一种基于Raw域的双支路HDR视频重建算法以解决背景技术中所提到的问题。The purpose of the present invention is to establish a Raw video HDR dataset, and on this basis propose a dual-branch HDR video reconstruction algorithm based on the Raw domain to solve the problems mentioned in the background technology.

为了实现上述目的,本发明采用了如下技术方案:In order to achieve the above object, the present invention adopts the following technical solutions:

一种基于Raw域的双支路HDR视频重建算法,具体包括以下步骤:A dual-branch HDR video reconstruction algorithm based on the Raw domain specifically includes the following steps:

S1、建立合成的Raw域视频HDR数据集:建立Raw数据集,具体包括以下内容:S1. Establish a synthetic Raw domain video HDR dataset: Establish a Raw dataset, specifically including the following contents:

S101、源sRGB视频数据选择:选择Froehlich和Kronander等人拍摄的21个sRGB域HDR视频和高质量视频数据集Vimeo-90K作为源sRGB视频数据。每个HDR视频通过选定的曝光参数重新曝光来模拟交替曝光LDR视频序列;S101, source sRGB video data selection: 21 sRGB domain HDR videos shot by Froehlich and Kronander et al. and the high-quality video dataset Vimeo-90K are selected as source sRGB video data. Each HDR video is re-exposed with the selected exposure parameters to simulate the alternating exposure LDR video sequence;

S102、Raw数据集建立:通过模拟相机成像管道流程,将sRGB视频转换为Raw视频数据集。具体步骤为:逆相机响应(CRF)曲线、模拟bayer格式、数据增强、重新曝光、添加噪声得到Raw域的HDR和LDR图像,将其作为Raw域的训练数据对;S102, Raw data set establishment: by simulating the camera imaging pipeline process, the sRGB video is converted into a Raw video data set. The specific steps are: inverse camera response (CRF) curve, simulated bayer format, data enhancement, re-exposure, and adding noise to obtain HDR and LDR images in the Raw domain, which are used as training data pairs in the Raw domain;

S2、设计重建算法:基于S1中所得的数据对,采用LDR的Raw帧

Figure SMS_1
和HDR的Raw帧
Figure SMS_2
作为训练对来设计双支路Raw视频HDR重建算法;S2. Design reconstruction algorithm: Based on the data pairs obtained in S1, use the Raw frame of LDR
Figure SMS_1
and Raw frames of HDR
Figure SMS_2
Use them as training pairs to design a dual-branch Raw video HDR reconstruction algorithm;

S3、训练模型:基于S2中所设计的重建算法来搭建模型,并利用深度学习框架Pytorch平台训练模型,在整个数据集上迭代15个epcoh,随后减小学习率至0.00001,继续迭代直到损失收敛,得到最终模型;S3, training model: build a model based on the reconstruction algorithm designed in S2, and use the deep learning framework Pytorch platform to train the model. Iterate 15 epcohs on the entire dataset, then reduce the learning rate to 0.00001, and continue to iterate until the loss converges to obtain the final model;

S4、输出结果:将测试集中的低动态范围的Raw视频序列输入到S3中所得的最终模型中,得到相应的高动态范围的输出结果。S4, output results: Input the low dynamic range Raw video sequence in the test set into the final model obtained in S3 to obtain the corresponding high dynamic range output results.

优选地,所述S102中提到模拟相机成像流程将sRGB视频转换为Raw视频,具体包括以下步骤:Preferably, the analog camera imaging process mentioned in S102 converts the sRGB video into a Raw video, which specifically includes the following steps:

S1021、对于Vimeo-90K数据集,通过估计CRF曲线,将视频帧由非线性域转换到线性域;S1021. For the Vimeo-90K dataset, the video frames are converted from the nonlinear domain to the linear domain by estimating the CRF curve.

S1022、将3通道sRGB帧下采样为四分之一分辨率的四个通道:红、绿、绿、蓝,按照GRBG的bayer格式组合为mosaic图像;S1022, down-sample the 3-channel sRGB frame into four channels of one-quarter resolution: red, green, green, and blue, and combine them into a mosaic image according to the GRBG bayer format;

S1023、将mosaic图像转换为G、R、B、G的4通道图像,然后随机进行缩放、平移、旋转;S1023, converting the mosaic image into a 4-channel image of G, R, B, G, and then randomly scaling, translating, and rotating;

S1024、按照特定的曝光参数将Raw域HDR帧转换为Raw域LDR帧;S1024, converting the Raw domain HDR frame into a Raw domain LDR frame according to specific exposure parameters;

S1025、为Raw域LDR帧添加模拟的高斯和泊松噪声。S1025. Add simulated Gaussian and Poisson noise to the Raw domain LDR frame.

优选地,所述S2中提到的设计双支路Raw视频HDR重建算法,具体包括以下步骤:Preferably, the design of the dual-branch Raw video HDR reconstruction algorithm mentioned in S2 specifically includes the following steps:

S201、噪声估计:每次输入三帧连续的raw域LDR帧

Figure SMS_3
Figure SMS_4
利用噪声估计网络估计噪声水平图:S201, noise estimation: input three consecutive raw domain LDR frames each time
Figure SMS_3
Figure SMS_4
Estimate the noise level map using a noise estimation network:

Figure SMS_5
Figure SMS_5

S202、数据处理和特征提取:输入三帧连续的raw域LDR帧

Figure SMS_6
Figure SMS_7
和其对应的曝光系数ti-1、ti、ti+1,利用曝光系数对输入LDR图像进行曝光校正,校正公式为:S202, data processing and feature extraction: input three consecutive raw domain LDR frames
Figure SMS_6
Figure SMS_7
And its corresponding exposure coefficients ti -1 , ti , ti +1 , use the exposure coefficients to perform exposure correction on the input LDR image, the correction formula is:

Figure SMS_8
Figure SMS_8

利用上述公式将输入

Figure SMS_9
映射到同一曝光水平;Using the above formula, input
Figure SMS_9
Mapped to the same exposure level;

然后经过特征提取模块,利用卷积提取特征:Then, through the feature extraction module, convolution is used to extract features:

Figure SMS_10
Figure SMS_10

其中,Fi表示第i帧提取的特征,输入的LDR图像用于帮助检测过曝和欠曝区域,输入的HDR图像用于帮助后续对齐,噪声水平图像帮助检测噪声区域;Where Fi represents the features extracted from the i-th frame, the input LDR image is used to help detect overexposed and underexposed areas, the input HDR image is used to help with subsequent alignment, and the noise level image helps detect noise areas;

S203、特征对齐:级联的金字塔式可变形卷积结构,首先将输入特征经过两次下采样得到多个尺度的特征:S203, feature alignment: cascaded pyramid deformable convolution structure, first downsample the input features twice to obtain features of multiple scales:

Figure SMS_11
Figure SMS_11

其中,

Figure SMS_12
表示下采样;in,
Figure SMS_12
represents downsampling;

在第s个尺度上,利用

Figure SMS_13
与中间帧特征
Figure SMS_14
级联估计偏移量
Figure SMS_15
At the sth scale, using
Figure SMS_13
Intermediate frame features
Figure SMS_14
Cascade Estimation Offset
Figure SMS_15

Figure SMS_16
Figure SMS_16

将计算所得的偏移量

Figure SMS_17
作为可变形卷积的偏移量,用可变形卷积对上一帧特征处理后得到当前尺度下的对齐结果:The calculated offset
Figure SMS_17
As the offset of the deformable convolution, the deformable convolution is used to process the features of the previous frame to obtain the alignment result at the current scale:

Figure SMS_18
Figure SMS_18

Figure SMS_19
Figure SMS_19

其中,↑2表示2倍的双线性插值上采样,每一尺度的对齐结果与上一尺度的对齐结果经过卷积进一步融合;特征域上由粗到细的联合预测,能够更准确地估计大尺度下的位移;Among them, ↑2 represents 2 times bilinear interpolation upsampling, and the alignment result of each scale is further fused with the alignment result of the previous scale through convolution; the coarse-to-fine joint prediction in the feature domain can more accurately estimate the displacement at a large scale;

S204、时间融合:空间注意力结构通过卷积得到相邻帧之间的注意力相关性,帮助网络重建出无鬼影和曝光准确的HDR图像:S204, Temporal Fusion: The spatial attention structure obtains the attention correlation between adjacent frames through convolution, helping the network to reconstruct HDR images without ghosting and with accurate exposure:

Figure SMS_20
Figure SMS_20

Figure SMS_21
Figure SMS_21

其中,Ai表示预测出的空间注意力,⊙表示逐元素相乘,

Figure SMS_22
表示时间融合后的特征;Among them, Ai represents the predicted spatial attention, ⊙ represents element-by-element multiplication,
Figure SMS_22
Represents the features after time fusion;

S205、内容增强分支:将对齐后的特征

Figure SMS_23
经过残差估计分支REB提取输入特征的高频信息,帮助恢复上一支路中缺少的内容:S205, content enhancement branch: the aligned features
Figure SMS_23
The residual estimation branch REB extracts high-frequency information of the input features to help restore the missing content in the previous branch:

Figure SMS_24
Figure SMS_24

其中,Hres表示估计得到的残差信息,REB为残差估计分支,其中包含了多个密集残差块;Among them, H res represents the estimated residual information, REB is the residual estimation branch, which contains multiple dense residual blocks;

S206、重建HDR:

Figure SMS_25
经过一系列残差块、跳跃连接后和Hres相加,再经过sigmoid层得到最终的raw域HDR结果
Figure SMS_26
S206, Reconstruct HDR:
Figure SMS_25
After a series of residual blocks and jump connections, it is added to H res and then passed through a sigmoid layer to obtain the final raw domain HDR result.
Figure SMS_26

S207、损失函数:采用色调映射后输出

Figure SMS_27
和真值
Figure SMS_28
之间的差异化网络:S207, loss function: output after tone mapping
Figure SMS_27
and truth value
Figure SMS_28
Differentiation network between:

Figure SMS_29
Figure SMS_29

Figure SMS_30
Figure SMS_30

其中,μ为5000,Ti raw

Figure SMS_31
表示经过色调映射后的raw域真值图像和预测结果。Among them, μ is 5000, T i raw and
Figure SMS_31
Represents the raw domain true value image and prediction result after tone mapping.

与现有技术相比,本发明提供了一种基于Raw域的双支路HDR视频重建算法,具备以下有益效果:Compared with the prior art, the present invention provides a dual-branch HDR video reconstruction algorithm based on the Raw domain, which has the following beneficial effects:

(1)本发明在Raw域合成了第一个模拟真实噪声分布的视频HDR数据集,为夜晚或极端场景下HDR重建方法的训练和评估提供了基准数据集;(1) This paper synthesizes the first video HDR dataset that simulates real noise distribution in the Raw domain, providing a benchmark dataset for the training and evaluation of HDR reconstruction methods at night or in extreme scenes;

(2)本发明基于提出的Raw视频HDR数据集,提出了一种双支路HDR视频重建方法,同时利用所提出的可变形卷积对齐模块,以及内容增强模块,实现了困难场景下带噪LDR视频的动态范围扩展;(2) Based on the proposed Raw video HDR dataset, this paper proposes a dual-branch HDR video reconstruction method, and uses the proposed deformable convolution alignment module and content enhancement module to achieve dynamic range expansion of noisy LDR videos in difficult scenarios;

(3)将本发明所提出的重建算法与是市面上主流重建方法进行的对比实验,结果表明,本发明所提出的重建算法优于目前主流的基于sRGB的HDR重建方法,并且优于或者相当于将主流方法直接转到Raw域的结果;经过本发明的研究探索,希望能够启发更多基于Raw域的视频HDR重建方法的研究。(3) The reconstruction algorithm proposed in the present invention is compared with the mainstream reconstruction method on the market. The results show that the reconstruction algorithm proposed in the present invention is superior to the current mainstream sRGB-based HDR reconstruction method, and is superior to or equivalent to the result of directly converting the mainstream method to the Raw domain. Through the research and exploration of the present invention, it is hoped that more research on Raw-domain-based video HDR reconstruction methods can be inspired.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明提出的一种基于Raw域的双支路HDR视频重建算法的算法流程图;FIG1 is an algorithm flow chart of a dual-branch HDR video reconstruction algorithm based on the Raw domain proposed by the present invention;

图2为本发明提出的一种基于Raw域的双支路HDR视频重建算法与其他视频/图像HDR重建算法在测试集上的结果视觉对比图。FIG2 is a visual comparison diagram of the results of a dual-branch HDR video reconstruction algorithm based on the Raw domain proposed in the present invention and other video/image HDR reconstruction algorithms on a test set.

具体实施方式DETAILED DESCRIPTION

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。The technical solutions in the embodiments of the present invention will be described clearly and completely below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, rather than all the embodiments.

实施例1:Embodiment 1:

一种基于Raw域的双支路HDR视频重建算法,具体包括以下步骤:A dual-branch HDR video reconstruction algorithm based on Raw domain specifically includes the following steps:

S1、建立合成的Raw域视频HDR数据集:建立Raw数据集,具体包括以下内容:S1. Establish a synthetic Raw domain video HDR dataset: Establish a Raw dataset, specifically including the following contents:

S101、源sRGB视频数据选择:选择Froehlich和Kronander等人拍摄的21个sRGB域HDR视频和高质量视频数据集Vimeo-90K作为源sRGB视频数据。每个HDR视频通过选定的曝光参数重新曝光来模拟交替曝光LDR视频序列;S101, source sRGB video data selection: 21 sRGB domain HDR videos shot by Froehlich and Kronander et al. and the high-quality video dataset Vimeo-90K are selected as source sRGB video data. Each HDR video is re-exposed with the selected exposure parameters to simulate the alternating exposure LDR video sequence;

S102、Raw数据集建立:通过模拟相机成像管道流程,将sRGB视频转换为Raw视频数据集。具体步骤为:逆相机响应(CRF)曲线、模拟bayer格式、数据增强、重新曝光、添加噪声得到Raw域的HDR和LDR图像,将其作为Raw域的训练数据对;S102, Raw data set establishment: by simulating the camera imaging pipeline process, the sRGB video is converted into a Raw video data set. The specific steps are: inverse camera response (CRF) curve, simulated bayer format, data enhancement, re-exposure, and adding noise to obtain HDR and LDR images in the Raw domain, which are used as training data pairs in the Raw domain;

S102中提到模拟相机成像流程将sRGB视频转换为Raw视频,具体包括以下步骤:S102 mentions that the analog camera imaging process converts sRGB video into Raw video, which specifically includes the following steps:

S1021、对于Vimeo-90K数据集,通过估计CRF曲线,将视频帧由非线性域转换到线性域;S1021. For the Vimeo-90K dataset, the video frames are converted from the nonlinear domain to the linear domain by estimating the CRF curve.

S1022、将3通道sRGB帧下采样为四分之一分辨率的四个通道:红、绿、绿、蓝,按照GRBG的bayer格式组合为mosaic图像;S1022, down-sample the 3-channel sRGB frame into four channels of one-quarter resolution: red, green, green, and blue, and combine them into a mosaic image according to the GRBG bayer format;

S1023、将mosaic图像转换为G、R、B、G的4通道图像,然后随机进行缩放、平移、旋转;S1023, converting the mosaic image into a 4-channel image of G, R, B, G, and then randomly scaling, translating, and rotating;

S1024、按照特定的曝光参数将Raw域HDR帧转换为Raw域LDR帧;S1024, converting the Raw domain HDR frame into a Raw domain LDR frame according to specific exposure parameters;

S1025、为Raw域LDR帧添加模拟的高斯和泊松噪声。S1025. Add simulated Gaussian and Poisson noise to the Raw domain LDR frame.

Raw域数据与sRGB具有不同的数据格式,所以需要首先从原始RGB三通道重采样为RGGB四通道,然后按照拜尔格式重新排列,以模拟Raw数据格式;其次,通过在图像上添加高斯和泊松噪声模拟真实情况下的噪声分布;Raw domain data has a different data format from sRGB, so it is necessary to first resample the original RGB three channels to RGGB four channels, and then rearrange them according to the Bayer format to simulate the Raw data format; secondly, add Gaussian and Poisson noise to the image to simulate the noise distribution in real situations;

S2、设计重建算法:基于S1中所得的数据对,采用LDR的Raw帧

Figure SMS_32
和HDR的Raw帧
Figure SMS_33
作为训练对来设计双支路Raw视频HDR重建算法;S2. Design reconstruction algorithm: Based on the data pairs obtained in S1, use the Raw frame of LDR
Figure SMS_32
and Raw frames of HDR
Figure SMS_33
Use them as training pairs to design a dual-branch Raw video HDR reconstruction algorithm;

S3、训练模型:基于S2中所设计的重建算法来搭建模型,模型输入帧数为3帧,输入视频被裁剪为256乘256的块,每一批次为16组样本数据。选用Adam优化器,初始学习率设置为0.0001。利用深度学习框架Pytorch平台训练模型,在整个数据集上迭代15个epoch,随后减小学习率至0.00001,继续迭代直到损失曲线收敛,得到最终模型;S3, training model: build the model based on the reconstruction algorithm designed in S2, the model input frame number is 3 frames, the input video is cropped into 256 x 256 blocks, and each batch is 16 groups of sample data. The Adam optimizer is selected, and the initial learning rate is set to 0.0001. Use the deep learning framework Pytorch platform to train the model, iterate 15 epochs on the entire data set, then reduce the learning rate to 0.00001, and continue to iterate until the loss curve converges to obtain the final model;

S4、输出结果:将测试集中的低动态范围的Raw视频序列输入到S3中所得的最终模型中,得到相应的高动态范围的输出结果。S4, output results: Input the low dynamic range Raw video sequence in the test set into the final model obtained in S3 to obtain the corresponding high dynamic range output results.

在Raw域合成了第一个模拟真实噪声分布的视频HDR数据集,为夜晚或极端场景下HDR重建方法的训练和评估提供了基准数据集。The first video HDR dataset simulating real noise distribution is synthesized in the Raw domain, providing a benchmark dataset for training and evaluation of HDR reconstruction methods in night or extreme scenes.

实施例2:Embodiment 2:

请参阅图1,基于实施例1但有所不同之处在于:Please refer to Figure 1, which is based on Example 1 but differs in that:

所述S2中提到的设计双支路Raw视频HDR重建算法,具体包括以下步骤:The design of the dual-branch Raw video HDR reconstruction algorithm mentioned in S2 specifically includes the following steps:

S201、噪声估计:每次输入三帧连续的raw域LDR帧

Figure SMS_34
Figure SMS_35
利用噪声估计网络估计噪声水平图:S201, noise estimation: input three consecutive raw domain LDR frames each time
Figure SMS_34
Figure SMS_35
Estimate the noise level map using a noise estimation network:

Figure SMS_36
Figure SMS_36

S202、数据处理和特征提取:输入三帧连续的raw域LDR帧

Figure SMS_37
Figure SMS_38
和其对应的曝光系数ti-1、ti、ti+1,利用曝光系数对输入LDR图像进行曝光校正,校正公式为:S202, data processing and feature extraction: input three consecutive raw domain LDR frames
Figure SMS_37
Figure SMS_38
And its corresponding exposure coefficients ti -1 , ti , ti +1 , use the exposure coefficients to perform exposure correction on the input LDR image, the correction formula is:

Figure SMS_39
Figure SMS_39

利用上述公式将输入

Figure SMS_40
映射到同一曝光水平;Using the above formula, input
Figure SMS_40
Mapped to the same exposure level;

然后经过特征提取模块,利用卷积提取特征:Then, through the feature extraction module, convolution is used to extract features:

Figure SMS_41
Figure SMS_41

其中,Fi表示第i帧提取的特征,输入的LDR图像用于帮助检测过曝和欠曝区域,输入的HDR图像用于帮助后续对齐,噪声水平图像帮助检测噪声区域;Where Fi represents the features extracted from the i-th frame, the input LDR image is used to help detect overexposed and underexposed areas, the input HDR image is used to help with subsequent alignment, and the noise level image helps detect noise areas;

S203、特征对齐:级联的金字塔式可变形卷积结构,首先将输入特征经过两次下采样得到多个尺度的特征:S203, feature alignment: cascaded pyramid deformable convolution structure, first downsample the input features twice to obtain features of multiple scales:

Figure SMS_42
Figure SMS_42

其中,

Figure SMS_43
表示下采样;in,
Figure SMS_43
represents downsampling;

在第s个尺度上,利用

Figure SMS_44
与中间帧特征
Figure SMS_45
级联估计偏移量
Figure SMS_46
At the sth scale, using
Figure SMS_44
Intermediate frame features
Figure SMS_45
Cascade Estimation Offset
Figure SMS_46

Figure SMS_47
Figure SMS_47

将计算所得的偏移量

Figure SMS_48
作为可变形卷积的偏移量,用可变形卷积对上一帧特征处理后得到当前尺度下的对齐结果:The calculated offset
Figure SMS_48
As the offset of the deformable convolution, the deformable convolution is used to process the features of the previous frame to obtain the alignment result at the current scale:

Figure SMS_49
Figure SMS_49

Figure SMS_50
Figure SMS_50

其中,↑2表示2倍的双线性插值上采样,每一尺度的对齐结果与上一尺度的对齐结果经过卷积进一步融合;特征域上由粗到细的联合预测,能够更准确地估计大尺度下的位移;Among them, ↑2 represents 2 times bilinear interpolation upsampling, and the alignment result of each scale is further fused with the alignment result of the previous scale through convolution; the coarse-to-fine joint prediction in the feature domain can more accurately estimate the displacement at a large scale;

S204、时间融合:空间注意力结构通过卷积得到相邻帧之间的注意力相关性,帮助网络重建出无鬼影和曝光准确的HDR图像:S204, Temporal Fusion: The spatial attention structure obtains the attention correlation between adjacent frames through convolution, helping the network to reconstruct HDR images without ghosting and with accurate exposure:

Figure SMS_51
Figure SMS_51

Figure SMS_52
Figure SMS_52

其中,Ai表示预测出的空间注意力,⊙表示逐元素相乘,

Figure SMS_53
表示时间融合后的特征。Among them, Ai represents the predicted spatial attention, ⊙ represents element-by-element multiplication,
Figure SMS_53
Represents the features after time fusion.

S205、内容增强分支:将对齐后的特征

Figure SMS_54
经过残差估计分支REB提取输入特征的高频信息,帮助恢复上一支路中缺少的内容:S205, content enhancement branch: the aligned features
Figure SMS_54
The residual estimation branch REB extracts high-frequency information of the input features to help restore the missing content in the previous branch:

Figure SMS_55
Figure SMS_55

其中Hres表示估计得到的残差信息。Where H res represents the estimated residual information.

S206、重建HDR:

Figure SMS_56
经过一系列残差块、跳跃连接后和Hres相加,再经过sigmoid层得到最终的raw域HDR结果
Figure SMS_57
S206, Reconstruct HDR:
Figure SMS_56
After a series of residual blocks and jump connections, it is added to H res and then passed through a sigmoid layer to obtain the final raw domain HDR result.
Figure SMS_57

S207、损失函数:采用色调映射后输出

Figure SMS_58
和真值
Figure SMS_59
之间的差异化网络:S207, loss function: output after tone mapping
Figure SMS_58
and truth value
Figure SMS_59
Differentiation network between:

Figure SMS_60
Figure SMS_60

Figure SMS_61
Figure SMS_61

其中,μ为5000,Ti raw

Figure SMS_62
表示经过色调映射后的raw域真值图像和预测结果。Among them, μ is 5000, T i raw and
Figure SMS_62
Represents the raw domain true value image and prediction result after tone mapping.

本发明基于Raw视频HDR数据集,提出了一种双支路HDR重建方法,利用所提出的可变形卷积对齐模块,以及内容增强模块,实现了困难场景下带噪LDR视频的动态范围扩展。Based on the Raw video HDR dataset, the present invention proposes a dual-branch HDR reconstruction method, and utilizes the proposed deformable convolution alignment module and content enhancement module to achieve the dynamic range extension of noisy LDR videos in difficult scenes.

实施例3:Embodiment 3:

请参阅图2,基于实施例1-2但又有所不同之处在于:Please refer to Figure 2, which is based on Example 1-2 but is different in that:

将本发明所提出的一种基于Raw域的双支路HDR视频重建算法与市面上主流的方法在Raw域进行对比,其在测试集上的结果对比如图2和表1所示。The dual-branch HDR video reconstruction algorithm based on the Raw domain proposed in the present invention is compared with the mainstream methods on the market in the Raw domain, and the comparison results on the test set are shown in Figure 2 and Table 1.

Figure SMS_63
Figure SMS_63

表1指标对比表Table 1 Index comparison table

从图2和表1中可以看出,本发明所提出的一种基于Raw域的双支路HDR视频重建算法,通过两个分支的联合学习和信息互补能够更好的减少噪声影响,同时保留原始图像细节;结合实际图像效果和表中数据可以明显看出,本发明所提出的重建算法取得了更好的视觉效果和数据指标。It can be seen from Figure 2 and Table 1 that the dual-branch HDR video reconstruction algorithm based on the Raw domain proposed in the present invention can better reduce the influence of noise while retaining the original image details through the joint learning and information complementarity of the two branches; combined with the actual image effect and the data in the table, it can be obviously seen that the reconstruction algorithm proposed in the present invention has achieved better visual effects and data indicators.

以上所述,仅为本发明较佳的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,根据本发明的技术方案及其发明构思加以等同替换或改变,都应涵盖在本发明的保护范围之内。The above description is only a preferred specific implementation manner of the present invention, but the protection scope of the present invention is not limited thereto. Any technician familiar with the technical field can make equivalent replacements or changes according to the technical scheme and inventive concept of the present invention within the technical scope disclosed by the present invention, which should be covered by the protection scope of the present invention.

Claims (3)

1.一种基于Raw域的双支路HDR视频重建算法,其特征在于,具体包括以下步骤:1. A dual-branch HDR video reconstruction algorithm based on Raw domain, characterized by comprising the following steps: S1、建立合成的Raw域视频HDR数据集:建立Raw数据集,具体包括以下内容:S1. Establish a synthetic Raw domain video HDR dataset: Establish a Raw dataset, specifically including the following contents: S101、源sRGB视频数据选择:选择若干个现有的sRGB域HDR视频和高质量视频数据集作为源sRGB视频数据,每个HDR视频通过选定的曝光参数重新曝光来模拟交替曝光LDR视频序列;S101, source sRGB video data selection: select a number of existing sRGB domain HDR videos and high-quality video datasets as source sRGB video data, and re-expose each HDR video by using selected exposure parameters to simulate an alternately exposed LDR video sequence; S102、Raw数据集建立:通过模拟相机成像管道流程,将sRGB视频转换为Raw视频数据集,具体步骤为:获取逆相机响应曲线→模拟bayer格式→数据增强→重新曝光→添加噪声得到Raw域的HDR和LDR图像,将所得的Raw域的HDR和LDR图像作为Raw域的训练数据对;S102, Raw data set establishment: by simulating the camera imaging pipeline process, the sRGB video is converted into a Raw video data set. The specific steps are: obtain the inverse camera response curve → simulate the bayer format → data enhancement → re-exposure → add noise to obtain the HDR and LDR images in the Raw domain, and use the obtained HDR and LDR images in the Raw domain as the training data pairs in the Raw domain; S2、设计重建算法:基于S1中所得的数据对,采用LDR的Raw帧Ii raw和HDR的Raw帧Hi raw作为训练对来设计双支路Raw视频HDR重建算法;S2. Design a reconstruction algorithm: Based on the data pair obtained in S1, use the LDR Raw frame I i raw and the HDR Raw frame H i raw as training pairs to design a dual-branch Raw video HDR reconstruction algorithm; S3、训练模型:基于S2中所设计的重建算法来搭建模型,并利用深度学习框架Pytorch平台训练模型,在整个数据集上迭代15个epcoh,随后减小学习率至0.00001,继续迭代直到损失收敛,得到最终模型;S3, training model: build a model based on the reconstruction algorithm designed in S2, and use the deep learning framework Pytorch platform to train the model. Iterate 15 epcohs on the entire dataset, then reduce the learning rate to 0.00001, and continue to iterate until the loss converges to obtain the final model; S4、输出结果:将测试集中的低动态范围的Raw视频序列输入到S3中所得的最终模型中,得到相应的高动态范围的输出结果。S4, output results: Input the low dynamic range Raw video sequence in the test set into the final model obtained in S3 to obtain the corresponding high dynamic range output results. 2.根据权利要求1所述的一种基于Raw域的双支路HDR视频重建算法,其特征在于,所述S102中提到模拟相机成像流程将sRGB视频转换为Raw视频数据集,具体包括以下步骤:2. According to the Raw domain-based dual-branch HDR video reconstruction algorithm of claim 1, it is characterized in that the analog camera imaging process mentioned in S102 converts the sRGB video into a Raw video data set, specifically comprising the following steps: S1021、对于高质量视频数据集,通过估计CRF曲线,将视频帧由非线性域转换到线性域;S1021. For a high-quality video dataset, convert the video frames from a nonlinear domain to a linear domain by estimating a CRF curve; S1022、将3通道sRGB帧下采样为四分之一分辨率的四个通道:红、绿、绿、蓝,按照GRBG的bayer格式组合为mosaic图像;S1022, down-sample the 3-channel sRGB frame into four channels of one-quarter resolution: red, green, green, and blue, and combine them into a mosaic image according to the GRBG bayer format; S1023、将mosaic图像转换为G、R、B、G的4通道图像,然后随机进行缩放、平移、旋转;S1023, converting the mosaic image into a 4-channel image of G, R, B, G, and then randomly scaling, translating, and rotating; S1024、按照特定的曝光参数将Raw域HDR帧转换为Raw域LDR帧;S1024, converting the Raw domain HDR frame into a Raw domain LDR frame according to specific exposure parameters; S1025、为Raw域LDR帧添加模拟的高斯和泊松噪声。S1025. Add simulated Gaussian and Poisson noise to the Raw domain LDR frame. 3.根据权利要求1所述的一种基于Raw域的双支路HDR视频重建算法,其特征在于,所述S2中提到的设计双支路Raw视频HDR重建算法,具体包括以下步骤:3. According to the Raw domain-based dual-branch HDR video reconstruction algorithm of claim 1, it is characterized in that the design of the dual-branch Raw video HDR reconstruction algorithm mentioned in S2 specifically comprises the following steps: S201、噪声估计:每次输入三帧连续的raw域LDR帧
Figure FDA0003844670250000021
Figure FDA0003844670250000022
利用噪声估计网络估计噪声水平图:
S201, noise estimation: input three consecutive raw domain LDR frames each time
Figure FDA0003844670250000021
Figure FDA0003844670250000022
Estimate the noise level map using a noise estimation network:
Figure FDA0003844670250000023
Figure FDA0003844670250000023
S202、数据处理和特征提取:输入三帧连续的raw域LDR帧
Figure FDA0003844670250000024
Figure FDA0003844670250000025
和其对应的曝光系数ti-1、ti、ti+1,利用曝光系数对输入LDR图像进行曝光校正,校正公式为:
S202, data processing and feature extraction: input three consecutive raw domain LDR frames
Figure FDA0003844670250000024
Figure FDA0003844670250000025
And the corresponding exposure coefficients ti -1 , ti , ti +1 , the exposure coefficients are used to perform exposure correction on the input LDR image, and the correction formula is:
Figure FDA0003844670250000026
Figure FDA0003844670250000026
利用上述公式将输入
Figure FDA0003844670250000027
映射到同一曝光水平;
Using the above formula, input
Figure FDA0003844670250000027
Mapped to the same exposure level;
然后经过特征提取模块,利用卷积提取特征:Then, through the feature extraction module, convolution is used to extract features:
Figure FDA0003844670250000031
Figure FDA0003844670250000031
其中,Fi表示第i帧提取的特征,输入的LDR图像用于帮助检测过曝和欠曝区域,输入的HDR图像用于帮助后续对齐,噪声水平图像帮助检测噪声区域;Where Fi represents the features extracted from the i-th frame, the input LDR image is used to help detect overexposed and underexposed areas, the input HDR image is used to help with subsequent alignment, and the noise level image helps detect noise areas; S203、特征对齐:级联的金字塔式可变形卷积结构,首先将输入特征,经过两次下采样得到多个尺度的特征:S203, feature alignment: The cascaded pyramid deformable convolution structure first downsamples the input features twice to obtain features of multiple scales:
Figure FDA0003844670250000032
Figure FDA0003844670250000032
其中,
Figure FDA0003844670250000033
表示下采样;
in,
Figure FDA0003844670250000033
represents downsampling;
在第s个尺度上,利用
Figure FDA0003844670250000034
与中间帧特征
Figure FDA0003844670250000035
级联估计偏移量
Figure FDA0003844670250000036
At the sth scale, using
Figure FDA0003844670250000034
Intermediate frame features
Figure FDA0003844670250000035
Cascade Estimation Offset
Figure FDA0003844670250000036
Figure FDA0003844670250000037
Figure FDA0003844670250000037
将计算所得的偏移量
Figure FDA0003844670250000038
作为可变形卷积的偏移量,用可变形卷积对上一帧特征处理后得到当前尺度下的对齐结果:
The calculated offset
Figure FDA0003844670250000038
As the offset of the deformable convolution, the deformable convolution is used to process the features of the previous frame to obtain the alignment result at the current scale:
Figure FDA0003844670250000039
Figure FDA0003844670250000039
Figure FDA00038446702500000310
Figure FDA00038446702500000310
其中,↑2表示2倍的双线性插值上采样,每一尺度的对齐结果与上一尺度的对齐结果经过卷积进一步融合;特征域上由粗到细的联合预测,能够更准确地估计大尺度下的位移;Among them, ↑2 represents 2 times bilinear interpolation upsampling, and the alignment result of each scale is further fused with the alignment result of the previous scale through convolution; the coarse-to-fine joint prediction in the feature domain can more accurately estimate the displacement at a large scale; S204、时间融合:空间注意力结构通过卷积得到相邻帧之间的注意力相关性,帮助网络重建出无鬼影和曝光准确的HDR图像:S204, Temporal Fusion: The spatial attention structure obtains the attention correlation between adjacent frames through convolution, helping the network to reconstruct HDR images without ghosting and with accurate exposure:
Figure FDA00038446702500000311
Figure FDA00038446702500000311
Figure FDA00038446702500000312
Figure FDA00038446702500000312
其中,Ai表示预测出的空间注意力,⊙表示逐元素相乘,
Figure FDA00038446702500000313
表示时间融合后的特征;
Among them, Ai represents the predicted spatial attention, ⊙ represents element-by-element multiplication,
Figure FDA00038446702500000313
Represents the features after time fusion;
S205、内容增强分支:将对齐后的特征经过残差估计分支REB提取输入特征的高频信息,帮助恢复上一支路中缺少的内容:S205, content enhancement branch: The aligned features are passed through the residual estimation branch REB to extract the high-frequency information of the input features to help restore the missing content in the previous branch:
Figure FDA0003844670250000041
Figure FDA0003844670250000041
其中,Hres表示估计得到的残差信息;
Figure FDA0003844670250000042
表示对齐后的特征;
Among them, H res represents the estimated residual information;
Figure FDA0003844670250000042
represents the aligned features;
S206、重建HDR:
Figure FDA0003844670250000043
经过一系列残差块、跳跃连接后和Hres相加,再经过sigmoid层得到最终的raw域HDR结果
Figure FDA0003844670250000044
S206, Reconstruct HDR:
Figure FDA0003844670250000043
After a series of residual blocks and jump connections, it is added to H res and then passed through a sigmoid layer to obtain the final raw domain HDR result.
Figure FDA0003844670250000044
S207、损失函数:采用色调映射后输出
Figure FDA0003844670250000045
和真值
Figure FDA0003844670250000046
之间的差异化网络:
S207, loss function: output after tone mapping
Figure FDA0003844670250000045
and truth value
Figure FDA0003844670250000046
Differentiation network between:
Figure FDA0003844670250000047
Figure FDA0003844670250000047
Figure FDA0003844670250000048
Figure FDA0003844670250000048
其中,μ为5000,
Figure FDA0003844670250000049
Figure FDA00038446702500000410
表示经过色调映射后的raw域真值图像和预测结果。
Among them, μ is 5000,
Figure FDA0003844670250000049
and
Figure FDA00038446702500000410
Represents the raw domain true value image and prediction result after tone mapping.
CN202211113812.8A 2022-09-14 2022-09-14 Double-branch HDR video reconstruction algorithm based on Raw domain Pending CN115841523A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211113812.8A CN115841523A (en) 2022-09-14 2022-09-14 Double-branch HDR video reconstruction algorithm based on Raw domain

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211113812.8A CN115841523A (en) 2022-09-14 2022-09-14 Double-branch HDR video reconstruction algorithm based on Raw domain

Publications (1)

Publication Number Publication Date
CN115841523A true CN115841523A (en) 2023-03-24

Family

ID=85575406

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211113812.8A Pending CN115841523A (en) 2022-09-14 2022-09-14 Double-branch HDR video reconstruction algorithm based on Raw domain

Country Status (1)

Country Link
CN (1) CN115841523A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116596779A (en) * 2023-04-24 2023-08-15 天津大学 Raw video denoising method based on Transformer

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116596779A (en) * 2023-04-24 2023-08-15 天津大学 Raw video denoising method based on Transformer
CN116596779B (en) * 2023-04-24 2023-12-01 天津大学 Raw video denoising method based on Transformer

Similar Documents

Publication Publication Date Title
CN111539879A (en) Video blind denoising method and device based on deep learning
CN111986084B (en) Multi-camera low-illumination image quality enhancement method based on multi-task fusion
Niu et al. Low cost edge sensing for high quality demosaicking
CN111784570A (en) Video image super-resolution reconstruction method and device
CN110225260B (en) A Stereo High Dynamic Range Imaging Method Based on Generative Adversarial Networks
US20250037244A1 (en) High Dynamic Range View Synthesis from Noisy Raw Images
CN111105376B (en) Single-exposure high-dynamic-range image generation method based on double-branch neural network
CN115393227B (en) Self-adaptive enhancement method and system for low-light full-color video images based on deep learning
CN113096029A (en) High dynamic range image generation method based on multi-branch codec neural network
CN113284061B (en) Underwater image enhancement method based on gradient network
CN112508812A (en) Image color cast correction method, model training method, device and equipment
WO2023005818A1 (en) Noise image generation method and apparatus, electronic device, and storage medium
CN114862698A (en) Method and device for correcting real overexposure image based on channel guidance
CN116757959A (en) A HDR image reconstruction method based on Raw domain
CN118552442A (en) Image defogging method based on mixed large-scale convolution and attention fusion
CN116563183A (en) High dynamic range image reconstruction method and system based on single RAW image
CN115841523A (en) Double-branch HDR video reconstruction algorithm based on Raw domain
Lu et al. Event camera demosaicing via swin transformer and pixel-focus loss
CN114463189A (en) Image information analysis modeling method based on dense residual UNet
CN111161189A (en) A single image re-enhancement method based on detail compensation network
CN116245968A (en) Method for generating HDR image based on LDR image of transducer
Zhang et al. Joint Luminance Adjustment and Color Correction for Low-Light Image Enhancement Network.
CN113935928B (en) Rock core image super-resolution reconstruction based on Raw format
CN115937045A (en) An Iterative Color Scale Reconstruction Method
Liu et al. Learning to Generate Realistic Images for Bit-Depth Enhancement via Camera Imaging Processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination