CN115841523A

CN115841523A - Double-branch HDR video reconstruction algorithm based on Raw domain

Info

Publication number: CN115841523A
Application number: CN202211113812.8A
Authority: CN
Inventors: 岳焕景; 彭昱博; 杨敬钰
Original assignee: Tianjin University
Current assignee: Tianjin University
Priority date: 2022-09-14
Filing date: 2022-09-14
Publication date: 2023-03-24

Abstract

The invention discloses a double-branch HDR video reconstruction algorithm based on a Raw domain, belonging to the technical field of video signal processing; a double-branch HDR video reconstruction algorithm based on a Raw domain specifically comprises the following steps: s1, establishing a synthesized Raw domain video HDR data set; s2, designing a double-branch Raw video HDR reconstruction algorithm based on the S1; s3, training a model; s4, inputting the Raw video sequence of the Low Dynamic Range (LDR) in the test set into the model to obtain a corresponding output result of the high dynamic range; the invention synthesizes a first video HDR data set simulating real noise distribution in a Raw domain, and provides a reference data set for training and evaluating an HDR reconstruction method at night or in an extreme scene; meanwhile, the dynamic range expansion of the LDR video with noise under the difficult scene is realized by utilizing the proposed content enhancement module.

Description

A Dual-branch HDR Video Reconstruction Algorithm Based on Raw Domain

技术领域Technical Field

本发明涉及视频信号处理技术领域，尤其涉及一种基于Raw域的双支路HDR视频重建算法。The present invention relates to the technical field of video signal processing, and in particular to a dual-branch HDR video reconstruction algorithm based on Raw domain.

背景技术Background Art

高动态范围(HDR)技术利用多张不同曝光的低动态范围(LDR)图片来扩展图像的动态范围，丰富图像细节和提高图像对比度。自然场景下的场景辐照度可能从10^-5～10⁹不等，而普通相机记录的照片可能只有8bit或10bit的位深度，无法完整捕获场景的亮度范围。相机捕获的动态范围小于场景的动态范围时，可能会使拍摄图像出现过曝或者欠曝的区域，影响视觉质量。High dynamic range (HDR) technology uses multiple low dynamic range (LDR) images with different exposures to expand the dynamic range of an image, enrich image details and improve image contrast. The scene irradiance in natural scenes may vary from 10 ^-5 to 10 ⁹ , while the photos recorded by ordinary cameras may only have a bit depth of 8 or 10 bits, which cannot fully capture the brightness range of the scene. When the dynamic range captured by the camera is smaller than the dynamic range of the scene, the captured image may have overexposed or underexposed areas, affecting the visual quality.

随着HDR技术的发展，从传统的基于统计方法的融合逐渐转向依赖于深度学习的方法重建HDR图像。这些深度学习的方法往往经历两个步骤，一是将不同时刻、不同曝光的LDR图像对齐，二是将对齐后的LDR图像融合为HDR图像。相较于图像HDR重建，视频HDR重建需要重建原始LDR帧序列中每一帧的HDR结果。现有的方法往往针对交替曝光的视频序列(如-2EV，+2EV，-2EV，……)，采用基于滑动窗口的方法输入三张或五张相邻的不同曝光的LDR帧，经过对齐和融合重建出中间帧的HDR结果。With the development of HDR technology, the traditional statistical fusion method has gradually shifted to deep learning-based methods to reconstruct HDR images. These deep learning methods often go through two steps: one is to align LDR images at different times and exposures, and the other is to fuse the aligned LDR images into HDR images. Compared with image HDR reconstruction, video HDR reconstruction requires reconstruction of the HDR results of each frame in the original LDR frame sequence. Existing methods often use a sliding window-based method to input three or five adjacent LDR frames of different exposures for alternating exposure video sequences (such as -2EV, +2EV, -2EV, ...), and reconstruct the HDR results of the intermediate frames after alignment and fusion.

以往的视频HDR重建方法往往直接处理sRGB图像，然而sRGB图像经过了相机内部复杂的图像处理管道(ISP)，如黑电平矫正、去马赛克、白平衡、伽马校正和色域转换等。这使得sRGB图像不仅损失了部分原始信息，同时一些非线性映射操作为HDR的重建带来了困难。而利用相机传感器输出的Raw域数据可以很好的解决上面的问题，Raw域数据含有更宽的bit位数，包含更丰富的场景信息，同时由于未经后续ISP处理的影响，Raw域数据具有更好的线性性质，更有利于实现HDR重建。Previous video HDR reconstruction methods often directly process sRGB images. However, sRGB images have gone through a complex image processing pipeline (ISP) inside the camera, such as black level correction, de-mosaicing, white balance, gamma correction, and color gamut conversion. This not only causes the sRGB image to lose some of its original information, but also some nonlinear mapping operations make HDR reconstruction difficult. The above problems can be solved well by using the Raw domain data output by the camera sensor. Raw domain data contains a wider number of bits and contains richer scene information. At the same time, because it is not affected by subsequent ISP processing, Raw domain data has better linear properties and is more conducive to HDR reconstruction.

另一方面，对于过暗场景的HDR重建，往往需要考虑严重的噪声影响，Raw域数据能够更准确的建模噪声，从而令模型更好的学会真实场景的去噪，广泛应用于图像和视频去噪任务。同时现有的方法如Kalantari等人提出的Deep hdr video from sequences withalternating exposures.Computer Graphics Forum 38,193–205(2019)缺少针对过暗区域噪声的特殊设计，使得网络对于重建带噪图像的HDR表现不佳。Chen等人提出的Hdrvideo reconstruction:A coarse-to-fine network and a real-world benchmarkdataset.In:Proceedings of the IEEE/CVF International Conference on ComputerVision.pp.2502–2511(2021)采用两阶段处理对齐和融合，使得短曝光图像作参考时细节被过度平滑，而长曝光作参考时又保留噪声。On the other hand, for HDR reconstruction of overly dark scenes, it is often necessary to consider the serious influence of noise. Raw domain data can model noise more accurately, so that the model can better learn to denoise real scenes and is widely used in image and video denoising tasks. At the same time, existing methods such as Deep HDR video from sequences with alternating exposures proposed by Kalantari et al. Computer Graphics Forum 38, 193–205 (2019) lack special designs for noise in overly dark areas, which makes the network perform poorly for HDR reconstruction of noisy images. HDR video reconstruction: A coarse-to-fine network and a real-world benchmark dataset. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 2502–2511 (2021) proposed by Chen et al. uses two-stage processing of alignment and fusion, so that the details are over-smoothed when the short exposure image is used as a reference, while the noise is retained when the long exposure image is used as a reference.

综合上述内容可以看出，基于深度学习的HDR图像和视频的重建方法往往受到噪声的限制，现有的方法主要在sRGB域对图像进行训练，无法建模真实的噪声情况；同时网络结构主要针对对齐和融合进行设计，缺乏处理噪声的模块，导致困难场景下的HDR重建质量较低；为了解决上述问题，本发明提出了一种基于Raw域的双支路HDR视频重建算法。Based on the above content, it can be seen that the reconstruction methods of HDR images and videos based on deep learning are often limited by noise. The existing methods mainly train images in the sRGB domain and cannot model the actual noise situation. At the same time, the network structure is mainly designed for alignment and fusion, and lacks a module for processing noise, resulting in low HDR reconstruction quality in difficult scenarios. In order to solve the above problems, the present invention proposes a dual-branch HDR video reconstruction algorithm based on the Raw domain.

发明内容Summary of the invention

本发明的目的在于建立一个Raw视频HDR数据集，并在此基础上提出了一种基于Raw域的双支路HDR视频重建算法以解决背景技术中所提到的问题。The purpose of the present invention is to establish a Raw video HDR dataset, and on this basis propose a dual-branch HDR video reconstruction algorithm based on the Raw domain to solve the problems mentioned in the background technology.

为了实现上述目的，本发明采用了如下技术方案：In order to achieve the above object, the present invention adopts the following technical solutions:

一种基于Raw域的双支路HDR视频重建算法，具体包括以下步骤：A dual-branch HDR video reconstruction algorithm based on the Raw domain specifically includes the following steps:

S1、建立合成的Raw域视频HDR数据集：建立Raw数据集，具体包括以下内容：S1. Establish a synthetic Raw domain video HDR dataset: Establish a Raw dataset, specifically including the following contents:

S101、源sRGB视频数据选择：选择Froehlich和Kronander等人拍摄的21个sRGB域HDR视频和高质量视频数据集Vimeo-90K作为源sRGB视频数据。每个HDR视频通过选定的曝光参数重新曝光来模拟交替曝光LDR视频序列；S101, source sRGB video data selection: 21 sRGB domain HDR videos shot by Froehlich and Kronander et al. and the high-quality video dataset Vimeo-90K are selected as source sRGB video data. Each HDR video is re-exposed with the selected exposure parameters to simulate the alternating exposure LDR video sequence;

S102、Raw数据集建立：通过模拟相机成像管道流程，将sRGB视频转换为Raw视频数据集。具体步骤为：逆相机响应(CRF)曲线、模拟bayer格式、数据增强、重新曝光、添加噪声得到Raw域的HDR和LDR图像，将其作为Raw域的训练数据对；S102, Raw data set establishment: by simulating the camera imaging pipeline process, the sRGB video is converted into a Raw video data set. The specific steps are: inverse camera response (CRF) curve, simulated bayer format, data enhancement, re-exposure, and adding noise to obtain HDR and LDR images in the Raw domain, which are used as training data pairs in the Raw domain;

S2、设计重建算法：基于S1中所得的数据对，采用LDR的Raw帧

和HDR的Raw帧

作为训练对来设计双支路Raw视频HDR重建算法；S2. Design reconstruction algorithm: Based on the data pairs obtained in S1, use the Raw frame of LDR

and Raw frames of HDR

Use them as training pairs to design a dual-branch Raw video HDR reconstruction algorithm;

S3、训练模型：基于S2中所设计的重建算法来搭建模型，并利用深度学习框架Pytorch平台训练模型，在整个数据集上迭代15个epcoh，随后减小学习率至0.00001，继续迭代直到损失收敛，得到最终模型；S3, training model: build a model based on the reconstruction algorithm designed in S2, and use the deep learning framework Pytorch platform to train the model. Iterate 15 epcohs on the entire dataset, then reduce the learning rate to 0.00001, and continue to iterate until the loss converges to obtain the final model;

S4、输出结果：将测试集中的低动态范围的Raw视频序列输入到S3中所得的最终模型中，得到相应的高动态范围的输出结果。S4, output results: Input the low dynamic range Raw video sequence in the test set into the final model obtained in S3 to obtain the corresponding high dynamic range output results.

优选地，所述S102中提到模拟相机成像流程将sRGB视频转换为Raw视频，具体包括以下步骤：Preferably, the analog camera imaging process mentioned in S102 converts the sRGB video into a Raw video, which specifically includes the following steps:

S1021、对于Vimeo-90K数据集，通过估计CRF曲线，将视频帧由非线性域转换到线性域；S1021. For the Vimeo-90K dataset, the video frames are converted from the nonlinear domain to the linear domain by estimating the CRF curve.

S1022、将3通道sRGB帧下采样为四分之一分辨率的四个通道：红、绿、绿、蓝，按照GRBG的bayer格式组合为mosaic图像；S1022, down-sample the 3-channel sRGB frame into four channels of one-quarter resolution: red, green, green, and blue, and combine them into a mosaic image according to the GRBG bayer format;

S1023、将mosaic图像转换为G、R、B、G的4通道图像，然后随机进行缩放、平移、旋转；S1023, converting the mosaic image into a 4-channel image of G, R, B, G, and then randomly scaling, translating, and rotating;

S1024、按照特定的曝光参数将Raw域HDR帧转换为Raw域LDR帧；S1024, converting the Raw domain HDR frame into a Raw domain LDR frame according to specific exposure parameters;

S1025、为Raw域LDR帧添加模拟的高斯和泊松噪声。S1025. Add simulated Gaussian and Poisson noise to the Raw domain LDR frame.

优选地，所述S2中提到的设计双支路Raw视频HDR重建算法，具体包括以下步骤：Preferably, the design of the dual-branch Raw video HDR reconstruction algorithm mentioned in S2 specifically includes the following steps:

S201、噪声估计：每次输入三帧连续的raw域LDR帧

利用噪声估计网络估计噪声水平图：S201, noise estimation: input three consecutive raw domain LDR frames each time

Estimate the noise level map using a noise estimation network:

S202、数据处理和特征提取：输入三帧连续的raw域LDR帧

和其对应的曝光系数t_i-1、t_i、t_i+1，利用曝光系数对输入LDR图像进行曝光校正，校正公式为：S202, data processing and feature extraction: input three consecutive raw domain LDR frames

And its corresponding exposure coefficients ti _-1 , _ti , ti ₊₁ , use the exposure coefficients to perform exposure correction on the input LDR image, the correction formula is:

利用上述公式将输入

映射到同一曝光水平；Using the above formula, input

Mapped to the same exposure level;

然后经过特征提取模块，利用卷积提取特征：Then, through the feature extraction module, convolution is used to extract features:

其中，F_i表示第i帧提取的特征，输入的LDR图像用于帮助检测过曝和欠曝区域，输入的HDR图像用于帮助后续对齐，噪声水平图像帮助检测噪声区域；Where _Fi represents the features extracted from the i-th frame, the input LDR image is used to help detect overexposed and underexposed areas, the input HDR image is used to help with subsequent alignment, and the noise level image helps detect noise areas;

S203、特征对齐：级联的金字塔式可变形卷积结构，首先将输入特征经过两次下采样得到多个尺度的特征：S203, feature alignment: cascaded pyramid deformable convolution structure, first downsample the input features twice to obtain features of multiple scales:

其中，

表示下采样；in,

represents downsampling;

在第s个尺度上，利用

与中间帧特征

级联估计偏移量

At the sth scale, using

Intermediate frame features

Cascade Estimation Offset

将计算所得的偏移量

作为可变形卷积的偏移量，用可变形卷积对上一帧特征处理后得到当前尺度下的对齐结果：The calculated offset

As the offset of the deformable convolution, the deformable convolution is used to process the features of the previous frame to obtain the alignment result at the current scale:

其中，↑2表示2倍的双线性插值上采样，每一尺度的对齐结果与上一尺度的对齐结果经过卷积进一步融合；特征域上由粗到细的联合预测，能够更准确地估计大尺度下的位移；Among them, ↑2 represents 2 times bilinear interpolation upsampling, and the alignment result of each scale is further fused with the alignment result of the previous scale through convolution; the coarse-to-fine joint prediction in the feature domain can more accurately estimate the displacement at a large scale;

S204、时间融合：空间注意力结构通过卷积得到相邻帧之间的注意力相关性，帮助网络重建出无鬼影和曝光准确的HDR图像：S204, Temporal Fusion: The spatial attention structure obtains the attention correlation between adjacent frames through convolution, helping the network to reconstruct HDR images without ghosting and with accurate exposure:

其中，A_i表示预测出的空间注意力，⊙表示逐元素相乘，

表示时间融合后的特征；Among them, _Ai represents the predicted spatial attention, ⊙ represents element-by-element multiplication,

Represents the features after time fusion;

S205、内容增强分支：将对齐后的特征

经过残差估计分支REB提取输入特征的高频信息，帮助恢复上一支路中缺少的内容：S205, content enhancement branch: the aligned features

The residual estimation branch REB extracts high-frequency information of the input features to help restore the missing content in the previous branch:

其中，H_res表示估计得到的残差信息，REB为残差估计分支，其中包含了多个密集残差块；Among them, H _res represents the estimated residual information, REB is the residual estimation branch, which contains multiple dense residual blocks;

S206、重建HDR：

经过一系列残差块、跳跃连接后和H_res相加，再经过sigmoid层得到最终的raw域HDR结果

S206, Reconstruct HDR:

After a series of residual blocks and jump connections, it is added to H _res and then passed through a sigmoid layer to obtain the final raw domain HDR result.

S207、损失函数：采用色调映射后输出

和真值

之间的差异化网络：S207, loss function: output after tone mapping

and truth value

Differentiation network between:

其中，μ为5000，T_i ^raw和

表示经过色调映射后的raw域真值图像和预测结果。Among them, μ is 5000, T _i ^raw and

Represents the raw domain true value image and prediction result after tone mapping.

与现有技术相比，本发明提供了一种基于Raw域的双支路HDR视频重建算法，具备以下有益效果：Compared with the prior art, the present invention provides a dual-branch HDR video reconstruction algorithm based on the Raw domain, which has the following beneficial effects:

(1)本发明在Raw域合成了第一个模拟真实噪声分布的视频HDR数据集，为夜晚或极端场景下HDR重建方法的训练和评估提供了基准数据集；(1) This paper synthesizes the first video HDR dataset that simulates real noise distribution in the Raw domain, providing a benchmark dataset for the training and evaluation of HDR reconstruction methods at night or in extreme scenes;

(2)本发明基于提出的Raw视频HDR数据集，提出了一种双支路HDR视频重建方法，同时利用所提出的可变形卷积对齐模块，以及内容增强模块，实现了困难场景下带噪LDR视频的动态范围扩展；(2) Based on the proposed Raw video HDR dataset, this paper proposes a dual-branch HDR video reconstruction method, and uses the proposed deformable convolution alignment module and content enhancement module to achieve dynamic range expansion of noisy LDR videos in difficult scenarios;

(3)将本发明所提出的重建算法与是市面上主流重建方法进行的对比实验，结果表明，本发明所提出的重建算法优于目前主流的基于sRGB的HDR重建方法，并且优于或者相当于将主流方法直接转到Raw域的结果；经过本发明的研究探索，希望能够启发更多基于Raw域的视频HDR重建方法的研究。(3) The reconstruction algorithm proposed in the present invention is compared with the mainstream reconstruction method on the market. The results show that the reconstruction algorithm proposed in the present invention is superior to the current mainstream sRGB-based HDR reconstruction method, and is superior to or equivalent to the result of directly converting the mainstream method to the Raw domain. Through the research and exploration of the present invention, it is hoped that more research on Raw-domain-based video HDR reconstruction methods can be inspired.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明提出的一种基于Raw域的双支路HDR视频重建算法的算法流程图；FIG1 is an algorithm flow chart of a dual-branch HDR video reconstruction algorithm based on the Raw domain proposed by the present invention;

图2为本发明提出的一种基于Raw域的双支路HDR视频重建算法与其他视频/图像HDR重建算法在测试集上的结果视觉对比图。FIG2 is a visual comparison diagram of the results of a dual-branch HDR video reconstruction algorithm based on the Raw domain proposed in the present invention and other video/image HDR reconstruction algorithms on a test set.

具体实施方式DETAILED DESCRIPTION

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。The technical solutions in the embodiments of the present invention will be described clearly and completely below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, rather than all the embodiments.

实施例1：Embodiment 1:

一种基于Raw域的双支路HDR视频重建算法，具体包括以下步骤：A dual-branch HDR video reconstruction algorithm based on Raw domain specifically includes the following steps:

S102中提到模拟相机成像流程将sRGB视频转换为Raw视频，具体包括以下步骤：S102 mentions that the analog camera imaging process converts sRGB video into Raw video, which specifically includes the following steps:

Raw域数据与sRGB具有不同的数据格式，所以需要首先从原始RGB三通道重采样为RGGB四通道，然后按照拜尔格式重新排列，以模拟Raw数据格式；其次，通过在图像上添加高斯和泊松噪声模拟真实情况下的噪声分布；Raw domain data has a different data format from sRGB, so it is necessary to first resample the original RGB three channels to RGGB four channels, and then rearrange them according to the Bayer format to simulate the Raw data format; secondly, add Gaussian and Poisson noise to the image to simulate the noise distribution in real situations;

S2、设计重建算法：基于S1中所得的数据对，采用LDR的Raw帧

和HDR的Raw帧

and Raw frames of HDR

S3、训练模型：基于S2中所设计的重建算法来搭建模型，模型输入帧数为3帧，输入视频被裁剪为256乘256的块，每一批次为16组样本数据。选用Adam优化器，初始学习率设置为0.0001。利用深度学习框架Pytorch平台训练模型，在整个数据集上迭代15个epoch，随后减小学习率至0.00001，继续迭代直到损失曲线收敛，得到最终模型；S3, training model: build the model based on the reconstruction algorithm designed in S2, the model input frame number is 3 frames, the input video is cropped into 256 x 256 blocks, and each batch is 16 groups of sample data. The Adam optimizer is selected, and the initial learning rate is set to 0.0001. Use the deep learning framework Pytorch platform to train the model, iterate 15 epochs on the entire data set, then reduce the learning rate to 0.00001, and continue to iterate until the loss curve converges to obtain the final model;

在Raw域合成了第一个模拟真实噪声分布的视频HDR数据集，为夜晚或极端场景下HDR重建方法的训练和评估提供了基准数据集。The first video HDR dataset simulating real noise distribution is synthesized in the Raw domain, providing a benchmark dataset for training and evaluation of HDR reconstruction methods in night or extreme scenes.

实施例2：Embodiment 2:

请参阅图1，基于实施例1但有所不同之处在于：Please refer to Figure 1, which is based on Example 1 but differs in that:

所述S2中提到的设计双支路Raw视频HDR重建算法，具体包括以下步骤：The design of the dual-branch Raw video HDR reconstruction algorithm mentioned in S2 specifically includes the following steps:

S201、噪声估计：每次输入三帧连续的raw域LDR帧

Estimate the noise level map using a noise estimation network:

S202、数据处理和特征提取：输入三帧连续的raw域LDR帧

利用上述公式将输入

映射到同一曝光水平；Using the above formula, input

Mapped to the same exposure level;

其中，

表示下采样；in,

represents downsampling;

在第s个尺度上，利用

与中间帧特征

级联估计偏移量

At the sth scale, using

Intermediate frame features

Cascade Estimation Offset

将计算所得的偏移量

其中，A_i表示预测出的空间注意力，⊙表示逐元素相乘，

表示时间融合后的特征。Among them, _Ai represents the predicted spatial attention, ⊙ represents element-by-element multiplication,

Represents the features after time fusion.

S205、内容增强分支：将对齐后的特征

其中H_res表示估计得到的残差信息。Where H _res represents the estimated residual information.

S206、重建HDR：

S206, Reconstruct HDR:

S207、损失函数：采用色调映射后输出

和真值

之间的差异化网络：S207, loss function: output after tone mapping

and truth value

Differentiation network between:

其中，μ为5000，T_i ^raw和

本发明基于Raw视频HDR数据集，提出了一种双支路HDR重建方法，利用所提出的可变形卷积对齐模块，以及内容增强模块，实现了困难场景下带噪LDR视频的动态范围扩展。Based on the Raw video HDR dataset, the present invention proposes a dual-branch HDR reconstruction method, and utilizes the proposed deformable convolution alignment module and content enhancement module to achieve the dynamic range extension of noisy LDR videos in difficult scenes.

实施例3：Embodiment 3:

请参阅图2，基于实施例1-2但又有所不同之处在于：Please refer to Figure 2, which is based on Example 1-2 but is different in that:

将本发明所提出的一种基于Raw域的双支路HDR视频重建算法与市面上主流的方法在Raw域进行对比，其在测试集上的结果对比如图2和表1所示。The dual-branch HDR video reconstruction algorithm based on the Raw domain proposed in the present invention is compared with the mainstream methods on the market in the Raw domain, and the comparison results on the test set are shown in Figure 2 and Table 1.

表1指标对比表Table 1 Index comparison table

从图2和表1中可以看出，本发明所提出的一种基于Raw域的双支路HDR视频重建算法，通过两个分支的联合学习和信息互补能够更好的减少噪声影响，同时保留原始图像细节；结合实际图像效果和表中数据可以明显看出，本发明所提出的重建算法取得了更好的视觉效果和数据指标。It can be seen from Figure 2 and Table 1 that the dual-branch HDR video reconstruction algorithm based on the Raw domain proposed in the present invention can better reduce the influence of noise while retaining the original image details through the joint learning and information complementarity of the two branches; combined with the actual image effect and the data in the table, it can be obviously seen that the reconstruction algorithm proposed in the present invention has achieved better visual effects and data indicators.

以上所述，仅为本发明较佳的具体实施方式，但本发明的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本发明揭露的技术范围内，根据本发明的技术方案及其发明构思加以等同替换或改变，都应涵盖在本发明的保护范围之内。The above description is only a preferred specific implementation manner of the present invention, but the protection scope of the present invention is not limited thereto. Any technician familiar with the technical field can make equivalent replacements or changes according to the technical scheme and inventive concept of the present invention within the technical scope disclosed by the present invention, which should be covered by the protection scope of the present invention.

Claims

1. A dual-branch HDR video reconstruction algorithm based on Raw domain, characterized by comprising the following steps:

S1. Establish a synthetic Raw domain video HDR dataset: Establish a Raw dataset, specifically including the following contents:

S101, source sRGB video data selection: select a number of existing sRGB domain HDR videos and high-quality video datasets as source sRGB video data, and re-expose each HDR video by using selected exposure parameters to simulate an alternately exposed LDR video sequence;

S102, Raw data set establishment: by simulating the camera imaging pipeline process, the sRGB video is converted into a Raw video data set. The specific steps are: obtain the inverse camera response curve → simulate the bayer format → data enhancement → re-exposure → add noise to obtain the HDR and LDR images in the Raw domain, and use the obtained HDR and LDR images in the Raw domain as the training data pairs in the Raw domain;

S2. Design a reconstruction algorithm: Based on the data pair obtained in S1, use the LDR Raw frame I _i ^raw and the HDR Raw frame H _i ^raw as training pairs to design a dual-branch Raw video HDR reconstruction algorithm;

S3, training model: build a model based on the reconstruction algorithm designed in S2, and use the deep learning framework Pytorch platform to train the model. Iterate 15 epcohs on the entire dataset, then reduce the learning rate to 0.00001, and continue to iterate until the loss converges to obtain the final model;

S4, output results: Input the low dynamic range Raw video sequence in the test set into the final model obtained in S3 to obtain the corresponding high dynamic range output results.

2. According to the Raw domain-based dual-branch HDR video reconstruction algorithm of claim 1, it is characterized in that the analog camera imaging process mentioned in S102 converts the sRGB video into a Raw video data set, specifically comprising the following steps:

S1021. For a high-quality video dataset, convert the video frames from a nonlinear domain to a linear domain by estimating a CRF curve;

S1022, down-sample the 3-channel sRGB frame into four channels of one-quarter resolution: red, green, green, and blue, and combine them into a mosaic image according to the GRBG bayer format;

S1023, converting the mosaic image into a 4-channel image of G, R, B, G, and then randomly scaling, translating, and rotating;

S1024, converting the Raw domain HDR frame into a Raw domain LDR frame according to specific exposure parameters;

S1025. Add simulated Gaussian and Poisson noise to the Raw domain LDR frame.

3. According to the Raw domain-based dual-branch HDR video reconstruction algorithm of claim 1, it is characterized in that the design of the dual-branch Raw video HDR reconstruction algorithm mentioned in S2 specifically comprises the following steps:

S201, noise estimation: input three consecutive raw domain LDR frames each time

Estimate the noise level map using a noise estimation network:

S202, data processing and feature extraction: input three consecutive raw domain LDR frames

And the corresponding exposure coefficients ti _-1 , _ti , ti ₊₁ , the exposure coefficients are used to perform exposure correction on the input LDR image, and the correction formula is:

Using the above formula, input

Mapped to the same exposure level;

Then, through the feature extraction module, convolution is used to extract features:

Where _Fi represents the features extracted from the i-th frame, the input LDR image is used to help detect overexposed and underexposed areas, the input HDR image is used to help with subsequent alignment, and the noise level image helps detect noise areas;

S203, feature alignment: The cascaded pyramid deformable convolution structure first downsamples the input features twice to obtain features of multiple scales:

in,

represents downsampling;

At the sth scale, using

Intermediate frame features

Cascade Estimation Offset

The calculated offset

Among them, ^↑2 represents 2 times bilinear interpolation upsampling, and the alignment result of each scale is further fused with the alignment result of the previous scale through convolution; the coarse-to-fine joint prediction in the feature domain can more accurately estimate the displacement at a large scale;

S204, Temporal Fusion: The spatial attention structure obtains the attention correlation between adjacent frames through convolution, helping the network to reconstruct HDR images without ghosting and with accurate exposure:

Among them, _Ai represents the predicted spatial attention, ⊙ represents element-by-element multiplication,

Represents the features after time fusion;

S205, content enhancement branch: The aligned features are passed through the residual estimation branch REB to extract the high-frequency information of the input features to help restore the missing content in the previous branch: