CN117830096A

CN117830096A - A super-resolution reconstruction method for face images under low light conditions

Info

Publication number: CN117830096A
Application number: CN202311690044.7A
Authority: CN
Inventors: 江俊君; 王晨阳; 江奎; 刘贤明
Original assignee: Harbin Institute of Technology Shenzhen
Current assignee: Harbin Institute of Technology Shenzhen
Priority date: 2023-12-11
Filing date: 2023-12-11
Publication date: 2024-04-05

Abstract

The present invention discloses a method for super-resolution reconstruction of facial images under low-light conditions. Step 1: synthesize a low-light low-resolution facial image I _LLR ; Step 2: construct a brightness-corrected facial super-resolution network IC‑FSRNet; Step 3: input the image synthesized in step 1 into step 2, improve the brightness of the facial image and restore the facial structure information, and obtain I _SR1 ; Step 4: construct a detail enhancement model DENet; Step 5: input the image obtained in step 3 into step 4, improve the facial details of the facial image, so that the facial image has a better visual effect, and obtain I _SR2 . The present invention can effectively improve the visual quality of low-light low-resolution facial images and solve the problem of loss of important facial information in the existing cascade technology.

Description

A super-resolution reconstruction method for face images under low light conditions

技术领域Technical Field

本发明属于图像领域，具体涉及一种低光照条件下的人脸图像超分辨率重建方法。The invention belongs to the field of images, and in particular relates to a super-resolution reconstruction method for face images under low-light conditions.

背景技术Background technique

人脸超分辨率(FSR)，也称为人脸幻觉，是一种从低分辨率人脸图像中恢复出相应的高分辨率人脸图像的技术。人脸超分辨率已被研究了很多年，并且广泛应用于户外计算机视觉系统。尽管现有人脸超分辨率方法在处理正常光照条件下捕获的人脸图像时非常有效，但在暗光照条件下仍有改进的空间。在实际场景中，由于成像环境的多样性和复杂性(例如夜间照明不足或视频监控场景中曝光时间有限)，捕获的人脸图像会同时受到低分辨率和低光照退化干扰的情况。因此，迫切需要设计低光照低分辨率人脸图像超分辨率技术。Face super-resolution (FSR), also known as face hallucination, is a technique to recover a corresponding high-resolution face image from a low-resolution face image. Face super-resolution has been studied for many years and is widely used in outdoor computer vision systems. Although existing face super-resolution methods are very effective in processing face images captured under normal lighting conditions, there is still room for improvement under low-light conditions. In practical scenarios, due to the diversity and complexity of imaging environments (such as insufficient lighting at night or limited exposure time in video surveillance scenarios), the captured face images are simultaneously interfered by low resolution and low-light degradation. Therefore, there is an urgent need to design low-light low-resolution face image super-resolution techniques.

在深度学习之前，传统方法通过设计手工先验和重建算子来提高人脸图像分辨率，但在开放世界场景下，传统方法的泛化能力具有一定局限性。最近，深度卷积神经网络已经出现，并且在人脸超分辨率任务上比传统方法表现出更优越的性能。然而，这些方法依然无法从低光照低分辨率人脸图像中重建出视觉上令人愉悦且内容保真的结果，因为它们是为恢复正常光照条件下捕获的人脸图像而定制的。一种可能的方法是依次执行人脸超分辨率和低光照图像增强(LLIE)，比如先超分辨率然后再对超分辨率结果进行增强(FSR+LLIE)或先对图像进行增强而后进行超分辨率(LLIE+FSR)。然而，简单的级联解决方案很难生成具有令人满意的面部结构和视觉质量的结果。原因在于，单独的方法仅关注单一退化(比如低光照图像增强方法仅仅关注低光照退化，而人脸超分辨率方法仅仅关注低分辨率降质)，而输入人脸图像同时承受低光照和低分辨率退化。也就是说，前一个过程不可避免地会产生伪影和错误，这会干扰后一个过程并导致错误累积，最终导致重要面部信息的丢失。因此，迫切需要一种能够有效恢复低光照低分辨率人脸图像的方法。Before deep learning, traditional methods improve the resolution of face images by designing handcrafted priors and reconstruction operators, but their generalization ability is limited in open-world scenarios. Recently, deep convolutional neural networks have emerged and have shown superior performance on face super-resolution tasks than traditional methods. However, these methods still cannot reconstruct visually pleasing and content-preserving results from low-light low-resolution face images because they are tailored to restore face images captured under normal lighting conditions. One possible approach is to perform face super-resolution and low-light image enhancement (LLIE) sequentially, such as super-resolution first and then enhancement of the super-resolution result (FSR+LLIE) or image enhancement first and then super-resolution (LLIE+FSR). However, simple cascade solutions are difficult to generate results with satisfactory facial structure and visual quality. The reason is that individual methods only focus on a single degradation (e.g., low-light image enhancement methods only focus on low-light degradation, while face super-resolution methods only focus on low-resolution degradation), while the input face images suffer from both low-light and low-resolution degradation. That is, the former process will inevitably produce artifacts and errors, which will interfere with the latter process and cause error accumulation, ultimately leading to the loss of important facial information. Therefore, there is an urgent need for a method that can effectively restore low-light and low-resolution face images.

发明内容Summary of the invention

本发明提供一种低光照条件下的人脸图像超分辨率重建方法，用以解决现有技术中错误累积，导致重要面部信息丢失的问题。The present invention provides a super-resolution reconstruction method for facial images under low-light conditions, which is used to solve the problem in the prior art of error accumulation leading to loss of important facial information.

本发明通过以下技术方案实现：The present invention is achieved through the following technical solutions:

一种低光照条件下的人脸图像超分辨率重建方法，所述重建方法包括以下步骤：A method for super-resolution reconstruction of face images under low-light conditions, the reconstruction method comprising the following steps:

步骤1：合成低光照低分辨率人脸图像I_LLR；Step 1: Synthesize a low-light low-resolution face image I _LLR ;

步骤2：构建亮度校正人脸超分辨率网络IC-FSRNet；Step 2: Construct the brightness correction face super-resolution network IC-FSRNet;

步骤3：将步骤1合成的图像输入到步骤2中，改善人脸图像的亮度并恢复人脸结构信息，得到I_SR1；Step 3: Input the image synthesized in step 1 into step 2 to improve the brightness of the face image and restore the face structure information to obtain I _SR1 ;

步骤4：构建细节增强模型DENet；Step 4: Build the detail enhancement model DENet;

步骤5：将步骤3得到的图像输入到步骤4中，改善人脸图像的面部细节，从而使人脸图像具有更好的视觉效果，得到I_SR2。Step 5: Input the image obtained in step 3 into step 4 to improve the facial details of the face image, so that the face image has a better visual effect, and obtain I _SR2 .

进一步的，所述构建亮度校正人脸超分辨率网络具体包括用于估计亮度系数以进行亮度调整的亮度估计IE分支和用于重建超分辨率结果的人脸超分辨率FSR分支；Furthermore, the said constructing brightness correction face super-resolution network specifically includes a brightness estimation IE branch for estimating a brightness coefficient for brightness adjustment and a face super-resolution FSR branch for reconstructing a super-resolution result;

首先低光照低分辨率人脸图像I_LLR被送到两个分支；First, the low-light low-resolution face image I _LLR is sent to two branches;

然后，亮度校正超分辨率块ICSRB将上一步的输出特征作为输入来编码它们的相互关系。Then, the brightness correction super-resolution block ICSRB takes the output features of the previous step as input to encode their mutual relations.

进一步的，所述亮度校正超分辨率块ICSRB将上一步的输出特征作为输入来编码它们的相互关系具体为，首先采用三个级联的卷积层从/>预测双边亮度网格B₁；同时，通过卷积层从上采样的I_LLR中提取引导图G；有了B₁和G，通过3D切片得到亮度系数C₁，表示为，Furthermore, the brightness correction super-resolution block ICSRB converts the output features of the previous step Specifically, three cascaded convolutional layers are used as input to encode their mutual relationship. Predict the bilateral brightness grid B ₁ ; at the same time, extract the guidance map G from the upsampled I _LLR through the convolution layer; with B ₁ and G, the brightness coefficient C ₁ is obtained through 3D slicing, expressed as,

C₁＝f_Slice(B₁,G),C ₁ = f _Slice (B ₁ ,G),

其中f_Slice表示3D切片操作；Where f _Slice represents a 3D slice operation;

然后将亮度系数C₁送入到亮度调整块中以调整FSR分支中的特征S₁的亮度，产生调整后的结果 The brightness coefficient _C1 is then fed into the brightness adjustment block to adjust the brightness of the feature _S1 in the FSR branch, producing the adjusted result

进一步的，所述3D切片先利用G将亮度网格B₁投影到3D网格上；Further, the 3D slice first projects the brightness grid B ₁ onto the 3D grid using G;

然后用高斯模糊对网格进行模糊处理；The mesh is then blurred using Gaussian blur;

最后，根据模糊的双边网格和引导图像G，通过使用三线性插值访问网格值来获得亮度系数C₁。Finally, according to the blurred bilateral grid and the guidance image G, the brightness coefficient C ₁ is obtained by accessing the grid values using trilinear interpolation.

进一步的，除了利用来自IE分支的亮度系数来改善FSR分支中的亮度外，还将亮度调整后的结果反馈到IE分支的亮度细化块以细化原始亮度并改善接下来的亮度估计；这样，人脸超分辨率便可以促进亮度估计；然后，将生成的超分辨率结果和亮度细化结果输入到后面的L-1个ICSRB中，以进行更准确的相互细化和利用；Furthermore, in addition to using the brightness coefficient from the IE branch to improve the brightness in the FSR branch, the brightness adjustment result is also Feedback to the brightness refinement block of the IE branch to refine the original brightness and improve the subsequent brightness estimation; in this way, face super-resolution can promote brightness estimation; then, the generated super-resolution result and brightness refinement result are input into the following L-1 ICSRBs for more accurate mutual refinement and utilization;

经过L次亮度估计和人脸超分辨率的协作后，在FSR分支的结果上应用卷积层，生成重建结果I_SR1；为了优化网络，采用L1像素损失，After L times of brightness estimation and face super-resolution, a convolutional layer is applied to the result of the FSR branch to generate the reconstruction result I _SR1 ; in order to optimize the network, L1 pixel loss is used,

其中I_HR是高质量人脸图像的参考标准。Among them, I _HR is the reference standard for high-quality face images.

进一步的，所述亮度调整具体为，首先通过卷积层和Sigmoid函数，将学习到的系数C_i映射到[-1,1]间，得到亮度系数C‘_i。而超分辨率特征S_i，将被输入到级联的卷积层生成S‘_i，然后通过以下转换函数进行亮度调整：Furthermore, the brightness adjustment is specifically to first map the learned coefficient _Ci to [-1,1] through the convolution layer and the Sigmoid function to obtain the brightness coefficient _C'i . The super-resolution feature _Si will be input into the cascaded convolution layer to generate _S'i , and then the brightness is adjusted through the following conversion function:

其中表示亮度调整后的特征，而*表示逐像素相乘。in represents the brightness-adjusted feature, and * indicates pixel-wise multiplication.

进一步的，所述亮度细化具体为，给定超分辨结果和原始亮度B_i，使用不同的全连接层从B_i生成查询Q，并从/>获得键K和值V；然后计算它们之间的交叉注意力，Furthermore, the brightness refinement is specifically as follows: given the super-resolution result and the original brightness _Bi , using different fully connected layers to generate query Q from _Bi , and from/> Get the key K and value V; then calculate the cross attention between them,

其中F_Att表示注意力图，f_Softmax表示softmax，d是超参数；Where F _Att represents the attention map, f _Softmax represents softmax, and d is a hyperparameter;

然后，应用前馈网络来生成细化结果；利用超分辨特征来细化下一步的亮度估计。Then, a feed-forward network is applied to generate the refined result; the super-resolution features are used to refine the brightness estimation in the next step.

进一步的，所述构建细节增强模型DENet具体为，DENet的训练目标如下，Furthermore, the construction of the detail enhancement model DENet is specifically as follows:

其中∈～N(0,1)，(x,y)对应I_SR1和I_HR，γ～p(γ)，f_DENet则表示DENet。Where ∈～N(0,1), (x,y) corresponds to I _SR1 and I _HR , γ～p(γ), and f _DENet represents DENet.

本发明的有益效果是：The beneficial effects of the present invention are:

本发明将低光照人脸图像超分辨率任务分为结构保真度重建和纹理一致性学习，并设计了亮度校正人脸超分辨率网络(IC-FSRNet)和细节增强模型(DENet)；前者旨在重现具有可信结构信息的人脸图像，而后者负责消除扰动以提高纹理一致性。The present invention divides the low-light face image super-resolution task into structure fidelity reconstruction and texture consistency learning, and designs a brightness correction face super-resolution network (IC-FSRNet) and a detail enhancement model (DENet); the former aims to reproduce face images with credible structural information, while the latter is responsible for eliminating disturbances to improve texture consistency.

本发明充分利用亮度校正和人脸超分辨率之间的互补信息以使得二者相互完善。The present invention makes full use of the complementary information between brightness correction and face super-resolution so that the two complement each other.

本发明的实验结果表明，所提出的方法在视觉质量和定量指标方面实现了最先进的性能。Our experimental results demonstrate that the proposed method achieves state-of-the-art performance in terms of both visual quality and quantitative metrics.

本发明可以有效改善低光照低分辨率人脸图像视觉质量，解决现有级联技术中重要面部信息丢失的问题。The present invention can effectively improve the visual quality of low-light and low-resolution facial images and solve the problem of loss of important facial information in existing cascade technology.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是本发明的流程图，其中IC-FSRNet主要用于恢复人脸结构，实现结构保真重建，而DENet则可以进一步增强面部细节并提升视觉质量。FIG1 is a flow chart of the present invention, wherein IC-FSRNet is mainly used to restore the face structure and realize structure-fidelity reconstruction, while DENet can further enhance facial details and improve visual quality.

图2是本发明的IC-FSRNet的总体结构示意图。FIG2 is a schematic diagram of the overall structure of IC-FSRNet of the present invention.

图3是本发明与其他几个SOTA方法的主观结果对比，(a)：I_LLR主观结果；(b)：SCTANet+LLformer方法的主观结果；(c)：LLformer+SCTANet方法的主观结果；(d)：SISN+FECNet方法的主观结果；(e)：FECNet+SISN方法的主观结果；(f)：SFMNet+LEDNet方法的主观结果；(g)：LEDNet+SFMNet方法的主观结果；(h)：IC-FSRNet方法的主观结果；(i)：IC-FSRDENet方法的主观结果(j)：对应的高质量人脸图像参考标准。Figure 3 is a comparison of the subjective results of the present invention with several other SOTA methods, (a): I _LLR subjective results; (b): subjective results of the SCTANet+LLformer method; (c): subjective results of the LLformer+SCTANet method; (d): subjective results of the SISN+FECNet method; (e): subjective results of the FECNet+SISN method; (f): subjective results of the SFMNet+LEDNet method; (g): subjective results of the LEDNet+SFMNet method; (h): subjective results of the IC-FSRNet method; (i): subjective results of the IC-FSRDENet method (j): corresponding high-quality face image reference standards.

图4是本发明与其他几个SOTA方法的余弦相似性对比图。FIG4 is a cosine similarity comparison diagram of the present invention and several other SOTA methods.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be described clearly and completely below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.

为了改善低光照低分辨率人脸图像质量，一个自然的想法是设计一个联合低光照低分辨率人脸超分辨率框架。但是简单的级联方案(LLIE+FSR或FSR+LLIE)在捕捉复杂的面部特征方面存在缺陷，导致超分辨率结果不理想，存在明显的色差和结构损失。为了解决这个困境，本发明提出将这个任务分解为结构保真度重建和纹理一致性学习。前者专为提高低分辨率低光照人脸图像质量同时保持结构保真度而定制，而后者则侧重于消除低光退化和重建引起的扰动和伪影。In order to improve the quality of low-light low-resolution face images, a natural idea is to design a joint low-light low-resolution face super-resolution framework. However, simple cascade schemes (LLIE+FSR or FSR+LLIE) are defective in capturing complex facial features, resulting in unsatisfactory super-resolution results with obvious chromatic aberration and structural loss. To solve this dilemma, the present invention proposes to decompose this task into structural fidelity reconstruction and texture consistency learning. The former is tailored for improving the quality of low-resolution low-light face images while maintaining structural fidelity, while the latter focuses on eliminating perturbations and artifacts caused by low-light degradation and reconstruction.

考虑到，亮度调整后的区域可以提供互补信息来改善人脸图像超分辨率，而反过来，超分辨率后的人脸图像也可以促进人脸图像的亮度调整。因此本发明引入了交互学习来利用亮度调整后的区域和重建的信息，并实现迭代细化。鉴于此，IC-FSRNet构建了用于估计亮度调整系数的亮度估计(IE)分支和结合这些系数以联合执行超分辨率和亮度调整的人脸超分辨率(FSR)分支。IC-FSRNet通过充分探索超分辨率和亮度调整之间的相互增强，采用迭代方式来增强人脸重建。在保持结构保真后，本发明利用扩散模型在合成精细图像细节方面的优点，设计了基于扩散概率模型的细节增强网络(DENet)来消除伪影并提高视觉质量。方法总称为IC-FSRDENet。Considering that the brightness-adjusted region can provide complementary information to improve the super-resolution of face images, and conversely, the super-resolution face images can also promote the brightness adjustment of face images. Therefore, the present invention introduces interactive learning to utilize the brightness-adjusted region and the reconstructed information, and realizes iterative refinement. In view of this, IC-FSRNet constructs a brightness estimation (IE) branch for estimating brightness adjustment coefficients and a face super-resolution (FSR) branch that combines these coefficients to jointly perform super-resolution and brightness adjustment. IC-FSRNet uses an iterative approach to enhance face reconstruction by fully exploring the mutual enhancement between super-resolution and brightness adjustment. After maintaining structural fidelity, the present invention utilizes the advantages of the diffusion model in synthesizing fine image details and designs a detail enhancement network (DENet) based on the diffusion probability model to eliminate artifacts and improve visual quality. The method is generally referred to as IC-FSRDENet.

本发明的目标是从给定的低光照低分辨率人脸图像I_LLR生成高质量的超分辨率人脸图像。现有方法通常关注单个任务、分辨率放大或亮度校正，但很少同时考虑两种退化。因此，它们无法有效地处理这项复杂的任务。尽管引入了级联解决方案(即先执行人脸超分辨率后执行亮度调整(FSR+LLIE)或先对图像进行亮度调整后对图像进行超分辨率(LLIE+FSR))来恢复这些退化的人脸图像，但级联的方法忽略了这两个任务之间的内在关系，性能改善有限。级联方案生成的人脸图像通常缺乏人脸结构和细节，同时存在明显的颜色偏差。为了缓解这个问题，本发明提出了一种新颖的方法，将该任务分解为结构保真度重建和纹理一致性学习。前者专为提高I_LLR人脸的质量而设计，同时保持结构保真度。后者侧重于消除由低光照退化和重建引起的扰动和伪影。如图1所示，一种低光照条件下的人脸图像超分辨率重建方法，所述重建方法包括以下步骤：The goal of the present invention is to generate high-quality super-resolution face images from a given low-light low-resolution face image I _LLR . Existing methods usually focus on a single task, resolution upscaling or brightness correction, but rarely consider both degradations at the same time. Therefore, they cannot effectively handle this complex task. Although cascade solutions (i.e., performing face super-resolution first and then brightness adjustment (FSR+LLIE) or performing brightness adjustment on the image first and then super-resolution on the image (LLIE+FSR)) have been introduced to restore these degraded face images, the cascade method ignores the intrinsic relationship between the two tasks and has limited performance improvement. The face images generated by the cascade scheme usually lack face structure and details, and there is obvious color deviation. To alleviate this problem, the present invention proposes a novel method to decompose the task into structural fidelity reconstruction and texture consistency learning. The former is designed to improve the quality of I _LLR faces while maintaining structural fidelity. The latter focuses on eliminating perturbations and artifacts caused by low-light degradation and reconstruction. As shown in Figure 1, a super-resolution reconstruction method for face images under low-light conditions, the reconstruction method comprises the following steps:

首先将I_LLR人脸图像输入IC-FSRNet进行亮度调整，同时生成粗略恢复的超分辨率人脸图像；考虑到亮度调整和人脸超分辨率的改进特征是互补的，超分辨率人脸图像可以改善亮度调整，并且亮度调整后的区域可以提供补充信息以促进人脸超分辨率；鉴于此，IC-FSRNet包括两个分支，一个亮度估计分支(IE)，用于估计亮度调整系数，一个超分辨率分支来结合这些系数以联合执行超分辨率和亮度调整；此外，IC-FSRNet迭代执行亮度估计和人脸超分辨率，探索它们之间的相互信息，以促进人脸超分辨率；虽然IC-FSRNet可以很好地恢复人脸结构并完成亮度调整，但恢复出来的人脸图像总是缺乏高频细节，并且存在低光照退化和重建带来的噪声；因此，进一步构建基于扩散模型的细节增强网络DENet，并将IC-FSRNet恢复的结果输入其中，以改善人脸细节和视觉质量。方法总称为IC-FSRDENet。First, the I _LLR face image is input into IC-FSRNet for brightness adjustment, and a roughly restored super-resolution face image is generated at the same time; considering that the improved features of brightness adjustment and face super-resolution are complementary, the super-resolution face image can improve the brightness adjustment, and the brightness-adjusted area can provide complementary information to promote face super-resolution; in view of this, IC-FSRNet includes two branches, a brightness estimation branch (IE) for estimating the brightness adjustment coefficient, and a super-resolution branch to combine these coefficients to jointly perform super-resolution and brightness adjustment; in addition, IC-FSRNet iteratively performs brightness estimation and face super-resolution, exploring the mutual information between them to promote face super-resolution; although IC-FSRNet can well restore the face structure and complete the brightness adjustment, the restored face image always lacks high-frequency details and has noise caused by low-light degradation and reconstruction; therefore, a detail enhancement network DENet based on a diffusion model is further constructed, and the results restored by IC-FSRNet are input into it to improve face details and visual quality. The method is collectively called IC-FSRDENet.

进一步的，所述构建亮度校正人脸超分辨率网络具体包括用于估计亮度系数以进行亮度调整的亮度估计IE分支和用于重建超分辨率结果的人脸超分辨率FSR分支；考虑到亮度调整和人脸超分辨率的改进特征是互补的，超分辨率后的人脸图像可以改善亮度调整，并且亮度调整后的区域可以提供补充信息来促进人脸超分辨率，因此IC-FSRNet迭代地执行亮度系数估计和超分辨率以探索二者的互补关系进行相互细化；Furthermore, the brightness correction face super-resolution network constructed specifically includes a brightness estimation IE branch for estimating the brightness coefficient for brightness adjustment and a face super-resolution FSR branch for reconstructing the super-resolution result; considering that the improved features of brightness adjustment and face super-resolution are complementary, the super-resolution face image can improve the brightness adjustment, and the brightness-adjusted area can provide supplementary information to promote face super-resolution, so IC-FSRNet iteratively performs brightness coefficient estimation and super-resolution to explore the complementary relationship between the two for mutual refinement;

首先低光照低分辨率人脸图像I_LLR送到两个分支；First, the low-light low-resolution face image I _LLR is sent to two branches;

然后，亮度校正超分辨率块ICSRB将上一步的输出特征作为输入来编码它们的相互关系；Then, the brightness correction super-resolution block ICSRB takes the output features of the previous step as input to encode their mutual relations;

ICSRB首先并行地执行人脸超分辨率和亮度估计；前者由常用的RCAB(Imagesuper-resolution using very deep residual channel attention networks)完成，生成增强特征S₁；后者是通过高效率的双边网格学习来实现的。多亏了双边网格学习，本发明可以在低分辨率维度估计亮度系数网格，然后用引导图切片到高分辨率空间，效率更高。ICSRB first performs face super-resolution and brightness estimation in parallel; the former is done by the commonly used RCAB (Imagesuper-resolution using very deep residual channel attention networks) to generate enhanced features S ₁ ; the latter is achieved through efficient bilateral grid learning. Thanks to bilateral grid learning, the present invention can estimate the brightness coefficient grid in the low-resolution dimension and then slice it into the high-resolution space with a guide map, which is more efficient.

进一步的，所述亮度校正超分辨率块ICSRB将上一步的输出特征作为输入来编码它们的相互关系，具体为，首先采用三个级联的卷积层从/>预测双边亮度网格B₁；同时，通过卷积层从上采样的I_LLR中提取引导图G；有了B₁和G，通过3D切片得到亮度系数C₁，表示为，Furthermore, the brightness correction super-resolution block ICSRB converts the output features of the previous step As input to encode their mutual relationship, specifically, firstly three cascaded convolutional layers are used from/> Predict the bilateral brightness grid B ₁ ; at the same time, extract the guidance map G from the upsampled I _LLR through the convolution layer; with B ₁ and G, the brightness coefficient C ₁ is obtained through 3D slicing, expressed as,

C₁＝f_Slice(B₁,G),C ₁ = f _Slice (B ₁ ,G),

进一步的，所述3D切片先利用G将亮度网格B₁投影到3D网格上，该网格的前两个维度表示图像平面中的2D位置，第三个维度表示G的图像强度；然后用高斯模糊对网格进行模糊处理；最后，根据模糊的双边网格和引导图像G，通过使用三线性插值访问网格值来获得亮度系数C₁。Furthermore, the 3D slice first projects the brightness grid B ₁ onto a 3D grid using G, where the first two dimensions of the grid represent the 2D position in the image plane and the third dimension represents the image intensity of G; then the grid is blurred by Gaussian blur; finally, based on the blurred bilateral grid and the guide image G, the brightness coefficient C ₁ is obtained by accessing the grid value using trilinear interpolation.

然后将亮度系数C₁送入到亮度调整块中以调整FSR分支中的特征S₁的照明，产生调整后的结果鉴于更准确的亮度系数可以促进人脸超分辨率，而更高分辨率的人脸图像可以改善亮度估计，ICSRB以迭代的方式实现亮度估计和人脸超分辨率之间的相互增强。The brightness coefficient _C1 is then fed into the brightness adjustment block to adjust the illumination of feature _S1 in the FSR branch, producing the adjusted result Given that more accurate brightness coefficients can facilitate face super-resolution, while higher-resolution face images can improve brightness estimation, ICSRB achieves mutual enhancement between brightness estimation and face super-resolution in an iterative manner.

进一步的，所述亮度调整具体为，本发明采用了一个具有可学习亮度系数的转换函数来进行亮度调整。具体而言，首先通过卷积层和Sigmoid函数，将学习到的系数C_i映射到[-1,1]间，得到亮度系数C‘_i。而超分辨率特征S_i，将被输入到级联的卷积层生成S‘_i，然后通过以下转换函数进行亮度调整：Furthermore, the brightness adjustment is specifically that the present invention uses a conversion function with a learnable brightness coefficient to perform brightness adjustment. Specifically, first, the learned coefficient _Ci is mapped to [-1,1] through a convolution layer and a Sigmoid function to obtain a brightness coefficient _C'i . The super-resolution feature _Si will be input into the cascaded convolution layer to generate _S'i , and then the brightness adjustment is performed through the following conversion function:

进一步的，所述亮度细化具体为，考虑到人脸超分辨率和亮度估计可以相互促进，本发明将人脸超分辨率分支的超分辨结果反馈到亮度估计分支以提炼亮度估计；考虑到亮度信息的全局特征，引入了交叉注意机制，从超分辨结果中探索全局信息；具体而言，给定超分辨结果和原始亮度B_i，使用不同的全连接层从B_i生成查询Q，并从/>获得键K和值V；然后计算它们之间的交叉注意力，Furthermore, the brightness refinement is specifically as follows: considering that face super-resolution and brightness estimation can promote each other, the present invention feeds back the super-resolution result of the face super-resolution branch to the brightness estimation branch to refine the brightness estimation; considering the global characteristics of brightness information, a cross-attention mechanism is introduced to explore global information from the super-resolution results; specifically, given the super-resolution result and the original brightness _Bi , using different fully connected layers to generate query Q from _Bi , and from/> Get the key K and value V; then calculate the cross attention between them,

然后，应用前馈网络来生成细化结果；通过这种方式，可以利用超分辨特征来细化下一步的亮度估计。Then, a feed-forward network is applied to generate the refined result; in this way, the super-resolution features can be utilized to refine the brightness estimation in the next step.

进一步的，所述构建细节增强模型DENet具体为，尽管IC-FSRNet可以改善人脸图像亮度并恢复出人脸结构，但IC-FSRNet恢复出来的人脸图像缺乏人脸关系细节并且具有明显的伪影，无法提供令人愉悦的视觉体验。为此，建立了细节增强网络(DENet)来提高视觉质量。受到扩散概率模型DDPM强大生成能力的启发，本发明设计了基于条件DDPM的DENet，并以I_SR1作为附加辅助信息来帮助逆转扩散过程。具体来说，DENet的训练目标如下，Furthermore, the construction of the detail enhancement model DENet is specifically as follows: although IC-FSRNet can improve the brightness of the face image and restore the face structure, the face image restored by IC-FSRNet lacks face relationship details and has obvious artifacts, and cannot provide a pleasant visual experience. For this reason, a detail enhancement network (DENet) is established to improve the visual quality. Inspired by the powerful generation ability of the diffusion probability model DDPM, the present invention designs a DENet based on conditional DDPM, and uses I _SR1 as additional auxiliary information to help reverse the diffusion process. Specifically, the training objectives of DENet are as follows,

其中∈～N(0,1)，(x,y)对应I_SR1和I_HR，γ～p(γ)，f_DENet则表示DENet。多亏了DENet，本发明的方法可以恢复出令人视觉愉悦的人脸图像。Where ∈ ～ N (0, 1), (x, y) corresponds to I _SR1 and I _HR , γ ～ p (γ), and f _DENet represents DENet. Thanks to DENet, the method of the present invention can restore visually pleasing face images.

验证试验Verification test

为了研究低光照低分辨率人脸超分辨率并验证所提方法的有效性，低光照低分辨率人脸和对应的高质量人脸图像数据集是必不可少的。然而，目前没有成对的低光照低分辨率人脸数据集，构建真实的人脸图像对又特别困难。因此，本发明模拟退化过程并使用现有的人脸数据集来合成低光照低分辨率人脸图像和正常光高分辨率人脸图像对。In order to study low-light low-resolution face super-resolution and verify the effectiveness of the proposed method, low-light low-resolution face and corresponding high-quality face image datasets are essential. However, there is currently no paired low-light low-resolution face dataset, and it is particularly difficult to construct a real face image pair. Therefore, the present invention simulates the degradation process and uses the existing face dataset to synthesize low-light low-resolution face images and normal light high-resolution face image pairs.

低光照低分辨率人脸图像模拟退化模拟的关键部分包括亮度调整和噪声添加。下面本发明详细阐述退化模拟。The key parts of the degradation simulation of low-light low-resolution face images include brightness adjustment and noise addition. The present invention describes the degradation simulation in detail below.

亮度调整。亮度调整的目的是将正常人脸图像转变为低光照人脸图像。本发明没有直接使用伽玛校正，而是利用线性变换和伽玛变换的结合来进行亮度调整，这可以更好地近似低光图像。具体来说，给定一个正常光照的人脸图像I_HR，亮度调整可以表示为：I_LL＝β×(α×I_HR)^γ，其中β∈U(0.5,1)、α∈U(0.9,1)、γ∈U(1.5,5)，而I_LL则表示低光照人脸图像。Brightness adjustment. The purpose of brightness adjustment is to transform a normal face image into a low-light face image. The present invention does not directly use gamma correction, but uses a combination of linear transformation and gamma transformation to perform brightness adjustment, which can better approximate low-light images. Specifically, given a normal-light face image I _HR , the brightness adjustment can be expressed as: I _LL = β×(α×I _HR ) ^γ , where β∈U(0.5,1), α∈U(0.9,1), γ∈U(1.5,5), and I _LL represents a low-light face image.

噪声添加。除了光照的变化之外，低光照环境还会引入噪声。因此，本发明向低光照人脸图像添加噪声。在模拟真实的低光照噪声中，本发明考虑了相机内图像信号处理(ISP)的特性和影响。ISP是指相机硬件和软件执行的将原始传感器数据转换为最终图像的处理步骤。在模拟低光照噪声时，考虑ISP的特性也很重要。因此，噪声添加过程可以表示为：其中f_ISP和/>表示ISP函数和逆ISP函数，N_P和N_G则对应泊松噪声和高斯噪声，I_LLN是生成的低光照带噪声的人脸图像。这里，ISP函数由白平衡、去马赛克、CAM-to-XYZ、XYZ-to-RGB和色调映射组成。Noise addition. In addition to changes in illumination, low-light environments also introduce noise. Therefore, the present invention adds noise to low-light face images. In simulating real low-light noise, the present invention takes into account the characteristics and effects of in-camera image signal processing (ISP). ISP refers to the processing steps performed by camera hardware and software to convert raw sensor data into the final image. When simulating low-light noise, it is also important to consider the characteristics of ISP. Therefore, the noise addition process can be expressed as: Where _fISP and/> represents the ISP function and the inverse ISP function, _NP and _NG correspond to Poisson noise and Gaussian noise, and _ILLN is the generated low-light noisy face image. Here, the ISP function consists of white balance, demosaicing, CAM-to-XYZ, XYZ-to-RGB, and tone mapping.

下采样。最后本发明用Bicubic将I_LLN下采样×16倍生成低光照低分辨率人脸图像I_LLR。Downsampling. Finally, the present invention uses Bicubic to downsample I _LLN by a factor of ×16 to generate a low-light and low-resolution face image I _LLR .

数据集和指标本发明的模型是在CelebAMaskHQ上训练的，本发明从该数据中随机选择3050张人脸图像作为训练集，并使用额外的300张人脸图像作为测试集。首先，本发明将人脸图像裁剪至256×256来作为高分辨率人脸图像，然后利用上述降质过程将其下采样到16×16作为I_LLR图像。本发明选用峰值信噪比(PSNR)、结构相似度(SSIM)、学习的感知图像块相似度(LPIPS)、自然图像质量评估器(NIQE)指标作为评估指标。Datasets and Metrics The model of the present invention is trained on CelebAMaskHQ. The present invention randomly selects 3050 face images from the data as the training set, and uses an additional 300 face images as the test set. First, the present invention crops the face images to 256×256 as high-resolution face images, and then downsamples them to 16×16 as I _LLR images using the above-mentioned degradation process. The present invention selects peak signal-to-noise ratio (PSNR), structural similarity (SSIM), learned perceptual image patch similarity (LPIPS), and natural image quality evaluator (NIQE) indicators as evaluation indicators.

实验细节ICSRB的个数L为14，IC-FSRNet和DENet是分别训练的，先训练IC-FSRNet，然后固定IC-FSRNet的参数，利用IC-FSRNet的输出为辅助信息，训练DENet。本发明选择Adam作为本发明模型的优化器，并且在整个训练阶段的学习率均设为1e-4。Experimental details: The number L of ICSRBs is 14. IC-FSRNet and DENet are trained separately. IC-FSRNet is trained first, and then the parameters of IC-FSRNet are fixed. DENet is trained using the output of IC-FSRNet as auxiliary information. Adam is selected as the optimizer of the model of the present invention, and the learning rate is set to 1e-4 throughout the training stage.

比较方法在本发明的实验中，由于没有低光照低分辨率人脸超分辨率方法，本发明选择几个有代表性的人脸超分辨率方法和图像增强方法，将其级联起来，构成18个对比方法。人脸超分辨率方法包括SISN(Face Hallucination via Split-Attention inSplit-Attention Network)，SCTANet(A Spatial Attention-Guided CNN-TransformerAggregation Networkfor Deep Face Image Super-Resolution)，SFMNet(Spatial-FrequencyMutualLearningforFace Super-Resolution)，低光照图像增强方法包括FECNet(Deep Fourier-Based Exposure Correction Network with Spatial-FrequencyInteraction)、LLformer(Ultra-high-definition low-light image enhancement:Abenchmark and transformer-based method)和LEDNet(Lednet:Joint low-lightenhancement and deblurring in the dark)。Comparison Method In the experiment of the present invention, since there is no low-light low-resolution face super-resolution method, the present invention selects several representative face super-resolution methods and image enhancement methods, cascades them, and forms 18 comparison methods. The face super-resolution methods include SISN (Face Hallucination via Split-Attention in Split-Attention Network), SCTANet (A Spatial Attention-Guided CNN-Transformer Aggregation Network for Deep Face Image Super-Resolution), SFMNet (Spatial-Frequency Mutual Learning for Face Super-Resolution), and the low-light image enhancement methods include FECNet (Deep Fourier-Based Exposure Correction Network with Spatial-Frequency Interaction), LLformer (Ultra-high-definition low-light image enhancement: A benchmark and transformer-based method) and LEDNet (Lednet: Joint low-light enhancement and deblurring in the dark).

对比试验Comparative Test

主观结果图3显示了从测试集中选择的几张人脸图像的超分辨率结果。可以看到，简单的级联解决方案会出现伪影，无法有效恢复人脸结构。原因在于，单个方法只关注单一退化，而输入同时承受了低光照和低分辨率退化。也就是说，前一个过程不可避免地会产生伪影和错误，这会干扰后一个过程并导致错误累积，最终导致重要人脸信息的丢失。相比之下，本发明的方法考虑了低光照和低分辨率退化，并提出了一个用于恢复I_LLR人脸图像的联合框架。在视觉质量方面，IC-FSRNet成功恢复了级联解决方案无法恢复的清晰人脸结构。然而，IC-FSRNet重建的人脸仍然存在噪声并且缺乏精细细节。得益于DENet，本发明的方法能够恢复视觉上令人愉悦的高质量人脸图像。Subjective Results Figure 3 shows the super-resolution results of several face images selected from the test set. It can be seen that the simple cascade solution will produce artifacts and cannot effectively restore the face structure. The reason is that the individual methods only focus on a single degradation, while the input suffers from both low-light and low-resolution degradation. That is, the former process will inevitably produce artifacts and errors, which will interfere with the latter process and cause error accumulation, ultimately leading to the loss of important face information. In contrast, the method of the present invention takes into account low-light and low-resolution degradations and proposes a joint framework for restoring I _LLR face images. In terms of visual quality, IC-FSRNet successfully restores clear face structures that the cascade solution cannot restore. However, the faces reconstructed by IC-FSRNet are still noisy and lack fine details. Thanks to DENet, the method of the present invention is able to restore visually pleasing high-quality face images.

表1.本发明方法与其他几种SOTA方法的客观比较，最好的结果被标记为加粗，次优的结果使用下划线标记。Table 1. Objective comparison of our method with several other SOTA methods. The best results are marked in bold and the suboptimal results are underlined.

客观结果表1展示了对比方法和本发明提出的方法的客观性能。具体来说，表1的上半部分给出了先执行人脸超分辨率后执行低光照图像增强方法的结果，而中间部分列出了先执行低光照图像增强后执行人脸超分辨率的结果。底部展示了本发明提出的方法的性能。从表1的比较可以看出，简单的级联解决方案很难获得令人满意的性能，而本发明的联合方法明显优于级联解决方案。特别是，与级联解决方案相比，IC-FSRNet在除NIQE之外的三个评估指标中表现出最佳性能，尤其在PSNR和SSIM方面表现出色。与性能第二好的方法相比，它在使用更少的参数的情况下，提高PSNR3.04 dB，提高SSIM0.0661。DENet的加入进一步改善了LPIPS和NIQE，但会在一定程度上影响PSNR和SSIM。尽管如此，IC-FSRDENet的PSNR和SSIM仍然明显高于其他方法。总的来说，表1中的结果证实了本发明提出的方法相对于简单级联解决方案的优越性。Objective Results Table 1 shows the objective performance of the comparison methods and the proposed method. Specifically, the upper part of Table 1 gives the results of the method that performs face super-resolution first and then performs low-light image enhancement, while the middle part lists the results of the method that performs low-light image enhancement first and then performs face super-resolution. The bottom shows the performance of the proposed method. From the comparison in Table 1, it can be seen that it is difficult for a simple cascade solution to achieve satisfactory performance, while the joint method of the present invention is significantly better than the cascade solution. In particular, compared with the cascade solution, IC-FSRNet shows the best performance in the three evaluation indicators except NIQE, especially in PSNR and SSIM. Compared with the second best performing method, it improves PSNR by 3.04 dB and SSIM by 0.0661 while using fewer parameters. The addition of DENet further improves LPIPS and NIQE, but affects PSNR and SSIM to a certain extent. Despite this, the PSNR and SSIM of IC-FSRDENet are still significantly higher than those of other methods. In general, the results in Table 1 confirm the superiority of the proposed method over the simple cascade solution.

人脸识别结果对比除了恢复高质量的人脸图像之外，人脸超分辨率方法还应该提高人脸识别等下游任务的性能。因此，本发明在人脸识别性能方面对本发明提出的方法和现有级联方法进行了比较分析。为了评估人脸识别性能，采用预训练的人脸识别模型Deepface从不同方法重建的人脸图像以及对应的高质量人脸图像参考标准中提取人脸身份特征。随后，计算这些人脸身份特征之间的余弦相似度作为人脸识别性能的度量指标。比较结果如图4所示。如图所示，本发明的方法的余弦相似度明显高于对比方法。这一结果清楚地表明本发明的方法在人脸识别性能方面优于其他方法，凸显了本发明提出的方法在改进人脸识别下游任务方面的有效性和优越性。Comparison of face recognition results In addition to restoring high-quality face images, face super-resolution methods should also improve the performance of downstream tasks such as face recognition. Therefore, the present invention compares and analyzes the method proposed in the present invention and the existing cascade method in terms of face recognition performance. In order to evaluate the face recognition performance, the pre-trained face recognition model Deepface is used to extract face identity features from face images reconstructed by different methods and the corresponding high-quality face image reference standards. Subsequently, the cosine similarity between these face identity features is calculated as a measure of face recognition performance. The comparison results are shown in Figure 4. As shown in the figure, the cosine similarity of the method of the present invention is significantly higher than that of the comparison method. This result clearly shows that the method of the present invention is superior to other methods in terms of face recognition performance, highlighting the effectiveness and superiority of the method proposed in the present invention in improving downstream tasks of face recognition.

鉴于亮度调整和人脸超分辨率可以提供互补信息并相互促进，IC-FSRNet构建了用于估计亮度调整系数的亮度估计分支和用于人脸超分辨率的人脸超分辨率分支，并迭代地执行亮度系数估计和人脸超分辨率，让它们相辅相成，促进人脸重建。之后，使用基于扩散概率模型的DENet去除伪影并提高粗略结果的质量。实验结果表明本发明的方法达到了最先进的性能。Considering that brightness adjustment and face super-resolution can provide complementary information and promote each other, IC-FSRNet constructs a brightness estimation branch for estimating brightness adjustment coefficients and a face super-resolution branch for face super-resolution, and iteratively performs brightness coefficient estimation and face super-resolution, allowing them to complement each other and promote face reconstruction. Afterwards, DENet based on the diffusion probability model is used to remove artifacts and improve the quality of rough results. Experimental results show that the method of the present invention achieves state-of-the-art performance.

Claims

1. A method for super-resolution reconstruction of facial images under low light conditions, characterized in that the reconstruction method comprises the following steps:

Step 1: Synthesize a low-light low-resolution face image I _LLR ;

Step 2: Construct the brightness correction face super-resolution network IC-FSRNet;

Step 3: Input the image synthesized in step 1 into step 2 to improve the brightness of the face image and restore the face structure information to obtain I _SR1 ;

Step 4: Build the detail enhancement model DENet;

Step 5: Input the image obtained in step 3 into step 4 to improve the facial details of the face image, so that the face image has a better visual effect, and obtain I _SR2 .

2. According to the method for super-resolution reconstruction of facial images under low-light conditions in claim 1, it is characterized in that the construction of the brightness correction face super-resolution network specifically includes a brightness estimation IE branch for estimating the brightness coefficient for brightness adjustment and a face super-resolution FSR branch for reconstructing the super-resolution result;

First, the low-light low-resolution face image I _LLR is sent to two branches;

Then, the brightness correction super-resolution block ICSRB takes the output features of the previous step as input to encode their mutual relations.

3. According to the method for super-resolution reconstruction of face images under low light conditions in claim 2, it is characterized in that the brightness correction super-resolution block ICSRB converts the output feature Specifically, three cascaded convolutional layers are used as input to encode their mutual relationship. Predict the bilateral brightness grid B ₁ ; at the same time, extract the guidance map G from the upsampled I _LLR through the convolution layer; with B ₁ and G, the brightness coefficient C ₁ is obtained through 3D slicing, expressed as,

C ₁ = f _Slice (B ₁ ,G),

Where f _Slice represents a 3D slice operation;

The brightness coefficient _C1 is then fed into the brightness adjustment block to adjust the brightness of the feature _S1 in the FSR branch, producing the adjusted result

4. According to the method for super-resolution reconstruction of facial images under low-light conditions in claim 3, it is characterized in that the 3D slice first uses G to project the brightness grid _B1 onto the 3D grid,

The mesh is then blurred using Gaussian blur;

Finally, according to the blurred bilateral grid and the guidance image G, the brightness coefficient C ₁ is obtained by accessing the grid values using trilinear interpolation.

5. According to claim 3, a method for super-resolution reconstruction of face images under low light conditions is characterized in that in addition to using the brightness coefficient from the IE branch to improve the brightness in the FSR branch, the brightness adjusted result is also Feedback to the brightness refinement block of the IE branch to refine the original brightness and improve the subsequent brightness estimation; in this way, face super-resolution can promote brightness estimation; then, the generated super-resolution result and brightness refinement result are input into the following L-1 ICSRBs for more accurate mutual refinement and utilization;

After L times of brightness estimation and face super-resolution, a convolutional layer is applied to the result of the FSR branch to generate the reconstruction result I _SR1 ; in order to optimize the network, L1 pixel loss is used,

Among them, I _HR is the reference standard for high-quality face images.

6. According to claim 5, a method for super-resolution reconstruction of facial images under low-light conditions, characterized in that the brightness adjustment is specifically to first map the learned coefficient _Ci to [-1,1] through a convolution layer and a Sigmoid function to obtain a brightness coefficient _C'i . The super-resolution feature _Si will be input into the cascaded convolution layer to generate _S'i , and then brightness adjustment is performed through the following conversion function:

in represents the brightness-adjusted feature, and * indicates pixel-wise multiplication.

7. According to the method for super-resolution reconstruction of face images under low light conditions in claim 6, the brightness refinement is specifically: given the super-resolution result and the original brightness _Bi , using different fully connected layers to generate query Q from _Bi , and from/> Get the key K and value V; then calculate the cross attention between them,

Where F _Att represents the attention map, f _Softmax represents softmax, and d is a hyperparameter;

Then, a feed-forward network is applied to generate the refined result; the super-resolution features are used to refine the brightness estimation in the next step.

8. According to the method for super-resolution reconstruction of face images under low-light conditions in claim 2, it is characterized in that the construction of the detail enhancement model DENet is specifically that the training objective of DENet is as follows:

Where ∈～N(0,1), (x,y) corresponds to I _SR1 and I _HR , γ～p(γ), and f _DENet represents DENet.