[go: up one dir, main page]

WO2024088061A1 - Face reconstruction and occlusion region recognition method, apparatus and device, and storage medium - Google Patents

Face reconstruction and occlusion region recognition method, apparatus and device, and storage medium Download PDF

Info

Publication number
WO2024088061A1
WO2024088061A1 PCT/CN2023/123840 CN2023123840W WO2024088061A1 WO 2024088061 A1 WO2024088061 A1 WO 2024088061A1 CN 2023123840 W CN2023123840 W CN 2023123840W WO 2024088061 A1 WO2024088061 A1 WO 2024088061A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
information
dimensional
reconstruction
network structure
Prior art date
Application number
PCT/CN2023/123840
Other languages
French (fr)
Chinese (zh)
Inventor
汪叶娇
约翰·尤迪·阿迪库苏马
Original Assignee
广州市百果园信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州市百果园信息技术有限公司 filed Critical 广州市百果园信息技术有限公司
Publication of WO2024088061A1 publication Critical patent/WO2024088061A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions

Definitions

  • the embodiments of the present application relate to the field of image processing technology, and in particular to a method, device, equipment and storage medium for face reconstruction and occlusion area recognition.
  • 3D beautification and 3D face pinching people are no longer satisfied with ordinary two-dimensional plane interactions, and have an urgent need for three-dimensional space applications such as 3D beautification and 3D face pinching.
  • 3D face reconstruction applications such as 3D beautification and 3D stylization are deployed in live broadcast, social networking and other scenarios, and have high requirements for real-time performance and reconstruction effects.
  • the input data does not always contain a complete face or have self-occlusion, object occlusion, etc., so the model needs to be robust to a variety of input data.
  • convolutional neural networks are usually used to extract 3D face features and segment 2D faces for occlusion area determination. It is necessary to start two separate tasks for prediction, which requires more resource deployment and cannot meet the needs of some application scenarios with many parallel tasks and high real-time requirements.
  • the occlusion area determination performed in this way has poor robustness.
  • most existing occlusion area determination methods use traditional semantic segmentation tasks for processing, and the accuracy of occlusion area determination is low.
  • the embodiments of the present application provide a method, apparatus, device and storage medium for face reconstruction and occlusion area recognition, which solves the problems in the related technologies, optimizes the overall resource deployment, can meet the application scenarios with many parallel tasks and high real-time requirements, and has higher accuracy in occlusion area determination.
  • an embodiment of the present application provides a method for face reconstruction and occlusion area recognition, the method comprising:
  • the graph structure information is input into a second network structure, and the occluded area corresponding to the face image is output through the second network structure.
  • the embodiment of the present application further provides a face reconstruction and occlusion area recognition device, comprising:
  • a parameter vector generation module configured to input a face image into a first network structure, and output a reconstruction parameter vector of the face image when three-dimensional reconstruction is performed through the first network structure;
  • a three-dimensional information determination module configured to perform three-dimensional reconstruction of the face image based on the reconstruction parameter vector to generate three-dimensional face information
  • a rendering and mapping module configured to render and map the three-dimensional face information to generate two-dimensional face information containing network segmentation data
  • a graph information construction module configured to obtain a depth feature map extracted by the first network structure, and construct graph structure information based on the depth feature map and the two-dimensional face information;
  • the occlusion area determination module is configured to input the graph structure information into a second network structure, and output the occlusion area corresponding to the face image through the second network structure.
  • the embodiment of the present application further provides a face reconstruction and occlusion area recognition device, the device comprising:
  • processors one or more processors
  • a storage device for storing one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the face reconstruction and occlusion area recognition method described in the embodiment of the present application.
  • the embodiments of the present application further provide a non-volatile storage medium storing computer executable instructions, wherein the computer executable instructions are used to execute the embodiments of the present application when executed by a computer processor.
  • a non-volatile storage medium storing computer executable instructions, wherein the computer executable instructions are used to execute the embodiments of the present application when executed by a computer processor.
  • an embodiment of the present application also provides a computer program product, which includes a computer program, which is stored in a computer-readable storage medium, and at least one processor of the device reads and executes the computer program from the computer-readable storage medium, so that the device performs the face reconstruction and occluded area recognition method described in the embodiment of the present application.
  • a face image is input into a first network structure, a reconstruction parameter vector of the face image during three-dimensional reconstruction is output through the first network structure, and then three-dimensional reconstruction of the face image is performed based on the reconstruction parameter vector to generate three-dimensional face information, and the three-dimensional face information is rendered and mapped to generate two-dimensional face information containing network segmentation data.
  • a depth feature map extracted by the first network structure is further obtained, and graph structure information is constructed based on the depth feature map and the two-dimensional face information.
  • the graph structure information is input into a second network structure, and the occlusion area corresponding to the face image is output through the second network structure.
  • the face reconstruction and occlusion area recognition method combines face reconstruction and occlusion area recognition into the same model.
  • the relevant data in the face reconstruction process is used to model the occlusion problem on the two-dimensional plane as the recognition of three-dimensional face occlusion, which is more in line with the way of human perception.
  • the relevant data of the three-dimensional face is used, it is easier to mine hidden related features, thereby obtaining a more accurate occlusion area recognition result.
  • the common characteristics of face reconstruction and occluded area recognition tasks are fully utilized, and the two tasks are mutually reinforcing while compressing the model as much as possible, optimizing the overall resource deployment, and being able to meet application scenarios with many parallel tasks and high real-time requirements, and the accuracy of occluded area judgment is higher.
  • FIG1 is a flow chart of a method for face reconstruction and occlusion area recognition provided by an embodiment of the present application
  • FIG2 is a flow chart of a method for performing three-dimensional reconstruction of a face image based on a reconstruction parameter vector provided in an embodiment of the present application;
  • FIG3 is a schematic diagram of a picture of reconstruction and processing of a face image provided by an embodiment of the present application.
  • FIG4 is a flow chart of a method for constructing graph structure information based on a depth feature map and two-dimensional face information provided by an embodiment of the present application;
  • FIG5 is a schematic diagram of performing special effects processing on an image provided by an embodiment of the present application.
  • FIG6 is a structural block diagram of a face reconstruction and occlusion area recognition device provided in an embodiment of the present application.
  • FIG. 7 is a schematic diagram of the structure of a face reconstruction and occlusion area recognition device provided in an embodiment of the present application.
  • first, second, etc. in the specification and claims of this application are used to distinguish similar objects, and are not used to describe a specific order or sequence. It should be understood that the data used in this way can be interchangeable under appropriate circumstances, so that the embodiments of the present application can be implemented in an order other than those illustrated or described here, and the objects distinguished by "first”, “second”, etc. are generally of one type, and the number of objects is not limited.
  • the first object can be one or more.
  • “and/or” in the specification and claims represents at least one of the connected objects, and the character “/" generally indicates that the objects associated with each other are in an "or” relationship.
  • the face reconstruction and occlusion area recognition method provided in the embodiments of the present application can be applied to various scenarios that require 3D face reconstruction, and can also accurately identify whether the current face is occluded and the corresponding occlusion area.
  • the face image in the short video or live broadcast process is processed to achieve face reconstruction and occlusion area recognition.
  • makeup and other special effects processing can be further performed based on the occlusion area recognition result to ensure the special effects.
  • FIG1 is a flow chart of a method for face reconstruction and occlusion area recognition provided by an embodiment of the present application, which specifically includes the following steps:
  • Step S101 input a face image into a first network structure, and output a reconstruction parameter vector of the face image when three-dimensional reconstruction is performed through the first network structure.
  • the face image is an image containing a face area
  • the face image may be an image captured by a camera or a picture input by a user.
  • a face detector may be used to perform relevant face detection and alignment correction on the face image.
  • the first network structure may be a convolutional neural network, such as a mobilenet-v3 network.
  • the first network structure is a pre-trained network structure, which takes a two-dimensional image as input during training, and outputs a reconstruction parameter vector for three-dimensional face reconstruction through a series of convolutional layers.
  • the first network structure uses a reconstruction loss function of weakly supervised training during training.
  • the reconstruction parameter vector output by the first network structure exemplarily includes: a face feature vector, a face expression vector and a three-dimensional face coefficient vector.
  • other vectors used for three-dimensional reconstruction are also included, such as an illumination vector, a reflectivity vector, a posture vector and a translation vector.
  • Step S102 Perform three-dimensional reconstruction of the face image based on the reconstruction parameter vector to generate three-dimensional face information.
  • FIG2 is a flow chart of a method for three-dimensional reconstruction of a face image based on a reconstruction parameter vector provided in an embodiment of the present application, specifically comprising:
  • Step S1021 reconstructing a three-dimensional face point cloud based on the face feature vector, the face expression vector, the three-dimensional face coefficient vector and the preset average face shape information to obtain point cloud reconstruction information.
  • the three-dimensional face reconstruction includes the reconstruction of the three-dimensional face point cloud and the generation of face texture information.
  • the reconstruction is performed based on the face feature vector, the face expression vector, the three-dimensional face coefficient vector in the determined reconstruction parameter vector, and the preset average face shape information.
  • the facial feature vector is recorded as Bid
  • the facial expression vector is recorded as Bexp
  • the three-dimensional facial coefficient vector includes ⁇ , ⁇ and ⁇
  • the preset average facial shape information is recorded as For example, when determining the point cloud reconstruction information S, the following formula is used to calculate:
  • Step S1022 reconstruct the facial texture based on the three-dimensional facial coefficient vector and the preset average facial texture information and facial base information to obtain facial texture information.
  • the face base information when generating the facial texture information, based on the three-dimensional facial coefficient vector in the reconstruction parameter vector determined above, the acquired facial base information and the preset average facial texture
  • the face base information may use base information in a public face model.
  • the three-dimensional face coefficient vector includes ⁇ , ⁇ and ⁇
  • the face base information is recorded as B t
  • the preset average face texture information is recorded as
  • Step S1023 Generate three-dimensional face information based on the point cloud reconstruction information and the face texture information.
  • the final three-dimensional face information is generated based on the point cloud reconstruction information and the face texture information.
  • the three-dimensional face information can be obtained by superimposing and fitting the point cloud reconstruction information and the face texture information.
  • Step S103 Render and map the three-dimensional face information to generate two-dimensional face information containing network segmentation data.
  • the rendering process may be to render the three-dimensional face information by a renderer to obtain a two-dimensional face image, or to render by using a set rendering model or other rendering algorithm; the mapping process may be: based on the topological structure in the three-dimensional face information, the three-dimensional face point pair data in the three-dimensional face information is mapped to the two-dimensional face image.
  • FIG3 is a picture schematic diagram of the reconstruction and processing of the face image provided by the embodiment of the present application, wherein the original input face image is IT , and the image obtained after three-dimensional reconstruction is VR .
  • the image VR can be rendered in two dimensions using a differential renderer to obtain a two-dimensional face image IR .
  • the triangular face of 1293 three-dimensional face point pairs is projected into the two-dimensional face image IR in combination with the topological structure of the reconstructed image VR to generate two-dimensional face information TrR containing network segmentation data.
  • Step S104 Obtain a depth feature map extracted by the first network structure, and construct graph structure information based on the depth feature map and the two-dimensional face information.
  • the depth feature map extracted by the first network structure during the three-dimensional reconstruction process is used, and the graph structure information is constructed based on the depth feature map and the two-dimensional face information generated in step S103, and the occluded area is further identified based on the graph structure information.
  • the depth feature map is the feature of the layer obtained after calculating the convolution layer in the first network structure.
  • FIG. 4 is a flow chart of a method for constructing graph structure information based on a depth feature map and two-dimensional face information provided by an embodiment of the present application, specifically comprising:
  • Step S1041 Segment the depth feature map based on the network segmentation data in the two-dimensional face information to obtain a triangular face segmentation result.
  • the same segmentation strategy as the network segmentation in the two-dimensional face information is used for segmentation.
  • the size of the depth feature map and the image size in the two-dimensional face information are further adjusted to be consistent in size so that the two sizes are the same.
  • the same segmentation strategy is executed to obtain a triangular face segmentation result.
  • Step S1042 construct an adjacency matrix based on the vertex connectivity and point pair distances in the triangle face segmentation result to obtain graph structure information.
  • the adjacency matrix is constructed using the vertex connectivity of the three-dimensional face mesh itself and the point-to-point distance between the point pairs after projection onto the two-dimensional plane, thereby completing the construction of the graph structure information. For example, taking the construction of the adjacency matrix A ij ⁇ NxN as an example, where N represents the number of vertices of the three-dimensional face point cloud, and in one embodiment, 1293 vertices are used.
  • C ij represents whether two vertices are connected. If they are not connected, the value is 0, otherwise the value is 1.
  • Dij represents the point pair distance between the two vertices after projection onto the two-dimensional plane.
  • Step S105 input the graph structure information into a second network structure, and output the occluded area corresponding to the face image through the second network structure.
  • the second network structure may be a graph convolutional neural network
  • the graph structure information generated in step S104 is input into the second network structure, and the occlusion area corresponding to the face image is output through the second network structure.
  • the three-dimensional face vertices in the graph structure information may be classified through a graph convolutional neural network, and the occlusion area corresponding to the face image is output based on the classification result.
  • the framework of the second network structure mainly uses a graph attention network, which integrates and classifies the three-dimensional face vertices and finally outputs a segmentation mask MR of the occlusion area.
  • the second network structure uses a supervised segmentation loss function during training, such as using a dice loss function to supervise the occlusion mask and the true mask.
  • a supervised segmentation loss function such as using a dice loss function to supervise the occlusion mask and the true mask.
  • the reconstruction parameter vector of the face image during three-dimensional reconstruction is output through the first network structure, and then the face image is reconstructed in three dimensions based on the reconstruction parameter vector to generate three-dimensional face information, and the three-dimensional face information is rendered and mapped to generate two-dimensional face information containing network segmentation data.
  • the depth feature map extracted by the first network structure is further obtained, and the graph structure information is constructed based on the depth feature map and the two-dimensional face information.
  • the graph structure information is input into the second network structure, and the occlusion area corresponding to the face image is output through the second network structure.
  • the face reconstruction and occlusion area recognition method combines face reconstruction and occlusion area recognition into the same model.
  • the relevant data in the face reconstruction process is used to model the occlusion problem on the two-dimensional plane as the recognition of three-dimensional face occlusion, which is more in line with the way of human perception.
  • the relevant data of the three-dimensional face is used, it is easier to mine hidden related features, thereby obtaining a more accurate occlusion area recognition result.
  • the common characteristics of face reconstruction and occluded area recognition tasks are fully utilized, and the two tasks are mutually reinforcing while compressing the model as much as possible, optimizing the overall resource deployment, and being able to meet application scenarios with many parallel tasks and high real-time requirements, and the accuracy of occluded area judgment is higher.
  • face reconstruction and occluded area recognition are integrated into one model for processing, and feature extraction is performed on the two related tasks at the same time, thereby solving the pain point that traditional solutions cannot mine the hidden related features of the two tasks, compressing the model size, and achieving better reconstruction and segmentation effects.
  • Figure 5 is a schematic diagram of special effect processing of an image provided by an embodiment of the present application.
  • the face image IT is the original input image
  • the image MR is the image output after the occlusion area is identified
  • the image IR is the image corresponding to the two-dimensional face information after rendering and mapping.
  • the three are processed to finally obtain the image I F.
  • the occlusion area can be rendered using the original image occlusion object, and the part of the face image outside the occlusion area can be rendered based on the three-dimensional face information.
  • the black part in the image MR represents the unobstructed area, indicating that the original image has not been covered in this area. Occlusion, the area should be rendered into the reconstructed image IR during the final rendering. Conversely, the white area in the image MR represents the area of the original input image that is occluded, and the reconstructed image should not be rendered during the final rendering. Since a complete face will be reconstructed regardless of whether the object face is occluded during three-dimensional face reconstruction. Therefore, for situations such as hand self-occlusion and sunglasses occlusion in the example of Figure 5, only the entire face will be reconstructed as shown in the image IR during reconstruction, and no occluders will be reconstructed.
  • the occluder is identified using the predicted black area of the image MR , and the input image is finally rendered for the occluded part.
  • the parts blocked by the sunglasses and hands are rendered using the sunglasses and hands in the image IT in the final rendered image I F , thereby avoiding the problem of being unable to render the blocked objects or, if there is three-dimensional makeup post-processing, mistakenly rendering the three-dimensional makeup onto the blocked objects.
  • FIG6 is a structural block diagram of a face reconstruction and occlusion region identification device provided in an embodiment of the present application.
  • the device is used to execute the face reconstruction and occlusion region identification method provided in the above embodiment, and has the corresponding functional modules and beneficial effects of the execution method.
  • the device specifically includes: a parameter vector generation module 101, a three-dimensional information determination module 102, a rendering mapping module 103, a graph information construction module 104 and an occlusion region determination module 105, wherein:
  • the parameter vector generation module 101 is configured to input a face image into a first network structure, and output a reconstruction parameter vector of the face image when performing three-dimensional reconstruction through the first network structure;
  • a three-dimensional information determination module 102 configured to perform three-dimensional reconstruction of the face image based on the reconstruction parameter vector to generate three-dimensional face information
  • a rendering and mapping module 103 is configured to render and map the three-dimensional face information to generate two-dimensional face information including network segmentation data;
  • a graph information construction module 104 configured to obtain a depth feature graph extracted by the first network structure, and construct graph structure information based on the depth feature graph and the two-dimensional face information;
  • the occlusion region determination module 105 is configured to input the graph structure information into a second network structure, and output the occlusion region corresponding to the face image through the second network structure.
  • the reconstruction parameter vector of the face image is output through the first network structure when performing three-dimensional reconstruction, and then the face is reconstructed based on the reconstruction parameter vector.
  • the three-dimensional reconstruction of the image generates three-dimensional face information, and the three-dimensional face information is rendered and mapped to generate two-dimensional face information containing network segmentation data.
  • the depth feature map extracted by the first network structure is further obtained, and the graph structure information is constructed based on the depth feature map and the two-dimensional face information.
  • the graph structure information is input into the second network structure, and the occlusion area corresponding to the face image is output through the second network structure.
  • the face reconstruction and occlusion area recognition method provided by this scheme merges face reconstruction and occlusion area recognition into the same model.
  • the relevant data in the face reconstruction process is used to model the occlusion problem on the two-dimensional plane as the recognition of three-dimensional face occlusion, which is more in line with the way of human perception.
  • the relevant data of the three-dimensional face it is easier to mine the hidden related features, so as to obtain a more accurate occlusion area recognition result.
  • the common characteristics of the two tasks of face reconstruction and occlusion area recognition are fully utilized, and the two tasks are mutually promoted while compressing the model as much as possible, optimizing the overall resource deployment, and being able to meet the application scenarios with many parallel tasks and high real-time requirements, and the accuracy of occlusion area determination is higher.
  • the reconstruction parameter vector includes a face feature vector, a face expression vector and a three-dimensional face coefficient vector
  • the three-dimensional information determination module 102 is configured as follows:
  • the rendering mapping module 103 is configured as follows:
  • the three-dimensional face point pair data in the three-dimensional face information is mapped to the two-dimensional face image to generate two-dimensional face information containing network segmentation data.
  • the graph information construction module 104 is configured as follows:
  • the graph structure information is obtained by constructing an adjacency matrix based on the vertex connectivity and point pair distances in the triangle face segmentation result.
  • the occlusion area determination module 105 is configured as follows:
  • the occluded area corresponding to the face image is output based on the classification result.
  • the device further includes a special effect processing module configured to:
  • the special effect processing module is configured as follows:
  • the occluded area is rendered using the original image occluder, and the facial image portion outside the occluded area is rendered based on the three-dimensional facial information.
  • FIG7 is a schematic diagram of the structure of a face reconstruction and occlusion area recognition device provided by an embodiment of the present application.
  • the device includes a processor 201, a memory 202, an input device 203, and an output device 204; the number of processors 201 in the device can be one or more, and FIG7 takes one processor 201 as an example; the processor 201, the memory 202, the input device 203, and the output device 204 in the device can be connected by a bus or other means, and FIG7 takes the connection by a bus as an example.
  • the memory 202 can be used to store software programs, computer executable programs, and modules, such as program instructions/modules corresponding to the face reconstruction and occlusion area recognition method in the embodiment of the present application.
  • the processor 201 executes various functional applications and data processing of the device by running the software programs, instructions, and modules stored in the memory 302, that is, realizing the above-mentioned face reconstruction and occlusion area recognition method.
  • the input device 203 can be used to receive input digital or character information, and generate key signal input related to the user settings and function control of the device.
  • the output device 204 may include a display device such as a display screen.
  • the embodiment of the present application further provides a non-volatile storage medium containing computer executable instructions, wherein the computer executable instructions are used to execute a face reconstruction and occlusion area recognition method described in the above embodiment when executed by a computer processor, including:
  • the graph structure information is input into a second network structure, and the occluded area corresponding to the face image is output through the second network structure.
  • various aspects of the method provided by the present application may also be implemented in the form of a program product, which includes a program code.
  • the program product When the program product is run on a computer device, the program code is used to enable the computer device to execute the steps of the method according to various exemplary embodiments of the present application described above in this specification.
  • the computer device may execute the face reconstruction and occlusion area recognition method recorded in the embodiment of the present application.
  • the program product may be implemented in any combination of one or more readable media.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Human Computer Interaction (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)

Abstract

Embodiments of the present application provide a face reconstruction and occlusion region recognition method, apparatus and device, and a storage medium. The method comprises: inputting a face image into a first network structure, and outputting a reconstruction parameter vector of the face image during three-dimensional reconstruction by means of the first network structure; performing three-dimensional reconstruction of the face image on the basis of the reconstruction parameter vector to generate three-dimensional face information; performing rendering and mapping processing on the three-dimensional face information to generate two-dimensional face information comprising network segmentation data; acquiring a depth feature map extracted by the first network structure, and constructing map structure information on the basis of the depth feature map and the two-dimensional face information; and inputting the map structure information into a second network structure, and outputting an occlusion region corresponding to the face image by means of the second network structure. The present solution optimizes the overall resource deployment, can meet application scenarios having multiple parallel tasks and high real-time requirements, and achieves higher accuracy in occlusion region determination.

Description

人脸重建和遮挡区域识别方法、装置、设备及存储介质Face reconstruction and occluded area recognition method, device, equipment and storage medium

本申请要求在2022年10月27日提交中国专利局,申请号为202211328264.0的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application filed with the China Patent Office on October 27, 2022, with application number 202211328264.0, the entire contents of which are incorporated by reference into this application.

技术领域Technical Field

本申请实施例涉及图像处理技术领域,尤其涉及一种人脸重建和遮挡区域识别方法、装置、设备及存储介质。The embodiments of the present application relate to the field of image processing technology, and in particular to a method, device, equipment and storage medium for face reconstruction and occlusion area recognition.

背景技术Background technique

随着虚拟现实技术的发展和普及,人们已不满足于普通二维平面的交互,对3D美型、3D捏脸等立体空间的应用提出迫切需求。当前大多数3D美型、3D风格化等三维人脸重建应用布局于直播、社交等场景下,对实时性和重建效果要求很高。同时,由于实际生产环境比想象复杂得多,输入数据不总是符合包含完整人脸或存在自遮挡、物体遮挡等情况,因此模型需要对多样的输入数据具有鲁棒性。同时,很多时候需要判定实际遮挡区域,从而进行一系列后处理操作。With the development and popularization of virtual reality technology, people are no longer satisfied with ordinary two-dimensional plane interactions, and have an urgent need for three-dimensional space applications such as 3D beautification and 3D face pinching. Currently, most 3D face reconstruction applications such as 3D beautification and 3D stylization are deployed in live broadcast, social networking and other scenarios, and have high requirements for real-time performance and reconstruction effects. At the same time, because the actual production environment is much more complicated than imagined, the input data does not always contain a complete face or have self-occlusion, object occlusion, etc., so the model needs to be robust to a variety of input data. At the same time, it is often necessary to determine the actual occluded area, so as to perform a series of post-processing operations.

相关技术中,通常利用卷积神经网络分别进行3D人脸特征的提取以及2D人脸分割进行遮挡区域判定,其需要开启两项单独任务分别进行预测,由此需要更多的资源部署,无法满足一些并行任务多且实时性要求高的应用场景,该种方式进行的遮挡区域判定鲁棒性差。且现有的大多遮挡区域判定的方式采取传统的语义分割任务进行处理,其遮挡区域判定的精准度低。In the related technology, convolutional neural networks are usually used to extract 3D face features and segment 2D faces for occlusion area determination. It is necessary to start two separate tasks for prediction, which requires more resource deployment and cannot meet the needs of some application scenarios with many parallel tasks and high real-time requirements. The occlusion area determination performed in this way has poor robustness. In addition, most existing occlusion area determination methods use traditional semantic segmentation tasks for processing, and the accuracy of occlusion area determination is low.

发明内容Summary of the invention

本申请实施例提供了一种人脸重建和遮挡区域识别方法、装置、设备及存储介质,解决了相关技术中,优化了整体的资源部署,能够满足并行任务多且实时性要求高的应用场景,遮挡区域判定的精确性更高。The embodiments of the present application provide a method, apparatus, device and storage medium for face reconstruction and occlusion area recognition, which solves the problems in the related technologies, optimizes the overall resource deployment, can meet the application scenarios with many parallel tasks and high real-time requirements, and has higher accuracy in occlusion area determination.

第一方面,本申请实施例提供了一种人脸重建和遮挡区域识别方法,该方法包括: In a first aspect, an embodiment of the present application provides a method for face reconstruction and occlusion area recognition, the method comprising:

将人脸图像输入至第一网络结构,通过所述第一网络结构输出所述人脸图像在进行三维重建时的重建参数向量;Inputting a face image into a first network structure, and outputting a reconstruction parameter vector of the face image when three-dimensional reconstruction is performed through the first network structure;

基于所述重建参数向量进行所述人脸图像的三维重建生成三维人脸信息;Performing three-dimensional reconstruction of the face image based on the reconstruction parameter vector to generate three-dimensional face information;

对所述三维人脸信息进行渲染和映射处理生成包含网络分割数据的二维人脸信息;Rendering and mapping the three-dimensional face information to generate two-dimensional face information containing network segmentation data;

获取所述第一网络结构提取的深度特征图,基于所述深度特征图以及所述二维人脸信息构造图结构信息;Obtaining a depth feature map extracted by the first network structure, and constructing graph structure information based on the depth feature map and the two-dimensional face information;

将所述图结构信息输入至第二网络结构,通过所述第二网络结构输出所述人脸图像对应的遮挡区域。The graph structure information is input into a second network structure, and the occluded area corresponding to the face image is output through the second network structure.

第二方面,本申请实施例还提供了一种人脸重建和遮挡区域识别装置,包括:In a second aspect, the embodiment of the present application further provides a face reconstruction and occlusion area recognition device, comprising:

参数向量生成模块,配置为将人脸图像输入至第一网络结构,通过所述第一网络结构输出所述人脸图像在进行三维重建时的重建参数向量;A parameter vector generation module, configured to input a face image into a first network structure, and output a reconstruction parameter vector of the face image when three-dimensional reconstruction is performed through the first network structure;

三维信息确定模块,配置为基于所述重建参数向量进行所述人脸图像的三维重建生成三维人脸信息;a three-dimensional information determination module, configured to perform three-dimensional reconstruction of the face image based on the reconstruction parameter vector to generate three-dimensional face information;

渲染映射模块,配置为对所述三维人脸信息进行渲染和映射处理生成包含网络分割数据的二维人脸信息;A rendering and mapping module, configured to render and map the three-dimensional face information to generate two-dimensional face information containing network segmentation data;

图信息构造模块,配置为获取所述第一网络结构提取的深度特征图,基于所述深度特征图以及所述二维人脸信息构造图结构信息;A graph information construction module, configured to obtain a depth feature map extracted by the first network structure, and construct graph structure information based on the depth feature map and the two-dimensional face information;

遮挡区域确定模块,配置为将所述图结构信息输入至第二网络结构,通过所述第二网络结构输出所述人脸图像对应的遮挡区域。The occlusion area determination module is configured to input the graph structure information into a second network structure, and output the occlusion area corresponding to the face image through the second network structure.

第三方面,本申请实施例还提供了一种人脸重建和遮挡区域识别设备,该设备包括:In a third aspect, the embodiment of the present application further provides a face reconstruction and occlusion area recognition device, the device comprising:

一个或多个处理器;one or more processors;

存储装置,用于存储一个或多个程序,a storage device for storing one or more programs,

当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现本申请实施例所述的人脸重建和遮挡区域识别方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the face reconstruction and occlusion area recognition method described in the embodiment of the present application.

第四方面,本申请实施例还提供了一种存储计算机可执行指令的非易失性存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行本申请实 施例所述的人脸重建和遮挡区域识别方法。In a fourth aspect, the embodiments of the present application further provide a non-volatile storage medium storing computer executable instructions, wherein the computer executable instructions are used to execute the embodiments of the present application when executed by a computer processor. The face reconstruction and occluded area recognition method described in the embodiment.

第五方面,本申请实施例还提供了一种计算机程序产品,该计算机程序产品包括计算机程序,该计算机程序存储在计算机可读存储介质中,设备的至少一个处理器从计算机可读存储介质读取并执行计算机程序,使得设备执行本申请实施例所述的人脸重建和遮挡区域识别方法。In the fifth aspect, an embodiment of the present application also provides a computer program product, which includes a computer program, which is stored in a computer-readable storage medium, and at least one processor of the device reads and executes the computer program from the computer-readable storage medium, so that the device performs the face reconstruction and occluded area recognition method described in the embodiment of the present application.

本申请实施例中,将人脸图像输入至第一网络结构,通过第一网络结构输出人脸图像在进行三维重建时的重建参数向量,再基于重建参数向量进行人脸图像的三维重建生成三维人脸信息,对三维人脸信息进行渲染和映射处理生成包含网络分割数据的二维人脸信息,此时进一步的获取第一网络结构提取的深度特征图,基于深度特征图以及所述二维人脸信息构造图结构信息,将图结构信息输入至第二网络结构,通过第二网络结构输出所述人脸图像对应的遮挡区域。本方案提供的人脸重建和遮挡区域识别方法,将人脸重建和遮挡区域识别合并到同一个模型中,在遮挡区域识别时,利用了人脸重建过程中的相关数据,将二维平面上的遮挡问题建模为三维人脸遮挡的识别,其更加符合人类感知的方式,同时由于利用了三维人脸的相关数据,更容易挖掘隐藏的关联特征,从而得到更为精确的遮挡区域识别结果。同时,充分利用了人脸重建和遮挡区域识别二者任务的共性特征,在尽量压缩模型的同时使两项任务形成互相促进的效果,优化了整体的资源部署,能够满足并行任务多且实时性要求高的应用场景,遮挡区域判定的精确性更高。In an embodiment of the present application, a face image is input into a first network structure, a reconstruction parameter vector of the face image during three-dimensional reconstruction is output through the first network structure, and then three-dimensional reconstruction of the face image is performed based on the reconstruction parameter vector to generate three-dimensional face information, and the three-dimensional face information is rendered and mapped to generate two-dimensional face information containing network segmentation data. At this time, a depth feature map extracted by the first network structure is further obtained, and graph structure information is constructed based on the depth feature map and the two-dimensional face information. The graph structure information is input into a second network structure, and the occlusion area corresponding to the face image is output through the second network structure. The face reconstruction and occlusion area recognition method provided by this scheme combines face reconstruction and occlusion area recognition into the same model. When recognizing the occlusion area, the relevant data in the face reconstruction process is used to model the occlusion problem on the two-dimensional plane as the recognition of three-dimensional face occlusion, which is more in line with the way of human perception. At the same time, because the relevant data of the three-dimensional face is used, it is easier to mine hidden related features, thereby obtaining a more accurate occlusion area recognition result. At the same time, the common characteristics of face reconstruction and occluded area recognition tasks are fully utilized, and the two tasks are mutually reinforcing while compressing the model as much as possible, optimizing the overall resource deployment, and being able to meet application scenarios with many parallel tasks and high real-time requirements, and the accuracy of occluded area judgment is higher.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本申请实施例提供的一种人脸重建和遮挡区域识别方法的流程图;FIG1 is a flow chart of a method for face reconstruction and occlusion area recognition provided by an embodiment of the present application;

图2为本申请实施例提供的一种基于重建参数向量进行人脸图像三维重建的方法的流程图;FIG2 is a flow chart of a method for performing three-dimensional reconstruction of a face image based on a reconstruction parameter vector provided in an embodiment of the present application;

图3为本申请实施例提供的对人脸图像进行重建和处理的图片示意图;FIG3 is a schematic diagram of a picture of reconstruction and processing of a face image provided by an embodiment of the present application;

图4为本申请实施例提供的一种基于深度特征图以及二维人脸信息构造图结构信息的方法的流程图;FIG4 is a flow chart of a method for constructing graph structure information based on a depth feature map and two-dimensional face information provided by an embodiment of the present application;

图5为本申请实施例提供的一种对图像进行特效处理的示意图;FIG5 is a schematic diagram of performing special effects processing on an image provided by an embodiment of the present application;

图6为本申请实施例提供的一种人脸重建和遮挡区域识别装置的结构框图; FIG6 is a structural block diagram of a face reconstruction and occlusion area recognition device provided in an embodiment of the present application;

图7为本申请实施例提供的一种人脸重建和遮挡区域识别设备的结构示意图。FIG. 7 is a schematic diagram of the structure of a face reconstruction and occlusion area recognition device provided in an embodiment of the present application.

具体实施方式Detailed ways

下面结合附图和实施例对本申请实施例作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释本申请实施例,而非对本申请实施例的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与本申请实施例相关的部分而非全部结构。The embodiments of the present application are further described in detail below in conjunction with the accompanying drawings and embodiments. It is to be understood that the specific embodiments described herein are only used to explain the embodiments of the present application, rather than to limit the embodiments of the present application. It should also be noted that, for ease of description, only parts related to the embodiments of the present application are shown in the accompanying drawings, rather than all structures.

本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施,且“第一”、“第二”等所区分的对象通常为一类,并不限定对象的个数,例如第一对象可以是一个,也可以是多个。此外,说明书以及权利要求中“和/或”表示所连接对象的至少其中之一,字符“/”,一般表示前后关联对象是一种“或”的关系。The terms "first", "second", etc. in the specification and claims of this application are used to distinguish similar objects, and are not used to describe a specific order or sequence. It should be understood that the data used in this way can be interchangeable under appropriate circumstances, so that the embodiments of the present application can be implemented in an order other than those illustrated or described here, and the objects distinguished by "first", "second", etc. are generally of one type, and the number of objects is not limited. For example, the first object can be one or more. In addition, "and/or" in the specification and claims represents at least one of the connected objects, and the character "/" generally indicates that the objects associated with each other are in an "or" relationship.

本申请实施例中提供的人脸重建和遮挡区域识别方法,可应用于各类需要3D人脸重建的场景,同时可以得到当前人脸是否存在遮挡以及对应遮挡区域的精确识别。具体如对短视频、直播过程中的人脸图像进行处理以实现人脸重建和遮挡区域识别。其中,针对遮挡区域识别完成后,可进一步的基于该遮挡区域识别结果进行美妆美颜和其它特效处理,以保证特效效果。The face reconstruction and occlusion area recognition method provided in the embodiments of the present application can be applied to various scenarios that require 3D face reconstruction, and can also accurately identify whether the current face is occluded and the corresponding occlusion area. Specifically, the face image in the short video or live broadcast process is processed to achieve face reconstruction and occlusion area recognition. After the occlusion area recognition is completed, makeup and other special effects processing can be further performed based on the occlusion area recognition result to ensure the special effects.

图1为本申请实施例提供的一种人脸重建和遮挡区域识别方法的流程图,具体包括如下步骤:FIG1 is a flow chart of a method for face reconstruction and occlusion area recognition provided by an embodiment of the present application, which specifically includes the following steps:

步骤S101、将人脸图像输入至第一网络结构,通过所述第一网络结构输出所述人脸图像在进行三维重建时的重建参数向量。Step S101: input a face image into a first network structure, and output a reconstruction parameter vector of the face image when three-dimensional reconstruction is performed through the first network structure.

在一个实施例中,人脸图像为包含人脸区域的图像,该人脸图像示例性的可以是通过摄像头采集的图像,或者用户输入图片。可选的,在将人脸图像输入至第一网络结构之前,可以通过人脸检测器对人脸图像行相关的人脸检测以及对齐校正。In one embodiment, the face image is an image containing a face area, and the face image may be an image captured by a camera or a picture input by a user. Optionally, before the face image is input into the first network structure, a face detector may be used to perform relevant face detection and alignment correction on the face image.

在一个实施例中,该第一网络结构可以是卷积神经网络,如mobilenet-v3网 络结构(一种轻量化的卷积神经网络,mobilenet系列的第三代)、VGG网络结构或resnet等骨干网络结构。其中,该第一网络结构为预先训练完成的网络结构,训练过程中以二维图片作为输入,经过一系列卷积层输出用于三维人脸重建的重建参数向量。可选的,该第一网络结构在训练学习时使用弱监督训练的重建损失函数。In one embodiment, the first network structure may be a convolutional neural network, such as a mobilenet-v3 network. The first network structure is a pre-trained network structure, which takes a two-dimensional image as input during training, and outputs a reconstruction parameter vector for three-dimensional face reconstruction through a series of convolutional layers. Optionally, the first network structure uses a reconstruction loss function of weakly supervised training during training.

在一个实施例中,将人脸图像输入至第一网络结构后,通过该第一网络结构输出的重建参数向量示例性的包括:人脸特征向量、人脸表情向量和三维人脸系数向量。其中,还包括其它用于三维重建的向量,如光照向量、反射率向量、位姿向量和平移向量等。In one embodiment, after the face image is input into the first network structure, the reconstruction parameter vector output by the first network structure exemplarily includes: a face feature vector, a face expression vector and a three-dimensional face coefficient vector. Among them, other vectors used for three-dimensional reconstruction are also included, such as an illumination vector, a reflectivity vector, a posture vector and a translation vector.

步骤S102、基于所述重建参数向量进行所述人脸图像的三维重建生成三维人脸信息。Step S102: Perform three-dimensional reconstruction of the face image based on the reconstruction parameter vector to generate three-dimensional face information.

在一个实施例中,通过第一网络结构得到输入人脸图像对应的重建参数向量后,基于该重建参数向量进行人脸图像的三维重建,以得到三维人脸信息。一种示例性的三维重建过程如图2所示,图2为本申请实施例提供的一种基于重建参数向量进行人脸图像三维重建的方法的流程图,具体包括:In one embodiment, after obtaining the reconstruction parameter vector corresponding to the input face image through the first network structure, a three-dimensional reconstruction of the face image is performed based on the reconstruction parameter vector to obtain three-dimensional face information. An exemplary three-dimensional reconstruction process is shown in FIG2, which is a flow chart of a method for three-dimensional reconstruction of a face image based on a reconstruction parameter vector provided in an embodiment of the present application, specifically comprising:

步骤S1021、基于人脸特征向量、人脸表情向量、三维人脸系数向量以及预设的平均人脸形状信息进行三维人脸点云的重建,得到点云重建信息。Step S1021, reconstructing a three-dimensional face point cloud based on the face feature vector, the face expression vector, the three-dimensional face coefficient vector and the preset average face shape information to obtain point cloud reconstruction information.

在一个实施例中,进行三维人脸重建时,包括三维人脸点云的重建和人脸纹理信息的生成。可选的,在进行三维人脸点云的重建过程中,基于确定出的重建参数向量中的人脸特征向量、人脸表情向量、三维人脸系数向量,以及预设的平均人脸形状信息进行重建。In one embodiment, the three-dimensional face reconstruction includes the reconstruction of the three-dimensional face point cloud and the generation of face texture information. Optionally, during the reconstruction of the three-dimensional face point cloud, the reconstruction is performed based on the face feature vector, the face expression vector, the three-dimensional face coefficient vector in the determined reconstruction parameter vector, and the preset average face shape information.

示例性的,以人脸特征向量记为Bid,人脸表情向量记为Bexp,三维人脸系数向量包括α、β以及δ,预设的平均人脸形状信息记为为例,在确定点云重建信息S时,采用下述公式计算得到:
For example, the facial feature vector is recorded as Bid , the facial expression vector is recorded as Bexp , the three-dimensional facial coefficient vector includes α, β and δ, and the preset average facial shape information is recorded as For example, when determining the point cloud reconstruction information S, the following formula is used to calculate:

步骤S1022、基于三维人脸系数向量以及预设的平均人脸纹理信息和人脸基底信息进行人脸纹理的重建,得到人脸纹理信息。Step S1022: reconstruct the facial texture based on the three-dimensional facial coefficient vector and the preset average facial texture information and facial base information to obtain facial texture information.

在一个实施例中,进行人脸纹理信息的生成时,基于前述确定的重建参数向量中的三维人脸系数向量,以及获取的人脸基底信息和预设的平均人脸纹理 信息进行生成。可选的,人脸基底信息可使用公开的人脸模型中的基底信息。In one embodiment, when generating the facial texture information, based on the three-dimensional facial coefficient vector in the reconstruction parameter vector determined above, the acquired facial base information and the preset average facial texture Optionally, the face base information may use base information in a public face model.

示例性的,以三维人脸系数向量包括α、β以及δ,人脸基底信息记为Bt,预设的平均人脸纹理信息记为在确定人脸纹理信息T时,采用下述公式计算得到:
For example, the three-dimensional face coefficient vector includes α, β and δ, the face base information is recorded as B t , and the preset average face texture information is recorded as When determining the face texture information T, the following formula is used to calculate it:

步骤S1023、基于点云重建信息和人脸纹理信息生成三维人脸信息。Step S1023: Generate three-dimensional face information based on the point cloud reconstruction information and the face texture information.

在一个实施例中,在得到点云重建信息和人脸纹理信息后,基于该点云重建信息和人脸纹理信息生成最终的三维人脸信息。示例性的,可以是进行点云重建信息和人脸纹理信息的叠加拟合得到三维人脸信息。In one embodiment, after obtaining the point cloud reconstruction information and the face texture information, the final three-dimensional face information is generated based on the point cloud reconstruction information and the face texture information. Exemplarily, the three-dimensional face information can be obtained by superimposing and fitting the point cloud reconstruction information and the face texture information.

步骤S103、对所述三维人脸信息进行渲染和映射处理生成包含网络分割数据的二维人脸信息。Step S103: Render and map the three-dimensional face information to generate two-dimensional face information containing network segmentation data.

在一个实施例中,得到三维人脸信息后,进一步的对其进行渲染和映射处理生成二维人脸信息,其中,该生成的二维人脸信息包含有网络分割数据。可选的,渲染处理的过程可以是通过渲染器进行三维人脸信息的渲染得到二维人脸图像,或者通过使用设置的渲染模型或其它渲染算法进行渲染处理;映射处理的过程可以是:基于三维人脸信息中的拓扑结构,将三维人脸信息中的人脸三维点对数据映射至二维人脸图像中。示例性的,如图3所示,图3为本申请实施例提供的对人脸图像进行重建和处理的图片示意图,其中,原始输入的人脸图像为IT,经过三维重建后得到的图像为VR,对该图像VR可利用微分渲染器进行二维渲染得到二维人脸图像IR,此时结合重建的图像VR的拓扑结构将1293个人脸三维点对的三角面片投影到二维人脸图像IR中生成包含网络分割数据的二维人脸信息TrRIn one embodiment, after obtaining the three-dimensional face information, it is further rendered and mapped to generate two-dimensional face information, wherein the generated two-dimensional face information contains network segmentation data. Optionally, the rendering process may be to render the three-dimensional face information by a renderer to obtain a two-dimensional face image, or to render by using a set rendering model or other rendering algorithm; the mapping process may be: based on the topological structure in the three-dimensional face information, the three-dimensional face point pair data in the three-dimensional face information is mapped to the two-dimensional face image. Exemplarily, as shown in FIG3, FIG3 is a picture schematic diagram of the reconstruction and processing of the face image provided by the embodiment of the present application, wherein the original input face image is IT , and the image obtained after three-dimensional reconstruction is VR . The image VR can be rendered in two dimensions using a differential renderer to obtain a two-dimensional face image IR . At this time, the triangular face of 1293 three-dimensional face point pairs is projected into the two-dimensional face image IR in combination with the topological structure of the reconstructed image VR to generate two-dimensional face information TrR containing network segmentation data.

步骤S104、获取所述第一网络结构提取的深度特征图,基于所述深度特征图以及所述二维人脸信息构造图结构信息。Step S104: Obtain a depth feature map extracted by the first network structure, and construct graph structure information based on the depth feature map and the two-dimensional face information.

在一个实施例中,进行遮挡区域的识别时,利用三维重建过程中第一网络结构提取的深度特征图,基于该深度特征图以及步骤S103中生成的二维人脸信息构造图结构信息,后续进一步基于该图结构信息进行遮挡区域的识别。其中,该深度特征图为对第一网络结构中卷积层计算后得到的图层的特征。In one embodiment, when identifying the occluded area, the depth feature map extracted by the first network structure during the three-dimensional reconstruction process is used, and the graph structure information is constructed based on the depth feature map and the two-dimensional face information generated in step S103, and the occluded area is further identified based on the graph structure information. The depth feature map is the feature of the layer obtained after calculating the convolution layer in the first network structure.

可选的,利用该深度特征图以及二维人脸信息构造图结构信息的方式示例 性的如图4所示,图4为本申请实施例提供的一种基于深度特征图以及二维人脸信息构造图结构信息的方法的流程图,具体包括:Optionally, an example of a method of constructing graph structure information using the depth feature map and two-dimensional face information As shown in FIG. 4 , FIG. 4 is a flow chart of a method for constructing graph structure information based on a depth feature map and two-dimensional face information provided by an embodiment of the present application, specifically comprising:

步骤S1041、基于二维人脸信息中的网络分割数据对深度特征图进行分割处理得到三角片面分割结果。Step S1041: Segment the depth feature map based on the network segmentation data in the two-dimensional face information to obtain a triangular face segmentation result.

其中,在对深度特征图进行分割时,采用二维人脸信息中的网络分割的相同分割策略进行分割。其中,在进行分割时进一步的包括将深度特征图尺寸与二维人脸信息中的图像尺寸进行大小一致性的调节,以使得二者大小相同,此时执行相同的分割策略得到三角片面分割结果。When segmenting the depth feature map, the same segmentation strategy as the network segmentation in the two-dimensional face information is used for segmentation. When segmenting, the size of the depth feature map and the image size in the two-dimensional face information are further adjusted to be consistent in size so that the two sizes are the same. At this time, the same segmentation strategy is executed to obtain a triangular face segmentation result.

步骤S1042、依据三角片面分割结果中的顶点联通关系以及点对距离进行邻接矩阵的构造得到图结构信息。Step S1042: construct an adjacency matrix based on the vertex connectivity and point pair distances in the triangle face segmentation result to obtain graph structure information.

其中,在得到三角片面分割结果后,利用三维人脸网格本身的顶点联通关系及投影到二维平面后点对之间的点对距离构造邻接矩阵,从而完成图结构信息的构建。示例性的,以构造邻接矩阵Aij∈NxN为例,其中N代表三维人脸点云的顶点个数,在一个实施例中使用1293个顶点,具体的基于顶点联通关系以及点对距离进行邻接矩阵的构造方式为:
Aij=Cij*Dij
After obtaining the triangular face segmentation result, the adjacency matrix is constructed using the vertex connectivity of the three-dimensional face mesh itself and the point-to-point distance between the point pairs after projection onto the two-dimensional plane, thereby completing the construction of the graph structure information. For example, taking the construction of the adjacency matrix A ij ∈ NxN as an example, where N represents the number of vertices of the three-dimensional face point cloud, and in one embodiment, 1293 vertices are used. The specific method of constructing the adjacency matrix based on the vertex connectivity and the point-to-point distance is:
A ij =C ij *D ij

其中,Cij代表两个顶点是否联通,如果不连通取值为0,反之取值为1,Dij表示投影到二维平面后两个顶点之间的点对距离。Among them, C ij represents whether two vertices are connected. If they are not connected, the value is 0, otherwise the value is 1. Dij represents the point pair distance between the two vertices after projection onto the two-dimensional plane.

步骤S105、将所述图结构信息输入至第二网络结构,通过所述第二网络结构输出所述人脸图像对应的遮挡区域。Step S105: input the graph structure information into a second network structure, and output the occluded area corresponding to the face image through the second network structure.

在一个实施例中,该第二网络结构可以是图卷积神经网络,将步骤S104中生成的图结构信息输入至该第二网络结构,通过该第二网络结构输出人脸图像对应的遮挡区域。可选的,可以是通过图卷积神经网络对图结构信息中的三维人脸顶点进行分类,基于分类结果输出人脸图像对应的遮挡区域。可选的,该第二网络结构的框架主要使用图注意力网络,其通过对三维人脸顶点进行整合分类,最终输出遮挡区域的分割maskMRIn one embodiment, the second network structure may be a graph convolutional neural network, and the graph structure information generated in step S104 is input into the second network structure, and the occlusion area corresponding to the face image is output through the second network structure. Optionally, the three-dimensional face vertices in the graph structure information may be classified through a graph convolutional neural network, and the occlusion area corresponding to the face image is output based on the classification result. Optionally, the framework of the second network structure mainly uses a graph attention network, which integrates and classifies the three-dimensional face vertices and finally outputs a segmentation mask MR of the occlusion area.

可选的,该第二网络结构在训练学习时使用监督的分割损失函数,如采用dice损失函数对遮挡mask与真实mask进行监督训练,实验表明其对于分割 mask的预测相比传统的交叉熵损失函数可以得到更精确的结果。Optionally, the second network structure uses a supervised segmentation loss function during training, such as using a dice loss function to supervise the occlusion mask and the true mask. Experiments have shown that it is effective for segmentation. The prediction of mask can obtain more accurate results compared with the traditional cross entropy loss function.

由上述方案可知,通过将人脸图像输入至第一网络结构,通过第一网络结构输出人脸图像在进行三维重建时的重建参数向量,再基于重建参数向量进行人脸图像的三维重建生成三维人脸信息,对三维人脸信息进行渲染和映射处理生成包含网络分割数据的二维人脸信息,此时进一步的获取第一网络结构提取的深度特征图,基于深度特征图以及所述二维人脸信息构造图结构信息,将图结构信息输入至第二网络结构,通过第二网络结构输出所述人脸图像对应的遮挡区域。本方案提供的人脸重建和遮挡区域识别方法,将人脸重建和遮挡区域识别合并到同一个模型中,在遮挡区域识别时,利用了人脸重建过程中的相关数据,将二维平面上的遮挡问题建模为三维人脸遮挡的识别,其更加符合人类感知的方式,同时由于利用了三维人脸的相关数据,更容易挖掘隐藏的关联特征,从而得到更为精确的遮挡区域识别结果。同时,充分利用了人脸重建和遮挡区域识别二者任务的共性特征,在尽量压缩模型的同时使两项任务形成互相促进的效果,优化了整体的资源部署,能够满足并行任务多且实时性要求高的应用场景,遮挡区域判定的精确性更高。It can be seen from the above scheme that by inputting a face image into the first network structure, the reconstruction parameter vector of the face image during three-dimensional reconstruction is output through the first network structure, and then the face image is reconstructed in three dimensions based on the reconstruction parameter vector to generate three-dimensional face information, and the three-dimensional face information is rendered and mapped to generate two-dimensional face information containing network segmentation data. At this time, the depth feature map extracted by the first network structure is further obtained, and the graph structure information is constructed based on the depth feature map and the two-dimensional face information. The graph structure information is input into the second network structure, and the occlusion area corresponding to the face image is output through the second network structure. The face reconstruction and occlusion area recognition method provided by this scheme combines face reconstruction and occlusion area recognition into the same model. When recognizing the occlusion area, the relevant data in the face reconstruction process is used to model the occlusion problem on the two-dimensional plane as the recognition of three-dimensional face occlusion, which is more in line with the way of human perception. At the same time, because the relevant data of the three-dimensional face is used, it is easier to mine hidden related features, thereby obtaining a more accurate occlusion area recognition result. At the same time, the common characteristics of face reconstruction and occluded area recognition tasks are fully utilized, and the two tasks are mutually reinforcing while compressing the model as much as possible, optimizing the overall resource deployment, and being able to meet application scenarios with many parallel tasks and high real-time requirements, and the accuracy of occluded area judgment is higher.

上述方案中,将人脸重建和遮挡区域识别整合至一个模型中进行处理,对本身具有关联性的两项任务同时进行特征提取,从而解决传统解决方案无法挖掘两项任务隐藏的关联特征的痛点,实现了对模型大小进行压缩,达成了更好的重建和分割效果的目的。In the above solution, face reconstruction and occluded area recognition are integrated into one model for processing, and feature extraction is performed on the two related tasks at the same time, thereby solving the pain point that traditional solutions cannot mine the hidden related features of the two tasks, compressing the model size, and achieving better reconstruction and segmentation effects.

在上述方案的基础上,在通过所述第二网络结构输出所述人脸图像对应的遮挡区域之后,还包括:基于人脸图像、二维人脸信息以及遮挡区域进行特效渲染处理,对处理结果进行显示。即在进行三维重建和确定遮挡区域后,可进一步的基于此施加各类特效处理。示例性的,如图5所示,图5为本申请实施例提供的一种对图像进行特效处理的示意图。其中,人脸图像IT为原始输入的图像,图像MR为进行遮挡区域识别后输出的图像,图像IR为渲染和映射处理后的二维人脸信息对应的图像,通过对三者的处理以最终得到图像IF。可选的,在进行特效渲染处理的过程中,可以对遮挡区域使用原图遮挡物进行渲染,对遮挡区域以外的人脸图像部分,基于三维人脸信息进行渲染。On the basis of the above scheme, after the occlusion area corresponding to the face image is output through the second network structure, it also includes: performing special effect rendering processing based on the face image, two-dimensional face information and occlusion area, and displaying the processing results. That is, after three-dimensional reconstruction and determination of the occlusion area, various special effects processing can be further applied based on this. Exemplarily, as shown in Figure 5, Figure 5 is a schematic diagram of special effect processing of an image provided by an embodiment of the present application. Among them, the face image IT is the original input image, the image MR is the image output after the occlusion area is identified, and the image IR is the image corresponding to the two-dimensional face information after rendering and mapping. The three are processed to finally obtain the image I F. Optionally, in the process of special effect rendering processing, the occlusion area can be rendered using the original image occlusion object, and the part of the face image outside the occlusion area can be rendered based on the three-dimensional face information.

其中,图像MR中黑色部分表示未被遮挡的区域,表明原始图片该区域未被 遮挡,最终渲染时该区域应该渲染重建还原的图像IR。相对的,图像MR中的白色区域代表原始输入图片该区域被遮挡,最终渲染时不应该渲染重建还原的图片。由于三维人脸重建时无论对象人脸是否存在遮挡,都会重建出完整人脸。因此针对类似图5示例中的中手部自遮挡以及墨镜遮挡的情况,重建时只会如图像IR所示将整个人脸重建出来,不会重建任何遮挡物。如果还存在类似三维妆容的后处理流程,由于重建出的是完整的三维人脸,因此整个三维妆容素材都会保留,而不会考虑有遮挡物的存在而屏蔽遮挡部分的妆容。因此为了解决这一常见问题,在一个实施例中利用预测出的图像MR的黑色区域识别出遮挡物,针对遮挡物的部分,最终渲染展示的是输入图像。对于被墨镜以及手遮挡的部分在最终渲染的图像IF中使用的是图像IT中的墨镜和手进行渲染的,从而避免了无法渲染遮挡物或者如果有三维妆容后处理的情况下,错误地将三维妆容渲染到遮挡物上的问题。The black part in the image MR represents the unobstructed area, indicating that the original image has not been covered in this area. Occlusion, the area should be rendered into the reconstructed image IR during the final rendering. Conversely, the white area in the image MR represents the area of the original input image that is occluded, and the reconstructed image should not be rendered during the final rendering. Since a complete face will be reconstructed regardless of whether the object face is occluded during three-dimensional face reconstruction. Therefore, for situations such as hand self-occlusion and sunglasses occlusion in the example of Figure 5, only the entire face will be reconstructed as shown in the image IR during reconstruction, and no occluders will be reconstructed. If there is a post-processing process similar to three-dimensional makeup, since a complete three-dimensional face is reconstructed, the entire three-dimensional makeup material will be retained, and the occluded part of the makeup will not be shielded due to the presence of occluders. Therefore, in order to solve this common problem, in one embodiment, the occluder is identified using the predicted black area of the image MR , and the input image is finally rendered for the occluded part. The parts blocked by the sunglasses and hands are rendered using the sunglasses and hands in the image IT in the final rendered image I F , thereby avoiding the problem of being unable to render the blocked objects or, if there is three-dimensional makeup post-processing, mistakenly rendering the three-dimensional makeup onto the blocked objects.

图6为本申请实施例提供的一种人脸重建和遮挡区域识别装置的结构框图,该装置用于执行上述实施例提供的人脸重建和遮挡区域识别方法,具备执行方法相应的功能模块和有益效果。如图6所示,该装置具体包括:参数向量生成模块101、三维信息确定模块102、渲染映射模块103、图信息构造模块104和遮挡区域确定模块105,其中,FIG6 is a structural block diagram of a face reconstruction and occlusion region identification device provided in an embodiment of the present application. The device is used to execute the face reconstruction and occlusion region identification method provided in the above embodiment, and has the corresponding functional modules and beneficial effects of the execution method. As shown in FIG6, the device specifically includes: a parameter vector generation module 101, a three-dimensional information determination module 102, a rendering mapping module 103, a graph information construction module 104 and an occlusion region determination module 105, wherein:

参数向量生成模块101,配置为将人脸图像输入至第一网络结构,通过所述第一网络结构输出所述人脸图像在进行三维重建时的重建参数向量;The parameter vector generation module 101 is configured to input a face image into a first network structure, and output a reconstruction parameter vector of the face image when performing three-dimensional reconstruction through the first network structure;

三维信息确定模块102,配置为基于所述重建参数向量进行所述人脸图像的三维重建生成三维人脸信息;A three-dimensional information determination module 102, configured to perform three-dimensional reconstruction of the face image based on the reconstruction parameter vector to generate three-dimensional face information;

渲染映射模块103,配置为对所述三维人脸信息进行渲染和映射处理生成包含网络分割数据的二维人脸信息;A rendering and mapping module 103 is configured to render and map the three-dimensional face information to generate two-dimensional face information including network segmentation data;

图信息构造模块104,配置为获取所述第一网络结构提取的深度特征图,基于所述深度特征图以及所述二维人脸信息构造图结构信息;A graph information construction module 104, configured to obtain a depth feature graph extracted by the first network structure, and construct graph structure information based on the depth feature graph and the two-dimensional face information;

遮挡区域确定模块105,配置为将所述图结构信息输入至第二网络结构,通过所述第二网络结构输出所述人脸图像对应的遮挡区域。The occlusion region determination module 105 is configured to input the graph structure information into a second network structure, and output the occlusion region corresponding to the face image through the second network structure.

由上述方案可知,将人脸图像输入至第一网络结构,通过第一网络结构输出人脸图像在进行三维重建时的重建参数向量,再基于重建参数向量进行人脸 图像的三维重建生成三维人脸信息,对三维人脸信息进行渲染和映射处理生成包含网络分割数据的二维人脸信息,此时进一步的获取第一网络结构提取的深度特征图,基于深度特征图以及所述二维人脸信息构造图结构信息,将图结构信息输入至第二网络结构,通过第二网络结构输出所述人脸图像对应的遮挡区域。本方案提供的人脸重建和遮挡区域识别方法,将人脸重建和遮挡区域识别合并到同一个模型中,在遮挡区域识别时,利用了人脸重建过程中的相关数据,将二维平面上的遮挡问题建模为三维人脸遮挡的识别,其更加符合人类感知的方式,同时由于利用了三维人脸的相关数据,更容易挖掘隐藏的关联特征,从而得到更为精确的遮挡区域识别结果。同时,充分利用了人脸重建和遮挡区域识别二者任务的共性特征,在尽量压缩模型的同时使两项任务形成互相促进的效果,优化了整体的资源部署,能够满足并行任务多且实时性要求高的应用场景,遮挡区域判定的精确性更高。It can be seen from the above scheme that the face image is input into the first network structure, the reconstruction parameter vector of the face image is output through the first network structure when performing three-dimensional reconstruction, and then the face is reconstructed based on the reconstruction parameter vector. The three-dimensional reconstruction of the image generates three-dimensional face information, and the three-dimensional face information is rendered and mapped to generate two-dimensional face information containing network segmentation data. At this time, the depth feature map extracted by the first network structure is further obtained, and the graph structure information is constructed based on the depth feature map and the two-dimensional face information. The graph structure information is input into the second network structure, and the occlusion area corresponding to the face image is output through the second network structure. The face reconstruction and occlusion area recognition method provided by this scheme merges face reconstruction and occlusion area recognition into the same model. When recognizing the occlusion area, the relevant data in the face reconstruction process is used to model the occlusion problem on the two-dimensional plane as the recognition of three-dimensional face occlusion, which is more in line with the way of human perception. At the same time, due to the use of the relevant data of the three-dimensional face, it is easier to mine the hidden related features, so as to obtain a more accurate occlusion area recognition result. At the same time, the common characteristics of the two tasks of face reconstruction and occlusion area recognition are fully utilized, and the two tasks are mutually promoted while compressing the model as much as possible, optimizing the overall resource deployment, and being able to meet the application scenarios with many parallel tasks and high real-time requirements, and the accuracy of occlusion area determination is higher.

在一个可能的实施例中,所述重建参数向量包括人脸特征向量、人脸表情向量和三维人脸系数向量,所述三维信息确定模块102,配置为:In a possible embodiment, the reconstruction parameter vector includes a face feature vector, a face expression vector and a three-dimensional face coefficient vector, and the three-dimensional information determination module 102 is configured as follows:

基于所述人脸特征向量、所述人脸表情向量、所述三维人脸系数向量以及预设的平均人脸形状信息进行三维人脸点云的重建,得到点云重建信息;Reconstructing a three-dimensional face point cloud based on the face feature vector, the face expression vector, the three-dimensional face coefficient vector and preset average face shape information to obtain point cloud reconstruction information;

基于所述三维人脸系数向量以及预设的平均人脸纹理信息和人脸基底信息进行人脸纹理的重建,得到人脸纹理信息;Reconstructing facial texture based on the three-dimensional facial coefficient vector and preset average facial texture information and facial base information to obtain facial texture information;

基于所述点云重建信息和所述人脸纹理信息生成三维人脸信息。Generate three-dimensional face information based on the point cloud reconstruction information and the face texture information.

在一个可能的实施例中,所述渲染映射模块103,配置为:In a possible embodiment, the rendering mapping module 103 is configured as follows:

通过渲染器进行所述三维人脸信息的渲染得到二维人脸图像;Rendering the three-dimensional face information by a renderer to obtain a two-dimensional face image;

基于所述三维人脸信息中的拓扑结构,将所述三维人脸信息中的人脸三维点对数据映射至所述二维人脸图像中生成包含网络分割数据的二维人脸信息。Based on the topological structure in the three-dimensional face information, the three-dimensional face point pair data in the three-dimensional face information is mapped to the two-dimensional face image to generate two-dimensional face information containing network segmentation data.

在一个可能的实施例中,所述图信息构造模块104,配置为:In a possible embodiment, the graph information construction module 104 is configured as follows:

基于所述二维人脸信息中的网络分割数据对所述深度特征图进行分割处理得到三角片面分割结果;Segment the depth feature map based on the network segmentation data in the two-dimensional face information to obtain a triangular face segmentation result;

依据所述三角片面分割结果中的顶点联通关系以及点对距离进行邻接矩阵的构造得到图结构信息。The graph structure information is obtained by constructing an adjacency matrix based on the vertex connectivity and point pair distances in the triangle face segmentation result.

在一个可能的实施例中,所述遮挡区域确定模块105,配置为: In a possible embodiment, the occlusion area determination module 105 is configured as follows:

通过所述图卷积神经网络对所述图结构信息中的三维人脸顶点进行分类;Classifying the three-dimensional face vertices in the graph structure information by using the graph convolutional neural network;

基于分类结果输出所述人脸图像对应的遮挡区域。The occluded area corresponding to the face image is output based on the classification result.

在一个可能的实施例中,该装置还包括特效处理模块,配置为:In a possible embodiment, the device further includes a special effect processing module configured to:

在所述通过所述第二网络结构输出所述人脸图像对应的遮挡区域之后,基于所述人脸图像、所述二维人脸信息以及所述遮挡区域进行特效渲染处理,对处理结果进行显示。After the occluded area corresponding to the face image is output through the second network structure, special effects rendering processing is performed based on the face image, the two-dimensional face information and the occluded area, and the processing result is displayed.

在一个可能的实施例中,所述特效处理模块,配置为:In a possible embodiment, the special effect processing module is configured as follows:

对所述遮挡区域使用原图遮挡物进行渲染,对所述遮挡区域以外的人脸图像部分,基于所述三维人脸信息进行渲染。The occluded area is rendered using the original image occluder, and the facial image portion outside the occluded area is rendered based on the three-dimensional facial information.

图7为本申请实施例提供的一种人脸重建和遮挡区域识别设备的结构示意图,如图7所示,该设备包括处理器201、存储器202、输入装置203和输出装置204;设备中处理器201的数量可以是一个或多个,图7中以一个处理器201为例;设备中的处理器201、存储器202、输入装置203和输出装置204可以通过总线或其他方式连接,图7中以通过总线连接为例。存储器202作为一种计算机可读存储介质,可用于存储软件程序、计算机可执行程序以及模块,如本申请实施例中的人脸重建和遮挡区域识别方法对应的程序指令/模块。处理器201通过运行存储在存储器302中的软件程序、指令以及模块,从而执行设备的各种功能应用以及数据处理,即实现上述的人脸重建和遮挡区域识别方法。输入装置203可用于接收输入的数字或字符信息,以及产生与设备的用户设置以及功能控制有关的键信号输入。输出装置204可包括显示屏等显示设备。FIG7 is a schematic diagram of the structure of a face reconstruction and occlusion area recognition device provided by an embodiment of the present application. As shown in FIG7, the device includes a processor 201, a memory 202, an input device 203, and an output device 204; the number of processors 201 in the device can be one or more, and FIG7 takes one processor 201 as an example; the processor 201, the memory 202, the input device 203, and the output device 204 in the device can be connected by a bus or other means, and FIG7 takes the connection by a bus as an example. The memory 202, as a computer-readable storage medium, can be used to store software programs, computer executable programs, and modules, such as program instructions/modules corresponding to the face reconstruction and occlusion area recognition method in the embodiment of the present application. The processor 201 executes various functional applications and data processing of the device by running the software programs, instructions, and modules stored in the memory 302, that is, realizing the above-mentioned face reconstruction and occlusion area recognition method. The input device 203 can be used to receive input digital or character information, and generate key signal input related to the user settings and function control of the device. The output device 204 may include a display device such as a display screen.

本申请实施例还提供一种包含计算机可执行指令的非易失性存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行一种上述实施例描述的人脸重建和遮挡区域识别方法,其中,包括:The embodiment of the present application further provides a non-volatile storage medium containing computer executable instructions, wherein the computer executable instructions are used to execute a face reconstruction and occlusion area recognition method described in the above embodiment when executed by a computer processor, including:

将人脸图像输入至第一网络结构,通过所述第一网络结构输出所述人脸图像在进行三维重建时的重建参数向量;Inputting a face image into a first network structure, and outputting a reconstruction parameter vector of the face image when three-dimensional reconstruction is performed through the first network structure;

基于所述重建参数向量进行所述人脸图像的三维重建生成三维人脸信息;Performing three-dimensional reconstruction of the face image based on the reconstruction parameter vector to generate three-dimensional face information;

对所述三维人脸信息进行渲染和映射处理生成包含网络分割数据的二维人脸信息;Rendering and mapping the three-dimensional face information to generate two-dimensional face information containing network segmentation data;

获取所述第一网络结构提取的深度特征图,基于所述深度特征图以及所述 二维人脸信息构造图结构信息;Obtain a depth feature map extracted by the first network structure, based on the depth feature map and the Two-dimensional face information construction graph structure information;

将所述图结构信息输入至第二网络结构,通过所述第二网络结构输出所述人脸图像对应的遮挡区域。The graph structure information is input into a second network structure, and the occluded area corresponding to the face image is output through the second network structure.

值得注意的是,上述人脸重建和遮挡区域识别装置的实施例中,所包括的各个单元和模块只是按照功能逻辑进行划分的,但并不局限于上述的划分,只要能够实现相应的功能即可;另外,各功能单元的具体名称也只是为了便于相互区分,并不用于限制本申请实施例的保护范围。It is worth noting that in the above-mentioned embodiment of the face reconstruction and occlusion area recognition device, the various units and modules included are only divided according to functional logic, but are not limited to the above-mentioned division, as long as the corresponding functions can be achieved; in addition, the specific name of each functional unit is only for the convenience of distinguishing each other, and is not used to limit the protection scope of the embodiment of the present application.

在一些可能的实施方式中,本申请提供的方法的各个方面还可以实现为一种程序产品的形式,其包括程序代码,当所述程序产品在计算机设备上运行时,所述程序代码用于使所述计算机设备执行本说明书上述描述的根据本申请各种示例性实施方式的方法中的步骤,例如,所述计算机设备可以执行本申请实施例所记载的人脸重建和遮挡区域识别方法。所述程序产品可以采用一个或多个可读介质的任意组合实现。 In some possible implementations, various aspects of the method provided by the present application may also be implemented in the form of a program product, which includes a program code. When the program product is run on a computer device, the program code is used to enable the computer device to execute the steps of the method according to various exemplary embodiments of the present application described above in this specification. For example, the computer device may execute the face reconstruction and occlusion area recognition method recorded in the embodiment of the present application. The program product may be implemented in any combination of one or more readable media.

Claims (11)

人脸重建和遮挡区域识别方法,其中,包括:A method for face reconstruction and occlusion region recognition, comprising: 将人脸图像输入至第一网络结构,通过所述第一网络结构输出所述人脸图像在进行三维重建时的重建参数向量;Inputting a face image into a first network structure, and outputting a reconstruction parameter vector of the face image when three-dimensional reconstruction is performed through the first network structure; 基于所述重建参数向量进行所述人脸图像的三维重建生成三维人脸信息;Performing three-dimensional reconstruction of the face image based on the reconstruction parameter vector to generate three-dimensional face information; 对所述三维人脸信息进行渲染和映射处理生成包含网络分割数据的二维人脸信息;Rendering and mapping the three-dimensional face information to generate two-dimensional face information containing network segmentation data; 获取所述第一网络结构提取的深度特征图,基于所述深度特征图以及所述二维人脸信息构造图结构信息;Obtaining a depth feature map extracted by the first network structure, and constructing graph structure information based on the depth feature map and the two-dimensional face information; 将所述图结构信息输入至第二网络结构,通过所述第二网络结构输出所述人脸图像对应的遮挡区域。The graph structure information is input into a second network structure, and the occluded area corresponding to the face image is output through the second network structure. 根据权利要求1所述的人脸重建和遮挡区域识别方法,其中,所述重建参数向量包括人脸特征向量、人脸表情向量和三维人脸系数向量,所述基于所述重建参数向量进行所述人脸图像的三维重建生成三维人脸信息,包括:The method for face reconstruction and occlusion area recognition according to claim 1, wherein the reconstruction parameter vector includes a face feature vector, a face expression vector and a three-dimensional face coefficient vector, and the three-dimensional reconstruction of the face image based on the reconstruction parameter vector to generate three-dimensional face information comprises: 基于所述人脸特征向量、所述人脸表情向量、所述三维人脸系数向量以及预设的平均人脸形状信息进行三维人脸点云的重建,得到点云重建信息;Reconstructing a three-dimensional face point cloud based on the face feature vector, the face expression vector, the three-dimensional face coefficient vector and preset average face shape information to obtain point cloud reconstruction information; 基于所述三维人脸系数向量以及预设的平均人脸纹理信息和人脸基底信息进行人脸纹理的重建,得到人脸纹理信息;Reconstructing facial texture based on the three-dimensional facial coefficient vector and preset average facial texture information and facial base information to obtain facial texture information; 基于所述点云重建信息和所述人脸纹理信息生成三维人脸信息。Generate three-dimensional face information based on the point cloud reconstruction information and the face texture information. 根据权利要求1所述的人脸重建和遮挡区域识别方法,其中,所述对所述三维人脸信息进行渲染和映射处理生成包含网络分割数据的二维人脸信息,包括:The method for face reconstruction and occlusion area recognition according to claim 1, wherein the rendering and mapping of the three-dimensional face information to generate two-dimensional face information containing network segmentation data comprises: 通过渲染器进行所述三维人脸信息的渲染得到二维人脸图像;Rendering the three-dimensional face information by a renderer to obtain a two-dimensional face image; 基于所述三维人脸信息中的拓扑结构,将所述三维人脸信息中的人脸三维点对数据映射至所述二维人脸图像中生成包含网络分割数据的二维人脸信息。Based on the topological structure in the three-dimensional face information, the three-dimensional face point pair data in the three-dimensional face information is mapped to the two-dimensional face image to generate two-dimensional face information containing network segmentation data. 根据权利要求1-3中任一项所述的人脸重建和遮挡区域识别方法,其中,所述基于所述深度特征图以及所述二维人脸信息构造图结构信息,包括:The method for face reconstruction and occlusion area recognition according to any one of claims 1 to 3, wherein the constructing graph structure information based on the depth feature map and the two-dimensional face information comprises: 基于所述二维人脸信息中的网络分割数据对所述深度特征图进行分割处理得到三角片面分割结果;Segment the depth feature map based on the network segmentation data in the two-dimensional face information to obtain a triangular face segmentation result; 依据所述三角片面分割结果中的顶点联通关系以及点对距离进行邻接矩阵 的构造得到图结构信息。The adjacency matrix is constructed based on the vertex connectivity and point pair distance in the triangular face segmentation result. The graph structure information is obtained by constructing . 根据权利要求1-4中任一项所述的人脸重建和遮挡区域识别方法,其中,所述第二网络结构包括图卷积神经网络,所述通过所述第二网络结构输出所述人脸图像对应的遮挡区域,包括:The method for face reconstruction and occlusion area recognition according to any one of claims 1 to 4, wherein the second network structure comprises a graph convolutional neural network, and outputting the occlusion area corresponding to the face image through the second network structure comprises: 通过所述图卷积神经网络对所述图结构信息中的三维人脸顶点进行分类;Classifying the three-dimensional face vertices in the graph structure information by using the graph convolutional neural network; 基于分类结果输出所述人脸图像对应的遮挡区域。The occluded area corresponding to the face image is output based on the classification result. 根据权利要求1-5中任一项所述的人脸重建和遮挡区域识别方法,其中,在所述通过所述第二网络结构输出所述人脸图像对应的遮挡区域之后,还包括:The method for face reconstruction and occlusion area recognition according to any one of claims 1 to 5, wherein, after outputting the occlusion area corresponding to the face image through the second network structure, it also includes: 基于所述人脸图像、所述二维人脸信息以及所述遮挡区域进行特效渲染处理,对处理结果进行显示。Special effects rendering processing is performed based on the facial image, the two-dimensional facial information and the occluded area, and the processing result is displayed. 根据权利要求6所述的人脸重建和遮挡区域识别方法,其中,所述基于所述人脸图像、所述二维人脸信息以及所述遮挡区域进行特效渲染处理,包括:The method for face reconstruction and occlusion area recognition according to claim 6, wherein the special effects rendering processing based on the face image, the two-dimensional face information and the occlusion area comprises: 对所述遮挡区域使用原图遮挡物进行渲染,对所述遮挡区域以外的人脸图像部分,基于所述三维人脸信息进行渲染。The occluded area is rendered using the original image occluder, and the facial image portion outside the occluded area is rendered based on the three-dimensional facial information. 人脸重建和遮挡区域识别装置,其中,包括:A face reconstruction and occlusion area recognition device, comprising: 参数向量生成模块,配置为将人脸图像输入至第一网络结构,通过所述第一网络结构输出所述人脸图像在进行三维重建时的重建参数向量;a parameter vector generation module, configured to input a face image into a first network structure, and output a reconstruction parameter vector of the face image when three-dimensional reconstruction is performed through the first network structure; 三维信息确定模块,配置为基于所述重建参数向量进行所述人脸图像的三维重建生成三维人脸信息;a three-dimensional information determination module, configured to perform three-dimensional reconstruction of the face image based on the reconstruction parameter vector to generate three-dimensional face information; 渲染映射模块,配置为对所述三维人脸信息进行渲染和映射处理生成包含网络分割数据的二维人脸信息;A rendering and mapping module, configured to render and map the three-dimensional face information to generate two-dimensional face information containing network segmentation data; 图信息构造模块,配置为获取所述第一网络结构提取的深度特征图,基于所述深度特征图以及所述二维人脸信息构造图结构信息;A graph information construction module, configured to obtain a deep feature map extracted by the first network structure, and construct graph structure information based on the deep feature map and the two-dimensional face information; 遮挡区域确定模块,配置为将所述图结构信息输入至第二网络结构,通过所述第二网络结构输出所述人脸图像对应的遮挡区域。The occlusion area determination module is configured to input the graph structure information into a second network structure, and output the occlusion area corresponding to the face image through the second network structure. 一种人脸重建和遮挡区域识别设备,所述设备包括:一个或多个处理器;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现权利要求1-7中任一项所述的人脸重建和遮挡区域识别方法。 A face reconstruction and occlusion area recognition device, the device comprising: one or more processors; a storage device for storing one or more programs, when the one or more programs are executed by the one or more processors, the one or more processors implement the face reconstruction and occlusion area recognition method described in any one of claims 1-7. 一种存储计算机可执行指令的非易失性存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行权利要求1-7中任一项所述的人脸重建和遮挡区域识别方法。A non-volatile storage medium storing computer executable instructions, wherein the computer executable instructions are used to execute the face reconstruction and occlusion area recognition method according to any one of claims 1 to 7 when executed by a computer processor. 一种计算机程序产品,包括计算机程序,其中,所述计算机程序被处理器执行时实现权利要求1-7中任一项所述的人脸重建和遮挡区域识别方法。 A computer program product comprises a computer program, wherein when the computer program is executed by a processor, the method for face reconstruction and occlusion area recognition according to any one of claims 1 to 7 is implemented.
PCT/CN2023/123840 2022-10-27 2023-10-10 Face reconstruction and occlusion region recognition method, apparatus and device, and storage medium WO2024088061A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211328264.0A CN115880748A (en) 2022-10-27 2022-10-27 Face reconstruction and occlusion region identification method, device, equipment and storage medium
CN202211328264.0 2022-10-27

Publications (1)

Publication Number Publication Date
WO2024088061A1 true WO2024088061A1 (en) 2024-05-02

Family

ID=85759023

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/123840 WO2024088061A1 (en) 2022-10-27 2023-10-10 Face reconstruction and occlusion region recognition method, apparatus and device, and storage medium

Country Status (2)

Country Link
CN (1) CN115880748A (en)
WO (1) WO2024088061A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115880748A (en) * 2022-10-27 2023-03-31 百果园技术(新加坡)有限公司 Face reconstruction and occlusion region identification method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020037678A1 (en) * 2018-08-24 2020-02-27 太平洋未来科技(深圳)有限公司 Method, device, and electronic apparatus for generating three-dimensional human face image from occluded image
CN113781640A (en) * 2021-09-27 2021-12-10 华中科技大学 A method for establishing a 3D face reconstruction model based on weakly supervised learning and its application
CN114549501A (en) * 2022-02-28 2022-05-27 佛山虎牙虎信科技有限公司 Face occlusion recognition method, three-dimensional face processing method, device, equipment and medium
CN115880748A (en) * 2022-10-27 2023-03-31 百果园技术(新加坡)有限公司 Face reconstruction and occlusion region identification method, device, equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020037678A1 (en) * 2018-08-24 2020-02-27 太平洋未来科技(深圳)有限公司 Method, device, and electronic apparatus for generating three-dimensional human face image from occluded image
CN113781640A (en) * 2021-09-27 2021-12-10 华中科技大学 A method for establishing a 3D face reconstruction model based on weakly supervised learning and its application
CN114549501A (en) * 2022-02-28 2022-05-27 佛山虎牙虎信科技有限公司 Face occlusion recognition method, three-dimensional face processing method, device, equipment and medium
CN115880748A (en) * 2022-10-27 2023-03-31 百果园技术(新加坡)有限公司 Face reconstruction and occlusion region identification method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN115880748A (en) 2023-03-31

Similar Documents

Publication Publication Date Title
CN111598998B (en) Three-dimensional virtual model reconstruction method, three-dimensional virtual model reconstruction device, computer equipment and storage medium
CN110675487B (en) Three-dimensional face modeling and recognition method and device based on multi-angle two-dimensional face
CN107852533B (en) Three-dimensional content generating device and method for generating three-dimensional content
JP6553285B2 (en) Scene reconstruction method, apparatus, terminal device, and storage medium
US20220222892A1 (en) Normalized three-dimensional avatar synthesis and perceptual refinement
WO2022121895A1 (en) Binocular living body detection method, apparatus, and device, and storage medium
CN108491848B (en) Image saliency detection method and device based on depth information
WO2019019828A1 (en) Target object occlusion detection method and apparatus, electronic device and storage medium
CN111008935B (en) Face image enhancement method, device, system and storage medium
US20120313937A1 (en) Coupled reconstruction of hair and skin
CN111127631B (en) Three-dimensional shape and texture reconstruction method, system and storage medium based on single image
CN113628327B (en) Head three-dimensional reconstruction method and device
CN107924571A (en) Three-dimensional reconstruction is carried out to human ear from a cloud
CN107481101B (en) Dressing recommendation method and device
CN110930503B (en) Method, system, storage medium and electronic device for establishing a three-dimensional model of clothing
CN113313832B (en) Semantic generation method and device of three-dimensional model, storage medium and electronic equipment
US20220157016A1 (en) System and method for automatically reconstructing 3d model of an object using machine learning model
CN110264573A (en) Three-dimensional rebuilding method, device, terminal device and storage medium based on structure light
CN115035235A (en) Three-dimensional reconstruction method and device
US20140198177A1 (en) Realtime photo retouching of live video
WO2023034100A1 (en) Systems for generating presentations of eyebrow designs
WO2024088061A1 (en) Face reconstruction and occlusion region recognition method, apparatus and device, and storage medium
CN114926593B (en) SVBRDF material modeling method and system based on Shan Zhanggao light images
CN116167925A (en) Image restoration method, device, electronic equipment and computer readable storage medium
CN116342831A (en) Three-dimensional scene reconstruction method, three-dimensional scene reconstruction device, computer equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23881640

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE