CN106228507B

CN106228507B - A kind of depth image processing method based on light field

Info

Publication number: CN106228507B
Application number: CN201610541262.8A
Authority: CN
Inventors: 侯广琦; 孙哲南; 谭铁牛; 刘菲
Original assignee: Tianjin Zhongke Intelligent Identification Industry Technology Research Institute Co Ltd
Current assignee: Tianjin Zhongke Intelligent Identification Co ltd
Priority date: 2016-07-11
Filing date: 2016-07-11
Publication date: 2019-06-25
Anticipated expiration: 2036-07-11
Also published as: CN106228507A

Abstract

The depth image processing method based on light field that the invention discloses a kind of, comprising steps of obtaining the initial 4D light field color image and initial depth image of default photographic subjects using the shooting of optical field acquisition equipment；Pretreatment obtains the initial 3D grid model for presetting photographic subjects and corresponding initial normal direction field；It analyzes and calculates the default photographic subjects surface reflectivity of acquisition；According to the initial normal direction field of default photographic subjects and surface reflectivity, its light field image is modeled, obtains illumination model and illumination parameter；According to the illumination parameter that the surface reflectivity of default photographic subjects and illumination model have, the initial normal direction field of default photographic subjects is optimized；According to the normal direction field of optimization, after carrying out depth enhancing to initial depth image, the 3D grid model of default photographic subjects is rebuild.The present invention can be based on 4D light field, rebuild the shape of captured target, realize and carry out optical field imaging stereoscopic display to captured target, obtain the depth image of high quality.

Description

Depth image processing method based on light field

Technical Field

The invention relates to the technical fields of light field imaging, image processing, computer vision and the like, in particular to a depth image processing method based on a light field.

Background

At present, with the continuous development of human science and technology, in a computer vision system, three-dimensional scene information provides more possibilities for computer vision applications such as image segmentation, target detection, object tracking and the like, and compared with a two-dimensional image, a depth image has three-dimensional characteristic information of an object, namely depth information, and therefore the depth image is widely applied as a general three-dimensional scene information expression mode. Therefore, the detection and identification of three-dimensional objects by using an imaging device capable of capturing color and depth information at the same time is certainly a new hotspot in the field of computer vision, wherein the acquisition of depth images is a key technology.

In computer vision systems, methods for obtaining depth images can be divided into two categories: passive and active. The passive depth image acquisition method mainly utilizes ambient condition imaging, a common method is binocular stereo vision, and light field imaging is paid more and more attention to application in depth estimation as an emerging passive imaging mode at present. Light field imaging is an important branch of the field of computational imaging. The light field is a light radiation field which simultaneously contains position and direction information in space, and compared with a traditional imaging mode of only recording two-dimensional data, the light field imaging can obtain richer image information. Therefore, light field imaging techniques offer many new directions for computational imaging.

At present, light field imaging utilizes a special imaging structure thereof to acquire four-dimensional light field data, which not only comprises brightness information, but also comprises direction information of light, and meanwhile, the light field imaging is widely applied in the fields of three-dimensional display, imaging depth expansion, depth estimation and the like by virtue of strong post-processing capability. There are three main forms of light field imaging: a microlens array, a camera array, and a mask. The microlens array type is a light field imaging method which is most commonly used at present, and the light field data is acquired by the microlens array which is arranged between the main lens and the sensor.

Furthermore, with the rapid development of depth cameras, high-precision 3D shape modeling becomes more practical and challenging. However, active stereo imaging techniques (such as laser, structured light, Kinect) are generally expensive, low in resolution, and in indoor imaging environments, while passive stereo imaging techniques (such as binocular stereo vision, multi-view reconstruction MVS) are very complicated and time-consuming in algorithms, and therefore, it is difficult to achieve high resolution, high precision, real-time, practicality and universality in 3D shape modeling. The advent of commercial light field cameras (Lytro, rasrix) brought new developments in 3D stereoscopic display, shape modeling.

At present, a commercialized Lytro light field camera is low in spatial resolution, generally performs matching of a corresponding white image according to parameter setting in a shooting process, decodes a microlens image to obtain 4D light field data, and performs algorithm processing such as depth estimation, refocusing, stereoscopic display and the like. As a passive imaging technology, the light field camera performs depth estimation based on a plurality of 2D images, and the accuracy of the calculated depth map is low. Different from active depth acquisition technologies such as Kinect and the like, a depth map obtained by Kinect is globally smooth, the depth value deviation is small, the depth value based on 4D light field estimation is well described for texture details, a depth estimation value cannot be obtained for a shape surface without texture, repeated texture or few texture, and the noise depth value is greatly deviated from a real value.

Shadow reconstruction (SFS), multi-view stereo reconstruction (MVS), photometric stereo reconstruction (PS) are three classical passive stereo imaging techniques. The SFS technique of shadow reconstruction reconstructs a shape from shadow cues of a luminance image, however, in some scenes, it cannot be determined whether the luminance change of an object is caused by a change in geometry or a difference in reflection properties, so the SFS algorithm of shadow reconstruction usually assumes imaging conditions such as lambertian reflector, uniform reflectivity, and a long-distance point light source in application. The multi-view stereo reconstruction MVS technology reconstructs a shape from a plurality of 2D images which are calibrated and shot at different view angles, and performs feature extraction and matching on a plurality of images at adjacent view angles to generate an initial depth map or sparse 3D point cloud, and finally generates a high-precision shape model by optimization, so that the MVS algorithm is high in complexity and long in time consumption, the feature extraction and matching are very sensitive to changes of textures, shelters, illumination and reflection attributes, and most algorithms cannot be applied to all scenes. The PS algorithm for photometric stereo reconstruction needs to set a plurality of light sources in a controllable indoor illumination environment, shoot a plurality of images, accurately calculate the directions of the light sources, calculate the normal field of the surface of an object according to the brightness changes of the plurality of images, and model the shape.

Therefore, given the particular structure of light-field imaging and the acquired 4D data, it was decided that conventional passive stereo imaging techniques could not be directly applied for depth estimation.

Therefore, there is an urgent need to develop a technology that can reconstruct the shape of a shot target based on a 4D light field, implement light field imaging stereoscopic display on the shot target, obtain a high-quality depth image, ensure imaging quality, and contribute to expanding the popularization and application range of light field imaging.

Disclosure of Invention

In view of the above, an object of the present invention is to provide a depth image processing method based on a light field, which can reconstruct the shape of a shot target based on a 4D light field, implement light field imaging stereoscopic display on the shot target, obtain a high-quality depth image, ensure imaging quality, help to expand the popularization and application range of light field imaging, promote the application and development of light field imaging, facilitate improvement of product use experience of users, and have great production practice significance.

Therefore, the invention provides a depth image processing method based on a light field, which comprises the following steps:

the first step is as follows: shooting by using light field acquisition equipment to obtain an initial 4D light field color image and an initial depth image of a preset shooting target;

the second step is that: preprocessing the obtained initial 4D light field color image and the initial depth image of the preset shooting target to obtain an initial 3D grid model and a corresponding initial normal field of the preset shooting target;

the third step: analyzing and calculating the surface reflectivity of the preset shooting target according to the initial color image and the initial normal field of the preset shooting target;

the fourth step: modeling a light field image corresponding to the preset shooting target according to the initial normal field and the surface reflectivity corresponding to the preset shooting target to obtain an illumination model of the preset shooting target and illumination parameters of the illumination model;

the fifth step: optimizing an initial normal field corresponding to the preset shooting target according to the surface reflectivity of the preset shooting target and the illumination parameters of the illumination model;

and a sixth step: according to the normal field obtained through optimization, depth enhancement is carried out on the initial depth image of the preset shooting target, and the initial depth image subjected to depth enhancement is obtained;

the seventh step: and projecting the initial depth image subjected to depth enhancement into a 3D space, and reconstructing a 3D mesh model of a preset shooting target.

Wherein the second step comprises the sub-steps of:

a mask is established for the initial 4D light field color image and the initial depth image, and background interference in the mask is removed;

preprocessing the depth image, projecting the depth image into a 3D space, and obtaining an initial 3D grid model of a preset shooting target;

and obtaining an initial normal field corresponding to the preset shooting target based on the initial 3D grid model of the preset shooting target.

Wherein the third step comprises the substeps of:

processing the initial color image of the preset shooting target to obtain a corresponding chromaticity diagram;

dividing the chromaticity diagram through a threshold value, and extracting edge point information of the chromaticity diagram;

performing reflectivity division on all surface areas included in the chromaticity diagram according to edge point information or chromaticity values of the chromaticity diagram, and establishing different marks for the surface areas with different reflectivities;

respectively calculating the chroma mean value of each surface area with different reflectivities to judge whether the surface area is an ambiguous pixel area by judging whether the chroma difference between the chroma mean value and a preset chroma value reaches a preset threshold value, if so, defining the surface area as the ambiguous pixel area, and filtering and eliminating the surface area as the ambiguous pixel area based on the Euclidean distance;

and calculating the reflectivity of all the surface areas in the filtered chromaticity diagram, and finally obtaining the surface reflectivity of the preset shooting target.

The operation of dividing the reflectivity of all the surface areas of the chromaticity diagram according to the edge point information of the chromaticity diagram specifically comprises the following steps:

and judging whether edge points exist on the connecting line of any two pixel points in the chromaticity diagram, if so, defining that the pixel points belong to surface areas with different reflectivities, and setting different marks.

The operation of dividing the reflectivity of all the surface areas included in the chromaticity diagram according to the chromaticity value of the chromaticity diagram specifically includes the following steps:

and judging whether the chromaticity difference between any two surface areas in the chromaticity diagram reaches a preset value, if so, defining the surface areas as surface areas with different reflectivities, and marking different marks.

In the fourth step, according to an initial normal field and a surface reflectivity corresponding to the preset shooting target, modeling a light field image of the preset shooting target by adopting a preset quadratic function related to the normal direction and the reflectivity to obtain an illumination model of the preset shooting target and an illumination parameter of the illumination model;

the formula of the quadratic function is:

I＝s(η)＝η^TAη+b^Tη+c；

η_x，y＝ρ_x，y·n_x，y；

wherein, η_x，yIs the reflectance ρ_x，yAnd unit normal n_x，yAnd (4) calculating to obtain the illumination parameters by a linear least square optimization algorithm, wherein A, b and c are the illumination parameters of the illumination model.

Wherein the fifth step comprises the substeps of:

optimizing an initial normal field by using a preset energy function comprising color image brightness constraint, local normal smooth constraint, normal prior constraint and unit vector constraint according to the surface reflectivity of the preset shooting target and the illumination parameters of the illumination model;

and optimizing and solving the preset energy function by utilizing a nonlinear least square optimization LM algorithm to obtain an optimized normal field.

Compared with the prior art, the depth image processing method based on the optical field can reconstruct the shape of the shot target based on the 4D optical field, realize the optical field imaging three-dimensional display of the shot target, obtain the high-quality depth image, ensure the imaging quality, contribute to expanding the popularization and application range of optical field imaging, promote the application development of optical field imaging, contribute to improving the product use experience of users and have great production practice significance.

Drawings

FIG. 1 is a flow chart of a depth image processing method based on a light field according to the present invention;

fig. 2 is a diagram illustrating an initial color image of a preset shooting target in a depth image processing method based on a light field according to the present invention;

fig. 3 is a diagram illustrating an initial depth image of a preset shooting target in a depth image processing method based on a light field according to the present invention;

fig. 4 is a schematic diagram of an initial 3D mesh model of a preset shooting target obtained by smoothing and denoising a depth image in the depth image processing method based on a light field according to the present invention;

fig. 5 is a schematic diagram of a local enlargement of an initial 3D mesh model of a preset shooting target obtained by smoothing and denoising a depth image in the depth image processing method based on a light field provided by the present invention and shown in fig. 4;

fig. 6 is a schematic diagram of a normal field corresponding to a preset shot target obtained based on an initial 3D mesh model of the preset shot target in the depth image processing method based on a light field according to the present invention;

fig. 7 is a diagram of fig. 6 illustrating that, in the depth image processing method based on the light field according to the present invention, a normal map of a normal field of a preset shot target is obtained based on an initial 3D mesh model of the preset shot target;

fig. 8 is a chromaticity diagram obtained by processing an initial color image of the preset shooting target in the depth image processing method based on the light field according to the present invention;

fig. 9 is a diagram of an illumination model of the preset shooting target in the depth image processing method based on the light field according to the present invention;

fig. 10 is a schematic diagram of an optimized normal field of a preset shooting target in a depth image processing method based on a light field according to the present invention;

fig. 11 is a normal map of an optimized normal field of a preset shooting target in a depth image processing method based on a light field according to the present invention;

fig. 12 is a schematic diagram of a three-dimensional 3D mesh model of a preset shooting target finally obtained by the depth image processing method based on the light field according to the present invention;

FIG. 13 is an enlarged schematic view of section I of FIG. 12;

fig. 14 is an enlarged schematic view of a portion II shown in fig. 12.

Detailed Description

In order that those skilled in the art will better understand the technical solution of the present invention, the following detailed description of the present invention is provided in conjunction with the accompanying drawings and embodiments.

referring to fig. 1, the depth image processing method based on a light field provided by the present invention includes the following steps:

step S101: shooting by using light field acquisition equipment to obtain an initial 4D light field color image and an initial depth image of a preset shooting target;

referring to fig. 2 and 3, fig. 2 and 3 are respectively a 4D light field color image and an initial depth image of a preset shooting target obtained by shooting through a light field collecting device such as a color image sensor.

It should be noted that the current commercial handheld light field cameras are mainly Lytro cameras and Raytrix cameras, the Lytro cameras include Lytro 1.0 and Lytro Illum cameras, and the Raytrix cameras include models of R5, R12, R29, R42, etc., and can be used for light field image acquisition, depth estimation, refocusing, three-dimensional imaging, etc. of a real scene. Meanwhile, the mechanical arm can be matched with a common camera to simulate a light field imaging mode, and light field collection is carried out through micro movement.

Step S102: preprocessing the obtained initial 4D light field color image and the initial depth image of the preset shooting target to obtain an initial 3D grid model and a corresponding initial normal field of the preset shooting target;

in the present invention, the step S102 specifically includes the following sub-steps:

step S1021: creating a mask for the initial 4D light field color image and the initial depth image, and removing background interference therein (which can be manually removed by a user);

in the invention, in a specific implementation, the saliency detection and segmentation of the image can be performed based on the color difference, so that a mask is established for a target object (namely an initial 4D light field color image and an initial depth image), background information is deleted, and only the target object is operated.

Step S1022: smoothing and denoising the initial depth image, projecting the image into a 3D space to obtain an initial 3D mesh model (as shown in fig. 4 and 5) of a preset shooting target;

in the invention, the smoothing and denoising processing is carried out on the initial depth image through mean filtering and bilateral filtering.

In particular, it should be noted that the depth image is generally considered to be 2.5D, i.e., a depth value z of three-dimensional coordinates (x, y, z) is projected into a two-dimensional space and expressed by a gray value of 0-255. If the parameters of a camera (such as a light field camera Lytro Illum) are known, the depth information can be projected to a three-dimensional space according to a projection model of the camera to obtain (x, y, z) coordinates of a target; if the parameters of the camera cannot be acquired, the depth value is scaled to a space z value according to the size of the object and the size of the image according to a preset proportion, and the three-dimensional shape of the target object is approximately expressed.

It should be noted that for the light field camera Lytro Illum and other light field cameras, camera parameters need to be output through camera calibration, multiple checkerboard images (10-20) at different angles are shot according to a traditional camera calibration method, and internal and external parameters of the camera are calculated.

Step S1023: based on the initial 3D mesh model of the preset shot target, an initial normal field (i.e. an initial surface normal, as shown in fig. 6 and 7, a normal vector) corresponding to the preset shot target is obtainedFig. 7 shows a normal map of color value formation).

In the present invention, it should be noted that, in the three-dimensional mesh model, each spatial point p (X, Y, Z), and the vector having a direction perpendicular to the tangent plane of the mesh surface where the point is located, is called a normal vector, and is used as the normal vectorAnd (3) expressing, namely calculating a tangent plane where the point is located through all grids of the pixel point p, then obtaining a normal vector of the point, and finally generating a normal field capable of expressing the three-dimensional shape of the target object.

Step S103: analyzing and calculating to obtain the surface reflectivity of the preset shooting target according to the initial color image and the initial normal field of the preset shooting target (after the reflectivity eliminates most ambiguous points, the reflection attribute of the surface pixel of the preset shooting target can be accurately expressed);

in the present invention, the step S103 specifically includes the following substeps:

step S1031: processing an initial color image (shown in fig. 2) of the preset shooting target to obtain a corresponding chromaticity diagram (shown in fig. 8);

in the present invention, it should be noted that, it can be found through the chromaticity diagram that ambiguity occurs in the pixel chromaticity value of the shadow area caused by occlusion and mutual reflection, and the reflection attribute of the preset shooting target cannot be correctly expressed. By carrying out clustering processing of the existing K-means algorithm on the chromaticity diagram, an ambiguous pixel region can be found, namely a brightness value change region (namely the chromaticity diagram corresponding to the initial color image) caused by shielding and mutual reflection can be obtained;

step S1032: performing threshold segmentation on the chromaticity diagram, and extracting edge point information of the chromaticity diagram (because the chromaticity values of shadow areas caused by occlusion and mutual reflection have ambiguity and cannot express the reflection attribute of an object);

in the present invention, it should be noted that, the chroma map is divided by a threshold, specifically: edge detection is carried out on the chromaticity diagram by using an edge detection operator (such as an edge detection algorithm like Canny or Sobel), edge pixel points in the chromaticity diagram are extracted, and simultaneously all the edge pixel points can be extracted as far as possible by expanding and optimizing discrete edge points.

Step S1033: performing reflectivity division on all surface areas included in the chromaticity diagram according to edge point information or chromaticity values of the chromaticity diagram, and establishing different marks for the surface areas with different reflectivities (namely different reflection attributes);

for the present invention, the operation of dividing the reflectivity of all the surface areas included in the chromaticity diagram according to the chromaticity value of the chromaticity diagram specifically includes the following steps:

and judging whether the chromaticity difference between any two surface areas in the chromaticity diagram reaches a preset value, if so, defining the surface areas as surface areas with different reflectivities (namely different reflection attributes), and marking different marks.

It should be noted that the mark may be any mark that can distinguish two surface regions, for example, a letter mark such as a or B, a number mark such as 1 and 2, or other marks.

It should be further noted that, in a specific implementation, the operation of performing reflectivity division on all surface areas included in the chromaticity diagram according to the edge point information of the chromaticity diagram specifically includes the following steps:

if there are other edge points (i.e. edge points matching the edge point information of the chromaticity diagram) on the connection line between any two pixel points p and q in the chromaticity diagram, if so, they are defined as surface areas with different reflectivities (i.e. different reflection attributes), and different marks are marked, because this can also indicate that they are surface areas with different reflectivities (i.e. different reflection attributes).

In the invention, the reflectivity of all surface areas included in the chromaticity diagram is divided according to the edge point information or the chromaticity value of the chromaticity diagram, therefore, any pixel points p and q of any two surface areas in the chromaticity diagram are judged, and if other edge points exist on the connecting line of the two surface areas, the two surface areas where the two surface areas are located have different reflection attributes.

Step S1034: calculating the chroma mean value of each surface area with different reflectivities (namely different reflection attributes), judging whether the chroma mean value is an ambiguous pixel area (namely an ambiguous pixel point) or not by judging whether the chroma difference between the chroma mean value and a preset chroma value reaches a preset threshold value or not, if so, defining the ambiguous pixel area, and filtering and eliminating the surface area serving as the ambiguous pixel area based on the Euclidean distance;

it should be noted that, for each surface region with different reflectivity, the chroma mean value is obtainedIf it isIt is an ambiguous point.

In the present invention, it should be noted that,is the average chrominance value. For theWherein Ch_pIs the chroma value of any point p, tau is the truncation threshold, and the user of the invention can set the value of tau in advance through experiments.

In particular, for the implementation, the specific calculation method for filtering based on the euclidean distance and the chromaticity difference is as follows:

wherein,is a local neighborhood of pixel p, I_pIs the color brightness value of the pixel point p, gamma is the normalized coefficient term, omega_dIs the weight of the spatial distance between the pixel p and the neighboring pixel q, i.e. the Euclidean distance,is the chroma difference weight, d, of pixel p and its neighborhood pixel q_p、d_qIs the spatial position (coordinate value), Ch, of pixel points p and q_p、Ch_qIs the chrominance value of pixel points p and q.

And if the chromaticity difference of one surface area is larger than a preset chromaticity threshold value according to the chromaticity difference between the chromaticity mean value and the preset chromaticity value, judging that the surface area is an ambiguous pixel point caused by shadow and mutual reflection.

In the specific implementation, for the invention, filtering eliminates ambiguous pixel points caused by shadow and mutual reflection, namely, the correct chroma values of the ambiguous points are calculated by using average filtering through the chroma values of local neighborhood pixel points, thereby eliminating the influence of the shadow and the mutual reflection.

Step S1035: and calculating the reflectivity of all the surface areas in the filtered chromaticity diagram, and finally obtaining the surface reflectivity of the preset shooting target.

It should be noted that, for the present invention, the well-known and preset algorithm (the algorithm of "a simple model for an internal image decoding with depth functions," in Computer Vision (ICCV), IEEE International Conference on, 2013) may be specifically used to calculate the reflectivity of all the surface regions in the filtered chromaticity diagram, so as to finally obtain the surface reflectivity of the preset shooting target.

In the present invention, it should be noted that the input data is the light field central view sub-image and its corresponding initial depth map, the initial normal field is obtained from the initial depth map through step S102, and the accurate chromaticity diagram of the target surface is obtained from the light field central view sub-image through step S103. Based on the initial normal field and the chromaticity diagram, the content attribute decomposition is carried out on the image, and the reflection attribute A of each pixel point on the surface of the body is extracted through a preset algorithm (algorithms of Qiang Chen and VladlenKoltun, A simple model for the internal image processing with depth cups, computer Vision (ICCV), IEEE International conference on, 2013)_pAnd establishing an energy function containing a data item and a regular item, and finally calculating the reflectivity of all surface areas in the filtered chromaticity diagram by using a linear least square method. The formula is as follows:

E_reg＝ω_AE_A+ω_DE_D+ω_NE_N+ω_CE_C；

wherein, I_pIs the color brightness value of the pixel point p, A_pIs the reflectance, D_pIs direct irradiance, N_pIs indirect irradiance, C_pIs the color of illumination, w_A、w_D、w_N、w_CIs a weight corresponding to a content attribute, a canonical term energy function E corresponding to a content attribute_A、E_D、E_N、E_CIs defined as:

wherein, α_p，qIs the chroma difference weight of the local neighborhood pixels, a_pIs the reflectivity of the pixel p, d_pIs the spatial position (coordinate value), Ch, of the pixel point p_pIs the chroma value of pixel p, n_pIs the normal vector of pixel p.

Step S104: modeling a light field image corresponding to the preset shooting target according to the initial normal field and the surface reflectivity corresponding to the preset shooting target, and obtaining an illumination model (shown in fig. 9) of the preset shooting target and illumination parameters of the illumination model;

in the present invention, in terms of specific implementation, illumination modeling is performed on the preset shooting target according to an initial normal field and a surface reflectivity corresponding to the preset shooting target, and since it is difficult to measure an illumination attribute under a continuously changing natural illumination condition, a preset quadratic function related to the normal direction and the reflectivity is used to model a light field image (i.e., a light field luminance image) corresponding to the preset shooting target, so as to obtain an illumination model of the preset shooting target and an illumination parameter of the illumination model, where the formula of the quadratic function is:

I＝s(η)＝η^TAη+b^Tη+c；

η_x，y＝ρ_x，y·n_x，y；

wherein, η_x，yIs the reflectance ρ_x，yAnd unit normal n_x，yThe product of (a), (b), and (c) is an illumination parameter of the illumination model, the illumination parameter is calculated by a linear least square optimization algorithm, and a globally smooth preset shooting target surface can be obtained based on the global illumination model (as shown in fig. 9).

Step S105: optimizing an initial normal field corresponding to the preset shooting target according to the surface reflectivity of the preset shooting target and the illumination parameters of the illumination model, and recovering geometric details of the surface of the object;

it should be noted that, the illumination parameters are added to optimize the normal field, so that the surface shape of the finally constructed three-dimensional 3D stereoscopic image of the preset shooting target can be smooth.

In the invention, in particular, the initial normal field corresponding to the preset shooting target is optimized according to the surface reflectivity of the preset shooting target and the illumination parameter of the illumination model. It should be noted that the surface reflectivity and the illumination model of the preset shooting target are computationally optimized based on the initial normal field, but the initial normal field is generated from the initial depth map, and there are many noise and ambiguity values. The preset shooting target surface contains a lot of high-frequency local geometric details, the normal direction of each point is unique, and an accurate normal field is an essential factor for three-dimensional reconstruction, so that the local geometric details of the target surface are recovered, and optimal reconstruction of the normal field is necessary. According to the surface reflectivity of the preset shooting target and the illumination parameters of the illumination model, a minimum energy function is established by representing illumination consistency, local smoothness, initial prior knowledge and constraint terms of unit vectors, and the initial normal field corresponding to the preset shooting target is optimized

For the present invention, the step S105 specifically includes the following steps:

step S1051: according to the surface reflectivity of the preset shooting target and the illumination parameters of the illumination model, applying a preset energy function to the normal direction of each pixel on the surface of the preset shooting targetOptimizing, namely optimizing the initial normal field (as shown in fig. 10 and fig. 11);

step S1052: and (3) optimizing and solving the preset energy function by utilizing a nonlinear least square optimization LM algorithm to obtain an optimized normal field (as shown in figures 12 and 13).

For the present invention, it should be noted that, the collection of the light field image is under the natural illumination condition, the measurement of the illumination property is difficult, and the illumination environment is changed, the present invention adopts a method related to the normal directionModeling the light field brightness image by a secondary shading function of the reflectivity rho; multiplying and solving estimated global illumination parameters by using least square optimization LM algorithm(ii) a Optimizing the normal of each pixel of the object surface, in particular by minimizing an energy function with the reflectance p and the illumination model parameters as inputs

Wherein the preset energy function E (n) comprises an image brightness constraint E_i(n) local Normal smoothness constraint E_sh(n) initial Normal constraint E_r(n) and Unit vector constraint E_u(n), the specific calculation method of the preset energy function is as follows:

E(n)＝λ_iE_i(n)+λ_shE_sh(n)+λ_rE_r(n)+λ_uE_u(n)；

wherein I is_pIs the color brightness value, s (η), of pixel point p_p) Is the brightness value of the illumination model output of the pixel point p, E_i(n) is a consistency constraint on the image brightness and true brightness values output by the illumination model; n is_pIs the normal vector of pixel p, E_sh(n) performing smooth constraint on a local neighborhood of the surface of a preset shooting target;is the initial normal vector of pixel p, E_r(n) is a consistency constraint on the optimized normal to the initial normal,is a transposition of the normal of the pixel p, E_u(n) is a constraint that the optimized normal vector must be a unit vector.

Step S106: according to the normal field obtained by optimization, depth enhancement is carried out on the initial depth image of the preset shooting target, and the final initial depth image subjected to depth enhancement is obtained (namely the initial depth image is enhanced by utilizing a high-precision optimization normal method in the invention);

in the present invention, in a specific implementation, an initial depth map may be enhanced based on an optimization normal to obtain a high-quality depth map, and then a three-dimensional mesh reconstruction of a geometric shape is performed, and depth enhancement is performed by using a preset algorithm (the algorithm of Diego neha, szymonrusinwicz, James Davis, andRavi ramamorthi, "efficient combining location and normals for prediction 3d geometry" in ACM transactionison Graphics (TOG), 2005), where the specific algorithm is:

through mutual combined enhancement of the spatial coordinates, local information and a normal field, an energy function is calculated by using a weighted least square method to obtain a high-precision depth map, and a formula is defined as follows:

wherein E is^pIs an energy function of spatial coordinates, where P (x, y) is the three-dimensional spatial coordinates of the image plane pixel point (x, y), Z (x, y) is the depth value, f (x, y) is the depth value_x、f_yIs the focal length of the camera, P_iIs the optimal spatial coordinates of the object to be measured,the space coordinate obtained by measurement is obtained through an initial depth map; eⁿIs an energy function of the normal field, T_x、T_yIs the surface tangent of a pixel point (x, y) of a preset shooting target,is a corresponding optimized space coordinate P_iThe correct normal vector.

Step S107: and (3) reconstructing a 3D mesh model (shown in figure 14) of a preset shooting target according to the initial depth image subjected to depth enhancement and projected into a 3D space. Therefore, the invention can realize the light field imaging three-dimensional display of the preset shooting target, obtain the high-quality depth image, ensure the imaging quality, contribute to expanding the popularization and application range of the light field imaging,

it should be noted that, for the present invention, a known algorithm (algorithm of Diego Nehab, szymon rusinnkiewicz, James Davis, and Ravi ramamorthi, "efficient combining and normals for precision 3d geometry" in ACM Transactions On Graphics (TOG), 2005) may be specifically used, so that the initial depth image of the preset shooting target may be depth-enhanced according to the normal field obtained by optimization, thereby obtaining a final, high-quality depth-enhanced initial depth image.

In the present invention, it should be noted that, according to a projection model from a two-dimensional space to a three-dimensional space, an initial depth image subjected to depth enhancement is projected to a 3D space, and a formula is as follows:

wherein (X, Y) is an image plane coordinate of the preset photographic target, (X, Y, Z) is a coordinate of the preset photographic target surface in a 3D space, which is a focal length and a center coordinate of the camera, respectively, and R and T are projection transformation rotation and translation matrices, respectively.

In summary, compared with the prior art, the depth image processing method based on the optical field provided by the invention can reconstruct the shape of the shot target based on the 4D optical field, realize the optical field imaging three-dimensional display of the shot target, obtain the high-quality depth image, ensure the imaging quality, contribute to expanding the popularization and application range of optical field imaging, promote the application and development of optical field imaging, and contribute to improving the product use experience of users, and has great production practice significance.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. A depth image processing method based on a light field is characterized by comprising the following steps:

2. The method of claim 1, wherein the second step comprises the sub-steps of:

3. The method of claim 1, wherein the third step comprises the sub-steps of:

4. The method as claimed in claim 3, wherein the operation of dividing the reflectivity of the entire surface area of the chromaticity diagram according to the edge point information of the chromaticity diagram specifically comprises the following steps:

and judging whether edge points exist on a connecting line between any two pixel points in the chromaticity diagram, if so, defining that the pixel points belong to surface areas with different reflectivities, and setting different marks.

5. The method according to claim 3, wherein the operation of dividing the reflectivity of all surface areas included in the chromaticity diagram according to the chromaticity value of the chromaticity diagram specifically comprises the steps of:

6. The method according to claim 4, wherein in the fourth step, a light field image of the preset shooting target is modeled by using a preset quadratic function about normal and reflectivity according to an initial normal field and surface reflectivity corresponding to the preset shooting target, and an illumination model of the preset shooting target and illumination parameters of the illumination model are obtained;

the formula of the quadratic function is:

I＝s(η)＝η^TAη+b^Tη+c；

η_x，y＝ρ_x，y·n_x，y；

7. The method according to any one of claims 1 to 6, characterized in that said fifth step comprises the sub-steps of: