CN103493482B

CN103493482B - The method and apparatus of a kind of extraction and optimized image depth map

Info

Publication number: CN103493482B
Application number: CN201280004184.8A
Authority: CN
Inventors: 赵兴朋
Original assignee: Qingdao Hisense Xinxin Technology Co Ltd
Current assignee: Hisense Visual Technology Co Ltd
Priority date: 2012-05-08
Filing date: 2012-05-08
Publication date: 2016-01-20
Anticipated expiration: 2032-05-08
Also published as: WO2013166656A1; CN103493482A

Abstract

The embodiment of the present invention relates to the method and apparatus of a kind of extraction and optimized image depth map.Described method comprises: the scene degree of correlation obtaining each pixel in current source picture and current source picture; To the continuous down-sampling of current source picture, obtain the scene degree of correlation of each pixel in current each down-sampling source images; The pixel that each pixel in current down-sampling source images is corresponding with down-sampling source images before carries out block matching motion Vector operation, obtains the motion vector value of each pixel in current down-sampling source images; In cumulative current down-sampling source images, the motion vector value of each pixel, extracts the ID value of each pixel from motion vector cumulative sum, and ID value forms ID figure; Utilize the scene degree of correlation of each pixel in the scene degree of correlation of each pixel in source images and each down-sampling source images to surpass the disposal of gentle filter and up-sampling process continuously to pixel each in ID figure, obtain source images depth map.

Description

The method and apparatus of a kind of extraction and optimized image depth map

Technical field

The present invention relates to technical field of image processing, especially relate to the method and apparatus of a kind of extraction and optimized image depth map.

Background technology

Three-dimensional stereo display technique high speed development recent years, three-dimensional terminal presentation facility (ThreeDimensions, 3D) as, 3D TV, the rapid emergence of 3D game machine etc. has become the inevitable outcome of development in science and technology.Due to 3D film source resource scarcity, involve great expense, so the 2D converting 3D sequence of frames of video to by common two dimension (TwoDimensions, 2D) sequence of frames of video turns 3D technology become hot fields in three-dimensional stereo display technique.

Turn in 3D technology at 2D, Focal point and difficult point is the extraction of 2D video image depth map.The physical significance of depth map is: in 2D sequence of frames of video, different pictures content is from the how far of beholder, is to form the topmost information source of 3D anaglyph.The ways and means of current extraction depth map is various, mainly comprises the depth information based on contour of object extraction object, the color segmentation based on image extracts depth map, extracts depth map based on Virtual Space intersection point, based on the motion vector extraction depth map of object and based on key frame semi-automatic extraction depth map etc.But the technology of these numerous extraction depth maps all exists serious defect mostly, the too large or Human disturbance factor of clear amount of calculation that depth map is smudgy is too many etc., is so just difficult to the display demand reaching 3D terminal presentation facility.

Prior art provides a kind of method generating two dimensional video sequence depth map, as shown in Figure 1, the key frame of the method first in selecting video frame sequence, and the artificial depth map generating key frame, then coupling estimates the moving displacement between video successive frame characteristic point, draws the depth map of present frame according to key frame depth maps and moving displacement.This method of document introduction can extract present frame depth map to a certain extent, but this method needs the depth map manually choosing and calculate key frame, is unfavorable for that the full-automation of depth map generates, and then is difficult to promote at industrial circle; Be a bit carrying out being easy to cause matching error time coupling is estimated in addition, and then depth map information also can cause matching error, and the depth map of extraction so often soft edge, depth information is concavo-convex unbalanced.

Summary of the invention

The need that the object of the invention is to solve prior art are manually chosen and are extracted the depth map of key frame, cause the depth map precision of extraction low, occur the problem of error, provide the method and apparatus of a kind of extraction and optimized image depth map.

In first aspect, embodiments provide a kind of method of extraction and optimized image depth map, described method comprises: the scene degree of correlation obtaining each pixel in current source picture and described current source picture, and described current source picture is current video successive frame sequence;

To the continuous down-sampling of described current source picture, obtain the scene degree of correlation of each pixel in current each down-sampling source images;

The pixel that each pixel in described current down-sampling source images is corresponding with down-sampling source images before carries out block matching motion Vector operation, obtains the motion vector value of each pixel in described current down-sampling source images;

The motion vector value of each pixel in cumulative described current down-sampling source images respectively, from motion vector cumulative sum, extract the ID value of each pixel, described ID value forms source images ID figure;

Utilize the scene degree of correlation of each pixel in the scene degree of correlation of each pixel in described source images and described each down-sampling source images to surpass the disposal of gentle filter and up-sampling process continuously to each pixel in described ID figure, obtain described source images depth map.

In second aspect, embodiments provide the device of a kind of extraction and optimized image depth map, described device comprises: the first acquiring unit, for obtaining the scene degree of correlation of each pixel in current source picture and described current source picture, described current source picture is current video successive frame sequence;

Second acquisition unit, for the continuous down-sampling of described current source picture, obtains the scene degree of correlation of each pixel in current each down-sampling source images;

3rd acquiring unit, carries out block matching motion Vector operation for the pixel that each pixel in described current down-sampling source images is corresponding with down-sampling source images before, obtains the motion vector value of each pixel in described current down-sampling source images;

Computing unit, for the motion vector value of each pixel in the described current down-sampling source images that adds up respectively, extract the ID value of each pixel from motion vector cumulative sum, described ID value forms source images ID figure;

First processing unit, for utilizing the scene degree of correlation of each pixel in the scene degree of correlation of each pixel in described source images and described each down-sampling source images to surpass the disposal of gentle filter and up-sampling process continuously to each pixel in described ID figure, obtain described source images depth map.

By application embodiment of the present invention the methods and apparatus disclosed, the corresponding scene degree of correlation is obtained in the different down-sampling stage, ID figure is extracted by motion vector cumulative sum, utilize the scene degree of correlation of different sample phase to carry out iteration to ID figure and surpass smoothing processing, final generation source images depth map, improve the picture quality of depth map, make depth map profile more clear, this method also allows the computing cost of whole process remain within zone of reasonableness simultaneously.

Accompanying drawing explanation

Fig. 1 is that prior art extracts depth map flow chart;

Fig. 2 method flow diagram that depth map extracts and optimizes disclosed in the embodiment of the present invention;

Fig. 3 is embodiment of the present invention current source picture schematic diagram;

Fig. 4 is the schematic diagram that the embodiment of the present invention calculates arbitrary pixel scene degree of correlation;

Fig. 5 is the schematic diagram of arbitrary pixel scene degree of correlation on embodiment of the present invention computation bound;

Fig. 6 is the ID figure of embodiment of the present invention source images;

Fig. 7 to be the embodiment of the present invention be arbitrary pixel assigns weight coefficient schematic diagram;

Fig. 8 to be the embodiment of the present invention be on border arbitrary pixel assigns weight coefficient schematic diagram;

Fig. 9 is the structure chart that the embodiment of the present invention surpasses smoothing filter;

Figure 10 is depth map after the optimization of embodiment of the present invention source images;

Figure 11 is the installation drawing of the disclosed extraction of the embodiment of the present invention and optimized image depth map.

Embodiment

For making the object of the embodiment of the present invention, technical scheme and advantage clearly, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.

For ease of the understanding to the embodiment of the present invention, be further explained explanation below in conjunction with accompanying drawing with specific embodiment, embodiment does not form the restriction to the embodiment of the present invention.

The method of image procossing disclosed in the embodiment of the present invention is described in detail, Fig. 2 method flow diagram that depth map extracts and optimizes disclosed in the embodiment of the present invention for Fig. 2.

As shown in Figure 2, in embodiments of the present invention, first continuous current source picture is obtained, described current source picture is two-dimentional successive frame sequence, extract the scene degree of correlation of each pixel in current source picture, to current source picture respectively through horizontal and vertical twice 1/2 down-samplings operations, and after twice 1/2 down-samplings, carry out the scene degree of correlation extracting each pixel accordingly, obtain each pixel in each stage resolution source image the scene degree of correlation.

Utilize after twice 1/2 down-samplings, resolution is that the pixel corresponding with 1/16 resolution source image before of the pixel in the source images of 1/16 carries out block matching motion Vector operation, draw the motion vector value of pixel in current 1/16 resolution source image, the motion vector value of pixel in cumulative current 1/16 resolution source image, extract the ID value of pixel in current 1/16 resolution source image based on motion vector cumulative sum, the ID value of each pixel forms ID figure; But the resolution of described ID figure is 1/16 of source images, and therefore soft edge is unintelligible, also strict optimization process to be carried out to described ID figure.

Super the disposal of gentle filter based on the 1/16 resolution scene degree of correlation is carried out to 1/16 resolution ID figure, and carries out four iteration and surpass smothing filtering; 1/16 complete depth of resolution figure of the disposal of gentle filter is surpassed to iteration and carries out horizontal and vertical twice up-sampling respectively, obtain 1/4 resolution ID figure, super the disposal of gentle filter based on the 1/4 resolution scene degree of correlation is carried out to 1/4 depth of resolution figure, and carries out twice iteration and surpass smothing filtering; Further the depth map that twice up-sampling obtains original resolution size is carried out to the 1/4 depth of resolution figure processed, an iteration of carrying out again based on the scene degree of correlation surpasses the disposal of gentle filter, finally get the depth map after optimization process, the step of specific implementation as described below:

Step 210, acquisition two dimensional source image, and from source images, extract the scene degree of correlation;

Particularly, first obtain continuous current source picture, described current source picture is two-dimentional successive frame sequence, and as shown in Figure 3, Fig. 3 is embodiment of the present invention current source picture schematic diagram;

Meanwhile, from current source picture, the scene degree of correlation is extracted;

In embodiments of the present invention, the scene degree of correlation is arbitrary pixel (appoint get pixel centered by a pixel) and the degree of correlation of neighbor pixel around it in a two field picture.Utilize central pixel point place R (red), G (green), B (indigo plant) value subtract each other with the value of around neighbor pixel place R (red), G (green), B (indigo plant) successively, and get the absolute value of difference.If the absolute value of the difference of the neighbor on some directions is less than the relevance threshold preset, so, just the degree of correlation flag bit in the party upwards degree of correlation mark groove is arranged 1, otherwise arrange 0.Described degree of correlation mark groove buffer [] is the buffer of 8 bit wides, and 8 degree of correlation flag bits from lowest order to highest order deposit the degree of correlation information that central pixel point 8 neighbors nearest with it press clock-wise order rotation successively.Centered by storing 1, pixel is relevant to neighbor pixel, and centered by storing 0, pixel is uncorrelated with neighbor pixel.Any one pixel in a degree of correlation mark groove map source image.

As shown in Figure 4, Fig. 4 is the schematic diagram that the embodiment of the present invention calculates arbitrary pixel scene degree of correlation; Wherein, red pixel point is the central pixel point chosen, the coordinate of central pixel point is (x, y), the pixel adjacent with described central pixel point is neighbor pixel, in the diagram, there are 8 neighbor pixels, and 8 neighbors are pressed clockwise numbering, as No. 0-7, described numbering correspond to the lowest order of scene degree of correlation mark groove respectively to highest order;

According to preceding method, judge the degree of correlation of each neighbor pixel and central pixel point, if the neighbor pixel of numbering 1 and the relevant of central pixel point, then in degree of correlation mark groove buffer [1], store 1; Therefore, the formula of degree of correlation mark groove buffer [m] is:

buffer [m] = \{\begin{matrix} 1, {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{R} + {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{G} + {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{B} \leq A; \\ 0, {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{R} + {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{G} + {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{B} > A; \end{matrix}

(formula 1)

Wherein, k=0,1; 0≤m≤7; M ∈ Z; F (x, y), f (x ± k, y ± k) are pixel R (red), the value of G (green), B (indigo plant) component; A is the threshold value of the scene degree of correlation; The m position that buffer [m] is the degree of correlation; M is neighbor piont mark.

At this, illustratively, the situation of the pixel degree of correlation on computation bound, as shown in Figure 5, Fig. 5 is the schematic diagram of arbitrary pixel scene degree of correlation on embodiment of the present invention computation bound, wherein, red pixel point is the central pixel point chosen, the coordinate of central pixel point is (x, y), the pixel adjacent with described central pixel point is neighbor pixel, in Figure 5, the coordinate of central pixel point gets x=0 and y=0, namely central pixel point is positioned at the first row of current source picture, first row, now, the mode of numbering as hereinbefore, only need calculate and be numbered 2 in dotted line frame, 3, the degree of correlation of 4 pixels, be numbered 0, 1, 5, 6, 7, pixel does not exist, therefore, directly by degree of correlation mark groove buffer [0], buffer [1], buffer [5], buffer [6], buffer [7] assignment is 0, pixel processing method on other boundary position is identical with it, therefore, repeat no more.

It should be noted that, the calculating degree of correlation of foregoing description is described for 1 pixel, in Practical Calculation, all to calculate each pixel, in source images each pixel as calculated after, all there is 1 degree of correlation mark groove, all degree of correlation mark grooves of all pixels constitute the scene degree of correlation of current source picture.

Step 220, horizontal and vertical 1/2 down-sampling is carried out to source images, obtain 1/4 resolution source image, and extract the scene degree of correlation from 1/4 resolution source image;

Particularly, to the current source picture obtained respectively through 1/2 horizontal and vertical down-sampling operation, and after 1/2 down-sampling, from each pixel of 1/4 resolution source image, extract the scene degree of correlation, obtain the scene degree of correlation of each pixel in 1/4 resolution source image and 1/4 resolution source image.

In the sample phase of current 1/4 resolution, calculate each pixel scene degree of correlation in 1/4 resolution source image by the method that step 210 describes.

It should be noted that, prior art is treated to the down-sampling of source images, does not repeat them here.

Step 230, horizontal and vertical 1/2 down-sampling is carried out again to 1/4 resolution source image, obtain 1/16 resolution source image, and extract the scene degree of correlation from 1/16 resolution source image;

Particularly, to 1/4 resolution source image obtained respectively through 1/2 horizontal and vertical down-sampling operation, and after 1/2 down-sampling, from 1/16 resolution source image, extract the scene degree of correlation, obtain the scene degree of correlation of each pixel in 1/16 resolution source image and 1/16 resolution source image.

In the sample phase of current 1/16 resolution, calculate each pixel scene degree of correlation in 1/16 resolution source image by the method that step 210 describes.

It should be noted that, prior art is treated to the down-sampling of 1/4 resolution source image, does not repeat them here.

Step 240,1/16 resolution source image and before 1/16 resolution source image carry out block matching motion Vector operation;

Particularly, utilize in step 230,1/16 resolution source image after down-sampling carries out block matching motion Vector operation with 1/16 resolution source image before, obtain the motion vector value of each pixel in current 1/16 resolution source image, and the motion vector value of each pixel in current 1/16 resolution source image obtained is added up.

It should be noted that, the motion vector based on Block-matching is cumulative is also prior art, does not repeat them here.

Extract ID value in step 250, motion vector cumulative sum, form ID figure;

Particularly, according to the description of step 240, the motion vector value obtaining each pixel in current 1/16 resolution source image adds up, the ID value of each pixel is extracted from described motion vector cumulative sum, all extract ID value to pixel whole in 1/16 resolution source image, the ID value of whole pixel forms the ID figure of 1/16 resolution source image;

At this, illustratively, based in motion vector cumulative sum, ID value is extracted to each pixel;

Assuming that in continuous two source images, the maximum offset of moving object is 3.5% width of current source picture, and now, the gray value that representative motion vector value is corresponding is 255, the following formula of gray value size so representated by unit picture element displacement:

I = \frac{255}{W * 3.5 %}

(formula 2)

Wherein, W is the width of image;

If the image block motion vector mould calculated is the value shown in table 1, at this, illustrate by way of example, in table 1, calculate 9 motion vector moulds.

Table 1 motion vector mould

a	b	c
			d	e	f
g	h	k

So, the gray value that above-mentioned 9 motion vector moulds are corresponding is:

Table 2 motion vector gray value

a*I	b*I	c*I
			d*I	e*I	f*I
g*I	h*I	k*I

Illustrate that the source images of acquisition is two-dimensional video frame sequence aforementioned, the depth information of arbitrary pixel is extracted from described two-dimensional frames sequence, in order to keep pixel after stop motion, its depth value still exists, so that obtain its movable information at any time, by the depth value of arbitrary pixel stored in degree of depth buffer.Otherwise once pixel stopped motion in current source picture, so, in source images, the motion vector of this pixel will be zero afterwards, now, if directly will be obtained the result of mistake by current pixel point motion vector value compute depth information.Therefore, the accumulated value for the depth information before pixel that degree of depth buffer is deposited, because degree of depth buffer exists maximum, so the maximum of controlling depth buffer is D _total.

Owing to having obtained the gray value of arbitrary pixel in the preamble, the cumulative sum of the whole gray value of current depth figure has been D _new, before in degree of depth buffer, whole gray value cumulative sums of all depth maps are D _accif, always D _newsimply be added to D _accin, finally can exceed the maximum D of degree of depth buffer cumulative sum _totaland cause overflowing, cause pixel depth information to run off; Therefore, in embodiments of the present invention,

If D _new+ D _acc< D _totaltime, then D _{acc_depth}(x, y)=D _{acc_depth}(x, y)+D _{new_depth}(x, y) (formula 3)

If D _new+ D _acc> D _totaltime, then

D_{acc_depth} (x, y) = D_{acc_depth} (x, y) * \frac{D_{total} - D_{new}}{D_{acc}}

(formula 4)

If

\frac{D_{total} - D_{new}}{D_{acc}} < 0,

Then

\frac{D_{total} - D_{new}}{D_{acc}} = 0;

(formula 5)

If

\frac{D_{total} - D_{new}}{D_{acc}} > 1,

Then

\frac{D_{total} - D_{new}}{D_{acc}} = 1 .

(formula 6)

Wherein 0≤x≤h-1,0≤y≤w-1.H, w are respectively height and the width of down-sampling 1/16 resolution source image.D _{acc_depth}(x, y) be expressed as each pixel before the cumulative sum of motion vector gray value, D _{new_depth}(x, y) is expressed as each pixel current motion vector gray value.

It should be noted that, the ID value of extracting from motion vector cumulative sum of foregoing description is described for partial pixel point, in Practical Calculation, all to extract each pixel, extract each pixel ID value in 1/16 resolution source image after, form the ID figure of 1/16 resolution source image, as shown in Figure 6, Fig. 6 is the ID figure of embodiment of the present invention source images.

Step 260, super the disposal of gentle filter is carried out to 1/16 resolution ID figure, and up-sampling process;

Particularly, from the ID figure of Fig. 6, due in step 250, in block matching motion Vector operation process, there is comparatively big error, cause extracted ID map contour smudgy clear, therefore, in this step strict optimization process is carried out to 1/16 resolution ID figure.

According to the description of step 230, obtain the scene degree of correlation of 1/16 resolution source image in step 230, therefore, the scene degree of correlation according to 1/16 resolution source image obtained carries out super the disposal of gentle filter to 1/16 resolution ID figure, and carries out four iterative filterings; Then, horizontal and vertical twice up-sampling is carried out respectively to the 1/16 resolution ID figure that iteration surpasses the disposal of gentle filter complete, obtains 1/4 resolution ID figure.

In embodiments of the present invention, according to the description of step 230, obtained the scene degree of correlation of 1/16 resolution source image pixel, the scene degree of correlation according to 1/16 resolution source image pixel is optimized process to 1/16 resolution ID figure; The pixel adjacent with central pixel point is defined in the calculating degree of correlation, in super the disposal of gentle filter in this step, different weight coefficients is distributed again by each neighbor pixel, as shown in Figure 7, Fig. 7 to be the embodiment of the present invention be arbitrary pixel assigns weight coefficient schematic diagram;

The weight coefficient of each neighbor pixel is respectively as the filter taps coefficient of super smoothing filter.Described super smoothing filter is low pass filter, because this filter coefficient factor is 8, and to be regularly distributed on 8 directions of central pixel point, so, filtering performance is high, can effectively concavo-convex sharp keen high-frequency noise and high fdrequency component in level and smooth ID figure.

When the central pixel point chosen is not on the border of ID figure, according to the degree of correlation mark groove at central pixel point place, obtain the correlation circumstance of central pixel point and 8 neighbor pixels.If a certain neighbor pixel and the central pixel point degree of correlation are 1, the weight coefficient of this neighbor pixel is then multiplied by with the gray value of neighbor pixel, if a certain neighbor pixel and the central pixel point degree of correlation are 0, the gray value of central pixel point is then multiplied by with the weight coefficient of neighbor pixel, finally, the result that 8 neighbor pixels are multiplied has been added up be used as the result after to 1/16 resolution ID figure smothing filtering.

To be the embodiment of the present invention to assign weight coefficient schematic diagram to a border pixel of taking up an official post Fig. 8, as shown in Figure 8, wherein, red pixel point is the central pixel point chosen, the coordinate of central pixel point is (x, y), the pixel adjacent with described central pixel point is neighbor pixel, in fig. 8, the coordinate of central pixel point gets x=0 and y=0, namely central pixel point is positioned at the first row of current source picture, first row, now, the mode of numbering as hereinbefore, only need according to being numbered 2 in dotted line frame, 3, the degree of correlation of 4 pixels carries out filtering, be numbered 0, 1, 5, 6, 7, pixel does not exist, therefore, degree of correlation mark groove buffer [0], buffer [1], buffer [5], buffer [6], buffer [7] value is 0, pixel processing method on other boundary position is identical with it, therefore, repeat no more.

Fig. 9 is the structure chart that the embodiment of the present invention surpasses smoothing filter, and as shown in Figure 9, the coef0-coef7 in Fig. 9 is the weight coefficient of neighbor pixel, is also the tap coefficient of super smoothing filter; In embodiments of the present invention, coef0=coef2=coef4=coef6=1/6 is set; Coef1=coef3=coef5=coef7=1/12; When setting weight coefficient, the weight coefficient of 8 neighbor pixels be met and be 1, be i.e. coef0+coef1+coef2+coef3+coef4+coef5+coef6+coef7=1.

Therefore, to the formula that arbitrary pixel carries out filtering be:

f (x, y) = [\frac{1}{6} * Σ_{n = 0}^{3} [~ buffer [2 n]] + \frac{1}{12} * Σ_{n = 0}^{3} [~ buffer [2 n + 1]]] * f (x, y) + \frac{1}{6} * Σ_{n = 0}^{3} [buffer [2 n] * f (2 n)] + \frac{1}{12} * Σ_{n = 0}^{3} [buffer [2 n + 1] * f (2 n + 1)]

(formula 7)

Wherein, n ∈ Z, n=0,1,2,3; The scene degree of correlation that buffer [] is neighbor pixel; The negate of the scene degree of correlation that ~ buffer [] is neighbor pixel; The gray value at pixel (x, y) place centered by f (x, y).

Carrying out four iteration to described 1/16 resolution ID figure surpasses after the disposal of gentle filter completes, each pixel in 1/16 complete resolution ID figure of the disposal of gentle filter is surpassed to iteration and carries out horizontal and vertical twice up-sampling respectively, obtain the depth value of each 1/4 definition pixel point, the depth value of each 1/4 definition pixel point forms 1/4 depth of resolution figure, obtains 1/4 depth of resolution figure.

It should be noted that being described for 1 pixel ID figure filtering according to the degree of correlation of foregoing description, in Practical Calculation, all will carry out filtering to each pixel.

Up-sampling iteration being surpassed to 1/16 complete resolution ID figure of the disposal of gentle filter is treated to prior art, does not repeat them here.

Step 270, super the disposal of gentle filter is carried out to 1/4 depth of resolution figure, and up-sampling process;

Particularly, 1/4 depth of resolution figure is obtained according to step 260, according to the description of step 220, obtain the scene degree of correlation of 1/4 resolution source image in a step 220, therefore, the scene degree of correlation according to 1/4 resolution source image obtained carries out super the disposal of gentle filter to 1/4 depth of resolution figure, and carries out twice iterative filtering; Then, horizontal and vertical twice up-sampling is carried out respectively to the 1/4 resolution ID figure that iteration surpasses the disposal of gentle filter complete, obtain the depth value of each original resolution pixel, the depth value of each original resolution pixel forms original resolution depth map, obtains the ID figure of original resolution;

In filtering in this step, by the method that step 260 describes, super smothing filtering is carried out to 1/4 depth of resolution figure.

Depth map after step 280, acquisition optimization process;

Particularly, obtain original resolution depth map according to step 270, according to the description of step 210, obtain the scene degree of correlation of source images in step 210, therefore, according to the scene degree of correlation of the source images obtained, an iteration is carried out to original resolution depth map and surpass the disposal of gentle filter; Finally, source images depth map is obtained.

As shown in Figure 10, Figure 10 is depth map after source images optimization process; Compared with the ID figure of Fig. 6 source images, ID figure resolution is lower, pixel is less, image outline is unintelligible, carrying out after comparatively successive ignition surpasses smothing filtering and up-sampling, ID figure resolution raises, and pixel becomes many, depth map profile is more clear, improves the picture quality of depth map.

By method disclosed in the application embodiment of the present invention, the corresponding scene degree of correlation is obtained in the different down-sampling stage, ID figure is extracted by motion vector cumulative sum, utilize the scene degree of correlation in different down-sampling stage to carry out iteration to ID figure and surpass smoothing processing, carry out up-sampling process, final generation source images depth map, improves the picture quality of depth map simultaneously, make depth map profile more clear, this method also allows the computing cost of whole process remain within zone of reasonableness simultaneously.

Correspondingly, above-described embodiment describes the method for extraction and optimized image depth map, and correspondingly, the device of also usable image process realizes.As shown in figure 11, Figure 11 is the installation drawing of the disclosed extraction of the embodiment of the present invention and optimized image depth map.The device of described extraction and optimized image depth map comprises: the first acquiring unit 1110, and for obtaining the scene degree of correlation of each pixel in current source picture and described current source picture, described current source picture is current video successive frame sequence;

Second acquisition unit 1120, for the continuous down-sampling of described current source picture, obtains the scene degree of correlation of each pixel in current each down-sampling source images;

3rd acquiring unit 1130, carry out block matching motion Vector operation for the pixel that each pixel in described current down-sampling source images is corresponding with down-sampling source images before, obtain the motion vector value of each pixel in described current down-sampling source images;

Computing unit 1140, for the motion vector value of each pixel in the described current down-sampling source images that adds up respectively, extract the ID value of each pixel from motion vector cumulative sum, described ID value forms source images ID figure;

First processing unit 1150, for utilizing the scene degree of correlation of each pixel in the scene degree of correlation of each pixel in described source images and described each down-sampling source images to surpass the disposal of gentle filter and up-sampling process continuously to each pixel in described ID figure, obtain described source images depth map.

In described device, the first acquiring unit 1110 specifically for: select arbitrary pixel as central pixel point, will with described central pixel point neighbor piont mark;

Obtain the difference of the red R of described central pixel point, green G, blue B component value and the red R of each described neighbor pixel, green G, blue B component value, and described difference is taken absolute value;

By described absolute value compared with scene relevance threshold, if described absolute value is less than scene relevance threshold, then the scene degree of correlation of described neighbor pixel is set to 1 and is correlated with, otherwise, be set to 0 uncorrelated;

The scene degree of correlation of described neighbor pixel stored in a buffer, described buffer is specially

buffer [m] = \{\begin{matrix} 1, {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{R} + {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{G} + {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{B} \leq A; \\ 0, {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{R} + {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{G} + {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{B} > A; \end{matrix}

Wherein, k=0,1; 0≤m≤7; M ∈ Z; F (x, y), f (x ± k, y ± k) are the red R of pixel, green G, the value of blue B component; A is scene relevance threshold; The m position that buffer [m] is the degree of correlation; M is neighbor piont mark.

In described device, second acquisition unit 1120 specifically for: 1/2 horizontal and vertical down-sampling is carried out to described current source picture, obtains the scene degree of correlation of each pixel in current 1/4 resolution source image and current 1/4 resolution source image;

1/2 horizontal and vertical down-sampling is carried out to described current 1/4 resolution source image, obtains the scene degree of correlation of each pixel in current 1/16 resolution source image and current 1/16 resolution source image.

In described device, 3rd acquiring unit 1130 specifically for: the pixel that each pixel in described current 1/16 resolution source image is corresponding with described 1/16 resolution source image before carries out block matching motion Vector operation, obtains the motion vector value of each pixel in described current 1/16 resolution source image;

The motion vector value of each pixel in cumulative described current 1/16 resolution source image respectively.

In described device, computing unit 1140 specifically for: the motion vector mould and the unit picture element displacement gray value that obtain each pixel displacement, described unit picture element displacement gray value is w is the width of image;

Described motion vector mould is multiplied with described unit picture element displacement gray value, obtains the motion vector gray value of described each pixel;

The cumulative sum of the motion vector gray value of described each pixel is stored in degree of depth buffer;

If D _new+ D _acc< D _total; Then D _{acc_depth}(x, y)=D _{acc_depth}(x, y)+D _{new_depth}(x, y);

If D _new+ D _acc> D _total; Then

D_{acc_depth} (x, y) = D_{acc_depth} (x, y) * \frac{D_{total} - D_{new}}{D_{acc}};

If

\frac{D_{total} - D_{new}}{D_{acc}} < 0,

Then

\frac{D_{total} - D_{new}}{D_{acc}} = 0;

If

\frac{D_{total} - D_{new}}{D_{acc}} > 1,

Then

\frac{D_{total} - D_{new}}{D_{acc}} = 1;

Wherein, described (x, y) is each pixel coordinate; D _{acc_depth}(x, y) is the cumulative sum of motion vector gray value before each pixel; D _{new_depth}(x, y) is each pixel current motion vector gray value; D _totalfor the maximum of degree of depth buffer cumulative sum, D _newfor the cumulative sum of the whole gray value of current depth figure; D _accfor whole gray value cumulative sums of all depth maps before in degree of depth buffer.

In described device, the first processing unit 1150 specifically for: for described each neighbor pixel assigns weight coefficient, described weight coefficient is the tap coefficient of super smothing filtering;

Call in described buffer the scene degree of correlation of the described neighbor pixel stored;

If the scene degree of correlation of described neighbor pixel be 1 be correlated with, then the weight coefficient that neighbor pixel gray value and neighbor pixel distribute is multiplied;

If the scene degree of correlation of described neighbor pixel is 0 uncorrelated, then the weight coefficient that central pixel point gray value and neighbor pixel distribute is multiplied;

The value that cumulative described neighbor pixel gray value is multiplied with the weight coefficient that neighbor pixel distributes, and the value that described central pixel point gray value is multiplied with the weight coefficient that neighbor pixel distributes;

f (x, y) = [\frac{1}{6} * Σ_{n = 0}^{3} [~ buffer [2 n]] + \frac{1}{12} * Σ_{n = 0}^{3} [~ buffer [2 n + 1]]] * f (x, y) + \frac{1}{6} * Σ_{n = 0}^{3} [buffer [2 n] * f (2 n)] + \frac{1}{12} * Σ_{n = 0}^{3} [buffer [2 n + 1] * f (2 n + 1)]

In described device, the first processing unit 1150 further specifically for: four iteration are carried out to described ID figure and surpass the disposal of gentle filter, surpass the depth map after the disposal of gentle filter through four iteration carry out horizontal and vertical twice up-sampling to described;

Obtain the depth value of each 1/4 definition pixel point, the depth value of described each 1/4 definition pixel point forms 1/4 depth of resolution figure;

Twice iteration is carried out to described 1/4 depth of resolution figure and surpasses the disposal of gentle filter, surpass the depth map after the disposal of gentle filter through twice iteration carry out horizontal and vertical twice up-sampling to described;

Obtain the depth value of each original resolution pixel, the depth value of described each original resolution pixel forms original resolution depth map;

An iteration is carried out to described original resolution depth map and surpasses the disposal of gentle filter, obtain source images depth map.

In described device, to be describedly specially for described each neighbor pixel coefficient that assigns weight: the weight coefficient of cumulative neighbor pixel and be 1.

By device disclosed in the application embodiment of the present invention, the corresponding scene degree of correlation is obtained in the different down-sampling stage, ID figure is extracted by motion vector cumulative sum, utilize the scene degree of correlation in different down-sampling stage to carry out iteration to ID figure and surpass smoothing processing, carry out up-sampling process, final generation source images depth map, improves the picture quality of depth map simultaneously, make depth map profile more clear, this method also allows the computing cost of whole process remain within zone of reasonableness simultaneously.

Professional should recognize further, in conjunction with unit and the algorithm steps of each example of embodiment disclosed herein description, can realize with electronic hardware, computer software or the combination of the two, in order to the interchangeability of hardware and software is clearly described, generally describe composition and the step of each example in the above description according to function.These functions perform with hardware or software mode actually, depend on application-specific and the design constraint of technical scheme.Professional and technical personnel can use distinct methods to realize described function to each specifically should being used for, but this realization should not think the scope exceeding the embodiment of the present invention.

The software module that the method described in conjunction with embodiment disclosed herein or the step of algorithm can use hardware, processor to perform, or the combination of the two is implemented.Software module can be placed in the storage medium of other form any known in random asccess memory (RAM), internal memory, read-only memory (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technical field.

Above-described embodiment; the object of the embodiment of the present invention, technical scheme and beneficial effect are further described; be understood that; the foregoing is only the embodiment of the embodiment of the present invention; be not intended to limit the present invention the protection range of embodiment; within all spirit in the embodiment of the present invention and principle, any amendment made, equivalent replacement, improvement etc., within the protection range that all should be included in the embodiment of the present invention.

Claims

1. a method for extraction and optimized image depth map, it is characterized in that, described method comprises:

Obtain the scene degree of correlation of each pixel in current source picture and described current source picture, described current source picture is current video successive frame sequence;

2. the method for extraction according to claim 1 and optimized image depth map, is characterized in that, in described acquisition current source picture and described current source picture, the scene degree of correlation of each pixel is specially:

Select arbitrary pixel as central pixel point, for described central pixel point neighbor piont mark;

Obtain the difference of the red R of described central pixel point, green G, blue B value and the red R of described neighbor pixel, green G, blue B component value, and described difference is taken absolute value;

b u f f e r [m] = \{\begin{matrix} 1, {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{R} + {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{G} + {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{B} \leq A; \\ 0, {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{R} + {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{G} + {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{B} > A; \end{matrix}

3. the method for extraction according to claim 1 and optimized image depth map, is characterized in that, described to the continuous down-sampling of described current source picture, and the scene degree of correlation obtaining each pixel in current each down-sampling source images is specially:

1/2 horizontal and vertical down-sampling is carried out to described current source picture, obtains the scene degree of correlation of each pixel in current 1/4 resolution source image and current 1/4 resolution source image;

4. the method for extraction according to claim 3 and optimized image depth map, it is characterized in that, the described pixel that each pixel in described current down-sampling source images is corresponding with down-sampling source images before carries out block matching motion Vector operation, and the motion vector value obtaining each pixel in described current down-sampling source images is specially:

The pixel that each pixel in described current 1/16 resolution source image is corresponding with described 1/16 resolution source image before carries out block matching motion Vector operation, obtains the motion vector value of each pixel in described current 1/16 resolution source image;

5. the method for extraction according to claim 1 and optimized image depth map, is characterized in that, the described ID value extracting each pixel from motion vector cumulative sum is specially:

Obtain motion vector mould and the unit picture element displacement gray value of each pixel displacement, described unit picture element displacement gray value is w is the width of image;

If D _new+ D _acc> D _total; Then

D_{a c c_d e p t h} (x, y) = D_{a c c_d e p t h} (x, y) * \frac{D_{t o t a l} - D_{n e w}}{D_{a c c}};

If

\frac{D_{t o t a l} - D_{n e w}}{D_{a c c}} < 0,

Then

\frac{D_{t o t a l} - D_{n e w}}{D_{a c c}} = 0;

If

\frac{D_{t o t a l} - D_{n e w}}{D_{a c c}} > 1,

Then

\frac{D_{t o t a l} - D_{n e w}}{D_{a c c}} = 1;

6. the method for extraction according to claim 2 and optimized image depth map, it is characterized in that, describedly utilize the scene degree of correlation of each pixel in the scene degree of correlation of each pixel in described source images and described each down-sampling source images to surpass the disposal of gentle filter continuously to each pixel in described ID figure to be specially:

For described each neighbor pixel assigns weight coefficient, described weight coefficient is the tap coefficient of super smothing filtering;

f (x, y) = [\frac{1}{6} * Σ_{n = 0}^{3} [~ b u f f e r [2 n]] + \frac{1}{12} * Σ_{n = 0}^{3} [~ b u f f e r [2 n + 1]]] * f (x, y) + \frac{1}{6} * Σ_{n = 0}^{3} * [b u f f e r [2 n] * f (2 n)] + \frac{1}{12} * Σ_{n = 0}^{3} [b u f f e r [2 n + 1] * f (2 n + 1)]

7. the method for extraction according to claim 6 and optimized image depth map, is characterized in that, in described ID figure, each pixel carries out continuous up-sampling and is specially:

Four iteration being carried out to described ID figure and surpasses the disposal of gentle filter, carrying out horizontal and vertical twice up-sampling to surpassing the depth map after the disposal of gentle filter through described four iteration;

Twice iteration being carried out to described 1/4 depth of resolution figure and surpasses the disposal of gentle filter, carrying out horizontal and vertical twice up-sampling to surpassing the depth map after the disposal of gentle filter through described twice iteration;

8. the method for extraction according to claim 6 and optimized image depth map, is characterized in that, is describedly specially for described each neighbor pixel coefficient that assigns weight: the weight coefficient of cumulative neighbor pixel and be 1.

9. a device for extraction and optimized image depth map, it is characterized in that, described device comprises:

First acquiring unit, for obtaining the scene degree of correlation of each pixel in current source picture and described current source picture, described current source picture is current video successive frame sequence;

10. the device of extraction according to claim 9 and optimized image depth map, is characterized in that, described first acquiring unit specifically for:

Select arbitrary pixel as central pixel point, will with described central pixel point neighbor piont mark;

b u f f e r [m] = \{\begin{matrix} 1, {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{R} + {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{G} + {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{B} \leq A; \\ 0, {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{R} + {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{G} + {| f (x, y) - f (x &PlusMinus; k, y &PlusMinus; k) |}_{B} > A; \end{matrix}

The device of 11. extractions according to claim 9 and optimized image depth map, is characterized in that, described second acquisition unit specifically for:

The device of 12. extractions according to claim 11 and optimized image depth map, is characterized in that, described 3rd acquiring unit specifically for:

The device of 13. extractions according to claim 9 and optimized image depth map, is characterized in that, described computing unit specifically for:

If D _new+ D _acc> D _total; Then

D_{a c c_d e p t h} (x, y) = D_{a c c_d e p t h} (x, y) * \frac{D_{t o t a l} - D_{n e w}}{D_{a c c}};

If

\frac{D_{t o t a l} - D_{n e w}}{D_{a c c}} < 0,

Then

\frac{D_{t o t a l} - D_{n e w}}{D_{a c c}} = 0;

If

\frac{D_{t o t a l} - D_{n e w}}{D_{a c c}} > 1,

Then

\frac{D_{t o t a l} - D_{n e w}}{D_{a c c}} = 1;

The device of 14. extractions according to claim 10 and optimized image depth map, is characterized in that, described first processing unit specifically for:

f (x, y) = [\frac{1}{6} * Σ_{n = 0}^{3} [~ b u f f e r [2 n]] + \frac{1}{12} * Σ_{n = 0}^{3} [~ b u f f e r [2 n + 1]]] * f (x, y) + \frac{1}{6} * Σ_{n = 0}^{3} * [b u f f e r [2 n] * f (2 n)] + \frac{1}{12} * Σ_{n = 0}^{3} [b u f f e r [2 n + 1] * f (2 n + 1)]

The device of 15. extractions according to claim 14 and optimized image depth map, is characterized in that, described first processing unit further specifically for:

16. the device of extraction according to claim 15 and optimized image depth map, is characterized in that, to be describedly specially for described each neighbor pixel coefficient that assigns weight: the weight coefficient of cumulative neighbor pixel and be 1.