CN109087346A - Training method, training device and the electronic equipment of monocular depth model - Google Patents
Training method, training device and the electronic equipment of monocular depth model Download PDFInfo
- Publication number
- CN109087346A CN109087346A CN201811106152.4A CN201811106152A CN109087346A CN 109087346 A CN109087346 A CN 109087346A CN 201811106152 A CN201811106152 A CN 201811106152A CN 109087346 A CN109087346 A CN 109087346A
- Authority
- CN
- China
- Prior art keywords
- image
- anaglyph
- monocular
- training
- depth model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Processing (AREA)
Abstract
Disclose training method, training device and the electronic equipment of a kind of monocular depth model.This method comprises: obtaining multiple binocular images for training monocular depth model;Randomly choose at least monocular image in multiple binocular images;Calculate corresponding first anaglyph of each monocular image and the corresponding first mask image of the first anaglyph in multiple binocular images in addition to an at least monocular image;It for each monocular image in an at least monocular image, calculates the anaglyph after each monocular image is overturn and is overturn again as the second anaglyph, and calculate the corresponding second mask image of the second anaglyph;And the reversed gradient by shielding the region of the first anaglyph that the first mask image blocks and the second mask image reversed gradient in the region of the second anaglyph blocked trains monocular depth model.In this way, the Depth Blur problem of object edge can be efficiently solved, while improving the precision of prediction of model entirety.
Description
Technical field
This application involves model training fields, and more particularly, to a kind of training method of monocular depth model, training
Device and electronic equipment.
Background technique
Currently, computer vision technique is proposed on the basis of two-dimensional image, therefore, how from two
Extract depth information in dimensional plane images or video sequence so predetermined depth image reconstruction three-dimensional structure be one very
Important technology.This has very big promotion to the application such as dimension of object, hiding relation, shape, segmentation, and can answer extensively
Turn 3D film, intelligent robot independent navigation for 2D film, mechanical arm grabs, under the scenes such as augmented reality.
In estimation of Depth technology, monocular depth estimation is to be estimated in image using the image of camera acquisition respectively
The depth information of a pixel, and the estimation of unsupervised monocular depth is exactly that the training of model does not need the depth information or other of pixel
Markup information.
As the depth estimation algorithm based on machine learning obtains more and more extensive research, by monocular depth model into
Row estimation of Depth can not be limited by specific scene condition, have preferable applicability.Accordingly, it is desired to provide improved list
The training program of mesh depth model.
Summary of the invention
In order to solve the above-mentioned technical problem, the application is proposed.Embodiments herein provides a kind of monocular depth mould
Training method, training device and the electronic equipment of type are blocked by calculating to block mask and shield during model training
The reversed gradient in region, while input picture invert at random and then prediction result is flipped back to again to carry out the anti-of gradient
To propagation, the Depth Blur problem of object edge can be efficiently solved, while improving the precision of prediction of model entirety.
According to the one aspect of the application, a kind of training method of monocular depth model is provided, comprising: obtain for instructing
Practice multiple binocular images of monocular depth model;Randomly choose at least monocular image in the multiple binocular image;It calculates
Corresponding first anaglyph of each monocular image in the multiple binocular image in addition to an at least monocular image with
And the corresponding first mask image of first anaglyph;For each monocular image in an at least monocular image,
It calculates the anaglyph after each monocular image is overturn and is overturn again as the second anaglyph, and calculate described second
The corresponding second mask image of anaglyph;And first disparity map blocked by shielding the first mask image
The reversed gradient in the region for the second anaglyph that the reversed gradient in the region of picture and the second mask image block is instructed
Practice the monocular depth model.
According to the another aspect of the application, a kind of training device of monocular depth model is provided, comprising: image obtains single
Member, for obtaining multiple binocular images for training monocular depth model;Image selection unit is described more for randomly choosing
An at least monocular image in a binocular image;First computing unit, for calculate in the multiple binocular image except it is described extremely
Corresponding first anaglyph of each monocular image and first anaglyph except a few monocular image corresponding the
One mask image;Second computing unit, for calculating described every for each monocular image in an at least monocular image
Anaglyph after the overturning of one monocular image is simultaneously overturn again as the second anaglyph, and calculating second anaglyph pair
The the second mask image answered;And model training unit, described first for being blocked by shielding the first mask image
The reversed ladder in the region for the second anaglyph that the reversed gradient in the region of anaglyph and the second mask image block
Degree is to train the monocular depth model.
According to the application's in another aspect, providing a kind of electronic equipment, comprising: processor;And memory, in institute
It states and is stored with computer program instructions in memory, the computer program instructions make described when being run by the processor
Processor executes the training method of monocular depth model as described above.
According to the another aspect of the application, a kind of computer-readable medium is provided, computer program is stored thereon with and refers to
It enables, the computer program instructions make the processor execute monocular depth model as described above when being run by processor
Training method.
Compared with prior art, training method, training device and the electronic equipment of the monocular depth model of the application can be with
Obtain multiple binocular images for training monocular depth model;Randomly choose at least monocular in the multiple binocular image
Image;Calculate corresponding first view of each monocular image in the multiple binocular image in addition to an at least monocular image
Difference image and the corresponding first mask image of first anaglyph;For each list in an at least monocular image
Mesh image calculates the anaglyph after each monocular image is overturn and is overturn again as the second anaglyph, and calculates
The corresponding second mask image of second anaglyph;And described the by shielding that the first mask image blocks
The region for the second anaglyph that the reversed gradient in the region of one anaglyph and the second mask image block it is reversed
Gradient trains the monocular depth model.In this way, blocking mask by being calculated during model training and shielding blocked area
The reversed gradient in domain, while input picture invert at random and then prediction result is flipped back to again to carry out the reversed of gradient
It propagates, the Depth Blur problem of object edge can be efficiently solved, while improving the precision of prediction of model entirety.
Detailed description of the invention
The embodiment of the present application is described in more detail in conjunction with the accompanying drawings, the above-mentioned and other purposes of the application,
Feature and advantage will be apparent.Attached drawing is used to provide to further understand the embodiment of the present application, and constitutes explanation
A part of book is used to explain the application together with the embodiment of the present application, does not constitute the limitation to the application.In the accompanying drawings,
Identical reference label typically represents same parts or step.
Fig. 1 illustrates the flow charts according to the training method of the monocular depth model of the embodiment of the present application.
Fig. 2 illustrates the schematic diagram of the generating process of the anaglyph and mask image according to the embodiment of the present application.
Fig. 3 illustrates the first exemplary schematic diagram of the network structure according to the embodiment of the present application.
Fig. 4 illustrates the second exemplary schematic diagram of the network structure according to the embodiment of the present application.
Fig. 5 illustrates the effect picture of the training method of the monocular depth model according to the embodiment of the present application.
Fig. 6 illustrates the block diagram of the training device of the monocular depth model according to the embodiment of the present application.
Fig. 7 illustrates the block diagram of the electronic equipment according to the embodiment of the present application.
Specific embodiment
In the following, example embodiment according to the application will be described in detail by referring to the drawings.Obviously, described embodiment is only
It is only a part of the embodiment of the application, rather than the whole embodiments of the application, it should be appreciated that the application is not by described herein
The limitation of example embodiment.
Application is summarized
As described above, monocular depth model is more and more widely used in the estimation of Depth of two dimensional image.
There are mainly two types of implementations for current monocular depth estimation, and one is the images by binocular to carry out model
Training, is mainly utilized the physical relation between binocular image, another is the video by monocular cam, main to utilize
The information of front and back picture frame.In addition, the method for also having while using binocular image and video.
For the unsupervised training method for binocular image, estimation of Depth can be easily carried out, still, at present
The edge that above-mentioned estimation method will lead to object is relatively fuzzyyer.
Present inventor has found that the fuzzy of this object edge is mainly due in training process later after study
Image transformation (image warping) can not be handled caused by object blocks.
In view of the above technical problems, the basic conception of the application is to calculate to block mask and shield during model training
The reversed gradient of occlusion area, while random reversion is carried out to input picture and then prediction result is overturn again and can carry out gradient
Backpropagation.
Specifically, training method, training device and the electronic equipment of monocular depth model provided by the present application can be first
Multiple binocular images for training monocular depth model are obtained, at least one in the multiple binocular image is then randomly choosed
Monocular image calculates each monocular image corresponding in the multiple binocular image in addition to an at least monocular image
One anaglyph and the corresponding first mask image of first anaglyph, and in an at least monocular image
Each monocular image, anaglyph after calculating each monocular image overturning simultaneously overturns again as the second anaglyph,
And the corresponding second mask image of second anaglyph is calculated, it is blocked finally by shielding the first mask image
The region for the second anaglyph that the reversed gradient in the region of first anaglyph and the second mask image block
Reversed gradient train the monocular depth model.In this way, the Depth Blur problem of object edge can be efficiently solved, together
The precision of prediction of Shi Tigao model entirety.
After describing the basic principle of the application, carry out the various non-limits for specifically introducing the application below with reference to the accompanying drawings
Property embodiment processed.
Illustrative methods
Fig. 1 illustrates the flow charts according to the training method of the monocular depth model of the embodiment of the present application.
As shown in Figure 1, including: S110 according to the training method of the monocular depth model of the embodiment of the present application, acquisition is used for
Multiple binocular images of training monocular depth model;S120 randomly chooses at least monocular figure in the multiple binocular image
Picture;S130 calculates each monocular image corresponding in the multiple binocular image in addition to an at least monocular image
One anaglyph and the corresponding first mask image of first anaglyph;S140, for an at least monocular image
In each monocular image, anaglyph after calculating each monocular image overturning simultaneously overturns again as the second disparity map
Picture, and calculate the corresponding second mask image of second anaglyph;And S150, by shielding the first mask figure
The second disparity map blocked as the reversed gradient in the region for first anaglyph blocked and the second mask image
The reversed gradient in the region of picture trains the monocular depth model.
In step s 110, multiple binocular images for training monocular depth model are obtained.That is, according to this
In the training method for applying for the monocular depth model of embodiment, using the unsupervised training method based on binocular image to monocular depth
Degree model is trained.
Here, each binocular image includes the left-eye image and eye image as monocular image.Also, in model
In training process, left view difference image is generated based on the left-eye image, with eye image corresponding with left-eye image synthesis.
Equally, right anaglyph is generated based on the eye image, with left-eye image corresponding with eye image synthesis.
In the step s 120, at least monocular image in the multiple binocular image is randomly choosed.As described above, in root
In training method according to the monocular depth model of the embodiment of the present application, random selection part input picture is overturn, then right
It is overturn back again in the parallax that the image of overturning is overturn, carries out the synthesis of image and the backpropagation of gradient.
Also, in the embodiment of the present application, selected image does not need the left eye being limited in simultaneous selection binocular image
Image and eye image.That is, left-eye image that can only in selected section binocular image, or only selected section binocular
Eye image in image also can choose in left-eye image and another part binocular image in a part of binocular image
Eye image, etc..Certainly, in the embodiment of the present application, it is double that at least one of multiple binocular images also can be randomly selected
Mesh image, and simultaneously using in selected binocular image left-eye image and eye image as the image overturn.
It is, being randomly choosed the multiple in the training method according to the monocular depth model of the embodiment of the present application
An at least monocular image in binocular image includes: at least binocular image in the multiple binocular image of random selection to obtain
The left-eye image and eye image in an at least binocular image are obtained, using the left-eye image and the eye image as institute
State an at least monocular image.
In this way, due to simultaneously using in binocular image left-eye image and eye image overturn, can be in processing
Another is handled in a similar way based on one in left-eye image and eye image, reduces the complexity of calculating.Also, by
Simultaneously include left-eye image and eye image in the image of overturning, improves the diversity for the sample trained after overturning, Ke Yijin
The precision of prediction of one step raising model.
In step s 130, each monocular in the multiple binocular image in addition to an at least monocular image is calculated
Corresponding first anaglyph of image and the corresponding first mask image of first anaglyph.In the following, will be said with reference to Fig. 2
Disparity map is generated for the image for not overturning and overturning in the bright training method according to the monocular depth model of the embodiment of the present application
Picture and the process for generating mask image.Fig. 2 illustrates the generation of anaglyph and mask image according to the embodiment of the present application
The schematic diagram of journey.
As shown in the left-half of Fig. 2, for the input picture for not needing overturning, for example, input left eye as shown in Figure 2
Image generates left view difference image corresponding with the left-eye image, regenerates mask image corresponding with left view difference image.Together
Sample, although showing the operation executed to input left-eye image in Fig. 2, the operation executed for input eye image is also identical.
Therefore, in the embodiment of the present application, first anaglyph refers to the view for the input picture generation without overturning
Difference image, and the first mask image refers to both wrapping the mask image of the input picture generation without overturning
The anaglyph and mask image for left-eye image are included, also includes anaglyph and mask image for eye image.
In step S140, for each monocular image in an at least monocular image, each monocular is calculated
Anaglyph after Image Reversal is simultaneously overturn again as the second anaglyph, and calculates second anaglyph corresponding the
Two mask images.
Equally, with reference to Fig. 2, as shown in the right half part of Fig. 2, for input picture, for example, such as the left-half institute of Fig. 2
The input left-eye image shown is overturn the view that the overturning input picture is then calculated to obtain overturning input picture first
Difference image, i.e., the anaglyph of flipped image as shown in Figure 2 later overturn the anaglyph of flipped image again, with
Obtain the anaglyph of overturning as shown in Figure 2.Finally, generating corresponding mask image for the anaglyph of overturning again.
Therefore, in the embodiment of the present application, second anaglyph is referred to for the selected input by overturning
The anaglyph that image generates, and the second mask image is referred to for the selected input picture generation by overturning
Mask image.And it as described above, it can only include the anaglyph and mask image for left-eye image, can also only wrap
The anaglyph and mask image for eye image are included, or includes for the left-eye image and eye image in binocular image
The anaglyph and mask image of the two.
That is, being right as unit of each monocular image in the binocular image of input in the embodiment of the present application
Input picture is divided, and a part directly calculates anaglyph and corresponding mask image, i.e., the first parallax as described above
Image and the first mask image, another part calculate anaglyph and corresponding mask image after then overturning, i.e., as described above
Second anaglyph and the second mask image.
Finally, in step S150, by the area for shielding first anaglyph that the first mask image blocks
The reversed gradient in the region for the second anaglyph that the reversed gradient in domain and the second mask image block is described to train
Monocular depth model.
In this way, passing through the reversed gradient for calculating occlusion area and shielding occlusion area, due to passing through mask image intensification
Object area in image simultaneously inhibits non-object area, can efficiently solve the Depth Blur problem of object edge.
Specifically, in the training method according to the monocular depth model of the embodiment of the present application, different structure can be used
Network model.Fig. 3 illustrates the first exemplary schematic diagram of the network structure according to the embodiment of the present application.As shown in figure 3, needle
To the left-eye image I of inputlWith eye image Ir, calculate separately out its left view difference image dlWith right anaglyph dr.Here, originally
Field technical staff is appreciated that left-eye image IlCorresponding anaglyph dlIt can be unturned input figure as described above
As corresponding first anaglyph, it is also possible to corresponding second anaglyph of the input picture overturn as described above, equally,
Eye image IrCorresponding anaglyph drIt is also possible to the first anaglyph and the second anaglyph as described above.
Next, by left view difference image dlCorresponding eye image IrSynthesis, and by right anaglyph drIt is right with it
The left-eye image I answeredlSynthesis, to generate forecast imageWithThen, by calculating the forecast imageWithWith it is true
Image IlAnd IrBetween difference function, and at least part using the difference function as loss function trains the list
Mesh depth model.Also, as described above, in the training process, blocking the forecast image with mask image, and shield and hidden
The reversed gradient in the region of gear.Here, the difference function can be the forecast imageWithWith true picture IlAnd IrIt
Between image difference or described image difference quadratic sum, etc..
Here, network structure shown in Fig. 3 passes through while calculating anaglyph for left-eye image and eye image and closing
At forecast image, the precision of prediction of model can be improved.
Therefore, it in the training method according to the monocular depth model of the embodiment of the present application, is covered by shielding described first
The second view that the reversed gradient in the region for first anaglyph that code image blocks and the second mask image block
The reversed gradient in the region of difference image is come to train the monocular depth model include: by first anaglyph and described second
The monocular image of each anaglyph corresponding thereto in anaglyph synthesizes forecast image;Calculate the forecast image and true
Difference function between real image;And using the difference function as at least part of loss function training monocular
Depth model shields the reversed ladder in the region for the forecast image that the mask image is blocked in the training process
Degree.
As another example of network structure, Fig. 4 illustrates the second example of the network structure according to the embodiment of the present application
Schematic diagram.As shown in figure 4, this exemplary network structure can be just for left-eye image IlWith eye image IrOne of instructed
Practice.For example, being directed to left-eye image Il, its left view difference image d is calculated firstl, then with corresponding eye image IrIt is synthesized
To obtain forecast imageNext, calculating the forecast imageWith true picture IlBetween difference function, and with described
Difference function trains the monocular depth model as at least part of loss function.Similarly, in the training process, with
Mask image blocks the forecast image, and shields the reversed gradient in blocked region.
In addition, it will be understood by those skilled in the art that network structure as shown in Figure 4 can be equally applied to right eye figure
As Ir.That is, being directed to eye image Ir, its right anaglyph d is calculated firstr, then with corresponding left-eye image IlInto
Row synthesis is to obtain forecast imageNext, calculating the forecast imageWith true picture IrBetween difference function, and
At least part using the difference function as loss function trains the monocular depth model.Similarly, it was training
Cheng Zhong blocks the forecast image with mask image, and shields the reversed gradient in blocked region.Here, the difference
Function can be the forecast imageWithWith true picture IlAnd IrBetween image difference or described image difference
Quadratic sum, etc..
Here, network structure shown in Fig. 4 calculates anaglyph only for one of left-eye image and eye image and synthesizes
Forecast image, so that calculating process is fairly simple, while can also be compatible with current some existing network structures.
It is, passing through shielding described first in the training method according to the monocular depth model of the embodiment of the present application
The reversed gradient in the region for first anaglyph that mask image blocks and the second mask image block second
The reversed gradient in the region of anaglyph is come to train the monocular depth model include: by first anaglyph and described
The each left view difference image for corresponding to one of the left-eye image and described eye image or each right side in two anaglyphs
Anaglyph eye image corresponding thereto or left-eye image synthesize forecast image;Calculate the forecast image and true figure
Difference function as between;And using the difference function as at least part of loss function training monocular depth
Model shields the reversed gradient in the region for the forecast image that the mask image is blocked in the training process.
Fig. 5 illustrates the effect picture of the training method of the monocular depth model according to the embodiment of the present application.As shown in figure 5,
(a) left-eye image I is shownl, (b) show eye image Ir, (c) show the anaglyph d being aligned with left-eye imagel,
(d) the left eye forecast image after showing reconstruct(e) it shows and anaglyph dlCorresponding mask image, and (f) show
Use the left-eye image reconstructed after mask image masking.It can see from (d), left eye forecast image after reconstitution
In the presence of apparent repetition and artifact.And shielded by using the mask image (e) generated from anaglyph (c) those repeat and
The backpropagation of artifact, it can be seen that white area has been blocked in final result (f).
Exemplary means
Fig. 6 illustrates the block diagram of the training device of the monocular depth model according to the embodiment of the present application.
As shown in fig. 6, including: that image obtains list according to the training device 200 of the monocular depth model of the embodiment of the present application
Member 210, for obtaining multiple binocular images for training monocular depth model;Image selection unit 220, for randomly choosing
An at least monocular image in the multiple binocular image;First computing unit 230, for calculating in the multiple binocular image
Corresponding first anaglyph of each monocular image and first anaglyph in addition to an at least monocular image
Corresponding first mask image;Second computing unit 240, for for each monocular figure in an at least monocular image
Picture calculates the anaglyph after each monocular image is overturn and is overturn again as the second anaglyph, and described in calculating
The corresponding second mask image of second anaglyph;And model training unit 250, for by shielding first mask
The second parallax that the reversed gradient in the region for first anaglyph that image blocks and the second mask image block
The reversed gradient in the region of image trains the monocular depth model.
In one example, in the training device 200 of above-mentioned monocular depth model, the binocular image includes as single
The left-eye image and eye image of mesh image;The corresponding anaglyph of the left-eye image is left view difference image;And the right side
The corresponding anaglyph of eye image is right anaglyph.
In one example, in the training device 200 of above-mentioned monocular depth model, described image selecting unit 220 is used
In: at least binocular image in the multiple binocular image is randomly choosed to obtain the left eye in an at least binocular image
Both image and eye image are using as an at least monocular image.
In one example, in the training device 200 of above-mentioned monocular depth model, the model training unit 250 is used
In: the monocular image of each anaglyph in first anaglyph and second anaglyph corresponding thereto is synthesized
For forecast image;Calculate the difference function between the forecast image and true picture;And using the difference function as damage
At least part training monocular depth model for losing function shields the mask image and is hidden in the training process
The reversed gradient in the region of the forecast image of gear.
In one example, in the training device 200 of above-mentioned monocular depth model, the model training unit 250 is used
In: by first anaglyph and second anaglyph correspond to the left-eye image and the eye image it
One each left view difference image or each right anaglyph eye image corresponding thereto or left-eye image synthesize prediction
Image;Calculate the difference function between the forecast image and true picture;And using the difference function as loss function
At least part training monocular depth model shield the institute that the mask image is blocked in the training process
State the reversed gradient in the region of forecast image.
Here, it will be understood by those skilled in the art that each unit in the training device 200 of above-mentioned monocular depth model
It is had been described above in the training method referring to figs. 1 to the monocular depth model of Fig. 5 description in detail with the concrete function of module and operation
It is thin to introduce, and therefore, it will omit its repeated description.
As described above, may be implemented according to the training device 200 of the monocular depth model of the embodiment of the present application at various ends
In end equipment, such as the server of operation monocular depth model.In one example, according to the device of the embodiment of the present application 200
It can be used as a software module and/or hardware module and be integrated into terminal device.For example, the device 200 can be the end
A software module in the operating system of end equipment, or can be and be directed to one that the terminal device is developed using journey
Sequence;Certainly, which equally can be one of numerous hardware modules of the terminal device.
Alternatively, in another example, the training device 200 of the monocular depth model and the terminal device are also possible to point
Vertical equipment, and the device 200 can be connected to the terminal device by wired and or wireless network, and according to agreement
Data format transmit interactive information.
Example electronic device
In the following, being described with reference to Figure 7 the electronic equipment according to the embodiment of the present application.
Fig. 7 illustrates the block diagram of the electronic equipment according to the embodiment of the present application.
As shown in fig. 7, electronic equipment 10 includes one or more processors 11 and memory 12.
Processor 11 can be central processing unit (CPU) or have data-handling capacity and/or instruction execution capability
Other forms processing unit, and can control the other assemblies in electronic equipment 10 to execute desired function.
Memory 12 may include one or more computer program products, and the computer program product may include each
The computer readable storage medium of kind form, such as volatile memory and/or nonvolatile memory.The volatile storage
Device for example may include random access memory (RAM) and/or cache memory (cache) etc..It is described non-volatile to deposit
Reservoir for example may include read-only memory (ROM), hard disk, flash memory etc..It can be deposited on the computer readable storage medium
One or more computer program instructions are stored up, processor 11 can run described program instruction, to realize this Shen described above
The training method of the monocular depth model of each embodiment please and/or other desired functions.The computer can
It reads that the various contents such as input binocular image, anaglyph, mask image can also be stored in storage medium.
In one example, electronic equipment 10 can also include: input unit 13 and output device 14, these components pass through
The interconnection of bindiny mechanism's (not shown) of bus system and/or other forms.
For example, the input unit 13 may include binocular camera, for acquiring binocular image.In addition, the input unit
13 can also include such as keyboard, mouse etc..
The output device 14 can be output to the outside various information, such as comprising display, loudspeaker, printer, with
And communication network and its remote output devices connected etc..
Certainly, to put it more simply, illustrated only in Fig. 7 it is some in component related with the application in the electronic equipment 10,
The component of such as bus, input/output interface etc. is omitted.In addition to this, according to concrete application situation, electronic equipment 10 is also
It may include any other component appropriate.
Illustrative computer program product and computer readable storage medium
Other than the above method and equipment, embodiments herein can also be computer program product comprising meter
Calculation machine program instruction, it is above-mentioned that the computer program instructions make the processor execute this specification when being run by processor
According to the step in the training method of the monocular depth model of the various embodiments of the application described in " illustrative methods " part.
The computer program product can be write with any combination of one or more programming languages for holding
The program code of row the embodiment of the present application operation, described program design language includes object oriented program language, such as
Java, C++ etc. further include conventional procedural programming language, such as " C " language or similar programming language.Journey
Sequence code can be executed fully on the user computing device, partly execute on a user device, be independent soft as one
Part packet executes, part executes on a remote computing or completely in remote computing device on the user computing device for part
Or it is executed on server.
In addition, embodiments herein can also be computer readable storage medium, it is stored thereon with computer program and refers to
It enables, the computer program instructions make the processor execute above-mentioned " the exemplary side of this specification when being run by processor
According to the step in the training method of the monocular depth model of the various embodiments of the application described in method " part.
The computer readable storage medium can be using any combination of one or more readable mediums.Readable medium can
To be readable signal medium or readable storage medium storing program for executing.Readable storage medium storing program for executing for example can include but is not limited to electricity, magnetic, light, electricity
Magnetic, the system of infrared ray or semiconductor, device or device, or any above combination.Readable storage medium storing program for executing it is more specific
Example (non exhaustive list) includes: the electrical connection with one or more conducting wires, portable disc, hard disk, random access memory
Device (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc
Read-only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.
The basic principle of the application is described in conjunction with specific embodiments above, however, it is desirable to, it is noted that in this application
The advantages of referring to, advantage, effect etc. are only exemplary rather than limitation, must not believe that these advantages, advantage, effect etc. are the application
Each embodiment is prerequisite.In addition, detail disclosed above is merely to exemplary effect and the work being easy to understand
With, rather than limit, it is that must be realized using above-mentioned concrete details that above-mentioned details, which is not intended to limit the application,.
Device involved in the application, device, equipment, system block diagram only as illustrative example and be not intended to
It is required that or hint must be attached in such a way that box illustrates, arrange, configure.As those skilled in the art will appreciate that
, it can be connected by any way, arrange, configure these devices, device, equipment, system.Such as "include", "comprise", " tool
" etc. word be open vocabulary, refer to " including but not limited to ", and can be used interchangeably with it.Vocabulary used herein above
"or" and "and" refer to vocabulary "and/or", and can be used interchangeably with it, unless it is not such that context, which is explicitly indicated,.Here made
Vocabulary " such as " refers to phrase " such as, but not limited to ", and can be used interchangeably with it.
It may also be noted that each component or each step are can to decompose in the device of the application, device and method
And/or reconfigure.These decompose and/or reconfigure the equivalent scheme that should be regarded as the application.
The above description of disclosed aspect is provided so that any person skilled in the art can make or use this
Application.Various modifications in terms of these are readily apparent to those skilled in the art, and are defined herein
General Principle can be applied to other aspect without departing from scope of the present application.Therefore, the application is not intended to be limited to
Aspect shown in this, but according to principle disclosed herein and the consistent widest range of novel feature.
In order to which purpose of illustration and description has been presented for above description.In addition, this description is not intended to the reality of the application
It applies example and is restricted to form disclosed herein.Although already discussed above multiple exemplary aspects and embodiment, this field skill
Its certain modifications, modification, change, addition and sub-portfolio will be recognized in art personnel.
Claims (12)
1. a kind of training method of monocular depth model, comprising:
Obtain multiple binocular images for training monocular depth model;
Randomly choose at least monocular image in the multiple binocular image;
Calculate corresponding first view of each monocular image in the multiple binocular image in addition to an at least monocular image
Difference image and the corresponding first mask image of first anaglyph;
Disparity map for each monocular image in an at least monocular image, after calculating each monocular image overturning
Picture is simultaneously overturn again as the second anaglyph, and calculates the corresponding second mask image of second anaglyph;And
By the reversed gradient and described for shielding the region for first anaglyph that the first mask image blocks
The reversed gradient in the region for the second anaglyph that two mask images block trains the monocular depth model.
2. the training method of monocular depth model as described in claim 1, wherein
The binocular image includes the left-eye image and eye image as monocular image;
The corresponding anaglyph of the left-eye image is left view difference image;And
The corresponding anaglyph of the eye image is right anaglyph.
3. the training method of monocular depth model as claimed in claim 2, wherein randomly choose in the multiple binocular image
An at least monocular image include:
At least binocular image in the multiple binocular image is randomly choosed to obtain the left side in an at least binocular image
Eye image and eye image, using the left-eye image and the eye image as an at least monocular image.
4. the training method of monocular depth model as described in claim 1, wherein hidden by shielding the first mask image
The second anaglyph that the reversed gradient in the region of first anaglyph of gear and the second mask image block
The reversed gradient in region includes: to train the monocular depth model
The monocular image of each anaglyph in first anaglyph and second anaglyph corresponding thereto is closed
As forecast image;
Calculate the difference function between the forecast image and true picture;And
Using the difference function as at least part of loss function training monocular depth model, in the training process
In, shield the reversed gradient in the region for the forecast image that the mask image is blocked.
5. the training method of monocular depth model as claimed in claim 2, wherein hidden by shielding the first mask image
The second anaglyph that the reversed gradient in the region of first anaglyph of gear and the second mask image block
The reversed gradient in region includes: to train the monocular depth model
The left-eye image and the eye image will be corresponded in first anaglyph and second anaglyph
One of each left view difference image or each right anaglyph eye image corresponding thereto or left-eye image synthesize it is pre-
Altimetric image;
Calculate the difference function between the forecast image and true picture;And
Using the difference function as at least part of loss function training monocular depth model, in the training process
In, shield the reversed gradient in the region for the forecast image that the mask image is blocked.
6. a kind of training device of monocular depth model, comprising:
Image acquisition unit, for obtaining multiple binocular images for training monocular depth model;
Image selection unit, for randomly choosing at least monocular image in the multiple binocular image;
First computing unit, for calculating each monocular in the multiple binocular image in addition to an at least monocular image
Corresponding first anaglyph of image and the corresponding first mask image of first anaglyph;
Second computing unit, for calculating each monocular for each monocular image in an at least monocular image
Anaglyph after Image Reversal is simultaneously overturn again as the second anaglyph, and calculates second anaglyph corresponding the
Two mask images;
Model training unit, for the anti-of the region by shielding first anaglyph that the first mask image blocks
The reversed gradient in the region for the second anaglyph blocked to gradient and the second mask image trains the monocular deep
Spend model.
7. the training device of monocular depth model as claimed in claim 6, wherein
The binocular image includes the left-eye image and eye image as monocular image;
The corresponding anaglyph of the left-eye image is left view difference image;And
The corresponding anaglyph of the eye image is right anaglyph.
8. the training device of monocular depth model as claimed in claim 7, wherein described image selecting unit is used for:
At least binocular image in the multiple binocular image is randomly choosed to obtain the left side in an at least binocular image
Eye both image and eye image is using as an at least monocular image.
9. the training device of monocular depth model as claimed in claim 6, wherein the model training unit is used for:
The monocular image of each anaglyph in first anaglyph and second anaglyph corresponding thereto is closed
As forecast image;
Calculate the difference function between the forecast image and true picture;And
Using the difference function as at least part of loss function training monocular depth model, in the training process
In, shield the reversed gradient in the region for the forecast image that the mask image is blocked.
10. the training device of monocular depth model as claimed in claim 6, wherein the model training unit is used for:
The left-eye image and the eye image will be corresponded in first anaglyph and second anaglyph
One of each left view difference image or each right anaglyph eye image corresponding thereto or left-eye image synthesize it is pre-
Altimetric image;
Calculate the difference function between the forecast image and true picture;And
Using the difference function as at least part of loss function training monocular depth model, in the training process
In, shield the reversed gradient in the region for the forecast image that the mask image is blocked.
11. a kind of electronic equipment, comprising:
Processor;And
Memory is stored with computer program instructions in the memory, and the computer program instructions are by the processing
The processor is made to execute the training method of monocular depth model according to any one of claims 1 to 5 when device operation.
12. a kind of computer-readable medium is stored thereon with computer program instructions, the computer program instructions are processed
The processor is made to execute the training method of monocular depth model according to any one of claims 1 to 5 when device operation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811106152.4A CN109087346B (en) | 2018-09-21 | 2018-09-21 | Monocular depth model training method and device and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811106152.4A CN109087346B (en) | 2018-09-21 | 2018-09-21 | Monocular depth model training method and device and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109087346A true CN109087346A (en) | 2018-12-25 |
CN109087346B CN109087346B (en) | 2020-08-11 |
Family
ID=64842277
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811106152.4A Active CN109087346B (en) | 2018-09-21 | 2018-09-21 | Monocular depth model training method and device and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109087346B (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110070056A (en) * | 2019-04-25 | 2019-07-30 | 腾讯科技(深圳)有限公司 | Image processing method, device, storage medium and equipment |
CN111105451A (en) * | 2019-10-31 | 2020-05-05 | 武汉大学 | A Binocular Depth Estimation Method for Driving Scenes Overcoming Occlusion Effect |
CN111178547A (en) * | 2020-04-10 | 2020-05-19 | 支付宝(杭州)信息技术有限公司 | Method and system for model training based on private data |
CN111292425A (en) * | 2020-01-21 | 2020-06-16 | 武汉大学 | A View Synthesis Method Based on Monocular Hybrid Dataset |
CN111476834A (en) * | 2019-01-24 | 2020-07-31 | 北京地平线机器人技术研发有限公司 | Method and device for generating image and electronic equipment |
CN111508010A (en) * | 2019-01-31 | 2020-08-07 | 北京地平线机器人技术研发有限公司 | Method and device for depth estimation of two-dimensional image and electronic equipment |
CN111583152A (en) * | 2020-05-11 | 2020-08-25 | 福建帝视信息科技有限公司 | Image artifact detection and automatic removal method based on U-net structure |
CN111696145A (en) * | 2019-03-11 | 2020-09-22 | 北京地平线机器人技术研发有限公司 | Depth information determination method, depth information determination device and electronic equipment |
CN112149458A (en) * | 2019-06-27 | 2020-12-29 | 商汤集团有限公司 | Obstacle detection method, intelligent driving control method, device, medium and equipment |
CN112634147A (en) * | 2020-12-09 | 2021-04-09 | 上海健康医学院 | PET image noise reduction method, system, device and medium for self-supervision learning |
CN113128601A (en) * | 2021-04-22 | 2021-07-16 | 北京百度网讯科技有限公司 | Training method of classification model and method for classifying images |
CN113538258A (en) * | 2021-06-15 | 2021-10-22 | 福州大学 | Image deblurring model and method based on mask |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103413298A (en) * | 2013-07-17 | 2013-11-27 | 宁波大学 | Three-dimensional image objective evaluation method based on visual characteristics |
CN105374039A (en) * | 2015-11-16 | 2016-03-02 | 辽宁大学 | Monocular image depth information estimation method based on contour acuity |
EP2747427B1 (en) * | 2012-12-21 | 2016-03-16 | imcube labs GmbH | Method, apparatus and computer program usable in synthesizing a stereoscopic image |
CN107578436A (en) * | 2017-08-02 | 2018-01-12 | 南京邮电大学 | A Depth Estimation Method for Monocular Image Based on Fully Convolutional Neural Network FCN |
CN107945265A (en) * | 2017-11-29 | 2018-04-20 | 华中科技大学 | Real-time dense monocular SLAM method and systems based on on-line study depth prediction network |
CN108269278A (en) * | 2016-12-30 | 2018-07-10 | 杭州海康威视数字技术股份有限公司 | A kind of method and device of scene modeling |
-
2018
- 2018-09-21 CN CN201811106152.4A patent/CN109087346B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2747427B1 (en) * | 2012-12-21 | 2016-03-16 | imcube labs GmbH | Method, apparatus and computer program usable in synthesizing a stereoscopic image |
CN103413298A (en) * | 2013-07-17 | 2013-11-27 | 宁波大学 | Three-dimensional image objective evaluation method based on visual characteristics |
CN105374039A (en) * | 2015-11-16 | 2016-03-02 | 辽宁大学 | Monocular image depth information estimation method based on contour acuity |
CN108269278A (en) * | 2016-12-30 | 2018-07-10 | 杭州海康威视数字技术股份有限公司 | A kind of method and device of scene modeling |
CN107578436A (en) * | 2017-08-02 | 2018-01-12 | 南京邮电大学 | A Depth Estimation Method for Monocular Image Based on Fully Convolutional Neural Network FCN |
CN107945265A (en) * | 2017-11-29 | 2018-04-20 | 华中科技大学 | Real-time dense monocular SLAM method and systems based on on-line study depth prediction network |
Non-Patent Citations (1)
Title |
---|
林义闽 等: "用于弱纹理场景三维重建的机器人视觉系统", 《CNKI》 * |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111476834B (en) * | 2019-01-24 | 2023-08-11 | 北京地平线机器人技术研发有限公司 | Method and device for generating image and electronic equipment |
CN111476834A (en) * | 2019-01-24 | 2020-07-31 | 北京地平线机器人技术研发有限公司 | Method and device for generating image and electronic equipment |
CN111508010B (en) * | 2019-01-31 | 2023-08-08 | 北京地平线机器人技术研发有限公司 | Method and device for estimating depth of two-dimensional image and electronic equipment |
CN111508010A (en) * | 2019-01-31 | 2020-08-07 | 北京地平线机器人技术研发有限公司 | Method and device for depth estimation of two-dimensional image and electronic equipment |
CN111696145B (en) * | 2019-03-11 | 2023-11-03 | 北京地平线机器人技术研发有限公司 | Depth information determining method, depth information determining device and electronic equipment |
CN111696145A (en) * | 2019-03-11 | 2020-09-22 | 北京地平线机器人技术研发有限公司 | Depth information determination method, depth information determination device and electronic equipment |
CN110070056B (en) * | 2019-04-25 | 2023-01-10 | 腾讯科技(深圳)有限公司 | Image processing method, device, storage medium and equipment |
CN110070056A (en) * | 2019-04-25 | 2019-07-30 | 腾讯科技(深圳)有限公司 | Image processing method, device, storage medium and equipment |
CN112149458B (en) * | 2019-06-27 | 2025-02-25 | 商汤集团有限公司 | Obstacle detection method, intelligent driving control method, device, medium and equipment |
CN112149458A (en) * | 2019-06-27 | 2020-12-29 | 商汤集团有限公司 | Obstacle detection method, intelligent driving control method, device, medium and equipment |
CN111105451B (en) * | 2019-10-31 | 2022-08-05 | 武汉大学 | Driving scene binocular depth estimation method for overcoming occlusion effect |
CN111105451A (en) * | 2019-10-31 | 2020-05-05 | 武汉大学 | A Binocular Depth Estimation Method for Driving Scenes Overcoming Occlusion Effect |
CN111292425A (en) * | 2020-01-21 | 2020-06-16 | 武汉大学 | A View Synthesis Method Based on Monocular Hybrid Dataset |
CN111292425B (en) * | 2020-01-21 | 2022-02-01 | 武汉大学 | View synthesis method based on monocular and binocular mixed data set |
CN111178547A (en) * | 2020-04-10 | 2020-05-19 | 支付宝(杭州)信息技术有限公司 | Method and system for model training based on private data |
CN111583152B (en) * | 2020-05-11 | 2023-07-07 | 福建帝视科技集团有限公司 | Image artifact detection and automatic removal method based on U-net structure |
CN111583152A (en) * | 2020-05-11 | 2020-08-25 | 福建帝视信息科技有限公司 | Image artifact detection and automatic removal method based on U-net structure |
CN112634147A (en) * | 2020-12-09 | 2021-04-09 | 上海健康医学院 | PET image noise reduction method, system, device and medium for self-supervision learning |
CN112634147B (en) * | 2020-12-09 | 2024-03-29 | 上海健康医学院 | PET image noise reduction method, system, device and medium for self-supervision learning |
CN113128601A (en) * | 2021-04-22 | 2021-07-16 | 北京百度网讯科技有限公司 | Training method of classification model and method for classifying images |
CN113538258A (en) * | 2021-06-15 | 2021-10-22 | 福州大学 | Image deblurring model and method based on mask |
CN113538258B (en) * | 2021-06-15 | 2023-10-13 | 福州大学 | Mask-based image deblurring model and method |
Also Published As
Publication number | Publication date |
---|---|
CN109087346B (en) | 2020-08-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109087346A (en) | Training method, training device and the electronic equipment of monocular depth model | |
Xiao et al. | Deepfocus: Learned image synthesis for computational display | |
US8538138B2 (en) | Global registration of multiple 3D point sets via optimization on a manifold | |
JP2022524891A (en) | Image processing methods and equipment, electronic devices and computer programs | |
CN102835119B (en) | Support the multi-core processor that the real-time 3D rendering on automatic stereoscopic display device is played up | |
EP3448032B1 (en) | Enhancing motion pictures with accurate motion information | |
CN118053090A (en) | Generating video using potential diffusion models | |
KR20140096532A (en) | Apparatus and method for generating digital hologram | |
US20140354633A1 (en) | Image processing method and image processing device | |
CN105530502B (en) | According to the method and apparatus for the picture frame generation disparity map that stereoscopic camera is shot | |
Kellnhofer et al. | Optimizing disparity for motion in depth | |
Lee et al. | Automatic 2d-to-3d conversion using multi-scale deep neural network | |
US11532122B2 (en) | Method and apparatus for processing holographic image | |
CN106169179A (en) | Image denoising method and image noise reduction apparatus | |
CN113132706A (en) | Controllable position virtual viewpoint generation method and device based on reverse mapping | |
CN116977167A (en) | Video processing method and device, electronic equipment and storage medium | |
Mori et al. | Exemplar-based inpainting for 6dof virtual reality photos | |
EP3882852A1 (en) | Training alignment of a plurality of images | |
Lüke et al. | Near Real‐Time Estimation of Super‐resolved Depth and All‐In‐Focus Images from a Plenoptic Camera Using Graphics Processing Units | |
Thatte | Cinematic virtual reality with head-motion parallax | |
Shi et al. | ImmersePro: End-to-End Stereo Video Synthesis Via Implicit Disparity Learning | |
Zhang et al. | Efficient variational light field view synthesis for making stereoscopic 3D images | |
CN108197248A (en) | A kind of method, apparatus and system of 3Dization 2D web displayings | |
KR101784208B1 (en) | System and method for displaying three-dimension image using multiple depth camera | |
CN119273591A (en) | Three-dimensional image generation method, device, storage medium and program product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |