[go: up one dir, main page]

CN109087346A - Training method, training device and the electronic equipment of monocular depth model - Google Patents

Training method, training device and the electronic equipment of monocular depth model Download PDF

Info

Publication number
CN109087346A
CN109087346A CN201811106152.4A CN201811106152A CN109087346A CN 109087346 A CN109087346 A CN 109087346A CN 201811106152 A CN201811106152 A CN 201811106152A CN 109087346 A CN109087346 A CN 109087346A
Authority
CN
China
Prior art keywords
image
anaglyph
monocular
training
depth model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811106152.4A
Other languages
Chinese (zh)
Other versions
CN109087346B (en
Inventor
耿益锋
胡义涵
罗恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Horizon Robotics Technology Research and Development Co Ltd
Original Assignee
Beijing Horizon Robotics Technology Research and Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Horizon Robotics Technology Research and Development Co Ltd filed Critical Beijing Horizon Robotics Technology Research and Development Co Ltd
Priority to CN201811106152.4A priority Critical patent/CN109087346B/en
Publication of CN109087346A publication Critical patent/CN109087346A/en
Application granted granted Critical
Publication of CN109087346B publication Critical patent/CN109087346B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)

Abstract

Disclose training method, training device and the electronic equipment of a kind of monocular depth model.This method comprises: obtaining multiple binocular images for training monocular depth model;Randomly choose at least monocular image in multiple binocular images;Calculate corresponding first anaglyph of each monocular image and the corresponding first mask image of the first anaglyph in multiple binocular images in addition to an at least monocular image;It for each monocular image in an at least monocular image, calculates the anaglyph after each monocular image is overturn and is overturn again as the second anaglyph, and calculate the corresponding second mask image of the second anaglyph;And the reversed gradient by shielding the region of the first anaglyph that the first mask image blocks and the second mask image reversed gradient in the region of the second anaglyph blocked trains monocular depth model.In this way, the Depth Blur problem of object edge can be efficiently solved, while improving the precision of prediction of model entirety.

Description

Training method, training device and the electronic equipment of monocular depth model
Technical field
This application involves model training fields, and more particularly, to a kind of training method of monocular depth model, training Device and electronic equipment.
Background technique
Currently, computer vision technique is proposed on the basis of two-dimensional image, therefore, how from two Extract depth information in dimensional plane images or video sequence so predetermined depth image reconstruction three-dimensional structure be one very Important technology.This has very big promotion to the application such as dimension of object, hiding relation, shape, segmentation, and can answer extensively Turn 3D film, intelligent robot independent navigation for 2D film, mechanical arm grabs, under the scenes such as augmented reality.
In estimation of Depth technology, monocular depth estimation is to be estimated in image using the image of camera acquisition respectively The depth information of a pixel, and the estimation of unsupervised monocular depth is exactly that the training of model does not need the depth information or other of pixel Markup information.
As the depth estimation algorithm based on machine learning obtains more and more extensive research, by monocular depth model into Row estimation of Depth can not be limited by specific scene condition, have preferable applicability.Accordingly, it is desired to provide improved list The training program of mesh depth model.
Summary of the invention
In order to solve the above-mentioned technical problem, the application is proposed.Embodiments herein provides a kind of monocular depth mould Training method, training device and the electronic equipment of type are blocked by calculating to block mask and shield during model training The reversed gradient in region, while input picture invert at random and then prediction result is flipped back to again to carry out the anti-of gradient To propagation, the Depth Blur problem of object edge can be efficiently solved, while improving the precision of prediction of model entirety.
According to the one aspect of the application, a kind of training method of monocular depth model is provided, comprising: obtain for instructing Practice multiple binocular images of monocular depth model;Randomly choose at least monocular image in the multiple binocular image;It calculates Corresponding first anaglyph of each monocular image in the multiple binocular image in addition to an at least monocular image with And the corresponding first mask image of first anaglyph;For each monocular image in an at least monocular image, It calculates the anaglyph after each monocular image is overturn and is overturn again as the second anaglyph, and calculate described second The corresponding second mask image of anaglyph;And first disparity map blocked by shielding the first mask image The reversed gradient in the region for the second anaglyph that the reversed gradient in the region of picture and the second mask image block is instructed Practice the monocular depth model.
According to the another aspect of the application, a kind of training device of monocular depth model is provided, comprising: image obtains single Member, for obtaining multiple binocular images for training monocular depth model;Image selection unit is described more for randomly choosing An at least monocular image in a binocular image;First computing unit, for calculate in the multiple binocular image except it is described extremely Corresponding first anaglyph of each monocular image and first anaglyph except a few monocular image corresponding the One mask image;Second computing unit, for calculating described every for each monocular image in an at least monocular image Anaglyph after the overturning of one monocular image is simultaneously overturn again as the second anaglyph, and calculating second anaglyph pair The the second mask image answered;And model training unit, described first for being blocked by shielding the first mask image The reversed ladder in the region for the second anaglyph that the reversed gradient in the region of anaglyph and the second mask image block Degree is to train the monocular depth model.
According to the application's in another aspect, providing a kind of electronic equipment, comprising: processor;And memory, in institute It states and is stored with computer program instructions in memory, the computer program instructions make described when being run by the processor Processor executes the training method of monocular depth model as described above.
According to the another aspect of the application, a kind of computer-readable medium is provided, computer program is stored thereon with and refers to It enables, the computer program instructions make the processor execute monocular depth model as described above when being run by processor Training method.
Compared with prior art, training method, training device and the electronic equipment of the monocular depth model of the application can be with Obtain multiple binocular images for training monocular depth model;Randomly choose at least monocular in the multiple binocular image Image;Calculate corresponding first view of each monocular image in the multiple binocular image in addition to an at least monocular image Difference image and the corresponding first mask image of first anaglyph;For each list in an at least monocular image Mesh image calculates the anaglyph after each monocular image is overturn and is overturn again as the second anaglyph, and calculates The corresponding second mask image of second anaglyph;And described the by shielding that the first mask image blocks The region for the second anaglyph that the reversed gradient in the region of one anaglyph and the second mask image block it is reversed Gradient trains the monocular depth model.In this way, blocking mask by being calculated during model training and shielding blocked area The reversed gradient in domain, while input picture invert at random and then prediction result is flipped back to again to carry out the reversed of gradient It propagates, the Depth Blur problem of object edge can be efficiently solved, while improving the precision of prediction of model entirety.
Detailed description of the invention
The embodiment of the present application is described in more detail in conjunction with the accompanying drawings, the above-mentioned and other purposes of the application, Feature and advantage will be apparent.Attached drawing is used to provide to further understand the embodiment of the present application, and constitutes explanation A part of book is used to explain the application together with the embodiment of the present application, does not constitute the limitation to the application.In the accompanying drawings, Identical reference label typically represents same parts or step.
Fig. 1 illustrates the flow charts according to the training method of the monocular depth model of the embodiment of the present application.
Fig. 2 illustrates the schematic diagram of the generating process of the anaglyph and mask image according to the embodiment of the present application.
Fig. 3 illustrates the first exemplary schematic diagram of the network structure according to the embodiment of the present application.
Fig. 4 illustrates the second exemplary schematic diagram of the network structure according to the embodiment of the present application.
Fig. 5 illustrates the effect picture of the training method of the monocular depth model according to the embodiment of the present application.
Fig. 6 illustrates the block diagram of the training device of the monocular depth model according to the embodiment of the present application.
Fig. 7 illustrates the block diagram of the electronic equipment according to the embodiment of the present application.
Specific embodiment
In the following, example embodiment according to the application will be described in detail by referring to the drawings.Obviously, described embodiment is only It is only a part of the embodiment of the application, rather than the whole embodiments of the application, it should be appreciated that the application is not by described herein The limitation of example embodiment.
Application is summarized
As described above, monocular depth model is more and more widely used in the estimation of Depth of two dimensional image.
There are mainly two types of implementations for current monocular depth estimation, and one is the images by binocular to carry out model Training, is mainly utilized the physical relation between binocular image, another is the video by monocular cam, main to utilize The information of front and back picture frame.In addition, the method for also having while using binocular image and video.
For the unsupervised training method for binocular image, estimation of Depth can be easily carried out, still, at present The edge that above-mentioned estimation method will lead to object is relatively fuzzyyer.
Present inventor has found that the fuzzy of this object edge is mainly due in training process later after study Image transformation (image warping) can not be handled caused by object blocks.
In view of the above technical problems, the basic conception of the application is to calculate to block mask and shield during model training The reversed gradient of occlusion area, while random reversion is carried out to input picture and then prediction result is overturn again and can carry out gradient Backpropagation.
Specifically, training method, training device and the electronic equipment of monocular depth model provided by the present application can be first Multiple binocular images for training monocular depth model are obtained, at least one in the multiple binocular image is then randomly choosed Monocular image calculates each monocular image corresponding in the multiple binocular image in addition to an at least monocular image One anaglyph and the corresponding first mask image of first anaglyph, and in an at least monocular image Each monocular image, anaglyph after calculating each monocular image overturning simultaneously overturns again as the second anaglyph, And the corresponding second mask image of second anaglyph is calculated, it is blocked finally by shielding the first mask image The region for the second anaglyph that the reversed gradient in the region of first anaglyph and the second mask image block Reversed gradient train the monocular depth model.In this way, the Depth Blur problem of object edge can be efficiently solved, together The precision of prediction of Shi Tigao model entirety.
After describing the basic principle of the application, carry out the various non-limits for specifically introducing the application below with reference to the accompanying drawings Property embodiment processed.
Illustrative methods
Fig. 1 illustrates the flow charts according to the training method of the monocular depth model of the embodiment of the present application.
As shown in Figure 1, including: S110 according to the training method of the monocular depth model of the embodiment of the present application, acquisition is used for Multiple binocular images of training monocular depth model;S120 randomly chooses at least monocular figure in the multiple binocular image Picture;S130 calculates each monocular image corresponding in the multiple binocular image in addition to an at least monocular image One anaglyph and the corresponding first mask image of first anaglyph;S140, for an at least monocular image In each monocular image, anaglyph after calculating each monocular image overturning simultaneously overturns again as the second disparity map Picture, and calculate the corresponding second mask image of second anaglyph;And S150, by shielding the first mask figure The second disparity map blocked as the reversed gradient in the region for first anaglyph blocked and the second mask image The reversed gradient in the region of picture trains the monocular depth model.
In step s 110, multiple binocular images for training monocular depth model are obtained.That is, according to this In the training method for applying for the monocular depth model of embodiment, using the unsupervised training method based on binocular image to monocular depth Degree model is trained.
Here, each binocular image includes the left-eye image and eye image as monocular image.Also, in model In training process, left view difference image is generated based on the left-eye image, with eye image corresponding with left-eye image synthesis. Equally, right anaglyph is generated based on the eye image, with left-eye image corresponding with eye image synthesis.
In the step s 120, at least monocular image in the multiple binocular image is randomly choosed.As described above, in root In training method according to the monocular depth model of the embodiment of the present application, random selection part input picture is overturn, then right It is overturn back again in the parallax that the image of overturning is overturn, carries out the synthesis of image and the backpropagation of gradient.
Also, in the embodiment of the present application, selected image does not need the left eye being limited in simultaneous selection binocular image Image and eye image.That is, left-eye image that can only in selected section binocular image, or only selected section binocular Eye image in image also can choose in left-eye image and another part binocular image in a part of binocular image Eye image, etc..Certainly, in the embodiment of the present application, it is double that at least one of multiple binocular images also can be randomly selected Mesh image, and simultaneously using in selected binocular image left-eye image and eye image as the image overturn.
It is, being randomly choosed the multiple in the training method according to the monocular depth model of the embodiment of the present application An at least monocular image in binocular image includes: at least binocular image in the multiple binocular image of random selection to obtain The left-eye image and eye image in an at least binocular image are obtained, using the left-eye image and the eye image as institute State an at least monocular image.
In this way, due to simultaneously using in binocular image left-eye image and eye image overturn, can be in processing Another is handled in a similar way based on one in left-eye image and eye image, reduces the complexity of calculating.Also, by Simultaneously include left-eye image and eye image in the image of overturning, improves the diversity for the sample trained after overturning, Ke Yijin The precision of prediction of one step raising model.
In step s 130, each monocular in the multiple binocular image in addition to an at least monocular image is calculated Corresponding first anaglyph of image and the corresponding first mask image of first anaglyph.In the following, will be said with reference to Fig. 2 Disparity map is generated for the image for not overturning and overturning in the bright training method according to the monocular depth model of the embodiment of the present application Picture and the process for generating mask image.Fig. 2 illustrates the generation of anaglyph and mask image according to the embodiment of the present application The schematic diagram of journey.
As shown in the left-half of Fig. 2, for the input picture for not needing overturning, for example, input left eye as shown in Figure 2 Image generates left view difference image corresponding with the left-eye image, regenerates mask image corresponding with left view difference image.Together Sample, although showing the operation executed to input left-eye image in Fig. 2, the operation executed for input eye image is also identical. Therefore, in the embodiment of the present application, first anaglyph refers to the view for the input picture generation without overturning Difference image, and the first mask image refers to both wrapping the mask image of the input picture generation without overturning The anaglyph and mask image for left-eye image are included, also includes anaglyph and mask image for eye image.
In step S140, for each monocular image in an at least monocular image, each monocular is calculated Anaglyph after Image Reversal is simultaneously overturn again as the second anaglyph, and calculates second anaglyph corresponding the Two mask images.
Equally, with reference to Fig. 2, as shown in the right half part of Fig. 2, for input picture, for example, such as the left-half institute of Fig. 2 The input left-eye image shown is overturn the view that the overturning input picture is then calculated to obtain overturning input picture first Difference image, i.e., the anaglyph of flipped image as shown in Figure 2 later overturn the anaglyph of flipped image again, with Obtain the anaglyph of overturning as shown in Figure 2.Finally, generating corresponding mask image for the anaglyph of overturning again.
Therefore, in the embodiment of the present application, second anaglyph is referred to for the selected input by overturning The anaglyph that image generates, and the second mask image is referred to for the selected input picture generation by overturning Mask image.And it as described above, it can only include the anaglyph and mask image for left-eye image, can also only wrap The anaglyph and mask image for eye image are included, or includes for the left-eye image and eye image in binocular image The anaglyph and mask image of the two.
That is, being right as unit of each monocular image in the binocular image of input in the embodiment of the present application Input picture is divided, and a part directly calculates anaglyph and corresponding mask image, i.e., the first parallax as described above Image and the first mask image, another part calculate anaglyph and corresponding mask image after then overturning, i.e., as described above Second anaglyph and the second mask image.
Finally, in step S150, by the area for shielding first anaglyph that the first mask image blocks The reversed gradient in the region for the second anaglyph that the reversed gradient in domain and the second mask image block is described to train Monocular depth model.
In this way, passing through the reversed gradient for calculating occlusion area and shielding occlusion area, due to passing through mask image intensification Object area in image simultaneously inhibits non-object area, can efficiently solve the Depth Blur problem of object edge.
Specifically, in the training method according to the monocular depth model of the embodiment of the present application, different structure can be used Network model.Fig. 3 illustrates the first exemplary schematic diagram of the network structure according to the embodiment of the present application.As shown in figure 3, needle To the left-eye image I of inputlWith eye image Ir, calculate separately out its left view difference image dlWith right anaglyph dr.Here, originally Field technical staff is appreciated that left-eye image IlCorresponding anaglyph dlIt can be unturned input figure as described above As corresponding first anaglyph, it is also possible to corresponding second anaglyph of the input picture overturn as described above, equally, Eye image IrCorresponding anaglyph drIt is also possible to the first anaglyph and the second anaglyph as described above.
Next, by left view difference image dlCorresponding eye image IrSynthesis, and by right anaglyph drIt is right with it The left-eye image I answeredlSynthesis, to generate forecast imageWithThen, by calculating the forecast imageWithWith it is true Image IlAnd IrBetween difference function, and at least part using the difference function as loss function trains the list Mesh depth model.Also, as described above, in the training process, blocking the forecast image with mask image, and shield and hidden The reversed gradient in the region of gear.Here, the difference function can be the forecast imageWithWith true picture IlAnd IrIt Between image difference or described image difference quadratic sum, etc..
Here, network structure shown in Fig. 3 passes through while calculating anaglyph for left-eye image and eye image and closing At forecast image, the precision of prediction of model can be improved.
Therefore, it in the training method according to the monocular depth model of the embodiment of the present application, is covered by shielding described first The second view that the reversed gradient in the region for first anaglyph that code image blocks and the second mask image block The reversed gradient in the region of difference image is come to train the monocular depth model include: by first anaglyph and described second The monocular image of each anaglyph corresponding thereto in anaglyph synthesizes forecast image;Calculate the forecast image and true Difference function between real image;And using the difference function as at least part of loss function training monocular Depth model shields the reversed ladder in the region for the forecast image that the mask image is blocked in the training process Degree.
As another example of network structure, Fig. 4 illustrates the second example of the network structure according to the embodiment of the present application Schematic diagram.As shown in figure 4, this exemplary network structure can be just for left-eye image IlWith eye image IrOne of instructed Practice.For example, being directed to left-eye image Il, its left view difference image d is calculated firstl, then with corresponding eye image IrIt is synthesized To obtain forecast imageNext, calculating the forecast imageWith true picture IlBetween difference function, and with described Difference function trains the monocular depth model as at least part of loss function.Similarly, in the training process, with Mask image blocks the forecast image, and shields the reversed gradient in blocked region.
In addition, it will be understood by those skilled in the art that network structure as shown in Figure 4 can be equally applied to right eye figure As Ir.That is, being directed to eye image Ir, its right anaglyph d is calculated firstr, then with corresponding left-eye image IlInto Row synthesis is to obtain forecast imageNext, calculating the forecast imageWith true picture IrBetween difference function, and At least part using the difference function as loss function trains the monocular depth model.Similarly, it was training Cheng Zhong blocks the forecast image with mask image, and shields the reversed gradient in blocked region.Here, the difference Function can be the forecast imageWithWith true picture IlAnd IrBetween image difference or described image difference Quadratic sum, etc..
Here, network structure shown in Fig. 4 calculates anaglyph only for one of left-eye image and eye image and synthesizes Forecast image, so that calculating process is fairly simple, while can also be compatible with current some existing network structures.
It is, passing through shielding described first in the training method according to the monocular depth model of the embodiment of the present application The reversed gradient in the region for first anaglyph that mask image blocks and the second mask image block second The reversed gradient in the region of anaglyph is come to train the monocular depth model include: by first anaglyph and described The each left view difference image for corresponding to one of the left-eye image and described eye image or each right side in two anaglyphs Anaglyph eye image corresponding thereto or left-eye image synthesize forecast image;Calculate the forecast image and true figure Difference function as between;And using the difference function as at least part of loss function training monocular depth Model shields the reversed gradient in the region for the forecast image that the mask image is blocked in the training process.
Fig. 5 illustrates the effect picture of the training method of the monocular depth model according to the embodiment of the present application.As shown in figure 5, (a) left-eye image I is shownl, (b) show eye image Ir, (c) show the anaglyph d being aligned with left-eye imagel, (d) the left eye forecast image after showing reconstruct(e) it shows and anaglyph dlCorresponding mask image, and (f) show Use the left-eye image reconstructed after mask image masking.It can see from (d), left eye forecast image after reconstitution In the presence of apparent repetition and artifact.And shielded by using the mask image (e) generated from anaglyph (c) those repeat and The backpropagation of artifact, it can be seen that white area has been blocked in final result (f).
Exemplary means
Fig. 6 illustrates the block diagram of the training device of the monocular depth model according to the embodiment of the present application.
As shown in fig. 6, including: that image obtains list according to the training device 200 of the monocular depth model of the embodiment of the present application Member 210, for obtaining multiple binocular images for training monocular depth model;Image selection unit 220, for randomly choosing An at least monocular image in the multiple binocular image;First computing unit 230, for calculating in the multiple binocular image Corresponding first anaglyph of each monocular image and first anaglyph in addition to an at least monocular image Corresponding first mask image;Second computing unit 240, for for each monocular figure in an at least monocular image Picture calculates the anaglyph after each monocular image is overturn and is overturn again as the second anaglyph, and described in calculating The corresponding second mask image of second anaglyph;And model training unit 250, for by shielding first mask The second parallax that the reversed gradient in the region for first anaglyph that image blocks and the second mask image block The reversed gradient in the region of image trains the monocular depth model.
In one example, in the training device 200 of above-mentioned monocular depth model, the binocular image includes as single The left-eye image and eye image of mesh image;The corresponding anaglyph of the left-eye image is left view difference image;And the right side The corresponding anaglyph of eye image is right anaglyph.
In one example, in the training device 200 of above-mentioned monocular depth model, described image selecting unit 220 is used In: at least binocular image in the multiple binocular image is randomly choosed to obtain the left eye in an at least binocular image Both image and eye image are using as an at least monocular image.
In one example, in the training device 200 of above-mentioned monocular depth model, the model training unit 250 is used In: the monocular image of each anaglyph in first anaglyph and second anaglyph corresponding thereto is synthesized For forecast image;Calculate the difference function between the forecast image and true picture;And using the difference function as damage At least part training monocular depth model for losing function shields the mask image and is hidden in the training process The reversed gradient in the region of the forecast image of gear.
In one example, in the training device 200 of above-mentioned monocular depth model, the model training unit 250 is used In: by first anaglyph and second anaglyph correspond to the left-eye image and the eye image it One each left view difference image or each right anaglyph eye image corresponding thereto or left-eye image synthesize prediction Image;Calculate the difference function between the forecast image and true picture;And using the difference function as loss function At least part training monocular depth model shield the institute that the mask image is blocked in the training process State the reversed gradient in the region of forecast image.
Here, it will be understood by those skilled in the art that each unit in the training device 200 of above-mentioned monocular depth model It is had been described above in the training method referring to figs. 1 to the monocular depth model of Fig. 5 description in detail with the concrete function of module and operation It is thin to introduce, and therefore, it will omit its repeated description.
As described above, may be implemented according to the training device 200 of the monocular depth model of the embodiment of the present application at various ends In end equipment, such as the server of operation monocular depth model.In one example, according to the device of the embodiment of the present application 200 It can be used as a software module and/or hardware module and be integrated into terminal device.For example, the device 200 can be the end A software module in the operating system of end equipment, or can be and be directed to one that the terminal device is developed using journey Sequence;Certainly, which equally can be one of numerous hardware modules of the terminal device.
Alternatively, in another example, the training device 200 of the monocular depth model and the terminal device are also possible to point Vertical equipment, and the device 200 can be connected to the terminal device by wired and or wireless network, and according to agreement Data format transmit interactive information.
Example electronic device
In the following, being described with reference to Figure 7 the electronic equipment according to the embodiment of the present application.
Fig. 7 illustrates the block diagram of the electronic equipment according to the embodiment of the present application.
As shown in fig. 7, electronic equipment 10 includes one or more processors 11 and memory 12.
Processor 11 can be central processing unit (CPU) or have data-handling capacity and/or instruction execution capability Other forms processing unit, and can control the other assemblies in electronic equipment 10 to execute desired function.
Memory 12 may include one or more computer program products, and the computer program product may include each The computer readable storage medium of kind form, such as volatile memory and/or nonvolatile memory.The volatile storage Device for example may include random access memory (RAM) and/or cache memory (cache) etc..It is described non-volatile to deposit Reservoir for example may include read-only memory (ROM), hard disk, flash memory etc..It can be deposited on the computer readable storage medium One or more computer program instructions are stored up, processor 11 can run described program instruction, to realize this Shen described above The training method of the monocular depth model of each embodiment please and/or other desired functions.The computer can It reads that the various contents such as input binocular image, anaglyph, mask image can also be stored in storage medium.
In one example, electronic equipment 10 can also include: input unit 13 and output device 14, these components pass through The interconnection of bindiny mechanism's (not shown) of bus system and/or other forms.
For example, the input unit 13 may include binocular camera, for acquiring binocular image.In addition, the input unit 13 can also include such as keyboard, mouse etc..
The output device 14 can be output to the outside various information, such as comprising display, loudspeaker, printer, with And communication network and its remote output devices connected etc..
Certainly, to put it more simply, illustrated only in Fig. 7 it is some in component related with the application in the electronic equipment 10, The component of such as bus, input/output interface etc. is omitted.In addition to this, according to concrete application situation, electronic equipment 10 is also It may include any other component appropriate.
Illustrative computer program product and computer readable storage medium
Other than the above method and equipment, embodiments herein can also be computer program product comprising meter Calculation machine program instruction, it is above-mentioned that the computer program instructions make the processor execute this specification when being run by processor According to the step in the training method of the monocular depth model of the various embodiments of the application described in " illustrative methods " part.
The computer program product can be write with any combination of one or more programming languages for holding The program code of row the embodiment of the present application operation, described program design language includes object oriented program language, such as Java, C++ etc. further include conventional procedural programming language, such as " C " language or similar programming language.Journey Sequence code can be executed fully on the user computing device, partly execute on a user device, be independent soft as one Part packet executes, part executes on a remote computing or completely in remote computing device on the user computing device for part Or it is executed on server.
In addition, embodiments herein can also be computer readable storage medium, it is stored thereon with computer program and refers to It enables, the computer program instructions make the processor execute above-mentioned " the exemplary side of this specification when being run by processor According to the step in the training method of the monocular depth model of the various embodiments of the application described in method " part.
The computer readable storage medium can be using any combination of one or more readable mediums.Readable medium can To be readable signal medium or readable storage medium storing program for executing.Readable storage medium storing program for executing for example can include but is not limited to electricity, magnetic, light, electricity Magnetic, the system of infrared ray or semiconductor, device or device, or any above combination.Readable storage medium storing program for executing it is more specific Example (non exhaustive list) includes: the electrical connection with one or more conducting wires, portable disc, hard disk, random access memory Device (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc Read-only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.
The basic principle of the application is described in conjunction with specific embodiments above, however, it is desirable to, it is noted that in this application The advantages of referring to, advantage, effect etc. are only exemplary rather than limitation, must not believe that these advantages, advantage, effect etc. are the application Each embodiment is prerequisite.In addition, detail disclosed above is merely to exemplary effect and the work being easy to understand With, rather than limit, it is that must be realized using above-mentioned concrete details that above-mentioned details, which is not intended to limit the application,.
Device involved in the application, device, equipment, system block diagram only as illustrative example and be not intended to It is required that or hint must be attached in such a way that box illustrates, arrange, configure.As those skilled in the art will appreciate that , it can be connected by any way, arrange, configure these devices, device, equipment, system.Such as "include", "comprise", " tool " etc. word be open vocabulary, refer to " including but not limited to ", and can be used interchangeably with it.Vocabulary used herein above "or" and "and" refer to vocabulary "and/or", and can be used interchangeably with it, unless it is not such that context, which is explicitly indicated,.Here made Vocabulary " such as " refers to phrase " such as, but not limited to ", and can be used interchangeably with it.
It may also be noted that each component or each step are can to decompose in the device of the application, device and method And/or reconfigure.These decompose and/or reconfigure the equivalent scheme that should be regarded as the application.
The above description of disclosed aspect is provided so that any person skilled in the art can make or use this Application.Various modifications in terms of these are readily apparent to those skilled in the art, and are defined herein General Principle can be applied to other aspect without departing from scope of the present application.Therefore, the application is not intended to be limited to Aspect shown in this, but according to principle disclosed herein and the consistent widest range of novel feature.
In order to which purpose of illustration and description has been presented for above description.In addition, this description is not intended to the reality of the application It applies example and is restricted to form disclosed herein.Although already discussed above multiple exemplary aspects and embodiment, this field skill Its certain modifications, modification, change, addition and sub-portfolio will be recognized in art personnel.

Claims (12)

1. a kind of training method of monocular depth model, comprising:
Obtain multiple binocular images for training monocular depth model;
Randomly choose at least monocular image in the multiple binocular image;
Calculate corresponding first view of each monocular image in the multiple binocular image in addition to an at least monocular image Difference image and the corresponding first mask image of first anaglyph;
Disparity map for each monocular image in an at least monocular image, after calculating each monocular image overturning Picture is simultaneously overturn again as the second anaglyph, and calculates the corresponding second mask image of second anaglyph;And
By the reversed gradient and described for shielding the region for first anaglyph that the first mask image blocks The reversed gradient in the region for the second anaglyph that two mask images block trains the monocular depth model.
2. the training method of monocular depth model as described in claim 1, wherein
The binocular image includes the left-eye image and eye image as monocular image;
The corresponding anaglyph of the left-eye image is left view difference image;And
The corresponding anaglyph of the eye image is right anaglyph.
3. the training method of monocular depth model as claimed in claim 2, wherein randomly choose in the multiple binocular image An at least monocular image include:
At least binocular image in the multiple binocular image is randomly choosed to obtain the left side in an at least binocular image Eye image and eye image, using the left-eye image and the eye image as an at least monocular image.
4. the training method of monocular depth model as described in claim 1, wherein hidden by shielding the first mask image The second anaglyph that the reversed gradient in the region of first anaglyph of gear and the second mask image block The reversed gradient in region includes: to train the monocular depth model
The monocular image of each anaglyph in first anaglyph and second anaglyph corresponding thereto is closed As forecast image;
Calculate the difference function between the forecast image and true picture;And
Using the difference function as at least part of loss function training monocular depth model, in the training process In, shield the reversed gradient in the region for the forecast image that the mask image is blocked.
5. the training method of monocular depth model as claimed in claim 2, wherein hidden by shielding the first mask image The second anaglyph that the reversed gradient in the region of first anaglyph of gear and the second mask image block The reversed gradient in region includes: to train the monocular depth model
The left-eye image and the eye image will be corresponded in first anaglyph and second anaglyph One of each left view difference image or each right anaglyph eye image corresponding thereto or left-eye image synthesize it is pre- Altimetric image;
Calculate the difference function between the forecast image and true picture;And
Using the difference function as at least part of loss function training monocular depth model, in the training process In, shield the reversed gradient in the region for the forecast image that the mask image is blocked.
6. a kind of training device of monocular depth model, comprising:
Image acquisition unit, for obtaining multiple binocular images for training monocular depth model;
Image selection unit, for randomly choosing at least monocular image in the multiple binocular image;
First computing unit, for calculating each monocular in the multiple binocular image in addition to an at least monocular image Corresponding first anaglyph of image and the corresponding first mask image of first anaglyph;
Second computing unit, for calculating each monocular for each monocular image in an at least monocular image Anaglyph after Image Reversal is simultaneously overturn again as the second anaglyph, and calculates second anaglyph corresponding the Two mask images;
Model training unit, for the anti-of the region by shielding first anaglyph that the first mask image blocks The reversed gradient in the region for the second anaglyph blocked to gradient and the second mask image trains the monocular deep Spend model.
7. the training device of monocular depth model as claimed in claim 6, wherein
The binocular image includes the left-eye image and eye image as monocular image;
The corresponding anaglyph of the left-eye image is left view difference image;And
The corresponding anaglyph of the eye image is right anaglyph.
8. the training device of monocular depth model as claimed in claim 7, wherein described image selecting unit is used for:
At least binocular image in the multiple binocular image is randomly choosed to obtain the left side in an at least binocular image Eye both image and eye image is using as an at least monocular image.
9. the training device of monocular depth model as claimed in claim 6, wherein the model training unit is used for:
The monocular image of each anaglyph in first anaglyph and second anaglyph corresponding thereto is closed As forecast image;
Calculate the difference function between the forecast image and true picture;And
Using the difference function as at least part of loss function training monocular depth model, in the training process In, shield the reversed gradient in the region for the forecast image that the mask image is blocked.
10. the training device of monocular depth model as claimed in claim 6, wherein the model training unit is used for:
The left-eye image and the eye image will be corresponded in first anaglyph and second anaglyph One of each left view difference image or each right anaglyph eye image corresponding thereto or left-eye image synthesize it is pre- Altimetric image;
Calculate the difference function between the forecast image and true picture;And
Using the difference function as at least part of loss function training monocular depth model, in the training process In, shield the reversed gradient in the region for the forecast image that the mask image is blocked.
11. a kind of electronic equipment, comprising:
Processor;And
Memory is stored with computer program instructions in the memory, and the computer program instructions are by the processing The processor is made to execute the training method of monocular depth model according to any one of claims 1 to 5 when device operation.
12. a kind of computer-readable medium is stored thereon with computer program instructions, the computer program instructions are processed The processor is made to execute the training method of monocular depth model according to any one of claims 1 to 5 when device operation.
CN201811106152.4A 2018-09-21 2018-09-21 Monocular depth model training method and device and electronic equipment Active CN109087346B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811106152.4A CN109087346B (en) 2018-09-21 2018-09-21 Monocular depth model training method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811106152.4A CN109087346B (en) 2018-09-21 2018-09-21 Monocular depth model training method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN109087346A true CN109087346A (en) 2018-12-25
CN109087346B CN109087346B (en) 2020-08-11

Family

ID=64842277

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811106152.4A Active CN109087346B (en) 2018-09-21 2018-09-21 Monocular depth model training method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN109087346B (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110070056A (en) * 2019-04-25 2019-07-30 腾讯科技(深圳)有限公司 Image processing method, device, storage medium and equipment
CN111105451A (en) * 2019-10-31 2020-05-05 武汉大学 A Binocular Depth Estimation Method for Driving Scenes Overcoming Occlusion Effect
CN111178547A (en) * 2020-04-10 2020-05-19 支付宝(杭州)信息技术有限公司 Method and system for model training based on private data
CN111292425A (en) * 2020-01-21 2020-06-16 武汉大学 A View Synthesis Method Based on Monocular Hybrid Dataset
CN111476834A (en) * 2019-01-24 2020-07-31 北京地平线机器人技术研发有限公司 Method and device for generating image and electronic equipment
CN111508010A (en) * 2019-01-31 2020-08-07 北京地平线机器人技术研发有限公司 Method and device for depth estimation of two-dimensional image and electronic equipment
CN111583152A (en) * 2020-05-11 2020-08-25 福建帝视信息科技有限公司 Image artifact detection and automatic removal method based on U-net structure
CN111696145A (en) * 2019-03-11 2020-09-22 北京地平线机器人技术研发有限公司 Depth information determination method, depth information determination device and electronic equipment
CN112149458A (en) * 2019-06-27 2020-12-29 商汤集团有限公司 Obstacle detection method, intelligent driving control method, device, medium and equipment
CN112634147A (en) * 2020-12-09 2021-04-09 上海健康医学院 PET image noise reduction method, system, device and medium for self-supervision learning
CN113128601A (en) * 2021-04-22 2021-07-16 北京百度网讯科技有限公司 Training method of classification model and method for classifying images
CN113538258A (en) * 2021-06-15 2021-10-22 福州大学 Image deblurring model and method based on mask

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103413298A (en) * 2013-07-17 2013-11-27 宁波大学 Three-dimensional image objective evaluation method based on visual characteristics
CN105374039A (en) * 2015-11-16 2016-03-02 辽宁大学 Monocular image depth information estimation method based on contour acuity
EP2747427B1 (en) * 2012-12-21 2016-03-16 imcube labs GmbH Method, apparatus and computer program usable in synthesizing a stereoscopic image
CN107578436A (en) * 2017-08-02 2018-01-12 南京邮电大学 A Depth Estimation Method for Monocular Image Based on Fully Convolutional Neural Network FCN
CN107945265A (en) * 2017-11-29 2018-04-20 华中科技大学 Real-time dense monocular SLAM method and systems based on on-line study depth prediction network
CN108269278A (en) * 2016-12-30 2018-07-10 杭州海康威视数字技术股份有限公司 A kind of method and device of scene modeling

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2747427B1 (en) * 2012-12-21 2016-03-16 imcube labs GmbH Method, apparatus and computer program usable in synthesizing a stereoscopic image
CN103413298A (en) * 2013-07-17 2013-11-27 宁波大学 Three-dimensional image objective evaluation method based on visual characteristics
CN105374039A (en) * 2015-11-16 2016-03-02 辽宁大学 Monocular image depth information estimation method based on contour acuity
CN108269278A (en) * 2016-12-30 2018-07-10 杭州海康威视数字技术股份有限公司 A kind of method and device of scene modeling
CN107578436A (en) * 2017-08-02 2018-01-12 南京邮电大学 A Depth Estimation Method for Monocular Image Based on Fully Convolutional Neural Network FCN
CN107945265A (en) * 2017-11-29 2018-04-20 华中科技大学 Real-time dense monocular SLAM method and systems based on on-line study depth prediction network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
林义闽 等: "用于弱纹理场景三维重建的机器人视觉系统", 《CNKI》 *

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476834B (en) * 2019-01-24 2023-08-11 北京地平线机器人技术研发有限公司 Method and device for generating image and electronic equipment
CN111476834A (en) * 2019-01-24 2020-07-31 北京地平线机器人技术研发有限公司 Method and device for generating image and electronic equipment
CN111508010B (en) * 2019-01-31 2023-08-08 北京地平线机器人技术研发有限公司 Method and device for estimating depth of two-dimensional image and electronic equipment
CN111508010A (en) * 2019-01-31 2020-08-07 北京地平线机器人技术研发有限公司 Method and device for depth estimation of two-dimensional image and electronic equipment
CN111696145B (en) * 2019-03-11 2023-11-03 北京地平线机器人技术研发有限公司 Depth information determining method, depth information determining device and electronic equipment
CN111696145A (en) * 2019-03-11 2020-09-22 北京地平线机器人技术研发有限公司 Depth information determination method, depth information determination device and electronic equipment
CN110070056B (en) * 2019-04-25 2023-01-10 腾讯科技(深圳)有限公司 Image processing method, device, storage medium and equipment
CN110070056A (en) * 2019-04-25 2019-07-30 腾讯科技(深圳)有限公司 Image processing method, device, storage medium and equipment
CN112149458B (en) * 2019-06-27 2025-02-25 商汤集团有限公司 Obstacle detection method, intelligent driving control method, device, medium and equipment
CN112149458A (en) * 2019-06-27 2020-12-29 商汤集团有限公司 Obstacle detection method, intelligent driving control method, device, medium and equipment
CN111105451B (en) * 2019-10-31 2022-08-05 武汉大学 Driving scene binocular depth estimation method for overcoming occlusion effect
CN111105451A (en) * 2019-10-31 2020-05-05 武汉大学 A Binocular Depth Estimation Method for Driving Scenes Overcoming Occlusion Effect
CN111292425A (en) * 2020-01-21 2020-06-16 武汉大学 A View Synthesis Method Based on Monocular Hybrid Dataset
CN111292425B (en) * 2020-01-21 2022-02-01 武汉大学 View synthesis method based on monocular and binocular mixed data set
CN111178547A (en) * 2020-04-10 2020-05-19 支付宝(杭州)信息技术有限公司 Method and system for model training based on private data
CN111583152B (en) * 2020-05-11 2023-07-07 福建帝视科技集团有限公司 Image artifact detection and automatic removal method based on U-net structure
CN111583152A (en) * 2020-05-11 2020-08-25 福建帝视信息科技有限公司 Image artifact detection and automatic removal method based on U-net structure
CN112634147A (en) * 2020-12-09 2021-04-09 上海健康医学院 PET image noise reduction method, system, device and medium for self-supervision learning
CN112634147B (en) * 2020-12-09 2024-03-29 上海健康医学院 PET image noise reduction method, system, device and medium for self-supervision learning
CN113128601A (en) * 2021-04-22 2021-07-16 北京百度网讯科技有限公司 Training method of classification model and method for classifying images
CN113538258A (en) * 2021-06-15 2021-10-22 福州大学 Image deblurring model and method based on mask
CN113538258B (en) * 2021-06-15 2023-10-13 福州大学 Mask-based image deblurring model and method

Also Published As

Publication number Publication date
CN109087346B (en) 2020-08-11

Similar Documents

Publication Publication Date Title
CN109087346A (en) Training method, training device and the electronic equipment of monocular depth model
Xiao et al. Deepfocus: Learned image synthesis for computational display
US8538138B2 (en) Global registration of multiple 3D point sets via optimization on a manifold
JP2022524891A (en) Image processing methods and equipment, electronic devices and computer programs
CN102835119B (en) Support the multi-core processor that the real-time 3D rendering on automatic stereoscopic display device is played up
EP3448032B1 (en) Enhancing motion pictures with accurate motion information
CN118053090A (en) Generating video using potential diffusion models
KR20140096532A (en) Apparatus and method for generating digital hologram
US20140354633A1 (en) Image processing method and image processing device
CN105530502B (en) According to the method and apparatus for the picture frame generation disparity map that stereoscopic camera is shot
Kellnhofer et al. Optimizing disparity for motion in depth
Lee et al. Automatic 2d-to-3d conversion using multi-scale deep neural network
US11532122B2 (en) Method and apparatus for processing holographic image
CN106169179A (en) Image denoising method and image noise reduction apparatus
CN113132706A (en) Controllable position virtual viewpoint generation method and device based on reverse mapping
CN116977167A (en) Video processing method and device, electronic equipment and storage medium
Mori et al. Exemplar-based inpainting for 6dof virtual reality photos
EP3882852A1 (en) Training alignment of a plurality of images
Lüke et al. Near Real‐Time Estimation of Super‐resolved Depth and All‐In‐Focus Images from a Plenoptic Camera Using Graphics Processing Units
Thatte Cinematic virtual reality with head-motion parallax
Shi et al. ImmersePro: End-to-End Stereo Video Synthesis Via Implicit Disparity Learning
Zhang et al. Efficient variational light field view synthesis for making stereoscopic 3D images
CN108197248A (en) A kind of method, apparatus and system of 3Dization 2D web displayings
KR101784208B1 (en) System and method for displaying three-dimension image using multiple depth camera
CN119273591A (en) Three-dimensional image generation method, device, storage medium and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant