CN109087346A

CN109087346A - Training method, training device and the electronic equipment of monocular depth model

Info

Publication number: CN109087346A
Application number: CN201811106152.4A
Authority: CN
Inventors: 耿益锋; 胡义涵; 罗恒
Original assignee: Beijing Horizon Robotics Technology Research and Development Co Ltd
Current assignee: Beijing Horizon Robotics Technology Research and Development Co Ltd
Priority date: 2018-09-21
Filing date: 2018-09-21
Publication date: 2018-12-25
Anticipated expiration: 2038-09-21
Also published as: CN109087346B

Abstract

Disclose training method, training device and the electronic equipment of a kind of monocular depth model.This method comprises: obtaining multiple binocular images for training monocular depth model；Randomly choose at least monocular image in multiple binocular images；Calculate corresponding first anaglyph of each monocular image and the corresponding first mask image of the first anaglyph in multiple binocular images in addition to an at least monocular image；It for each monocular image in an at least monocular image, calculates the anaglyph after each monocular image is overturn and is overturn again as the second anaglyph, and calculate the corresponding second mask image of the second anaglyph；And the reversed gradient by shielding the region of the first anaglyph that the first mask image blocks and the second mask image reversed gradient in the region of the second anaglyph blocked trains monocular depth model.In this way, the Depth Blur problem of object edge can be efficiently solved, while improving the precision of prediction of model entirety.

Description

Training method, training device and the electronic equipment of monocular depth model

Technical field

This application involves model training fields, and more particularly, to a kind of training method of monocular depth model, training Device and electronic equipment.

Background technique

Currently, computer vision technique is proposed on the basis of two-dimensional image, therefore, how from two Extract depth information in dimensional plane images or video sequence so predetermined depth image reconstruction three-dimensional structure be one very Important technology.This has very big promotion to the application such as dimension of object, hiding relation, shape, segmentation, and can answer extensively Turn 3D film, intelligent robot independent navigation for 2D film, mechanical arm grabs, under the scenes such as augmented reality.

In estimation of Depth technology, monocular depth estimation is to be estimated in image using the image of camera acquisition respectively The depth information of a pixel, and the estimation of unsupervised monocular depth is exactly that the training of model does not need the depth information or other of pixel Markup information.

As the depth estimation algorithm based on machine learning obtains more and more extensive research, by monocular depth model into Row estimation of Depth can not be limited by specific scene condition, have preferable applicability.Accordingly, it is desired to provide improved list The training program of mesh depth model.

Summary of the invention

In order to solve the above-mentioned technical problem, the application is proposed.Embodiments herein provides a kind of monocular depth mould Training method, training device and the electronic equipment of type are blocked by calculating to block mask and shield during model training The reversed gradient in region, while input picture invert at random and then prediction result is flipped back to again to carry out the anti-of gradient To propagation, the Depth Blur problem of object edge can be efficiently solved, while improving the precision of prediction of model entirety.

According to the one aspect of the application, a kind of training method of monocular depth model is provided, comprising: obtain for instructing Practice multiple binocular images of monocular depth model；Randomly choose at least monocular image in the multiple binocular image；It calculates Corresponding first anaglyph of each monocular image in the multiple binocular image in addition to an at least monocular image with And the corresponding first mask image of first anaglyph；For each monocular image in an at least monocular image, It calculates the anaglyph after each monocular image is overturn and is overturn again as the second anaglyph, and calculate described second The corresponding second mask image of anaglyph；And first disparity map blocked by shielding the first mask image The reversed gradient in the region for the second anaglyph that the reversed gradient in the region of picture and the second mask image block is instructed Practice the monocular depth model.

According to the another aspect of the application, a kind of training device of monocular depth model is provided, comprising: image obtains single Member, for obtaining multiple binocular images for training monocular depth model；Image selection unit is described more for randomly choosing An at least monocular image in a binocular image；First computing unit, for calculate in the multiple binocular image except it is described extremely Corresponding first anaglyph of each monocular image and first anaglyph except a few monocular image corresponding the One mask image；Second computing unit, for calculating described every for each monocular image in an at least monocular image Anaglyph after the overturning of one monocular image is simultaneously overturn again as the second anaglyph, and calculating second anaglyph pair The the second mask image answered；And model training unit, described first for being blocked by shielding the first mask image The reversed ladder in the region for the second anaglyph that the reversed gradient in the region of anaglyph and the second mask image block Degree is to train the monocular depth model.

According to the application's in another aspect, providing a kind of electronic equipment, comprising: processor；And memory, in institute It states and is stored with computer program instructions in memory, the computer program instructions make described when being run by the processor Processor executes the training method of monocular depth model as described above.

According to the another aspect of the application, a kind of computer-readable medium is provided, computer program is stored thereon with and refers to It enables, the computer program instructions make the processor execute monocular depth model as described above when being run by processor Training method.

Compared with prior art, training method, training device and the electronic equipment of the monocular depth model of the application can be with Obtain multiple binocular images for training monocular depth model；Randomly choose at least monocular in the multiple binocular image Image；Calculate corresponding first view of each monocular image in the multiple binocular image in addition to an at least monocular image Difference image and the corresponding first mask image of first anaglyph；For each list in an at least monocular image Mesh image calculates the anaglyph after each monocular image is overturn and is overturn again as the second anaglyph, and calculates The corresponding second mask image of second anaglyph；And described the by shielding that the first mask image blocks The region for the second anaglyph that the reversed gradient in the region of one anaglyph and the second mask image block it is reversed Gradient trains the monocular depth model.In this way, blocking mask by being calculated during model training and shielding blocked area The reversed gradient in domain, while input picture invert at random and then prediction result is flipped back to again to carry out the reversed of gradient It propagates, the Depth Blur problem of object edge can be efficiently solved, while improving the precision of prediction of model entirety.

Detailed description of the invention

The embodiment of the present application is described in more detail in conjunction with the accompanying drawings, the above-mentioned and other purposes of the application, Feature and advantage will be apparent.Attached drawing is used to provide to further understand the embodiment of the present application, and constitutes explanation A part of book is used to explain the application together with the embodiment of the present application, does not constitute the limitation to the application.In the accompanying drawings, Identical reference label typically represents same parts or step.

Fig. 1 illustrates the flow charts according to the training method of the monocular depth model of the embodiment of the present application.

Fig. 2 illustrates the schematic diagram of the generating process of the anaglyph and mask image according to the embodiment of the present application.

Fig. 3 illustrates the first exemplary schematic diagram of the network structure according to the embodiment of the present application.

Fig. 4 illustrates the second exemplary schematic diagram of the network structure according to the embodiment of the present application.

Fig. 5 illustrates the effect picture of the training method of the monocular depth model according to the embodiment of the present application.

Fig. 6 illustrates the block diagram of the training device of the monocular depth model according to the embodiment of the present application.

Fig. 7 illustrates the block diagram of the electronic equipment according to the embodiment of the present application.

Specific embodiment

In the following, example embodiment according to the application will be described in detail by referring to the drawings.Obviously, described embodiment is only It is only a part of the embodiment of the application, rather than the whole embodiments of the application, it should be appreciated that the application is not by described herein The limitation of example embodiment.

Application is summarized

As described above, monocular depth model is more and more widely used in the estimation of Depth of two dimensional image.

There are mainly two types of implementations for current monocular depth estimation, and one is the images by binocular to carry out model Training, is mainly utilized the physical relation between binocular image, another is the video by monocular cam, main to utilize The information of front and back picture frame.In addition, the method for also having while using binocular image and video.

For the unsupervised training method for binocular image, estimation of Depth can be easily carried out, still, at present The edge that above-mentioned estimation method will lead to object is relatively fuzzyyer.

Present inventor has found that the fuzzy of this object edge is mainly due in training process later after study Image transformation (image warping) can not be handled caused by object blocks.

In view of the above technical problems, the basic conception of the application is to calculate to block mask and shield during model training The reversed gradient of occlusion area, while random reversion is carried out to input picture and then prediction result is overturn again and can carry out gradient Backpropagation.

Specifically, training method, training device and the electronic equipment of monocular depth model provided by the present application can be first Multiple binocular images for training monocular depth model are obtained, at least one in the multiple binocular image is then randomly choosed Monocular image calculates each monocular image corresponding in the multiple binocular image in addition to an at least monocular image One anaglyph and the corresponding first mask image of first anaglyph, and in an at least monocular image Each monocular image, anaglyph after calculating each monocular image overturning simultaneously overturns again as the second anaglyph, And the corresponding second mask image of second anaglyph is calculated, it is blocked finally by shielding the first mask image The region for the second anaglyph that the reversed gradient in the region of first anaglyph and the second mask image block Reversed gradient train the monocular depth model.In this way, the Depth Blur problem of object edge can be efficiently solved, together The precision of prediction of Shi Tigao model entirety.

After describing the basic principle of the application, carry out the various non-limits for specifically introducing the application below with reference to the accompanying drawings Property embodiment processed.

Illustrative methods

As shown in Figure 1, including: S110 according to the training method of the monocular depth model of the embodiment of the present application, acquisition is used for Multiple binocular images of training monocular depth model；S120 randomly chooses at least monocular figure in the multiple binocular image Picture；S130 calculates each monocular image corresponding in the multiple binocular image in addition to an at least monocular image One anaglyph and the corresponding first mask image of first anaglyph；S140, for an at least monocular image In each monocular image, anaglyph after calculating each monocular image overturning simultaneously overturns again as the second disparity map Picture, and calculate the corresponding second mask image of second anaglyph；And S150, by shielding the first mask figure The second disparity map blocked as the reversed gradient in the region for first anaglyph blocked and the second mask image The reversed gradient in the region of picture trains the monocular depth model.

In step s 110, multiple binocular images for training monocular depth model are obtained.That is, according to this In the training method for applying for the monocular depth model of embodiment, using the unsupervised training method based on binocular image to monocular depth Degree model is trained.

Here, each binocular image includes the left-eye image and eye image as monocular image.Also, in model In training process, left view difference image is generated based on the left-eye image, with eye image corresponding with left-eye image synthesis. Equally, right anaglyph is generated based on the eye image, with left-eye image corresponding with eye image synthesis.

In the step s 120, at least monocular image in the multiple binocular image is randomly choosed.As described above, in root In training method according to the monocular depth model of the embodiment of the present application, random selection part input picture is overturn, then right It is overturn back again in the parallax that the image of overturning is overturn, carries out the synthesis of image and the backpropagation of gradient.

Also, in the embodiment of the present application, selected image does not need the left eye being limited in simultaneous selection binocular image Image and eye image.That is, left-eye image that can only in selected section binocular image, or only selected section binocular Eye image in image also can choose in left-eye image and another part binocular image in a part of binocular image Eye image, etc..Certainly, in the embodiment of the present application, it is double that at least one of multiple binocular images also can be randomly selected Mesh image, and simultaneously using in selected binocular image left-eye image and eye image as the image overturn.

It is, being randomly choosed the multiple in the training method according to the monocular depth model of the embodiment of the present application An at least monocular image in binocular image includes: at least binocular image in the multiple binocular image of random selection to obtain The left-eye image and eye image in an at least binocular image are obtained, using the left-eye image and the eye image as institute State an at least monocular image.

In this way, due to simultaneously using in binocular image left-eye image and eye image overturn, can be in processing Another is handled in a similar way based on one in left-eye image and eye image, reduces the complexity of calculating.Also, by Simultaneously include left-eye image and eye image in the image of overturning, improves the diversity for the sample trained after overturning, Ke Yijin The precision of prediction of one step raising model.

In step s 130, each monocular in the multiple binocular image in addition to an at least monocular image is calculated Corresponding first anaglyph of image and the corresponding first mask image of first anaglyph.In the following, will be said with reference to Fig. 2 Disparity map is generated for the image for not overturning and overturning in the bright training method according to the monocular depth model of the embodiment of the present application Picture and the process for generating mask image.Fig. 2 illustrates the generation of anaglyph and mask image according to the embodiment of the present application The schematic diagram of journey.

As shown in the left-half of Fig. 2, for the input picture for not needing overturning, for example, input left eye as shown in Figure 2 Image generates left view difference image corresponding with the left-eye image, regenerates mask image corresponding with left view difference image.Together Sample, although showing the operation executed to input left-eye image in Fig. 2, the operation executed for input eye image is also identical. Therefore, in the embodiment of the present application, first anaglyph refers to the view for the input picture generation without overturning Difference image, and the first mask image refers to both wrapping the mask image of the input picture generation without overturning The anaglyph and mask image for left-eye image are included, also includes anaglyph and mask image for eye image.

In step S140, for each monocular image in an at least monocular image, each monocular is calculated Anaglyph after Image Reversal is simultaneously overturn again as the second anaglyph, and calculates second anaglyph corresponding the Two mask images.

Equally, with reference to Fig. 2, as shown in the right half part of Fig. 2, for input picture, for example, such as the left-half institute of Fig. 2 The input left-eye image shown is overturn the view that the overturning input picture is then calculated to obtain overturning input picture first Difference image, i.e., the anaglyph of flipped image as shown in Figure 2 later overturn the anaglyph of flipped image again, with Obtain the anaglyph of overturning as shown in Figure 2.Finally, generating corresponding mask image for the anaglyph of overturning again.

Therefore, in the embodiment of the present application, second anaglyph is referred to for the selected input by overturning The anaglyph that image generates, and the second mask image is referred to for the selected input picture generation by overturning Mask image.And it as described above, it can only include the anaglyph and mask image for left-eye image, can also only wrap The anaglyph and mask image for eye image are included, or includes for the left-eye image and eye image in binocular image The anaglyph and mask image of the two.

That is, being right as unit of each monocular image in the binocular image of input in the embodiment of the present application Input picture is divided, and a part directly calculates anaglyph and corresponding mask image, i.e., the first parallax as described above Image and the first mask image, another part calculate anaglyph and corresponding mask image after then overturning, i.e., as described above Second anaglyph and the second mask image.

Finally, in step S150, by the area for shielding first anaglyph that the first mask image blocks The reversed gradient in the region for the second anaglyph that the reversed gradient in domain and the second mask image block is described to train Monocular depth model.

In this way, passing through the reversed gradient for calculating occlusion area and shielding occlusion area, due to passing through mask image intensification Object area in image simultaneously inhibits non-object area, can efficiently solve the Depth Blur problem of object edge.

Specifically, in the training method according to the monocular depth model of the embodiment of the present application, different structure can be used Network model.Fig. 3 illustrates the first exemplary schematic diagram of the network structure according to the embodiment of the present application.As shown in figure 3, needle To the left-eye image I of input^lWith eye image I^r, calculate separately out its left view difference image d^lWith right anaglyph d^r.Here, originally Field technical staff is appreciated that left-eye image I^lCorresponding anaglyph d^lIt can be unturned input figure as described above As corresponding first anaglyph, it is also possible to corresponding second anaglyph of the input picture overturn as described above, equally, Eye image I^rCorresponding anaglyph d^rIt is also possible to the first anaglyph and the second anaglyph as described above.

Next, by left view difference image d^lCorresponding eye image I^rSynthesis, and by right anaglyph d^rIt is right with it The left-eye image I answered^lSynthesis, to generate forecast imageWithThen, by calculating the forecast imageWithWith it is true Image I^lAnd I^rBetween difference function, and at least part using the difference function as loss function trains the list Mesh depth model.Also, as described above, in the training process, blocking the forecast image with mask image, and shield and hidden The reversed gradient in the region of gear.Here, the difference function can be the forecast imageWithWith true picture I^lAnd I^rIt Between image difference or described image difference quadratic sum, etc..

Here, network structure shown in Fig. 3 passes through while calculating anaglyph for left-eye image and eye image and closing At forecast image, the precision of prediction of model can be improved.

Therefore, it in the training method according to the monocular depth model of the embodiment of the present application, is covered by shielding described first The second view that the reversed gradient in the region for first anaglyph that code image blocks and the second mask image block The reversed gradient in the region of difference image is come to train the monocular depth model include: by first anaglyph and described second The monocular image of each anaglyph corresponding thereto in anaglyph synthesizes forecast image；Calculate the forecast image and true Difference function between real image；And using the difference function as at least part of loss function training monocular Depth model shields the reversed ladder in the region for the forecast image that the mask image is blocked in the training process Degree.

As another example of network structure, Fig. 4 illustrates the second example of the network structure according to the embodiment of the present application Schematic diagram.As shown in figure 4, this exemplary network structure can be just for left-eye image I^lWith eye image I^rOne of instructed Practice.For example, being directed to left-eye image I^l, its left view difference image d is calculated first^l, then with corresponding eye image I^rIt is synthesized To obtain forecast imageNext, calculating the forecast imageWith true picture I^lBetween difference function, and with described Difference function trains the monocular depth model as at least part of loss function.Similarly, in the training process, with Mask image blocks the forecast image, and shields the reversed gradient in blocked region.

In addition, it will be understood by those skilled in the art that network structure as shown in Figure 4 can be equally applied to right eye figure As I^r.That is, being directed to eye image I^r, its right anaglyph d is calculated first^r, then with corresponding left-eye image I^lInto Row synthesis is to obtain forecast imageNext, calculating the forecast imageWith true picture I^rBetween difference function, and At least part using the difference function as loss function trains the monocular depth model.Similarly, it was training Cheng Zhong blocks the forecast image with mask image, and shields the reversed gradient in blocked region.Here, the difference Function can be the forecast imageWithWith true picture I^lAnd I^rBetween image difference or described image difference Quadratic sum, etc..

Here, network structure shown in Fig. 4 calculates anaglyph only for one of left-eye image and eye image and synthesizes Forecast image, so that calculating process is fairly simple, while can also be compatible with current some existing network structures.

It is, passing through shielding described first in the training method according to the monocular depth model of the embodiment of the present application The reversed gradient in the region for first anaglyph that mask image blocks and the second mask image block second The reversed gradient in the region of anaglyph is come to train the monocular depth model include: by first anaglyph and described The each left view difference image for corresponding to one of the left-eye image and described eye image or each right side in two anaglyphs Anaglyph eye image corresponding thereto or left-eye image synthesize forecast image；Calculate the forecast image and true figure Difference function as between；And using the difference function as at least part of loss function training monocular depth Model shields the reversed gradient in the region for the forecast image that the mask image is blocked in the training process.

Fig. 5 illustrates the effect picture of the training method of the monocular depth model according to the embodiment of the present application.As shown in figure 5, (a) left-eye image I is shown^l, (b) show eye image I^r, (c) show the anaglyph d being aligned with left-eye image^l, (d) the left eye forecast image after showing reconstruct(e) it shows and anaglyph d^lCorresponding mask image, and (f) show Use the left-eye image reconstructed after mask image masking.It can see from (d), left eye forecast image after reconstitution In the presence of apparent repetition and artifact.And shielded by using the mask image (e) generated from anaglyph (c) those repeat and The backpropagation of artifact, it can be seen that white area has been blocked in final result (f).

Exemplary means

As shown in fig. 6, including: that image obtains list according to the training device 200 of the monocular depth model of the embodiment of the present application Member 210, for obtaining multiple binocular images for training monocular depth model；Image selection unit 220, for randomly choosing An at least monocular image in the multiple binocular image；First computing unit 230, for calculating in the multiple binocular image Corresponding first anaglyph of each monocular image and first anaglyph in addition to an at least monocular image Corresponding first mask image；Second computing unit 240, for for each monocular figure in an at least monocular image Picture calculates the anaglyph after each monocular image is overturn and is overturn again as the second anaglyph, and described in calculating The corresponding second mask image of second anaglyph；And model training unit 250, for by shielding first mask The second parallax that the reversed gradient in the region for first anaglyph that image blocks and the second mask image block The reversed gradient in the region of image trains the monocular depth model.

In one example, in the training device 200 of above-mentioned monocular depth model, the binocular image includes as single The left-eye image and eye image of mesh image；The corresponding anaglyph of the left-eye image is left view difference image；And the right side The corresponding anaglyph of eye image is right anaglyph.

In one example, in the training device 200 of above-mentioned monocular depth model, described image selecting unit 220 is used In: at least binocular image in the multiple binocular image is randomly choosed to obtain the left eye in an at least binocular image Both image and eye image are using as an at least monocular image.

In one example, in the training device 200 of above-mentioned monocular depth model, the model training unit 250 is used In: the monocular image of each anaglyph in first anaglyph and second anaglyph corresponding thereto is synthesized For forecast image；Calculate the difference function between the forecast image and true picture；And using the difference function as damage At least part training monocular depth model for losing function shields the mask image and is hidden in the training process The reversed gradient in the region of the forecast image of gear.

In one example, in the training device 200 of above-mentioned monocular depth model, the model training unit 250 is used In: by first anaglyph and second anaglyph correspond to the left-eye image and the eye image it One each left view difference image or each right anaglyph eye image corresponding thereto or left-eye image synthesize prediction Image；Calculate the difference function between the forecast image and true picture；And using the difference function as loss function At least part training monocular depth model shield the institute that the mask image is blocked in the training process State the reversed gradient in the region of forecast image.

Here, it will be understood by those skilled in the art that each unit in the training device 200 of above-mentioned monocular depth model It is had been described above in the training method referring to figs. 1 to the monocular depth model of Fig. 5 description in detail with the concrete function of module and operation It is thin to introduce, and therefore, it will omit its repeated description.

As described above, may be implemented according to the training device 200 of the monocular depth model of the embodiment of the present application at various ends In end equipment, such as the server of operation monocular depth model.In one example, according to the device of the embodiment of the present application 200 It can be used as a software module and/or hardware module and be integrated into terminal device.For example, the device 200 can be the end A software module in the operating system of end equipment, or can be and be directed to one that the terminal device is developed using journey Sequence；Certainly, which equally can be one of numerous hardware modules of the terminal device.

Alternatively, in another example, the training device 200 of the monocular depth model and the terminal device are also possible to point Vertical equipment, and the device 200 can be connected to the terminal device by wired and or wireless network, and according to agreement Data format transmit interactive information.

Example electronic device

In the following, being described with reference to Figure 7 the electronic equipment according to the embodiment of the present application.

As shown in fig. 7, electronic equipment 10 includes one or more processors 11 and memory 12.

Processor 11 can be central processing unit (CPU) or have data-handling capacity and/or instruction execution capability Other forms processing unit, and can control the other assemblies in electronic equipment 10 to execute desired function.

Memory 12 may include one or more computer program products, and the computer program product may include each The computer readable storage medium of kind form, such as volatile memory and/or nonvolatile memory.The volatile storage Device for example may include random access memory (RAM) and/or cache memory (cache) etc..It is described non-volatile to deposit Reservoir for example may include read-only memory (ROM), hard disk, flash memory etc..It can be deposited on the computer readable storage medium One or more computer program instructions are stored up, processor 11 can run described program instruction, to realize this Shen described above The training method of the monocular depth model of each embodiment please and/or other desired functions.The computer can It reads that the various contents such as input binocular image, anaglyph, mask image can also be stored in storage medium.

In one example, electronic equipment 10 can also include: input unit 13 and output device 14, these components pass through The interconnection of bindiny mechanism's (not shown) of bus system and/or other forms.

For example, the input unit 13 may include binocular camera, for acquiring binocular image.In addition, the input unit 13 can also include such as keyboard, mouse etc..

The output device 14 can be output to the outside various information, such as comprising display, loudspeaker, printer, with And communication network and its remote output devices connected etc..

Certainly, to put it more simply, illustrated only in Fig. 7 it is some in component related with the application in the electronic equipment 10, The component of such as bus, input/output interface etc. is omitted.In addition to this, according to concrete application situation, electronic equipment 10 is also It may include any other component appropriate.

Illustrative computer program product and computer readable storage medium

Other than the above method and equipment, embodiments herein can also be computer program product comprising meter Calculation machine program instruction, it is above-mentioned that the computer program instructions make the processor execute this specification when being run by processor According to the step in the training method of the monocular depth model of the various embodiments of the application described in " illustrative methods " part.

The computer program product can be write with any combination of one or more programming languages for holding The program code of row the embodiment of the present application operation, described program design language includes object oriented program language, such as Java, C++ etc. further include conventional procedural programming language, such as " C " language or similar programming language.Journey Sequence code can be executed fully on the user computing device, partly execute on a user device, be independent soft as one Part packet executes, part executes on a remote computing or completely in remote computing device on the user computing device for part Or it is executed on server.

In addition, embodiments herein can also be computer readable storage medium, it is stored thereon with computer program and refers to It enables, the computer program instructions make the processor execute above-mentioned " the exemplary side of this specification when being run by processor According to the step in the training method of the monocular depth model of the various embodiments of the application described in method " part.

The computer readable storage medium can be using any combination of one or more readable mediums.Readable medium can To be readable signal medium or readable storage medium storing program for executing.Readable storage medium storing program for executing for example can include but is not limited to electricity, magnetic, light, electricity Magnetic, the system of infrared ray or semiconductor, device or device, or any above combination.Readable storage medium storing program for executing it is more specific Example (non exhaustive list) includes: the electrical connection with one or more conducting wires, portable disc, hard disk, random access memory Device (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc Read-only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.

The basic principle of the application is described in conjunction with specific embodiments above, however, it is desirable to, it is noted that in this application The advantages of referring to, advantage, effect etc. are only exemplary rather than limitation, must not believe that these advantages, advantage, effect etc. are the application Each embodiment is prerequisite.In addition, detail disclosed above is merely to exemplary effect and the work being easy to understand With, rather than limit, it is that must be realized using above-mentioned concrete details that above-mentioned details, which is not intended to limit the application,.

Device involved in the application, device, equipment, system block diagram only as illustrative example and be not intended to It is required that or hint must be attached in such a way that box illustrates, arrange, configure.As those skilled in the art will appreciate that , it can be connected by any way, arrange, configure these devices, device, equipment, system.Such as "include", "comprise", " tool " etc. word be open vocabulary, refer to " including but not limited to ", and can be used interchangeably with it.Vocabulary used herein above "or" and "and" refer to vocabulary "and/or", and can be used interchangeably with it, unless it is not such that context, which is explicitly indicated,.Here made Vocabulary " such as " refers to phrase " such as, but not limited to ", and can be used interchangeably with it.

It may also be noted that each component or each step are can to decompose in the device of the application, device and method And/or reconfigure.These decompose and/or reconfigure the equivalent scheme that should be regarded as the application.

The above description of disclosed aspect is provided so that any person skilled in the art can make or use this Application.Various modifications in terms of these are readily apparent to those skilled in the art, and are defined herein General Principle can be applied to other aspect without departing from scope of the present application.Therefore, the application is not intended to be limited to Aspect shown in this, but according to principle disclosed herein and the consistent widest range of novel feature.

In order to which purpose of illustration and description has been presented for above description.In addition, this description is not intended to the reality of the application It applies example and is restricted to form disclosed herein.Although already discussed above multiple exemplary aspects and embodiment, this field skill Its certain modifications, modification, change, addition and sub-portfolio will be recognized in art personnel.

Claims

1. a kind of training method of monocular depth model, comprising:

Obtain multiple binocular images for training monocular depth model；

Randomly choose at least monocular image in the multiple binocular image；

Calculate corresponding first view of each monocular image in the multiple binocular image in addition to an at least monocular image Difference image and the corresponding first mask image of first anaglyph；

Disparity map for each monocular image in an at least monocular image, after calculating each monocular image overturning Picture is simultaneously overturn again as the second anaglyph, and calculates the corresponding second mask image of second anaglyph；And

By the reversed gradient and described for shielding the region for first anaglyph that the first mask image blocks The reversed gradient in the region for the second anaglyph that two mask images block trains the monocular depth model.

2. the training method of monocular depth model as described in claim 1, wherein

The binocular image includes the left-eye image and eye image as monocular image；

The corresponding anaglyph of the left-eye image is left view difference image；And

The corresponding anaglyph of the eye image is right anaglyph.

3. the training method of monocular depth model as claimed in claim 2, wherein randomly choose in the multiple binocular image An at least monocular image include:

At least binocular image in the multiple binocular image is randomly choosed to obtain the left side in an at least binocular image Eye image and eye image, using the left-eye image and the eye image as an at least monocular image.

4. the training method of monocular depth model as described in claim 1, wherein hidden by shielding the first mask image The second anaglyph that the reversed gradient in the region of first anaglyph of gear and the second mask image block The reversed gradient in region includes: to train the monocular depth model

The monocular image of each anaglyph in first anaglyph and second anaglyph corresponding thereto is closed As forecast image；

Calculate the difference function between the forecast image and true picture；And

Using the difference function as at least part of loss function training monocular depth model, in the training process In, shield the reversed gradient in the region for the forecast image that the mask image is blocked.

5. the training method of monocular depth model as claimed in claim 2, wherein hidden by shielding the first mask image The second anaglyph that the reversed gradient in the region of first anaglyph of gear and the second mask image block The reversed gradient in region includes: to train the monocular depth model

The left-eye image and the eye image will be corresponded in first anaglyph and second anaglyph One of each left view difference image or each right anaglyph eye image corresponding thereto or left-eye image synthesize it is pre- Altimetric image；

6. a kind of training device of monocular depth model, comprising:

Image acquisition unit, for obtaining multiple binocular images for training monocular depth model；

Image selection unit, for randomly choosing at least monocular image in the multiple binocular image；

First computing unit, for calculating each monocular in the multiple binocular image in addition to an at least monocular image Corresponding first anaglyph of image and the corresponding first mask image of first anaglyph；

Second computing unit, for calculating each monocular for each monocular image in an at least monocular image Anaglyph after Image Reversal is simultaneously overturn again as the second anaglyph, and calculates second anaglyph corresponding the Two mask images；

Model training unit, for the anti-of the region by shielding first anaglyph that the first mask image blocks The reversed gradient in the region for the second anaglyph blocked to gradient and the second mask image trains the monocular deep Spend model.

7. the training device of monocular depth model as claimed in claim 6, wherein

The corresponding anaglyph of the eye image is right anaglyph.

8. the training device of monocular depth model as claimed in claim 7, wherein described image selecting unit is used for:

At least binocular image in the multiple binocular image is randomly choosed to obtain the left side in an at least binocular image Eye both image and eye image is using as an at least monocular image.

9. the training device of monocular depth model as claimed in claim 6, wherein the model training unit is used for:

10. the training device of monocular depth model as claimed in claim 6, wherein the model training unit is used for:

11. a kind of electronic equipment, comprising:

Processor；And

Memory is stored with computer program instructions in the memory, and the computer program instructions are by the processing The processor is made to execute the training method of monocular depth model according to any one of claims 1 to 5 when device operation.

12. a kind of computer-readable medium is stored thereon with computer program instructions, the computer program instructions are processed The processor is made to execute the training method of monocular depth model according to any one of claims 1 to 5 when device operation.