WO2021102948A1 - Image processing method and device - Google Patents
Image processing method and device Download PDFInfo
- Publication number
- WO2021102948A1 WO2021102948A1 PCT/CN2019/122090 CN2019122090W WO2021102948A1 WO 2021102948 A1 WO2021102948 A1 WO 2021102948A1 CN 2019122090 W CN2019122090 W CN 2019122090W WO 2021102948 A1 WO2021102948 A1 WO 2021102948A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- dimensional
- target image
- image area
- dimensional space
- area
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/005—General purpose rendering architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/40—Analysis of texture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
- G06T7/593—Depth or shape recovery from multiple images from stereo images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Definitions
- This application relates to the field of image processing technology, and specifically to an image processing method and device.
- 3D reconstruction technology has been widely used in various fields.
- multiple images can be collected, and the depth maps corresponding to these images can be determined, and then a three-dimensional point cloud can be obtained from the multiple depth maps to construct a three-dimensional model.
- feature points can be extracted from one image, and matching points of these feature points can be found in another image, and then the image can be determined based on these feature points and matching points
- the depth of each pixel in the middle to obtain a depth map. Therefore, for some objects with smooth surfaces and weak textures, because the feature points cannot be extracted, the depth of the object cannot be obtained. Therefore, the depth information corresponding to these objects in the final depth map will be missing, which makes the final construction
- There are holes in the 3D point cloud so it is necessary to fill in the missing depth information in the depth map.
- the effect of filling the missing depth information in the depth map is not ideal, which will lead to the layering phenomenon of the point cloud corresponding to the same object in the three-dimensional space in the finally reconstructed three-dimensional point cloud. Therefore, it is necessary to improve the filling method of the depth information of the depth map to obtain a uniform and complete point cloud.
- this application provides an image processing method and device.
- an image processing method including:
- each depth map includes one or more target image regions, and the target image regions meet a preset type condition
- the depth value of the target image area corresponding to each three-dimensional point set is filled.
- an image processing apparatus including a processor, a memory, and a computer program stored on the memory, and the processor implements the following steps when the processor executes the computer program:
- each depth map includes one or more target image regions, and the target image regions meet a preset type condition
- the depth value of the target image area corresponding to each three-dimensional point set is filled.
- Fig. 1 is a flowchart of an image processing method provided by an embodiment of the present invention.
- Fig. 2 is a schematic diagram of a semantic image provided by an embodiment of the present invention.
- Fig. 3 is a schematic diagram of a depth map or semantic image including a target area provided by an embodiment of the present invention.
- Fig. 4 is a schematic diagram of the intersection of water areas on two images provided by an embodiment of the present invention.
- Fig. 5 is a flowchart of another image processing method provided by an embodiment of the present invention.
- FIG. 6 is a schematic diagram of the point cloud effect obtained after filling the water area of the depth map provided by the prior art.
- FIG. 7 is a schematic diagram of the point cloud effect obtained after filling the water area of the depth map according to an embodiment of the present invention.
- Fig. 8 is a logical structure block diagram of an image processing device provided by an embodiment of the present invention.
- Three-dimensional reconstruction technology has been widely used in various fields.
- multiple images can be collected, and the depth maps corresponding to these images can be determined, and then a three-dimensional point cloud can be obtained from the multiple depth maps to construct a three-dimensional model.
- the feature points on the surface of the object can be extracted from one image, and the matching points of these feature points can be found in another image, and then based on these feature points and matching Point to determine the depth of each pixel of the object to get a depth map.
- the feature points are generally the corner points, inflection points or the intersection of the contour lines in the object, that is, the pixels with large differences in the pixel values of the surrounding pixels, it is impossible to extract the feature points for some objects with smooth surfaces and weak texture. It is also impossible to accurately calculate the depth of this object, so the area corresponding to these objects in the final depth map will have lack of depth information, which makes the final three-dimensional point cloud appear hollow. For example, for objects such as water and flat glass, because the surface is relatively smooth and weak texture, it is impossible to extract feature points from the image, and calculate the depth of the surface of these objects based on the feature points, so the final depth map is that the water Corresponding areas of objects such as, glass, etc. will appear missing.
- the regions corresponding to such objects in the depth map can be filled.
- the edge pixels of this type of object are usually determined from a single depth map, and the depth is calculated based on the depth of the edge pixels in the single depth map. , And fill in the corresponding area of the object in the single depth map.
- each depth map may only contain a part of the object, so a single depth map is used There will be a certain deviation in calculating the depth based on the pixels of each depth map, that is, the depth calculated by each depth map is different, but it represents the same object in the three-dimensional space.
- the depth of the edge pixels based on a single depth map is calculated separately. After the depth map is filled, it will cause the layered phenomenon in the finally reconstructed 3D point cloud. Therefore, it is necessary to improve the filling method of the depth information of the depth map to obtain a uniform and complete point cloud.
- this application provides an image processing method. Specifically, as shown in FIG. 1, the image processing method provided by this application includes the following steps:
- each depth map includes one or more target image regions, and the target image regions meet a preset type condition
- S106 Project the edge pixels of each target image area into a three-dimensional space to determine at least one three-dimensional point set, and the three-dimensional points included in each three-dimensional point set correspond to the same target object in the three-dimensional space;
- S108 Fill in the depth value of the target image area corresponding to each three-dimensional point set according to the space coordinates of the three-dimensional point in each of the three-dimensional point sets.
- the image processing method provided in this application can be used in various electronic devices that perform image processing, such as mobile phones, notebook computers, tablet computers, desktop computers, and so on.
- the image processing method provided in this application may also be executed in a cloud processor.
- the image processing method provided in this application can also be applied to various types of 3D reconstruction software.
- These depth maps can be obtained from multiple RGB images collected by a camera. By extracting feature points from one RGB image, and then finding the corresponding matching points from another RGB image, according to the feature points With the corresponding matching point, the depth information of each pixel in the RGB image can be obtained, and the depth map corresponding to the RGB image can be obtained.
- These depth maps include one or more target image areas, where the target image area is an area that meets a preset type condition, that is, it may be a corresponding image area of a certain type of three-dimensional object.
- these target image areas may be image areas that need to be filled with depth information. For example, they may be image areas corresponding to some three-dimensional objects with a smooth surface and weak texture.
- the target image area may be The image area corresponding to the water area in the three-dimensional space, or the image area corresponding to the flat glass, of course, can also be the image area of other three-dimensional objects with similar characteristics, and this application is not limited.
- the target object in this application is an object in a three-dimensional space corresponding to the target image area, that is, an object in the three-dimensional space with a relatively smooth surface and weak texture.
- the target image area is an image area of a water area, that is, the target object is a water area.
- the target image area is the image area of the flat glass, and the target object is the flat glass.
- one or more target image regions included in each depth map may be image regions corresponding to the same three-dimensional space target object, or image regions corresponding to different target objects in the three-dimensional space.
- the edge pixels of the target image region can be determined from the depth map.
- the edge pixels can be obtained by pre-marking the depth map.
- the target image area in the depth map can be determined, and then the target image area can be marked in the depth map.
- the marking can be artificial
- the way of marking of course, can also adopt the way of automatic marking.
- the edge pixels can be determined from the depth map through the semantic image corresponding to the depth map, where the semantic image is an image divided into multiple image regions, and each image region corresponds to a type in the three-dimensional space.
- Target object Semantic images can be obtained by semantic segmentation of images. Semantic segmentation of images distinguishes objects of different categories, and objects belonging to the same category are marked as one category. As shown in Figure 2, in the result of semantic segmentation, all people in the scene are divided into one category and marked with the same color (presented by different gray levels in Figure 2), trees, streets, buildings, sky, etc. So it is.
- the semantic image corresponding to each depth map can be obtained through a pre-trained calculation model.
- the RGB image corresponding to each depth map can be input to the pre-trained calculation model, and then the semantic image corresponding to each depth map can be output.
- the calculation model a large number of RGB images containing the target object can be collected, and then the image area corresponding to the target object in the image can be annotated, and the annotated RGB image can be trained on the model to obtain the calculation model.
- the calculation model may be a deep learning model, for example, it may be an FCN (Fully Convolutional Networks) model.
- each target image area and the target image area can be determined from the depth map according to the classification of various objects in the semantic image.
- Edge pixels For example, if the target image area is the image area of the water area, the image area of the water area can be determined from the semantic image, and then the image area and the edge of the water area can be determined on the depth map according to the correspondence between the pixels on the two images pixel.
- the edge pixels may be pixels located at the boundary of the target image area and belonging to the target image area. In some embodiments, the edge pixels may also be pixels located at the boundary of the target image area but belonging to other image areas.
- Figure 3 it is a schematic diagram of the target image area in the depth map or semantic image, where the pixels filled in gray correspond to the target image area in the image, and the unfilled white pixels are other image areas, and the pixels filled in gray are gray.
- the pixel points 301 marked with a "cross" are pixels located at the boundary of the target image area and belonging to the target image area, and the pixels 302 filled with white and marked by multiple diagonal lines are located at the boundary of the target image area and belong to others. The pixels of the image area.
- 301 that is, a pixel located at the boundary of the target image area and belonging to the target image area, may be regarded as an edge pixel.
- the pixel depth information on the target image area is not accurate, the subsequent calculation of the depth of the target image area is not accurate. Therefore, in some embodiments, 302, which is located at the boundary of the target image area and belongs to other images The pixels of the area are regarded as edge pixels.
- the pixel points of each target image area can be determined from the depth map according to the correspondence between the semantic image and the pixel points of the depth map, and then for each pixel point in the target image area, it can be determined one by one to locate the pixel point. Whether the surrounding pixels are all pixels of the target image area, if so, it means that there are pixels of the target area besides the pixel, so the pixel is not the edge of the target image area. If it is not, it means that the image is outside the image area, so it can be determined that the pixels of other image areas adjacent to the pixel are the edge pixels of the target image area, as shown in 302 in FIG. 3.
- each pixel of the water image area can be determined from the semantic image corresponding to the depth map. Since the pixels of the two images correspond one-to-one, the depth map can be found Pixels in the water image area, and then determine one by one whether the neighboring pixels around each pixel are all pixels representing the water image, if not, determine the pixels of other image areas adjacent to the pixel as the edge of the water area .
- the pixel corresponding to the target image area can also be determined in the semantic image, and it is determined one by one whether the neighboring pixels located around the pixel are all pixels of the target image area, if not, then It is determined that the pixels in other image areas adjacent to the pixel are edge pixels, as shown in 302 in FIG. 3. After determining the edge pixel in the semantic image, find the corresponding point of the edge pixel on the depth map, which is the edge pixel on the depth map.
- the edge pixels of each target image area can be projected into the three-dimensional space to determine which target image areas correspond to the same target object in the three-dimensional space , And then take the three-dimensional points corresponding to the edge pixels of the target image area corresponding to the same target object as a set to obtain one or more three-dimensional point sets, where each three-dimensional point set corresponds to the same target in the three-dimensional space Object. Then, according to the three-dimensional points in each three-dimensional point set, the depth value of the target image area corresponding to the set is determined.
- the projection of the edge pixel points of the target image area to the corresponding three-dimensional points in the three-dimensional space can be determined according to the pixel coordinates of the edge pixel points, the internal parameters and external parameters of the camera device, and the external parameters include the rotation matrix and the translation matrix of the camera device.
- the formula (1) can be used for conversion:
- (u, v) is the coordinate of the edge pixel
- Z is the depth of the pixel coordinate (u, v)
- K is the camera internal parameter
- R is the rotation matrix of the camera device that took the image
- T is the image
- Pw is the three-dimensional space coordinate (u, v) is the spatial coordinate of the three-dimensional point corresponding to the edge pixel point.
- the edge pixel points of each target image area can be projected into the three-dimensional space first to obtain the three-dimensional points corresponding to the edge pixel points of each target image area, and then according to the corresponding edge pixel points of each target image area
- the spatial coordinates of the three-dimensional points determine the three-dimensional points corresponding to the same target object in the three-dimensional space, and place the three-dimensional points corresponding to the same target object in the three-dimensional space in the same three-dimensional point set.
- the three-dimensional space contains water area 1 and water area 2, and the water area image area on each depth map contains only a part of the above two water area areas, so the edge pixels of the water area image area on the multiple depth maps Points are projected into the three-dimensional space, and these edge pixels are determined to correspond to the three-dimensional points in the three-dimensional space. Then, from these edge pixels, the three-dimensional points corresponding to the edge pixels of water area 1 and the three-dimensional points corresponding to the edge pixels of water area 2 are determined. Point to construct two three-dimensional point sets.
- the three-dimensional points corresponding to the edge pixel points of each target image area may be determined first according to the spatial coordinates of the three-dimensional points corresponding to each target image area in the three-dimensional space. The intersection of the three-dimensional space area, and then determine the three-dimensional point corresponding to the same target object in the three-dimensional space according to the intersection of each target image area in the three-dimensional space corresponding to the three-dimensional space.
- the three-dimensional points corresponding to the edge pixels of the target image area A are A1, A2, A3, A4, A5, and the three-dimensional points corresponding to the edge pixels of the target image area B are B1, B2, B3, B4, B5, and then you can add Connect A1, A2, A3, A4, and A5 to obtain a three-dimensional space area a, which is the physical space of the real object corresponding to the target image area A.
- connect B1, B2, B3, B4, and B5 and You will get a three-dimensional space area b, which is the physical space of the real object corresponding to the target image area B. Then you can see whether the three-dimensional space area a and the three-dimensional space area b intersect. If they intersect, the target image area A and the target image area B corresponds to the same target object, so the three-dimensional points corresponding to the edge pixels of A and B are the three-dimensional points of the same target object.
- the three-dimensional points corresponding to the edge pixel points of each target image area can be projected onto the same plane to determine each target image The plane area corresponding to the area, and then according to the intersection of the plane area corresponding to each target image area, the intersection of the three-dimensional space area corresponding to each target image area is determined.
- the three-dimensional points corresponding to the edge pixels of the target image area A are A1, A2, A3, A4, A5
- the three-dimensional points corresponding to the edge pixels of the target image area B are B1, B2, B3, B4, and B5.
- Points A1, A2, A3, A4, A5 and three-dimensional points B1, B2, B3, B4, B5 are projected into a plane, such as projected to a plane formed by the XY axis, and two plane areas a1 and b1 are obtained respectively, and then a1 is judged Whether it intersects with b1. If it intersects, it means that the three-dimensional space area corresponding to the target image area A and B intersect.
- the three-dimensional space regions corresponding to two target image regions intersect, it is determined that the three-dimensional points corresponding to the edge pixel points of the two target image regions are three-dimensional points corresponding to the same target object in the three-dimensional space, Or if the three-dimensional space regions corresponding to two target image regions intersect with the same three-dimensional space region, then it is determined that the three-dimensional points corresponding to the edge pixel points of the two target image regions are three-dimensional points corresponding to the same target object in the three-dimensional space.
- the corresponding three-dimensional space area of target image area A in three-dimensional space is a
- the corresponding three-dimensional space area of target image area B in three-dimensional space is b
- the corresponding three-dimensional space area of target image area C in three-dimensional space is c
- the target image area A and the target image area B also correspond to the same target object in the three-dimensional space, that is, the target image area A, the target image area B, and the target image area C all correspond to the same target object in the three-dimensional space, and the three-dimensional points corresponding to the edge pixels of the three image regions are the three-dimensional points of the same target object.
- the depth information of the target object can be determined according to the spatial coordinates of each three-dimensional point in the three-dimensional point set. Since the target object is often an object with flat features such as water and glass, it can be regarded as a plane. Therefore, in some embodiments, the average value of the depth values of these three-dimensional points can be taken as the depth value of the target object. To fill in the depth information of the target image area in each depth map.
- the spatial coordinates of each three-dimensional point in the three-dimensional point set are fitted to obtain a fitting plane, and then the depth information of each target image area is determined according to the fitting plane.
- the plane equation of the fitting plane can be determined as follows:
- a, b, c, d are the coefficients of the plane equation obtained by fitting the three-dimensional point coordinates in the three-dimensional point set. Then according to the fitting plane to determine the depth information of each target image area. Specifically, by combining formula (2) with the following formula (1), the depth Z of the pixel corresponding to the target object on each depth map can be immediately obtained, and then the target image area corresponding to the target object on each depth map is filled.
- the specific formula (1) is as follows:
- the calculation of the depth map depends on the specificity and stability of the surface texture of the object. It is necessary to extract feature points from multiple RGB images of three-dimensional objects and perform matching to determine the depth information of each three-dimensional object in the image.
- 3D objects with reflective surfaces such as water surface, weak surface texture, and unfixed texture
- the 3D reconstruction has always been a problem.
- Traditional 3D reconstruction methods are difficult to obtain correct reconstruction results for these objects.
- the water surface point cloud usually lacks or height information Confused.
- this embodiment provides an image processing method that can fill in the water area in the depth map used to construct the 3D point cloud, so that the constructed 3D point cloud will not appear
- the delamination phenomenon is relatively uniform and complete. The specific method is as follows:
- the deep learning model can be an FCN model.
- the trained model can calculate the pixel-level water segmentation results of each image through a series of processing, and then determine the pixel points corresponding to the water area in each image.
- the RGB image corresponding to the depth map used to construct the three-dimensional point cloud can be input into the trained deep learning model, and then the image of the water area and non-water area in the marked image can be output. Call it a semantic image.
- the original color RGB original image of the same scale and the depth map and semantic image corresponding to the RGB image can be obtained, and the position of each pixel of the three images has a one-to-one correspondence.
- each depth map and semantic image can determine the edge pixels of each water area in the depth map. Since the pixels in the water area in the semantic image all carry labels, each pixel in the water area can be determined from the semantic image, and then each pixel is judged one by one whether the neighboring pixels around each pixel are all pixels in the water area, if not , Then take the pixel point of the non-water image area adjacent to the pixel point as the edge pixel point of the water area to determine the edge pixel point of the water area in each depth map.
- edge pixels of the water area in each depth map After determining the edge pixels of the water area in each depth map, these edge pixels can be projected into the world coordinate system to determine the three-dimensional points corresponding to each edge pixel in the world coordinate system.
- the spatial coordinates of the corresponding three-dimensional points after each edge pixel point is projected to the world coordinates can be calculated by formula (1):
- (u, v) is the coordinate of the edge pixel point
- Z is the depth of the pixel coordinate (u, v)
- K is the camera parameter
- R is the rotation torque of the camera device that took the image
- T is the image taken
- Pw is the space coordinate of the three-dimensional point corresponding to the edge pixel point in the three-dimensional space with coordinates (u, v).
- each water area can be determined according to the three-dimensional points corresponding to the edge pixels of each water area in the depth map, and the intersecting water areas are marked as the same piece Waters, as shown in Figure 4, the waters on the two images intersect and can be marked as the same waters.
- the water area in which the image is located belongs to the known water area, and it is determined that it is a new water area.
- the statistical water area intersects with multiple statistical water areas if it intersects with a single statistical water area (S506), the 3D point corresponding to the water area is added to the known 3D point set of the corresponding water area (S508);
- the multiple water areas are merged into one water area, and the three-dimensional points corresponding to the edge pixels corresponding to the multiple water areas are placed in a three-dimensional point set (S509). ). In this way, a statistical result of the global water area can be obtained.
- a, b, c, d are the coefficients of the plane equation obtained by fitting the three-dimensional point coordinates in the three-dimensional point set.
- formula (2) with the following formula (1) can immediately obtain the depth Z of the pixel corresponding to the target object on each depth map, and then fill in the target image area corresponding to the target object on each depth map, formula (1 )as follows:
- the related technology directly uses the water edge pixel information of a single depth map to fill in the depth of the water surface area. Because the depth noise near the water surface pixels is large, the reconstructed point cloud will appear serious Of stratification.
- the image processing method provided by this application collects the edge pixels corresponding to the same water area in each depth map, and then determines the depth based on these edge pixels, and fills in the depth of the water surface area from the global perspective. The angle determines the depth information and fills in the depth of the water surface area. Therefore, the final point cloud will not be stratified, and it will be more uniform and complete.
- Figure 6(a) is a schematic diagram of a three-dimensional point cloud reconstructed without filling the water area
- Figure 6(b) is a schematic diagram of a three-dimensional point cloud reconstructed based on a single depth map for filling processing, which can be seen
- the reconstructed 3D point cloud will have the phenomenon of non-uniform layering.
- Figure 7(a) is a schematic diagram of a three-dimensional point cloud reconstructed using the filling method provided by an embodiment of the present invention
- Figure 7(b) is a three-dimensional point cloud reconstructed based on a single depth map for filling processing
- the present application also provides an image processing device.
- the image processing device 80 includes a processor 81, a memory 82, and a computer program stored on the memory.
- the processor executes the The computer program implements the following steps:
- each depth map includes one or more target image regions, and the target image regions meet a preset type condition
- the depth value of the target image area corresponding to each three-dimensional point set is filled.
- the method when the processor is configured to project edge pixel points of each target image area into a three-dimensional space to determine at least one three-dimensional point set, the method includes:
- the method when the processor is configured to determine the three-dimensional points corresponding to the same target object in the three-dimensional space according to the spatial coordinates of the three-dimensional points corresponding to the edge pixel points of the respective target image regions, the method includes:
- a three-dimensional point corresponding to the same target object in the three-dimensional space is determined based on the intersection of the three-dimensional space regions.
- the method when the processor is configured to determine the intersection of the three-dimensional space regions corresponding to each target image region according to the spatial coordinates of the three-dimensional points corresponding to the edge pixel points of each target image region, the method includes:
- the method when the processor is configured to determine the three-dimensional point corresponding to the same target object in the three-dimensional space based on the intersection of the three-dimensional space regions, the method includes:
- the three-dimensional space areas corresponding to the two target image areas intersect, determine that the three-dimensional points corresponding to the edge pixel points of the two target image areas are three-dimensional points corresponding to the same target object in the three-dimensional space; or
- the three-dimensional space regions corresponding to the two target image regions intersect with the same three-dimensional space region, it is determined that the three-dimensional points corresponding to the edge pixel points of the two target image regions are three-dimensional points corresponding to the same target object in the three-dimensional space.
- the method when the processor is configured to fill in the depth value of the target image region corresponding to each three-dimensional point set according to the spatial coordinates of the three-dimensional point in each three-dimensional point set, the method includes:
- the method when the processor is used to determine the edge pixels of each target image area in the multiple depth maps, the method includes:
- the semantic image is obtained based on the RGB image corresponding to the depth map and a pre-trained calculation model.
- the determining the edge pixels of each target image area in the multiple depth maps based on the semantic image includes:
- the target image area is an image area of a water area
- the target object is a water area
- an embodiment of the present specification also provides a computer storage medium in which a program is stored, and the program is executed by a processor to implement the image processing method in any of the foregoing embodiments.
- the embodiments of this specification may adopt the form of a computer program product implemented on one or more storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing program codes.
- Computer usable storage media include permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology.
- the information can be computer-readable instructions, data structures, program modules, or other data.
- Examples of computer storage media include, but are not limited to: phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices.
- PRAM phase change memory
- SRAM static random access memory
- DRAM dynamic random access memory
- RAM random access memory
- ROM read-only memory
- EEPROM electrically erasable programmable read-only memory
- flash memory or other memory technology
- CD-ROM compact disc
- DVD digital versatile disc
- Magnetic cassettes magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices.
- the relevant part can refer to the part of the description of the method embodiment.
- the device embodiments described above are merely illustrative, where the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in One place, or it can be distributed to multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments. Those of ordinary skill in the art can understand and implement without creative work.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computational Linguistics (AREA)
- Evolutionary Biology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Computer Graphics (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
An image processing method and device. The method comprises: obtaining a plurality of depth maps, each depth map comprising one or more target image regions, and the target image regions satisfying a preset type condition; determining edge pixels of the target image regions in the plurality of depth maps; projecting the edge pixels of the target image regions to a three-dimensional space to determine at least one three-dimensional point set, three-dimensional points in each three-dimensional point set corresponding to the same target object in the three-dimensional space; and according to spatial coordinates of the three-dimensional points in the three-dimensional points set, filling up a depth value of the target image region corresponding to each three-dimensional point set. By using this method, defects of missing, layering, and unevenness of a point cloud constructed by using depth maps are effectively ameliorated.
Description
本申请涉及图像处理技术领域,具体而言,涉及一种图像处理方法及装置。This application relates to the field of image processing technology, and specifically to an image processing method and device.
目前,三维重建技术已在各个领域广泛应用。在进行三维重建时,可以采集多张图像,并确定这些图像对应的深度图,然后通过多张深度图得到三维点云,构建三维模型。其中,在确定图像中物体的各像素点的深度时,可以在一张图像上提取特征点,并且在另一张图像找到这些特征点的匹配点,然后基于这些特征点和匹配点来确定图像中各像素点的深度,得到深度图。因此,对于一些表面光滑、弱纹理的物体,由于无法提取到特征点,因而也无法得到这个物体的深度,所以最终得到的深度图中这些物体对应的区域会出现深度信息缺失,使得最终构建的三维点云出现空洞,因而需要对深度图中缺失的深度信息进行填补。At present, 3D reconstruction technology has been widely used in various fields. When performing three-dimensional reconstruction, multiple images can be collected, and the depth maps corresponding to these images can be determined, and then a three-dimensional point cloud can be obtained from the multiple depth maps to construct a three-dimensional model. Among them, when determining the depth of each pixel of an object in an image, feature points can be extracted from one image, and matching points of these feature points can be found in another image, and then the image can be determined based on these feature points and matching points The depth of each pixel in the middle to obtain a depth map. Therefore, for some objects with smooth surfaces and weak textures, because the feature points cannot be extracted, the depth of the object cannot be obtained. Therefore, the depth information corresponding to these objects in the final depth map will be missing, which makes the final construction There are holes in the 3D point cloud, so it is necessary to fill in the missing depth information in the depth map.
相关技术中,对深度图中缺失的深度信息进行填补的效果不够理想,会导致最终重建出来的三维点云中,对应于三维空间中同一物体的点云出现分层的现象。因而,有必要对深度图的深度信息的填补方法加以改进,以得到均匀、完整的点云。In the related technology, the effect of filling the missing depth information in the depth map is not ideal, which will lead to the layering phenomenon of the point cloud corresponding to the same object in the three-dimensional space in the finally reconstructed three-dimensional point cloud. Therefore, it is necessary to improve the filling method of the depth information of the depth map to obtain a uniform and complete point cloud.
发明内容Summary of the invention
有鉴于此,本申请提供了一种图像处理方法及装置。In view of this, this application provides an image processing method and device.
根据本申请的第一方面,提供了一种图像处理方法,所述方法包括:According to the first aspect of the present application, there is provided an image processing method, the method including:
获取多张深度图,各深度图中包括一个或多个目标图像区域,所述目标图像区域满足预设类型条件;Acquiring a plurality of depth maps, each depth map includes one or more target image regions, and the target image regions meet a preset type condition;
确定所述多张深度图中各个目标图像区域的边缘像素点;Determining the edge pixels of each target image area in the multiple depth maps;
将所述各个目标图像区域的边缘像素点投影到三维空间中以确定至少一个三维点集合,各个所述三维点集合中所包含的三维点对应于三维空间中的同一个目标对象;Projecting the edge pixels of each target image area into a three-dimensional space to determine at least one three-dimensional point set, and the three-dimensional points contained in each of the three-dimensional point sets correspond to the same target object in the three-dimensional space;
根据各个所述三维点集合中三维点的空间坐标,填补各个三维点集合对应的目标图像区域的深度值。According to the spatial coordinates of the three-dimensional points in each of the three-dimensional point sets, the depth value of the target image area corresponding to each three-dimensional point set is filled.
根据本申请的第二方面,提供了一种图像处理装置,所述装置包括处理器、存储器以及存储在所述存储器上的计算机程序,所述处理器执行所述计算机程序时实现以下步骤:According to a second aspect of the present application, there is provided an image processing apparatus, the apparatus including a processor, a memory, and a computer program stored on the memory, and the processor implements the following steps when the processor executes the computer program:
获取多张深度图,各深度图中包括一个或多个目标图像区域,所述目标图像区域满足预设类型条件;Acquiring a plurality of depth maps, each depth map includes one or more target image regions, and the target image regions meet a preset type condition;
确定所述多张深度图中各个目标图像区域的边缘像素点;Determining the edge pixels of each target image area in the multiple depth maps;
将所述各个目标图像区域的边缘像素点投影到三维空间中以确定至少一个三维点集合,各个所述三维点集合中所包含的三维点对应于三维空间中的同一个目标对象;Projecting the edge pixels of each target image area into a three-dimensional space to determine at least one three-dimensional point set, and the three-dimensional points contained in each of the three-dimensional point sets correspond to the same target object in the three-dimensional space;
根据各个所述三维点集合中三维点的空间坐标,填补各个三维点集合对应的目标图像区域的深度值。According to the spatial coordinates of the three-dimensional points in each of the three-dimensional point sets, the depth value of the target image area corresponding to each three-dimensional point set is filled.
应用本申请的方案,在获取多张深度图后,确定每张深度图上的目标图像区域的边缘像素点,然后把边缘像素点投影在三维空间中,得到边缘像素点在三维空间的对应三维点,然后根据这些三维点确定对应于三维空间同一个目标对象的三维点集合,根据各三维点集合中的三维点的空间坐标确定三维点集合在深度图上对应的目标图像区域的深度值,以对深度图的深度信息进行填补,通过这种方法,可以有效的改善利用深度图构建的点云的出现缺失、分层、不均匀的缺陷。Applying the solution of this application, after acquiring multiple depth maps, determine the edge pixels of the target image area on each depth map, and then project the edge pixels in the three-dimensional space to obtain the corresponding three-dimensional values of the edge pixels in the three-dimensional space Then, according to these three-dimensional points, determine the three-dimensional point set corresponding to the same target object in three-dimensional space, and determine the depth value of the target image area corresponding to the three-dimensional point set on the depth map according to the space coordinates of the three-dimensional points in each three-dimensional point set. In order to fill in the depth information of the depth map, this method can effectively improve the defect, layering, and unevenness of the point cloud constructed by the depth map.
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅 是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings needed in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained from these drawings without creative labor.
图1是本发明一个实施例提供的一种图像处理方法的流程图。Fig. 1 is a flowchart of an image processing method provided by an embodiment of the present invention.
图2是本发明一个实施例提供的一种语义图像的示意图。Fig. 2 is a schematic diagram of a semantic image provided by an embodiment of the present invention.
图3是本发明一个实施例提供的包含目标区域的深度图或语义图像的示意图。Fig. 3 is a schematic diagram of a depth map or semantic image including a target area provided by an embodiment of the present invention.
图4是本发明一个实施例提供的两张图像上水域相交的示意图。Fig. 4 is a schematic diagram of the intersection of water areas on two images provided by an embodiment of the present invention.
图5是本发明一个实施例提供的另一种图像处理方法的流程图。Fig. 5 is a flowchart of another image processing method provided by an embodiment of the present invention.
图6是现有技术提供的对深度图的水域进行填补后的得到的点云效果示意图。FIG. 6 is a schematic diagram of the point cloud effect obtained after filling the water area of the depth map provided by the prior art.
图7是本发明一个实施例提供的对深度图的水域进行填补后的得到的点云效果示意图。FIG. 7 is a schematic diagram of the point cloud effect obtained after filling the water area of the depth map according to an embodiment of the present invention.
图8是本发明一个实施例提供的一种图像处理装置的逻辑结构框图。Fig. 8 is a logical structure block diagram of an image processing device provided by an embodiment of the present invention.
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only a part of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
三维重建技术已在各个领域广泛应用。在进行三维重建时,可以采集多张图像,并确定这些图像对应的深度图,然后通过多张深度图得到三维点云,构建三维模型。其中,在确定图像中物体的各像素点的深度时,可以在一张图像上提取该物体表面的特征点,并且在另一张图像找到这些特征点的匹配点,然后基于这些特征点和匹配点来确定物体各像素点的深度,得到深度图。由于特征点一般为物体中的角点、拐点或者轮廓线的交点,即与周围像素点像素值差异较大的像素点,因此,对于一些表面光滑、弱 纹理的物体,无法提取到特征点,也无法准确计算得到这个物体的深度,所以最终得到的深度图中这些物体对应的区域会出现深度信息缺失,使得最终构建的三维点云出现空洞。比如说,对于水域、平面玻璃等物体,由于其表面比较光滑,弱纹理,因而无法从图像中提取出特征点,并根据特征点计算这些物体表面的深度,所以最终得到的深度图中,水域、玻璃等一类物体的对应区域就会出现缺失。Three-dimensional reconstruction technology has been widely used in various fields. When performing three-dimensional reconstruction, multiple images can be collected, and the depth maps corresponding to these images can be determined, and then a three-dimensional point cloud can be obtained from the multiple depth maps to construct a three-dimensional model. Among them, when determining the depth of each pixel point of an object in an image, the feature points on the surface of the object can be extracted from one image, and the matching points of these feature points can be found in another image, and then based on these feature points and matching Point to determine the depth of each pixel of the object to get a depth map. Since the feature points are generally the corner points, inflection points or the intersection of the contour lines in the object, that is, the pixels with large differences in the pixel values of the surrounding pixels, it is impossible to extract the feature points for some objects with smooth surfaces and weak texture. It is also impossible to accurately calculate the depth of this object, so the area corresponding to these objects in the final depth map will have lack of depth information, which makes the final three-dimensional point cloud appear hollow. For example, for objects such as water and flat glass, because the surface is relatively smooth and weak texture, it is impossible to extract feature points from the image, and calculate the depth of the surface of these objects based on the feature points, so the final depth map is that the water Corresponding areas of objects such as, glass, etc. will appear missing.
为了完善深度图中这些表面光滑、弱纹理的物体的深度信息,可以对深度图中这类物体对应的区域进行填补。相关技术中,在对深度图中的这类物体进行填补时,通常从单张深度图中确定出该类物体的边缘像素点,并基于单张深度图中的边缘像素点的深度来计算深度,并对该单张深度图中的该物体的对应区域进行填补,但是,由于一个物体往往出现在多张深度图中,每张深度图可能仅包含该物体的一部分,因而使用单张深度图的像素点来计算深度会存在一定偏差,即每张深度图计算得到的深度不一样,但表示的是三维空间中的同一个物体,因而,基于单张深度图的边缘像素点的深度分别对深度图进行填补后,会导致最终重建出来的三维点云出现分层的现象。因而,有必要对深度图的深度信息的填补方法加以改进,以得到均匀、完整的点云。In order to perfect the depth information of these objects with smooth surfaces and weak texture in the depth map, the regions corresponding to such objects in the depth map can be filled. In related technologies, when filling such objects in the depth map, the edge pixels of this type of object are usually determined from a single depth map, and the depth is calculated based on the depth of the edge pixels in the single depth map. , And fill in the corresponding area of the object in the single depth map. However, since an object often appears in multiple depth maps, each depth map may only contain a part of the object, so a single depth map is used There will be a certain deviation in calculating the depth based on the pixels of each depth map, that is, the depth calculated by each depth map is different, but it represents the same object in the three-dimensional space. Therefore, the depth of the edge pixels based on a single depth map is calculated separately. After the depth map is filled, it will cause the layered phenomenon in the finally reconstructed 3D point cloud. Therefore, it is necessary to improve the filling method of the depth information of the depth map to obtain a uniform and complete point cloud.
为了解决上述问题,本申请提供了一种图像处理方法,具体的,如图1所示,本申请提供的图像处理方法包括以下步骤:In order to solve the above problems, this application provides an image processing method. Specifically, as shown in FIG. 1, the image processing method provided by this application includes the following steps:
S102、获取多张深度图,各深度图中包括一个或多个目标图像区域,所述目标图像区域满足预设类型条件;S102. Acquire multiple depth maps, each depth map includes one or more target image regions, and the target image regions meet a preset type condition;
S104、确定所述多张深度图中各个目标图像区域的边缘像素点;S104. Determine the edge pixels of each target image area in the multiple depth maps.
S106、将所述各个目标图像区域的边缘像素点投影到三维空间中以确定至少一个三维点集合,各个所述三维点集合中所包含的三维点对应于三维空间中的同一个目标对象;S106: Project the edge pixels of each target image area into a three-dimensional space to determine at least one three-dimensional point set, and the three-dimensional points included in each three-dimensional point set correspond to the same target object in the three-dimensional space;
S108、根据各个所述三维点集合中三维点的空间坐标,填补各个三维点集合对应的目标图像区域的深度值。S108: Fill in the depth value of the target image area corresponding to each three-dimensional point set according to the space coordinates of the three-dimensional point in each of the three-dimensional point sets.
本申请提供的图像处理方法可以用于各类进行图像处理的电子设备中,比如手机、笔记本电脑、平板电脑、台式电脑等。在一种可选的实施例中,本申请提供的图像处理方法还可以在云端处理器中执行。当然,在某些实施例中,本申请提供的图像处理方法也可以应用于各类三维重建的软件中。The image processing method provided in this application can be used in various electronic devices that perform image processing, such as mobile phones, notebook computers, tablet computers, desktop computers, and so on. In an optional embodiment, the image processing method provided in this application may also be executed in a cloud processor. Of course, in some embodiments, the image processing method provided in this application can also be applied to various types of 3D reconstruction software.
可以先获取多张深度图,这些深度图可以是通过摄像装置采集的多张RGB图像得到,通过从一张RGB图像提取特征点,然后从另一张RGB图像找到对应的匹配点,根据特征点和对应的匹配点即可以求得RGB图像中各个像素点的深度信息,得到RGB图像对应的深度图。这些深度图包括一个或多个目标图像区域,其中,目标图像区域为满足预设类型条件的区域,即可以是某一类三维物体的对应的图像区域。通常,这些目标图像区域可以是需要进行深度信息填补的图像区域,比如,可以是一些表面光滑,弱纹理的三维空间物体对应的图像区域,比如,在某些实施例中,目标图像区域可以是三维空间中的水域对应的图像区域,或者是平面玻璃对应的图像区域,当然,也可以是其他的具有类似特性的三维空间物体的图像区域,本申请不作限制。Multiple depth maps can be obtained first. These depth maps can be obtained from multiple RGB images collected by a camera. By extracting feature points from one RGB image, and then finding the corresponding matching points from another RGB image, according to the feature points With the corresponding matching point, the depth information of each pixel in the RGB image can be obtained, and the depth map corresponding to the RGB image can be obtained. These depth maps include one or more target image areas, where the target image area is an area that meets a preset type condition, that is, it may be a corresponding image area of a certain type of three-dimensional object. Generally, these target image areas may be image areas that need to be filled with depth information. For example, they may be image areas corresponding to some three-dimensional objects with a smooth surface and weak texture. For example, in some embodiments, the target image area may be The image area corresponding to the water area in the three-dimensional space, or the image area corresponding to the flat glass, of course, can also be the image area of other three-dimensional objects with similar characteristics, and this application is not limited.
本申请中的目标对象为三维空间中与该目标图像区域对应的物体,即三维空间中的表面比较光滑、纹理较弱的物体,比如目标图像区域为水域的图像区域,即目标对象为水域,目标图像区域为平面玻璃的图像区域,则目标对像为平面玻璃。其中,每张深度图中包括的一个或多个目标图像区域可以是对应于同一个三维空间目标对象的图像区域,也可以是对应于三维空间不同目标对象的图像区域。The target object in this application is an object in a three-dimensional space corresponding to the target image area, that is, an object in the three-dimensional space with a relatively smooth surface and weak texture. For example, the target image area is an image area of a water area, that is, the target object is a water area. The target image area is the image area of the flat glass, and the target object is the flat glass. Wherein, one or more target image regions included in each depth map may be image regions corresponding to the same three-dimensional space target object, or image regions corresponding to different target objects in the three-dimensional space.
在获取包含一个或多个目标图像区域的深度图后,可以从深度图中确定目标图像区域的边缘像素点。其中,在某些实施例中,边缘像素点可以通过预先对深度图进行标记得到,比如,可以确定出深度图中的目标图像区域,然后在深度图中标记该目标图像区域,标记可以采用人工标记的方式,当然也可以采用自动标记的方式。After obtaining a depth map containing one or more target image regions, the edge pixels of the target image region can be determined from the depth map. Among them, in some embodiments, the edge pixels can be obtained by pre-marking the depth map. For example, the target image area in the depth map can be determined, and then the target image area can be marked in the depth map. The marking can be artificial The way of marking, of course, can also adopt the way of automatic marking.
在某些实施例中,可以通过深度图对应的语义图像从深度图中确定边缘像素点,其中,语义图像为被分割成多个图像区域的图像,每个图像区域对 应三维空间中的一类目标对像。语义图像可以通过对图像进行语义分割得到,对图像进行语义分割是将不同类别的物体区分开来,属于同一类别的物体被标记为一类。如图2所示,在语义分割的结果中,场景中的所有的人被划分为一类并标记为同一颜色(在图2中通过不同灰度呈现),树木、街道、楼房、天空等亦是如此。In some embodiments, the edge pixels can be determined from the depth map through the semantic image corresponding to the depth map, where the semantic image is an image divided into multiple image regions, and each image region corresponds to a type in the three-dimensional space. Target object. Semantic images can be obtained by semantic segmentation of images. Semantic segmentation of images distinguishes objects of different categories, and objects belonging to the same category are marked as one category. As shown in Figure 2, in the result of semantic segmentation, all people in the scene are divided into one category and marked with the same color (presented by different gray levels in Figure 2), trees, streets, buildings, sky, etc. So it is.
在某些实施例中,各深度图对应的语义图像可以通过预先训练的计算模型得到,可以将各深度图对应的RGB图像输入到预先训练的计算模型,然后可以输出各深度图对应的语义图像。为了训练该计算模型,可以收集大量包含目标对象的RGB图像,然后对图像中的目标对象对应的图像区域进行标注,并将标注后的RGB图像对模型进行训练,得到该计算模型。其中,计算模型可以是深度学习模型,比如,可以是FCN(Fully Convolutional Networks)模型。In some embodiments, the semantic image corresponding to each depth map can be obtained through a pre-trained calculation model. The RGB image corresponding to each depth map can be input to the pre-trained calculation model, and then the semantic image corresponding to each depth map can be output. . In order to train the calculation model, a large number of RGB images containing the target object can be collected, and then the image area corresponding to the target object in the image can be annotated, and the annotated RGB image can be trained on the model to obtain the calculation model. The calculation model may be a deep learning model, for example, it may be an FCN (Fully Convolutional Networks) model.
在确定深度图对应的语义图像后,由于图像中的像素点是一一对应的,因而可以根据语义图像中对各类物体的分类从深度图中确定各目标图像区域,以及各目标图像区域的边缘像素点。比如,如果目标图像区域为水域的图像区域,则可以从语义图像中确定出水域的图像区域,然后根据两种图像上的像素点的对应关系,在深度图上确定出水域的图像区域以及边缘像素点。当然,也可以直接在语义图像中确定出水域的图像区域和边缘像素点后,再根据像素点的对应关系,在深度图中确定出边缘像素点。After determining the semantic image corresponding to the depth map, since the pixels in the image are in a one-to-one correspondence, each target image area and the target image area can be determined from the depth map according to the classification of various objects in the semantic image. Edge pixels. For example, if the target image area is the image area of the water area, the image area of the water area can be determined from the semantic image, and then the image area and the edge of the water area can be determined on the depth map according to the correspondence between the pixels on the two images pixel. Of course, it is also possible to directly determine the image area and edge pixels of the water area in the semantic image, and then determine the edge pixels in the depth map according to the corresponding relationship of the pixels.
其中,在某些实施例中,边缘像素点可以是位于目标图像区域边界,且属于目标图像区域的像素点。在某些实施例中,边缘像素点也可以是位于目标图像区域边界,但属于其他图像区域的像素点。如图3所示,为深度图或语义图像中目标图像区域的示意图,其中,填充为灰色的像素点对应于图像中的目标图像区域,未填充的白色像素点为其他图像区域,填充为灰色且以“十”字标记的像素点301为位于目标图像区域的边界且属于目标图像区域的像素点,填充为白色且被多个斜线标记的像素点302为位于目标图像区域边界且属于其他图像区域的像素点。因此,在某些实施例中,可以将301,即 位于目标图像区域边界且属于目标图像区域的像素点作为边缘像素点。但是由于目标图像区域上的像素点深度信息不太准确,导致后续计算目标图像区域的深度也不准确,因此,在某些实施例中,可以将302,即位于目标图像区域边界且属于其他图像区域的像素点作为边缘像素点。Among them, in some embodiments, the edge pixels may be pixels located at the boundary of the target image area and belonging to the target image area. In some embodiments, the edge pixels may also be pixels located at the boundary of the target image area but belonging to other image areas. As shown in Figure 3, it is a schematic diagram of the target image area in the depth map or semantic image, where the pixels filled in gray correspond to the target image area in the image, and the unfilled white pixels are other image areas, and the pixels filled in gray are gray. And the pixel points 301 marked with a "cross" are pixels located at the boundary of the target image area and belonging to the target image area, and the pixels 302 filled with white and marked by multiple diagonal lines are located at the boundary of the target image area and belong to others. The pixels of the image area. Therefore, in some embodiments, 301, that is, a pixel located at the boundary of the target image area and belonging to the target image area, may be regarded as an edge pixel. However, because the pixel depth information on the target image area is not accurate, the subsequent calculation of the depth of the target image area is not accurate. Therefore, in some embodiments, 302, which is located at the boundary of the target image area and belongs to other images The pixels of the area are regarded as edge pixels.
在某些实施例中,可以根据语义图像与深度图的像素点的对应关系从深度图确定各个目标图像区域的像素点,然后针对目标图像区域的每个像素点,可以逐一判定位于该像素点四周的邻近像素点是否都为目标图像区域的像素点,如果是,则说明在该像素点之外还有该目标区域的像素点,因而该像素点不是目标图像区域的边缘。如果不是,则说明该图像之外是其他的图像区域,因而可以确定该像素点邻近的其他的图像区域的像素点为该目标图像区域的边缘像素点,如图3中的302。举个例子,如果目标图像区域为水域图像,因而可以从深度图对应的语义图像中确定出水域图像区域的各像素点,由于两张图像的像素点一一对应,因而可以在深度图找到该水域图像区域的像素点,然后逐一判断每一个像素点周围的邻近像素点是否都是表示水域图像的像素点,如果不是,则确定该像素点邻近的其他的图像区域的像素点为水域的边缘。In some embodiments, the pixel points of each target image area can be determined from the depth map according to the correspondence between the semantic image and the pixel points of the depth map, and then for each pixel point in the target image area, it can be determined one by one to locate the pixel point. Whether the surrounding pixels are all pixels of the target image area, if so, it means that there are pixels of the target area besides the pixel, so the pixel is not the edge of the target image area. If it is not, it means that the image is outside the image area, so it can be determined that the pixels of other image areas adjacent to the pixel are the edge pixels of the target image area, as shown in 302 in FIG. 3. For example, if the target image area is a water image, each pixel of the water image area can be determined from the semantic image corresponding to the depth map. Since the pixels of the two images correspond one-to-one, the depth map can be found Pixels in the water image area, and then determine one by one whether the neighboring pixels around each pixel are all pixels representing the water image, if not, determine the pixels of other image areas adjacent to the pixel as the edge of the water area .
当然,在某些实施例中,也可以在语义图像中确定出目标图像区域对应的像素点,逐一判定位于该像素点四周的邻近像素点是否都为目标图像区域的像素点,如果不是,则确定该像素点邻近的其他的图像区域的像素点为边缘像素点,如图3中的302。在语义图像中确定出边缘像素点后,再到深度图上找该边缘像素点的对应点,即为深度图上的边缘像素点。Of course, in some embodiments, the pixel corresponding to the target image area can also be determined in the semantic image, and it is determined one by one whether the neighboring pixels located around the pixel are all pixels of the target image area, if not, then It is determined that the pixels in other image areas adjacent to the pixel are edge pixels, as shown in 302 in FIG. 3. After determining the edge pixel in the semantic image, find the corresponding point of the edge pixel on the depth map, which is the edge pixel on the depth map.
在确定上述多张深度图中各目标区域的边缘像素点后,可以将各目标图像区域的边缘像素点投影到三维空间中,以确定哪些目标图像区域是对应于三维空间中的同一个目标对象,然后将对应于同一个目标对象的目标图像区域的边缘像素点对应的三维点作为一个集合,以得到一个或多个三维点集合,其中,每一个三维点集合对应三维空间中的同一个目标对象。然后再根据每一个三维点集合中的三维点确定该集合对应的目标图像区域的深度值。其中, 目标图像区域的边缘像素点投影到三维空间中对应的三维点可以根据边缘像素点的像素坐标、摄像装置的内参数、外参数确定,外参数包括摄像装置的旋转矩阵和平移矩阵,具体可采用公式(1)进行换算:After determining the edge pixels of each target area in the above multiple depth maps, the edge pixels of each target image area can be projected into the three-dimensional space to determine which target image areas correspond to the same target object in the three-dimensional space , And then take the three-dimensional points corresponding to the edge pixels of the target image area corresponding to the same target object as a set to obtain one or more three-dimensional point sets, where each three-dimensional point set corresponds to the same target in the three-dimensional space Object. Then, according to the three-dimensional points in each three-dimensional point set, the depth value of the target image area corresponding to the set is determined. The projection of the edge pixel points of the target image area to the corresponding three-dimensional points in the three-dimensional space can be determined according to the pixel coordinates of the edge pixel points, the internal parameters and external parameters of the camera device, and the external parameters include the rotation matrix and the translation matrix of the camera device. The formula (1) can be used for conversion:
其中,(u,v)为边缘像素点的坐标,Z为像素坐标(u,v)的深度,K为相机内参数,R为拍摄该图像的摄像装置的旋转矩阵,T为拍摄该图像的摄像装置的平移矩阵,Pw为三维空间中坐标为(u,v)为边缘像素点对应的三维点的空间坐标。Among them, (u, v) is the coordinate of the edge pixel, Z is the depth of the pixel coordinate (u, v), K is the camera internal parameter, R is the rotation matrix of the camera device that took the image, and T is the image The translation matrix of the camera device, where Pw is the three-dimensional space coordinate (u, v) is the spatial coordinate of the three-dimensional point corresponding to the edge pixel point.
在某些实施例中,可以先将各个目标图像区域的边缘像素点投影到三维空间中,得到各个目标图像区域的边缘像素点对应的三维点,然后根据各个目标图像区域的边缘像素点对应的三维点的空间坐标,确定对应于三维空间同一个目标对象的三维点,将对应于三维空间同一个目标对象的三维点置于同一个三维点集合。举个例子,三维空间中包含水域1和水域2,而每张深度图上的水域图像区域只包含上述两个水域区域的一部分,因而可以将这多张深度图上的水域图像区域的边缘像素点都投影到三维空间中,确定这些边缘像素点对应于三维空间的三维点,然后从这些边缘像素点中确定出水域1的边缘像素点对应的三维点以及水域2的边缘像素点对应的三维点,构建两个三维点集合。In some embodiments, the edge pixel points of each target image area can be projected into the three-dimensional space first to obtain the three-dimensional points corresponding to the edge pixel points of each target image area, and then according to the corresponding edge pixel points of each target image area The spatial coordinates of the three-dimensional points determine the three-dimensional points corresponding to the same target object in the three-dimensional space, and place the three-dimensional points corresponding to the same target object in the three-dimensional space in the same three-dimensional point set. For example, the three-dimensional space contains water area 1 and water area 2, and the water area image area on each depth map contains only a part of the above two water area areas, so the edge pixels of the water area image area on the multiple depth maps Points are projected into the three-dimensional space, and these edge pixels are determined to correspond to the three-dimensional points in the three-dimensional space. Then, from these edge pixels, the three-dimensional points corresponding to the edge pixels of water area 1 and the three-dimensional points corresponding to the edge pixels of water area 2 are determined. Point to construct two three-dimensional point sets.
在某些实施例中,在确定对应于三维空间同一个目标对象的三维点时,可以先根据各个目标图像区域的边缘像素点对应的三维点的空间坐标确定各个目标图像区域在三维空间对应的三维空间区域的相交情况,然后根据各个目标图像区域在三维空间对应的三维空间区域的相交情况确定对应于三维空间同一个目标对象的三维点。比如,目标图像区域A边缘像素点对应的三维点为A1、A2、A3、A4、A5,目标图像区域B的边缘像素点对应的三维点为B1、B2、B3、B4、B5,然后可以将A1、A2、A3、A4、A5 连接起来,得到一个三维空间区域a,也就是目标图像区域A对应的真实物体的物理空间,同样的,将B1、B2、B3、B4、B5连接起来,也会得到一个三维空间区域b,也就是目标图像区域B对应的真实物体的物理空间,然后可以看三维空间区域a和三维空间区域b是否相交,如果相交,则说明目标图像区域A和目标图像区域B对应同一个目标对象,因而A和B的边缘像素点对应的三维点为同一个目标对象的三维点。In some embodiments, when determining the three-dimensional points corresponding to the same target object in the three-dimensional space, the three-dimensional points corresponding to the edge pixel points of each target image area may be determined first according to the spatial coordinates of the three-dimensional points corresponding to each target image area in the three-dimensional space. The intersection of the three-dimensional space area, and then determine the three-dimensional point corresponding to the same target object in the three-dimensional space according to the intersection of each target image area in the three-dimensional space corresponding to the three-dimensional space. For example, the three-dimensional points corresponding to the edge pixels of the target image area A are A1, A2, A3, A4, A5, and the three-dimensional points corresponding to the edge pixels of the target image area B are B1, B2, B3, B4, B5, and then you can add Connect A1, A2, A3, A4, and A5 to obtain a three-dimensional space area a, which is the physical space of the real object corresponding to the target image area A. Similarly, connect B1, B2, B3, B4, and B5, and You will get a three-dimensional space area b, which is the physical space of the real object corresponding to the target image area B. Then you can see whether the three-dimensional space area a and the three-dimensional space area b intersect. If they intersect, the target image area A and the target image area B corresponds to the same target object, so the three-dimensional points corresponding to the edge pixels of A and B are the three-dimensional points of the same target object.
在某些实施例中,为了更好地确定目标图像区域对应的三维空间区域的相交情况,可以将各个目标图像区域的边缘像素点对应的三维点投影到同一个平面内,以确定各个目标图像区域对应的平面区域,然后根据各个目标图像区域对应的平面区域的相交情况,确定各个目标图像区域对应的三维空间区域的相交情况。比如,目标图像区域A边缘像素点对应的三维点为A1、A2、A3、A4、A5,目标图像区域B的边缘像素点对应的三维点为B1、B2、B3、B4、B5,可以将三维点A1、A2、A3、A4、A5以及三维点B1、B2、B3、B4、B5投影到一个平面中,比如投影到XY轴构成的平面,分别得到两个平面区域a1和b1,然后判断a1和b1是否相交,如果相交,则说明目标图像区域A和B对应的三维空间区域相交。In some embodiments, in order to better determine the intersection of the three-dimensional space area corresponding to the target image area, the three-dimensional points corresponding to the edge pixel points of each target image area can be projected onto the same plane to determine each target image The plane area corresponding to the area, and then according to the intersection of the plane area corresponding to each target image area, the intersection of the three-dimensional space area corresponding to each target image area is determined. For example, the three-dimensional points corresponding to the edge pixels of the target image area A are A1, A2, A3, A4, A5, and the three-dimensional points corresponding to the edge pixels of the target image area B are B1, B2, B3, B4, and B5. Points A1, A2, A3, A4, A5 and three-dimensional points B1, B2, B3, B4, B5 are projected into a plane, such as projected to a plane formed by the XY axis, and two plane areas a1 and b1 are obtained respectively, and then a1 is judged Whether it intersects with b1. If it intersects, it means that the three-dimensional space area corresponding to the target image area A and B intersect.
在某些实施例中,如果两个目标图像区域对应的三维空间区域相交,则确定所述两个目标图像区域的边缘像素点对应的三维点为对应于三维空间同一个目标对象的三维点,或者如果两个目标图像区域对应的三维空间区域与同一个三维空间区域相交,则确定所述两个目标图像区域的边缘像素点对应的三维点为对应于三维空间同一个目标对象的三维点。举个例子,目标图像区域A在三维空间的对应三维空间区域为a,目标图像区域B在三维空间的对应三维空间区域为b,目标图像区域C在三维空间的对应三维空间区域为c,如果a和b相交,则认为目标图像区域A和目标图像区域B对应于三维空间同一个目标对象,两个图像区域的边缘像素点对应的三维点为同一个目标对象的三维点。如果a和c相交,b也和c相交,那么则认为目标图像区域A、目标图像区域B也对应于三维空间的同一个目 标对象,也就是目标图像区域A、目标图像区域B以及目标图像区域C都对应于三维空间同一个目标对象,三个图像区域的边缘像素点对应的三维点为同一个目标对象的三维点。In some embodiments, if the three-dimensional space regions corresponding to two target image regions intersect, it is determined that the three-dimensional points corresponding to the edge pixel points of the two target image regions are three-dimensional points corresponding to the same target object in the three-dimensional space, Or if the three-dimensional space regions corresponding to two target image regions intersect with the same three-dimensional space region, then it is determined that the three-dimensional points corresponding to the edge pixel points of the two target image regions are three-dimensional points corresponding to the same target object in the three-dimensional space. For example, the corresponding three-dimensional space area of target image area A in three-dimensional space is a, the corresponding three-dimensional space area of target image area B in three-dimensional space is b, and the corresponding three-dimensional space area of target image area C in three-dimensional space is c, if If a and b intersect, it is considered that the target image area A and the target image area B correspond to the same target object in three-dimensional space, and the three-dimensional points corresponding to the edge pixels of the two image areas are the three-dimensional points of the same target object. If a and c intersect, and b and c intersect, then it is considered that the target image area A and the target image area B also correspond to the same target object in the three-dimensional space, that is, the target image area A, the target image area B, and the target image area C all correspond to the same target object in the three-dimensional space, and the three-dimensional points corresponding to the edge pixels of the three image regions are the three-dimensional points of the same target object.
在确定各目标对象对应的三维点集合后,可以根据三维点集合中各三维点的空间坐标确定该目标对象的深度信息。由于目标对象往往是水域、玻璃等具有平面特征的物体,因而可以看成是一个平面,所以,在某些实施例中,可以取这些三维点的深度值的平均值作为目标对象的深度值,来填补各深度图中的目标图像区域的深度信息。After determining the three-dimensional point set corresponding to each target object, the depth information of the target object can be determined according to the spatial coordinates of each three-dimensional point in the three-dimensional point set. Since the target object is often an object with flat features such as water and glass, it can be regarded as a plane. Therefore, in some embodiments, the average value of the depth values of these three-dimensional points can be taken as the depth value of the target object. To fill in the depth information of the target image area in each depth map.
为了更加精确地确定各目标图像区域的深度值,以得到更加真实的三维点云模型,在某些实施例中,在确定各目标对象的三维点集合后,针对每一个三维点集合,可以根据三维点集合中各三维点的空间坐标拟合得到一个拟合平面,然后根据该拟合平面去确定各目标图像区域的深度信息。比如可以根据同一目标对象对应的三维点集合,确定拟合平面的平面方程如下:In order to more accurately determine the depth value of each target image area to obtain a more realistic three-dimensional point cloud model, in some embodiments, after determining the three-dimensional point set of each target object, for each three-dimensional point set, you can The spatial coordinates of each three-dimensional point in the three-dimensional point set are fitted to obtain a fitting plane, and then the depth information of each target image area is determined according to the fitting plane. For example, according to the three-dimensional point set corresponding to the same target object, the plane equation of the fitting plane can be determined as follows:
ax
w+by
w+cz
w+d=1 公式(2)
ax w +by w +cz w +d = 1 formula (2)
其中,a,b,c,d为根据三维点集合中的三维点坐标拟合得到的平面方程的系数。然后根据拟合平面去确定各目标图像区域的深度信息。具体的,将公式(2)与下述公式(1)联立即可以求得各深度图上该目标对象对应的像素点的深度Z,然后填补各深度图上该目标对象对应的目标图像区域。公式(1)具体如下:Among them, a, b, c, d are the coefficients of the plane equation obtained by fitting the three-dimensional point coordinates in the three-dimensional point set. Then according to the fitting plane to determine the depth information of each target image area. Specifically, by combining formula (2) with the following formula (1), the depth Z of the pixel corresponding to the target object on each depth map can be immediately obtained, and then the target image area corresponding to the target object on each depth map is filled. The specific formula (1) is as follows:
为了进一步解释本申请提供的图像处理方法,以下再以一个具体的实施例加以解释。In order to further explain the image processing method provided in this application, a specific embodiment is used for explanation below.
在三维重建技术中,深度图的计算依赖于物体的表面纹理的特异性和稳定性。需要从三维物体的多张RGB图像中提取特征点,并进行匹配,以 确定图像之中各三维物体的深度信息。对于水面等表面反光、且表面弱纹理、不固定纹理的三维物体,其三维重建一直是一个难题,传统三维重建方法很难对这些物体得到正确的重建结果,水面点云通常会缺失或者高度信息错乱。In the 3D reconstruction technology, the calculation of the depth map depends on the specificity and stability of the surface texture of the object. It is necessary to extract feature points from multiple RGB images of three-dimensional objects and perform matching to determine the depth information of each three-dimensional object in the image. For 3D objects with reflective surfaces such as water surface, weak surface texture, and unfixed texture, the 3D reconstruction has always been a problem. Traditional 3D reconstruction methods are difficult to obtain correct reconstruction results for these objects. The water surface point cloud usually lacks or height information Confused.
为了解决三维重建中,水域缺失的问题,本实施例中提供了一种图像处理方法,可以对用于构建三维点云的深度图中的水域区域进行填补,使得构建的三维点云不会出现分层现象,比较均匀和完整。具体的所述方法如下:In order to solve the problem of the lack of water in the 3D reconstruction, this embodiment provides an image processing method that can fill in the water area in the depth map used to construct the 3D point cloud, so that the constructed 3D point cloud will not appear The delamination phenomenon is relatively uniform and complete. The specific method is as follows:
1、深度学习模型的训练1. Training of deep learning model
首先,可以收集大量包含水域场景的RGB图像,对这些RGB图像中的水域场景的像素点做好标注,然后将标注好的RGB图像输入到深度学习模型中,对深度模型进行训练,得到一个可以分割水域和非水域的模型。其中,该深度学习模型可以选用FCN模型。对模型进行训练后,训练好的模型可以经过一系列处理计算出每张图像的像素级水域分割结果,即可以确定出来每一张图像中的水域区域对应的像素点。First, you can collect a large number of RGB images containing water scenes, label the pixels of the water scenes in these RGB images, and then input the marked RGB images into the deep learning model, and train the deep model to obtain a A model that separates water and non-aquatic areas. Among them, the deep learning model can be an FCN model. After the model is trained, the trained model can calculate the pixel-level water segmentation results of each image through a series of processing, and then determine the pixel points corresponding to the water area in each image.
2、确定深度图对应语义图像2. Determine the semantic image corresponding to the depth map
训练好深度学习模型后,可以将用于构建三维点云的深度图对应的RGB图像输入到训练好的深度学习模型中,然后可以输出标注好图像中的水域区域和非水域区域的图像,我们称之为语义图像。由此可以得到相同尺度的原彩色RGB原图像、以及该RGB图像对应的深度图和语义图像,三种图像的每个像素点的位置处一一对应。After the deep learning model is trained, the RGB image corresponding to the depth map used to construct the three-dimensional point cloud can be input into the trained deep learning model, and then the image of the water area and non-water area in the marked image can be output. Call it a semantic image. As a result, the original color RGB original image of the same scale and the depth map and semantic image corresponding to the RGB image can be obtained, and the position of each pixel of the three images has a one-to-one correspondence.
3、深度图中各水域区域边缘像素点的确定3. Determining the edge pixels of each water area in the depth map
由各深度图和语义图像间的对应关系可以确定深度图中各水域区域的边缘像素点。由于语义图像中的水域区域的像素点都携带标签,因而可以通过语义图像确定出水域区域的各个像素点,然后逐一判断每个像素点四周的邻近像素点是否都是水域的像素点,如果不是,则取该像素点邻近的非水域图像区域的像素点作为该水域的边缘像素点进而确定各深度图中水 域的边缘像素点。The corresponding relationship between each depth map and semantic image can determine the edge pixels of each water area in the depth map. Since the pixels in the water area in the semantic image all carry labels, each pixel in the water area can be determined from the semantic image, and then each pixel is judged one by one whether the neighboring pixels around each pixel are all pixels in the water area, if not , Then take the pixel point of the non-water image area adjacent to the pixel point as the edge pixel point of the water area to determine the edge pixel point of the water area in each depth map.
4、确定深度图中各水域的相交情况4. Determine the intersection of each water area in the depth map
在确定各深度图中水域的边缘像素点后,可以将这些边缘像素点投影到世界坐标系中,确定各边缘像素点在世界坐标系对应的三维点。其中,各边缘像素点投影到世界坐标后的对应的三维点的空间坐标可以通过公式(1)计算:After determining the edge pixels of the water area in each depth map, these edge pixels can be projected into the world coordinate system to determine the three-dimensional points corresponding to each edge pixel in the world coordinate system. Among them, the spatial coordinates of the corresponding three-dimensional points after each edge pixel point is projected to the world coordinates can be calculated by formula (1):
其中,(u,v)为边缘像素点的坐标,Z为像素坐标(u,v)的深度,K为相机内参数,R为拍摄该图像的摄像装置的旋转转矩,T为拍摄该图像的摄像装置的平移转矩,Pw为三维空间中坐标为(u,v)为边缘像素点对应的三维点的空间坐标。Among them, (u, v) is the coordinate of the edge pixel point, Z is the depth of the pixel coordinate (u, v), K is the camera parameter, R is the rotation torque of the camera device that took the image, and T is the image taken The translation torque of the camera device, Pw is the space coordinate of the three-dimensional point corresponding to the edge pixel point in the three-dimensional space with coordinates (u, v).
当多张图像拍摄到同一水域时,水域边缘会出现相交的情况,因而可以根据深度图中各水域的边缘像素点对应的三维点确定各水域的相交情况,并将相交的水域标记为同一片水域,如图4所示意,两张图像上的水域存在相交,可以标记为同一片水域。When multiple images are captured in the same water area, the edge of the water area will intersect. Therefore, the intersection of each water area can be determined according to the three-dimensional points corresponding to the edge pixels of each water area in the depth map, and the intersecting water areas are marked as the same piece Waters, as shown in Figure 4, the waters on the two images intersect and can be marked as the same waters.
如图5所示,为确定相交水域的流程图,可以在获取到每张用于构建三维点云的深度图后(S501),确定每张深度图中水域区域的边缘像素点(S502),然后将每张图像的水域区域的边缘像素点投影到世界坐标系下,获取该边缘像素点在三维空间对应的三维点(S503),然后根据三维点确定该水域区域是否与已统计的水域相交(S504),如果与已统计的水域都不相交,则将其统计为新水域(S505),如果与已统计水域相交,则判定该张图像所在水域属于已知水域,并判定是与单个已统计水域相交还是与多个已统计水域相交,如果与单个已统计水域相交(S506),则将该水域对应的三维点添加到已知的相应水域的三维点集合中(S508);如果该张图像对应的水域与多个已统计水域相交(S507),则将这多个水域合并成一个水域,并将这多个水域对应的边缘像 素点对应的三维点置于一个三维点集合中(S509)。通过这种方法,可以得到一个全局水域的统计结果。As shown in Figure 5, in order to determine the flow chart of intersecting waters, after each depth map used to construct the three-dimensional point cloud is obtained (S501), the edge pixels of the water area of each depth map can be determined (S502), Then project the edge pixels of the water area of each image to the world coordinate system, obtain the 3D point corresponding to the edge pixel in the 3D space (S503), and then determine whether the water area intersects the counted water area according to the 3D points (S504), if it does not intersect the water area that has been counted, it will be counted as a new water area (S505). If it intersects the water area that has been counted, it is determined that the water area in which the image is located belongs to the known water area, and it is determined that it is a new water area. Whether the statistical water area intersects with multiple statistical water areas, if it intersects with a single statistical water area (S506), the 3D point corresponding to the water area is added to the known 3D point set of the corresponding water area (S508); When the water area corresponding to the image intersects with multiple statistical water areas (S507), the multiple water areas are merged into one water area, and the three-dimensional points corresponding to the edge pixels corresponding to the multiple water areas are placed in a three-dimensional point set (S509). ). In this way, a statistical result of the global water area can be obtained.
将属于同一水域的边缘像素点在三维空间对应的三维点放在一个三维点集合中,通过一个三维点集合中的三维点坐标拟合出一个平面,得到平面方程如下:Put the three-dimensional points corresponding to the edge pixels of the same water area in the three-dimensional space in a three-dimensional point set, and fit a plane by the three-dimensional point coordinates in a three-dimensional point set, and the plane equation is as follows:
ax
w+by
w+cz
w+d=1 公式(2)
ax w +by w +cz w +d = 1 formula (2)
其中,a,b,c,d为根据三维点集合中的三维点坐标拟合得到的平面方程的系数。将公式(2)与下述公式(1)联立即可以求得各深度图上该目标对象对应的像素点的深度Z,然后填补各深度图上该目标对象对应的目标图像区域,公式(1)如下:Among them, a, b, c, d are the coefficients of the plane equation obtained by fitting the three-dimensional point coordinates in the three-dimensional point set. Combining formula (2) with the following formula (1) can immediately obtain the depth Z of the pixel corresponding to the target object on each depth map, and then fill in the target image area corresponding to the target object on each depth map, formula (1 )as follows:
针对多张图像拍摄到同一水域时,相关技术直接用单张深度图的水域边缘像素点信息对水面区域的深度进行填补,由于靠近水面像素的深度噪音较大,重建出来的点云会出现严重的分层现象。而本申请提供的图像处理方法,是将各张深度图中对应于同一个水域的边缘像素点汇集到一起,然后再根据这些边缘像素点确定深度,对水面区域的深度进行填补,是从全局的角度确定深度信息,对水面区域的深度进行填补,因而,最终得到的点云不会出现分层现象,更加均匀和完整。When multiple images are captured in the same water area, the related technology directly uses the water edge pixel information of a single depth map to fill in the depth of the water surface area. Because the depth noise near the water surface pixels is large, the reconstructed point cloud will appear serious Of stratification. The image processing method provided by this application collects the edge pixels corresponding to the same water area in each depth map, and then determines the depth based on these edge pixels, and fills in the depth of the water surface area from the global perspective. The angle determines the depth information and fills in the depth of the water surface area. Therefore, the final point cloud will not be stratified, and it will be more uniform and complete.
如图6所示,图6(a)为未对水域进行填补重建出来的三维点云的示意图,图6(b)为基于单张深度图进行填补处理重建出来的三维点云的示意图,可见,采用现有技术中的填补方法对水域进行填补后,重建的三维点云会出现分层不均一的现象。如图7所示,图7(a)为采用本发明实施例提供的填补方法重建出来的三维点云的示意图,图7(b)为基于单张深 度图进行填补处理重建出来的三维点云的示意图,可见,相比于采用单张深度图进行填补处理的方法,本申请提供的方法得到的点云中的水面更加均匀,不会出现水面分层的情况,且大大减小了水面噪声。As shown in Figure 6, Figure 6(a) is a schematic diagram of a three-dimensional point cloud reconstructed without filling the water area, and Figure 6(b) is a schematic diagram of a three-dimensional point cloud reconstructed based on a single depth map for filling processing, which can be seen , After the water area is filled with the filling method in the prior art, the reconstructed 3D point cloud will have the phenomenon of non-uniform layering. As shown in Figure 7, Figure 7(a) is a schematic diagram of a three-dimensional point cloud reconstructed using the filling method provided by an embodiment of the present invention, and Figure 7(b) is a three-dimensional point cloud reconstructed based on a single depth map for filling processing It can be seen that, compared to the method of using a single depth map for filling processing, the water surface in the point cloud obtained by the method provided in this application is more uniform, there will be no water surface layering, and the water surface noise is greatly reduced .
另外,本申请还提供了一种图像处理装置,如图8所示,所述图像处理装置80包括处理器81、存储器82以及存储在所述存储器上的计算机程序,所述处理器执行所述计算机程序实现以下步骤:In addition, the present application also provides an image processing device. As shown in FIG. 8, the image processing device 80 includes a processor 81, a memory 82, and a computer program stored on the memory. The processor executes the The computer program implements the following steps:
获取多张深度图,各深度图中包括一个或多个目标图像区域,所述目标图像区域满足预设类型条件;Acquiring a plurality of depth maps, each depth map includes one or more target image regions, and the target image regions meet a preset type condition;
确定所述多张深度图中各个目标图像区域的边缘像素点;Determining the edge pixels of each target image area in the multiple depth maps;
将所述各个目标图像区域的边缘像素点投影到三维空间中以确定至少一个三维点集合,各个所述三维点集合中所包含的三维点对应于三维空间中的同一个目标对象;Projecting the edge pixels of each target image area into a three-dimensional space to determine at least one three-dimensional point set, and the three-dimensional points contained in each of the three-dimensional point sets correspond to the same target object in the three-dimensional space;
根据各个所述三维点集合中三维点的空间坐标,填补各个三维点集合对应的目标图像区域的深度值。According to the spatial coordinates of the three-dimensional points in each of the three-dimensional point sets, the depth value of the target image area corresponding to each three-dimensional point set is filled.
在某些实施例中,所述处理器用于将所述各个目标图像区域的边缘像素点投影到三维空间中以确定至少一个三维点集合时,包括:In some embodiments, when the processor is configured to project edge pixel points of each target image area into a three-dimensional space to determine at least one three-dimensional point set, the method includes:
将所述各个目标图像区域的边缘像素点投影到三维空间中,得到各个目标图像区域的边缘像素点对应的三维点;Projecting the edge pixels of each target image area into a three-dimensional space to obtain three-dimensional points corresponding to the edge pixels of each target image area;
根据所述各个目标图像区域的边缘像素点对应的三维点的空间坐标,确定对应于三维空间同一个目标对象的三维点;Determine the three-dimensional points corresponding to the same target object in the three-dimensional space according to the spatial coordinates of the three-dimensional points corresponding to the edge pixel points of the respective target image regions;
将对应于三维空间同一个目标对象的三维点置于同一个三维点集合。Place the three-dimensional points corresponding to the same target object in the three-dimensional space in the same three-dimensional point set.
在某些实施例中,所述处理器用于根据所述各个目标图像区域的边缘像素点对应的三维点的空间坐标,确定对应于三维空间同一个目标对象的三维点时,包括:In some embodiments, when the processor is configured to determine the three-dimensional points corresponding to the same target object in the three-dimensional space according to the spatial coordinates of the three-dimensional points corresponding to the edge pixel points of the respective target image regions, the method includes:
根据所述各个目标图像区域的边缘像素点对应的三维点的空间坐标,确定所述各个目标图像区域在三维空间对应的三维空间区域的相交情况;Determine the intersection of each target image area in the three-dimensional space corresponding to the three-dimensional space according to the spatial coordinates of the three-dimensional point corresponding to the edge pixel point of each target image area;
基于所述三维空间区域的相交情况确定对应于三维空间同一个目标对 象的三维点。A three-dimensional point corresponding to the same target object in the three-dimensional space is determined based on the intersection of the three-dimensional space regions.
在某些实施例中,所述处理器用于根据所述各个目标图像区域的边缘像素点对应的三维点的空间坐标,确定所述各个目标图像区域对应的三维空间区域的相交情况时,包括:In some embodiments, when the processor is configured to determine the intersection of the three-dimensional space regions corresponding to each target image region according to the spatial coordinates of the three-dimensional points corresponding to the edge pixel points of each target image region, the method includes:
将所述各个目标图像区域的边缘像素点对应的三维点投影到同一个平面内,以确定所述各个目标图像区域对应的平面区域;Projecting the three-dimensional points corresponding to the edge pixel points of each target image area onto the same plane to determine the plane area corresponding to each target image area;
根据所述各个目标图像区域对应的平面区域的相交情况,确定所述各个目标图像区域对应的三维空间区域的相交情况。Determine the intersection of the three-dimensional space area corresponding to each target image area according to the intersection of the plane areas corresponding to each target image area.
在某些实施例中,所述处理器用于基于所述三维空间区域的相交情况确定对应于三维空间同一个目标对象的三维点时,包括:In some embodiments, when the processor is configured to determine the three-dimensional point corresponding to the same target object in the three-dimensional space based on the intersection of the three-dimensional space regions, the method includes:
若两个目标图像区域对应的三维空间区域相交,则确定所述两个目标图像区域的边缘像素点对应的三维点为对应于三维空间同一个目标对象的三维点;或If the three-dimensional space areas corresponding to the two target image areas intersect, determine that the three-dimensional points corresponding to the edge pixel points of the two target image areas are three-dimensional points corresponding to the same target object in the three-dimensional space; or
若两个目标图像区域对应的三维空间区域与同一个三维空间区域相交,则确定所述两个目标图像区域的边缘像素点对应的三维点为对应于三维空间同一个目标对象的三维点。If the three-dimensional space regions corresponding to the two target image regions intersect with the same three-dimensional space region, it is determined that the three-dimensional points corresponding to the edge pixel points of the two target image regions are three-dimensional points corresponding to the same target object in the three-dimensional space.
在某些实施例中,所述处理器用于根据各个所述三维点集合中三维点的空间坐标,填补各个三维点集合对应的目标图像区域的深度值时,包括:In some embodiments, when the processor is configured to fill in the depth value of the target image region corresponding to each three-dimensional point set according to the spatial coordinates of the three-dimensional point in each three-dimensional point set, the method includes:
针对每个三维点集合:For each three-dimensional point collection:
根据所述三维点集合中三维点的空间坐标确定一个拟合平面;Determining a fitting plane according to the spatial coordinates of the three-dimensional points in the three-dimensional point set;
根据所述拟合平面填补所述三维点集合对应的目标图像区域的深度值。Filling the depth value of the target image area corresponding to the three-dimensional point set according to the fitting plane.
在某些实施例中,所述处理器用于确定所述多张深度图中各个目标图像区域的边缘像素点时,包括:In some embodiments, when the processor is used to determine the edge pixels of each target image area in the multiple depth maps, the method includes:
获取所述多张深度图对应的语义图像,其中,所述语义图像被分割成多个图像区域,每个图像区域对应三维空间中的一类目标对象;Acquiring semantic images corresponding to the multiple depth maps, wherein the semantic image is segmented into multiple image regions, and each image region corresponds to a type of target object in a three-dimensional space;
基于所述语义图像确定所述多张深度图中各个目标图像区域的边缘像 素点。Determine the edge pixel points of each target image area in the multiple depth maps based on the semantic image.
在某些实施例中,所述语义图像基于所述深度图对应的RGB图像以及预先训练的计算模型得到。In some embodiments, the semantic image is obtained based on the RGB image corresponding to the depth map and a pre-trained calculation model.
在某些实施例中,所述基于所述语义图像确定所述多张深度图中各个目标图像区域的边缘像素点时,包括:In some embodiments, the determining the edge pixels of each target image area in the multiple depth maps based on the semantic image includes:
基于所述语义图像与所述深度图的像素点的对应关系从所述深度图确定各个目标图像区域的像素点;Determining the pixel points of each target image area from the depth map based on the correspondence between the semantic image and the pixel points of the depth map;
根据所述像素点四周的邻近像素点是否都为所述目标图像区域的像素点,确定所述目标图像区域的边缘像素点。Determine the edge pixels of the target image area according to whether the neighboring pixels around the pixel points are all the pixels of the target image area.
在某些实施例中,所述目标图像区域为水域的图像区域,所述目标对象为水域。In some embodiments, the target image area is an image area of a water area, and the target object is a water area.
相应地,本说明书实施例还提供一种计算机存储介质,所述存储介质中存储有程序,所述程序被处理器执行时实现上述任一实施例中图像处理方法。Correspondingly, an embodiment of the present specification also provides a computer storage medium in which a program is stored, and the program is executed by a processor to implement the image processing method in any of the foregoing embodiments.
本说明书实施例可采用在一个或多个其中包含有程序代码的存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。计算机可用存储介质包括永久性和非永久性、可移动和非可移动媒体,可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括但不限于:相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。The embodiments of this specification may adopt the form of a computer program product implemented on one or more storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing program codes. Computer usable storage media include permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology. The information can be computer-readable instructions, data structures, program modules, or other data. Examples of computer storage media include, but are not limited to: phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices.
对于装置实施例而言,由于其基本对应于方法实施例,所以相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开 的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。For the device embodiment, since it basically corresponds to the method embodiment, the relevant part can refer to the part of the description of the method embodiment. The device embodiments described above are merely illustrative, where the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in One place, or it can be distributed to multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments. Those of ordinary skill in the art can understand and implement without creative work.
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should be noted that in this article, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply one of these entities or operations. There is any such actual relationship or order between. The terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements not only includes those elements, but also includes other elements that are not explicitly listed. Elements, or also include elements inherent to such processes, methods, articles, or equipment. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, article, or equipment that includes the element.
以上对本发明实施例所提供的方法和装置进行了详细介绍,本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本发明的限制。The methods and devices provided by the embodiments of the present invention are described in detail above. Specific examples are used in this article to illustrate the principles and implementations of the present invention. The descriptions of the above embodiments are only used to help understand the methods and methods of the present invention. The core idea; at the same time, for those of ordinary skill in the art, according to the idea of the present invention, there will be changes in the specific implementation and the scope of application. In summary, the content of this specification should not be construed as a limitation of the present invention .
Claims (20)
- 一种图像处理方法,其特征在于,所述方法包括:An image processing method, characterized in that the method includes:获取多张深度图,各深度图中包括一个或多个目标图像区域,所述目标图像区域满足预设类型条件;Acquiring a plurality of depth maps, each depth map includes one or more target image regions, and the target image regions meet a preset type condition;确定所述多张深度图中各个目标图像区域的边缘像素点;Determining the edge pixels of each target image area in the multiple depth maps;将所述各个目标图像区域的边缘像素点投影到三维空间中以确定至少一个三维点集合,各个所述三维点集合中所包含的三维点对应于三维空间中的同一个目标对象;Projecting the edge pixels of each target image area into a three-dimensional space to determine at least one three-dimensional point set, and the three-dimensional points contained in each of the three-dimensional point sets correspond to the same target object in the three-dimensional space;根据各个所述三维点集合中三维点的空间坐标,填补各个三维点集合对应的目标图像区域的深度值。According to the spatial coordinates of the three-dimensional points in each of the three-dimensional point sets, the depth value of the target image area corresponding to each three-dimensional point set is filled.
- 根据权利要求1所述的图像处理方法,其特征在于,将所述各个目标图像区域的边缘像素点投影到三维空间中以确定至少一个三维点集合,包括:The image processing method according to claim 1, wherein projecting the edge pixel points of each target image area into a three-dimensional space to determine at least one three-dimensional point set comprises:将所述各个目标图像区域的边缘像素点投影到三维空间中,得到各个目标图像区域的边缘像素点对应的三维点;Projecting the edge pixels of each target image area into a three-dimensional space to obtain three-dimensional points corresponding to the edge pixels of each target image area;根据所述各个目标图像区域的边缘像素点对应的三维点的空间坐标,确定对应于三维空间同一个目标对象的三维点;Determine the three-dimensional points corresponding to the same target object in the three-dimensional space according to the spatial coordinates of the three-dimensional points corresponding to the edge pixel points of the respective target image regions;将对应于三维空间同一个目标对象的三维点置于同一个三维点集合。Place the three-dimensional points corresponding to the same target object in the three-dimensional space in the same three-dimensional point set.
- 根据权利要求2所述的图像处理方法,其特征在于,所述根据所述各个目标图像区域的边缘像素点对应的三维点的空间坐标,确定对应于三维空间同一个目标对象的三维点,包括:The image processing method according to claim 2, wherein the determining the three-dimensional points corresponding to the same target object in the three-dimensional space according to the spatial coordinates of the three-dimensional points corresponding to the edge pixel points of the respective target image regions comprises :根据所述各个目标图像区域的边缘像素点对应的三维点的空间坐标,确定所述各个目标图像区域在三维空间对应的三维空间区域的相交情况;Determine the intersection of each target image area in the three-dimensional space corresponding to the three-dimensional space according to the spatial coordinates of the three-dimensional point corresponding to the edge pixel point of each target image area;基于所述三维空间区域的相交情况确定对应于三维空间同一个目标对象的三维点。The three-dimensional points corresponding to the same target object in the three-dimensional space are determined based on the intersection of the three-dimensional space regions.
- 根据权利要求3所述的图像处理方法,其特征在于,所述根据所述各个目标图像区域的边缘像素点对应的三维点的空间坐标,确定所述各个 目标图像区域对应的三维空间区域的相交情况,包括:The image processing method according to claim 3, wherein the intersection of the three-dimensional space area corresponding to each target image area is determined according to the spatial coordinates of the three-dimensional point corresponding to the edge pixel point of each target image area Situation, including:将所述各个目标图像区域的边缘像素点对应的三维点投影到同一个平面内,以确定所述各个目标图像区域对应的平面区域;Projecting the three-dimensional points corresponding to the edge pixel points of each target image area onto the same plane to determine the plane area corresponding to each target image area;根据所述各个目标图像区域对应的平面区域的相交情况,确定所述各个目标图像区域对应的三维空间区域的相交情况。According to the intersection of the plane regions corresponding to the respective target image regions, the intersection of the three-dimensional space regions corresponding to the respective target image regions is determined.
- 根据权利要求3所述的图像处理方法,其特征在于,所述基于所述三维空间区域的相交情况确定对应于三维空间同一个目标对象的三维点,包括:The image processing method according to claim 3, wherein the determining the three-dimensional points corresponding to the same target object in the three-dimensional space based on the intersection of the three-dimensional space regions comprises:若两个目标图像区域对应的三维空间区域相交,则确定所述两个目标图像区域的边缘像素点对应的三维点为对应于三维空间同一个目标对象的三维点;或If the three-dimensional space regions corresponding to the two target image regions intersect, it is determined that the three-dimensional points corresponding to the edge pixel points of the two target image regions are three-dimensional points corresponding to the same target object in the three-dimensional space; or若两个目标图像区域对应的三维空间区域与同一个三维空间区域相交,则确定所述两个目标图像区域的边缘像素点对应的三维点为对应于三维空间同一个目标对象的三维点。If the three-dimensional space regions corresponding to the two target image regions intersect with the same three-dimensional space region, it is determined that the three-dimensional points corresponding to the edge pixel points of the two target image regions are three-dimensional points corresponding to the same target object in the three-dimensional space.
- 根据权利要求1-5任一项所述的图像处理方法,其特征在于,所述根据各个所述三维点集合中三维点的空间坐标,填补各个三维点集合对应的目标图像区域的深度值,包括:The image processing method according to any one of claims 1-5, wherein the depth value of the target image area corresponding to each three-dimensional point set is filled according to the spatial coordinates of the three-dimensional point in each three-dimensional point set, include:针对每个三维点集合:For each three-dimensional point collection:根据所述三维点集合中三维点的空间坐标确定一个拟合平面;Determining a fitting plane according to the spatial coordinates of the three-dimensional points in the three-dimensional point set;根据所述拟合平面填补所述三维点集合对应的目标图像区域的深度值。Filling the depth value of the target image area corresponding to the three-dimensional point set according to the fitting plane.
- 根据权利要求1-6任一项所述的图像处理方法,其特征在于,确定所述多张深度图中各个目标图像区域的边缘像素点,包括:The image processing method according to any one of claims 1 to 6, wherein determining the edge pixels of each target image area in the multiple depth maps comprises:获取所述多张深度图对应的语义图像,其中,所述语义图像被分割成多个图像区域,每个图像区域对应三维空间中的一类目标对象;Acquiring semantic images corresponding to the multiple depth maps, wherein the semantic image is segmented into multiple image regions, and each image region corresponds to a type of target object in a three-dimensional space;基于所述语义图像确定所述多张深度图中各个目标图像区域的边缘像素点。Determine the edge pixels of each target image area in the multiple depth maps based on the semantic image.
- 根据权利要求7所述的图像处理方法,其特征在于,所述语义图像基于所述深度图对应的RGB图像以及预先训练的计算模型得到。8. The image processing method according to claim 7, wherein the semantic image is obtained based on the RGB image corresponding to the depth map and a pre-trained calculation model.
- 根据权利要求7或8所述的图像处理方法,其特征在于,基于所述语义图像确定所述多张深度图中各个目标图像区域的边缘像素点,包括:The image processing method according to claim 7 or 8, characterized in that determining the edge pixels of each target image area in the multiple depth maps based on the semantic image comprises:基于所述语义图像与所述深度图的像素点的对应关系从所述深度图确定各个目标图像区域的像素点;Determining the pixel points of each target image area from the depth map based on the correspondence between the semantic image and the pixel points of the depth map;根据所述像素点四周的邻近像素点是否都为所述目标图像区域的像素点,确定所述目标图像区域的边缘像素点。Determine the edge pixels of the target image area according to whether the neighboring pixels around the pixel points are all the pixels of the target image area.
- 根据权利要求1-9任一项所述的图像处理方法,其特征在于,所述目标图像区域为水域的图像区域,所述目标对象为水域。The image processing method according to any one of claims 1-9, wherein the target image area is an image area of a water area, and the target object is a water area.
- 一种图像处理装置,其特征在于,所述装置包括处理器、存储器以及存储在所述存储器上的计算机程序,所述处理器执行所述计算机程序实现以下步骤:An image processing device, characterized in that the device includes a processor, a memory, and a computer program stored on the memory, and the processor executes the computer program to implement the following steps:获取多张深度图,各深度图中包括一个或多个目标图像区域,所述目标图像区域满足预设类型条件;Acquiring a plurality of depth maps, each depth map includes one or more target image regions, and the target image regions meet a preset type condition;确定所述多张深度图中各个目标图像区域的边缘像素点;Determining the edge pixels of each target image area in the multiple depth maps;将所述各个目标图像区域的边缘像素点投影到三维空间中以确定至少一个三维点集合,各个所述三维点集合中所包含的三维点对应于三维空间中的同一个目标对象;Projecting the edge pixels of each target image area into a three-dimensional space to determine at least one three-dimensional point set, and the three-dimensional points contained in each of the three-dimensional point sets correspond to the same target object in the three-dimensional space;根据各个所述三维点集合中三维点的空间坐标,填补各个三维点集合对应的目标图像区域的深度值。According to the spatial coordinates of the three-dimensional points in each of the three-dimensional point sets, the depth value of the target image area corresponding to each three-dimensional point set is filled.
- 根据权利要求11所述的图像处理装置,其特征在于,所述处理器用于将所述各个目标图像区域的边缘像素点投影到三维空间中以确定至少一个三维点集合时,包括:The image processing device according to claim 11, wherein when the processor is configured to project edge pixel points of each target image area into a three-dimensional space to determine at least one three-dimensional point set, the method comprises:将所述各个目标图像区域的边缘像素点投影到三维空间中,得到各个目标图像区域的边缘像素点对应的三维点;Projecting the edge pixels of each target image area into a three-dimensional space to obtain three-dimensional points corresponding to the edge pixels of each target image area;根据所述各个目标图像区域的边缘像素点对应的三维点的空间坐标, 确定对应于三维空间同一个目标对象的三维点;Determining the three-dimensional points corresponding to the same target object in the three-dimensional space according to the spatial coordinates of the three-dimensional points corresponding to the edge pixel points of the respective target image regions;将对应于三维空间同一个目标对象的三维点置于同一个三维点集合。Place the three-dimensional points corresponding to the same target object in the three-dimensional space in the same three-dimensional point set.
- 根据权利要求12所述的图像处理装置,其特征在于,所述处理器用于根据所述各个目标图像区域的边缘像素点对应的三维点的空间坐标,确定对应于三维空间同一个目标对象的三维点时,包括:The image processing device according to claim 12, wherein the processor is configured to determine, according to the spatial coordinates of the three-dimensional points corresponding to the edge pixels of the respective target image regions, the three-dimensional objects corresponding to the same target object in the three-dimensional space. When point, include:根据所述各个目标图像区域的边缘像素点对应的三维点的空间坐标,确定所述各个目标图像区域在三维空间对应的三维空间区域的相交情况;Determine the intersection of each target image area in the three-dimensional space corresponding to the three-dimensional space according to the spatial coordinates of the three-dimensional point corresponding to the edge pixel point of each target image area;基于所述三维空间区域的相交情况确定对应于三维空间同一个目标对象的三维点。The three-dimensional points corresponding to the same target object in the three-dimensional space are determined based on the intersection of the three-dimensional space regions.
- 根据权利要求13所述的图像处理装置,其特征在于,所述处理器用于根据所述各个目标图像区域的边缘像素点对应的三维点的空间坐标,确定所述各个目标图像区域对应的三维空间区域的相交情况时,包括:The image processing device according to claim 13, wherein the processor is configured to determine the three-dimensional space corresponding to each target image area according to the spatial coordinates of the three-dimensional point corresponding to the edge pixel point of each target image area When the area intersects, it includes:将所述各个目标图像区域的边缘像素点对应的三维点投影到同一个平面内,以确定所述各个目标图像区域对应的平面区域;Projecting the three-dimensional points corresponding to the edge pixel points of each target image area onto the same plane to determine the plane area corresponding to each target image area;根据所述各个目标图像区域对应的平面区域的相交情况,确定所述各个目标图像区域对应的三维空间区域的相交情况。According to the intersection of the plane regions corresponding to the respective target image regions, the intersection of the three-dimensional space regions corresponding to the respective target image regions is determined.
- 根据权利要求13所述的图像处理装置,其特征在于,所述处理器用于基于所述三维空间区域的相交情况确定对应于三维空间同一个目标对象的三维点时,包括:The image processing device according to claim 13, wherein when the processor is configured to determine the three-dimensional points corresponding to the same target object in the three-dimensional space based on the intersection of the three-dimensional space regions, the method comprises:若两个目标图像区域对应的三维空间区域相交,则确定所述两个目标图像区域的边缘像素点对应的三维点为对应于三维空间同一个目标对象的三维点;或If the three-dimensional space regions corresponding to the two target image regions intersect, it is determined that the three-dimensional points corresponding to the edge pixel points of the two target image regions are three-dimensional points corresponding to the same target object in the three-dimensional space; or若两个目标图像区域对应的三维空间区域与同一个三维空间区域相交,则确定所述两个目标图像区域的边缘像素点对应的三维点为对应于三维空间同一个目标对象的三维点。If the three-dimensional space regions corresponding to the two target image regions intersect with the same three-dimensional space region, it is determined that the three-dimensional points corresponding to the edge pixel points of the two target image regions are three-dimensional points corresponding to the same target object in the three-dimensional space.
- 根据权利要求11-15任一项所述的图像处理装置,其特征在于,所述处理器用于根据各个所述三维点集合中三维点的空间坐标,填补各个 三维点集合对应的目标图像区域的深度值时,包括:The image processing device according to any one of claims 11-15, wherein the processor is configured to fill in the target image area corresponding to each three-dimensional point set according to the spatial coordinates of the three-dimensional point in each three-dimensional point set. The depth value includes:针对每个三维点集合:For each three-dimensional point collection:根据所述三维点集合中三维点的空间坐标确定一个拟合平面;Determining a fitting plane according to the spatial coordinates of the three-dimensional points in the three-dimensional point set;根据所述拟合平面填补所述三维点集合对应的目标图像区域的深度值。Filling the depth value of the target image area corresponding to the three-dimensional point set according to the fitting plane.
- 根据权利要求11-16任一项所述的图像处理装置,其特征在于,所述处理器用于确定所述多张深度图中各个目标图像区域的边缘像素点时,包括:The image processing device according to any one of claims 11-16, wherein when the processor is used to determine the edge pixels of each target image area in the multiple depth maps, the method comprises:获取所述多张深度图对应的语义图像,其中,所述语义图像被分割成多个图像区域,每个图像区域对应三维空间中的一类目标对象;Acquiring semantic images corresponding to the multiple depth maps, wherein the semantic image is segmented into multiple image regions, and each image region corresponds to a type of target object in a three-dimensional space;基于所述语义图像确定所述多张深度图中各个目标图像区域的边缘像素点。Determine the edge pixels of each target image area in the multiple depth maps based on the semantic image.
- 根据权利要求17所述的图像处理装置,其特征在于,所述语义图像基于所述深度图对应的RGB图像以及预先训练的计算模型得到。18. The image processing device of claim 17, wherein the semantic image is obtained based on an RGB image corresponding to the depth map and a pre-trained calculation model.
- 根据权利要求17或18所述的图像处理装置,其特征在于,所述基于所述语义图像确定所述多张深度图中各个目标图像区域的边缘像素点时,包括:The image processing device according to claim 17 or 18, wherein the determining the edge pixels of each target image area in the multiple depth maps based on the semantic image comprises:基于所述语义图像与所述深度图的像素点的对应关系从所述深度图确定各个目标图像区域的像素点;Determining the pixel points of each target image area from the depth map based on the correspondence between the semantic image and the pixel points of the depth map;根据所述像素点四周的邻近像素点是否都为所述目标图像区域的像素点,确定所述目标图像区域的边缘像素点。Determine the edge pixels of the target image area according to whether the neighboring pixels around the pixel points are all the pixels of the target image area.
- 根据权利要求11-19任一项所述的图像处理装置,其特征在于,所述目标图像区域为水域的图像区域,所述目标对象为水域。The image processing device according to any one of claims 11-19, wherein the target image area is an image area of a water area, and the target object is a water area.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2019/122090 WO2021102948A1 (en) | 2019-11-29 | 2019-11-29 | Image processing method and device |
CN201980049930.7A CN112513929A (en) | 2019-11-29 | 2019-11-29 | Image processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2019/122090 WO2021102948A1 (en) | 2019-11-29 | 2019-11-29 | Image processing method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021102948A1 true WO2021102948A1 (en) | 2021-06-03 |
Family
ID=74923731
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/122090 WO2021102948A1 (en) | 2019-11-29 | 2019-11-29 | Image processing method and device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112513929A (en) |
WO (1) | WO2021102948A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113808251A (en) * | 2021-08-09 | 2021-12-17 | 杭州易现先进科技有限公司 | Dense reconstruction method, system, device and medium based on semantic segmentation |
CN113837943A (en) * | 2021-09-28 | 2021-12-24 | 广州极飞科技股份有限公司 | Image processing method, apparatus, electronic device and readable storage medium |
CN114373008A (en) * | 2022-01-11 | 2022-04-19 | 国网新疆电力有限公司电力科学研究院 | Method and device for measuring creepage distance of disc-shaped insulators |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115439543B (en) * | 2022-09-02 | 2023-11-10 | 北京百度网讯科技有限公司 | Method for determining hole position and method for generating three-dimensional model in meta universe |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130106849A1 (en) * | 2011-11-01 | 2013-05-02 | Samsung Electronics Co., Ltd. | Image processing apparatus and method |
CN103379350A (en) * | 2012-04-28 | 2013-10-30 | 中国科学院深圳先进技术研究院 | Virtual viewpoint image post-processing method |
CN103905813A (en) * | 2014-04-15 | 2014-07-02 | 福州大学 | DIBR hole filling method based on background extraction and partition recovery |
CN103945206A (en) * | 2014-04-22 | 2014-07-23 | 冠捷显示科技(厦门)有限公司 | Three-dimensional picture synthesis system based on comparison between similar frames |
CN104159093A (en) * | 2014-08-29 | 2014-11-19 | 杭州道玄影视科技有限公司 | Time-domain-consistent cavity region repairing method for static scene video shot in motion |
CN104780355A (en) * | 2015-03-31 | 2015-07-15 | 浙江大学 | Depth-based cavity repairing method in viewpoint synthesis |
CN109064542A (en) * | 2018-06-06 | 2018-12-21 | 链家网(北京)科技有限公司 | Threedimensional model surface hole complementing method and device |
CN110223383A (en) * | 2019-06-17 | 2019-09-10 | 重庆大学 | A kind of plant three-dimensional reconstruction method and system based on depth map repairing |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120188234A1 (en) * | 2011-01-20 | 2012-07-26 | University Of Southern California | Image processing apparatus and method |
CN104751508B (en) * | 2015-03-14 | 2017-07-14 | 杭州道玄影视科技有限公司 | The full-automatic of new view is quickly generated and complementing method in the making of 3D three-dimensional films |
CN105374019B (en) * | 2015-09-30 | 2018-06-19 | 华为技术有限公司 | A kind of more depth map fusion methods and device |
CN106791773B (en) * | 2016-12-30 | 2018-06-01 | 浙江工业大学 | A kind of novel view synthesis method based on depth image |
CN107622244B (en) * | 2017-09-25 | 2020-08-28 | 华中科技大学 | Indoor scene fine analysis method based on depth map |
-
2019
- 2019-11-29 CN CN201980049930.7A patent/CN112513929A/en active Pending
- 2019-11-29 WO PCT/CN2019/122090 patent/WO2021102948A1/en active Application Filing
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130106849A1 (en) * | 2011-11-01 | 2013-05-02 | Samsung Electronics Co., Ltd. | Image processing apparatus and method |
CN103379350A (en) * | 2012-04-28 | 2013-10-30 | 中国科学院深圳先进技术研究院 | Virtual viewpoint image post-processing method |
CN103905813A (en) * | 2014-04-15 | 2014-07-02 | 福州大学 | DIBR hole filling method based on background extraction and partition recovery |
CN103945206A (en) * | 2014-04-22 | 2014-07-23 | 冠捷显示科技(厦门)有限公司 | Three-dimensional picture synthesis system based on comparison between similar frames |
CN104159093A (en) * | 2014-08-29 | 2014-11-19 | 杭州道玄影视科技有限公司 | Time-domain-consistent cavity region repairing method for static scene video shot in motion |
CN104780355A (en) * | 2015-03-31 | 2015-07-15 | 浙江大学 | Depth-based cavity repairing method in viewpoint synthesis |
CN109064542A (en) * | 2018-06-06 | 2018-12-21 | 链家网(北京)科技有限公司 | Threedimensional model surface hole complementing method and device |
CN110223383A (en) * | 2019-06-17 | 2019-09-10 | 重庆大学 | A kind of plant three-dimensional reconstruction method and system based on depth map repairing |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113808251A (en) * | 2021-08-09 | 2021-12-17 | 杭州易现先进科技有限公司 | Dense reconstruction method, system, device and medium based on semantic segmentation |
CN113808251B (en) * | 2021-08-09 | 2024-04-12 | 杭州易现先进科技有限公司 | Dense reconstruction method, system, device and medium based on semantic segmentation |
CN113837943A (en) * | 2021-09-28 | 2021-12-24 | 广州极飞科技股份有限公司 | Image processing method, apparatus, electronic device and readable storage medium |
CN114373008A (en) * | 2022-01-11 | 2022-04-19 | 国网新疆电力有限公司电力科学研究院 | Method and device for measuring creepage distance of disc-shaped insulators |
Also Published As
Publication number | Publication date |
---|---|
CN112513929A (en) | 2021-03-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021102948A1 (en) | Image processing method and device | |
CN106033621B (en) | A kind of method and device of three-dimensional modeling | |
CN104820991B (en) | A kind of multiple soft-constraint solid matching method based on cost matrix | |
CN112489099B (en) | Point cloud registration method and device, storage medium and electronic equipment | |
US20150138193A1 (en) | Method and device for panorama-based inter-viewpoint walkthrough, and machine readable medium | |
CN103761397A (en) | Three-dimensional model slice for surface exposure additive forming and projection plane generating method | |
CN111369435B (en) | Color image depth up-sampling method and system based on self-adaptive stable model | |
CN110728707A (en) | Multi-view depth prediction method based on asymmetric depth convolution neural network | |
CN104778869A (en) | Immediately updated three-dimensional visualized teaching system and establishing method thereof | |
CN103051915A (en) | Manufacture method and manufacture device for interactive three-dimensional video key frame | |
CN107992588B (en) | Terrain display system based on elevation tile data | |
CN116778288A (en) | A multi-modal fusion target detection system and method | |
WO2022126921A1 (en) | Panoramic picture detection method and device, terminal, and storage medium | |
CN114387386A (en) | Rapid modeling method and system based on three-dimensional lattice rendering | |
CN114565722A (en) | Three-dimensional model monomer realization method | |
CN117237546A (en) | Three-dimensional profile reconstruction method and system for material-adding component based on light field imaging | |
CN111273877A (en) | Linkage display platform and linkage method for live-action three-dimensional data and two-dimensional grid picture | |
EP3906530B1 (en) | Method for 3d reconstruction of an object | |
CN106327576A (en) | Urban scene reconstruction method and system | |
CN103955886A (en) | 2D-3D image conversion method based on graph theory and vanishing point detection | |
CN117456076B (en) | Material map generation method and related equipment | |
CN118015197A (en) | A method, device and electronic device for real-scene three-dimensional logic monomerization | |
CN116152458B (en) | Three-dimensional simulation building generation method based on images | |
CN111738061A (en) | Binocular Vision Stereo Matching Method and Storage Medium Based on Region Feature Extraction | |
CN115496908B (en) | A method and system for automatically stratifying high-rise building oblique photography models |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19954572 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19954572 Country of ref document: EP Kind code of ref document: A1 |