CN111260564A

CN111260564A - Image processing method and device and computer storage medium

Info

Publication number: CN111260564A
Application number: CN201811458101.8A
Authority: CN
Inventors: 徐翔; 于璦玮; 李聪
Original assignee: Beijing Sensetime Technology Development Co Ltd
Current assignee: Beijing Sensetime Technology Development Co Ltd
Priority date: 2018-11-30
Filing date: 2018-11-30
Publication date: 2020-06-09

Abstract

The application discloses an image processing method and device. The method comprises the following steps: acquiring a building probability map of an image to be processed, wherein the building probability map comprises probability information that a plurality of pixel points in the image to be processed belong to a building area; sharpening the building probability map to obtain a sharpened building probability map; and obtaining the building boundary information of the image to be processed based on the sharpened building probability map. A corresponding apparatus is also disclosed. The building area and the non-building area can be better distinguished by sharpening the building probability map, so that the extracted building precision is improved.

Description

Image processing method and device and computer storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to an image processing method and apparatus, and a computer storage medium.

Background

With the increasing launching of satellites with high space, time and spectral resolution, remote sensing data based on the satellites are widely applied to various fields, information acquisition efficiency is greatly improved, and industrial development is promoted. The detection of the specified object from the remote sensing data is always an application hotspot in various fields.

Building extraction in remote sensing images is an important problem in remote sensing image processing. The extraction of buildings in the remote sensing image has very wide application, such as the change detection of buildings for predicting the urban development, the detection of illegal buildings, the analysis of buildings and the like.

However, since the remote sensing data is usually large in pixel, the imaging size of the object on the image is small due to the limitation of resolution, sometimes only a few pixels, and even visual interpretation needs to be considered for a long time by integrating various information. In a real environment, the same type of object has different forms, which causes difficulty for the traditional automatic detection technology based on prior characteristics and is difficult to apply to a practical scene.

Disclosure of Invention

The application provides an image processing technique.

In a first aspect, an image processing method is provided, including: acquiring a building probability map of an image to be processed, wherein the building probability map comprises probability information that a plurality of pixel points in the image to be processed belong to a building area; sharpening the building probability map to obtain a sharpened building probability map; and obtaining the building boundary information of the image to be processed based on the sharpened building probability map.

In one possible implementation manner, the obtaining of the building probability map of the image to be processed includes: step-by-step down-sampling processing is carried out on the image to be processed through a plurality of down-sampling modules to obtain first characteristic data; performing convolution processing on the first characteristic data to obtain second characteristic data; performing progressive upsampling processing on the second feature data through a plurality of upsampling modules to obtain third feature data; and obtaining the building probability map based on the third characteristic data.

In another possible implementation manner, the step-by-step downsampling the image to be processed by the plurality of downsampling modules to obtain the first feature data includes: performing pooling on input information of a first down-sampling module of the plurality of down-sampling modules to obtain first down-sampling feature data, wherein the input information of the first down-sampling module is the image to be processed or feature data output by an upper down-sampling module and a lower down-sampling module of the first down-sampling module; performing feature extraction processing on the first downsampling feature data to obtain second downsampling feature data; and performing fusion processing on the first downsampling feature data and the second downsampling feature data to obtain feature data output by the first downsampling module.

In another possible implementation manner, the performing the feature extraction process on the first downsampled feature data to obtain second downsampled feature data includes: performing feature extraction processing on the first downsampling feature data to obtain first intermediate feature data; fusing the first downsampling feature data and the first intermediate feature data to obtain second intermediate feature data; and performing feature extraction processing on the second intermediate feature data to obtain second downsampling feature data.

In another possible implementation manner, the performing convolution processing on the first feature data to obtain second feature data includes: carrying out reduction processing on the first characteristic data to obtain fourth characteristic data; performing feature extraction processing on the fourth feature data to obtain fifth feature data; and amplifying the fifth characteristic data to obtain the second characteristic data.

In another possible implementation manner, the step-by-step upsampling the second feature data by a plurality of upsampling modules to obtain third feature data includes: the feature data input by a first up-sampling module and the feature data output by the down-sampling module corresponding to the first up-sampling module are subjected to fusion processing to obtain first up-sampling feature data, wherein the plurality of up-sampling modules comprise the first up-sampling module, and the feature data output by the down-sampling module corresponding to the first up-sampling module and the feature data output by the first up-sampling module have the same dimensionality; performing feature extraction processing on the first up-sampling feature data to obtain second up-sampling feature data; and amplifying the second up-sampling feature data to obtain feature data output by the first up-sampling module.

In another possible implementation manner, the sharpening the building probability map to obtain a sharpened building probability map includes: filtering the building probability map to obtain a filtering probability map; overlapping the filtering probability map and the building probability map to obtain an overlapping probability map; and carrying out binarization processing on the superimposed probability map to obtain the sharpened building probability map.

In another possible implementation manner, the obtaining, based on the sharpened building probability map, building boundary information of the image to be processed includes: determining a connected region based on probability values of a plurality of pixel points included in the sharpened building probability map; performing edge correction processing on the original boundary of the connected region to obtain a corrected boundary of the connected region; and carrying out acute angle correction processing on the corrected boundary of the communication area to obtain the building boundary information.

In yet another possible implementation manner, the method further includes: storing information of the building boundary.

In a second aspect, there is provided an image processing apparatus comprising: the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring a building probability map of an image to be processed, and the building probability map comprises probability information that a plurality of pixel points in the image to be processed belong to a building area; the sharpening module is used for sharpening the building probability map to obtain a sharpened building probability map; and the processing module is used for obtaining the building boundary information of the image to be processed based on the sharpened building probability map.

In one possible implementation manner, the obtaining module includes: the down-sampling sub-module is used for performing down-sampling processing on the image to be processed step by step to obtain first characteristic data; the convolution submodule is used for performing convolution processing on the first characteristic data to obtain second characteristic data; the up-sampling sub-module is used for performing up-sampling processing on the second characteristic data step by step to obtain third characteristic data; and the processing submodule is used for obtaining the building probability map based on the third characteristic data.

In another possible implementation manner, the downsampling sub-module is configured to: pooling input information of a first down-sampling sub-module of the plurality of down-sampling sub-modules to obtain first down-sampling feature data, wherein the input information of the first down-sampling sub-module is the image to be processed or feature data output by an upper down-sampling sub-module and a lower down-sampling sub-module of the first down-sampling sub-module; performing feature extraction processing on the first downsampling feature data to obtain second downsampling feature data; and performing fusion processing on the first downsampling feature data and the second downsampling feature data to obtain feature data output by the first downsampling submodule.

In yet another possible implementation manner, the upsampling sub-module is configured to: performing feature extraction processing on the first downsampling feature data to obtain first intermediate feature data; fusing the first downsampling feature data and the first intermediate feature data to obtain second intermediate feature data; and performing feature extraction processing on the second intermediate feature data to obtain second downsampling feature data.

In yet another possible implementation manner, the convolution sub-module is configured to: carrying out reduction processing on the first characteristic data to obtain fourth characteristic data; performing feature extraction processing on the fourth feature data to obtain fifth feature data; and amplifying the fifth characteristic data to obtain the second characteristic data.

In yet another possible implementation manner, the upsampling sub-module is configured to: performing fusion processing on feature data input by a first up-sampling sub-module and feature data output by a down-sampling sub-module corresponding to the first up-sampling sub-module to obtain first up-sampling feature data, wherein the plurality of up-sampling sub-modules include the first up-sampling sub-module, and the feature data output by the down-sampling sub-module corresponding to the first up-sampling sub-module and the feature data output by the first up-sampling sub-module have the same dimensionality; performing feature extraction processing on the first up-sampling feature data to obtain second up-sampling feature data; and amplifying the second up-sampling feature data to obtain feature data output by the first up-sampling sub-module.

In yet another possible implementation manner, the sharpening module includes: the filtering submodule is used for filtering the building probability map to obtain a filtering probability map; the superposition submodule is used for carrying out superposition processing on the filtering probability map and the building probability map to obtain a superposition probability map; and the binarization submodule is used for carrying out binarization processing on the superposition probability map to obtain the sharpened building probability map.

In yet another possible implementation manner, the processing module includes: the determining submodule is used for determining a connected region based on probability values of a plurality of pixel points included in the sharpened building probability map; the boundary correction submodule is used for performing edge correction processing on the original boundary of the connected region to obtain a corrected boundary of the connected region; and the angle correction submodule is used for carrying out acute angle correction processing on the corrected boundary of the communicated area to obtain the building boundary information.

In yet another possible implementation manner, the apparatus further includes: and the storage module is used for storing the information of the building boundary.

In a third aspect, an image processing apparatus is provided, including: comprises a processor and a memory; the memory is configured to store computer instructions and the processor is configured to invoke the computer instructions stored in the memory to perform the method of the first aspect or any possible implementation thereof.

Optionally, the apparatus may further comprise an input/output interface for supporting communication between the apparatus and other apparatuses.

In a fourth aspect, there is provided a computer-readable storage medium having stored therein instructions, which, when run on a computer, cause the computer to perform the method of the first aspect or any possible implementation thereof.

In a fifth aspect, there is provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the method of the first aspect or any possible implementation thereof.

According to the embodiment of the application, the building area and the non-building area can be better distinguished by sharpening the building probability map, so that the extracted building precision is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments or the background art of the present application, the drawings required to be used in the embodiments or the background art of the present application will be described below.

Fig. 1 is a schematic flowchart of an image processing method according to an embodiment of the present application;

fig. 2 is another schematic flow chart of an image processing method according to an embodiment of the present disclosure;

fig. 3 is a schematic structural diagram of a building extraction network provided in an embodiment of the present application;

FIG. 4 is a schematic structural diagram of a feature fusion layer provided in an embodiment of the present application;

fig. 5 is another schematic flow chart of an image processing method according to an embodiment of the present application;

fig. 6 is another schematic flowchart of an image processing method according to an embodiment of the present application;

FIG. 7 is a schematic diagram of an angle correction algorithm provided in an embodiment of the present application;

fig. 8 is a comparison diagram of a to-be-processed image and a building vector diagram provided in an embodiment of the present application;

fig. 9 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application;

fig. 10 is a schematic diagram of a hardware structure of an image processing apparatus according to an embodiment of the present application.

Detailed Description

In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms "first," "second," and the like in the description and claims of the present application and in the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

The embodiments of the present application are described below with reference to the drawings in the embodiments of the present application, and it should be understood that the embodiments of the present application are mainly applied to processing of remote sensing images, but may also be applied to processing of other types of images including buildings, and the embodiments of the present application are not limited thereto.

Referring to fig. 1, fig. 1 is a schematic flowchart of an image processing method according to an embodiment of the present disclosure.

101. And acquiring a building probability map of the image to be processed.

In the embodiment of the present application, the building probability map of the image to be processed may be acquired in various ways. In some possible implementations, a building probability map of the image to be processed is obtained from other devices, for example, a building probability map sent by the receiving terminal device. In other possible implementations, the building probability map is obtained by processing the image to be processed. For example, feature extraction and prediction are performed on the image to be processed, so that a building probability map of the image to be processed is obtained. In an optional example, the image to be processed is subjected to down-sampling processing to obtain down-sampling feature data; then, carrying out up-sampling processing on the down-sampled feature data to further enrich semantic information in the feature data, and amplifying the size of the processed feature data to a target size to obtain up-sampled feature data; and finally, predicting the pixel points in the image to be processed according to the up-sampling characteristic data to obtain the probability that the pixel points belong to the building area, namely a building probability map.

102. And carrying out sharpening processing on the building probability map to obtain a sharpened building probability map.

In some possible implementation manners, the value range of the probability value in the building probability map is any value between 0 and 1, and the closer the probability value to 1, the higher the probability that the pixel belongs to the building area, and the closer the probability value to 0, the higher the probability that the pixel belongs to the non-building area. Because the probability values of the pixels in the image are continuously changed, the pixels with the probability values in the middle area may belong to a building area or a non-building area, and therefore a fuzzy area exists, and the fuzzy area can influence the accuracy of extracting the building area from the image to be processed, the method and the device for extracting the building probability map sharpen the building probability map so that the boundary of the building area and the non-building area is clearer, and the accuracy of extracting the building is improved.

At 102, the building probability map may be sharpened in a variety of ways. In some possible implementations, the probability value in the building probability map is binarized to obtain a sharpened building probability map, for example, the building probability map is binarized with a specific value as a boundary. In one example, bounded by 0.5, a probability value greater than or equal to 0.5 is adjusted to 1, and a probability value less than 0.5 is adjusted to 0, but the embodiment of the present application is not limited thereto.

103. And obtaining the building boundary information of the image to be processed based on the sharpened building probability map.

Specifically, a connected region corresponding to a building region in the image to be processed may be obtained based on the sharpened building probability map, and a building boundary may be obtained based on the connected region. In some possible implementations, after the connected region is obtained, the boundary of the connected region may be further corrected, and based on the corrected boundary, the building boundary is obtained. For example, the building boundary information of the image to be processed including the coordinate information of the building boundary is obtained by performing correction processing on the vertex of the connected region and/or performing acute angle correction processing on the boundary through a Douglas-pocker (Douglas-Peucker) algorithm.

In the embodiment of the application, optionally, a building area is extracted from the image to be processed through a neural network, and a geometric shape of a target rule is constructed, so that automatic labeling of a building on a map is realized.

Referring to fig. 2, fig. 2 is a flowchart illustrating a possible implementation manner of step 101 in an image processing method according to an embodiment of the present disclosure. In the example shown in fig. 2, the image to be processed is processed by the neural network, so as to obtain a building probability map of the image to be processed. For convenience of description, the neural network is hereinafter referred to as a dense building network (DSN), and a building probability map of an image to be processed can be obtained by processing the image to be processed through the DSN. Fig. 3 shows an example of a DSN structure in the embodiment of the present application, which specifically includes: the multi-sampling module comprises a plurality of down-sampling modules, a down-sampling (TD) layer, a Dense Block (DB) layer, an up-sampling (TU) layer and a plurality of up-sampling modules, wherein the plurality of down-sampling modules comprise a first down-sampling module, a second down-sampling module and a third down-sampling module, and the plurality of up-sampling modules comprise a first up-sampling module, a second up-sampling module and a third up-sampling module. It should be understood that the DSN structure shown in fig. 3 is only for illustration, and optionally, the DSN may not include some modules in fig. 3, such as the TD layer or the DB layer, and may further include other modules or have other structures, which is not limited in this disclosure.

201. And carrying out down-sampling processing on the image to be processed step by step through a plurality of down-sampling modules to obtain first characteristic data.

Optionally, before the image to be processed is input to the neural network, the image to be processed may be preprocessed, and the preprocessed image to be processed is input to the neural network for prediction, so as to obtain the building probability map. In some possible implementations, the pre-processing includes scaling processing, for example, the input image size of the neural network is fixed to 513 × 513, at which time, for an image to be processed with a size greater than 513 × 513, the size of the image to be processed may be reduced to 513 × 513, and for an image to be processed with a size less than 513 × 513, the size of the image to be processed may be enlarged to 513 × 513. In other possible implementations, the preprocessing includes one or more of brightness adjustment, color dithering, rotation, clipping, and the like, which is not limited in this application.

In 201, a down-sampling module in the DSN performs down-sampling processing step by step on an image to be processed, and extracts feature information in the image to be processed to obtain first feature data.

As shown in fig. 3, in one possible implementation of the down-sampling module, the down-sampling module includes: TD layer, DB layer and characteristic fusion layer.

The downsampling layer may be implemented in various ways, such as convolution, pooling, and the like, which is not limited in this application. In some possible implementations, the down-sampling layer includes a pooling layer for pooling, the resolution of the image to be processed is reduced to a specific size, and the number of sampling points is reduced, so that the dimension of the feature subsequently extracted from the image to be processed can be smaller, and the calculation amount of the subsequent processing can be greatly reduced. The pooling process may be average pooling or maximum pooling, and in one embodiment, assuming that the size of the image to be processed is H × W, when we want to obtain a target size of the image H × W, the image to be processed may be divided into H × W grids, such that each grid has a size of (H/H) × (W/W), and then the average value or the maximum value of the pixels in each grid is calculated, so as to obtain the image with the target size. In this way, the down-sampling layer performs reduction processing on the input information of the down-sampling module to obtain first down-sampling feature data, for example, a first down-sampling feature map.

The output end of the down-sampling layer is connected with the input end of the DB layer, and the DB layer is used for carrying out feature extraction processing on the feature graph output by the down-sampling layer. In the example shown in fig. 4, the DB layer includes 2 convolutional layers and 2 connection layers, and the first downsampled feature data is convolved by the convolutional layers to further extract semantic information therein. In some possible implementation manners, in the process of performing convolution operation on the first downsampling feature map, sliding a convolution kernel on an image, multiplying a numerical value of the first downsampling feature map by a numerical value on the corresponding convolution kernel, then adding all multiplied values to replace an original numerical value in the first downsampling feature map, and finally extracting features from the first downsampling feature map. In the example shown in fig. 4, the extracted features of each convolutional layer are connected to the input features of the convolutional layer to obtain connected features, which are used as the input of the next convolutional layer. By means of the iterative feature extraction and connection, the fusion of shallow and deep features is maximized, shallow detail and high-level abstract features can be considered to the greatest extent, and prediction accuracy is improved.

Optionally, a Batch Normalization (BN) layer and an activation layer (not shown in the figure), such as a modified linear activation function (ReLU) layer, may be further connected between the two convolutional layers. Specifically, when extracting features, data distribution of data is changed after the data is processed in each layer of network, which may cause difficulty in extracting the next layer of network. Therefore, the normalization processing is performed on the feature data output from the convolutional layer. And if the data output by each layer is normalized to a normal distribution with a mean value of 0 and a variance of 1, the learned feature distribution is normalized. The BN layer normalizes the data by adding trainable parameters, can accelerate the training speed, remove the correlation of the data and highlight the distribution difference between the characteristic data. In one example, the operation of the BN layer is as follows:

assume that the input data is β ═ x_1→mM data in total, the output is y_iBN (x), the BN layer will perform the following operations on the data:

first, the batch data β is obtained as x_1→mAverage value of (i), i.e.

According to the above average value μ_βDetermining the variance of said batch of data, i.e.

According to the above average value μ_βSum variance

Normalizing the batch data to obtain

Finally, based on the scaling variable γ and the translation variable β, a batch normalization result is obtained, i.e., the result is

Through the processing of the BN layer and the ReLu activation layer, the current feature space can be converted into another space through mapping, the nonlinearity of feature data is increased, and the feature data can be better classified.

And performing feature extraction processing on the first downsampling feature data through a DB layer to obtain second downsampling feature data. And finally, performing feature fusion processing on the second downsampling feature data and the first downsampling feature data, thereby further enriching semantic information in the feature data.

It should be understood that, in the example shown in fig. 3, the three downsampling modules have the same structure, in practical applications, the number of downsampling modules and the specific implementation of the downsampling modules may be adjusted according to requirements, and some or all of the downsampling modules in the plurality of downsampling modules may have different structures, which is not limited in this disclosure.

The first down-sampling module receives an input image to be processed or a preprocessed image to be processed or a part of image blocks in the image to be processed, performs first-stage down-sampling processing on input information to obtain characteristic data, and outputs the obtained characteristic data. The characteristic data output by the first down-sampling module is used as the input of a second down-sampling module, the second down-sampling module receives the characteristic data output by the first down-sampling module, carries out second-stage down-sampling processing on the input characteristic data to obtain characteristic data, and outputs the obtained characteristic data. The feature data output by the second down-sampling module is used as the input of a third down-sampling module, the third down-sampling module receives the feature data output by the second down-sampling module, carries out third-stage down-sampling processing on the input feature data to obtain first feature data, and outputs the first feature data. The plurality of down-sampling modules abstract the features in the image to be processed step by step to finally obtain first feature data.

202. And performing convolution processing on the first characteristic data to obtain second characteristic data.

As shown in fig. 3, the third down-sampling module is connected to the TD layer, the TD layer performs reduction processing on the first feature data to obtain fourth feature data, the DB layer performs feature extraction processing on the fourth feature data to obtain fifth feature data, and the TU layer performs amplification processing on the fifth feature data to obtain the second feature data. The specific composition of the TD layer and the DB layer and the processing procedure for the feature data are detailed in 201, and the specific processing procedure for the fifth feature data by the TU layer is detailed in 203.

203. And performing progressive upsampling processing on the second characteristic data through a plurality of upsampling modules to obtain third characteristic data.

In the example shown in fig. 3, the plurality of upsampling modules comprises: the device comprises a first up-sampling module, a second up-sampling module and a third up-sampling module. In one possible implementation of the upsampling module, a feature fusion process, a DB layer, and a TU layer are included.

The up-sampling is the inverse process of down-sampling, and the features are gradually amplified through an up-sampling layer, so that target feature data are finally obtained.

When the down-sampling module extracts the features, some relatively minor feature data can be discarded, but the minor data of the image is still reserved in the shallow feature of the image, so that the down-sampling feature data obtained by the down-sampling module and the up-sampling feature data with the same size can be fused when the up-sampling processing is carried out, richer feature data information can be obtained, and the finally obtained probability value of the building area is more accurate. Firstly, feature data input by a first up-sampling module and feature data output by a down-sampling module corresponding to the first up-sampling module (i.e. feature data output by a third down-sampling module) are fused to obtain first up-sampling feature data. It should be noted that, in the process of progressive upsampling, the downsampling feature data output by the second downsampling module and the upsampling output by the first upsampling module are subjected to feature fusion processing, and the downsampling feature data output by the first downsampling module and the upsampling output by the second upsampling module are subjected to feature fusion processing.

And the DB layer performs characteristic extraction processing on the first up-sampling characteristic data to obtain second up-sampling characteristic data.

The TU layer connected behind the DB layer performs an amplification process (e.g., bilinear interpolation) on the feature data, and amplifies the feature data step by step. In this way, the second up-sampling feature data is amplified through the TU layer, and feature data output by the up-sampling module is obtained. Optionally, the TU layer may also be implemented in other ways, which is not limited in this disclosure.

And performing progressive upsampling processing on the second characteristic data through a plurality of upsampling modules, and finally outputting third characteristic data.

204. And obtaining a building probability map based on the third characteristic data.

The output ends of the plurality of upsampling modules are connected with a classification layer (such as a softmax layer), different input features are mapped into values between 0 and 1 through a built-in softmax function, the mapped values correspond to the input features one by one, and therefore, the prediction is finished on each input feature, and corresponding probability is given in a numerical value mode.

Optionally, when the DSN is trained, the upsampled feature data output by the third upsampling module may be predicted by the softmax layer. Specifically, the softmax layer predicts the probability that the region belongs to the building region according to the features of different regions in the up-sampled feature data, and finally, each region obtains a probability value. The probability value in the probability map output by the softmax layer is matched with the actual situation by adjusting the parameter of the softmax function, namely, the corresponding probability value in the probability map is close to 1 for the building area, and the corresponding probability value in the probability map is close to 0 for the non-building area.

In practical application, after the third feature data are input to the softmax layer by the third up-sampling module, the softmax layer obtains the probability that the pixel points in the image to be processed belong to the building area based on the third feature data, and outputs a building probability map.

In the embodiment, the characteristic data is extracted by performing down-sampling processing on the image to be processed, and then the characteristic data is amplified by performing up-sampling processing, so that the characteristic information of the image to be processed can be completely extracted while the calculated amount in the whole process is reduced; and the classification layer predicts the content of the image to be processed according to the features in the feature image obtained after decoding, and autonomously and quickly obtains the probability map.

Referring to fig. 5, fig. 5 is a flowchart illustrating a possible implementation manner of step 102 in the image processing method according to the embodiment of the present application. In the example shown in fig. 5, the building probability map after sharpening is obtained by processing the building probability map.

501. And carrying out filtering processing on the building probability map to obtain a filtering probability map.

Since the probability values from building areas to non-building areas in the building probability map are often a gradual change from 1 to 0, fuzzy areas often exist at the intersection of the building areas and the non-building areas, i.e., it is difficult to distinguish whether a fuzzy area belongs to a building area or a non-building area. Obviously, the fuzzy zones have influence on the accuracy of extracting the building area from the image to be processed, and the building probability map is sharpened, so that the boundary between the building area and the non-building area is clearer.

In some possible implementations, the building probability map is filtered by the laplacian to enhance the details of the image and find the edges of the image. In one embodiment, when the probability value of the center of a certain neighborhood in the fuzzy zone is smaller than the average probability value of other pixels in the domain where the certain neighborhood is located, the probability value of the center pixel should be further reduced, and when the probability value of the center of the neighborhood is larger than the average probability value of other pixels in the neighborhood where the certain neighborhood is located, the probability value of the center pixel should be further improved, so that the sharpening processing of the image is realized.

Assuming that the building probability map is f (x, y), the laplacian operator is as follows:

based on the above formula, by using the differential property of fourier transform, a corresponding frequency domain Laplacian (Laplacian) filter can be derived, and the expression of the filter is as follows:

H(u,v)＝-4π²(u²+v²) Equation (2)

Where u is 0,1, 2., M-1, v is 0,1, 2., N-1, u, v are frequency variables, and M, N are the size of the input image f. The corresponding centering filter is as follows:

H(u,v)＝-4π²[(u-P/2)²+(v-Q/2)²]＝-4π²d (u, v) formula (3)

Where P and Q are the size of the filter H, P is 2M, and Q is 2N. Filtering the building probability map f by using a formula (3), and performing inverse transformation processing on a filtered result to obtain a filtering probability map, wherein the specific formula is as follows:

wherein F (u, v) is the Fourier transform of the building probability image F (x, y) and the symbol

Representing an inverse fourier transform.

Optionally, before the building probability map is sharpened, the building probability map may be smoothed, and accordingly, the smoothed building probability map is sharpened, so as to avoid that the sharpening on noise causes the reduction of the building extraction accuracy.

502. And overlapping the filtering probability map and the building probability map to obtain an overlapping probability map.

The filtering probability map obtained by filtering is superposed with the building probability map, the boundary of a building area and a non-building area can be distinguished, meanwhile, the probability values of the building area and the non-building area are reserved, and the superposition formula of the filtering probability map and the building probability map is as follows:

wherein g is a probability graph obtained after superposition, f is a building probability graph, and c is a non-0 integer. The filtering processing in 501 makes the edge outline of the building probability map clearer, so that the boundary between a building area and a non-building area can be sharpened, the information of the building area can be retained, the overlapping processing strengthens the weak edge in the building probability map, weakens the strong edge in the building probability map, and the contrast of the gradient position of the probability value is enhanced while the probability values of the building area and the non-building area are retained by overlapping the filtering probability map and the building probability map, so that the overlapped probability map obtained after processing looks more natural and is more suitable for human observation.

503. And carrying out binarization processing on the superimposed probability map to obtain a sharpened building probability map.

Through the processing of 501 and 502, the boundaries of the building areas and the non-building areas are determined, namely the building areas and the non-building areas are distinguished, but because the probability values of the building areas and the non-building areas are not the same, but the visual brightness degrees corresponding to different probability values are different, the closer the probability value is to the area of 1, the brighter the area is, and the closer the probability value is to the area of 0, the darker the area is. Obviously, the finally obtained building label graph only needs to mark the building area in the image to be processed correctly and is obviously distinguished from the non-building area, so that the probability values in the superimposed probability graph are subjected to binarization processing by taking 0.5 as a boundary, all probability values larger than or equal to 0.5 are adjusted to be 1, all probability values smaller than 0.5 are adjusted to be 0, and the sharpened building probability graph is obtained after adjustment.

The embodiment sharpens the boundaries of the building area and the non-building area in the building probability map, so that the building area and the non-building area are more clearly distinguished, and the accuracy of extracting the building area from the image to be processed can be improved.

Referring to fig. 6, fig. 6 is a flowchart illustrating a possible implementation manner of step 103 in the image processing method according to the embodiment of the present application.

601. And determining a connected region based on the probability values of a plurality of pixel points included in the sharpened building probability map.

503, the pixel values in the sharpened building probability map are only 0 and 1, and therefore, based on the probability values of the plurality of pixel points included in the sharpened building probability map, a connected region corresponding to a building area, that is, a region formed by a plurality of adjacent pixel points with a probability value of 1, may be determined, and a boundary of the connected region may be determined, where the boundary may be a polygon surrounding the connected region, where the boundary of the connected region may also be referred to as a vector map layer of the building, and the vector map layer includes boundary information of the connected region corresponding to the building, such as boundary vertex coordinates and the like.

602. And performing vertex correction or edge correction on the original boundary of the connected region to obtain a corrected boundary of the connected region.

The original boundary of the connected region is a polygon, so that the extracted boundary of the building is relatively straight and regular and is more close to the effect of artificial plotting, and redundant vertexes in the original boundary can be deleted.

In one possible implementation, the edge or vertex correction process is performed by the Douglas-pock (Douglas-Peucker) algorithm. Specifically, redundant vertices in the original boundary are deleted to enable edge modification of the original boundary of the building area. In one example implementation, the following steps are performed on the original boundary:

(1) a straight line AB is connected between the head point A and the tail point B of the polygon;

(2) determining the distance between a plurality of points positioned between A and B on the original boundary and the straight line AB, and obtaining a point C with the largest distance from the straight line AB section in the plurality of points and the distance d between the point C and the straight line AB based on the distance between the plurality of points and the straight line AB;

(3) and if the distance d is greater than the preset threshold, dividing the straight line AB into two sections of AC and CB by using the point C, respectively carrying out the processing from (1) to (3) on the two sections of straight lines, and carrying out iterative loop until the maximum distance from a plurality of points between the vertexes of each straight line section in the obtained plurality of straight line sections to the straight line section is less than the preset threshold.

The redundant vertex in the original boundary of the connected region can be removed by performing edge correction processing on the building vector map layer, and meanwhile, the topological relation of the building is kept unchanged, so that the extracted boundary of the building is more straight and regular, and the artificial plotting effect is approximate.

603. And carrying out acute angle correction processing on the corrected boundary of the communication area to obtain the building boundary information.

In order to make the extracted building area more fit to the actual building shape characteristics, the embodiment of the present invention provides an angle correction algorithm. The algorithm converts all corners close to right angles in the corrected boundary of the connected region into right angles on the premise of not changing the topological shape of the building. Thereby making the resulting building area more conforming to the actual building shape.

In the example shown in fig. 7, the angle correction algorithm traverses all vertices of the correction boundary of the connected region and determines the size of the included angle between the current vertex and the two subsequent vertices. In a specific implementation manner, if the current vertex traversed by the algorithm is the vertex P0, the size of the included angle a formed by the vertex P0 and the two subsequent vertices P1 and P2 is calculated, and if a is greater than or equal to 70 degrees and less than or equal to 110 degrees, a perpendicular line is made from P2 to P1P2, and the value of d (P1', P1)/d (P1, P0) is calculated, wherein d (P1', P1) and d (P1, P0) are the distance between P1' and P1 and the distance between P1 and P0, respectively. If the value of d (P1', P1)/d (P1, P0) is less than 0.3, P1 is replaced by the drop foot P1', so that the replaced point is not too far away from the original point, the topological relation of the building is ensured, and after a plurality of iterations, the corner approaching the right angle is more regular.

And when the correction of all internal angles of the building area is completed, a building vector diagram can be obtained. The building vector map contains coordinate information of the boundaries of connected regions (i.e., building region boundaries).

604. And storing the information of the building boundary.

As can be seen from the comparison diagram between the building vector diagram in fig. 8 and the image to be processed, the boundary of the building area in the building vector diagram (the second line diagram in fig. 8) is relatively straight and regular, which can approach the effect of manual plotting, and is consistent with the coordinates and the size of the actual building in the image to be processed, and meanwhile, since the building vector diagram has no background content, the memory space occupied by the building extraction result is greatly reduced, thereby reducing the requirement on hardware configuration.

It is known that the city planning industry needs to know the city map analysis city development in the last ten years, wherein the expansion trend of buildings is an important index. The building extraction method can effectively detect the change trend of the buildings in the city in the remote sensing image. In addition, the DSN provided by the application can be used for detecting illegal buildings in cities, so that the labor cost is saved. In addition, the scheme provided by the application can be used for building extraction on a monitored area regularly or irregularly. And finding out newly-added buildings by comparing the extraction results of the previous time and the next time, and regarding all the newly-added buildings as potential violation buildings. Relevant workers can carry out on-site verification according to the potential illegal building result given by the scheme of the application so as to ensure that the illegal building is accurately found. Through the building of this application realization to city building violating regulations, relevant staff can reduce staff's work load greatly, also can reduce human cost and time cost greatly simultaneously.

Due to the rapid development of the country, buildings in various regions are increasing, so that great difficulty is brought to updating of the electronic map. The electronic map needs to update the building data on the map at intervals, time and labor are wasted by using a manual detection mode, the building vector data are automatically detected and generated from the remote sensing image by using the scheme of the application, the detection accuracy is ensured, the workload of a marker and the storage space required by the building extraction result can be reduced, and meanwhile, the labor cost and the time cost can be saved.

According to the method and the device, the boundary of the building area is corrected, so that the building area is more in line with the actual shape of the building, the capacity of the image is greatly reduced through the building vector diagram obtained based on the building area after the boundary is corrected, and great convenience is brought to the user for storing the building vector diagram and performing subsequent processing based on the building vector diagram.

Referring to fig. 9, fig. 9 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application, where the apparatus 1 includes: the device comprises an acquisition module 11, a sharpening module 12 and a processing module 13, namely a storage module 14. Wherein:

the acquiring module 11 is configured to acquire a building probability map of an image to be processed, where the building probability map includes probability information that a plurality of pixel points in the image to be processed belong to a building area;

the sharpening module 12 is configured to sharpen the building probability map to obtain a sharpened building probability map;

and the processing module 13 is configured to obtain building boundary information of the image to be processed based on the sharpened building probability map.

Further, the obtaining module 11 includes: the down-sampling sub-module 111 is used for performing down-sampling processing on the image to be processed step by step to obtain first characteristic data; the convolution submodule 112 is configured to perform convolution processing on the first feature data to obtain second feature data; an upsampling submodule 113, configured to perform progressive upsampling processing on the second feature data to obtain third feature data; and the processing sub-module 114 is configured to obtain the building probability map based on the third feature data.

Further, the downsampling sub-module 111 is configured to: pooling input information of a first down-sampling sub-module of the plurality of down-sampling sub-modules to obtain first down-sampling feature data, wherein the input information of the first down-sampling sub-module is the image to be processed or feature data output by an upper down-sampling sub-module and a lower down-sampling sub-module of the first down-sampling sub-module; performing feature extraction processing on the first downsampling feature data to obtain second downsampling feature data; and performing fusion processing on the first downsampling feature data and the second downsampling feature data to obtain feature data output by the first downsampling submodule.

Further, the convolution sub-module 112 is configured to: carrying out reduction processing on the first characteristic data to obtain fourth characteristic data; performing feature extraction processing on the fourth feature data to obtain fifth feature data; and amplifying the fifth characteristic data to obtain the second characteristic data.

Further, the upsampling sub-module 113 is configured to: performing fusion processing on feature data input by a first up-sampling sub-module and feature data output by a down-sampling sub-module corresponding to the first up-sampling sub-module to obtain first up-sampling feature data, wherein the plurality of up-sampling sub-modules include the first up-sampling sub-module, and the feature data output by the down-sampling sub-module corresponding to the first up-sampling sub-module and the feature data output by the first up-sampling sub-module have the same dimensionality; performing feature extraction processing on the first up-sampling feature data to obtain second up-sampling feature data; and amplifying the second up-sampling feature data to obtain feature data output by the first up-sampling sub-module.

Further, the upsampling sub-module 113 is configured to: performing feature extraction processing on the first downsampling feature data to obtain first intermediate feature data; fusing the first downsampling feature data and the first intermediate feature data to obtain second intermediate feature data; and performing feature extraction processing on the second intermediate feature data to obtain second downsampling feature data.

Further, the sharpening module 12 includes: the filtering submodule 121 is configured to perform filtering processing on the building probability map to obtain a filtering probability map; the superposition submodule 122 is configured to perform superposition processing on the filtering probability map and the building probability map to obtain a superposition probability map; and the binarization submodule 123 is configured to perform binarization processing on the superimposed probability map to obtain the sharpened building probability map.

Further, the processing module 13 includes: the determining submodule 131 is configured to determine a connected region based on probability values of a plurality of pixel points included in the sharpened building probability map; a boundary modification submodule 132, configured to perform edge modification processing on the original boundary of the connected region, so as to obtain a modified boundary of the connected region; and the angle correction submodule 133 is configured to perform acute angle correction processing on the corrected boundary of the connected region to obtain the building boundary information.

Further, the apparatus 1 further comprises: a storage module 14 for storing information of the building boundary.

According to the method and the device, the building characteristic data in the image to be processed is extracted by performing down-sampling processing, up-sampling processing and characteristic fusion processing on the image to be processed, and the content of the image to be processed is predicted according to the building characteristic data to obtain the building probability map. The building region and the non-building region can be better distinguished by sharpening the building probability map, the building region can better conform to the actual shape of a building by correcting the boundary and the corner of the building region, and finally, the capacity of the image is greatly reduced according to the building vector diagram obtained according to the building region, so that great convenience is brought to users for storing the building vector diagram and performing subsequent processing based on the building vector diagram.

Fig. 10 is a schematic structural diagram of image processing hardware according to an embodiment of the present disclosure. The processing device 2 comprises a processor 21 and may further comprise an input device 22, an output device 23 and a memory 24. The input device 22, the output device 23, the memory 24 and the processor 21 are connected to each other via a bus.

The memory includes, but is not limited to, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM), or a portable read-only memory (CD-ROM), which is used for storing instructions and data.

The input means are for inputting data and/or signals and the output means are for outputting data and/or signals. The output means and the input means may be separate devices or may be an integral device.

The processor may include one or more processors, for example, one or more Central Processing Units (CPUs), and in the case of one CPU, the CPU may be a single-core CPU or a multi-core CPU.

The memory is used to store program codes and data of the network device.

The processor is used for calling the program codes and data in the memory and executing the steps in the method embodiment. Specifically, reference may be made to the description of the method embodiment, which is not repeated herein.

It will be appreciated that fig. 10 only shows a simplified design of an image processing apparatus. In practical applications, the image processing apparatuses may further include other necessary components, including but not limited to any number of input/output devices, processors, controllers, memories, etc., and all image processing apparatuses that can implement the embodiments of the present application are within the scope of the present application.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the division of the unit is only one logical function division, and other division may be implemented in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. The shown or discussed mutual coupling, direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some interfaces, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions according to the embodiments of the present application are wholly or partially generated when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on or transmitted over a computer-readable storage medium. The computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)), or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The usable medium may be a read-only memory (ROM), or a Random Access Memory (RAM), or a magnetic medium, such as a floppy disk, a hard disk, a magnetic tape, a magnetic disk, or an optical medium, such as a Digital Versatile Disk (DVD), or a semiconductor medium, such as a Solid State Disk (SSD).

Claims

1. An image processing method, comprising:

acquiring a building probability map of an image to be processed, wherein the building probability map comprises probability information that a plurality of pixel points in the image to be processed belong to a building area;

sharpening the building probability map to obtain a sharpened building probability map;

and obtaining the building boundary information of the image to be processed based on the sharpened building probability map.

2. The method of claim 1, wherein obtaining the building probability map of the image to be processed comprises:

step-by-step down-sampling processing is carried out on the image to be processed through a plurality of down-sampling modules to obtain first characteristic data;

performing convolution processing on the first characteristic data to obtain second characteristic data;

performing progressive upsampling processing on the second feature data through a plurality of upsampling modules to obtain third feature data;

and obtaining the building probability map based on the third characteristic data.

3. The method according to claim 2, wherein the down-sampling the image to be processed by a plurality of down-sampling modules step by step to obtain first feature data, comprises:

performing pooling on input information of a first down-sampling module of the plurality of down-sampling modules to obtain first down-sampling feature data, wherein the input information of the first down-sampling module is the image to be processed or feature data output by an upper down-sampling module and a lower down-sampling module of the first down-sampling module;

performing feature extraction processing on the first downsampling feature data to obtain second downsampling feature data;

and performing fusion processing on the first downsampling feature data and the second downsampling feature data to obtain feature data output by the first downsampling module.

4. The method according to claim 3, wherein the performing the feature extraction process on the first downsampled feature data to obtain second downsampled feature data comprises:

performing feature extraction processing on the first downsampling feature data to obtain first intermediate feature data;

fusing the first downsampling feature data and the first intermediate feature data to obtain second intermediate feature data;

and performing feature extraction processing on the second intermediate feature data to obtain second downsampling feature data.

5. The method according to any one of claims 2 to 4, wherein the convolving the first feature data to obtain second feature data comprises:

carrying out reduction processing on the first characteristic data to obtain fourth characteristic data;

performing feature extraction processing on the fourth feature data to obtain fifth feature data;

and amplifying the fifth characteristic data to obtain the second characteristic data.

6. The method according to any one of claims 2 to 5, wherein the step-by-step upsampling the second feature data by a plurality of upsampling modules to obtain third feature data comprises:

the feature data input by a first up-sampling module and the feature data output by the down-sampling module corresponding to the first up-sampling module are subjected to fusion processing to obtain first up-sampling feature data, wherein the plurality of up-sampling modules comprise the first up-sampling module, and the feature data output by the down-sampling module corresponding to the first up-sampling module and the feature data output by the first up-sampling module have the same dimensionality;

performing feature extraction processing on the first up-sampling feature data to obtain second up-sampling feature data;

and amplifying the second up-sampling feature data to obtain feature data output by the first up-sampling module.

7. The method according to any one of claims 1 to 6, wherein the sharpening the building probability map to obtain a sharpened building probability map comprises:

filtering the building probability map to obtain a filtering probability map;

overlapping the filtering probability map and the building probability map to obtain an overlapping probability map;

and carrying out binarization processing on the superimposed probability map to obtain the sharpened building probability map.

8. An image processing apparatus characterized by comprising:

the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring a building probability map of an image to be processed, and the building probability map comprises probability information that a plurality of pixel points in the image to be processed belong to a building area;

the sharpening module is used for sharpening the building probability map to obtain a sharpened building probability map;

and the processing module is used for obtaining the building boundary information of the image to be processed based on the sharpened building probability map.

9. An image processing apparatus comprising a memory having computer-executable instructions stored thereon and a processor that, when executing the computer-executable instructions on the memory, implements the method of any of claims 1 to 7.

10. A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the method of any one of claims 1 to 7.