Method for correcting elevation error of digital elevation model based on particle swarm optimization random forest
Technical Field
The invention relates to a digital elevation model elevation error correction method based on particle swarm optimization random forest, and belongs to the technical field of remote sensing.
Background
The digital elevation model is widely applied to the fields of landform, hydrology, mapping, disaster monitoring, control and the like as the digital expression of the ground surface elevation. However, the digital elevation model is affected by factors such as observation means, topography conditions, vegetation coverage and the like, the elevation of the digital elevation model has non-negligible errors, and the elevation precision in different areas often has large difference.
For the elevation errors of the digital elevation model, some scholars try to correct the elevation errors of the digital elevation model by utilizing various elevation data with higher precision, such as high-precision GPS measurement points, airborne laser radar elevation data and high-precision DEM data. However, the above elevation data is affected by factors such as limited distribution area, and great difficulty in acquisition and manufacture, and it is difficult to correct elevation errors of SRTM in any area over a wide range. ICESat satellite height measurement data are gradually applied to digital elevation model correction due to the advantages of global coverage, high height measurement precision and the like, but ICESat fails in 2009, and the data are correspondingly stopped from being updated, so that SRTM correction with better performance cannot be realized. At present, ICESat-2 altimetric satellites still operate, high-precision altimetric data in a near global range can be provided, and a digital elevation model elevation error correction model is established by combining Landsat8 images and adopting a polynomial regression method based on ICESat-2 data by Magruder and the like. However, since the elevation error of the digital elevation model and the influencing factors thereof are often complex nonlinear relations, the relation is difficult to fully express by a polynomial regression equation simply expressed in mathematics, and thus the elevation precision of the digital elevation model corrected by the method is still limited greatly.
The random forest is used as a machine learning algorithm for solving the nonlinear regression problem, has the advantages of high precision, strong noise resistance and difficult occurrence of fitting, but the precision of the method is affected by the set super parameters, so the invention searches the optimal super parameter combination value of the random forest by fusing the particle swarm algorithm, and corrects the elevation error of the digital elevation model by a particle swarm optimization random forest method.
Disclosure of Invention
In order to achieve a higher-precision digital elevation model correction result, the invention provides a digital elevation model elevation error correction method based on a particle swarm optimization random forest.
The technical scheme adopted by the invention for achieving the purpose is as follows:
a method for correcting elevation errors of a digital elevation model based on particle swarm optimization random forest comprises the following steps:
a. extracting a reference elevation control point from the height measurement satellite data, calculating an elevation error of a digital elevation model relative to the reference elevation control point, and extracting longitude and latitude, a topography parameter and a surface coverage type parameter corresponding to the reference elevation control point;
b. constructing a digital elevation model elevation error correction model based on a particle swarm optimization random forest;
c. Using longitude and latitude, a topographic parameter and a surface coverage parameter at a reference elevation control point as correction model input data, using an elevation error as correction model target data, and establishing a training set of a training model;
d. And training the correction model by using a training set, applying the digital elevation model to the correction model obtained by training, and correcting the elevation error.
In the step a, the extracted terrain parameters are the gradient Sl, the slope As and the terrain fluctuation Re of the digital elevation model, the extracted surface coverage type parameters come from global surface coverage data Gl, the calculation of the terrain parameters corresponding to the reference elevation control point is carried out at the reference elevation control point by adopting a bilinear interpolation method, and the extraction of the surface coverage type parameters is carried out at the reference elevation control point directly.
In the step b, the elevation error correction model is as follows:
Wherein H corrected is the elevation of the corrected digital elevation model, H original is the elevation of the original digital elevation model, H error is the elevation error of the predicted digital elevation model, PSO is a particle swarm algorithm, RF is a random forest algorithm, lat and Lon are latitude and longitude respectively, and Sl, as, re, gl are gradient, slope direction, topography relief and surface coverage type parameters respectively. And (3) through parameter optimization of a PSO algorithm, an RF elevation error model of an optimal parameter combination is obtained through training by utilizing a training set, and the corrected SRTM elevation H corrected is obtained by combining the obtained elevation error with the original SRTM elevation H original according to response variables [ Lat, lon, sl, as, re and Gl ] of each SRTM pixel and used for predicting an SRTM elevation error result H error.
In the step d, the specific flow of model training and correction is shown in fig. 1, wherein the original training set is firstly divided into 5 groups randomly, 4 groups are used for training a random forest model, and 1 group is used for model accuracy verification, wherein the random forest model is as follows:
Herror-RF=fRF(Lat,Lon,Sl,As,Re,Gl)
the verification evaluation index adopts a mean square error regression loss (MSE):
where N is the number of data used for verification, H error-RF is the elevation error predicted by the random forest model, Is the elevation error of the digital elevation model relative to the reference elevation control point. And sequentially using each group of data for accuracy verification, and finally taking the average value of 5 accuracy verification results as an adaptability function value of the model, wherein the smaller the adaptability function value is, the higher the model accuracy is.
And then determining the solution space range of the super parameters (the maximum value of the number of decision trees and the number of node partition selectable characteristic variables) influencing the precision of the random forest model, comparing fitness function values under different super parameter combinations through iterative updating of the speed and the position of particles in the solution space set by a particle swarm algorithm, searching the super parameter combination with the minimum fitness function value, taking the super parameter combination as the optimal super parameter combination of the random forest model, and training the random forest model under the optimal parameter combination by using the whole training set.
Finally, according to the model obtained by training, according to [ Lat, lon, sl, as, re, gl ] corresponding to each pixel of the digital elevation model, predicting an elevation error result H error corresponding to the pixel, and adding the original elevation H original of the pixel, obtaining a corrected elevation H corrected, and finishing the elevation error correction of the digital elevation model.
The method has the beneficial effects that the particle swarm optimization random forest method is used for correction so as to further improve the accuracy of the corrected digital elevation model, SRTM is selected as the digital elevation model used for the test, ICESat-2 strong light beam earth surface photon data is selected as the reference elevation control point used for the test, globeland30 is selected as the global earth surface coverage data used for the test, the airborne LIDAR DTM issued by NEON is selected as the verification data used for the test, the method (PSO-RF) and the polynomial regression-based correction method (PR) are tested, and the Root Mean Square Error (RMSE) is used as the verification index, so that the method disclosed by the invention can effectively reduce the elevation error of SRTM, the corrected SRTM is 42% -46% smaller than the elevation error before correction, and the correction accuracy is superior to the polynomial regression-based correction method.
Drawings
FIG. 1 is a flow chart for correcting the elevation error of a digital elevation model.
FIG. 2 is an evaluation of the elevation accuracy of the original SRTM, PR-modified SRTM, and PSO-RF modified SRTM for three lines.
Detailed Description
The following detailed description of the invention is further illustrated in conjunction with the examples and the accompanying drawings, but is not intended to limit the invention.
The implementation process of the invention is to adopt a computer to realize the elevation error correction of the digital elevation model based on the particle swarm optimization random forest. Taking an SRTM digital elevation model, ICESat-2 satellite height measurement data and Globeland ground surface coverage data in a certain area as examples, the method for correcting the elevation error of the SRTM comprises the following steps:
Step a, reading ICESat-2 satellite height measurement data, selecting photons (classed _pc_flag=1) classified as ground surfaces as reference height control points, and obtaining the height error of SRTM (short-distance) at the reference height control points relative to the reference height control points
And b, performing geographic processing on the SRTM to obtain gradient, slope direction and topography fluctuation data, extracting and obtaining topography parameters [ Sl, as, re ] at the reference elevation control point by bilinear interpolation, and directly taking the pixel value As a surface coverage type parameter Gl at the reference elevation control point according to Globeland pixels at the reference elevation control point.
And c, constructing a digital elevation model elevation error correction model based on the particle swarm optimization random forest.
And d, using longitude and latitude, a topographic parameter and a surface coverage parameter at a reference elevation control point as correction model input data, using an elevation error as correction model target data, and establishing a training set of a training model.
Step e, the original training set is randomly divided into 5 groups, wherein 4 groups are used for training a random forest model, and 1 group is used for model accuracy verification, and the random forest model is H error-RF=fRF (Lat, lon, sl, as, re and Gl). And sequentially using each group of data for accuracy verification, and finally taking the average value of 5 accuracy verification results as an adaptability function value of the model, wherein the smaller the adaptability function value is, the higher the model accuracy is. And then determining the solution space range of the super parameters (the maximum value of the number of decision trees and the number of node partition selectable characteristic variables) influencing the precision of the random forest model, comparing fitness function values under different super parameter combinations through iterative updating of the speed and the position of particles in the solution space set by a particle swarm algorithm, searching the super parameter combination with the minimum fitness function value, taking the super parameter combination as the optimal super parameter combination of the random forest model, and training the random forest model under the optimal parameter combination by using the whole training set.
And f, predicting an elevation error result H error corresponding to the pixel according to [ Lat, lon, sl, as, re and Gl ] of each pixel of the digital elevation model by the trained model, and adding the original elevation H original of the pixel to obtain a corrected elevation H corrected to finish the elevation error correction of the digital elevation model.
The present application has been described in terms of embodiments, and it will be appreciated by those of skill in the art that various changes can be made to the features and embodiments, or equivalents can be substituted, without departing from the spirit and scope of the application. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the application without departing from the essential scope thereof. Therefore, it is intended that the application not be limited to the particular embodiment disclosed, but that the application will include all embodiments falling within the scope of the appended claims.