CN115983317B

CN115983317B - A Method for Correcting Elevation Errors in Digital Elevation Models Based on Particle Swarm Optimization and Random Forest

Info

Publication number: CN115983317B
Application number: CN202310007751.5A
Authority: CN
Inventors: 张立华; 刘翔; 贾帅东; 戴泽源
Original assignee: PLA Dalian Naval Academy
Current assignee: PLA Naval University of Engineering
Priority date: 2023-01-04
Filing date: 2023-01-04
Publication date: 2025-11-25
Anticipated expiration: 2043-01-04
Also published as: CN115983317A

Abstract

This invention discloses a method for correcting elevation errors in digital elevation models (DEMs) based on particle swarm optimization (PSO) and random forests, belonging to the field of remote sensing technology. This invention uses a PSO-optimized random forest method for correction to further improve the accuracy of the corrected DEM. SRTM was selected as the DEM used in the experiment, ICESat-2 high-beam surface photon data was used as the reference elevation control points, Globeland30 was used as the global land cover data, and NEON-released airborne LiDAR DTM was used as the validation data. Through experiments comparing the proposed method with a multinomial regression-based correction method, using root mean square error (RMSE) as the validation index, this invention effectively reduces the elevation error of SRTM. The corrected SRTM shows a 42%-46% reduction in elevation error compared to the original method, and the correction accuracy is superior to the multinomial regression-based correction method.

Description

Method for correcting elevation error of digital elevation model based on particle swarm optimization random forest

Technical Field

The invention relates to a digital elevation model elevation error correction method based on particle swarm optimization random forest, and belongs to the technical field of remote sensing.

Background

The digital elevation model is widely applied to the fields of landform, hydrology, mapping, disaster monitoring, control and the like as the digital expression of the ground surface elevation. However, the digital elevation model is affected by factors such as observation means, topography conditions, vegetation coverage and the like, the elevation of the digital elevation model has non-negligible errors, and the elevation precision in different areas often has large difference.

For the elevation errors of the digital elevation model, some scholars try to correct the elevation errors of the digital elevation model by utilizing various elevation data with higher precision, such as high-precision GPS measurement points, airborne laser radar elevation data and high-precision DEM data. However, the above elevation data is affected by factors such as limited distribution area, and great difficulty in acquisition and manufacture, and it is difficult to correct elevation errors of SRTM in any area over a wide range. ICESat satellite height measurement data are gradually applied to digital elevation model correction due to the advantages of global coverage, high height measurement precision and the like, but ICESat fails in 2009, and the data are correspondingly stopped from being updated, so that SRTM correction with better performance cannot be realized. At present, ICESat-2 altimetric satellites still operate, high-precision altimetric data in a near global range can be provided, and a digital elevation model elevation error correction model is established by combining Landsat8 images and adopting a polynomial regression method based on ICESat-2 data by Magruder and the like. However, since the elevation error of the digital elevation model and the influencing factors thereof are often complex nonlinear relations, the relation is difficult to fully express by a polynomial regression equation simply expressed in mathematics, and thus the elevation precision of the digital elevation model corrected by the method is still limited greatly.

The random forest is used as a machine learning algorithm for solving the nonlinear regression problem, has the advantages of high precision, strong noise resistance and difficult occurrence of fitting, but the precision of the method is affected by the set super parameters, so the invention searches the optimal super parameter combination value of the random forest by fusing the particle swarm algorithm, and corrects the elevation error of the digital elevation model by a particle swarm optimization random forest method.

Disclosure of Invention

In order to achieve a higher-precision digital elevation model correction result, the invention provides a digital elevation model elevation error correction method based on a particle swarm optimization random forest.

The technical scheme adopted by the invention for achieving the purpose is as follows:

a method for correcting elevation errors of a digital elevation model based on particle swarm optimization random forest comprises the following steps:

a. extracting a reference elevation control point from the height measurement satellite data, calculating an elevation error of a digital elevation model relative to the reference elevation control point, and extracting longitude and latitude, a topography parameter and a surface coverage type parameter corresponding to the reference elevation control point;

b. constructing a digital elevation model elevation error correction model based on a particle swarm optimization random forest;

c. Using longitude and latitude, a topographic parameter and a surface coverage parameter at a reference elevation control point as correction model input data, using an elevation error as correction model target data, and establishing a training set of a training model;

d. And training the correction model by using a training set, applying the digital elevation model to the correction model obtained by training, and correcting the elevation error.

In the step a, the extracted terrain parameters are the gradient Sl, the slope As and the terrain fluctuation Re of the digital elevation model, the extracted surface coverage type parameters come from global surface coverage data Gl, the calculation of the terrain parameters corresponding to the reference elevation control point is carried out at the reference elevation control point by adopting a bilinear interpolation method, and the extraction of the surface coverage type parameters is carried out at the reference elevation control point directly.

In the step b, the elevation error correction model is as follows:

Wherein H _corrected is the elevation of the corrected digital elevation model, H _original is the elevation of the original digital elevation model, H _error is the elevation error of the predicted digital elevation model, PSO is a particle swarm algorithm, RF is a random forest algorithm, lat and Lon are latitude and longitude respectively, and Sl, as, re, gl are gradient, slope direction, topography relief and surface coverage type parameters respectively. And (3) through parameter optimization of a PSO algorithm, an RF elevation error model of an optimal parameter combination is obtained through training by utilizing a training set, and the corrected SRTM elevation H _corrected is obtained by combining the obtained elevation error with the original SRTM elevation H _original according to response variables [ Lat, lon, sl, as, re and Gl ] of each SRTM pixel and used for predicting an SRTM elevation error result H _error.

In the step d, the specific flow of model training and correction is shown in fig. 1, wherein the original training set is firstly divided into 5 groups randomly, 4 groups are used for training a random forest model, and 1 group is used for model accuracy verification, wherein the random forest model is as follows:

H_error-RF＝f_RF(Lat,Lon,Sl,As,Re,Gl)

the verification evaluation index adopts a mean square error regression loss (MSE):

where N is the number of data used for verification, H _error-RF is the elevation error predicted by the random forest model, Is the elevation error of the digital elevation model relative to the reference elevation control point. And sequentially using each group of data for accuracy verification, and finally taking the average value of 5 accuracy verification results as an adaptability function value of the model, wherein the smaller the adaptability function value is, the higher the model accuracy is.

And then determining the solution space range of the super parameters (the maximum value of the number of decision trees and the number of node partition selectable characteristic variables) influencing the precision of the random forest model, comparing fitness function values under different super parameter combinations through iterative updating of the speed and the position of particles in the solution space set by a particle swarm algorithm, searching the super parameter combination with the minimum fitness function value, taking the super parameter combination as the optimal super parameter combination of the random forest model, and training the random forest model under the optimal parameter combination by using the whole training set.

Finally, according to the model obtained by training, according to [ Lat, lon, sl, as, re, gl ] corresponding to each pixel of the digital elevation model, predicting an elevation error result H _error corresponding to the pixel, and adding the original elevation H _original of the pixel, obtaining a corrected elevation H _corrected, and finishing the elevation error correction of the digital elevation model.

The method has the beneficial effects that the particle swarm optimization random forest method is used for correction so as to further improve the accuracy of the corrected digital elevation model, SRTM is selected as the digital elevation model used for the test, ICESat-2 strong light beam earth surface photon data is selected as the reference elevation control point used for the test, globeland30 is selected as the global earth surface coverage data used for the test, the airborne LIDAR DTM issued by NEON is selected as the verification data used for the test, the method (PSO-RF) and the polynomial regression-based correction method (PR) are tested, and the Root Mean Square Error (RMSE) is used as the verification index, so that the method disclosed by the invention can effectively reduce the elevation error of SRTM, the corrected SRTM is 42% -46% smaller than the elevation error before correction, and the correction accuracy is superior to the polynomial regression-based correction method.

Drawings

FIG. 1 is a flow chart for correcting the elevation error of a digital elevation model.

FIG. 2 is an evaluation of the elevation accuracy of the original SRTM, PR-modified SRTM, and PSO-RF modified SRTM for three lines.

Detailed Description

The following detailed description of the invention is further illustrated in conjunction with the examples and the accompanying drawings, but is not intended to limit the invention.

The implementation process of the invention is to adopt a computer to realize the elevation error correction of the digital elevation model based on the particle swarm optimization random forest. Taking an SRTM digital elevation model, ICESat-2 satellite height measurement data and Globeland ground surface coverage data in a certain area as examples, the method for correcting the elevation error of the SRTM comprises the following steps:

Step a, reading ICESat-2 satellite height measurement data, selecting photons (classed _pc_flag=1) classified as ground surfaces as reference height control points, and obtaining the height error of SRTM (short-distance) at the reference height control points relative to the reference height control points

And b, performing geographic processing on the SRTM to obtain gradient, slope direction and topography fluctuation data, extracting and obtaining topography parameters [ Sl, as, re ] at the reference elevation control point by bilinear interpolation, and directly taking the pixel value As a surface coverage type parameter Gl at the reference elevation control point according to Globeland pixels at the reference elevation control point.

And c, constructing a digital elevation model elevation error correction model based on the particle swarm optimization random forest.

And d, using longitude and latitude, a topographic parameter and a surface coverage parameter at a reference elevation control point as correction model input data, using an elevation error as correction model target data, and establishing a training set of a training model.

Step e, the original training set is randomly divided into 5 groups, wherein 4 groups are used for training a random forest model, and 1 group is used for model accuracy verification, and the random forest model is H _error-RF＝f_RF (Lat, lon, sl, as, re and Gl). And sequentially using each group of data for accuracy verification, and finally taking the average value of 5 accuracy verification results as an adaptability function value of the model, wherein the smaller the adaptability function value is, the higher the model accuracy is. And then determining the solution space range of the super parameters (the maximum value of the number of decision trees and the number of node partition selectable characteristic variables) influencing the precision of the random forest model, comparing fitness function values under different super parameter combinations through iterative updating of the speed and the position of particles in the solution space set by a particle swarm algorithm, searching the super parameter combination with the minimum fitness function value, taking the super parameter combination as the optimal super parameter combination of the random forest model, and training the random forest model under the optimal parameter combination by using the whole training set.

And f, predicting an elevation error result H _error corresponding to the pixel according to [ Lat, lon, sl, as, re and Gl ] of each pixel of the digital elevation model by the trained model, and adding the original elevation H _original of the pixel to obtain a corrected elevation H _corrected to finish the elevation error correction of the digital elevation model.

The present application has been described in terms of embodiments, and it will be appreciated by those of skill in the art that various changes can be made to the features and embodiments, or equivalents can be substituted, without departing from the spirit and scope of the application. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the application without departing from the essential scope thereof. Therefore, it is intended that the application not be limited to the particular embodiment disclosed, but that the application will include all embodiments falling within the scope of the appended claims.

Claims

1. A method for correcting elevation errors in a digital elevation model based on particle swarm optimization and random forest, characterized in that the method includes the following steps:

a. Extract reference elevation control points from altimetry satellite data, calculate the elevation error of the digital elevation model relative to the reference elevation control points, and extract the latitude and longitude, terrain parameters, and land cover type parameters corresponding to the reference elevation control points;

b. Construct a digital elevation model based on particle swarm optimization random forest for elevation error correction.

The elevation error correction model is as follows:

In the formula, H _corrected is the elevation of the corrected digital elevation model, H _original is the elevation of the original digital elevation model, H _error is the predicted elevation error of the digital elevation model, PSO is the particle swarm optimization algorithm, and RF is the random forest algorithm; Lat and Lon are latitude and longitude, respectively, and Sl, As, Re, and Gl are slope, aspect, topographic relief, and land cover type parameters, respectively. Through parameter optimization of the PSO algorithm, the RF elevation error model with the optimal parameter combination is obtained by training with the training set. The response variables [Lat, Lon, Sl, As, Re, Gl] of each SRTM cell are used to predict the SRTM elevation error result H _error . Then, the corrected SRTM elevation H _corrected is obtained by combining the obtained elevation error with the original SRTM elevation H _original .

c. Use the latitude and longitude, topographic parameters, and land cover parameters at the reference elevation control point as input data for the correction model, and the elevation error as the target data for the correction model to establish a training set for the training model.

d. Train the correction model using the training set, and apply the digital elevation model to the trained correction model to correct the elevation error.

2. The method for correcting elevation errors in a digital elevation model based on particle swarm optimization and random forest as described in claim 1, characterized in that, in step a, the extracted terrain parameters are the slope Sl, aspect As, and topographic relief Re of the digital elevation model, and the extracted land cover type parameters are derived from global land cover data Gl; for the calculation of terrain parameters corresponding to reference elevation control points, bilinear interpolation is used to extract them at the reference elevation control points; and for the land cover type parameters, they are directly extracted at the reference elevation control points.

3. A method for correcting elevation errors in a digital elevation model based on particle swarm optimization and random forest, as described in claim 1 or 2, characterized in that, in step d, the specific process of model training and correction is as follows: First, the original training set is randomly divided into 5 groups, of which 4 groups are used for training the random forest model and 1 group is used for model accuracy verification, wherein the random forest model is:

H _error-RF =f _RF (Lat,Lon,Sl,As,Re,Gl)

The validation and evaluation index uses the mean squared error regression loss (MSE).

In the formula, N is the number of data points used for validation, and H _error-RF is the elevation error predicted by the random forest model. The elevation error of the digital elevation model relative to the reference elevation control point is used. Each set of data is used for accuracy verification in turn. Finally, the mean of the five accuracy verification results is used as the fitness function value of the model. The smaller the fitness function value, the higher the model accuracy.

Then, the solution space range of the hyperparameters that affect the accuracy of the random forest model is determined. The velocity and position of the particles in the solution space are iteratively updated by the particle swarm algorithm. The fitness function values under different combinations of hyperparameters are compared. The hyperparameter combination with the smallest fitness function value is searched. This hyperparameter combination is taken as the optimal hyperparameter combination of the random forest model. The random forest model with the optimal parameter combination is trained using the entire training set.

Finally, the trained model predicts the elevation error _{H_error} of each pixel based on [Lat, Lon, Sl, As, Re, Gl], and adds the original elevation _{H_original} to obtain the corrected elevation _{H_corrected} , thus completing the elevation error correction of the digital elevation model.

4. The elevation error correction method for a digital elevation model based on particle swarm optimization and random forest according to claim 1, characterized in that, in step d, the specific process of model training and correction is as follows: First, the original training set is randomly divided into 5 groups, of which 4 groups are used for training the random forest model and 1 group is used for model accuracy verification, wherein the random forest model is:

H _error-RF =f _RF (Lat,Lon,Sl,As,Re,Gl)