[go: up one dir, main page]

Next Article in Journal
TMP-Net: Terrain Matching and Positioning Network by Highly Reliable Airborne Synthetic Aperture Radar Altimeter
Next Article in Special Issue
Semantic Segmentation-Driven Integration of Point Clouds from Mobile Scanning Platforms in Urban Environments
Previous Article in Journal
Summer Discrepancies between 2 m Air Temperature and Landsat LST in Suceava City, Northeastern Romania
Previous Article in Special Issue
A Multi-Level Auto-Adaptive Noise-Filtering Algorithm for Land ICESat-2 Photon-Counting Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Validation of Multi-Temporal Land-Cover Products Considering Classification Error Propagation

1
College of Surveying and Geo-Informatics, Tongji University, Shanghai 200092, China
2
Shanghai Key Laboratory of Space Mapping and Remote Sensing for Planetary Exploration, Tongji University, Shanghai 200092, China
3
Shanghai Institute of Intelligent Science and Technology, Tongji University, Shanghai 200092, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2024, 16(16), 2968; https://doi.org/10.3390/rs16162968
Submission received: 5 July 2024 / Revised: 10 August 2024 / Accepted: 12 August 2024 / Published: 13 August 2024
Graphical abstract
">
Figure 1
<p>Flowchart of time-series sampling.</p> ">
Figure 2
<p>Minimum sample size calculated by combining the predicted user precision error rate and significance level.</p> ">
Figure 3
<p>Relationship between strata and map classes.</p> ">
Figure 4
<p>Misclassification propagation of pixels in the changed strata. The unchanged strata include pixels 1 to 6. The changed strata include pixels 7 to 9. ① indicates that the map and reference data in the changed strata are consistent in the two years; ② is the inconsistency between the map class and the reference in the year (T + 1); Light green means grass; Dark green means forest.</p> ">
Figure 5
<p>Interpretation results for the ESA CCI LC data in Google Earth. The red frame refers to the area range of pixel.</p> ">
Figure 6
<p>Spatial layout of samples in the base year.</p> ">
Figure 7
<p>Proportion of feature categories between the randomly revisited samples and total samples.</p> ">
Figure 8
<p>Spatial distribution of the samples. (<b>a</b>) Samples in 2010. (<b>b</b>–<b>f</b>) Samples randomly revisited in the years from 2011 to 2015.</p> ">
Figure 8 Cont.
<p>Spatial distribution of the samples. (<b>a</b>) Samples in 2010. (<b>b</b>–<b>f</b>) Samples randomly revisited in the years from 2011 to 2015.</p> ">
Figure 9
<p>Single-year precision results for the average of 10 times extracted from the single-temporal samples in 2015. (<b>a</b>) Difference between the accuracy when eliminating the misclassification and including the misclassification. (<b>b</b>) Difference between the single-year accuracy and single-temporal accuracy. WMC represents the difference between the accuracy when eliminating misclassification and the single-temporal accuracy (reference value). IMC represents the difference between the accuracy when including the misclassification and the single-temporal accuracy.</p> ">
Figure 10
<p>Single-year precision results of the average of ten times extracted from single-temporal samples in 2010. (<b>a</b>) Difference between the accuracy of eliminating the misclassification and including the misclassification. (<b>b</b>) Difference between the single-year accuracy and single-temporal accuracy.</p> ">
Figure 11
<p>Single-year precision results of the average of ten times extracted from single-temporal samples in 2010.</p> ">
Figure 12
<p>The accuracy of the <b>ESA CCI LC</b> for 2010–2015.</p> ">
Figure 13
<p>User’s accuracy and producer’s accuracy of each class. (<b>a</b>) refers to Crop; (<b>b</b>) refers to Forest; (<b>c</b>) refers to Grass; (<b>d</b>) refers to Shrub; (<b>e</b>) refers to Water; (<b>f</b>) refers to Bareland; (<b>g</b>) refers to Urban; (<b>h</b>) refers to Snow; (<b>i</b>) refers to Sparse Veg.</p> ">
Figure 13 Cont.
<p>User’s accuracy and producer’s accuracy of each class. (<b>a</b>) refers to Crop; (<b>b</b>) refers to Forest; (<b>c</b>) refers to Grass; (<b>d</b>) refers to Shrub; (<b>e</b>) refers to Water; (<b>f</b>) refers to Bareland; (<b>g</b>) refers to Urban; (<b>h</b>) refers to Snow; (<b>i</b>) refers to Sparse Veg.</p> ">
Figure 14
<p>Stability of the producer’s accuracy for the ESA CCI LC product.</p> ">
Figure 15
<p>Stability of the user’s accuracy for the ESA CCI LC product.</p> ">
Versions Notes

Abstract

:
Reducing the lag in the accuracy assessment of multi-temporal land-cover products has been a hot research topic. By identifying the changed strata, the annual accuracy in multi-temporal products can be quickly evaluated. However, there are still two limitations in the accuracy assessment of multi-temporal products. Firstly, the setting of the parameters (e.g., the total sample size, allocation of samples in the changed strata, etc.) in the fundamental sampling design is not based on specific setting criteria. Therefore, this evaluation method is not always applicable when the product or research area changes. Secondly, the accuracy evaluation of multi-temporal products does not consider the influence of misclassification. This can lead to an overestimation of the accuracy of changed strata in single-year evaluations. In this paper, we describe how the total sample and the assignment of samples in every stratum can be adjusted according to the characteristics of the land-cover product, which improves the applicability of the evaluation. The samples in the changed strata that propagate misclassification are essentially pixels that have not undergone any land-cover change. Therefore, in order to eliminate the propagation of this inter-annual classification error, the misclassified samples are reclassified as unchanged strata. This method was used in the multi-temporal ESA CCI land-cover product. The experimental results indicate that the single-year accuracy, considering classification error, is closer to the traditional evaluation accuracy of single-temporal data. For the categories with a small ratio of unchanged strata samples to changed strata samples, the accuracy improvement, after eliminating the classification errors, is more obvious. For the urban class, in particular, the misclassification affects its estimated accuracy by 9.72%.

Graphical Abstract">

Graphical Abstract

1. Introduction

Land cover is the basis of human existence, and therefore, research on this subject has always been topical [1,2,3,4]. Land-cover change presents a clear picture of the processes of interaction between humanity and nature, helping us to understand and analyze the range of globalization issues caused by human social processes [5,6,7,8,9,10]. For example, the identification of land-cover change can be used to monitor the impact of climate change [11], population increase [12], and ecological change [13], and allows us to take timely and appropriate conservation measures [14].
The accuracy assessment of land-cover data is the path to their practical applications [15]. However, the early land-cover data products cannot be strictly evaluated due to the lack of reference data, limiting the practical applications of these land-cover products [16,17]. With the emergence of more and more large-scale and even global land-cover products, many studies have stressed the precedence of estimating the accuracy assessment of land-cover data products [18,19]. Sampling is an indispensable means of accuracy estimation, which involves selecting representative samples and inferring the accuracy of the data product through probabilistic statistics [20]. At present, there are more than 20 different global land-cover data [21], for example, IGBP DISCover [22], GLC2000 [23,24], GLC-SHARE [25], GlobCover [26], FROM-GLC [27], GlobeLand30 [28,29], and so on. However, these products cannot provide short-term results on the change of land cover data. While moving ahead with medium- and high-resolution remote sensing technology, it is now possible to produce multi-temporal data of land cover at a global scale. The multi-temporal global land-cover data products have become the key data source for many related applications. Multi-temporal data of land cover can be used to evaluate short-term and continuous change information of feature attributes [30,31]. The current state of research on the accuracy evaluation of multi-temporal land cover products is as follows: ➊ Reducing the evaluation lag in multi-temporal data of land cover [32,33,34,35,36]. To quickly estimate accuracy for all years in a multi-temporal land-cover data product, Wickham et al. (2013) created changed strata using change information from the product. This approach estimates single-year accuracy when map labels and stratum labels cannot be matched for all pixels [33]. Tsendbazar et al. (2021) built a continuous verification framework for map accuracy evaluation, which was applied to the verification of the Copernicus Global Land Service GLC products [36]. Despite the availability of many time-series land cover products, timely validation is challenging because it requires the latest updates to a large number of samples. This framework regularly updates the entire validation dataset by making small partial updates to invariant regions and adding new samples to changed regions. This approach improves the dataset’s utilization and significantly reduces the lag in product evaluation. ➋ Monitoring the change of a single class in different periods [36,37,38]. For example, Tang et al. (2019) conducted near real-time monitoring of tropical forest disturbance from 2013 to 2015 and evaluated the results using stratified random sampling [37]. Potapov et al. (2021) considered the global variations in farmland area from the previous 20 years of the 21st century based on satellite data, estimated the area, and calculated the accuracy according to the stable and changed regions of farmland [38]. Druce et al. (2021) promoted the satellite-based dynamic monitoring of surface water by fusing optical and synthetic aperture radar (SAR) data. This was then verified by stratified random sampling and the construction of permanent water, seasonal water, and non-water layers [39]. ➌ Improving the stratification efficiency of land-cover changes [34,35,40,41,42,43]. Arévalo et al. (2020) created six stable layers and five change layers to monitor and estimate the land change activity areas. They also added buffer layers to reduce the impact of omission errors in area estimation [42]. Gong et al. (2023) analyzed a stratified sampling method that considers the frequency of change and conversion types in a stratified sampling of multi-temporal data of land cover. Moreover, binary coding was used to reduce the complexity of the stratification [43].
All these methods mentioned above have their own characteristics and advantages in multi-temporal accuracy assessment, but there are still some problems. Firstly, strict accuracy assessment should be considered as an important component of land-cover products [44]. Therefore, sampling design is needed to guide accuracy assessment practices. Repeatability is an important criterion that cannot be ignored in sampling design [45,46]. Some methods do not provide clear parameter settings for the total sample size and specific allocation of the sample size (i.e., the allocation methods for changed and unchanged strata), which greatly reduces their adaptability. Secondly, in multi-temporal land-cover accuracy evaluation, land cover often changes, causing inconsistencies between the constructed strata and map classes. As a result, traditional single-temporal accuracy evaluation methods cannot be applied. Although corresponding multi-temporal accuracy evaluation methods are available [32,33,34,35,36], they ignore the propagation of the classification error between years. Therefore, in this paper, in view of the above problems, we clearly provide the specific settings of the parameters required for sample design. The total sample size and the assignment of the samples in every stratum can be adjusted according to the characteristics of the land-cover product. Classification error, in essence, refers to the samples in the changed strata that have not, in fact, changed. Current evaluation methods overestimate the precision of change strata by not accounting for the classification errors [47,48], which would affect the authenticity of the multi-temporal single-year accuracy. In order to eliminate the spread of misclassification in the changed strata, we reclassify the misclassified samples into the unchanged strata.
In this study, based on a continuous verification framework, we considered the propagation of misclassification in the changed strata and evaluated the quality accuracy of the ESA CCI land cover (LC) data from 2010 to 2015. This process involved (1) the evaluation of single-year accuracy from 2010 to 2015, (2) the quantitative analysis of the impact of misclassification on the single-year accuracy in the multi-temporal product, and (3) the evaluation of the stability of the time-series product.

2. Methodology

We optimized an operational verification framework for annual time-series products, considering the need to prevent misclassification propagation and to quickly calculate a more accurate product accuracy for each year. This method is called time-series sampling. It includes two important parts: (1) a sampling design for determining the sample size, allocating samples (unchanged and changed strata), and conducting the evaluation analysis; and (2) an evaluation of the accuracy of the remaining years of the time-series land-cover product by partially updating the total samples for the base year and supplementing the samples in the changed regions of the land-cover product. Moreover, the misclassifications in the changed strata samples are eliminated. The specific flowchart of time-series sampling is illustrated in Figure 1 (the land-cover product is updated annually, and the range of the years is from year n to year n + k).

2.1. Sampling Design

The design of the complete sampling scheme includes the determination of the target population, the formulation of the sampling method, and the consideration of the requirements of accuracy and reliability.

2.1.1. Determination of the Total Samples

Increasing the sample size will provide more data for estimating the parameters and, therefore, increase the reliability of the estimates. This is because a larger sample size can better reflect the characteristics of the population and reduce the impact of random errors on the estimation results. Under the same conditions, large sample sizes provide more accurate precision than small samples [49,50]. However, with the increase in sample size, the cost will increase at the same time. Therefore, the sample size should be minimized while accurately assessing the population, with the aim to balance inspection costs and accuracy. Xie et al. (2015) developed an improved sample size estimation model based on the above principles [51], as follows:
n = ( μ 1 α / 2 2 × q e r 2 ( 1 q e ) ) / ( 1 + 1 N ( μ 1 α / 2 2 × q e r 2 ( 1 q e ) 1 ) )
q e = i = 1 m ω i · q e i
where q e refers to the expected classification accuracy, m denotes the number of map categories, ω i represents the area ratio of the i th map category,  refers to the maximum relative error between the estimated and true values, q e i refers to the expected classification accuracy of the i th map category, q e i is obtained empirically from pre-experiments, μ 1 α / 2 2 signifies the boundary value of the standard normal distribution at the confidence level of 1 α / 2 , and n refers to the total sample.

2.1.2. Selection of Sampling Method

To select representative samples and better adjust the sample size when updating the samples in the time series, the stratified sampling method is selected, and stratification is conducted according to the map categories. To improve the accuracy for the rare classes [52], the following sample distribution method is adopted:
n i = [ p i × ( n 2 ) + ( 1 k ) × ( n 2 ) ]
where n i is the number of samples to be allocated in a stratum i , p i is the percentage of area in the map category i , and k is the number of map categories.
This model adaptively determines the total samples and allocates the samples according to the map categories and the area ratio of each category of the products to be evaluated. Therefore, it can be applied to different multi-temporal land-cover products that may have different map categories.

2.1.3. Determination of the Samples for the Changed Strata

In addition to partial revisits of the base year between adjacent years, it is necessary to extract some samples in the changed areas of two images to ensure an accurate assessment of the land-cover change. Before confirming the samples in the changed strata, the minimum number of samples required for every stratum of the changed strata needs to be determined [53,54,55]. The specific calculation is shown in Equation (4). Because of the flexibility of stratified sampling for sample adjustment, Equations (4) and (5) are used in the stratified sampling as follows:
n c = p c ( 1 p c ) σ c 2 , c = 1 , , L
where n c refers to the sample size required for the stratum c of the changed strata, p c refers to the expected user precision error rate for the stratum c , L refers to the quantity of changed strata, and σ c refers to the accepted standard error of the error of commission for category c.
As shown in Figure 2, for an accepted standard error of the error of a commission ( σ c ) of 0.05, a sample size of at least 100 is required to ensure the expected user accuracy.
However, such a sample distribution would undermine the statistical analysis of the unchanged area due to the scarcity of land-cover changes in most applications. Therefore, the sample allocation for every stratum in the stratified sample needs to be combined with the unchanged strata, as shown in Equation (5) [53,54]:
n T c = n N c p c k = 1 L N k p k
where p c represents the estimated error rate of the changed strata, N c represents the total number of changed strata, k = 1 L N k p k refers to the sum of the unchanged and changed strata, and n T c refers to the total sample size allocated to the changed strata when considering the unchanged region. After obtaining the total sample size for the changed strata, if n T c is less than L c * 100 , the n T c = L c * 100 ; otherwise, the value of n T c remains unchanged. L c denotes the number of changed strata.
The construction of changed strata in this paper is consistent with the evaluation of NLCD accuracy [34], including urban gain, forest loss, forest gain, agriculture loss, agriculture gain, and catch-all (other changed strata, besides urban, forest, and crop, are all in the catch-all stratum). The pixels in the stratum are not repeated. For example, if a pixel appears in the urban gain stratum, it is impossible to appear in other loss strata.

2.1.4. Evaluation of the Accuracy

With the emergence of multi-temporal products, there will be inconsistency between the strata and map classes (Figure 3). The traditional accuracy evaluation approaches for single-temporal products are not suitable for the single-year accuracy evaluation of multi-temporal products.
In view of the inconsistency between strata and categories, Stehman (2014) proposed calculating the accuracy for a single year by combining the unchanged strata and changed strata [56]. When combining the changed strata to calculate the accuracy of two adjacent land-cover data, the accuracy of the map class for the previous year is calculated from the previous year’s class plus the loss stratum for that class; the latter year’s accuracy is calculated from the previous year’s class combined with the gain stratum for that class.
O A = i = 1 k N i N × y i ¯
where y i ¯ = u i y u / n i is the correct proportion of sample classification in the stratum i . The range of y i ¯ is [0, 1]. If sample pixel u is classified correctly, then y u = 1 ; otherwise, y u = 0 . u i indicates that the sample pixel u is selected from the stratum i , and k denotes the number of strata. N is the total number of pixels in the region of interest, whereas N i is the number of pixels in the stratum i .
The user’s accuracy and the producer’s accuracy are calculated using the following equations:
R = i = 1 k N i × y i ¯ i = 1 k N i × x i ¯
y u = { 1 , u c o n d i t i o n A 0 , u c o n d i t i o n A
x u = { 1 , u c o n d i t i o n B 0 , u c o n d i t i o n B
For the accuracy of the user, A denotes that the pixel u is classified correctly and has a map class i , and B indicates that the pixel u has map class i . For the accuracy of the producer, A denotes that the pixel u is classified correctly and has a reference class i , and B indicates that the pixel u has reference class i . x i ¯ is the mean of x u for the stratum i , and y i ¯ is the mean of y u for the stratum i .
However, the accuracy evaluation method does not consider the propagation of classification errors. As shown in Figure 4, the two situations in which the map is inconsistent with the reference data in the changed strata are ② and ③, respectively. Situation ② is the inconsistency between the map class and the reference in the year (T + 1). Situation ③ is the inconsistency of year T. Situation ① indicates that the map and reference data in the changed strata are consistent in the two years. According to Equation (6), in order to calculate the single-year overall accuracy of a multi-temporal product, it is necessary to count the total sample size correctly interpreted in year T + 1 in all the unchanged strata and gain strata (taking T + 1 year as an example). However, as can be seen from Figure 4, although pixel 7 is correctly interpreted in the (T + 1) year, the sample pixel will spread the misclassification in 2010. These samples that propagate the misclassification are actually pixels where the reference data remain constant for two years (samples in the unchanged strata). Therefore, in the proposed approach, first, the samples that propagate the misclassification in the changed strata are screened out. These samples are then merged into the unchanged strata to obtain a more accurate single-year accuracy.
The revised equation is as follows:
O A = i = 1 k { [ ( N u n c h a n g e _ i + n m _ c h a n g e _ i n o r i g i n a l _ c h a n g e _ i × N c h a n g e _ i ) × y ¯ ( u n c h a n g e _ i + m _ c h a n g e _ i ) ] + [ n n m _ c h a n g e _ i n o r i g i n a l _ c h a n g e _ i × N c h a n g e _ i × y ¯ n m _ c h a n g e _ i ] } N
where n o r i g i n a l _ c h a n g e refers to the total samples of the changed strata, n n m _ c h a n g e refers to the samples of the changed strata without misclassification, n m _ c h a n g e refers to the sample size in the changed strata that propagate the misclassification, N c h a n g e refers to the total quantity of pixels in the map of the changed strata, and N u n c h a n g e refers to the total number of pixels in the unchanged strata.

2.2. Process of Updating and Supplementing Time-Series Samples

The specific steps of the sample update and supplementation (the land-cover product is updated annually, and the range of years is from year n to year n + k) are as follows:
(1)
Starting from the nth year (base year) of the time series, the samples of the nth year are obtained by stratified sampling (see Section 2.1.1 and Section 2.1.2 for the specific sample size and sample distribution of each stratum);
(2)
Determining the sample of the changed strata: The changes of the multi-temporal product pixel by pixel according to the sequence of years are detected. The samples from the changed area to supplement the samples in the corresponding “changed stratum” are selected (see Section 2.1.3 for the number of samples to be distributed in each changed stratum);
(3)
Supplementing samples: The samples of the changed strata are screened out, and the pixels that spread misclassification are eliminated;
(4)
Updating samples: In order to achieve rapid map verification after the release of land cover products and avoid too much time lag between the verification results and the release time of products, the updating of verification data sets needs to be cost-effective without affecting the statistical rigor. Therefore, some revisits are used to update the verification of the data set. The update is divided into targeted and random revisits. Targeted revisit: if the samples in the base year fall within the detected change area, the samples in the base year are removed and added to the corresponding changed stratum. Random revisit: starting from each year after the base year, (100/k)% non-repetitive random sampling inspection on the sample set selected in the base year is conducted, and it is ensured that all the sample sets in the first step are sampled once in year n + k.
Additionally, compared to the existing classical multi-temporal land-cover product evaluation methods (Table 1), this paper accounts for and eliminates the transmission of classification errors in accuracy evaluation. Furthermore, instead of empirically determining the sampling parameters, this study proved a sampling design that adaptively determines the sample size and allocates samples for different land cover types. Taking the validation of ESA CCI LC as an example, the sampling design includes the following procedures: (1) determining the total samples required for global scale based on Equations (1) and (2) in Section 2.1.1, (2) deriving the sample size for each feature category of the product according to Equation (3) in Section 2.1.2, and (3) computing the number of samples for each changed stratum based on Equations (4) and (5). Since the framework for verifying multi-temporal land-cover products requires continuous updating of samples for the changed strata, the number of samples for each changed stratum is calculated as outlined in Section 2.1.3. Finally, Section 4.1 provides the detailed calculation results.

3. Materials

3.1. Land-Cover Data

The ESA CCI LC product is an annual land-cover data with a 300 m resolution continuously updated by the CCI-LC project of the European Space Agency (ESA). The important feature of this product is that it can ensure global consistency on the basis of the high temporal resolution [57,58,59,60,61,62]. In this study, the sample set of the base year was supplemented and updated by the time-series sampling, as described in Section 2. Finally, the accuracy of the ESA CCI LC product was validated under a global scope for 2010–2015. The 37 land cover classes of the product were reclassified into nine classes [63,64,65], as shown in Table 2:

3.2. Reference Data

The reference data were obtained by manual visual interpretations of the selected samples through Google Earth, and the interpretation results were divided into three confidence levels. Confidence level 1: completely correct/wrong (the image is clear, and the pixels are homogeneous objects); Confidence level 2: the judgment is fuzzy (the image is clear, but there are many objects in the pixels); Confidence level 3: unable to judge (blurred image or no image). In order to compare with the accuracy of the samples screened according to the homogeneity and complete interpretation standard of pixel areas in an official document, only the samples with confidence level 1 were considered when evaluating the accuracy. Of course, points of other grades do not participate in the evaluation. Figure 5 shows the judgment results for the ground objects in the ESA CCI LC data with a confidence level of 1 in Google Earth.

4. Results

4.1. Analysis of the Results of the Sampling Design

4.1.1. Selection of Samples in the Base Year

The total sample size required for precision evaluation in 2010 (base year) for the ESA CCI LC product was obtained as described in Section 2.1.1 (Table 3). The predicted classification accuracy was determined according to the pre-experiment and the user’s guide for the ESA CCI LC data [60]. According to Equations (1) and (2), 14,645 samples were needed to evaluate the quality accuracy of the ESA CCI LC data for 2010 at a global scale.
According to the ESA CCI LC data user guide [60] and the official accuracy evaluation report for GlobCover 2009 [66], the reference data used in the evaluation of the accuracy of CCI’s 2015 product were from the validation dataset of the GlobCover 2009 product. Only inland water (extracted using the CCI Global Map of Open Water Bodies) and ice areas other than Antarctic ice were considered. As many of the classes in the GlobCover 2009 validation dataset have few validation samples (e.g., water, ice, or urban), they were area-weighted for accuracy evaluation. In this study, stratified random sampling was chosen for allocating the total sample, as Neyman allocation and proportional allocation do not address the problem of under-allocation of samples for rare categories. The most commonly used sample distribution methods in stratified sampling are proportional distribution and Neiman distribution. Proportional distribution has the advantages of simple analysis and operation because it is an equal probability sampling design. Neyman distribution needs to consider the standard deviation of each feature category. The total amount and standard deviation of features determine the sample size of this feature stratum. That is, the greater the total amount and the greater the internal differences, the more samples that should be allocated to the feature type. However, Neyman distribution and proportional distribution cannot solve the problem of insufficient sample distribution in rare categories. The standard deviations of features in ESA CCI LC products are all small and close to each other (range: 0.30–0.50), so the area ratio is more important to the sample distribution. For the urban class, its area ratio is about 0.44%, and the standard deviation is about 0.36. Therefore, the number of samples obtained using the Neyman distribution method is 57, and the proportion distribution sample is 65. On a global scale, these allocation numbers cannot accurately evaluate the accuracy of the products. For a global scale, these allocations do not allow for accurate precision evaluation. The allocation technique described in Section 2.1.2 was, therefore, used to obtain the sample allocations for each class (Table 4). The sample selection for 2010 can be seen in Figure 6.

4.1.2. Determination of Supplementary Samples

According to the authors of [34], changed strata can be divided into the urban gain stratum, forest loss stratum, forest gain stratum, agriculture loss stratum, agriculture gain stratum, and catch-all stratum (other changed strata, besides urban, forest and crop, are all in the catch-all stratum), according to the types of ground object transformation. Previous accuracy evaluations of the ESA CCI LC data have reported that the overall accuracy of the unchanged strata is 71% [60,67,68], but the accuracy of the changed strata is unknown. According to Equation (5), with the increase in p c of the changed strata, more samples are allocated to the changed strata. Therefore, by assuming that the value of p k in the changed and unchanged strata is 99% and 29%, respectively, then the maximum total samples to be allocated to the changed strata can be obtained, as shown in Table 5. It is worth emphasizing that the total pixels of the unchanged strata and the samples of the unchanged strata are not always the same and need to be constantly adjusted according to the changed areas and targeted revisits.
Therefore, the total samples required for the changed strata can be determined, and the allocation method in Equation (3) can be conducted to determine the samples required for every changed stratum (Figure 2 shows that the number of samples in each stratum is at least 100, so less than 100 is made up to 100). The specific allocation is shown in Table 6. By using the above method, the sample size required by the changed strata can be reasonably calculated while maintaining the accuracy of the unchanged strata. Therefore, compared with other studies, the sampling scheme in this paper not only expands the applicability of parameters but also improves the repeatability of the sampling design.

4.2. Evaluation of the Advantages of Sample Revisiting

In the process of sample updating/random revisiting, the revisited samples for each year from 2011 to 2015 were obtained by randomly selecting 20% of the total samples from 2010 without repetition. By 2014–2015, all the samples had been randomly revisited. According to the randomly revisited samples in these years, the proportion of categories in the corresponding years was counted. Figure 7 shows that the proportion of categories in the other years is basically the same as that in the base year.
Figure 8 demonstrates the spatial distribution of the total samples in 2010 and the samples updated by random revisits from 2011 to 2015, where (a) refers to the spatial positions of all the samples (including nine classes) in 2010. The upper and right curves are the probability density function curves of these samples, which show the distribution of these samples in longitude (upper curve) and latitude (right curve). The areas under each probability density curve sum to one. The fluctuations in the skewness and kurtosis of the features in these years are very small. It can be seen that the distribution of the feature types in the randomly revisited samples for the other years aligns well with the distribution of the total samples for the base year.
Combined with Figure 7, it can be seen that the randomly revisited samples for the other years, although only accounting for 20% of the total samples, effectively reflect the spatial distribution of the total samples. The proportion of feature types in the randomly revisited samples is also consistent with that in the total samples. Therefore, the randomly revisited samples are selected with good uniformity and reasonableness. The targeted revisit shifts the samples where feature changes occur in the base year and places these changed base samples in the changed stratum. Land cover shifts over time are rare events [69], so sample revisiting can save considerable labor and time compared to the traditional accuracy of multi-temporal data while allowing the base year samples to be updated.

4.3. Analysis of the Accuracy Evaluation Results

As can be seen from the forest gain stratum marked in red in Figure 4, although the sample was correctly interpreted in 2011, the sample pixel will spread the misclassification in 2010. As shown in Table 7, y c i ¯ , considering the misclassification is always lower than or equal to y c i ¯ , does not consider the misclassification in any year. y c i ¯ refers to the correct proportion of sample classification in the changed strata. Especially for the ESA CCI LC product in 2013, the difference in y c i ¯ , between considering and not considering the classification errors reached, was 0.9412 ( y c i ¯ [ 0 , 1 ] ). The accuracy evaluation method by Stehman (2014) allows for a year-by-year evaluation of the ESA CCI LC data from 2010 to 2015, but the propagation of misclassification is not taken into account in this evaluation method. The accuracy evaluation described above will be affected by misclassification, so this classification error is eliminated in the assessment approach proposed in this paper.
The overall accuracy of ESA CCI LC for 2015, given by the official documents, is 71.1%. However, the reference data used in the official document was the verification dataset of GlobCover 2009. Because the classification systems of GlobCover 2009 and the ESA CCI LC product are inconsistent, only some samples in the verification dataset can be used [66]. This practice leads to a very small sample size for some features, which increases the interference of regional heterogeneity on the overall accuracy. Therefore, in the proposed approach, we kept the total sample size (2329) the same as the official document. According to the sampling design scheme proposed in this paper, the samples of the ESA CCI LC data in 2015 were selected, and the reference data were collected. Finally, the overall accuracy for 2015 was determined to be 66.89% (Table A1). The samples in Table A1, Table A2 and Table A3 are all sample points with a confidence level of 1 (in order to better compare with the overall accuracy obtained by homogeneous pixels in the official document [60]).
In order to objectively compare the single-temporal accuracy with the accuracy for a single year calculated by multi-temporal calculation, for 2015, 14645*20% = 2929 samples were randomly revisited in 2014−2015, and 2329 recollected samples were merged into the total samples for 2015. According to the 5258 total samples, the single-temporal overall accuracy for 2015 was 66.71% (Table A2). We then randomly selected 20% of the samples of each feature (repeated 10 times) and performed targeted revisiting as the samples for the 2014–2015 changed strata. Since the total pixels of the changed strata/the total pixels of the unchanged strata were approximately 0.0009, the samples of the changed strata to be allocated were the same as that of Section 4.1.2 in combination with Equation (5), so the samples of the changed strata for 2014–2015 in Table 6 were directly adopted. The accuracy without considering the misclassification and considering the misclassification can be calculated by using Equations (6) and (10), respectively. The result after averaging is demonstrated in Figure 9. Figure 9a shows that the impact of misclassification has a significant impact on the urban class (9.72%), while the other features remain unchanged. Because the main type of feature that changed in 2015 was urban (Table 7), although there were also some samples in the bare land gain strata, regardless of whether the misclassification was considered or not, y c i ¯ of this stratum is always zero. Therefore, the accuracy of bare land remains unchanged. In addition, the accuracy of urban without misclassification is closer to the accuracy for a single-temporal obtained from the total samples in 2015.
The same procedure was carried out for 2010, where we randomly selected 20% of the total sample (14,645) of each feature (repeated 10 times). The mean value of ten precision results for this year is demonstrated in Figure 10. Figure 10 demonstrates that the WMC of almost all the features is smaller than the IMC, indicating that the accuracy after eliminating misclassification is closer to that of single-temporal accuracy (Table A3). Due to the small difference between eliminating misclassification and including misclassification for each feature, the overall accuracy difference is also very small.
Compared with the other features, there are more times when there is a single land-cover class in the urban pixels. Therefore, after judging confidence, more samples are left in the changed strata of urban. In addition, urban accounts for only 0.44% of the total, so there are very few samples of unchanged strata allocated to urban. Therefore, compared to the other features, the ratio of unchanged strata samples to changed strata samples will be smaller. The elimination of misclassification has a particularly significant impact on urban. Therefore, for the forest in 2010 (this feature has the largest share of the total), 20% to 5% of randomly selected samples from the total samples of the feature and the samples in the forest loss stratum were increased to reduce the ratio of the unchanged strata to the changed strata. The values in Figure 10 were retained for the other features, and Figure 11 was ultimately obtained. Compared to Figure 10b, the forest in Figure 11 is more obviously affected by misclassification, and the overall accuracy is significantly closer to the single-temporal accuracy for 2010. The standard deviation of the accuracy of the single-year forest after eliminating the influence of misclassification is 2.53%. The standard deviation, including misclassification, is 6.61%, so the stability when eliminating misclassification is also better. This method is more effective in improving the accuracy of features with a small sample ratio of unchanged strata/changed strata.
Figure 11 shows that the accuracy is better after removing the effect of misclassification, and the steps in Section 2.2 allow an evaluation of the global accuracy for 2010–2015, as shown in Figure 12. The accuracy of the CCI product was more stable in 2010–2015, with a range of 67.69–67.82%.

4.4. Stability Evaluation of Land-Cover Products

In this paper, we have described how to evaluate the product accuracy of a global annual time-series land-cover product on a year-by-year basis through a time-series sampling process framework. The inter-annual stability of the global time-series land-cover product can be evaluated in conjunction with Equation (11):
S I = | c a n + 1 c a n | c a n 100
where c a n refers to the category accuracy for the year n , and c a n + 1 refers to the category accuracy for the year n + 1 .
Figure 13 shows the user’s accuracy and the producer’s accuracy of each feature over the years. These two accuracies of grasslands, shrubs, and sparse vegetation are all low, while the accuracies for water bodies are high.
Figure 14 and Figure 15 represent the stability of the ESA CCI LC data in terms of the producer’s accuracy and the user’s accuracy for 2010–2015. According to the authors of [36], a stability index below 15% indicates a stable product. Figure 14 and Figure 15 clearly show that the ESA CCI LC product not only provides annual global land-cover data but also maintains a high level of stability. Except for the urban class, the other features are below 5% for both the user’s and the producer’s accuracy. The class of snow, in particular, has maintained a constant stability of the user’s accuracy over the period of 2010–2015.

5. Discussion

In this paper, the problem of classification error propagation in a homogeneous pixel is considered and eliminated. In order to ensure the evaluation efficiency of products, the verification framework updates some sample points, so there will be some error propagation in reference data [36]. This is also a problem that needs to be studied and solved in the future. However, the samples of the changed strata are always re-selected, and the results obtained by combining the method in this paper do not have the cumulative propagation of classification errors in the changed strata. In addition, the main changed strata types in this paper are determined based on the evaluation method of NLCD. Therefore, for time series products with time intervals greater than 5 years, further research is needed on the main changed strata types. This method is suitable for products with a relatively long time span and supports change detection. Nonetheless, products with different resolutions will require further verification and explanation in the future.

6. Conclusions

In this paper, we clearly described the setting of the parameters, including the sample size and the allocation of samples in the changed/unchanged strata, which can be adjusted according to the characteristics of the land-cover product. Then, the influence of misclassification was considered in the evaluation method.
Especially for the ESA CCI LC product in 2013, the difference in y c i ¯ between considering and not considering classification errors reached was 0.9412 ( y c i ¯ [ 0 , 1 ] ). In contrast with the previous precision evaluation of multi-temporal products, we have described how to consider the influence of the propagation of misclassification in the changed strata on the single-year precision. The experimental results showed that the overall accuracy and the accuracy of individual categories, after eliminating misclassification, are closer to single-temporal accuracy with more samples than when misclassification is not considered. For the urban category, which typically had few mixed pixels and a small proportion in the classification map, the ratio of samples in the unchanged strata to changed strata was low. After eliminating misclassification, the accuracy improved by 9.72%. The forest class (the largest proportion of land cover) showed an improved accuracy after considering the classification error when the ratio was reduced. Therefore, in a specific area with a large proportion of changed areas, the method proposed in this paper will improve the authenticity of the evaluation more obviously. In addition, all features of the ESA CCI LC product were below 15% for both the user’s and the producer’s accuracy. Thus, the time-series product was shown to have good stability.

Author Contributions

Conceptualization, S.L. and H.X.; methodology, S.L., H.X., and Y.J.; software, S.L.; validation, S.L. and Y.G.; formal analysis, S.L., X.X., P.C., X.T., H.X., and Y.J.; writing—original draft preparation, S.L.; writing—review and editing, H.X. and Y.J.; supervision, H.X. and Y.J.; project administration, H.X.; funding acquisition, H.X., X.T., and Y.J. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grants 42221002, 42325106, and 42071372, the Shanghai Academic Research Leader Program under grant 23XD1404100, the Shanghai Science and Technology Innovation Action Plan Program under Grant 22511102900, and the Fundamental Research Funds for the Central Universities.

Data Availability Statement

The data presented in this study can be found here: https://maps.elie.ucl.ac.be/CCI/viewer/ (accessed on 8 April 2023).

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Single-temporal accuracy for 2015 (the total sample size is 2329): the weighted-area overall accuracy is 66.89% (1.58%).
Table A1. Single-temporal accuracy for 2015 (the total sample size is 2329): the weighted-area overall accuracy is 66.89% (1.58%).
Reference Data
CropForestGrassShrubWaterBareUrbanIceSparse VegSum
MapCrop1101516703003154
Forest079490200296
Grass91141003601374
Shrub21104601002484
Water201068001072
Bare0000069011787
Urban511001390047
Snow00002101400143
Sparse Veg0031111041131
Sum128974973711243914760788
Table A2. Single-temporal accuracy for 2015 (the total sample size is 5258): the weighted-area overall accuracy is 66.71% (1.02%).
Table A2. Single-temporal accuracy for 2015 (the total sample size is 5258): the weighted-area overall accuracy is 66.71% (1.02%).
Reference Data
CropForestGrassShrubWaterBareUrbanIceSparse VegSum
MapCrop23533243209103337
Forest5212102715102263
Grass2122616175043148
Shrub5411980240131174
Water20101402010146
Bare000001511223177
Urban72300214000154
Snow00004303300337
Sparse Veg01611320102677
Sum27525481174147303143348881813
Table A3. Single-temporal accuracy in 2010 (the total sample size is 14645): the weighted-area overall accuracy is 67.80% (0.63%).
Table A3. Single-temporal accuracy in 2010 (the total sample size is 14645): the weighted-area overall accuracy is 67.80% (0.63%).
Reference Data
CropForestGrassShrubWaterBareUrbanIceSparse VegSum
MapCrop7008660119126200994
Forest1665746102112101836
Grass50111014522271115453
Shrub1423142640870233437
Water21003628080381
Bare3101043441137491
Urban2911500858600639
Snow000072009430970
Sparse Veg105790101032100264
Sum82479523354037392359410071765465

References

  1. Schewe, J.; Gosling, S.N.; Reyer, C.; Zhao, F.; Ciais, P.; Elliott, J.; Francois, L.; Huber, V.; Lotze, H.K.; Seneviratne, S.I.; et al. State-of-the-art global models underestimate impacts from climate extremes. Nat. Commun. 2019, 10, 1005. [Google Scholar] [CrossRef] [PubMed]
  2. Gómez, C.; White, J.C.; Wulder, M.A. Optical remotely sensed time series data for land cover classification: A review. ISPRS-J. Photogramm. Remote Sens. 2016, 116, 55–72. [Google Scholar] [CrossRef]
  3. Houghton, R.A.; House, J.I.; Pongratz, J.; van der Werf, G.R.; DeFries, R.S.; Hansen, M.C.; Le Quere, C.; Ramankutty, N. Carbon emissions from land use and land-cover change. Biogeosciences 2012, 9, 5125–5142. [Google Scholar] [CrossRef]
  4. Wulder, M.A.; Coops, N.C.; Roy, D.P.; White, J.C.; Hermosilla, T. Land cover 2.0. Int. J. Remote Sens. 2018, 39, 4254–4284. [Google Scholar] [CrossRef]
  5. Foley, J.A.; DeFries, R.; Asner, G.P.; Barford, C.; Bonan, G.; Carpenter, S.R.; Chapin, F.S.; Coe, M.T.; Daily, G.C.; Gibbs, H.K.; et al. Global consequences of land use. Science 2005, 309, 570–574. [Google Scholar] [CrossRef] [PubMed]
  6. Li, J.Y.; Gao, Y.; Huang, X. The impact of urban agglomeration on ozone precursor conditions: A systematic investigation across global agglomerations utilizing multisource geospatial datasets. Sci. Total Environ. 2020, 704, 135458. [Google Scholar] [CrossRef] [PubMed]
  7. Xiao, R.; Liu, Y.; Huang, X.; Shi, R.; Yu, W.; Zhang, T. Exploring the driving forces of farmland loss under rapid urbanization using binary logistic regression and spatial regression: A case study of Shanghai and Hangzhou Bay. Ecol. Indic. 2018, 95, 455–467. [Google Scholar] [CrossRef]
  8. Yang, Q.; Huang, X.; Tang, Q. The footprint of urban heat island effect in 302 Chinese cities: Temporal trends and associated factors. Sci. Total Environ. 2019, 655, 652–662. [Google Scholar] [CrossRef] [PubMed]
  9. Chapin, F.S.; Zavaleta, E.S.; Eviner, V.T.; Naylor, R.L.; Vitousek, P.M.; Reynolds, H.L.; Hooper, D.U.; Lavorel, S.; Sala, O.E.; Hobbie, S.E.; et al. Consequences of changing biodiversity. Nature 2020, 405, 234–242. [Google Scholar] [CrossRef] [PubMed]
  10. Turner, B.L.; Lambin, E.F.; Reenberg, A. The emergence of land change science for global environmental change and sustainability. Proc. Natl. Acad. Sci. USA 2007, 104, 20666–20671. [Google Scholar] [CrossRef] [PubMed]
  11. Buchhorn, M.; Lesiv, M.; Tsendbazar, N.E.; Herold, M.; Bertels, L.; Smets, B. Copernicus global Land Cover layers—Collection 2. Remote Sens. 2020, 12, 1044. [Google Scholar] [CrossRef]
  12. Liu, H.; Gong, P.; Wang, J.; Clinton, N.; Bai, Y.; Liang, S. Annual dynamics of global land cover and its long-term changes from 1982 to 2015. Earth Syst. Sci. Data 2020, 12, 1217–1243. [Google Scholar] [CrossRef]
  13. Justice, C.; Gutman, G.; Vadrevu, K.P. NASA Land Cover and Land Use Change (LCLUC): An interdisciplinary research program. J. Environ. Manag. 2015, 148, 4–9. [Google Scholar] [CrossRef] [PubMed]
  14. Brown, J.F.; Tollerud, H.J.; Barber, C.P.; Zhou, Q.; Dwyer, J.L.; Vogelmann, J.E.; Loveland, T.R.; Woodcock, C.E.; Stehman, S.V.; Zhu, Z.; et al. Lessons learned implementing an operational continuous United States national land change monitoring capability: The Land Change Monitoring, Assessment, and Projection (LCMAP) approach. Remote Sens. Environ. 2020, 238, 111356. [Google Scholar] [CrossRef]
  15. Zhu, Z.; Qiu, S.; Ye, S. Remote sensing of land change: A multifaceted perspective. Remote Sens. Environ. 2022, 282, 113266. [Google Scholar] [CrossRef]
  16. Townshend, J.; Justice, C.; Li, W.; Gurney, C.; Mcmanus, J. Global land cover classification by remote sensing: Present capabilities and future possibilities. Remote Sens. Environ. 1991, 35, 243–255. [Google Scholar] [CrossRef]
  17. Olofsson, P.; Stehman, S.V.; Woodcock, C.E.; Sulla-Menashe, D.; Sibley, A.M.; Newell, J.D.; Friedl, M.A.; Herold, M. A global land-cover validation data set, part I: Fundamental design principles. Int. J. Remote Sens. 2012, 33, 5768–5788. [Google Scholar] [CrossRef]
  18. Findell, K.L.; Berg, A.; Gentine, P.; Krasting, J.P.; Lintner, B.R.; Malyshev, S.; Santanello, J.A.; Shevliakova, E. The impact of anthropogenic land use and land cover change on regional climate extremes. Nat. Commun. 2017, 8, 989. [Google Scholar] [CrossRef] [PubMed]
  19. Stehman, S.V.; Czaplewski, R.L. Design and Analysis for Thematic Map Accuracy Assessment: Fundamental Principles. Remote Sens. Environ. 1998, 64, 331–344. [Google Scholar] [CrossRef]
  20. Stehman, S.V. Statistical Rigor and Practical Utility in Thematic Map Accuracy Assessment. Photogramm. Eng. Remote Sens. 2001, 67, 727–734. [Google Scholar]
  21. Grekousis, G.; Mountrakis, G.; Kavouras, M. An overview of 21 global and 43 regional land-cover mapping products. Int. J. Remote Sens. 2015, 36, 5309–5335. [Google Scholar] [CrossRef]
  22. Loveland, T.R.; Reed, B.C.; Brown, J.F.; Ohlen, D.O.; Zhu, Z.; Yang, L.; Merchant, J.W. Development of a global land cover characteristics database and IGBP DISCover from 1 km AVHRR data. Int. J. Remote Sens. 2000, 21, 1303–1330. [Google Scholar] [CrossRef]
  23. Bartholomé, E.; Belward, A.S. GLC2000: A new approach to global land cover mapping from Earth Observation data. Int. J. Remote Sens. 2005, 26, 1959–1977. [Google Scholar] [CrossRef]
  24. Herold, M.; Mayaux, P.; Woodcock, C.E.; Baccini, A.; Schmullius, C. Some challenges in global land cover mapping: An assessment of agreement and accuracy in existing 1 km datasets. Remote Sens. Environ. 2008, 112, 2538–2556. [Google Scholar] [CrossRef]
  25. Latham, J.; Cumani, R.; Rosati, I.; Bloise, M. Global Land Cover SHARE (GLC-SHARE) Database Beta-Release Version 1.0; Land and Water Division: Rome, Italy, 2014. [Google Scholar]
  26. Kaptue Tchuenté, A.T.; Roujean, J.L.; De Jong, S.M. Comparison and relative quality assessment of the GLC2000, GLOBCOVER, MODIS and ECOCLIMAP land cover data sets at the African continental scale. Int. J. Appl. Earth Obs. Geoinf. 2011, 13, 207–219. [Google Scholar] [CrossRef]
  27. Gong, P.; Liu, H.; Zhang, M.N.; Li, C.C.; Wang, J.; Huang, H.B.; Clinton, N.; Ji, L.Y.; Li, W.Y.; Bai, Y.Q.; et al. Stable classification with limited sample: Transferring a 30-m resolution sample set collected in 2015 to mapping 10-m resolution global land cover in 2017. Sci. Bull. 2019, 64, 370–373. [Google Scholar] [CrossRef] [PubMed]
  28. Chen, J.; Chen, J.; Liao, A.P.; Cao, X.; Chen, L.J.; Chen, X.H.; He, C.Y.; Han, G.; Peng, S.; Lu, M.; et al. Global land cover mapping at 30 m resolution: A POK-based operational approach. ISPRS-J. Photogramm. Remote Sens. 2015, 103, 7–27. [Google Scholar] [CrossRef]
  29. Chen, J.; Chen, L.J.; Chen, F.; Ban, Y.F.; Li, S.N.A.; Han, G.; Tong, X.H.; Liu, C.; Stamenova, V.; Stamenov, S. Collaborative validation of GlobeLand30: Methodology and practices. Geo-Spat. Inf. Sci. 2021, 24, 134–144. [Google Scholar] [CrossRef]
  30. Woodcock, C.E.; Loveland, T.R.; Herold, M.; Bauer, M.E. Transitioning from change detection to monitoring with remote sensing: A paradigm shift. Remote Sens. Environ. 2020, 238, 111558. [Google Scholar] [CrossRef]
  31. Wulder, M.A.; Roy, D.P.; Radeloff, V.C.; Loveland, T.R.; Anderson, M.C.; Johnson, D.M.; Healey, S.; Zhu, Z.; Scambos, T.A.; Pahlevan, N.; et al. Fifty years of Landsat science and impacts. Remote Sens. Environ. 2022, 280, 113195. [Google Scholar] [CrossRef]
  32. Wickham, J.; Stehman, S.V.; Gass, L.; Dewitz, J.; Fry, J.A.; Wade, T.G. Accuracy assessment of NLCD 2006 land cover and impervious surface. Remote Sens. Environ. 2013, 130, 294–304. [Google Scholar] [CrossRef]
  33. Wickham, J.; Stehman, S.V.; Gass, L.; Dewitz, J.A.; Sorenson, D.G.; Granneman, B.J.; Poss, R.V.; Baer, L.A. The accuracy assessment of the 2011 National Land Cover Database (NLCD). Remote Sens. Environ. 2017, 191, 328–341. [Google Scholar] [CrossRef]
  34. Wickham, J.; Stehman, S.V.; Sorenson, D.G.; Gass, L.; Dewitz, J.A. Thematic accuracy assessment of the NLCD 2016 land cover for the conterminous United States. Remote Sens. Environ. 2021, 257, 112357. [Google Scholar] [CrossRef]
  35. Wickham, J.; Stehman, S.V.; Sorenson, D.G.; Gass, L.; Dewitz, J.A. Thematic accuracy assessment of the NLCD 2019 land cover for the conterminous United States. GISci. Remote Sens. 2023, 60, 2181143. [Google Scholar] [CrossRef]
  36. Tsendbazar, N.; Herold, M.; Li, L.; Tarko, A.; de Bruin, S.; Masiliunas, D.; Lesiv, M.; Fritz, S.; Buchhorn, M.; Smets, B.; et al. Towards operational validation of annual global land cover maps. Remote Sens. Environ. 2021, 266, 112686. [Google Scholar] [CrossRef]
  37. Tang, X.J.; Bullock, E.L.; Olofssonm, P.; Estel, S.; Woodcock, C.E. Near real-time monitoring of tropical forest disturbance: New algorithms and assessment framework. Remote Sens. Environ. 2019, 224, 202–218. [Google Scholar] [CrossRef]
  38. Potapov, P.; Turubanova, S.; Hansen, M.C.; Tyukavina, A.; Zalles, V.; Khan, A.; Song, X.P.; Pickens, A.; Shen, Q.; Cortez, J. Global maps of cropland extent and change show accelerated cropland expansion in the twenty-first century. Nat. Food 2021, 3, 19–28. [Google Scholar] [CrossRef] [PubMed]
  39. Druce, D.; Tong, X.Y.; Lei, X.; Guo, T.; Kittel, C.M.M.; Grogan, K.; Tottrup, C. An Optical and SAR Based Fusion Approach for Mapping Surface Water Dynamics over Mainland China. Remote Sens. 2021, 13, 1663. [Google Scholar] [CrossRef]
  40. Song, X.P.; Hansen, M.C.; Stehman, S.V.; Potapov, P.V.; Tyukavina, A.; Vermote, E.F.; Townshend, J.R. Global land change from 1982 to 2016. Nature 2018, 560, 639–643. [Google Scholar] [CrossRef] [PubMed]
  41. Potapov, P.; Hansen, M.C.; Pickens, A.; Hernandez-Serna, A.; Tyukavina, A.; Turubanova, S.; Zalles, V.; Li, X.Y.; Khan, A.; Stolle, F.; et al. The Global 2000-2020 Land Cover and Land Use Change Dataset Derived from the Landsat Archive: First Results. Front. Remote Sens. 2022, 3, 856903. [Google Scholar] [CrossRef]
  42. Arévalo, P.; Olofsson, P.; Woodcock, C.E. Continuous monitoring of land change activities and post-disturbance dynamics from Landsat time series: A test methodology for REDD+ reporting. Remote Sens. Environ. 2020, 238, 111051. [Google Scholar] [CrossRef]
  43. Gong, Y.L.; Xie, H.; Liao, S.C.; Lu, Y.; Jin, Y.M.; Wei, C.; Tong, X.H. Assessing the Accuracy of Multi-Temporal GlobeLand30 Products in China Using a Spatiotemporal Stratified Sampling Method. Remote Sens. 2023, 15, 4593. [Google Scholar] [CrossRef]
  44. Strahler, A.H.; Boschetti, L.; Foody, G.M.; Friedl, M.A.; Hansen, M.C.; Herold, M.; Mayaux, P.; Morisette, J.T.; Stehman, S.V.; Woodcock, C.E. Global Land Cover Validation: Recommendations for Evaluation and Accuracy Assessment of Global Land Cover Maps; Office for Official Publications of the European Communities: Luxembourg, 2006. [Google Scholar]
  45. Stehman, S.V.; Foody, G.M. Key issues in rigorous accuracy assessment of land cover products. Remote Sens. Environ. 2019, 231, 111199. [Google Scholar] [CrossRef]
  46. Ye, S.; Pontius, R.G.; Rakshit, R. A review of accuracy assessment for object-based image analysis: From per-pixel to per-polygon approaches. ISPRS-J. Photogramm. Remote Sens. 2018, 141, 137–147. [Google Scholar] [CrossRef]
  47. Van Oort, P.A.J. Improving land cover change estimates by accounting for classification errors. Int. J. Remote Sens. 2005, 26, 3009–3024. [Google Scholar] [CrossRef]
  48. Pouliot, D.; Latifovic, R.; Zabcic, N.; Guindon, L.; Olthof, I. Development and assessment of a 250 m spatial resolution MODIS annual land cover time series (2000–2011) for the forest region of Canada derived from change-based updating. Remote Sens. Environ. 2014, 140, 731–743. [Google Scholar] [CrossRef]
  49. Asiamah, N.; Mensah, H.K.; Oteng-abayie, E.F. Do Larger Samples Really Lead to More Precise Estimates? A Simulation Study. Am. J. Educ. Res. 2017, 5, 9–17. [Google Scholar]
  50. Milligan, G.W. Is sampling really dead? Qual. Prog. 1991, 24, 77–81. [Google Scholar]
  51. Xie, H.; Tong, X.H.; Meng, W.; Liang, D.; Wang, Z.H.; Shi, W.Z. A Multilevel Stratified Spatial Sampling Approach for the Quality Assessment of Remote-Sensing-Derived Products. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2015, 8, 4699–4713. [Google Scholar] [CrossRef]
  52. Czaplewski, R.L.; and Patterson, P.L. Classification accuracy for stratification with remotely sensed data. For. Sci. 2003, 49, 402–408. [Google Scholar] [CrossRef]
  53. Szantoi, Z.; Brink, A.; Lupi, A.; Mammone, C.; Jaffrain, G. Key landscapes for conservation land cover and change monitoring, thematic and validation datasets for sub-Saharan Africa. Earth Syst. Sci. Data 2020, 12, 3001–3019. [Google Scholar] [CrossRef]
  54. Gallaun, H.; Steinegger, M.; Wack, R.; Schardt, M.; Kornberger, B.; Schmitt, U. Remote Sensing Based Two-Stage Sampling for Accuracy Assessment and Area Estimation of Land Cover Changes. Remote Sens. 2015, 7, 11992–12008. [Google Scholar] [CrossRef]
  55. Bossard, M.; Feranec, J.; Otahel, J. CORINE Land Cover Technical Guide: Addendum 2000; European Environment Agency Copenhagen: Copenhagen, Denmark, 2000. [Google Scholar]
  56. Stehman, S.V. Estimating area and map accuracy for stratified random sampling when the strata are different from the map classes. Int. J. Remote Sens. 2014, 35, 4923–4939. [Google Scholar] [CrossRef]
  57. Karra, K.; Kontgis, C.; Statman-Weil, Z.; Mazzariello, J.C.; Mathis, M.; Brumby, S.P. Global Land Use/Land Cover with Sentinel 2 and Deep Learning; IEEE: Manhattan, NY, USA, 2021. [Google Scholar]
  58. Venter, Z.S.; Barton, D.N.; Chakraborty, T.; Simensen, T.; Singh, G. Global 10 m Land Use Land Cover Datasets: A Comparison of Dynamic World, World Cover and Esri Land Cover. Remote Sens. 2022, 14, 4101. [Google Scholar] [CrossRef]
  59. Defourny, P.; Lamarche, C.; Marissiaux, Q.; Carsten, B.; Martin, B.; Grit, K. Product User Guide and Specification: ICDR Land Cover 2016–2020; European Centre for Medium-Range Weather Forecasts (ECMWF): Reading, UK, 2021. [Google Scholar]
  60. Defourny, P.; Lamarche, C.; Bontemps, S.; De Maet, T.; Van Bogaert, E.; Moreau, I.; Brockmann, C.; Boettcher, M.; Kirches, G.; Wevers, J.; et al. Land Cover CCI Product User Guide–Version 2.0; UCL-Geomatics: Louvain-la-Neuve, Belgium, 2017. [Google Scholar]
  61. Sun, W.Y.; Ding, X.T.; Su, J.B. Land use and cover changes on the Loess Plateau: A comparison of six global or national land use and cover datasets. Land Use Pol. 2022, 119, 106165. [Google Scholar] [CrossRef]
  62. Xu, Y.D.; Yu, L.; Feng, D.L.; Peng, D.L.; Li, C.C.; Huang, X.M.; Lu, H.; Gong, P. Comparisons of three recent moderate resolution African land cover datasets: CGLS-LC100, ESA-S2-LC20, and FROM-GLC-Africa30. Int. J. Remote Sens. 2019, 40, 6185–6202. [Google Scholar] [CrossRef]
  63. Liu, X.X.; Yu, L.; Si, Y.L.; Zhang, C.; Lu, H.; Yu, C.Q.; Gong, P. Identifying patterns and hotspots of global land cover transitions using the ESA CCI Land Cover dataset. Remote Sens. Lett. 2018, 9, 972–981. [Google Scholar] [CrossRef]
  64. Mousivanda, A.; Arsanjani, J.J. Insights on the historical and emerging global land cover changes: The case of ESA-CCI-LC datasets. Appl. Geogr. 2019, 106, 82–92. [Google Scholar] [CrossRef]
  65. Jiang, L.; Yu, L. Analyzing land use intensity changes within and outside protected areas using ESA CCI-LC datasets. Glob. Ecol. Conserv. 2019, 20, e00789. [Google Scholar] [CrossRef]
  66. Bontemps, S.; Defourny, P.; Van Bogaert, E.; Arino, O.; Kalogirou, V.; Perez, J.R. GLOBCOVER 2009 Products Description and Validation Report; UCLouvain & ESA Team: Louvain-la-Neuve, Belgium, 2011. [Google Scholar]
  67. Liu, P.Y.; Pei, J.; Guo, H.; Tian, H.F.; Fang, H.J.; Wang, L. Evaluating the Accuracy and Spatial Agreement of Five Global Land Cover Datasets in the Ecologically Vulnerable South China Karst. Remote Sens. 2022, 14, 3090. [Google Scholar] [CrossRef]
  68. Liu, S.; Liu, X.X.; Yu, L.; Wang, Y.; Zhang, G.J.; Gong, P.; Huang, W.Y.; Wang, B.; Yang, M.M.; Cheng, Y.Q. Climate response to introduction of the ESA CCI land cover data to the NCAR CESM. Clim. Dyn. 2021, 56, 4109–4127. [Google Scholar] [CrossRef]
  69. Liu, S.S.; Su, H.; Cao, G.F.; Wang, S.Q.; Guan, Q.F. Learning from data: A post classification method for annual land cover analysis in urban areas. ISPRS-J. Photogramm. Remote Sens. 2019, 154, 202–215. [Google Scholar] [CrossRef]
Figure 1. Flowchart of time-series sampling.
Figure 1. Flowchart of time-series sampling.
Remotesensing 16 02968 g001
Figure 2. Minimum sample size calculated by combining the predicted user precision error rate and significance level.
Figure 2. Minimum sample size calculated by combining the predicted user precision error rate and significance level.
Remotesensing 16 02968 g002
Figure 3. Relationship between strata and map classes.
Figure 3. Relationship between strata and map classes.
Remotesensing 16 02968 g003
Figure 4. Misclassification propagation of pixels in the changed strata. The unchanged strata include pixels 1 to 6. The changed strata include pixels 7 to 9. ① indicates that the map and reference data in the changed strata are consistent in the two years; ② is the inconsistency between the map class and the reference in the year (T + 1); Light green means grass; Dark green means forest.
Figure 4. Misclassification propagation of pixels in the changed strata. The unchanged strata include pixels 1 to 6. The changed strata include pixels 7 to 9. ① indicates that the map and reference data in the changed strata are consistent in the two years; ② is the inconsistency between the map class and the reference in the year (T + 1); Light green means grass; Dark green means forest.
Remotesensing 16 02968 g004
Figure 5. Interpretation results for the ESA CCI LC data in Google Earth. The red frame refers to the area range of pixel.
Figure 5. Interpretation results for the ESA CCI LC data in Google Earth. The red frame refers to the area range of pixel.
Remotesensing 16 02968 g005
Figure 6. Spatial layout of samples in the base year.
Figure 6. Spatial layout of samples in the base year.
Remotesensing 16 02968 g006
Figure 7. Proportion of feature categories between the randomly revisited samples and total samples.
Figure 7. Proportion of feature categories between the randomly revisited samples and total samples.
Remotesensing 16 02968 g007
Figure 8. Spatial distribution of the samples. (a) Samples in 2010. (bf) Samples randomly revisited in the years from 2011 to 2015.
Figure 8. Spatial distribution of the samples. (a) Samples in 2010. (bf) Samples randomly revisited in the years from 2011 to 2015.
Remotesensing 16 02968 g008aRemotesensing 16 02968 g008b
Figure 9. Single-year precision results for the average of 10 times extracted from the single-temporal samples in 2015. (a) Difference between the accuracy when eliminating the misclassification and including the misclassification. (b) Difference between the single-year accuracy and single-temporal accuracy. WMC represents the difference between the accuracy when eliminating misclassification and the single-temporal accuracy (reference value). IMC represents the difference between the accuracy when including the misclassification and the single-temporal accuracy.
Figure 9. Single-year precision results for the average of 10 times extracted from the single-temporal samples in 2015. (a) Difference between the accuracy when eliminating the misclassification and including the misclassification. (b) Difference between the single-year accuracy and single-temporal accuracy. WMC represents the difference between the accuracy when eliminating misclassification and the single-temporal accuracy (reference value). IMC represents the difference between the accuracy when including the misclassification and the single-temporal accuracy.
Remotesensing 16 02968 g009
Figure 10. Single-year precision results of the average of ten times extracted from single-temporal samples in 2010. (a) Difference between the accuracy of eliminating the misclassification and including the misclassification. (b) Difference between the single-year accuracy and single-temporal accuracy.
Figure 10. Single-year precision results of the average of ten times extracted from single-temporal samples in 2010. (a) Difference between the accuracy of eliminating the misclassification and including the misclassification. (b) Difference between the single-year accuracy and single-temporal accuracy.
Remotesensing 16 02968 g010
Figure 11. Single-year precision results of the average of ten times extracted from single-temporal samples in 2010.
Figure 11. Single-year precision results of the average of ten times extracted from single-temporal samples in 2010.
Remotesensing 16 02968 g011
Figure 12. The accuracy of the ESA CCI LC for 2010–2015.
Figure 12. The accuracy of the ESA CCI LC for 2010–2015.
Remotesensing 16 02968 g012
Figure 13. User’s accuracy and producer’s accuracy of each class. (a) refers to Crop; (b) refers to Forest; (c) refers to Grass; (d) refers to Shrub; (e) refers to Water; (f) refers to Bareland; (g) refers to Urban; (h) refers to Snow; (i) refers to Sparse Veg.
Figure 13. User’s accuracy and producer’s accuracy of each class. (a) refers to Crop; (b) refers to Forest; (c) refers to Grass; (d) refers to Shrub; (e) refers to Water; (f) refers to Bareland; (g) refers to Urban; (h) refers to Snow; (i) refers to Sparse Veg.
Remotesensing 16 02968 g013aRemotesensing 16 02968 g013b
Figure 14. Stability of the producer’s accuracy for the ESA CCI LC product.
Figure 14. Stability of the producer’s accuracy for the ESA CCI LC product.
Remotesensing 16 02968 g014
Figure 15. Stability of the user’s accuracy for the ESA CCI LC product.
Figure 15. Stability of the user’s accuracy for the ESA CCI LC product.
Remotesensing 16 02968 g015
Table 1. Summary of existing methods.
Table 1. Summary of existing methods.
MethodsAdvantagesSampling ParametersConsidering Classification Error
Wickham et al. (2013) [32]Reducing the evaluation lag in multi-temporal data of land-coverEmpirically determined sample sizeNo
Tsendbazar et al. (2021) [36]Empirically determined sampling parametersNo
Tang et al. (2019) [37]Monitoring the change of a single class in different periodsCochran sampling model; fixed sample size for the rare land typeNo
Potapov et al. (2021) [38]Empirically determined sample sizeNo
Druce et al. (2021) [39]Empirically determined sample sizeNo
Arévalo et al. (2020) [42]Improving the stratification efficiency of land-cover changeCochran sampling model; fixed sample size for the rare land typeNo
Gong et al. (2023) [43]Probabilistic statistical sampling model; proportional distribution; fixed sample size for the rare land typeNo
Table 2. Nine classes after reclassification.
Table 2. Nine classes after reclassification.
Classes Considered in This StudyLand-Use Types Used in the CCI LC MapsDescription
Agriculture10, 11, 12Rainfed cropland
20Irrigated cropland
30Mosaic cropland (>50%)/natural vegetation (tree, shrub, herbaceous cover) (<50%)
40Mosaic natural vegetation (tree, shrub, herbaceous cover) (>50%)/cropland (<50%)
Forest50Tree cover, broad-leaved, evergreen, closed to open (>15%)
60, 61, 62Tree cover, broad-leaved, deciduous, closed to open (>15%)
70, 71, 72Tree cover, needle-leaved, evergreen, closed to open (>15%)
80, 81, 82Tree cover, needle-leaved, deciduous, closed to open (>15%)
90Tree cover, mixed leaf type (broad-leaved and needle-leaved)
100Mosaic tree and shrub (>50%)/herbaceous cover (<50%)
110Mosaic herbaceous cover (>50%)/tree and shrub (<50%)
160Tree cover, flooded, fresh, or brackish water
170Tree cover, flooded, saline water
Grass130Grassland
140Lichens and mosses
180Shrub or herbaceous cover, flooded, fresh-saline, or brackish water
Shrub120, 121, 122Shrubland
Water210Water
Bare land200, 201, 202Bare areas
Urban190Urban areas
Ice/snow220Permanent snow and ice
Sparse vegetation150, 151, 152, 153Sparse vegetation (tree, shrub, herbaceous cover)
Table 3. Total samples to be selected in the base year.
Table 3. Total samples to be selected in the base year.
No.Surface
Features
Number of PixelsPredicted Classification AccuracyArea Ratio (Number of Pixels of Surface Feature/Total Pixels)Total Sample Size
1Crop33201954680%3.95%14,645
2Forest66830189385%7.96%
3Grass25319741750%3.01%
4Shrub17570709660%2.09%
5Water567494620190%67.57%
6Bare25311592780%3.01%
7Urban884276485%0.11%
8Ice87144983490%10.38%
9Sparse Veg16049932230%1.91%
Table 4. Results of the sample distribution.
Table 4. Results of the sample distribution.
No.Surface FeaturesNumber of PixelsAllocationArea Ratio
1Crop332019546208216.57%
2Forest668301893334533.35%
3Grass253197417178612.64%
4Shrub17570709614968.77%
5Water6638916910853.31%
6Bare253115927178612.63%
7Urban88427648700.44%
8Ice8582233111584.28%
9Sparse Veg16049932214388.01%
Table 5. Total samples to be distributed in the changed strata between two adjacent years.
Table 5. Total samples to be distributed in the changed strata between two adjacent years.
Adjacent YearsThe Total Samples in the Changed Strata
2010–201165
2011–201249
2012–201341
2013–201472
2014–201510
Table 6. Sample size to be allocated for each changed stratum.
Table 6. Sample size to be allocated for each changed stratum.
2010–20112011–20122012–20132013–20142014–2015
Urban gain100100109100111
Forest loss100100112161138
Forest gain167160133140-
Agriculture loss100100100100-
Agriculture gain100100100100-
Catch-all135136108100100
Table 7. Correct proportion of sample classification in the changed stratum ( y c i ¯ ). CIM refers to y c i ¯ which contains misclassification, and CEM refers to y c i ¯ which eliminates misclassification.
Table 7. Correct proportion of sample classification in the changed stratum ( y c i ¯ ). CIM refers to y c i ¯ which contains misclassification, and CEM refers to y c i ¯ which eliminates misclassification.
Year 2010Year 2011Year 2012
StrataCIMCEMStrataCIMCEMStrataCIMCEM
Crop loss0.27500Crop gain0.61540Crop gain0.58330.1111
Forest loss0.66670Forest gain0.76920Forest gain0.54170.1667
Grass loss00Grass gain00Grass gain00
Shrub loss0-Shrub gain--Shrub gain--
Water loss1-Water gain0.50Water gain--
Bare loss0.68420Bare gain0.50Bare gain0.66670
Urban loss--Urban gain0.92160.2Urban gain0.93330.4
Snow loss--Snow gain--Snow gain--
Sparse loss0.42860Sparse gain0.37500Sparse gain0.14290
Year 2013Year 2014Year 2015
StrataCIMCEMStrataCIMCEMStrataCIMCEM
Crop gain0.9048-Crop gain0.46670Crop gain--
Forest gain0.36840Forest gain0.57500Forest gain--
Grass gain00Grass gain00Grass gain--
Shrub gain--Shrub gain--Shrub gain--
Water gain--Water gain1-Water gain--
Bare gain0-Bare gain0.40Bare gain00
Urban gain0.94120Urban gain0.89660Urban gain0.90790
Snow gain--Snow gain--Snow gain--
Sparse gain0.250Sparse gain00Sparse gain--
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liao, S.; Xie, H.; Gong, Y.; Jin, Y.; Xu, X.; Chen, P.; Tong, X. Validation of Multi-Temporal Land-Cover Products Considering Classification Error Propagation. Remote Sens. 2024, 16, 2968. https://doi.org/10.3390/rs16162968

AMA Style

Liao S, Xie H, Gong Y, Jin Y, Xu X, Chen P, Tong X. Validation of Multi-Temporal Land-Cover Products Considering Classification Error Propagation. Remote Sensing. 2024; 16(16):2968. https://doi.org/10.3390/rs16162968

Chicago/Turabian Style

Liao, Shicheng, Huan Xie, Yali Gong, Yanmin Jin, Xiong Xu, Peng Chen, and Xiaohua Tong. 2024. "Validation of Multi-Temporal Land-Cover Products Considering Classification Error Propagation" Remote Sensing 16, no. 16: 2968. https://doi.org/10.3390/rs16162968

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop