Bootstrap approaches for spatial data

Pilar García-Soidán¹,
Raquel Menezes² &
Óscar Rubiños³

1517 Accesses
22 Citations
Explore all metrics

Abstract

Generation of replicates of the available data enables the researchers to solve different statistical problems, such as the estimation of standard errors, the inference of parameters or even the approximation of distribution functions. With this aim, Bootstrap approaches are suggested in the current work, specifically designed for their application to spatial data, as they take into account the dependence structure of the underlying random process. The key idea is to construct nonparametric distribution estimators, adapted to the spatial setting, which are distribution functions themselves, associated to discrete or continuous random variables. Then, the Bootstrap samples are obtained by drawing at random from the estimated distribution. Consistency of the suggested approaches will be proved by assuming stationarity from the random process or by relaxing the latter hypothesis to admit a deterministic trend. Numerical studies for simulated data and a real data set, obtained from environmental monitoring, are included to illustrate the application of the proposed Bootstrap methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Uncertainty Quantification in Robust Inference for Irregularly Spaced Spatial Data Using Block Bootstrap

Article 08 November 2018

Estimating High Quantiles Based on Dependent Circular Data

Article 21 February 2019

Consistency of bootstrap approximation to the null distributions of local spatial statistics with application to house price analysis

Article Open access 04 September 2020

References

Bowman A, Hall P, Prvan T (1998) Bandwidth selection for the smoothing of distribution functions. Biometrika 85(4):799–808. doi:10.1093/biomet/85.4.799
Article Google Scholar
Crujeiras RM, Fernández-Casal R, González-Manteiga W (2010) Goodness-of-fit tests for the spatial spectral density. Stoch Environ Res Risk Assess 24(1):67–79. doi:10.1007/s00477-008-0300-0
Article Google Scholar
De Angelis D, Young GA (1992) Smoothing the bootstrap. Int Stat Rev 60(1):45–56
Article Google Scholar
Efron B (1979) Bootstrap methods: another look at the Jackknife. Ann Stat 7(1):1–26. doi:10.1214/aos/1176344552
Article Google Scholar
García-Soidán P (2007) Asymptotic normality of the Nadaraya–Watson semivariogram estimators. TEST 16(3):479–503. doi:10.1007/s11749-006-0016-8
Article Google Scholar
Goovaerts P (1997) Geostatistics for natural resources evaluation. Oxford University Press, New York
Govaerts B, Beck B, Lecoutre E, Le Bailly C, VandenEeckaut P (2005) From monitoring data to regional distributions: a practical methodology applied to water risk assessment. Environmetrics 16(2):109–127. doi:10.1002/env.665
Google Scholar
Hall P (1985) Resampling a coverage pattern. Stoch Process Appl 20(2):231–246. doi:10.1016/0304-4149(85)90212-1
Article Google Scholar
Hall P (1992) The bootstrap and edgeworth expansion. Springer, New York
Hall P, Maiti T (2006) On parametric bootstrap methods for small area prediction. J R Stat Soc B 68(2):221–238. doi:10.1111/j.1467-9868.2006.00541.x
Article Google Scholar
Hall P, Patil P (1994) Properties of nonparametric estimators of autocovariance for stationary random fields. Probab Theory Relat Fields 99(3):399–424. doi:10.1007/BF01199899
Article Google Scholar
Hyun-Han K, Young-Il M (2006) Improvement of overtopping risk evaluations using probabilistic concepts for existing dams. Stoch Environ Res Risk Assess 20(4):223–237. doi:10.1007/s00477-005-0017-2
Article Google Scholar
Iranpanah N, Mansourianb A, Tashayob B, Haghighic F (2011) Spatial semi-parametric bootstrap method for analysis of kriging predictor of random field. Procedia Environ Sci 3:81–86. doi:10.1016/j.proenv.2011.02.015
Article Google Scholar
Lejeune M, Sarda P (1992) Smooth estimators of distribution and density functions. Comput Stat Data Anal 14:457–471. doi:10.1016/0167-9473(92)90061-J
Article Google Scholar
Li B, Genton M, Sherman M (2007) A nonparametric assessment of properties of space–time covariance functions. JASA 102(478):736–744. doi:10.1198/016214507000000202
Article CAS Google Scholar
Loh JM (2008) A valid and fast spatial Bootstrap for correlation functions. Astrophys J 681(1):726–734. doi:10.1086/588631
Article Google Scholar
Maglione DS, Diblasi AM (2004) Exploring a valid model for the variogram of an isotropic spatial process. Stoch Environ Res Risk Assess 18(6):366–376. doi:10.1007/s00477-003-0143-7
Article Google Scholar
Martins A, Figueira R, Sousa A, Sérgio C (2012) Spatio-temporal patterns of Cu contamination in mosses using geostatistical estimation. Environ Pollut 170:276–284. doi:10.1016/j.envpol.2012.07.004
Article CAS Google Scholar
Menezes R, García-Soidán P, Ferreira C (2010) Nonparametric spatial prediction under stochastic sampling design. J Nonparametr Stat 22(3):363–377. doi:10.1080/10485250903094294
Article Google Scholar
Olea RA, Pardo-Igúzquiza E (2011) Generalized Bootstrap method for assessment of uncertainty in semivariogram inference. Math Geosci 43(2):203–228. doi:10.1007/s11004-010-9269-6
Article Google Scholar
Politis DN, Romano JP, Wolf M (1999) Subsampling. Springer, Berlin
Shapiro A, Botha JD (1991) Variogram fitting with a general class of conditionally nonnegative definite functions. Comput Stat Data Anal 11(1):87–96. doi:10.1016/0167-9473(91)90055-7
Article Google Scholar
Silverman BW, Young GA (1987) The bootstrap: to smooth or not to smooth? Biometrika 74(3):469–479. doi:10.1093/biomet/74.3.469
Article Google Scholar

Download references

Acknowledgments

The authors would like to thank the helpful suggestions and comments from the Reviewers. The authors are also grateful to Dr. K. J. Duncan-Barlow (University of Vigo) for her contribution in the language revision. The first and third authors acknowledge financial support from the Project TEC2011-28683-C02-02 of the Spanish Ministry of Science and Innovation and the Project CN2012/279 from the European Regional Development Fund and the Galician Regional Government (Xunta de Galicia). The second author’s work has been supported by the Project PTDC/MAT/112338/2009 (FEDER support included) of the Portuguese Ministry of Science, Technology and Higher Education.

Author information

Authors and Affiliations

Department of Statistics and Operations Research, University of Vigo, Vigo, Spain
Pilar García-Soidán
Department of Mathematics and Applications, University of Minho, Braga, Portugal
Raquel Menezes
Department of Signal Theory and Communications, University of Vigo, Vigo, Spain
Óscar Rubiños

Authors

Pilar García-Soidán
View author publications
You can also search for this author in PubMed Google Scholar
Raquel Menezes
View author publications
You can also search for this author in PubMed Google Scholar
Óscar Rubiños
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pilar García-Soidán.

Appendices

Appendix 1: Consistency of $\hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)} \left( x_1,{\ldots},x_k \right)$

To check that consistency follows for $\hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)} \left( x_1,{\ldots},x_k \right), $ the hypotheses described below will be assumed:

(i)
$\{ Z ( {\rm s} ) \in {I\!R} : {\rm s} \in D \subset {I\!R}^d \}$ can be modeled as given in (1).
(ii)
D = λD ₀, for some $\lambda=\lambda (n) \mathop{\longrightarrow}\limits^{n \rightarrow + \infty} +\infty$ and bounded $D_0 \subset {I\!R}^d. $
(iii)
t_i = λu_i, for 1 ≤ i ≤ n, where ${\rm u}_1, {\ldots}, {\rm u}_n$ denotes a realization of a random sample of size n drawn from a density function g ₀ considered on D ₀.
(iv)
$Z(\cdot)$ is α-mixing, with α(r) = O(r ^−a), for r > 0 and some constant a > 0.
(v)
K is d-variate and symmetric density function with compact support.
(vi)
$\{ h_{1}^{2}+ {\cdots} + h_{k-1}^{2}+ \lambda^{-1} + n^{-k} \lambda^{d(k-1)} h_{1}^{-d} {\ldots} h_{k-1}^{-d} \} \mathop{\longrightarrow}\limits^{n \rightarrow + \infty} 0. $
(vii)
$F_{{\rm s}_1,{\ldots}, {\rm s}_k} (x_1,{\ldots},x_k)$ is three-times continuously differentiable as a function of $({\rm s}_1,{\ldots}, {\rm s}_k). $

We will prove that the bias and the variance of $\hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)} \left( x_1,{\ldots},x_k \right)$ are of the respective orders $( h_{1}^{2}+ {\cdots} + h_{k-1}^{2})$ and $(n^{-k} \lambda^{d(k-1)} h_{1}^{-d} {\ldots} h_{k-1}^{-d} + \lambda^{-d})$ and, therefore, tend to zero as the sample size n increases, which would yield the consistency of the distribution estimator. To do the latter, conditions (i)–(vii) will be applied and a similar procedure as in the proof of Theorem 3.1 in Hall and Patil (1994).

Write $A_{i_1,i_2}=\frac{ p_{{\rm s}_1,{\rm s}_2}^{{\rm t}_{i_1},{\rm t}_{i_2},h_1} }{\sum_{i_1=1}^{n} \sum_{i_2=1}^{n} p_{{\rm s}_1,{\rm s}_2}^{{\rm t}_{i_1},{\rm t}_{i_2},h_1}}$ and $A_{i_{j-1},i_j}=\frac{ p_{{\rm s}_{j-1},{\rm s}_j}^{{\rm t}_{i_{j-1}},{\rm t}_{i_j},h_{j-1}} }{\sum_{i_j=1}^{n} p_{{\rm s}_{j-1},{\rm s}_j}^{{\rm t}_{i_{j-1}},{\rm t}_{i_j},h_{j-1}}}$ for $j=3,{\ldots},k. $ Firstly, we can take into account that, for large n:

$$ {\rm E} \left[\hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)} \left( x_1,{\ldots},x_k \right)\right] ={\rm E} \left[ {\rm E} \left[ \hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)} \left( x_1,{\ldots},x_k \right) {\rm t}_{i_j}, \forall j \right] \right] =\sum\limits_{i_1=1}^{n} {\ldots} \sum\limits_{i_k=1}^{n} {\rm E} \left[A_{i_1,i_2} {\ldots} A_{i_{k-1},i_k} {\rm E} \left[ I_{\{ X({\rm t}_{i_1}) \leq x_1 \}} {\ldots} I_{\{ X({\rm t}_{i_k}) \leq x_k \}} {\rm t}_{i_j}, \forall j \right]\right] = \sum\limits_{i_1=1}^{n} {\ldots} \sum\limits_{i_k=1}^{n} {\rm E} \left[ A_{i_1,i_2} {\ldots} A_{i_{k-1},i_k} F_{{\rm t}_{i_1},{\ldots}, {\rm t}_{i_k}} \left( x_1 + \mu \left( {\rm s}_1 + {\rm t}_{i_1}- {\rm s}_1\right) - \mu \left( {\rm s}_1 \right),{\ldots},x_k + \mu \left( {\rm s}_k + {\rm t}_{i_1}- {\rm s}_1\right) - \mu \left( {\rm s}_k \right)\right)\right] =\sum\limits_{i_1=1}^{n} {\ldots} \sum\limits_{i_k=1}^{n} {\rm E} \left[ A_{i_1,i_2} {\ldots} A_{i_{k-2},i_{k-1}} \cdot {\rm E} \left[ A_{i_{k-1},i_k} F_{{\rm t}_{i_1}-{\rm t}_{i_1}+{\rm s}_1, {\ldots},{\rm t}_{i_k}-{\rm t}_{i_1}+{\rm s}_1 } \left( x_1,{\ldots},x_k \right){\rm t}_{i_j}, j\leq k-1 \right] \right]$$

on account of (3).

Now, the last conditional expectation will be approximated. With this aim, bear in mind that:

$$ {\rm E} \left[ K\left( \frac{{\rm s_{k-1}}-{\rm s}_k-({\rm t}_{i_{k-1}}-{\rm t}_{i_k})}{h_{k-1}} \right) F_{{\rm t}_{i_1}-{\rm t}_{i_1}+{\rm s}_1, {\ldots},{\rm t}_{i_k}-{\rm t}_{i_1}+{\rm s}_1 } \left( x_1,{\ldots},x_k \right)\right]= \int K\left( \frac{{\rm s_{k-1}}-{\rm s}_k-({\rm t}_{i_{k-1}}-\lambda {\rm u})}{h_{k-1}} \right) F_{{\rm t}_{i_1}-{\rm t}_{i_1}+{\rm s}_1, {\ldots},\lambda {\rm u}-{\rm t}_{i_1}+{\rm s}_1 } \left( x_1,{\ldots},x_k \right) g_0({\rm u}) d{\rm u}\approx \lambda^d h_{k-1}^{d} g_0(0) \int K\left( {\rm z}_1 \right) F_{{\rm t}_{i_1}-{\rm t}_{i_1}+{\rm s}_1, {\ldots},{\rm s}_k-{\rm s}_{k-1}+{\rm t}_{i_{k-1}} -{\rm t}_{i_1}+{\rm s}_1+h_{k-1} {\rm z}_1 } \left( x_1,{\ldots},x_k \right) d{\rm z}_1 {\rm E} \left[ K\left( \frac{{\rm s_{k-1}}-{\rm s}_k-({\rm t}_{i_{k-1}}-{\rm t}_{i_k})}{h_{k-1}} \right) \right] = \int K\left( \frac{{\rm s_{k-1}}-{\rm s}_k-({\rm t}_{i_{k-1}}-\lambda {\rm u})}{h_{k-1}} \right) g_0({\rm u}) d{\rm u} \approx \lambda^d h_{k-1}^{d} g_0(0) $$

From the previous relations, it follows that:

$${\rm E} \left[ \left. A_{i_{k-1},i_k} F_{{\rm t}_{i_1}-{\rm t}_{i_1}+{\rm s}_1, {\ldots},{\rm t}_{i_k}-{\rm t}_{i_1}+{\rm s}_1 } \left( x_1,{\ldots},x_k \right)\right/ {\rm t}_{i_j}, j\leq k-1 \right]\approx n^{-1} \int K\left( {\rm z}_1 \right) F_{{\rm t}_{i_1}-{\rm t}_{i_1}+{\rm s}_1, {\ldots},{\rm s}_k-{\rm s}_{k-1}+{\rm t}_{i_{k-1}} -{\rm t}_{i_1}+{\rm s}_1+h_{k-1}{\rm z}_1 } \left( x_1,{\ldots},x_k \right) d{\rm z}_1 $$

and, therefore:

$${\rm E} \left[\hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)} \left( x_1,{\ldots},x_k \right)\right] \approx \sum\limits_{i_1=1}^{n} {\ldots} \sum\limits_{i_{k-1}=1}^{n} {\rm E} \left[ A_{i_1,i_2} {\ldots} A_{i_{k-2},i_{k-1}} \int K\left( {\rm z}_1 \right) \cdot F_{{\rm t}_{i_1}-{\rm t}_{i_1}+{\rm s}_1, {\ldots},{\rm s}_k-{\rm s}_{k-1}-{\rm t}_{i_{k-1}} -{\rm t}_{i_1}+{\rm s}_1 +h_{k-1}{\rm z}_1} \left( x_1,{\ldots},x_k \right) d{\rm z}_1 \right].$$

We can iterate the strategy above, based on applying an appropriate conditional expectation and developing the resulting term, to achieve that:

$${\rm E} \left[\hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)} \left( x_1,{\ldots},x_k \right)\right]\approx \int {\ldots} \int F_{{\rm s}_1,{\rm s}_2+h_1{\rm z}_{k-1},{\ldots}, {\rm s}_k+h_1{\rm z}_{k-1}+ {\cdots} +h_{k-1}{\rm z}_1} \left( x_1,x_2,{\ldots},x_k \right) \cdot K \left( {\rm z}_1 \right) {\ldots} K \left( {\rm z}_{k-1} \right) d{\rm z}_1 {\ldots} d{\rm z}_{k-1} = F_{{\rm s}_1,{\rm s}_2,{\ldots}, {\rm s}_k} \left( x_1,x_2,{\ldots},x_k \right) + O \left( h_{1}^{2}+{\cdots}+ h_{k-1}^{2}\right) $$

With regard to the variance, one has for large n that:

$$ {\rm Var} \left[\hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)} \left( x_1,{\ldots},x_k \right)\right]={\rm E} \left[ \left(\hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)} \left( x_1,{\ldots},x_k \right) -{\rm E} \left[\hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)} \left( x_1,{\ldots},x_k \right)\right]\right)^2 \right] \approx V_1+V_2 $$

where:

$$ \begin{aligned} V_1 &= \sum\limits_{i_1=1}^{n} {\ldots}\sum\limits_{i_k=1}^{n} {\rm E} \left[ A_{i_1,i_2}^{2} {\ldots}A_{i_{k-1},i_k}^{2}\cdot \left( I_{\{ X({\rm t}_{i_1}) \leq x_1\}} {\ldots} I_{\{ X({\rm t}_{i_k}) \leq x_k \}} - F_{{\rm s}_1,{\ldots},{\rm s}_k } \left( x_1,{\ldots},x_k \right)^2 \right)\right]\\ V_2 &= \sum\limits_{i_1=1}^{n} {\ldots}\sum\limits_{i_k=1}^{n} \sum\limits_{j_1=1}^{n} {\ldots}\sum\limits_{j_k=1}^{n}\cdot{\rm E} \left[ A_{i_1,i_2} {\ldots}A_{i_{k-1},i_k} A_{j_1,j_2} {\ldots} A_{j_{k-1},j_k}\left( I_{\{X({\rm t}_{i_1}) \leq x_1 \}} {\ldots} I_{\{ X({\rm t}_{i_k}) \leq x_k \}} \cdot I_{\{ X({\rm t}_{j_1}) \leq x_1 \}} {\ldots} I_{\{X({\rm t}_{j_k}) \leq x_k \}} -F_{{\rm s}_1, {\ldots},{\rm s}_k }\left( x_1,{\ldots},x_k \right)^2 \right) \right] \end{aligned}$$

By using similar arguments as above, we could check that:

$$ V_1 \approx \frac{n^{-k} \lambda^{d(k-1)} h_{1}^{-d} {\ldots} h_{k-1}^{-d} \left( F_{{\rm s}_1,{\ldots}, {\rm s}_k} \left( x_1,{\ldots},x_k \right)- F_{{\rm s}_1,{\ldots}, {\rm s}_k} \left( x_1,{\ldots},x_k \right)^2 \right)}{ g_0(0)^{k}\left( \int K \left( {\rm z} \right)^2 d{\rm z} \right)^{k-1} } V_2 \approx \lambda^{-d} \int \left( F_{{\rm s}_1,{\ldots}, {\rm s}_k,{\rm s}_1+{\rm t},{\ldots}, {\rm s}_k+{\rm t}} \left( x_1,{\ldots},x_k ,x_1,{\ldots},x_k\right) - F_{{\rm s}_1,{\ldots}, {\rm s}_k} \left( x_1,{\ldots},x_k \right)^2 \right) d{\rm t} $$

Consequently:

$$ {\rm Var} \left[\hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)} \left( x_1,{\ldots},x_k \right)\right]= O \left( n^{-k} \lambda^{d(k-1)} h_{1}^{-d} {\ldots} h_{k-1}^{-d} + \lambda^{-d} \right) $$

We could derive the dominant terms of the bias and the variance of the distribution estimator as well as asymptotically minimize the mean squared error (MSE) of the distribution estimator, namely:

$${\rm MSE} \left[\hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)} \left( x_1,{\ldots},x_k \right) \right]={\rm Bias} \left[ \hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)} \left( x_1,{\ldots},x_k \right) \right]^2 + {\rm Var} \left[\hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)} \left( x_1,{\ldots},x_k \right)\right] $$

to obtain the optimal bandwidths h _j, for $j=1,{\ldots},k-1, $ which would be dependent on unknown terms, such as the multivariate distribution function itself and its second-order derivatives.

Appendix 2: Consistency of $\tilde{F}_{{\rm s}_1,{\ldots}, {\rm s}_k} \left( x_1,{\ldots},x_k \right)$

To derive this proof, we will assume conditions (i)–(v), together with:

(vi′)
$\{ h_{1}^{2}+ {\cdots} + h_{k-1}^{2}+ h^2+\lambda^{-1} + n^{-k} \lambda^{d(k-1)} h_{1}^{-d} {\ldots} h_{k-1}^{-d} \} \mathop{\longrightarrow}\limits^{n \rightarrow + \infty} 0. $
(vii′)
$F_{{\rm s}_1,{\ldots}, {\rm s}_k} (x_1,{\ldots},x_k)$ is three-times continuously differentiable as a function of $({\rm s}_1,{\ldots}, {\rm s}_k)$ and as a function of $(x_1,{\ldots},x_k). $
(viii)
L is a univariate and symmetric density function with compact support.

For large n, the aforementioned hypotheses yield that:

$$ {\rm E} \left[\tilde{F}_{{\rm s}_1,{\ldots}, {\rm s}_k} \left( x_1,{\ldots},x_k \right)\right] ={\rm E} \left[ {\rm E} \left[ \left. \tilde{F}_{{\rm s}_1,{\ldots}, {\rm s}_k} \left( x_1,{\ldots},x_k \right) \right/ {\rm t}_{i_j}, \forall j \right] \right] =\sum\limits_{i_1=1}^{n} {\ldots} \sum\limits_{i_k=1}^{n} {\rm E} \left[ \left. A_{i_1,i_2} {\ldots} A_{i_{k-1},i_{k}} {\rm E} \left[ {\cal L} \left( \frac{x_1 -X ( {\rm t}_{i_1} )}{h} \right) {\ldots} {\cal L} \left( \frac{x_k -X ( {\rm t}_{i_k} )}{h} \right) \right/ {\rm t}_{i_j}, \forall j \right]\right] =\sum\limits_{i_1=1}^{n} {\ldots} \sum\limits_{i_k=1}^{n} {\rm E} \left[ A_{i_1,i_2}\cdot{\ldots}\cdot A_{i_{k-1},i_{k}} \cdot \int {\cal L} \left( \frac{x_1 -u_1+\mu ( {\rm s}_1-{\rm t}_{i_1}+{\rm s}_1)-\mu({\rm s}_1)}{h} \right) {\ldots} {\cal L} \left( \frac{x_k -u_k+\mu ( {\rm s}_k-{\rm t}_{i_1}+{\rm s}_1)-\mu({\rm s}_k)}{h} \right) \cdot f_{{\rm t}_{i_1},{\ldots}, {\rm t}_{i_k}} \left( u_1,{\ldots},u_k\right) du_1 {\ldots} du_k \right]$$

where $f_{{\rm t}_1,{\ldots}, {\rm t}_k}$ denotes the joint density function of $(Z({\rm t}_1),{\ldots},Z({\rm t}_k)). $

We can integrate by parts and apply relation (3) to obtain that:

$$ {\rm E} \left[\tilde{F}_{{\rm s}_1,{\ldots}, {\rm s}_k} \left( x_1,{\ldots},x_k \right)\right] = \sum\limits_{i_1=1}^{n} {\ldots} \sum\limits_{i_k=1}^{n} {\rm E} \left[ A_{i_1,i_2} {\ldots} A_{i_{k-1},i_{k}} \int L\left( y_1\right) {\ldots} L\left( y_k\right) \cdot F_{{\rm t}_{i_1},{\ldots}, {\rm t}_{i_k}} \left( x_1 -hy_1+\mu ( {\rm s}_1-{\rm t}_{i_1}+{\rm s}_1)-\mu({\rm s}_1) , {\ldots}, x_k -hy_k+\mu ( {\rm s}_k-{\rm t}_{i_1}+{\rm s}_1)-\mu({\rm s}_k)\right) dy_1 {\ldots} dy_k \right]= \sum\limits_{i_1=1}^{n} {\ldots} \sum\limits_{i_k=1}^{n} {\rm E} \left[ A_{i_1,i_2} {\ldots} A_{i_{k-1},i_{k}} \int L\left( y_1\right) {\ldots} L\left( y_k\right) \cdot F_{{\rm t}_{i_1}-{\rm t}_{i_1}+{\rm s}_1, {\ldots},{\rm t}_{i_k}-{\rm t}_{i_1}+{\rm s}_1 } \left( x_1 -hy_1 , {\ldots}, x_k -hy_k\right) dy_1 {\ldots} dy_k \right] $$

By proceeding with analogue arguments as those used for the bias of $\hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)}, $ it follows that:

$${\rm E} \left[\tilde{F}_{{\rm s}_1,{\ldots}, {\rm s}_k} \left( x_1,{\ldots},x_k \right)\right] \approx \int {\ldots} \int \int {\ldots} \int L\left( y_1\right) {\ldots} L\left( y_k\right) \cdot K \left( {\rm z}_1 \right) {\ldots} K \left( {\rm z}_{k-1} \right) F_{{\rm s}_1,{\rm s}_2+h_1{\rm z}_{k-1},{\ldots}, {\rm s}_k+h_1{\rm z}_{k-1}+ {\cdots} +h_{k-1}{\rm z}_1} \left( x_1 -hy_1, {\ldots}, x_k -hy_k\right) d{\rm z}_1 {\ldots} d{\rm z}_{k-1} dy_1 {\ldots} dy_k= F_{{\rm s}_1,{\rm s}_2,{\ldots}, {\rm s}_k} \left( x_1,{\ldots},x_k \right) + O \left( h_{1}^{2}+{\cdots}+ h_{k-1}^{2}+h^2\right)$$

Finally, the approximation of the variance of the continuous estimator will be addressed as given below:

$$ {\rm Var} \left[\tilde{F}_{{\rm s}_1,{\ldots}, {\rm s}_k} \left( x_1,{\ldots},x_k \right)\right] \approx W_1+W_2 $$

with:

$$ W_1 = \sum\limits_{i_1=1}^{n} {\ldots} \sum\limits_{i_k=1}^{n} {\rm E} \left[ A_{i_1,i_2}^{2} {\ldots} A_{i_{k-1},i_k}^{2} \left( {\cal L} \left( \frac{x_1 -X ( {\rm t}_{i_1} )}{h} \right) {\ldots} {\cal L} \left( \frac{x_k -X ( {\rm t}_{i_k} )}{h} \right) - F_{{\rm s}_1, {\ldots},{\rm s}_k } \left( x_1,{\ldots},x_k \right)^2 \right) \right] W_2 = \sum\limits_{i_1=1}^{n} {\ldots} \sum\limits_{i_k=1}^{n} \sum\limits_{j_1=1}^{n} {\ldots} \sum\limits_{j_k=1}^{n} {\rm E} \left[ A_{i_1,i_2} {\ldots} A_{i_{k-1},i_k} A_{j_1,j_2} {\ldots} A_{j_{k-1},j_k}\cdot \left( {\cal L} \left( \frac{x_1 -X ( {\rm t}_{i_1} )}{h} \right) {\ldots} {\cal L} \left( \frac{x_k -X ( {\rm t}_{i_k} )}{h}\right) {\cal L} \left( \frac{x_1 -X ( {\rm t}_{j_1} )}{h} \right) {\ldots} {\cal L} \left( \frac{x_k -X ( {\rm t}_{j_k} )}{h} \right)-F_{{\rm s}_1, {\ldots},{\rm s}_k } \left( x_1,{\ldots},x_k \right)^2 \right) \right] $$

Now, we can combine the arguments used for the bias of the continuous estimator with those applied for the variance of the discrete estimator to check that both terms, W ₁ and W ₂, are asymptotically negligible, as established next:

$$ W_1 \approx \sum\limits_{i_1=1}^{n} {\ldots} \sum\limits_{i_k=1}^{n} {\rm E} \left[ A_{i_1,i_2}^2 {\ldots} A_{i_{k-1},i_{k}}^2 \int L( y_1) {\ldots} L( y_k)\cdot ( F_{{\rm s}_1, {\ldots},{\rm s}_k} ( x_1 -hy_1 , {\ldots}, x_k -hy_k ) - F_{{\rm s}_1, {\ldots},{\rm s}_k } ( x_1,{\ldots},x_k )^2 ) dy_1 {\ldots} dy_k\right] \approx \sum\limits_{i_1=1}^{n} {\ldots} \sum\limits_{i_k=1}^{n} {\rm E} \left[ A_{i_1,i_2}^2 {\ldots} A_{i_{k-1},i_{k}}^2 ( F_{{\rm s}_1, {\ldots},{\rm s}_k } ( x_1,{\ldots},x_k ) - F_{{\rm s}_1, {\ldots},{\rm s}_k } ( x_1,{\ldots},x_k )^2)\right] \approx\frac{n^{-k} \lambda^{d(k-1)} h_{1}^{-d} {\ldots} h_{k-1}^{-d} ( x_1,{\ldots},x_k ) ( F_{{\rm s}_1,{\ldots}, {\rm s}_k} ( x_1,{\ldots},x_k )- F_{{\rm s}_1,{\ldots}, {\rm s}_k} ( x_1,{\ldots},x_k )^2 )}{ g_0(0)^{k}\left( \int K ( {\rm z} )^2 d{\rm z} \right)^{k-1} }= O \left( n^{-k} \lambda^{d(k-1)} h_{1}^{-d} {\ldots} h_{k-1}^{-d} \right) W_2 \approx \sum\limits_{i_1=1}^{n} {\ldots} \sum\limits_{i_k=1}^{n} \sum\limits_{j_1=1}^{n} {\ldots} \sum\limits_{j_k=1}^{n} {\rm E} \left[ A_{i_1,i_2} {\ldots} A_{i_{k-1},i_{k}} A_{j_1,j_2} {\ldots} A_{j_{k-1},j_{k}}\right]\cdot \int \int {\ldots} \int \int {\ldots} \int L( y_1) {\ldots} L( y_k) L( w_1) {\ldots} L( w_k) ( F_{{\rm s}_1, {\ldots},{\rm s}_k,{\rm s}_1+{\rm t}, {\ldots},{\rm s}_k+{\rm t}} ( x_1 -hy_1 , {\ldots}, x_k -hy_k,x_1 -hw_1 , {\ldots}, x_k -hw_k) - F_{{\rm s}_1, {\ldots},{\rm s}_k } ( x_1,{\ldots},x_k )^2) d{\rm t} dy_1 {\ldots} dy_k \cdot dw_1 {\ldots} dw_k\approx \lambda^{-d} \int ( F_{{\rm s}_1,{\ldots}, {\rm s}_k,{\rm s}_1+{\rm t},{\ldots}, {\rm s}_k+{\rm t}} ( x_1,{\ldots},x_k ,x_1,{\ldots},x_k) - F_{{\rm s}_1,{\ldots}, {\rm s}_k} ( x_1,{\ldots},x_k )^2) d{\rm t}= O (\lambda^{-d})$$

Then, consistency yields for $\tilde{F}_{{\rm s}_1,{\ldots}, {\rm s}_k} \left( x_1,{\ldots},x_k \right), $ since its bias and variance tend to zero, as the sample size increases.

Rights and permissions

Reprints and permissions

About this article

Cite this article

García-Soidán, P., Menezes, R. & Rubiños, Ó. Bootstrap approaches for spatial data. Stoch Environ Res Risk Assess 28, 1207–1219 (2014). https://doi.org/10.1007/s00477-013-0808-9

Download citation

Published: 11 October 2013
Issue Date: July 2014
DOI: https://doi.org/10.1007/s00477-013-0808-9

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Uncertainty Quantification in Robust Inference for Irregularly Spaced Spatial Data Using Block Bootstrap

Estimating High Quantiles Based on Dependent Circular Data

Consistency of bootstrap approximation to the null distributions of local spatial statistics with application to house price analysis

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Consistency of \(\hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)} \left( x_1,{\ldots},x_k \right)\)

Appendix 2: Consistency of \(\tilde{F}_{{\rm s}_1,{\ldots}, {\rm s}_k} \left( x_1,{\ldots},x_k \right)\)

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Navigation

Bootstrap approaches for spatial data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Uncertainty Quantification in Robust Inference for Irregularly Spaced Spatial Data Using Block Bootstrap

Estimating High Quantiles Based on Dependent Circular Data

Consistency of bootstrap approximation to the null distributions of local spatial statistics with application to house price analysis

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Consistency of \(\hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)} \left( x_1,{\ldots},x_k \right)\)

Appendix 2: Consistency of \(\tilde{F}_{{\rm s}_1,{\ldots}, {\rm s}_k} \left( x_1,{\ldots},x_k \right)\)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Search

Navigation