Abstract
Generation of replicates of the available data enables the researchers to solve different statistical problems, such as the estimation of standard errors, the inference of parameters or even the approximation of distribution functions. With this aim, Bootstrap approaches are suggested in the current work, specifically designed for their application to spatial data, as they take into account the dependence structure of the underlying random process. The key idea is to construct nonparametric distribution estimators, adapted to the spatial setting, which are distribution functions themselves, associated to discrete or continuous random variables. Then, the Bootstrap samples are obtained by drawing at random from the estimated distribution. Consistency of the suggested approaches will be proved by assuming stationarity from the random process or by relaxing the latter hypothesis to admit a deterministic trend. Numerical studies for simulated data and a real data set, obtained from environmental monitoring, are included to illustrate the application of the proposed Bootstrap methods.
Similar content being viewed by others
References
Bowman A, Hall P, Prvan T (1998) Bandwidth selection for the smoothing of distribution functions. Biometrika 85(4):799–808. doi:10.1093/biomet/85.4.799
Crujeiras RM, Fernández-Casal R, González-Manteiga W (2010) Goodness-of-fit tests for the spatial spectral density. Stoch Environ Res Risk Assess 24(1):67–79. doi:10.1007/s00477-008-0300-0
De Angelis D, Young GA (1992) Smoothing the bootstrap. Int Stat Rev 60(1):45–56
Efron B (1979) Bootstrap methods: another look at the Jackknife. Ann Stat 7(1):1–26. doi:10.1214/aos/1176344552
García-Soidán P (2007) Asymptotic normality of the Nadaraya–Watson semivariogram estimators. TEST 16(3):479–503. doi:10.1007/s11749-006-0016-8
Goovaerts P (1997) Geostatistics for natural resources evaluation. Oxford University Press, New York
Govaerts B, Beck B, Lecoutre E, Le Bailly C, VandenEeckaut P (2005) From monitoring data to regional distributions: a practical methodology applied to water risk assessment. Environmetrics 16(2):109–127. doi:10.1002/env.665
Hall P (1985) Resampling a coverage pattern. Stoch Process Appl 20(2):231–246. doi:10.1016/0304-4149(85)90212-1
Hall P (1992) The bootstrap and edgeworth expansion. Springer, New York
Hall P, Maiti T (2006) On parametric bootstrap methods for small area prediction. J R Stat Soc B 68(2):221–238. doi:10.1111/j.1467-9868.2006.00541.x
Hall P, Patil P (1994) Properties of nonparametric estimators of autocovariance for stationary random fields. Probab Theory Relat Fields 99(3):399–424. doi:10.1007/BF01199899
Hyun-Han K, Young-Il M (2006) Improvement of overtopping risk evaluations using probabilistic concepts for existing dams. Stoch Environ Res Risk Assess 20(4):223–237. doi:10.1007/s00477-005-0017-2
Iranpanah N, Mansourianb A, Tashayob B, Haghighic F (2011) Spatial semi-parametric bootstrap method for analysis of kriging predictor of random field. Procedia Environ Sci 3:81–86. doi:10.1016/j.proenv.2011.02.015
Lejeune M, Sarda P (1992) Smooth estimators of distribution and density functions. Comput Stat Data Anal 14:457–471. doi:10.1016/0167-9473(92)90061-J
Li B, Genton M, Sherman M (2007) A nonparametric assessment of properties of space–time covariance functions. JASA 102(478):736–744. doi:10.1198/016214507000000202
Loh JM (2008) A valid and fast spatial Bootstrap for correlation functions. Astrophys J 681(1):726–734. doi:10.1086/588631
Maglione DS, Diblasi AM (2004) Exploring a valid model for the variogram of an isotropic spatial process. Stoch Environ Res Risk Assess 18(6):366–376. doi:10.1007/s00477-003-0143-7
Martins A, Figueira R, Sousa A, Sérgio C (2012) Spatio-temporal patterns of Cu contamination in mosses using geostatistical estimation. Environ Pollut 170:276–284. doi:10.1016/j.envpol.2012.07.004
Menezes R, García-Soidán P, Ferreira C (2010) Nonparametric spatial prediction under stochastic sampling design. J Nonparametr Stat 22(3):363–377. doi:10.1080/10485250903094294
Olea RA, Pardo-Igúzquiza E (2011) Generalized Bootstrap method for assessment of uncertainty in semivariogram inference. Math Geosci 43(2):203–228. doi:10.1007/s11004-010-9269-6
Politis DN, Romano JP, Wolf M (1999) Subsampling. Springer, Berlin
Shapiro A, Botha JD (1991) Variogram fitting with a general class of conditionally nonnegative definite functions. Comput Stat Data Anal 11(1):87–96. doi:10.1016/0167-9473(91)90055-7
Silverman BW, Young GA (1987) The bootstrap: to smooth or not to smooth? Biometrika 74(3):469–479. doi:10.1093/biomet/74.3.469
Acknowledgments
The authors would like to thank the helpful suggestions and comments from the Reviewers. The authors are also grateful to Dr. K. J. Duncan-Barlow (University of Vigo) for her contribution in the language revision. The first and third authors acknowledge financial support from the Project TEC2011-28683-C02-02 of the Spanish Ministry of Science and Innovation and the Project CN2012/279 from the European Regional Development Fund and the Galician Regional Government (Xunta de Galicia). The second author’s work has been supported by the Project PTDC/MAT/112338/2009 (FEDER support included) of the Portuguese Ministry of Science, Technology and Higher Education.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Consistency of \(\hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)} \left( x_1,{\ldots},x_k \right)\)
To check that consistency follows for \(\hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)} \left( x_1,{\ldots},x_k \right), \) the hypotheses described below will be assumed:
-
(i)
\(\{ Z ( {\rm s} ) \in {I\!R} : {\rm s} \in D \subset {I\!R}^d \}\) can be modeled as given in (1).
-
(ii)
D = λD 0, for some \(\lambda=\lambda (n) \mathop{\longrightarrow}\limits^{n \rightarrow + \infty} +\infty\) and bounded \(D_0 \subset {I\!R}^d. \)
-
(iii)
t i = λu i , for 1 ≤ i ≤ n, where \({\rm u}_1, {\ldots}, {\rm u}_n\) denotes a realization of a random sample of size n drawn from a density function g 0 considered on D 0.
-
(iv)
\(Z(\cdot)\) is α-mixing, with α(r) = O(r −a), for r > 0 and some constant a > 0.
-
(v)
K is d-variate and symmetric density function with compact support.
-
(vi)
\(\{ h_{1}^{2}+ {\cdots} + h_{k-1}^{2}+ \lambda^{-1} + n^{-k} \lambda^{d(k-1)} h_{1}^{-d} {\ldots} h_{k-1}^{-d} \} \mathop{\longrightarrow}\limits^{n \rightarrow + \infty} 0. \)
-
(vii)
\(F_{{\rm s}_1,{\ldots}, {\rm s}_k} (x_1,{\ldots},x_k)\) is three-times continuously differentiable as a function of \(({\rm s}_1,{\ldots}, {\rm s}_k). \)
We will prove that the bias and the variance of \(\hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)} \left( x_1,{\ldots},x_k \right)\) are of the respective orders \(( h_{1}^{2}+ {\cdots} + h_{k-1}^{2})\) and \((n^{-k} \lambda^{d(k-1)} h_{1}^{-d} {\ldots} h_{k-1}^{-d} + \lambda^{-d})\) and, therefore, tend to zero as the sample size n increases, which would yield the consistency of the distribution estimator. To do the latter, conditions (i)–(vii) will be applied and a similar procedure as in the proof of Theorem 3.1 in Hall and Patil (1994).
Write \(A_{i_1,i_2}=\frac{ p_{{\rm s}_1,{\rm s}_2}^{{\rm t}_{i_1},{\rm t}_{i_2},h_1} }{\sum_{i_1=1}^{n} \sum_{i_2=1}^{n} p_{{\rm s}_1,{\rm s}_2}^{{\rm t}_{i_1},{\rm t}_{i_2},h_1}}\) and \(A_{i_{j-1},i_j}=\frac{ p_{{\rm s}_{j-1},{\rm s}_j}^{{\rm t}_{i_{j-1}},{\rm t}_{i_j},h_{j-1}} }{\sum_{i_j=1}^{n} p_{{\rm s}_{j-1},{\rm s}_j}^{{\rm t}_{i_{j-1}},{\rm t}_{i_j},h_{j-1}}}\) for \(j=3,{\ldots},k. \) Firstly, we can take into account that, for large n:
on account of (3).
Now, the last conditional expectation will be approximated. With this aim, bear in mind that:
From the previous relations, it follows that:
and, therefore:
We can iterate the strategy above, based on applying an appropriate conditional expectation and developing the resulting term, to achieve that:
With regard to the variance, one has for large n that:
where:
By using similar arguments as above, we could check that:
Consequently:
We could derive the dominant terms of the bias and the variance of the distribution estimator as well as asymptotically minimize the mean squared error (MSE) of the distribution estimator, namely:
to obtain the optimal bandwidths h j , for \(j=1,{\ldots},k-1, \) which would be dependent on unknown terms, such as the multivariate distribution function itself and its second-order derivatives.
Appendix 2: Consistency of \(\tilde{F}_{{\rm s}_1,{\ldots}, {\rm s}_k} \left( x_1,{\ldots},x_k \right)\)
To derive this proof, we will assume conditions (i)–(v), together with:
-
(vi′)
\(\{ h_{1}^{2}+ {\cdots} + h_{k-1}^{2}+ h^2+\lambda^{-1} + n^{-k} \lambda^{d(k-1)} h_{1}^{-d} {\ldots} h_{k-1}^{-d} \} \mathop{\longrightarrow}\limits^{n \rightarrow + \infty} 0. \)
-
(vii′)
\(F_{{\rm s}_1,{\ldots}, {\rm s}_k} (x_1,{\ldots},x_k)\) is three-times continuously differentiable as a function of \(({\rm s}_1,{\ldots}, {\rm s}_k)\) and as a function of \((x_1,{\ldots},x_k). \)
-
(viii)
L is a univariate and symmetric density function with compact support.
For large n, the aforementioned hypotheses yield that:
where \(f_{{\rm t}_1,{\ldots}, {\rm t}_k}\) denotes the joint density function of \((Z({\rm t}_1),{\ldots},Z({\rm t}_k)). \)
We can integrate by parts and apply relation (3) to obtain that:
By proceeding with analogue arguments as those used for the bias of \(\hat{F}_{{\rm s}_1,{\ldots}, {\rm s}_k}^{(2)}, \) it follows that:
Finally, the approximation of the variance of the continuous estimator will be addressed as given below:
with:
Now, we can combine the arguments used for the bias of the continuous estimator with those applied for the variance of the discrete estimator to check that both terms, W 1 and W 2, are asymptotically negligible, as established next:
Then, consistency yields for \(\tilde{F}_{{\rm s}_1,{\ldots}, {\rm s}_k} \left( x_1,{\ldots},x_k \right), \) since its bias and variance tend to zero, as the sample size increases.
Rights and permissions
About this article
Cite this article
García-Soidán, P., Menezes, R. & Rubiños, Ó. Bootstrap approaches for spatial data. Stoch Environ Res Risk Assess 28, 1207–1219 (2014). https://doi.org/10.1007/s00477-013-0808-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-013-0808-9