An Introduction to Kriging Using SAS
19th New England Statistics Symposium April 23rd, 2005 Prof. Roger Bilisoly Department Mathematical Sciences Central Connecticut State University
1. Example of Spatial Data
Magnetometer readings to detect metal Goal here is to find unexploded ordnance (UXO) In general, need spatial data that is spatially correlated
2. In This Talk We Consider Mining Data
Black = lowest value White = highest value (Think of burning embers) The goal is to find high concentrations of ore, which are then mined.
Some Background on Spatial Modeling
Spatial model: Z(s), s = spatial location, and Z(s) is a random variable for each s. If Z(s) has not been measured at a location s, then its value is uncertain. We model this uncertainty by a probability distribution.
. Z(s1) . Z(s2) x Z(s) . Z(s3)
Kriging = Spatial Interpolation
Kriging is BLUE: B for Best meaning minimum variance L for Linear U for Unbiased E for Estimate . Z(s1) . Z(s2) x Z(s) Form of . Z(s3) Kriging Estimate:
3
Z ( s)
i 1
Z ( si )
Spatial Autocorrelation
Points close together tend to be similar which allows spatial prediction.
All points uncorrelated here: No hope for spatial prediction.
Variogram = Sill - Covariance
Goal: We want to know the correlation between Z(s) and Z(s+h) as a function of the distance |h|.
Sill Range
.
. . . . Find pairs of data values separated by the same distance .
Empirical Variogram
Assume that the variogram is a function of s1 s2 = h and E(Z(s)) is constant. (s1 s2) = (h) = Var(Z(s1) Z(s2))
1 # pairs si
. .
( Z (si ) Z ( s j ))2
sj h
.
. .
Valid Models
Any spatial prediction must have non-negative variance:
N N i i 1 N i i 1 j 1 j
Var (
Z ( si ))
Cov ( si , s j ) 0
This restricts our choice of variogram models (h)
Example of Kriging with SAS 7. Goal: Strike it Rich!
Walker data set from Isaaks and Srivastava(1989) Sampling design: two stage adaptive sampling
Measured: Ore concentration
Goal: Find high concentrations of ore so it can be profitably mined.
8. Map of Ore Concentration
Produced by PROC G3GRID PROC GCONTOUR Using PATTERN
SAS Code for Map
axis1 order=(0 to 250 by 25) width=3 minor=(n=4) label=('Easting'); axis2 order=(0 to 300 by 25) width=3 minor=(n=4) label=('Northing'); symbol1 interpol=none line=1 value=plus color=black;
proc g3grid data=walker out=walker2; grid y*x=v / naxis1=25 naxis2=30; run;
pattern1 pattern2 pattern3 pattern31 value=solid color=CX000000; value=solid color=CX180000; value=solid color=CX300000; value=solid color=CXFFFFFF;
proc gcontour data=walker2; title 'Smoothed Data Plot'; plot y*x=v/haxis=axis1 vaxis=axis2 levels=0 to 1500 by 50 pattern nolegend; run;
3. Read in the Data
data walker; infile "M:\My Work\Spatial Stat Talks\walker data set.txt"; drop title nvar; length title $ 50; if _n_ = 1 then do; input title $ &; put title; input nvar; input #(nvar+2); end; else do; input id x y v u t; output; x and y = spatial coordinates end; v = variable analyzed here run;
4. Variogram Estimation
proc variogram data=walker outvar=outv; compute lagdistance=5 maxlag=20; coordinates xc=x yc=y; var v; run;
Need spatial spacing info.
This creates the empirical variogram, which quantifies spatial autocorrelation.
4. PROC NLIN to Fit a Parametric Variogram
proc nlin data=outv; parms c0=20000 c1=80000 a1=80; model variog = (c0 + c1*(1.5*(distance/a1) 0.5*(distance/a1)**3))*(distance < a1) + (c0 + c1)*(distance >= a1); der.c0 = 1; der.c1 = 1.5*(distance/a1) 0.5*(distance/a1)**3; der.a1 = c1*(-1.5*(distance/a1**2) + 1.5*(distance**3/a1**4)); output out=fitted predicted=pvari parms=c0 c1 a1; Need to give run; derivative
information.
Need to specify a parametric model
4. PROC KRIGE2D Does the Kriging
proc krige2d data=walker outest=est; predict var=v radius=200; model form=spherical range=&a1 scale=&c1 This is a specific nugget=&c0; parametric shape. coord xc=x yc=y; grid x=0 to 250 by 2 y=0 to 300 by 2; run;
This specifies locations to make spatial predictions.
Isotropic Variogram
Model for h < a1: c0 + c1*(1.5*h/a1 - 0.5*(h/a1)3)
Best fit by PROC NLIN: c0 = 29474 c1 = 64313 a1 = 43.39
6. Kriging vs. Map
Note that the kriging predictions are smoother than the data.
7. Why Kriging?
Both mapping and kriging interpolate Mapping is quicker BUT Kriging comes with an error estimate
6. Kriging Estimate & Error Estimate
5. Conclusions
1. Kriging is based on modeling, which makes it harder to do, but it also comes with an error estimate 2. SAS has kriging built into SAS/STAT via PROC NLIN, PROC VARIOGRAM and PROC KRIGE2D 3. SAS/GRAPH can be used for mapping 4. Note that hypothesis testing is not prominent for these PROCs.