This is not a peer-reviewed article.
Pp. 168-176 in Total Maximum Daily Load (TMDL) Environmental Regulations–II
Proceedings of the 8-12 November 2003 Conference (Albuquerque, New Mexico USA),
Publication Date 8 November 2003.
ASAE Publication Number 701P1503, ed. A. Saleh.
Stochastic Validation of SWAT Model
I. Chaubey1, T.A. Costello, K.L. White, and A.S. Cotter
Abstract. The Soil and Water Assessment Tool (SWAT) model was developed to be used in
ungaged watersheds where no measured watershed response data are available for model
calibration. The uncertainty associated with model parameters often limits the usability of
SWAT in ungaged watersheds. This uncertainty was studied using SWAT model to simulate
runoff from a small watershed in Northwest Arkansas (Moores Creek, Washington County). The
SWAT model was validated using a stochastic procedure which transformed parameter
uncertainty into model output uncertainty using probability density functions. A sensitivity
analysis was performed to identify the most sensitive model parameters for flow in an
agricultural watershed. A Monte-Carlo Simulation was performed using curve number (CN) as
the most sensitive model parameter to quantify uncertainty in model output. The results
indicated that model output uncertainty did indeed depend upon input parameter uncertainty. A
decision regarding model acceptability can be made by placing confidence intervals on model
output and using measured watershed response data and predetermined performance criteria on
the model.
Keywords. SWAT, Stochastic validation, Monte Carlo Simulation, Output uncertainty
Introduction
The use of distributed parameter hydrologic/water quality models (H/WQ) in
making watershed response has considerably increased during the last decade. Even
though these models are developed to mimic natural processes, actual processes
occurring in field are more complex and variable than what can be currently represented
in the most sophisticated models (Haan et al., 1995). Spatial heterogeneity of watershed
processes and of model parameters has been long identified as one of the principle
sources of uncertainty affecting model performance (Luis and McLaughlin, 1992). The
distributed parameter models address the problems associated with the spatial
heterogeneity of watershed processes by dividing the watershed into smaller
homogeneous areas and by defining model parameters for each of these areas. However,
Beven (1989) concluded that defining a consistent effective parameter value to reproduce
the response of a spatially-variable pattern of parameter values was not possible.
Problems associated with model structure, estimation of parameter values and the
1
Respectively, Assistant Professor, Associate Professor, Graduate Student, and Former Graduate Student
Department of Biological and Agricultural Engineering; Director, University of Arkansas, Fayetteville, AR 72701.
Corresponding Author: I. Chaubey. Email: chaubey@uark.edu
specifications of initial and boundary conditions may result in significant model
prediction uncertainty (Beven, 1989).
The model calibration is most often used as a process to parameterize model
parameters, and to minimize model output uncertainties. In the model calibration
process, the model parameters are adjusted until the model predictions match measured
data within a predefined accuracy level. In the watersheds where no measured data are
available to calibrate H/WQ models, considerable uncertainty exists in model predictions.
Even when parameter estimates are available for a watershed of interest, they should be
treated as random variables since their values depend on observed data which themselves
are random variables (Haan, 1989). Since any function of a random variable is also a
random variable, the model outputs can be viewed as random variables having a
probabilistic structure. There is a need to describe such model outputs using a probability
density function, and confidence intervals (Haan, 1989). Such a description would help
validate models where there are no observed data on the watershed response being
modeled (Haan et al., 1995).
The objective of this study was to stochastically validate the SWAT model to
determine its applicability in making watershed runoff predictions in ungaged
watersheds.
Methods and Materials
The validation of the SWAT model to predict annual runoff response was
performed in the Moores Creek watershed located in the Northwest Arkansas. This is a
small (1900 ha) watershed dominated by agriculture (55%) and forest (39%) land use. A
detailed description of the watershed characteristics is provided in Cotter et al. (2003).
Description of the SWAT Model
The SWAT model is a physically-based distributed-parameter watershed scale
model. It divides the study watershed into sub-basins, or smaller homogeneous areas (,
Arnold et al., 1998; Neitsch et al., 2000). SWAT considers all hydrologic processes
within the sub-basins of the watershed. The EPA currently supports this model for
developing TMDL in agricultural watersheds. SWAT consists of three major
components: (1) sub-basin, (2) reservoir routing, and (3) channel routing. The sub-basin
component consists of eight major divisions.
These are hydrology, weather,
sedimentation, soil temperature, crop growth, nutrients, agricultural management, and
pesticides. Channel inputs include reach length, channel slope, channel depth, channel
top width, channel side slope, flood plain slope, channel roughness factor, and flood plain
roughness factor. GIS interfaces have been developed for the SWAT model to facilitate
the aggregation of input data for simulating watersheds. The ArcView interface
developed for the SWAT model was used to prepare input data files in this study. This
interface requires a land cover map, soils map, and DEM as spatial inputs. Currently, a
detailed description of the model can be found at the SWAT website:
http://www.brc.tamus.edu/swat/swat2000doc.html
Stochastic Validation Methodology
Stochastic validation evaluates the ability of a model to simulate watershed
response by transforming parameter uncertainty into prediction uncertainty using
probability distribution functions (PDFs). Model validation (or lack thereof) is then
determined by calculating predetermined confidence intervals (CI) of the model
predictions and comparing predicted output with the measured watershed response data.
In general, stochastic validation is performed as follows (Haan et al., 1995, Luis and
McLaughlin, 1992).
1. Perform a sensitivity analysis (SA) of model parameters
2. Generate input parameter PDF
3. Generate output PDF
4. Assess model performance
The runoff response of the watershed was found to be most affected by curve number
(CN) based on detailed model sensitivity analyses (Cotter et al., 2003) and was used in
this study. The output of interest was mean annual flow from 1997 and 1998. Values of
CN are based on physical variables such as land cover and soil types, and do not follow a
specific distribution. Therefore the required input PDF could not be directly generated
from CN. Research has shown the retention parameter, S, is log normally distributed
(Haan and Schulze, 1987). S is related to CN by:
S=
1000
CN
−
10
(1)
Mean and standard deviation (SD) for S were required to describe the lognormal
distribution. The mean values of S were assumed to be the default S (calculated from the
original CN for each HRU). The SD of S was assumed to be 0.5 of the mean, as
suggested by Haan and Schulze (1987).
SWAT is a distributed parameter model and as such assigned a unique CN to each HRU.
The parameter PDF was created by randomly generating 500 values of S for each HRU in
the watershed using lognormal distribution with mean and SD described above, then
calculating the corresponding CN using Equation 1. The model was run for each set of
CN resulting in 500 unique flow predictions for each simulation, or the output PDF. The
output PDF was used to establish flow prediction CI. The CI represent a 95% chance the
bounds will include the actual value of annual flow, assuming the model and parameter
estimations are valid. The upper and lower bounds of the 95% CI were calculated by
the following equations (Haan, 1977):
CI UP = X + t1−α
CI LOW = X − t1−α
2,
n −1
sX
(2)
n −1
sX
(3)
2,
where, x was the mean of the output PDF, Sx was the SD of the output PDF, α was 0.05
and t1-α/2 was 1.96 (assuming infinite degrees of freedom) (Haan, 1977, Appendix E).
Finally model flow predictions were compared to measured stream flow by plotting
measured data on the PDF of the model response and comparing to the CIs. The model
was judged to perform satisfactorily from statistical point of view if measured data fell
within the 95% confidence interval of output PDF. If the measured data fell outside the
95% confidence interval, it indicated inadequacies in modeling and/or possible errors in
measured data. A very wide CI indicates that the model structure and uncertainties in
input parameters result in a very uncertain model predictions for the desired applications.
Under such conditions, the model may be unacceptable for a particular application, even
if it was validated stochastically (Haan et al., 1995).
Results and Discussion
Values of flow parameters, excluding CN, were adjusted to represent three different
validation conditions (Table 1), hereafter referred to as validation A, B, or C. The SWAT
model was run for each set of CN to generate 500 unique annual flow predictions.
Figure 1 shows the cumulative density functions (CDFs) for 1997 and 1998 annual flow
predictions for validation A. The CDF was numerically differentiated, resulting in the
PDF of 1997 and 1998 annual flow predictions, as shown in Figure 2. The 95% CI is
represented by the dashed vertical lines; the measured flow is represented by a solid
vertical line. Results showed the measured annual flow for 1998 was encompassed
within the 95% CI, indicating that the model is useful for predicting annual flow in
uncalibrated conditions or in ungauged watersheds.
However this was not the case for 1997. The predicted flows were overestimated
by the model for this year, even the minimum of 500 flow predictions was 35% greater
than the measured flow. Since the measured 1997 flow falls outside the CI, it must be
concluded that either the model is not useful for estimating annual flow at a 95%
confidence, or the estimated values of CN are invalid. It is well known some uncertainty
is associated with CN estimation, especially concerning antecedent moisture conditions
(AMC) of the soil. There are three moisture conditions known as AMC I (driest), AMC
II (medium), and AMC III (wettest), corresponding CN are designated CN I, CN II, and
CN III. Flow was over predicted for 1997, suggesting simulated conditions (e.g. AMC
II) were too wet and AMC I might be more appropriate for estimating CN.
CN II was converted to CN I and a new set of random CN was generated resulting
in validation B (Table 1). Figure 3 shows the resulting PDF for 1997 and 1998 annual
flow predictions for 500 model runs. At these conditions predicted flow was reduced,
however neither 1997 nor 1998 measured flow fell within the 95% CI.
To more closely simulate the environmental conditions in 1997, the model
parameters were further refined to validation C (Table 1). CANMX and ESCO were not
changed and CN remained at AMC I. Additional parameter changes were: GW_REVAP
= 0.04, ALPHABF = 0.02 and GWQMN = 50. Figure 4 shows the PDF of flow
predictions for validation C. At these conditions, 1997 flow is within the 95% CI,
indicating the parameter estimation is valid and the model is useful for predicting annual
flow at 95% confidence. The measured flow for 1998 was outside the 95% CI. These
results suggest environmental conditions (i.e. AMC, temperature, rainfall) of 1997 and
1998 were very different. SWAT was unable to accurately simulate conditions of both
years with the given input data.
TABLE 4.6. Parameter values for three calibrations of stochastic validation.
Parameter Values
Calibration
AMC
CANMX
ESCO
GW REVAP
ALPHABF
GWQMN
A
II
50
0.3
0.2
No change
No change
B
I
50
0.3
0.2
No change
No change
C
I
50
0.3
0.04
0.02
50
1.0
1997
1998
Cumulative Probability
0.8
0.6
0.4
0.2
0.0
4e+6
5e+6
6e+6
3
Flow (m )
Figure 1. Cumulative density functions of predicted annual flow based on validation A.
1997
0.15
0.10
Probability Axis
0.20
0.05
3000000
3300000 3500000 3700000 3900000 4100000 4300000 4500000 4700000
1998 FLOW (m3)
1998
0.15
0.10
Probability Axis
0.20
0.05
5600000 5700000 5800000 5900000 6000000 6100000 6200000 6300000
1998 FLOW (m3)
Figure 2. PDF of predicted annual flow based on calibration A. Solid vertical line
represents measured value; dashed vertical lines represent 95% CI.
0.80
0.60
0.40
Probability Axis
1997
0.20
3000000
3250000
3500000
3750000
4000000
1997 Flow (m3)
1998
0.50
0.30
0.20
0.10
5400000
5500000
5600000
5700000
5800000
5900000 6000000
1998 Flow (m3)
Figure 3. PDF of predicted annual flow based on calibration B. Solid vertical line
represents measured value; dashed vertical lines represent 95% CI.
Probability Axis
0.40
0.25
1997
0.15
0.10
Probability Axis
0.20
0.05
3000000
3100000
1997 Flow (m3)
1998
0.70
0.50
0.40
0.30
Probability Axis
0.60
0.20
0.10
4400000
4800000
5200000
5600000
6000000
1998 Flow (m3)
Figure 4. PDF of predicted annual flow based on calibration C. Solid vertical line
represents measured value; dashed vertical lines represent 95% CI.
Summary and Conclusions
Stochastic validation of SWAT model was performed for 1997 and 1998 for flow. The CN
was found to be the most sensitive model parameter affecting output uncertainty. The CN
was indirectly used to generate a parameter probability distribution function (PDF). Then
based on the parameter PDF, an output PDF was generated. The output was assessed using a
95% confidence interval (CI). The measured 1998 flow was within the 95% CI of the
predicted values. However, the CN has to be adjusted for dryer conditions to accurately
predict flow for 1997. Flow for 1998 was not within 95 CI when CN was adjusted for dryer
environmental conditions indicating that SWAT predictions were sensitive to antecedent
moisture conditions.
Literature Cited
Arnold, J. G., R. Srinivasan, R. S. Muttiah, and J. R. Williams. 1998. Large area hydrologic
modeling and assessment, Part1: Model Development. J. Amer. Water Resour. Assoc.
34 (1):1-17.
Beven, K. 1989. Changing ideas in hydrology: the case of physically based models. J.
Hydrology 105:157-172.
Cotter, A.S., I. Chaubey, T.A. Costello, T.S. Soerens, and M.A. Nelson. 2003. Water quality
model output uncertainty as affected by spatial resolution of input data. J. Amer. Water
Resour. Assoc. 39(4): 977-986.
Haan, C.T., B. Allred, D.E. Storm, G.J. Sabbagh, and S. Prabhu. 1995. Statistical procedure for
evaluating hydrologic/water quality models. Trans. ASAE 38:725-773.
Haan, C.T. 1989. Parametric uncertainty in hydrologic modeling. Trans. ASAE 32:132-146.
Haan, C.T. 1977. Statistical methods in hydrology. Iowa State University Press. Ames, Iowa.
Haan, C.T., and R.E. Schulze. 1987. Return periods flow prediction with uncertain
parameters. Trans. ASAE 30(3): 665-669.
Louis, S.J. and D. McLaughlin. 1992. A stochastic approach to model validation. Advances in
Water Resources 15:15-32.
Neitsch, S. L., J. G. Arnold, J. R. Kiniry, J. R. Williams. 2000. Soil and Water Assessment Tool
user’s manual. Blackland Research Center. Temple, TX.