04/11/2006
Frequency Analysis
Reading: Applied Hydrology Chapter 12
Slides Prepared by Venkatesh Merwade
Hydrologic extremes
Extreme events
Floods
Droughts
Magnitude of extreme events is related to their
frequency of occurrence
1
Magnitude
Frequency of occurence
The objective of frequency analysis is to relate the
magnitude of events to their frequency of occurrence
through probability distribution
It is assumed the events (data) are independent and
come from identical distribution
2
Return Period
Random variable: X
Threshold level: xT
Extreme event occurs if: X xT
Recurrence interval: Time between ocurrences of X x
T
Return Period: E ( )
Average recurrence interval between events equalling or
exceeding a threshold
If p is the probability of occurrence of an extreme
event, then E ( ) T 1
p
or 1
P ( X xT )
T
3
More on return period
If p is probability of success, then (1-p) is the probability
of failure
Find probability that (X ≥ xT) at least once in N years.
p P( X xT )
P( X xT ) (1 p )
P( X xT at least once in N years) 1 P( X xT all N years)
N
1
P( X xT at least once in N years) 1 (1 p) 1 1
N
T
4
Return period example
Dataset – annual maximum discharge for 106
years on Colorado River near Austin
xT = 200,000 cfs
600
500
No. of occurrences =
Annual Max Flow (10 3 cfs)
3
400
2 recurrence intervals
300
in 106 years
200
T = 106/2 = 53 years
100
0
1905 1908 1918 1927 1938 1948 1958 1968 1978 1988 1998 If xT = 100, 000 cfs
Year
7 recurrence intervals
T = 106/7 = 15.2 yrs
P( X ≥ 100,000 cfs at least once in the next 5 years) = 1- (1-1/15.2)5 = 0.29
5
Data series
600
500
Annual Max Flow (10 3 cfs)
400
300
200
100
0
1905 1908 1918 1927 1938 1948 1958 1968 1978 1988 1998
Year
Considering annual maximum series, T for 200,000 cfs = 53 years.
The annual maximum flow for 1935 is 481 cfs. The annual maximum data series probably
excluded some flows that are greater than 200 cfs and less than 481 cfs
Will the T change if we consider monthly maximum series or weekly maximum series? 6
Hydrologic data
series
Complete duration series
All the data available
Partial duration series
Magnitude greater than base value
Annual exceedance series
Partial duration series with # of values
= # years
Extreme value series
Includes largest or smallest values in
equal intervals
Annual series: interval = 1 year
Annual maximum series: largest values
Annual minimum series : smallest
values
7
Probability distributions
Normal family
Normal, lognormal, lognormal-III
Generalized extreme value family
EV1 (Gumbel), GEV, and EVIII (Weibull)
Exponential/Pearson type family
Exponential, Pearson type III, Log-Pearson type
III
8
Normal distribution
Central limit theorem – if X is the sum of n independent
and identically distributed random variables with finite variance,
then with increasing n the distribution of X becomes normal
regardless of the distribution of random variables
pdf for normal distribution
2
1 x
1
2
f X ( x) e
2
is the mean and is the standard
deviation
Hydrologic variables such as annual precipitation, annual average streamflow, or
annual average pollutant loadings follow normal distribution
9
Standard Normal distribution
A standard normal distribution is a normal
distribution with mean () = 0 and standard
deviation () = 1
Normal distribution is transformed to standard
normal distribution by using the following
formula:
X
z
z is called the standard normal variable
10
Lognormal distribution
If the pdf of X is skewed, it’s not
normally distributed
If the pdf of Y = log (X) is
normally distributed, then X is
said to be lognormally distributed.
1 ( y y )2
f ( x) exp x 0, and y log x
x 2 2 y
2
Hydraulic conductivity, distribution of raindrop sizes in storm follow
lognormal distribution.
11
Extreme value (EV) distributions
Extreme values – maximum or minimum values
of sets of data
Annual maximum discharge, annual minimum
discharge
When the number of selected extreme values is
large, the distribution converges to one of the
three forms of EV distributions called Type I, II
and III
12
EV type I distribution
If M1, M2…, Mn be a set of daily rainfall or streamflow,
and let X = max(Mi) be the maximum for the year. If Mi
are independent and identically distributed, then for large
n, X has an extreme value type I or Gumbel distribution.
1 x u x u
f ( x) exp exp
6sx
u x 0.5772
Distribution of annual maximum streamflow follows an EV1 distribution
13
EV type III distribution
If Wi are the minimum streamflows in
different days of the year, let X =
min(Wi) be the smallest. X can be
described by the EV type III or
Weibull distribution.
k x
k 1
x k
f ( x ) exp x 0; , k 0
Distribution of low flows (eg. 7-day min flow)
follows EV3 distribution.
14
Exponential distribution
Poisson process – a stochastic process
in which the number of events
occurring in two disjoint subintervals
are independent random variables.
In hydrology, the interarrival time
(time between stochastic hydrologic
events) is described by exponential
distribution
1
f ( x ) e x
x 0;
x
Interarrival times of polluted runoffs, rainfall intensities, etc are described by
exponential distribution.
15
Gamma Distribution
The time taken for a number of events
(b) in a Poisson process is described
by the gamma distribution
Gamma distribution – a distribution
of sum of b independent and identical
exponentially distributed random
variables.
b x b 1e x
f ( x) x 0; gamma function
( b )
Skewed distributions (eg. hydraulic conductivity)
can be represented using gamma without log
transformation.
16
Pearson Type III
Named after the statistician Pearson, it is also
called three-parameter gamma distribution. A
lower bound is introduced through the third
parameter (e)
b ( x e ) b 1 e ( x e )
f ( x) x e ; gamma function
( b )
It is also a skewed distribution first applied in hydrology for
describing the pdf of annual maximum flows.
17
Log-Pearson Type III
If log X follows a Person Type III distribution,
then X is said to have a log-Pearson Type III
distribution
b ( y e ) b 1 e ( y e )
f ( x) y log x e
( b )
18
Frequency analysis for extreme events
Q. Find a flow (or any other event) that has a return period of T years
x u x u
f ( x)
1
exp exp x u EV1 pdf and cdf
F ( x) exp exp
6sx
u x 0.5772
x u
Define a reduced variable y y
F ( x) exp exp( y )
y ln lnF ( x) ln ln(1 p) where p P(x xT )
1
yT ln ln1
T
If you know T, you can find yT, and once yT is know, xT can be computed by
xT u yT 19
Example 12.2.1
Given annual maxima for 10-minute storms
Find 5- & 50-year return period 10-minute
storms
x 0.649 in
s 0.177 in
6s 6 * 0.177 u x 0.5772 0.649 0.5772 * 0.138 0.569
0.138
T 5
y5 ln ln ln ln 1.5
T 1 5 1
x5 u y5 0.569 0.138 *1.5 0.78 in
x50 1.11in
20
Frequency Factors
Previous example only works if distribution is
invertible, many are not.
Once a distribution has been selected and its
parameters estimated, then how do we use it?
Chow proposed using: xT x KT s
xT Estimated event magnitude fX(x)
where KT Frequency factor
x
KT s
T Return period P( X xT )
1
T
x Sample mean
s Sample standard deviation xT x
21
Normal Distribution
2
1 x
Normal distribution 1
2
f X ( x) e
2
xT x
KT zT
s
So the frequency factor for the Normal
Distribution is the standard normal variate
xT x K T s x zT s
Example: 50 year return period
1
T 50; p 0.02; K 50 z50 2.054 Look in Table 11.2.1 or use –NORMSINV (.)
in EXCEL or see page 390 in the text book
50
22
EV-I (Gumbel) Distribution
x u 6s T
F ( x) exp exp u x 0.5772 yT ln ln
T 1
xT u yT
6 6 T
x 0.5772 s s ln ln
T 1
6 T
x 0 .5772 ln ln s
T 1
xT x KT s
6 T
KT 0.5772 ln ln
T 1
23
Example 12.3.2
Given annual maximum rainfall, calculate 5-yr
storm using frequency factor
6 T
KT 0.5772 ln ln
T 1
6 5
KT 0.5772 ln ln 0.719
5 1
xT x K T s
0.649 0.719 0.177
0.78 in
24
Probability plots
Probability plot is a graphical tool to assess whether
or not the data fits a particular distribution.
The data are fitted against a theoretical distribution
in such as way that the points should form
approximately a straight line (distribution function
is linearized)
Departures from a straight line indicate departure
from the theoretical distribution
25
Normal probability plot
Steps
1. Rank the data from largest (m = 1) to smallest (m = n)
2. Assign plotting position to the data
1. Plotting position – an estimate of exccedance probability
2. Use p = (m-3/8)/(n + 0.15)
3. Find the standard normal variable z corresponding to the
plotting position (use -NORMSINV (.) in Excel)
4. Plot the data against z
If the data falls on a straight line, the data comes from a
normal distributionI
26
Normal Probability Plot
600
500
Data
Q (1000 cfs)
400 Normal
300
200
100
0
-3 -2 -1 0 1 2 3
Standard normal variable (z)
Annual maximum flows for Colorado River near Austin, TX
The pink line you see on the plot is xT for T = 2, 5, 10, 25, 50, 100, 500 derived using
the frequency factor technique for normal distribution.
27
EV1 probability plot
Steps
1. Sort the data from largest to smallest
2. Assign plotting position using Gringorten formula
pi = (m – 0.44)/(n + 0.12)
3. Calculate reduced variate yi = -ln(-ln(1-pi))
4. Plot sorted data against yi
If the data falls on a straight line, the data
comes from an EV1 distribution
28
EV1 probability plot
600
500
Data
400 EV1
Q (1000 cfs)
300
200
100
0
-2 -1 0 1 2 3 4 5 6 7
EV1 reduced variate
Annual maximum flows for Colorado River near Austin, TX
The pink line you see on the plot is xT for T = 2, 5, 10, 25, 50, 100, 500 derived using
the frequency factor technique for EV1 distribution.
29