Stochastic Hydrology
Prediction is very difficult, especially if its about the future.
- Neils Bohr
Section 1b Basic Hydrology
Statistics and Probability
Many hydrologic data represent random phenomena temperature, rainfall, wind We must include extreme events in engineerring analysis floods, droughts Always some risk - question is How Much? Definition of probability and statistics
Section 3 Probability
Probability Rule I
The probability of obtaining either outcome A or B, with A and B independent and mutually exclusive, is the sum of the probability of obtaining each. P(A or B) = P (A U B) = P(A) + P(B) where: P(A or B) = probability of obtaining either A or B P(A) = probability of obtaining event A P(B) = probability of obtaining event B Venn diagram
Section 3 Probability
Probability
If events are not mutually exclusive, then P(A or B) = P (A U B) = P(A) + P(B) - P(A n B) Venn diagram
Section 3 Probability
Example
Example: What is the probability of getting a 1 or a 3 when rolling a fair dice one? Define Event A = rolling a 1 Event B = rolling a 3 P(A) = 1/6 P(B) = 1/6 so
P(A or B)=P(A a B)=P(A)+P(B)= 1/6+1/6=2/6= 1/3
Section 3 Probability
Probability Rule II
The probability of obtaining both outcome A and B, with A and B independent, is the product of the independent probabilities of obtaining either A or B. Thus, P(A and B) = P(A n B) = P(A) * P(B) Example: What is the probability of getting a 1 and then a 3 when rolling a fair dice twice? Define Event A= rolling a 1 Define Event B= rolling a 3 P(A) = 1/6 P(B) = 1/6
P(A and B) = P(A n B) = P(A) * P(B) = 1/6 * 1/6 = 1/36
Section 3 Probability
Bernoulli Variables and Binomial Experiments
A Bernoulli variable is one that can take on only one of two values (e.g. Success or Failure; True or False; 0 or 1) A binomial experiment is one where the outcome is a Bernoulli variable
Section 3 Probability
Probability Rule III
The probability P of having exactly n occurrences (Successes) in N trials is
N! p n (1 p ) N n n !( N n )! p= probability of success in any one trial p ( n, N ) =
n ! = factorial of n remember: 4! = (4*3*2*1) = 24 ;0! = 1 note: The order of the successes is not important, just the total number of successes over the number of trials
Section 3 Probability
Probability Example
In a roulette game (1-49, 0,00), If I bet on only 0 and 5, what is the probability of winning? Let Event A= event marble lands on 0 Event B= event marble land on 5 Note: Independent events, so P(A or B) = P(A U B) = P(A) + P(B) P(winning) = P(marble on 0) + P(marble on 5) = 1/51 + 1/51 = 2/51 = 0.0392 = 3.92%
Section 3 Probability
Probability Example
What is the probability of throwing a coin and getting three heads in a row? Let Event A= head on 1st toss Event B= head on 2nd toss Event C= head on 3rd toss Note: Independent events, so P(A_B_C) = P(A)*P(B)*P(C)
P(3 heads in a row) = P(1st throw a heads) * P(2nd throw a heads)*P(3rd throw a heads) P(3 heads in a row) = 1/2 * 1/2 * 1/2 = 1/8 =0.125 = 12.5%
Section 3 Probability
Probability Example
What is the probability of getting a 1 exactly two times in 6 throws of a fair die. Note: the number of trials fixed, and only count the total number of successes Here n= 2, N= 6, P(S) = P (getting a 1) = 1/6 P(2 Ss in 6 tries)
p( n, N ) =
N! p n (1 p) N n n !( N n )!
p(2,6) =
6! 0.16672 (1 .1667)62 = 0.201 2!(6 2)!
Section 3 Probability
Probability Example
What is the probability of getting a 1 at most two times in 6 throws of a fair die. Let event A= getting a 1 at most twice P(A) = P(0 in 6) + P(1 in 6) + P(2 in 6) P(0 Ss in 6 tries)
p(0,6) = 6! 0.16670 (1 0.1667)6 0 = 0.335 0!(6 0)!
6! 0.16671 (1 .1667) 61 = 0.402 1!(6 1)!
p (1,6) =
p(2, 6) =
6! 0.1667 2 (1 .1667)62 = 0.201 2!(6 2)!
P(getting 1 at most twice) = 0.355+0.402+0.201=0.938
Section 3 Probability
Return Period
Probability is often expressed as return period T
Means the average number of years between occurrence of an event
The largest storm in 100 years would be an event with a 100 year return period The 100 year flood has a 1% chance of being exceeded in any year Pr (occurrence in any year) p= 1/T The probability event will not occur is p
= 1-p = 1-1/T
If the probability of a storm occurrence is the same from year to year
Section 3 Probability
Risk and Reliability
If the probability of a storm occurrence is the same from year to year- what is the Pr it occurs once in N years For N year period, what is the Pr {T-yr} event occurs at least once? This Pr is called risk R Risk ? { 1 event or 2 event or 3, . . ., or N } events in an N-year period or Risk= 1- P(no occurrence in N-years) = 1-p(0) = 1- (1-p)N = 1-(1-1/T)N Reliability is defined as 1- risk or Reliability = (1-1/T)N
Section 3 Probability
Class Exercise
What is the probability that at least one 50-yr flood will occur during the 30-yr life of a flood control project?
The probability of a 50-year flood occurring in any given year is P = 1/T = 1/50 = 0.02 The probability of at least one flood is the risk, thus Risk = 1- (1-1/T)N = 1-(1-0.02)30 = 0.455 Fairly high! Lets use a larger 100-yr flood Risk = 1 - (1 - 0.01)30 = 0.26
Section 3 Probability
In Class Example
What is the probability that the 100-yr flood will not occur in 10 years or 100 years?
p=1/T=1/100=0.01 For N=10, P(n=0)=(1-p)10=.9910=0.92 For N= 100, P(n = 0) = (1 - p) 100 = 0.99 100 = 0.37
Section 3 Probability
In Class Example
In general, what is the probability of having no floods greater than the T-yr flood during a sequence of T years?
Here p(n = 0) = (1 1/T)T Remember the definition of e
1 e = lim(1 + x ) x = lim(1 + ) n = 2.71828... x 0 n n
So as T gets large P(n = 0 ) = e-1 = 0.368 (reliability) P(n > 0 ) = 1 e-1 = 0.632 (risk)
Section 3 Probability
In Class Exercise
Compute the return period of a design storm to be used for the design of a culvert. There is a 5% probability that the design storm will occur in the next 5 years
Risk R = 1 (1 1/T)N 0.05 = 1 (1 1/T)5 T = 1 / [ 1 (0.95)0.2] = 97.98 years
Section 3 Probability
In Class Exercise
Compute the probability that exactly two 10-yr floods occur in a single 30-yr period.
Here p = 1/10 = 0.1
Hint: use rule 3
p(2, 30) =
30! 0.12 (1 .1) 302 = 0.227 2!(30 2)!
Section 3 Probability
In Class Exercise
What is the probability of a flood equal to or greater than a 5-yr flood during the next 3 years?
Probability of having a 5-yr flood next year = 1/5 = 0.2 Probability of not having a 5-yr flood next year = 1- 0.2 = 0.8 Probability of not having a 5-yr flood in the next 3 years =(0.8)3 Probability of having a 5-yr flood in the next 3 years: Risk R = 1 (0.8)3 = 1 0.512 = 0.488 = 48.8%
Section 3 Probability
FLOOD FREQUENCY
Section 3 Probability
Flood Frequency Analysis
Statistical Methods to evaluate probability of flood occurrence Used to determine return periods of rainfall Used to determine 100 yr flows for floodplain mapping purposes Used for datasets that have no obvious trends
Section 3 Probability
Continuous and Discrete
Section 3 Probability
Probability Distributions CDF is the most useful form for analysis
F ( x ) = P ( X x ) = P ( xi )
i
F(x1) = P( x x1) =
x1
f (x)dx
P(x1 x x2) =F(x2)F(x1)
Section 3 Probability
Moments of a Distribution
n th moment
'N = x iN P (x i )
'N =
x N f (x )dx
First Moment about the Origin
E(x) = = x i P(x i )
Discrete Mean Continuous Mean
E(x) = = xf (x)dx
Section 3 Probability
Var(x) = second moment about mean
Var(x) = 2 = (x i )2P(x i )
Var(x) =
(x )
f (x)dx
Var(x) = E(x 2 ) (E(x))2
cv = = Coeff. of Variation
Section 3 Probability
Estimates of Moments from Data
x=
2 sx =
1 n x i Mean of Data n i
1 (x i x )2 Variance n 1
Section 3 Probability
Skewness Coefficient
Used to evaluate high or low data points - flood or drought data
Skewness
3 third central moment 3
Cs =
(x i x )3 skewness coeff. n 3 (n 1)(n 2) sx
Section 3 Probability
Data with Long Right Tail
Section 3 Probability
Siletz River Data
Stationary Data Showing No Obvious Trends
Section 3 Probability
10
Data with Trends
Section 3 Probability
Frequency Histogram
Probability that Q is 10,000 to 15, 000 = 17%
Section 3 Probability
Cumulative Histogram
Probability that Q < 20,000 is 55%
Section 3 Probability
11
Venn Diagram
Section 3 Probability
Probability Density Function
Section 3 Probability
Mean, Median, Mode
Positive Skew moves mean to right Negative Skew moves mean to left Normal Distn has mean = median = mode Median has highest prob of occurrence
Section 3 Probability
12
Generalized Skew
Section 3 Probability
Major Distributions
Binomial - P (x successes in n trials) Exponential - decays rapidly to low probability Normal - Symmetric based on and Lognormal - Log data are normal Gamma - skewed distribution Log Pearson III - skewed and recommended by the IAC on water data
Section 3 Probability
Binomial Distribution
Section 3 Probability
13
Exponential Distribution
Section 3 Probability
Normal, Log Normal, Log Pearson Type 3
Section 3 Probability
Graphical Methods
Need to fit some distribution to observed data Data are arranged in order and assign a rank
If order is descending in magnitude, Highest value has rank of 1 Lowest value has rank of n
Gives estimate of exceedance probability
Probability ( event = ranked value )
Section 3 Probability
14
Graphical Methods
If order is ascending in magnitude, lowest value has rank of 1
Lowest value has rank of 1 Highest value has rank of n
Gives estimate of non-exceedance probability Pr (event = ranked value) Many plotting formula are available, all can be expressed as
Where a, b = constants pm
a Pr (mth observation = mth value mN ordered observations such as P1 pm = of < P2 > . . . PN N +b
Section 3 Probability
Graphical Methods
Most formula common is
pm =
Where pm= probability of the mth
m N +1 data point (observed value)
Clearly, then the return period of the mth data point is
Tm =
N +1 m
Section 3 Probability
Normal Probability Paper
Place mean at F = 50% Place one Sx at 15.9 and 84.1% Connect points with st. line Plot data with plotting position formula Determine P < x by reading the graph for F
Section 3 Probability
15
Lognormal Probability Paper
Section 3 Probability
LogN Plot of Siletz R.
Mean
Straight Line Fits Data Well
Section 3 Probability
Siletz River Flow Data
Various Distributions
Section 3 Probability
16
Normal Distribution
Mean
Section 3 Probability
Flow Duration Curves
Section 3 Probability
Rules - Analysis of Data
Take Mean and Var of data Take Skewness C of data If Cs near zero, assume normal distn If Cs large, take Y = Log x - (Mean and Var of Y) Take Skewness of Log data - Cs(Y) Fit data to Log Pearson 3 or Log Normal
Section 3 Probability
17
In Class Exercise
Prepare a flood frequency curve Determine probability of a flow of 20,000 cfs Determine the magnitude of flow corresponding to an exceedance probability of 0.5 Determine the magnitude of flow of a return period of 100 years Use the normal distribution to find extreme values Make a frequency plot using a log normal distribution
Section 3 Probability
In Class Exercise
The following data are maximum flows (cfs) of the Cedar River in Minnesota. A peak flow of 30,000 cfs was recorded in 1905
Year 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 Flow 14,400 6,720 13,390 15,360 8,856 5,136 6,770 9,600 980 4,030 10,440 3,100 Rank Plot. % Year 1962 1962 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 Flow 6,240 22,700 11,140 4,560 5,376 12,480 19,200 12,984 5,450 13,440 22,680 8,400
Section 3 Probability
Rank
Plot. %
18