Mixture and Hidden Markov Models for Estimating Flood Quantiles and Risk Luc Perreault, Vincent Fortin Hydro-Québec Research Institute
Jose D. Salas Colorado State University AGU-CGU 2004 Joint Assembly 17-21 May Montréal, Canada
Outline ¾ Annual streamflow records exhibit abrupt changes ¾ And what about the extremes ? ¾ A simple hidden Markov model for peak flow ¾ Bivariate hidden Markov model for peak flow and volume ¾ Conclusion
2 Institut de recherche
Annual streamflow records exhibit abrupt changes
Some sites in Québec
4 Institut de recherche
May-June-July volume Ma nic 5 + Ha rt-J a une
Apports MJ J
O uta rd e s 4 50
60
40
50 40
30 1965
1970
1975
1980
1985
La c S a inte -Anne
1990
1995
Apports MJ J
1975
1980
1985
1990
1995
1965
1970
1975
1980
1985
1990
1995
1965
1970
1975
1980
1985
1990
1995
1965
1970
1975
1980
1985
1990
1995
S a inte -Ma rg ue rite
60
60
50
50
40
40 1965
1970
1975
1980
1985
R o m a ine
1990
1995
30
C hurc hill F a lls
60
60 Apports MJ J
1970
70
70
50
50
40
40
30
30
1965
Apports MJ J
1965
1970
1975
1980
1985
As hua nip i
1990
1995 60
60
C a nia p is c a u
50
50
40
40 30
30 1965
1970
1975
1980
Anné e s Institut de recherche
1985
1990
1995
Anné e s
5
Some models currently in used ¾ Change-point models
Smith (1975) Perreault et al. (1999, 2000, 2004)
¾ Shifting-Level models Salas and Boes (1982) Fortin et al. (2004)
¾ Mixture models
West (1992) Unpublished report Perreault (2003)
The Bayesian perspective Prior knowledge
p(θ)
Model (likelihood) p(y | θ)
Prior distribution
p(θ|y)
Posterior dist.
Bayes theorem θ
θ
Observations y = (y1, ..., yn)
Estimation Forecasting Decision
¾ Hidden Markov models Albert and Chib (1993) Thyer and Kuczera (2000)
6 Institut de recherche
And what about the extremes ?
Why do we need to analyse extremes ?
¾ Management of existing hydroelectric equipements and reservoirs (peak and spring volume) ¾ Further hydroelectric developments Case study : Manic 5 + Hart-Jaune 8 Institut de recherche
Identically and independently distributed 9000
¾ Heterogeneity ?
8000
S pring pe a k flow
7000 6000 5000 4000 3000 2000 1000 1950
¾ Persistence ?
1960
1970
1980
1990
2000
Ye a r 0 .6 0 .5
r(1) = 0.55
Autocorre la tion
0 .4 0 .3 0 .2 0 .1 0 -0 .1 -0 .2
1
2
3
4
5
6 La g
Institut de recherche
7
8
9
10
9
A simple hidden Markov model for extremes
Two-states hidden Markov model (HMM) Pr(W Æ D) WET REGIME
Pr(W Æ W)
ak
bk
Pr(D Æ D)
p P = DD pWD
P
DRY REGIME
Markov process
Pr(D Æ W)
z1 x1 LN ( x µ D , σ D )
...
z2
...
zi
...
zn
x2
...
xi
...
xn
LN ( x µ W , σ W ) 2-parameter lognormal
µzi, σzi
mzi, czi azi, bzi 11
Institut de recherche
pDW pWW
Inference about the parameters DRY
2000
DRY
pDD = 0.74
1000
0
0
0.5
1
0
0
1000
pWD
500
0
pDW
1000
1000
WET
WET
2000
0
0.5
1
pDD = 0.66
500
1
4000
0
0
0.5
1
2000
µD
2000
0 7.5
0.5
8
8.5
1000
9
9.5
0 7.5
µW 8
8.5
9
9.5
12 Institut de recherche
Retrospective classification S pring peak flow
8000 6000 4000 2000 1950
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
P rob(Z(i) = WET)
1 Mixture HMM
0.8 0.6 0.4 0.2 0 1950
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
13 Institut de recherche
One year ahead predictive quantiles (2003) p ( x x , x ,..., x ) n +1
16000
1
2
n
HMM Mixture Dry Wet Lognorm a l Da ta
14000
WET
Standard two-parameter lognormal distibution
12000
S pring peak flow
10000
DRY
8000
Xp(n+1)
6000
4000
2000
0
0
0.001
0.023
0.159 0.5 0.841 Non e xc e e da nc e proba bility
0.977
0.999
1
14 Institut de recherche
Joint forecast of spring peak flow and freshet volume Preliminary results
Peak flow vs volume Joint inference of peak flow and volume is needed for reservoir management ¾ Freshet volume is a good predictor of peak flow in this region
9.4 9.2 9
R2 = 0.46
¾ Freshet volume is less variable (Cv = 0.18 compare to Cv = 0.39) ¾ Easier to discriminate between the regimes
Log(S pring peak flow)
8.8 8.6 8.4 8.2 8 7.8 7.6 7.4 Volume
16 Institut de recherche
bk
ak
Bivariate HMM model Classification
z1
S pring peak flow
P
x1
8000 6000
...
z2
...
zi
...
zn
x2
...
xi
...
xn
µzi, σzi
4000 1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
P rob(Z(i) = WET)
Volume
2000 1950
1950 1 0.8 0.6 0.4 0.2 0 1950
17 Institut de recherche
mzi, czi azi, bzi
Predictive quantiles for year n+1 16000
14000
HMM Bivariate HMM Lognormal Data
Univariate HMM
12000
LN
S pring peak flow
10000
8000
Bivariate HMM
6000
4000
2000
0 0
0.001
0.023
0.159 0.5 0.841 Exceedance probability
0.977
0.999
1
18 Institut de recherche
8000 6000
3079 m3/s
4000 2000 1950
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
P rob(Z(i) = WET)
Volume
S pring peak flow
Observed values for 2003
1950 1 0.8 0.6 0.4 0.2 0 1950
19 Institut de recherche
Conclusion
In conclusion, we proposed … ¾ A design flood analysis framework for samples exhibiting abrupt changes ¾ Predictive quantile estimates ¾ Graphical modeling (DAG) ¾ Bayesian analysis
21 Institut de recherche
Perspectives ¾ Dealing with large uncertainties on quantile estimates persistence + heterogeneity => smaller effective sample size Adding more information : prior, indirect data (tree rings)
¾ Regional analysis using a hierarchical bayesian model (Perreault et al. (2004), in French)