Everything You Always Wanted to Know about

4 downloads 0 Views 309KB Size Report
This paper by Professors Genest and Favre (2007) was published in a recent ... to uncover and measure dependence, how to test for indepen- dence, and how ...
DISCUSSIONS AND CLOSURES

Discussion of “Gumbel-Hougaard Copula for Trivariate Rainfall Frequency Analysis” by L. Zhang and Vijay P. Singh

such intermittent spells are typically part of the same storm. This is also the criteria employed by the authors to identify the individual storms. It is seen, as expected, that all the observed data points fall exactly on the surface given by

Downloaded from ascelibrary.org by UNIVERSITE DE MONCTON on 01/09/15. Copyright ASCE. For personal use only; all rights reserved.

July/August 2007, Vol. 12, No. 4, pp. 409–419.

I = V/D

DOI: 10.1061/共ASCE兲1084-0699共2007兲12:4共409兲

H. Chowdhary1 1

Graduate Research Assistant, Department of Civil and Environmental Engineering, Louisiana State Univ., Baton Rouge, LA 70802. E-mail: [email protected]

Consideration of the multivariate nature of hydro-meteorological processes associated with hydrologic systems is important when assessing the availability of water resources and the risks caused by extreme events and when conducting rainfall-runoff simulation and modeling. Detailed statistical analysis of rainfall, as one of the main factors affecting water availability and extreme conditions, is required for estimating the design parameters. Storm depth, duration, inter-arrival duration, and areal spread are among the main features of rainfall process. One or more of these factors need to be analyzed depending on the type of hydrological application at hand. However, this is further complicated by the fact that a wide range of areal coverage and thereby durations of rainfall are usually of interest 共Linsley and Franzini 1979, p. 122兲. Intensity-duration-frequency curves are aimed at the requirements of obtaining extreme rainfall values for different durations and aid design of small to medium-sized drainage systems. Analysis of storm depth and duration, or intensity and duration, along with the inter-arrival duration, on the other hand, is required for applications such as deriving flood frequency distributions from climatic characteristics 共Eagleson 1972兲 and simulating rainfall events for the purpose of modeling rainfall-runoff, assessing water availability, and drought studies, among other activities. The application of Gumbel-Hougaard copula for trivariate rainfall frequency analysis in the paper involves the three rainfall variables storm depth, duration, and mean intensity of the annual largest rainfall events. The largest annual rainfall event is identified as the storm having the highest rainfall depth. The mean intensity is obtained as the ratio of the corresponding storm depth and duration. Any pair of two of these three variables constitute a bivariate data set as the joint occurrence of the constituents is random in nature. However, considering all three variables simultaneously makes this a nonsubstantive trivariate case as any one of the variables could be computed from the knowledge of the other two. In other words, as the third variable is algebraically determinable and does not have any randomness in its occurrence when the other two are given, a bivariate frequency analysis may suffice. This one-to-one relationship of mean intensity with depth and duration, as given by Eq. 共18b兲 in Zhang and Singh 共2006兲, is illustrated in the perspective plot in Fig. 1共a兲. As in the paper, the annual largest storms at Liberty rainfall station for the period of 1980 to 2006 have been considered in this illustration. A minimum of six hours of dry period between storms is employed as the criteria to define the storms. This helps combine rain spells that are separated by less than six hours into single storms, as

共1兲

where I, V, and D are storms’ mean intensity, depth 共i.e., volume of rainfall兲, and duration in inch/hour, inch, and hours, respectively. For such cases that involve a variable that has a functional relationship with one or more variables, the dimensionality of the problem can be reduced by employing suitable transformation function. For a bivariate, this corresponds to the FrechetHoeffding upper and lower bounds, depending on the increasing or decreasing function between the two variables. For example, considering a bivariate random data 共X , Y兲 and a nondecreasing function Y = g共X兲, the joint cumulative distribution function 共cdf兲, as given by Nelsen 共2006兲 is

Fig. 1. Perspective plots of depth, duration and mean intensity at Liberty rainfall station for 共a兲 annual largest storms for period from 1980 to 2006; 共b兲 91 storms of hydrological year 1992

992 / JOURNAL OF HYDROLOGIC ENGINEERING © ASCE / OCTOBER 2008

J. Hydrol. Eng. 2008.13:995-996.

Downloaded from ascelibrary.org by UNIVERSITE DE MONCTON on 01/09/15. Copyright ASCE. For personal use only; all rights reserved.

Fig. 2. Pairwise plots of depth, duration, and mean intensity of annual largest storms at Liberty rainfall station for period from 1980 to 2006

The right hand side of Eq. 共2兲 is essentially a bivariate expression. The condition V / D 艋 i further restricts the bivariate space 关0 , v兴 ⫻ 关0 , d兴 such that the ratio V / D is not more than I. The joint cdf in Eq. 共2兲 may be further expressed as

FX,Y 共x,y兲 = min关FX共x兲,FY 共y兲兴 = min兵FX共x兲,FX关g−1共x兲兴其 = min共u, v兲 = C共u, v兲 where FX,Y 共x , y兲, FX共x兲, FY 共y兲, and C are joint cdf of 共X , Y兲, marginal cdfs of X and Y, and the copula function, respectively, and U and V are uniformly distributed random variables. In this, the bivariate joint cdf is fully determinable in terms of the marginal distribution of X. For the trivariate random variables 共I , V , D兲, which are the subject matter of the present discussion, I is a function of 共V , D兲 as given by Eq. 共1兲, resulting in I ⬃ V / D. And thus the joint trivariate cdf is given as FI,V,D共i, v,d兲 = P关共I 艋 i兲,共V 艋 v兲,共D 艋 d兲兴 = P关共V/D 艋 i兲,共V 艋 v兲,共D 艋 d兲兴

共2兲

FI,V,D共i, v,d兲 =



冕 冕 冕 冕 s=d

s=0

r=v

r=0

r=is

f V,D共r,s兲drds ∀ i 艋 v/d

r=0 s=d

s=r/i

f V,D共r,s兲dsdr ∀ i 艌 v/d



共3兲

where r and s are dummy variables for V and D, respectively. It may be seen from Eq. 共3兲 that the trivariate cdf is determinable from the knowledge of bivariate distribution of 共V , D兲. Thus the functional relationship between I, V, and D in this case, viz. I = V / D, helps reduce the trivariate distribution into a bivariate distribution.

JOURNAL OF HYDROLOGIC ENGINEERING © ASCE / OCTOBER 2008 / 993

J. Hydrol. Eng. 2008.13:995-996.

Downloaded from ascelibrary.org by UNIVERSITE DE MONCTON on 01/09/15. Copyright ASCE. For personal use only; all rights reserved.

Another aspect that is important to consider is the selection of storm events for this frequency analysis. The authors have first identified the annual largest storms on the basis of the total storm depth and then the durations of such storm events are used to obtain the mean intensity. It may stated that the depth-wise largest storms may not always yield higher mean intensities that are of interest in most design applications. This is illustrated in Fig. 1共b兲 as a perspective plot of depth, duration, and mean intensity at Liberty rainfall station for all of the 91 storms of the hydrological year 1992. It may be seen from this plot that the mean rainfall intensities for the two depth-wise largest storms 共of about 6 in depth and 25 h duration兲 are much smaller than those of many other storms, e.g., those having about 4 in depth and less than 5 h duration兲. Thus the results of frequency analysis presented in this paper have to be specifically applicable to the bivariate characteristics of depth and duration or mean intensity and duration of the “depth-wise” annual maximum storms rather than for any storms in general. Furthermore, applications such as the simulation of rainfall storms for deriving flood frequency distribution, water availability, and drought analysis would require consideration of all storm events rather than only the largest storms used in the study. All the studies referred to in the “Introduction” section of the paper are related to rainfall frequency analysis considering all significant events rather than only the largest storms. The implications of selection of storms on the basis of this criteria has also been emphasized recently by Kao and Govindaraju 共2007兲, who compared results of bivariate frequency analysis involving selection of the largest storms on the basis of three criteria: 共a兲 annual maximum depth, 共b兲 annual maximum intensity, and 共c兲 annual maximum cumulative probability. Some of the mean intensities as shown in the related article 共Zhang and Singh 2006兲, such as about 4.2, 4.9, and 8.8 in/ h for the Liberty station and about 5.0, 6.8, and 7.0 in/ h for the Clinton station, appear to be unusually high and are not supported by the data, as available from the National Climatic Data Center 共NCDC兲. Pairwise plots of depth, duration, and mean intensity for annual largest storms for the Liberty station for the hydrological years from 1980 to 2005 are shown in Fig. 2. Data for Liberty station is available from NCDC for 1980 to 2006, not from 1960, as mentioned in the paper. The source of data for Denham Springs could not be ascertained as none of the national or state data repositories carry the same. It may be helpful if the sources of data are provided so that the same may be obtained for possible comparative studies.

References Eagleson, P. S. 共1972兲. “Dynamics of flood frequency.” Water Resour. Res., 8共4兲, 878–898. Kao, S.-C., and Govindaraju, R. S. 共2007兲. “A bivariate frequency analysis of extreme rainfall with implications for design.” J. Geophys. Res., 112, D13119.

Linsley, R. K., and Franzini, J. B. 共1979兲. Water-resources engineering, McGraw-Hill, New York. Nelsen, R. B. 共2006兲. An introduction to copulas, Springer, New York. Zhang, L., and Singh, V. P. 共2006兲. “Bivariate rainfall frequency distributions using Archimedean copulas.” J. Hydrol., 332共1–2兲, 93–109.

Closure to “Gumbel-Hougaard Copula for Trivariate Rainfall Frequency Analysis” by L. Zhang and Vijay P. Singh July/August 2007, Vol. 12, No. 4, pp. 409–419.

DOI: 10.1061/共ASCE兲1084-0699共2007兲12:4共409兲

L. Zhang1 and V. P. Singh2 1

Independent Consultant, 3403 Maple Avenue, Brookfield, IL, 60513. E-mail: [email protected] 2 Caroline & William N. Lehrer Distinguished Chair in Water Engineering, Professor of Civil & Environmental Engineering, and Professor of Biological & Agricultural Engineering, Department of Biological and Agricultural Engineering, Texas A & M Univ., College Station, TX 77843-2117. E-mail: [email protected]

The authors appreciate the discussion by Chowdhary and concur with his discussion on the need for trivariate analyses of such rainfall characteristics as rainfall depth 共or intensity兲, duration, and inter-arrival times. The objective of the paper was to present a trivariate analysis of rainfall using copulas. The three variables could be any. In the paper they were depth, intensity, and duration. Values of these rainfall variables analyzed in the paper were obtained from Southern Regional Climate Center and National Weather Service records. The discusser makes two points. First, if the average intensity is obtained by dividing the depth by the associated duration, then depth and intensity are uniquely related and the three variables— depth, intensity, and duration—reduce to two independent variables. Therefore, in this case trivariate analysis reduces to bivariate analysis. The authors agree with this point. However, if the depth is obtained by employing the weighted average intensity, then this depth will be different from the depth obtained by simply using the average intensity. In many cases, storage rainfall gauges are used and all that is measured is depth. From this depth a rainfall intensity distribution is obtained using a standard rainfall distribution developed for that area. Thus, weighted intensity and average intensity will not be identical. Second, the discusser makes a good point that higher depth storms do not always yield higher mean intensities, unless duration is appropriately taken into consideration. This also means that extreme depth values do not uniquely correspond to extreme intensity values unless the duration is kept the same. Thus, a trivariate analysis of extreme values of depth, duration, and intensity may be needed.

994 / JOURNAL OF HYDROLOGIC ENGINEERING © ASCE / OCTOBER 2008

J. Hydrol. Eng. 2008.13:995-996.

Discussion of “Everything You Always Wanted to Know about Copula Modeling but Were Afraid to Ask” by C. Genest and A.-C. Favre July/August 2007, Vol. 12, No. 4, pp. 347-368.

DOI: 10.1061/共ASCE兲1084-0699共2007兲12:4共347兲

F. Ashkar, Ph.D.1

Downloaded from ascelibrary.org by UNIVERSITE DE MONCTON on 01/09/15. Copyright ASCE. For personal use only; all rights reserved.

1

Dept. of Mathematics and Statistics, Université de Moncton, Moncton, NB, Canada, E1A 3E9. E-mail: [email protected]

This paper by Professors Genest and Favre 共2007兲 was published in a recent special issue of the Journal of Hydrologic Engineering dedicated to copula modeling in hydrology. The discusser thinks that it would have been more beneficial to the hydrologic community had this special issue been directed more broadly to the subject of multivariate frequency and risk modeling in hydrology so that presentations would not have been restricted only to copula modeling. The discusser has some remarks on some of the issues and methodologies that Genest and Favre propose in their paper. In the Introduction section of their paper, the authors refer to “classical families of bivariate distributions” 共e.g., bivariate normal, lognormal, gamma, . . .兲 and state that “the main limitation of this approach is that the individual behavior of the two variables 共or transformations thereof兲 must then be characterized by the same parametric family of univariate distributions.” It needs to be pointed out that, from the practical standpoint, this so-called limitation is only superficial and not serious. For if we consider the bivariate gamma distribution for 共X , Y兲 as an example, it is easy to transform each of its marginal distributions into any other distribution that we desire 共extreme value, ␤, lognormal, Pareto, . . .兲 by simply transforming from the marginal gamma to the standard uniform 关U共0 , 1兲兴, and then transforming back from U共0 , 1兲 to the other distribution that we are interested in. By performing this two-step marginal transformation for each of X and Y, we will have a transformed variable 共X* , Y *兲 whose marginal distributions do not have to belong to the same parametric family 共the marginal distribution for X* could be extreme value, for instance, while that for Y * could be lognormal兲. The bivariate pdf or cdf of 共X* , Y *兲 is generally not difficult to obtain by standard probabilistic calculations, as outlined for instance by Ashkar 共2007兲. This transformation approach of “classical families of bivariate distributions” to bivariate distributions with less restrictions on marginals, is an interesting one that needs to be researched more in hydrology. This transformation from 共X , Y兲 to 共X* , Y *兲 is in fact essentially the same kind of transformation that is used in the “copula-based” approach where 共X , Y兲 replaces 共U , V兲, where U and V are U共0 , 1兲 distributed, and where their bivariate cdf is C␪共u , v兲 关same notation is used here as in Genest and Favre 共2007兲兴. Genest and Favre 共2007兲 present valuable information and discussion about dependence modeling by the copula approach. They discuss, among other things, how to detect dependence, how to uncover and measure dependence, how to test for independence, and how to obtain point and interval estimates for dependence parameters from copula models. However, they provide almost no information on the broader problem of multivariate frequency modeling for the purpose of hydrologic risk analysis, which most often is the central problem to hydrologists. It seems that the reason that the authors have felt satisfied with limiting

their attention to dependence modeling using the copula approach can be understood from two of their statements: 1. “The main advantage provided to the hydrologist by this approach is that the selection of an appropriate model for the dependence between X and Y, represented by the copula, can then proceed independently from the choice of the marginal distributions.” 2. “to us, association is also a margin-free notion, . . .” It should be remembered that modeling the dependence between X and Y is seldom the ultimate goal in hydrology, but is only a part of a broader hydrologic risk model that involves the variable 共X , Y兲. Genest and Favre 共2007兲 base their entire methodology for modeling the dependence between X and Y on the not-so-perfectly substantiated suggestion that “statistical inference concerning dependence structures should always be based on ranks” 共a statement found in the Introduction section of their paper兲. The discusser has qualified this suggestion as “not-soperfectly substantiated” because modeling the dependence between X and Y is usually not the ultimate goal in hydrologic modeling. Therefore when searching for an appropriate bivariate model for 共X , Y兲, it is far from clear why the dependence between X and Y should necessarily be based on ranks. Do the authors suggest that use of the parametric two-step “inference from margins” 共IFM兲 procedure, which does not rely on ranks, should be discouraged in hydrology? Note that the IFM procedure has already been favored by a number of statisticians over the rankbased approach that is advocated by the authors 共as they in fact admit in the “Other Estimation Methods” section of their paper兲. The IFM procedure has also already been shown to be useful in a number of hydrological publications 共e.g., Favre et al. 2004; Dupuis 2007; Ashkar 2007兲. More comparisons need to be done in hydrology between procedures of the IFM type 共for example兲 for measuring the dependence between X and Y, and rank-based procedures of the types proposed by Genest and Favre. It would be helpful if the authors would make their hydrological data set available to the scientific community to assist in making these types of comparisons. The discusser also thinks that hydrologists interested in copula modeling should read the paper by Mikosch 共2006a兲, which presents a point of view on how copulas fall within the broader area of stochastic dependence and risk. Although the paper makes a somewhat harsh attack on copulas, it raises many legitimate questions regarding copula modeling and stochastic dependence that deserve serious reflection. The discussions on that paper by Genest and Rémillard 共2006兲 and by others, along with the reply by Mikosch 共2006b兲, are also informative and thought stimulating. It is clear that copula modeling is simple and convenient for use in hydrological applications, and the discusser agrees with the statement by Genest and Rémillard 共2006兲 that: “Simplicity, interpretability, convenience 共. . .兲 are also important considerations in selecting a model, whether it be copula-based or not.” However, as interest in multivariate stochastic modeling is beginning to intensify within the hydrologic community, hydrologists need to make use of the research that has been done on copula modeling but at the same time be careful not to see stochastic dependence only through the “dark glasses of copulas” 关a quote borrowed from Mikosch 共2006b兲兴.

JOURNAL OF HYDROLOGIC ENGINEERING © ASCE / OCTOBER 2008 / 995

J. Hydrol. Eng. 2008.13:995-996.

Downloaded from ascelibrary.org by UNIVERSITE DE MONCTON on 01/09/15. Copyright ASCE. For personal use only; all rights reserved.

References Ashkar, F. 共2007兲. “Bivariate generalized-Pareto-based models for hydrological frequency analysis.” Proc., 18th Canadian Hydrotechnical Conf. and Symp. of the Canadian Society for Civil Engineering 共CD-ROM兲, Winnipeg Manitoba, Canada. Dupuis, D. J. 共2007兲. “Using copulas in hydrology: Benefits, cautions, and issues.” J. Hydrol. Eng., 12共4兲, 381–393. Favre, A.-C., El Adlouni, I. S., Perreault, L., Thiémonge, N., Bobée, B. 共2004兲. “Multivariate hydrological frequency analysis using copulas.” Water Resour. Res., 40共W01101兲, 1–12. Genest, C., and Favre, A. C. 共2007兲. “Everything you always wanted to know about Copula modeling but were afraid to ask.” J. Hydrol. Eng., 12共4兲, 347–368. Genest, C., and Rémillard, B. 共2006兲. “Discussion of “Copulas: Tales and facts” by T. Mikosch.” Extremes, 9共1兲, 27–36. Mikosch, T. 共2006a兲. “Copulas: Tales and facts.” Extremes, 9共1兲, 3–20. Mikosch, T. 共2006b兲. “Copulas: Tales and facts—Rejoinder.” Extremes, 9共1兲, 55–62.

Closure to “Everything You Always Wanted to Know about Copula Modeling but Were Afraid to Ask” by C. Genest and A.-C. Favre July/August 2007, Vol. 12, No. 4, pp. 347–368.

DOI: 10.1061/共ASCE兲1084-0699共2007兲12:4共347兲

Christian Genest1 and Anne-Catherine Favre2 1

Professor, Dépt. de mathématiques et de statistique, Univ. Laval, Québec QC, Canada G1K 7P4. 2 Professor, Chaire en Hydrologie Statistique, INRS, Eau, Terre et Environnement, Québec QC, Canada G1K 9A9. E-mail: [email protected]

Dr. Ashkar’s comments on our paper are welcome. They call for only a brief response. In his discussion, Dr. Ashkar mentions that classical families of bivariate distributions can be made more flexible through suitable transformations of their margins. When he writes that “this transformation approach . . . is an interesting one that needs to be

researched more in hydrology,” he is expressing support for copula modeling, albeit indirectly. In his example, the pairs 共X , Y兲 and 共X* , Y *兲 would have different margins 共F , G兲 and 共F* , G*兲 but they would have the same copula by construction. Similarly, the transformation model described in Ashkar 共2007兲 is an example of copula modeling. Because transformations of margins affect neither a distribution’s underlying copula nor the relative position of points in the scatter plot, rank-based methods are well suited for selecting the dependence structure, C. Once this is done, a copula model for the pair 共X , Y兲 is given by H共x,y兲 = C兵F共x兲,G共y兲其

共1兲

Alternatively, one could choose to model the transformed pair 共X* , Y *兲 by replacing F and G by F* and G* in 共1兲. Both are equally simple from the copula modeling perspective, but it is often more natural in applications to work with a model based on the original variables. As Dr. Ashkar states, rank-based inference for copulas is not an invitation to ignore the variables’ individual behavior completely. This point hardly needs emphasis, as C, F and G are essential ingredients in 共1兲, and in any subsequent hydrologic risk analysis. When parametric families of margins and copulas are deemed appropriate in a given context, maximum likelihood and IFM estimation work well. However, rank-based methods may be preferable if possible misspecification of the margins is a concern or when the latter are not of primary interest, e.g., when comparing dependence structures from different populations. Dr. Ashkar’s suggestion that these inference approaches be compared is a good one. Kim et al. 共2007兲, who recently considered this issue, conclude that rank-based estimation is superior unless the marginals are known with certainty, which is seldom the case in practice. We join Dr. Ashkar in encouraging hydrologists to learn more about copulas and to remain critical in their use. Minds, like parachutes, work best when they are open.

References Kim, G., Silvapulle, M. J., and Silvapulle, P. 共2007兲. “Comparison of semiparametric and parametric methods for estimating copulas.” Comput. Stat. Data Anal., 51共6兲, 2836–2850.

996 / JOURNAL OF HYDROLOGIC ENGINEERING © ASCE / OCTOBER 2008

J. Hydrol. Eng. 2008.13:995-996.