Generalized Method of Moments Applied to LP3 ...

2 downloads 0 Views 572KB Size Report
Generalized Method of Moments (GMM) which combines any three moments of the LP3 distribution. We also present a special case of the. GMM which we call ...
GENERALIZED METHOD OF MOMENTS APPLIED TO LP3 DISTRIBUTION Downloaded from ascelibrary.org by UNIVERSITE DE MONCTON on 01/06/15. Copyright ASCE. For personal use only; all rights reserved.

By Bernard Bobee1 and Fahim Ashkar2 ABSTRACT: The log-Pearson type 3 (LP3) distribution is recommended in the United States by the Water Resources Council (WRC) as the parent distribution to maximum annual flood series. The parameter estimation technique proposed for this distribution consists in applying a logarithmic transformation to the observed flood sample and then applying the method of moments to these logarithmic values using moments of order 1, 2, and 3 (mean, variance, and coefficient of skew) in log space. Other methods have been proposed which use only moments in real space such as the mean, variance, geometric mean, harmonic mean, etc. of the observed sample, or combination of moments in real and log space (method of "mixed moments"). There are many other versions of the method of moments which have yet to be explored and evaluated. To help motivate this kind of exploration, we derive a general formula for the variance of the T-year event XT obtained by a method called the Generalized Method of Moments (GMM) which combines any three moments of the LP3 distribution. We also present a special case of the GMM which we call the Sundry Averages Method (SAM) which uses the harmonic, geometric, and arithmetic means of the distribution. INTRODUCTION

Following the recommendation of the hydrology committee of the United States Water Resources Council (WRC 1967; Benson 1968), the log-Pearson type 3 (LP3) distribution has been largely used in many countries and especially in North America and Australia for representing flood flows. In recent years, several methods of fitting this distribution have been proposed. One method which has been often preferred by hydrologists because of its relative ease of computation is the method of moments (MM). One apparent disadvantage of the MM is that there is no unique way of applying it to a given distribution because the order of the moments to be used in the estimation is not unique. This flexibility of the method, however, can be considered, in a sense, to be an advantage rather than a disadvantage. In the case of the LP3 distribution, the fitting procedure proposed by the WRC was to apply a logarithmic transformation y,- = log Xj to the observed annual flood series (xt , . . ., % ) and then to equate the y, variance s* , and skew coefficient (Cs)y of these logarithmic values to the corresponding population moments of the Pearson type 3 (P3) distribution. This approach has sometimes been criticized in that it gives the 'Prof., Institut National de la Recherche Scientifique (INRS-Eau), P.O. Box 7500, Ste-Foy, Quebec, Canada, G1V 4C7. 2 Res. Assoc, Institut National de la Recherche Scientifique (INRS-Eau). Note. Discussion open until January. 1, 1989. To extend the closing date one month, a written request must be filed with the ASCE Manager of Journals. The manuscript for this paper was submitted for review and possible publication on November 7, 1986. This paper is part of the Journal of Hydraulic Engineering, Vol. 114, No. 8, August, 1988. ©ASCE, ISSN 0733-9420/88/0008-0899/$ 1.00 + $.15 per page. Paper No. 22681. 899

J. Hydraul. Eng. 1988.114:899-909.

Downloaded from ascelibrary.org by UNIVERSITE DE MONCTON on 01/06/15. Copyright ASCE. For personal use only; all rights reserved.

same weight to the logarithms of the observed flood values and not to the observed values themselves, hereby reducing the relative importance of the larger elements of the sample. For this reason, Bobee (1975) recommended applying the method of moments directly to the sample of observed values. One common property between Bobee's method and that of the WRC is that both use sample skew in their application (one in real space, the other in log space). It has been remarked by some investigators that sample skew coefficients, especially those estimated from small samples, have large variability and produce pronounced error in the estimates. It has also been observed that skew estimates are constrained: Kirby (1974) showed that a functional inequality exists, dependent only on sample size, which limits values of sample skew that can be observed, regardless of the skewness of the parent population. Some authors have therefore cautioned that this boundedness and instability of sample skew may have important negative consequences on the accuracy of the estimated parameters, and of XT, and as a result on the engineering decisions that are based on these estimates. To avoid the direct use of sample skew, Rao (1980) introduced the method of "mixed moments" which uses the mean and variance of the logarithmic data along with the mean of the real data, or the mean and variance of the real data along with the mean of the logarithmic data to fit the LP3 distribution. Phien and Hira (1983) carried Rao's idea further by using other combinations of information from real and log space. From a statistical point of view, these departures from the traditional MM are perfectly justified, because there is no theoretical reason on the basis of statistical efficiency why one should stick to moments of order, 1, 2 and 3 (mean, variance, and coefficient of skewness) in applying the MM to a 3-parameter distribution. In the case of the LP3 distribution, one might find it more reasonable to think that the "best" MM ("best" in terms of minimizing mean square error of estimates of extreme flood events, for instance) is not fixed, but is rather dependent on the skewness of the population. Avoiding the use of sample skew can therefore be considered as wise or unwise depending on the form of the distribution that is being considered. What is important for achieving stability in quantile estimates is not only the stability of the sample moments that are being employed, but also the degree of correlation that exists between these sample moments. This point does not seem to have been well emphasized in the hydrologic literature in the past. What we have mentioned so far are only few of the many ways of applying the MM to the LP3 distribution; other versions of the MM have yet to be explored and evaluated. To help motivate this kind of exploration we shall derive in the present study a general formula for the variance of the T-year event XT obtained by combining any three moments of the LP3 distribution. This derivation is based on a first-order asymptotic approximation. For reasons of mathematical convenience, the calculations will be done only for the case where moments in real space are used. The formula that will be obtained applies, therefore, to the method of Bobee (1975) and to the methods MM1 (Rao 1980) and SAM (introduced later in this study), but not to the method proposed by the WRC (which uses moments in log space). The corresponding formula for the method of the WRC has already been given in (Bobee 1973). It is to be noted that by carefully choosing the 900

J. Hydraul. Eng. 1988.114:899-909.

Downloaded from ascelibrary.org by UNIVERSITE DE MONCTON on 01/06/15. Copyright ASCE. For personal use only; all rights reserved.

order of the three moments to be used in the estimation, one can achieve a significant reduction in the variance of the T-year event, XT, as compared to estimation by the traditional method of moments. SOME BRIEF PROPERTIES OF LPS DISTRIBUTION

The LP3 distribution is derived from the Pearson type 3 (P3) distribution whose probability density function (p.d.f.) is given by f(y) = j ^

« - • " - % - m)f~l

'

(l)

where r ( • ) is the gamma function. If Y = log a X follows a P3 distribution then X follows an LP3 distribution with p.d.f. g{x) = ^

e-^-x"n\a(\oga

x - m)f-1 -

(2)

with k = logfl e

(e=* 2.71828)

In the present study we shall use a general logarithmic base, a, in moving from the P3 to the LP3 distribution, although it is the common logarithm (base 10) and natural logarithm (base e) which are most frequently used. The rth noncentral moment of the LP3 distribution can be expressed (Bobee 1975) under the single general form:

*M = *v(™)(l-£)

X

(3)

where This rth moment exists for all r, positive or negative as long as the following condition is satisfied: l-->0,

i.e.,

r0

and

r>p

for

p 0

or

'

=

(13)

'

lim A(r) = g r->0

where g is the geometric mean of the sample, which therefore can be regarded as the moment of order "quasi zero" . A(l) > A(0) > A(-1) which represents the classical inequality between the three means (arithmetic, geometric, and harmonic) By using Eqs. similar to 12 and 13 for the population rather than for the sample, one can show that the geometric mean (G) of the population is the moment of order quasi zero, and therefore in the limit as r -» 0, the equation: m'r{x) = (x/(x)

(14)

becomes equivalent to: gx = Gx

(15)

Taking logs on both sides of Eq. 15 for the LP3 distribution we obtain: X y = \x,y = m + -

(16)

where y is the arithmetic mean of the sample of logarithmic values (y,- = log a X/) and |Xy is the population mean of the P3 variable Y = log„ X. (We shall use in the remainder of the study the simplified notation y = 0—with a dash under the zero—to mean "r —» 0"). 903

J. Hydraul. Eng. 1988.114:899-909.

Downloaded from ascelibrary.org by UNIVERSITE DE MONCTON on 01/06/15. Copyright ASCE. For personal use only; all rights reserved.

We consider now the sundry averages method (SAM) applied to a sample {xlt . . ., xn . . ., xN) of size N drawn from a log-Pearson type 3 distribution. This method is equivalent to GMM ( - 1 , 0 , 1) because it uses s = - 1 , t = 0, u = 1, (Eq. 6). From Eq. 4, it can be shown that for the method SAM to be applicable, the condition ipi > 1 must be satisfied (because if 1(31 > 1, the moments of order s = - 1 , 0 and 1 exist). Piegorsch and Casella (1985) give some general conditions for the existence of the first negative moment (r = -1) independently of the distribution considered. For the LP3 distribution, these conditions lead to the same conclusions that we have presented in Eq. 4 (/• = —1). It can also be noted that the method MM1 (Rao 1980) is equivalent to GMM (0, 1, 2) and this method is applicable, if and only if, (3 > 2 or p < 0. When fitting an LP3 distribution with parameters a, X, and m to a sample x , , . . ., xN, the three equations of the method SAM are given by Arithmetic Mean I

*

=

N x

em/k

N l i

=

M-ifr) = 7

Harmonic Mean

TY^

N

(17)

,n/k

1 1 ^ 1

e-

Geometric Mean 1 N 1 N log fl g = ^ ^ toBaXi = Jj lyi 1 = 1

X =m + ~

(W)

( = 1

After some computations, it is possible to show that the system of Eqs. 17, 18, and 19 is equivalent to Eqs. 20, 21, and 22 such that

fia) = ^\^\^-^r^gH=0 °Sa [i 1 -OL -21lt \

l

(20)

loga x - loga ZT

logB H - loga x 2 2 777273T fl [1 - l/a /c ]

*• - log TZ^n

(21)

X m = loga g - -

(22)

a

Eq. 20 can be solved for a by a Newton-Raphson method, considering/' (a) such that: log,, B (\ \ 2 / 1 a" \C */ ka3B a +logoC Aa) =

_5_

! _

904

J. Hydraul. Eng. 1988.114:899-909.

with B = ( 1 - -rn I and and C= C = 1( 1 Downloaded from ascelibrary.org by UNIVERSITE DE MONCTON on 01/06/15. Copyright ASCE. For personal use only; all rights reserved.

Jl?)

( -5fc

After a is estimated, Eqs. 21 and 22 can be used to calculate X. and m. CALCULATION OF VARIANCE OF XT

If X follows an LP3 distribution and Y = loga X follows a P3 distribution then for large sample sizes we have var XT =

—r1-

var F r

(23)

which together with Eq. 7 gives var XT = Af (In a)2 var F r = ( - p I var F r

(24)

Thus var F r will be calculated first, and Eq. 24 will then be used to deduce varXy. In subsequent calculations we shall drop the tilde, for simplicity, and shall use |x,! for m'r, Now YT is a function of the parameters of the P3 distribution: X. 1 7 2

X

YT = Z(a, K,m) = m+-+K — a a therefore, for large sample sizes we have:

(25)

v - (YT) - j ; ( | - ) 2 var &J + ^ |

(26)

( | ) ( *-) cov