Minimax estimation of multivariate normal mean ...

1 downloads 0 Views 197KB Size Report
timating the multivariate normal mean vectors under the squared error loss. ... the posterior distribution is multivariate normal distribution with mean vec- tor n.
Minimax estimation of multivariate normal mean under balanced loss function y Younshik Chung 1 and Chansoo Kim 2

Abstract This paper considers simultaneous estimation of multivariate normal mean vector using Zellner's(1994) balanced loss function when 2 is known and unknown. We show that the usual estimator X is minimax and obtain a class of minimax estimators which have uniformly smaller risk than the usual estimator X . Also, we obtain the proper Bayes estimator relative to balanced loss function and nd the minimax condition of hyperparameter to be minimax.

Key Words : Balanced loss function, Hierarchical Bayes estimator,

James-Stein type estimator, Minimax estimator, Simultaneous estimation.

1. INTRODUCTION

Let us consider the problem of estimating the mean vector of a multivariate normal distribution when common variance 2 is known and unknown. The present studies was supported (in part) by Pusan National University, Korea, 1996-1998. 1 Department of Statistics, Pusan National University, Pusan, 609-735, Korea 2 Department of Statistics, Univ. of Connecticut, Storrs, CT 06269, U.S.A. y

1

Suppose a normal model is speci ed by the following independent distributions Xij ji  N (i ; 2); i = 1;    ; p; j = 1;    ; n: (1.1) We consider simultaneous estimation of  = (1 ;    ; p)t under the balanced loss function(BLF), introduced by Zellner(1994), given by

0n 1 p X X w LB (^; ) = @ n (Xij ? ^i)2 + (1 ? w)(^i ? i )2A ; i=1 j =1

(1.2)

where 0 < w < 1, i = 1;    ; p, j = 1;    ; n and ^ is estimate of . The BLF is formulated to re ect two criteria, namely goodness of t and precision of estimation. For the model (1.1) with n=1, James and Stein(1961), Baranchik(1970), Strawderman(1971) and Stein(1981) have been discussed the problem of estimating the multivariate normal mean vectors under the squared error loss. Also, Sun(1995) investigated risk ratio and minimaxity under the squared error loss. Under the BLF(1.2), Rodrigue and Zellner(1995) considered the estimation of the exponential mean time to failure and Chung, Kim and Song(1998) investigated an admissible linear estimator of Poisson mean. Zellner(1994) derived the optimal estimators of scalar mean, a vector mean and a vector of regression with p=1 and Chung and Kim(1997) show that X is admissible for p  2 under BLF. Also in order to show that X is inadmissible for p  3, Chung and Kim(1997) obtained the James-Stein type estimator given componentwisely by ! (1 ? w )( p ? 2) JS (1.3) i (X ) = 1 ? Pp  2 Xi n i=1 Xi

Pn

where Xi = j=1n Xij , i = 1;    ; p. In this paper, our objective is to nd the minimax conditions under the BLF(1.2) when 2 is known and unknown. The paper is organized as follows. In section 2, we show that the usual estimator X = (X1;    ; Xp)t is minimax under BLF and then nd a family of minimax estimators which have uniformly smaller risk than the usual estimator X when 2 is known and unknown. Thus the JS type estimator in (1.3) is minimax under BLF. 2

In Section 3, for given prior distribution, a proper Bayes estimator is derived by hierarchical approach under the BLF(1.2). It is shown that this Bayes estimator is minimax under some mild conditions. In section 4, using Monte Carlo simulation studies, we compared the performances of JS type estimators with that of usual estimator X .

2. A CLASS OF MINIMAX ESTIMATOR

In this section, it is shown that the usual estimator 0(X ) = (X1;    ; Xp) is minimax under the BLF (1.2). Also, we nd a certain minimax condition for a class of estimator which has smaller risk than the usual estimator 0(X ) =PX when 2 is known andPunknown. For notational convenience, let P P 2 p p p n 1 2 t 2 ~      X = p i=1 Xi, jjX jj = X X = i=1 Xi and S = i=1 j=1(Xij ? Xi)2.

Case 1: 2 is known

Since 2 is known, without loss of generality assume that 2=1. Theorem 2.1. For the model (1.1), the usual estimators 0(X ) = (X1;    ; Xp) is minimax under the BLF (1.2). Proof. Let us consider the proper prior sequence m () which is distributed to a normal distribution with mean vector 0 and covariance matrix mI . Then the posterior distribution is multivariate  normal  distribution with mean vec1 n  tor n+1=m X and covariance matrix n+1=m I . Thus, the Bayes estimator Bm (X ) is nm X:  Bm (X ) = wX + (1 ? w)E (jX ) () = w1 ++ nm (2.1) Moreover, the risk function of 0 (X ) and Bayes risk function of Bm (X ) are respectively R(; X ) = n1 (wp(n ? 1) + p(1 ? w)) ; 3

and

r(m; Bm (X ))

!2 !2 wp ( n ? 1) n + w=m pw 1 1 ? w p (1 ? w ) = + n m2 n + 1=m + n n n + 1=m 2 + (1 ? w) p 2 : (2.2) m(n + 1=m)

To show that 0(X ) is minimax, it suces to show that R(; X )  limm!1 r(m) < 1 as m ! 1 (See Lehmann(1983), Theorem 2.2, p256). Then it follows from (2.2) that R(; 0(X ))  limm!1 r(m ; Bm (X )) = wp(nn? 1) + p(1 n? w) < 1: That is, r(m ; Bm (X )) converges to r(; B (X )) as m converges to 1. Hence 0(X ) = (X1 ;    ; Xp) is minimax. 2   Theorem 2.2. Let 1(X ) = 1 ? (nnjjjjXXjjjj22) X where () is a real valued function. If () is nondecreasing function and 0 < ()  2(1 ? w)(p ? 2), then 1 (X ) is minimax for p  3 under BLF(1.2). Proof. The risk function of 1 (X ) = (11(X );    ; p1(X )) is

(X p

R(;  X )) = R(;  X )) + E (i1(X ) ? Xi)2 + 2(1 ? w) i=1 ) p X 1  (Xi ? i )(i (X ) ? Xi) : 1(

0(

i=1

Let DR = R(; 1(X )) ? R(; 0(X )). And it is known that njjX jj2 has 2a non-central 2 with p degree of freedom and a non-centrality parameter njj2jj . Also the expectation of non-central random variable 2 can be computed by representation as an in nite sum of central 2 expectations with Poisson weights. Then, ) ( 2  2 p  jj2) X  ( n jj X  ( n jj X jj ) (Xi ? i)Xi DR = E n2 jjX jj2 ? 2(1 ? w) njjX jj2 i=1 1 ( njjjj2 )K e ?njj2jj 1 ( (2 ) X 2K +p 2 2 = ? 2(1 ? w )  (  E 2K +p) 2 K! n 2K +p K =0 2

4

) (22K +p)  +4(1 ? w)K 2 ; 2K +p

(2.3)





where K is distributed to Poisson distribution with parameter njj2jj . In order to show that 1(X ) is a minimax estimator, it suces to show the expectation in (2.3) is nonpositive. By the condition of (), !) ( 4(1 ? w ) K  (  2K +p )  2 ? 2(1 ? w) + 2 E f  g = E (2K +p) 2 2K +p 2K +p ( !)  (1 ? w)E (22K +p) 2p +24K ? 4 ? 2 2K +p ! 2 p + 4 K ? 4 2 = (1 ? w)Cov (2K +p); 2 ?2 2K +p  2  2p + 4K ? 4 ! +E (2K +p) E 22K +p ? 2 : Note that given 22K +p, 2p+422KK+?p 4 ? 2 is a nonincreasing function in terms of 22K +p. From the fact that the covariance between a nondecreasing  function and nonincreasing function is nonpositive and E 2p+422KK+?p 4 ? 2 = 0, it is seen that the expectation in (2.3) is negative. Therefore 1 (X ) is minimax for p  3. 2 Corollary 2.1. The JS type estimator in (1.3) is minimax under BLF. Proof. Take (njjX jj2) = (1 ? w)(p ? 2). Then 1(X ) can be of the form in (1.3). 2 Theorem 2.3. An estimator of the form 2(X ) = (12(X );    ; p2(X )) de ned by componentwisely Pp (X ? X~ )2 ) !  ( n 2  (2.4) i (X ) = 1 ? Ppi=1  i ~ 2 (Xi ? X~ ) + X~ n i=1 (Xi ? X ) is minimax for p  4, if (a) () is a nondecreasing function, 5

2

and (b) 0 < ()  2(1 ? w)(p ? 3). Proof. It is similar to the proof of Theorem 2.2. 2

Case 2 : 2 is unknown Theorem 2.4. Under the BLF in (1.2), an estimator of the form 0 1 njjX jj2 )S 2  ( 2 3 (X ) = @1 ? njjSX jj2 A X

(2.5)

is minimax for p  3 if (a) () is a nondecreasing function, and p?2) . (b) 0 < ()  2(1p(?nw?)(1)+2

Proof. Let g(U ) = 1 ? (UU ) , where U = njjSX jj . Then the risk function of 2

2

3(X ) is expressed as n R(; 3(X )) = R(; 0(X )) + E (1 ? g(U ))2X t X ? 2(1 ? w)(1 ? g(U ))X tX o + 2(1 ? w)(1 ? g(U ))tX : For a given S 2 = s2 , the di erence of risks of 3(X ) and 0(X ) is 1 ( njj2jj2 )K e ?n2jj2jj 2 X  2 R(; 3(X )) ? R(; 0(X )) = n K! K =0 8 ! ! < 2 222K +p 2 222K +p 2 E :2K +p 1 ? g( s2 ) ? 2(1 ? w)2K +p 1 ? g( s2 ) 2

 2 2 + 4(1 ? w)K 1 ? g( 22K +p ) s

!)1

 2 where K is distributed to Poisson distribution with mean n2jj2jj . 6

(2.6)

In order to show that 3(X ) is a minimax estimator, it suces to show the expectation in (2.6) is not positive. From the condition (b) and the fact 2 S that 2 has a chi-square distribution with p(n ? 1) degrees of freedom,

8 0 2 1 1 < 2K +p 2 0 2p(n?1) 22K +p E f  g = E : @ 2 A p(n?1) @ 2 ( 2 ) ? 2(1 ? w)A 2K +p p(n?1) p(n?1) 9 0 2 1  w)K = + @ 22K +p A 2p(n?1) 4(1? 2 2K +p ; p(n?1) 8 0 2 1 !9 2 < 2K +p 2 =2 C  + 4(1 ? w ) K 0 p ( n ? 1)  E : @ 2 A p(n?1) ?2(1 ? w) + ; 2 1

p(n?1)

2K +p

p?3) . where C0 = 2(1p(?nw?)(1)+2 C 2

Note that given 2p(n?1) , () is nondecreasing and ?2(1 ? w)+ 022pK(n+?p 1) + 4(1?w)K 2 2 22K+p is nonincreasing in terms of 2K +p , by examining 2K +p > 2K + C0 2p(n?1) C0 2p(n?1) 2 and   2 K + 2K +p 2(1?w) 2(1?w) separately. From the fact that the covariance between a nondecreasing and nonincreasing function is nonpositive, we have

8 0 1 ! < 2 K ? 2(1 ? w )( p ? 2) C 0 2 2 @ A E f  g  E : 2 p(n?1) + 2K + p ? 2 p(n?1) 2(1 ? w) 93 1 0 2 = C  C 2 K + 2(1 ?0 w) A 2p(n?1) 2K0 +p(pn??1)2 ; : + @ 2 p(n?1)

From the condition (a), we have

! C C 0 0 2K E f  g   2(1 ? w)(p ? 2) + 2(1 ? w) ( ?2(1 ? w)(p ? 2) + C02p(n?1) !) 2 E p(n?1) = 0: 2K + p ? 2 3

7

Hence 3(X ) is minimax for p  3. 2 p?2) in (2.5), then the estimator is of the Remark. If taking () = (1(p?(nw?)(1)+2) form componentwisely de ned by ! 2 (1 ? w )( p ? 2) S  iJS (X ) = 1 ? X ; i = 1;    ; p (2.7) n(p(n ? 1) + 2) Ppi=1 Xi2 i and it can be regarded as JS type estimator. Theorem 2.5. An estimator of the form 4 (X ) = (14(X );    ; p4 (X )) componentwisely

1 0 P n pi=1 (Xi ?X~ )2 2 ) S  ( i4 (X ) = B @1 ? Pp S2 ~ 2 CA (Xi ? X~ ) + X~ n i=1 (Xi ? X ) is minimax for p  4, if (a) () is a nondecreasing function,

and p?3) . (b) 0  ()  2(1p(?nw?)(1)+2 Proof. Its proof is similar to the proof of Theorem 2.4. 2

3. HIERARCHICAL BAYES ESTIMATOR

In this section, we consider the hierarchical Bayes approaches when 2 is known. And we nd the Bayes estimator and minimax conditions under the BLF in (1.2). Let the conditional distribution of  = (1 ;    ; p) given  be a multivariate normal distribution with mean vector 0 and covariance ? I, 0 <  < 1 and let the marginal prior distribution of  be beta matrix 1n distribution with known parameters and , where > 0 and > 0. Lemma 3.1 Under above assumptions, the Bayes estimator is of the form ( ? w) (p=2 + ) ? ( ? 1)?( ? 1)?(1 + + p=2) HB  (X ) = 1 ? 2(1 njjX jj2 ?( )?( + p=2)

19

(1 + + 2p ; + + 2p ; ?njj2X jj ) A=  X ( + p2 ; + + 2p ; ?njj2X jj2 ) ; 2

8

(3.1)

) where (a; b; z) is the con uent hypergeometric function((a; b; z; ) = ?(b??(ab)?( a) R 1 a?1 (1 ? )b?a?1ezd , see Abramowitz and Stegun(1964)). 0

Proof. To nd the posterior mean, let us consider the distribution of  given  and X . Then j; X is a multivariate normal distribution with mean (1 ? )X and covariance matrix 1?n I. Hence, from (2.1)     HB (X ) = wX + (1 ? w) 1 ? E (jX ) X = 1 ? (1 ? w)E (jX ) X: Also, the conditional expectation of  given X is R 1 f (xj)g()d E (jX ) = R0 1 xj)g()d 0 f (

R 1 p=2+ (1 ? ) ?1e? njjX2 jj2 d

= R 10 :  2 p=2+ ?1 (1 ? ) ?1 e? njjX2 jj  d  0 Applying the integration by parts to the numerator and by de nition of the con uent function, it leads to the estimator HB (X ) in (3.1). 2 Theorem 3.1. If > 1 and  2p ? 2, then HB (X ) in (3.1) is minimax under BLF with p  5. Proof. To show that the Bayes estimator HB (X ) dominates 0(X ) in terms of risk, we use the Theorem 2.2. Let ? 1)?(1 + a + p=2) (njjX jj2) = 2(1 ? w) 2p + ? ( ? 1)?( ?( )?(a + p=2)

1

(1 + + 2p ; + + p2 ; ?njj2X jj ) A  : ( + 2p ; + + p2 ; ?njj2X jj2 ) 2

Then HB (X ) can be considered as the form of 1(X ) in Theorem 2.2. Thus if it holds two conditions of Theorem 2.2, then HB (X ) dominates 0 (X ) in terms of risk and is minimax. From the con uent hypergeometric function's property, that is, (a; b; c) = ?(?(b?b)a) (?c)?a [1 + O(jcj?1)] where c < 0, then for large njjX jj2, ! ?(1 + + p= 2) 2 p 2 2 ? 1   (njjX jj ) = 2(1?w) 2 + ? ( ? 1) ?( + p=2) njjX jj2 + O[(njX j ) ] : 9

Hence, it is seen that (njjX jj2) is a nondecreasing function in njjX jj2 for > 1. And by second condition in Theorem 2.2, p  2  (njjX jj )  2(1 ? w) 2 +  2(1 ? w)(p ? 2): So,  2p ? 2. Therefore, B (X ) dominates 0 (X ) in terms of risk and thus is minimax. 2

4. RISK SIMULATION STUDY

In this section we will compute the risk improvement percentage(RIP) of James-Stein type estimators JS (X ) in (2.7) over 0(X ) = X for the simultaneous estimation of multivariate normal means with unknown variance, using Monte Carlo simulation method. RIP can be computed as  ) ? R(; JS (X )) ! R ( ; X  1000: RIP = R(; X ) Using IMSL subroutines GGUBT, we generate 1 ;    ; p from uniform distribution on a given interval. And RIP is repeated 1000 times to compute the average of R(; X ) and R(; JS (X )). In table 1, we calculate the risk improvement percentage(RIP) of 0(X )   JS over  (X ) in (2.7) for the di erent values of n and p when w = 0:1, 0.5 and 0.9. From Table 1, as the magnitude of i 's increases, its corresponding RIP decreases. Also it is observed the RIP is always positive. And if ! closes to zero then the balanced loss function (1.2) is in uenced by precision part wheresea, ! closes to 1 then the loss depends goodness of t part. So, it is seen that as values of p are increasing, its corresponding Rip's are increasing. When ! = 0:9 and n has larger value, then RIP dosen't have such a trend. As seen above, the RIP is in uenced by various values of n, p, w and magnitude of . Because the value of w is close to zero, the range of () become large and hence the estimator JS (X ) will occur more improvement than usual estimator 0(X ). Otherwise, JS (X ) will collapse back to 0 (X ) and have little improvement over 0(X ). 10

Table 1 Risk improvement of 0(X ) over JS (X ) in (2.7) when 2 = 2. p=3 p=5 p=7 p=9 ! = 0:1 n=5 9.0821 17.473 21.938 24.276 n=10 4.2112 8.5789 10.819 12.140  2 (0; 1) ! = 0:5 n=5 1.4568 2.8041 3.5213 3.8958 n=10 .46764 .95297 1.2022 1.3488 ! = 0:9 n=5 .03853 .07421 .09536 .10594 n=10 .01045 .02388 .03038 .03084 ! = 0:1 n=5 3.0747 6.6804 8.6576 9.8023 n=10 1.2290 2.8373 3.5332 4.0145  2 (0; 2) ! = 0:5 n=5 .49304 1.0721 1.3901 1.5732 n=10 .14413 .31523 .39279 .44604 ! = 0:9 n=5 .01243 .02789 .03799 .04301 n=10 .00269 .00844 .01048 .00772 ! = 0:1 n=5 .56812 1.2632 1.5819 1.8418 n=10 .22496 .47989 .60415 .69786  2 (0; 5) ! = 0:5 n=5 .09088 .20273 .25404 .29560 n=10 .02491 .05328 .06709 .07751 ! = 0:9 n=5 .00141 .00433 .00726 .00835 n=10 .00041 .00319 .00318 .00061

p=20 29.412 14.677 4.7205 1.6308 .12686 .04077 12.150 4.9611 1.9499 .55102 .05195 .01438 2.3578 .88162 .37831 .09793 .00911 .00401

REFERENCES (1) Abramowitz, M. and Stegun, I. A. (1964), Handbook of mathematical

functions with formulas, graphs and mathematical tables, National Burean of Standards Applied Mathematics Ser.55. (2) Baranchik, A. J. (1970), A family of minimax estimators of the mean of a multivariate normal distribution, The Annals Mathematical Statistics, 22, 22-42. (3) Chung, Y. and Kim, C. (1997), Simultaneous estimation of the multivariate normal mean under balanced loss function, Communications in Statistics - Theory Methods , 26, 1599-1611.

11

(4) Chung, Y., Kim, C. and Song, S. (1998), Linear estimators of Poisson

mean under balanced loss functions, Statistics and Decisions , 16. (5) James, W. and Stein, C. (1961), Estimation with quadratic loss, Proceedings of the Fourth Berkeley Symposium Mathematical Statistics and Probability, 1, 361-371. (6) Lehmann, E. L. (1983), Theory of Point Estimation , (Wiley ; New York). (7) Rodrigue, J. and Zellner, A. (1995), Weight balanced loss function for the exponential mean time to failure, Communications in Statistics Theory Methods , 23, 3609-3616. (8) Stein, C. (1981), Estimation of A multivariate normal distribution, The Annals of Statistics 1 , 9, 1135-1151. (9) Strawderman, W. E. (1971), Proper Bayes minimax estimators of the multivariate normal distribution, The Annals of Mathematical Statistics, 42, 385-388. (10) Sun, L. (1995), Risk ratio and minimaxity in estimating the multivariate normal mean with unknown variance, Scandinavian Journal of Statistics , 22, 105-120. (11) Zellner, A. (1994), Bayesian and Non-Bayesian estimation using balanced loss function, in: J.O. Berger and S.S. Gupta eds., Statistical Decision Theory and Related Topics V , (Springer ; NewYork), pp377390.

12