Comparison of Methods for the Computation of Multivariate Normal ...

Comparison of Methods for the Computation of Multivariate Normal Probabilities Alan Genz Department of Mathematics Washington State University Pullman, WA 99164-3113 [email protected]

Abstract

This paper compares acceptance-rejection sampling and methods of Deak, Genz and Schervish for the numerical computation of multivariate normal probabilities. Tests using randomly chosen problems show that the most ecient numerical methods use a transformation developed by Genz (1992). The methods allow moderately accurate multivariate normal probabilities to be quickly computed for problems with a many as ten variables.

1 Introduction

A common problem in many statistics computations is computing the multivariate normal distribution function Z bm Z b1 Z b2 e? 12 xt ?1 x dx; ::: P(b) = p 1 m jj(2) ?1 ?1 ?1 where x = (x1 ; x2; :::; xm)t , is an m m symmetric positive de nite covariance matrix and bi < 1 for all i. There is reliable and ecient software available for computing P(b) for m = 1 and m = 2 (for m = 2 see Donnelly, 1973 and Drezner and Wesolowsky, 1990), so assume m > 2. Perhaps the simplest method uses acceptance-rejection sampling, but this is not expected to be ecient for high accuracy work. Other methods for m > 2 use algorithms developed by Deak(1980, 1986 and 1990), Schervish (1984) and Genz (1992). The Schervish method has been compared with Genz's methods (Genz, 1992), but Deak's methods have not been compared with the other methods. Improvements to Genz's methods have recently been proposed by Beckers and Haegemans(1992), and Gibson, Glasbey and Elston (1992). Published

in

Computing Science and Statistics 25

pp.

400-

405, 1993. This work was supported in part by NSF grant DMS9211640.

The purpose of this paper is to present results from tests comparing acceptance-rejection sampling, Deak's methods, Schervish's methods and improved versions of Genz's methods. In Section 2 brief descriptions of the dierent methods are given, in Section 3 test results are reported and Section 4 provides some concluding remarks.

2 The Methods

2.1 Deak's Methods

These methods use a transformation to a spherical coordinate system. First, let x = C y, where = CC t is the Cholesky decomposition of . Then xt?1 x = ytC tC ?tC ?1 C y = yty so Z e? 12 yt y dy: P(b) = (2)? m2 ?1 >
0 for all i ; (z) = > min i > : 1 otherwise Pi with wi = j =1 ci;j zj > 0, and F() = Hm

Z

0

rm?1 e? r2 dr; 2

with Hm chosen so that F(1) = 1. (z) is the distance, in the z direction, from the origin to the boundary of the integration region. The de nition given here assumes bi > 0 for all i, but this can been generalized (see Deak, 1986) to include negative bi 's. Deak's methods all compute

Finally, let zi = ei wi, for i = 1; :::; m, so dzi = ei dwi, and P(b) = e1

j =1

using points fzj g that are randomly chosen from the surface of m-sphere. His simplest method uses uniformly random fzj g from the surface of m-sphere. Better methods improve the sampling by using antithetic variates. Let Z be an m m random orthogonal matrix with columns fzj g and let n

Pn

X

X

s

1j1 ::: 6 because of its rapidly increasing computation times. The REJNRM average times for m > 8 gradually increased to approximately two seconds at m = 14. The SPHNRM average times for m > 8 gradually increased to approximately six seconds at m = 14. All three methods appear to be robust and reliable at this level of requested accuracy.

The next tests used = 0:001 (Tables 3 and 4). Table 3: Constant i;j , = 0:001 SPHNRM MULNOR m E T E T 3 0.19 10.50 0.05 0.00 sd's 0.16 6.48 0.03 0.00 4 0.27 7.84 0.04 0.13 sd's 0.20 3.86 0.03 0.18 5 0.22 9.93 0.03 8.08 sd's 0.13 5.25 0.02 15.52 6 0.28 6.96 sd's 0.19 4.39 7 0.22 14.58 sd's 0.16 7.92 8 0.25 14.11 sd's 0.19 6.81 -

Table 2: Constant i;j results, = 0:01 RANNRM SADNRM KRONRM m E T E T E T 3 0.13 0.01 0.00 0.00 0.00 0.08 sd's 0.16 0.00 0.00 0.01 0.00 0.00 4 0.16 0.01 0.00 0.01 0.00 0.12 sd's 0.19 0.01 0.01 0.00 0.00 0.01 5 0.15 0.02 0.01 0.01 0.01 0.16 sd's 0.12 0.02 0.01 0.00 0.01 0.01 6 0.12 0.02 0.01 0.02 0.01 0.21 sd's 0.13 0.02 0.02 0.00 0.01 0.05 7 0.14 0.03 0.00 0.03 0.00 0.25 sd's 0.13 0.02 0.01 0.00 0.00 0.02 8 0.13 0.03 0.01 0.06 0.05 0.29 sd's 0.12 0.02 0.03 0.01 0.08 0.03 9 0.12 0.04 0.00 0.10 0.00 0.34 sd's 0.11 0.02 0.01 0.01 0.01 0.04 10 0.13 0.05 0.00 0.18 0.00 0.38 sd's 0.13 0.02 0.01 0.01 0.00 0.03 11 0.15 0.06 0.01 0.35 0.00 0.43 sd's 0.14 0.05 0.01 0.04 0.00 0.02 12 0.10 0.07 0.01 0.72 0.01 0.49 sd's 0.11 0.04 0.01 0.06 0.02 0.05 13 0.15 0.08 0.02 0.52 0.00 0.53 sd's 0.15 0.06 0.02 0.71 0.01 0.04 14 0.14 0.09 0.03 0.89 0.06 0.60 sd's 0.13 0.06 0.02 1.29 0.06 0.09 E = scaled average error and T = average time

Table 4 Constant i;j results, = 0:001 RANNRM SADNRM KRONRM m E T E T E T 3 0.22 0.24 0.02 0.00 0.01 0.10 sd's 0.19 0.52 0.02 0.00 0.01 0.04 4 0.20 0.85 0.04 0.01 0.03 0.14 sd's 0.14 1.18 0.05 0.03 0.05 0.06 5 0.20 0.96 0.03 0.01 0.09 0.17 sd's 0.18 2.30 0.05 0.00 0.13 0.05 6 0.20 0.76 0.04 0.02 0.04 0.21 sd's 0.17 1.36 0.04 0.01 0.04 0.01 7 0.25 1.25 0.06 0.04 0.05 0.26 sd's 0.30 2.63 0.12 0.02 0.07 0.08 8 0.27 1.79 0.04 0.06 0.13 0.64 sd's 0.18 3.06 0.03 0.01 0.15 0.41 9 0.22 1.40 0.04 0.13 0.03 0.35 sd's 0.21 2.34 0.04 0.08 0.04 0.04 10 0.18 1.58 0.05 0.23 0.03 0.39 sd's 0.17 2.01 0.04 0.11 0.03 0.03 11 0.22 3.23 0.06 0.45 0.04 0.43 sd's 0.16 5.84 0.06 0.27 0.05 0.02 12 0.19 4.01 0.05 0.84 0.04 0.56 sd's 0.15 10.46 0.04 0.38 0.06 0.36 13 0.21 5.97 0.09 3.05 0.04 0.56 sd's 0.14 8.84 0.16 3.08 0.06 0.16 14 0.19 3.57 0.09 4.05 0.16 0.94 sd's 0.14 5.96 0.21 3.26 0.24 0.42 E = scaled average error, and T = average time

All three of the methods RANNRM, SADNRM and KRONRM are robust and reliable, and usually faster and more accurate (except for m = 3 and m = 4 ) than the other three methods. There is not much variation in time taken by a particular method until m reaches ten or so.

4

Results for REJNRM are not given, but some tests were done. Typical times were approximately one hundred times longer than the times required for = 0:01. The SPHNRM times show a similar pattern, as is expected from a Monte-Carlo method, where the error should decrease by a factor of ten when the number of sample points increases by a factor of one hundred. All three of the methods RANNRM, SADNRM and KRONRM continue to be robust and reliable, and except maybe for m = 3or4, are faster than the other methods. The RANNRM times increased by a factor of approximately one hundred compared to the times for = 0:01, as is expected for a Monte-Carlo method. Overall, SADNRM is faster for m < 12 and then KRONRM is faster. The nal tests for the constant covariance matrices used = 0:0001 (Table 5). Table 5: Constant i;j , = 0:0001 SADNRM KRONRM m E T E T 3 0.11 0.00 0.05 0.09 sd's 0.11 0.00 0.05 0.02 4 0.25 0.01 0.15 0.19 sd's 0.42 0.01 0.16 0.21 5 0.20 0.03 0.19 0.33 sd's 0.21 0.03 0.27 0.16 6 0.18 0.13 0.20 1.18 sd's 0.20 0.13 0.14 1.86 7 0.13 0.28 0.16 0.99 sd's 0.13 0.23 0.21 0.84 8 0.17 0.54 0.22 4.89 sd's 0.17 0.45 0.33 5.77 9 0.17 1.24 0.37 6.69 sd's 0.08 0.90 1.04 15.18 10 0.16 2.36 0.28 14.72 sd's 0.16 1.82 0.38 24.11 11 0.14 4.60 0.20 21.43 sd's 0.10 4.33 0.24 46.30 12 0.16 12.40 0.17 28.64 sd's 0.13 9.17 0.22 50.60 The last tests use = 0:01 with \completely" random 's for each F(b). For this test run, fty random 's were generated using a method described by Marsaglia and Olkin (1984). With this method, a lower triangular matrix C^ is rst generated, with elements uniformly random from [-1,1]. The columns of C^ are then scaled so that they have unit 2-norms and positive diagonal entries. The result is a lower trangular matrix C that is used to produce a random covariance matrix = CC t. For each test run, fty of these random covariance matrices were generated for each m; the random b's were

generated in the same manner as they were generated for the other tests. Because exact P(b) values were not known for these test problems \accurate" P(b) values were computed using KRONRM with = :0005. Selected test results for are given in Tables 6 and 7. Table 6: Random i;j Results, = 0:01 REJNRM SPHNRM MULNOR m E T E T E T 3 0.24 0.82 0.27 0.14 0.05 0.01 sd's 0.17 0.45 0.23 0.10 0.04 0.01 4 0.20 0.96 0.27 0.08 0.04 4.96 sd's 0.17 0.48 0.21 0.05 0.04 23.76 5 0.22 1.18 0.20 0.15 sd's 0.17 0.61 0.17 0.08 6 0.27 1.27 0.15 0.17 sd's 0.24 0.65 0.14 0.01 7 0.23 1.19 0.15 0.43 sd's 0.20 0.73 0.12 0.18 8 0.74 1.34 0.63 0.68 sd's 3.54 0.84 3.55 0.35 Table 7: Random i;j Results, = 0:01 SPHNRM RANNRM SADNRM m E T E T E T 3 0.16 0.08 0.25 0.01 0.03 0.10 sd's 0.16 0.12 0.61 0.01 0.06 0.04 4 0.21 0.11 0.19 0.01 0.05 0.14 sd's 0.19 0.19 0.36 0.01 0.07 0.07 5 0.16 0.10 0.15 0.01 0.07 0.19 sd's 0.13 0.13 0.20 0.01 0.07 0.09 6 0.25 0.13 0.16 0.02 0.07 0.21 sd's 0.20 0.13 0.20 0.02 0.05 0.04 7 0.24 0.15 0.13 0.05 0.11 0.28 sd's 0.19 0.22 0.13 0.04 0.11 0.15 8 0.24 0.15 0.13 0.07 0.14 0.45 sd's 0.18 0.16 0.14 0.05 0.16 0.32 9 0.27 0.13 0.12 0.29 0.08 0.36 sd's 0.20 0.13 0.11 0.89 0.08 0.08 10 0.21 0.19 0.16 1.07 0.08 0.42 sd's 0.17 0.18 0.15 5.20 0.09 0.21 11 0.20 0.23 0.21 2.93 0.11 1.12 sd's 0.15 0.22 0.28 10.95 0.15 4.42 12 0.19 0.23 0.19 4.33 0.12 0.60 sd's 0.15 0.23 0.23 11.48 0.19 0.54 13 0.25 0.20 0.16 11.98 0.06 0.55 sd's 0.21 0.23 0.32 28.41 0.08 0.11 14 0.20 0.24 0.36 16.62 0.12 0.68 sd's 0.16 0.42 1.19 44.78 0.11 0.27 E = scaled average error and T = average time 5

The \random i;j " problems are apparently harder for the dierent methods. SADNRM appears to be best for m < 10, and then KRONRM is faster.

Deak, I. (1986) `Computing Probabilities of Rectangles in Case of Multinormal Distribution' J. Statist. Comput. Simul. 26, p. 101-114. Deak, I. (1990) Random Number Generation and Simulation, Akademiai Kiado, Budapest, Chapter 7. Drezner, Z. and Wesolowsky, G. O. (1990) Òn the Computation of the Bivariate Normal Integral',

3.1 Concluding Remarks

These results provide strong evidence that multivariate normal probabilities can be robustly and reliably computed at low to moderate accuracy levels in less than a second of workstation time for problems with up to ten dimensions. When at least moderate accuracy is required, the implementations SADNRM and KRONRM are usually much faster than the other methods. High accuracy or high dimension problems can require long computation times for these methods and it is still not clear what is the best method for this type of problem. Software for all of the methods discussed here is available from the author.

Journal of Statistical Computation and Simulation

35, pp.

101-107. Drezner, Z. (1992) `Computation of the Multivariate Normal Integral', ACM TOMS 18, pp. 450-460. Genz, A. (1992) `Numerical Computation of the Multivariate Normal Probabilities', J. Comput. Graph. Stat. 1, pp. 141-150. Gibson, G. J., Glasbey, C. A. and Elston, D. A. (1992) `Monte-Carlo Evaluation of Multivariate Normal Integrals', Scottish Agricultural Statistics Service preprint, University of Edinburgh, Scotland. Keast, P. (1973) Òptimal Parameters for Multidimensional Integration', SIAM J. Numer. Anal. 10, pp. 831-838. Lohr, S. (1990) Àccurate Multivariate Estimation using Triple Sampling', Ann. Statist. 18, pp. 16151633. Marsaglia, G. and Olkin, I. (1984) `Generating Correlation Matrices', SIAM Journal of Scienti c and Statistical Computing 5, pp. 470-475. Schervish, M. (1984) `Multivariate Normal Probabilities with Error Bound', Applied Statistics 33, pp. 81-87. Tong, Y. L. (1990) The Multivariate Normal Distribution, Springer-Verlag, New York.

References

Beckers, M. and Haegemans, A. (1992) `Comparison of Numerical Integration Techniques for Multivariate Normal Integrals', Computer Science Department preprint, Catholic University of Leuven, Belgium. Berntsen, J., Espelid, T.O. and Genz, A. (1991) Àlgorithm 698: DCUHRE-An Adaptive Multidimensional Integration Routine for a Vector of Integrals', ACM Transactions on Mathematical Software 17, pp. 452-456. Cranley, R. and Patterson, T. N. L. (1976) `Randomization of Number Theoretic Methods for Multiple Integration', SIAM J. Numer. Anal. 13, pp. 904914. Deak, I. (1980) `Three Digit Accurate Multiple Normal Probabilities' Numer. Math. 35, p. 369-380.

6