Running head

0 downloads 0 Views 596KB Size Report
A Sequential Triangular Test of a Correlation Coefficient's ... much appealing results as concerns utility as the average sample size of the sequential triangular ... size n given apart from a certain null-hypothesis' ρ0 also the necessary precision ... ρ1 > ρ0) the former reference deal with a specific R-package (OPDOE; Rasch, ...
Running head: A Sequential Triangular Test

A Sequential Triangular Test of a Correlation Coefficient’s Null-Hypothesis: 0 <  ≤ 0

Berthold Schneider Hannover Medical School, Germany Dieter Rasch University of Natural Resources and Life Sciences, Austria Klaus D. Kubinger University of Vienna, Austria Takuya Yanagida University of Applied Sciences Upper Austria, Austria

1

Running head: A Sequential Triangular Test

Author Note Berthold Schneider , Institute for Biometry, Hannover Medical School; Dieter Rasch, Department of Landscape, Spatial and Infrastructure Sciences, University of Natural Resources and Life Sciences; Klaus D. Kubinger, Division of Psychological Assessment and Applied Psychometrics, Faculty of Psychology, University of Vienna; Takuya Yanagida, School of Applied Health and Social Sciences, University of Applied Sciences Upper Austria. Special thanks to Karl Moder in the Department of Landscape, Spatial and Infrastructure Sciences at University of Natural Resources and Life Sciences for helpful comments Correspondence concerning this article should be sent to Klaus D. Kubinger, Faculty of Psychology, University of Vienna, Liebiggasse 5, 1010 Vienna, Austria. E-mail: [email protected]

2

Running head: A Sequential Triangular Test

3

Abstract A sequential triangular test for the null-hypothesis H0: 0 <  ≤ 0 is derived, given a normal distribution of a two-dimensional quantitative vector of random variables (x, y). The test is based on an approximate normally distributed test statistic by Fisher’s transformation of a sample’s correlation coefficient. By a simulation study we show that for certain requirements of precision (type-I-, type-II-risk, and a practical relevant effect 𝛿 = 𝜌1 − 𝜌0 ) there are feasible results or to say much appealing results as concerns utility as the average sample size of the sequential triangular test is smaller than the sample size of the pertinent fixed sample size test. Keywords: sequential test, correlation analyses, correlation coefficient

Running head: A Sequential Triangular Test

4

Introduction Due the fact that most of the correlation analyses – at least within social sciences – focus on testing the null-hypotheses H0:  = 0 (versus the two-sided alternative hypothesis HA:   0) there is emphatically to claim an alternative approach: That is to test rather the composite nullhypothesis H0: 0 <  ≤ 0 for any 0 < 0 < 1, instead. Rejecting the traditional null-hypothesis H0:  = 0 (meaning a significant correlation coefficient – regardless of how small the type-I-risk  may be) – has, however, very often no practical meaning. Confirming a population’s correlation coefficient as being merely unequal to zero does not entail much gain of content information, unless this correlation coefficient is sufficiently large enough; and this in turn means that the correlation explains a relevant amount of variance. For this, the suggested alternative approach might be based on a chosen value of 0 according to the square root of the expected relative proportion of variance which is explained by a linear regression (that is the so-called coefficient of determination, ρ2). Of course, this approach is offered for a long time even in applied textbooks, cf. e.g. a more recent one: Rasch, Kubinger & Yanagida (2011), and in particular by Kubinger, Rasch, and Šimečkova (2007). Both this references give moreover an SPSS-syntax for the respective test, in order to fill the gap in the existing SPSS-menu for researchers restricted to the use of SPSS; up to date, the former reference illustrate the application of this test by R. – The latter also check the practical consequences of the respective test statistic being only asymptotically normally distributed: a simulation study proved that the actual risk is hardly larger than the nominal risk with regard to the type-I-risk. And, finally, as concerns the matter of planning a study, that is the calculation of sample size n given apart from a certain null-hypothesis’ 0 also the necessary precision requirements (type-I-risk  and type-II-risk  as well as , the (minimal) relevant difference between 1 and 0;

1 > 0) the former reference deal with a specific R-package (OPDOE; Rasch, Pilz, Verdooren, & Gebhardt, 2011).

Running head: A Sequential Triangular Test

5

However, the most effective strategy within statistics, that is sequential testing, has never been established for testing the composite null-hypothesis H0: 0 <  ≤ 0, yet. Beyond the first concepts of sequential testing by Wald (1947), a special group of sequential tests with a maximal number of research units needed are the sequential triangular tests. Their principle goes back to Whitehead (1992) and Schneider (1992). Those tests are of fundamental advantage as their average sample size is quite smaller than that of the corresponding tests with a planned and therefore fixed sample size. Accordingly a sequential test strategy for the given null-hypothesis is derived. Method As indicated, the H0: 0 <  ≤ 0 is the matter (for simplicity we do not deal in detail with the analogous problem H0: ρ0 ≤  < 0). Given, the distribution of a two-dimensional quantitative vector of random variables (x, y)1 is normal so that the finite second moments σ2x, σ2y, und σxy, and the correlation coefficient ρ = σxy/(σxσy) exist. Then the null-hypothesis should be tested against the alternative hypothesis H1: 0 < 0 ≤ ρ ≤ ρ1 with a type-I-risk  – i.e. the probability of wrongly rejecting the H0 – and a type-II-risk β – i.e. the probability of wrongly accepting the H0 (in particular as long as  ≥ 1 - 0 with a 1 to be fixed in advance).

The empirical correlation coefficient r and Fishers z-statistics. 𝑠𝑥𝑦

The empirical correlation coefficient 𝑟 = 𝑠

𝑥 𝑠𝑦

based on n observations (xi,yi)

(i=1,…,n), i.e. realizations of (x, y), offers an estimation of the parameter ρ (sxy, s2x, s2y are the empirical covariance and variances, respectively). When using r as a test statistic we refer to the fact that the distribution of r, i.e. the corresponding random variable, was derived by R.A. Fisher (1915) under the assumption of a bivariate normal distribution of (x, y). He proved that the distribution of r then only depends on n and ρ. Later, Fisher (1921) suggested the transformed value

Running head: A Sequential Triangular Test

6

1 r  z  ln  as a test statistic and proved that the distribution of the corresponding random 1 r 

1  r  variable z  ln  is approximately normal even if n is rather small. More concrete, Cramer 1  r  (1946) proved that for n = 10 the approximation is actually sufficient as long as –0.8 ≤ ρ ≤ 0.8. The expectation E of z, being a function  of ρ, amounts to

1    4 , the variance to var z   (1). E z       ln   n3  1    n 1 The statistic z can be used to test for data with a fixed sample size n the hypothesis H0: ρ ≤ ρ0 against the alternative hypothesis H1: ρ > ρ0 (respectively H0: ρ ≥ ρ0 against H1: ρ 3 each. For each sub-sample j (j = 1, 2,… m) we calculate a statistic

Running head: A Sequential Triangular Test

9

1 r  which distribution is only a function of  and k. This is again Fisher’s statistic z  ln  . As a  1 r  triangular test must be based on a statistic with expectation 0, given the null-hypothesis, we transform z into the standardized variable

z*  z    0 

 1  0  0  k  3 k 3     z  ln  2 1   k 1 2 0   

(2)

which is (for not too small values of k) approximate normally distributed with variance 1 and the expectation:

  Ez *        0 

1  0   0  k  3 k  3  1   ln  ln   2 1  0 k 1  2  1 

(3).

Now, θ is used as a specific parameter, which is θ replaces ρ as test-parameter. For ρ=ρ0 the parameter θ is 0 (as demanded). For ρ=ρ1 we obtain:

 1  1 1   0 1   0  k  3  ln   1  0 k 1  2  1  1

1  ln

(4)

The difference δ = 1  0 is the practical relevant difference which should be detected with the power 1   . Given m independent realizations z1*… zm* of z*, the log-likelihood ln L(z1*…zm*; θ) is the logarithm of the probability density of m independent normally distributed random variables with expectation θ and variance 1: ln L( z1* ,...zm* ; )  

m 1 m ln(2 )   ( z *j   ) 2 . This leads to: 2 2 j 1

Running head: A Sequential Triangular Test

Zm 



 ln L z1* ,....zm* ; 



10

m

 0

  z * j and Vm   j 1



 2 ln L z1* ,... z m* ;  2



m

(5).

 0

From each sub-sample j we now calculate the sample correlation coefficient rj as well as its

 1  rj transformed values z j  ln  1  rj 

      and z j *   z j  ln 1   0    0  k  3 (j = 1, 2,…, m). For the  1    k 1 2  0     

while we recommend for the following to choose k = 12. Now the sequential path is defined by points (Vm, Zm) for m = 1,2,… up to the maximum of V below or exactly at the point where a decision can be done. The continuation region is a triangle whose three sides depend on α, β, and 1 via

 z1 1  z1 a

  1   ln    2 

1

and

(6)

c

1  z 21  1  z1

  

,

with the percentiles zP of the standard normal distribution. That is, one side of the looked-for triangle lies between –a and a on the ordinate of the (V, Z) plane (V = 0). The two other borderlines are defined by the lines L1: Z = a+cV and L2: Z = -a+3cV, which intersect at

a   Vmax  , Z max  2a  . c  

(7)

The maximum sample size is of course kVmax . If θ = θ1 > 0 we get a > 0 and c > 0, and if θ1 < 0 we get a < 0 and c < 0. The decision rule now is: Continue sampling as long as  a  3cVm  Zm  a  cVm if

1 > 0 or  a  3cVm  Zm  a  cVm if 1 0, accept H1 in the case Zm reaches or

Running head: A Sequential Triangular Test

11

exceeds L1 and accept H0 in the case Zm reaches or underruns L2. If the point (Vmax 

a , Z max  2a) is c

reached, H1 is to accept.

Example 1 We like to test the null hypothesis  ≤ .5 against the alternative hypothesis  > .5 with α = .05, β = .2 and δ = 1 – .5 = .1. We use k = 12. We hence obtain

 1  0.5  0.5  1.144   1  0.5  11

 0.5  ln and

 1  0.6  0.6 = 1.441.   1  0.6  11

 0.6  ln Because k  3 = 3 we obtain

1 

3 1.4411.144  0.445 . 2

According (6) we find z.8  0.842and z.95  1.645 , and hence

 0.842   1  1   ln  1.645   0.1   a  7.819 0.445

c

0.445  0.147.  0.842  21    1.645 

From (7) we get

Vmax 

7.819  53.104, Z max  15.63 . 0.152

Let us assume that the two first correlation coefficients calculated from 12 pairs each are r1 = .42

Running head: A Sequential Triangular Test

12



32 we obtain z*1 = -

and r2 = .51; this leads to z1 = 0.895 and z2 =1.125; and from z j *  z j  1.144

0.374 and z*2 = -0.028. The points (V1 = 1, Z1 = -0.374); (V2 = 2, Z2 = -0.402) lie inside the continuation region as shown in Figure 1.

– Insert Figure 1 about here –

Simulation study By a simulation study we will check whether actually in case of normally distributed random variables the nominal risks keep holding and which values of k offer sufficient approximations. The feasibility2 of the sequential triangular test for hypotheses of the correlation coefficient was tested by simulated paths (Z, V) being generated by bivariate normally distributed random numbers x and y with means μx = μy = 0, variances σx2 = σy2 = 1, and a correlation coefficient σxy = ρ. Simulations were performed with the nominal risks αnom = 0.05, βnom =.1 and.2, respectively, and several values of ρ0, ρ1 and k. For each parameter combination 100 000 paths were generated. Simulations as well as the calculation of the introduced sequential triangular test were done in R (R Core Team, 2013)3. As criteria for the test-quality we calculated: the relative frequency of wrongly accepting H1, given ρ = ρ0, which is an estimate of the actual risk, say αact; the relative frequency of rejecting H1, given ρ = ρ1 – this is an estimate of the actual type-II-risk, say βact; the average number of subsamples used for the calculation of r and z until data sampling is to stop (that is: the path leaves the continuation region) for ρ0 and ρ1, respectively, as well as the average number of sample pairs (x, y) – usually called “average sample number”, ASN for ρ0 and ρ1. Here, ASN is the mean number of sample pairs over all 10 000 runs of the simulation study. Bear in mind, that in a certain case the number of sample pairs ρ needed for a terminal decision may lie either below or above that value ASN. The procedure is much appealing or to say: feasible, if αact ≤

Running head: A Sequential Triangular Test

13

α and βact ≤ β. Of course, the sequential test should also lead to E(ASN) < nfix,, nfix the sample size necessary in a fixed sample size test with a power 1 – β, given ρ = ρ1 and testing the hypothesis ρ  ρ0 with the type-I-risk α = .05. The results are shown in the Table 1 Table 1 shows only those two values of k for which αact is just below and just above the value .05, with the exception of a single case where exactly αact = .05. Table 2 offers just a summary table of the whole simulation study for those k resulting in feasible values of αact and βact. The ASN strongly depends on the real value of ρ. As an illustration of ASN’s dependency on ρ we give an example below.

– Insert Table 1 about here –

– Insert Table 2 about here –

Example 2: As an illustration we consider the special case of Table 1 with αact = .05, that is ρ0 =.6, ρ1 = .75, α = .05, β = .1, and k = 20. Simulating for the values of ρ .05, .1, .15, .2, .25, .3, .35, .4, .45, .5, .55, .65, .7, .8 .85, .9, .95 (in addition to the values .6 and .75 which are already given in Table 1) 10 000 runs each and calculating as well the ASN as the relative frequency of rejecting H0 resulted in Figure 2 and Figure 3. The relative frequencies of rejecting H0 disclose the empirical (actual) power function. Both curves were smoothed by least squares. As a matter of fact the ASN-curve tends for ρ → 0 to 30 and for ρ →1 to 20 and takes its maximum between ρ =.6 and ρ =.75.

– Insert Figure 2 about here –

– Insert Figure 3 about here –

Running head: A Sequential Triangular Test

14

To summarize: If one actually likes to check whether a (Pearson-) correlation coefficient is large enough we suggest first to fix a lower bound of the coefficient of determination ρ02 as the value which at least must be given in order to speak of a meaningful percentage of the variance of y which can be explained by x – and vice versa. Another value of the coefficient of determination, ρ12, is that one which the researcher must not oversee with a higher probability than 1 – β if given. For example, one might decide for ρ02 = .5, as a consequence of which ρ0 = .7; suppose ρ1 = .8,  = .05,  = .2, then using k = 10 (by interpolation) a total sample size will result as about 80 instead of 119 for a fixed sample study. In Table 3 we give a part of the results of the empirical ASN in dependence on ρ. As we can see the maximum of the empirical ASN – function lies between the ρ –values of the two hypotheses but is still smaller than nfix

– Insert Table 3 about here –

Running head: A Sequential Triangular Test

15

References Cramer, H. (1946). Mathematical Methods of Statistics. Princeton: Princeton Press. Fisher, R.A. (1915). Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika, 10, 507-521. Fisher, R.A. (1921). On the “probable error” of a coefficient of correlation deduced from a small sample. Metron, 1, 3-32. Kubinger, K.D., Rasch, D., & Šimečkova, M. (2007). Testing a correlation coefficient’s significance: Using H0: 0 <    is preferable to H0:  = 0. Psychology Science, 49, 74-87. Rasch, D., Kubinger, K.D., & Yanagida, T. (2011). Statistics in Psychology – Using R and SPSS. Chichester: Wiley. R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL: http://www.R-project.org/. Rasch, D., Pilz, J., Verdooren, R. L., & Gebhardt, A. (2011). Optimal experimental design with R. New York: Chapman & Hall/CRC. Schneider, B. (1992). An interactive computer program for design and monotoring of sequential clinical trials. In Proceedings of the XVIth international biometric conference (pp. 237-250). Hamilton, New Zealand. Wald, A. (1947). Sequential Analysis. New York: Wiley. Whitehead, J. (1992). The Design and Analysis of Sequential Clinical Trial (2nd ed.). Chichester: Ellis Horwood.

Running head: A Sequential Triangular Test

16

Footnotes 1

random variables in bold

2

“feasibility” means a statistical terminus technicus with respect to the given context and

concerns corroboration of some method or to say “utility” 3

The program will be integrated to OPDOE (Optimal design of experiments see Rasch, Pilz,

Verdooren, & Gebhardt (2011). Interested users may ask for the source code at the authors.

Running head: A Sequential Triangular Test

17

Table 1 Simulation results for α =.05. Only feasible results and k ≤ 50 are given (symbols are explained in the text). ρ0 =.5, ρ1 =.7 β = .1 k

12

16

ρ0 = .6, ρ1 = .75

β = .2 12

16

β = .1 20

β = .2 12

ρ0 = .6, ρ1 = .8 β = .1

16

12

16

ρ0 = .7, ρ1 = .8

β = .2 12

16

β = .1 20

ρ0 = .7, ρ1 = .9

β = .2

50

16

20

ρ0 = .8, ρ1 = .9

β = .1

β = .2

8

6

12

8

β = .1 16

20

ρ0 = .9, ρ1 = .95

β = .2 12

16

β = .1 16

20

β = .2 16

20

αact

.060 .049 .053 .042

.050

.063 .052 .052 .041 .048 .038 .064

.036

.064

.057 .058 .041 .066 .047 .054 .045 .059 .047 .058 .048 .051 .041

βact

.043 .053 .114 .130

.049

.103 .112 .040 .047 .109 .117 .044

.053

.102

.112 .029 .039 .059 .085 .038 .046 .094 .110 .039 .040 .106 .108

ASN|ρ0 74.2 71.5 55.7 54.5

90.0

71.4 67.9 49.

47. 37.1 37.0 128.7 131.8 98.2

96.1 28.2 25.8 24.9 21.2 56.9 56.2 44.7 43.3 61.6 60.8 46.4 46.9

ASN|ρ1 72.2 72.3 62.1 62.3

90.0

77.0 76.1 48.2 49.1 41.6 42.8 124.3 137.0 104.9 105.5 26.3 25.9 25.2 23.1 55.4 56.4 47.8 48.4 58.8 60.3 51.6 52.7

nfix

88

88

65

65

113

82

82

56

56

41

41

164

164

119

119

27

27

20

20

65

65

48

48

70

70

51

51

Running head: A Sequential Triangular Test

18

Table 2 Feasible results for all simulated combinations of δ, ρ0, and β for α = .05

 = .1

 = .2

0



k

0



k

.5

.1

50

.5

.1

20 < k < 50

.5

.15

20 < k < 50

.5

.15

20

.5

.2

12 < k < 16

.5

.2

12 < k < 16

.6

.1

20 < k < 50

.6

.1

20 < k < 50

.6

.15

20

.6

.15

12 < k < 16

.6

.2

12 < k < 16

.6

.2

12 < k < 16

.7

.1

20 < k < 50

.7

.1

16 < k < 20

.7

.15

12 < k < 16

.7

.15

8 < k < 12

.7

.2

8 < k < 12

.7

.2

6