Measurement Educational and Psychological

53 downloads 0 Views 124KB Size Report
Michael A. Long, Kenneth J. Berry and Paul W. Mielke, Jr. Tetrachoric ... (Eden & Yates, 1933; Fisher, 1925; Geary, 1927; Pitman, 1937a, 1937b, 1938). This.
Educational and Psychological Measurement http://epm.sagepub.com

Tetrachoric Correlation: A Permutation Alternative Michael A. Long, Kenneth J. Berry and Paul W. Mielke, Jr Educational and Psychological Measurement 2009; 69; 429 originally published online Oct 15, 2008; DOI: 10.1177/0013164408324463 The online version of this article can be found at: http://epm.sagepub.com/cgi/content/abstract/69/3/429

Published by: http://www.sagepublications.com

Additional services and information for Educational and Psychological Measurement can be found at: Email Alerts: http://epm.sagepub.com/cgi/alerts Subscriptions: http://epm.sagepub.com/subscriptions Reprints: http://www.sagepub.com/journalsReprints.nav Permissions: http://www.sagepub.com/journalsPermissions.nav Citations http://epm.sagepub.com/cgi/content/refs/69/3/429

Downloaded from http://epm.sagepub.com at COLORADO STATE UNIV LIBRARIES on May 8, 2009

Tetrachoric Correlation A Permutation Alternative

Educational and Psychological Measurement Volume 69 Number 3 June 2009 429-437 © 2009 SAGE Publications 10.1177/0013164408324463 http://epm.sagepub.com hosted at http://online.sagepub.com

Michael A. Long Kenneth J. Berry Paul W. Mielke, Jr. Colorado State University An exact permutation test is provided for the tetrachoric correlation coefficient. Comparisons with the conventional test employing Student’s t distribution demonstrate the necessity of using the permutation approach for small sample sizes and/or disproportionate marginal frequency totals. Keywords: tetrachoric correlation coefficient; dichotomous variables; small sample; permutation

T

he tetrachoric correlation coefficient is a product-moment correlation coefficient between two bivariate normal variables, each of which is measured on a dichotomous scale (Pearson, 1900). Whereas the tetrachoric correlation coefficient is typically used to measure the correlation between two independent dichotomous variables, it also can be used to assess the reliability of a single rater when the same two raters independently rate n objects on a dichotomous scale (Bonett & Price, 2005; Fleiss, 1981). In addition, the tetrachoric correlation coefficient is often used to measure rater agreement (Uebersax, 2006) and is preferred by some researchers over Cohen’s kappa (Cohen, 1960) for this purpose (Hutchinson, 1993). Because of the extensive calculations necessary to compute the tetrachoric correlation coefficient, it has not been a popular statistic, despite its usefulness. With the advent of high speed computing, the tetrachoric correlation coefficient has seen a resurrection in fields such as psychology, psychopathology, radiology, and genetics (Greer, Dunlap, & Beatty, 2003). Within psychology, examples of quantitative variables measured on a dichotomous scale include test items scored as correct/incorrect, assessment of students having/not having a learning disability, students passing/not passing a motor skills or other test, and children classified as having/not having attention deficit hyperactivity disorder or other emotional or behavioral problems.

Authors’ Note: Please address correspondence to Michael A. Long, Department of Sociology, Colorado State University, B258 Clark Building–1784, Fort Collins, CO 80523-1784; e-mail: mlong78@ lamar.colostate.edu. 429 Downloaded from http://epm.sagepub.com at COLORADO STATE UNIV LIBRARIES on May 8, 2009

430

Educational and Psychological Measurement

An asymptotic approximation to the standard error of the tetrachoric correlation was given by Pearson (1913). However, the accuracy and, therefore, usefulness of Pearson’s standard error has repeatedly been called into question. For example, Kendall and Stuart (1961) noted that the sampling distribution and the standard error of the tetrachoric correlation coefficient are not known with any precision, and they noted further that it is not known for what sample size the standard error may safely be used. Permutation tests have long been used to assess the necessity of meeting the assumptions of asymptotic tests and the quality of the theoretical standard errors (Eden & Yates, 1933; Fisher, 1925; Geary, 1927; Pitman, 1937a, 1937b, 1938). This article introduces an exact permutation approach for the tetrachoric correlation and compares this new approach with the traditional asymptotic approach. The tetrachoric correlation is, quite possibly, the single most difficult correlation coefficient to compute. Following Brown (1977), denote the four cell frequencies of a 2 × 2 contingency table as a, b, c, and d; the two row marginal frequency totals as (a + b) and (c + d); the two column marginal frequency totals as (a + c) and (b + d) ; and let n = a + b + c + d denote the total table frequency. Let z1 and z2 denote the standard normal deviates of the marginal probabilities, that is, z1 = F−1

  a+c n

and z2 = F−1

  a+b , n

where F is the cdf of the standard normal distribution. Then, the tetrachoric correlation coefficient, rt , is the correlation coefficient that satisfies a = n

zð2

zð1

fðx1 , x2 , rt Þdx1 dx2 ,

ð1Þ

−∞ −∞

where f(x1 , x2 , rt ) is the bivariate normal pdf given by " # h  1 i−1 x21 − 2rt x1 x2 + x22 2 2   , fðx1 , x2 , rt Þ = 2p 1 − rt exp − 2 1 − rt2

and where x1 = z1 and x2 = z2 define the point that divides the bivariate normal distribution into four quadrants with probabilities corresponding to the probabilities of the four cells in the 2 × 2 contingency table (Castellan, 1966). When only one cell has zero frequency, the zero is changed to .5 and all other cell frequencies are

Downloaded from http://epm.sagepub.com at COLORADO STATE UNIV LIBRARIES on May 8, 2009

Long et al. / Tetrachoric Correlation

431

correspondingly adjusted by .5 to maintain the original row and column marginal frequency totals. When a = d = 0, rt = −1 and when b = c = 0, rt = + 1: When z1 = z2 = 0, then an explicit solution exists where rt = − cos

  2pa : n

In all other cases, rt must be found by iteration as a root of Equation 1. Pearson (1900) and Everitt (1910) approximated the bivariate normal integral by the tetrachoric series expansion I=

zð2

zð1

−∞ −∞

fðx1 , x2 , rt Þdx1 dx2 =

   X ∞ j a+b a+c rt fðz1 , z2 , 0Þvj − 1 wj − 1 , + j! n n j=1

where v0 = 1, v1 = z1 , and vj = z1 vj − 1 − ( j − 1)vj − 2 for j > 1, and w0 = 1, w1 = z2 , and wj = z2 wj − 1 − ( j − 1)wj − 2 for j > 1, respectively. The standard error of rt is given by h 3 i−1  ða + d Þðb + cÞ 2 st = n fðz1 , z2 , rt Þ + ða + cÞðb + d ÞF22 + ða + bÞðc + d ÞF21 4 1 2 + 2ðad − bcÞF1 F2 − ðab − cd ÞF2 − ðac − bd ÞF1

ð2Þ

where 2

3 z1 − r t z2 5 1 4 F1 = F  1 − 2 1 − rt2 2

and 2

3 z − r z 2 t 1 5 1 F 2 = F 4 1 − 2 2 2 1 − rt

(Pearson, 1913). Under the null hypothesis, E½rt  = 0 and Equation 2 simplifies to h 5 i−1 1 s0 = n2 fðz1 , z2 , 0Þ ½ða + bÞða + cÞðb + cÞðb + dÞ2 :

It is well known that the sampling distributions of correlation coefficients sampled from populations with r 6¼ 0 are skewed. Because there is no provision for translating st into a z parameter that is symmetrically distributed, as is true for the Fisher z transformation for the ordinary Pearson product-moment correlation coefficient, the only reasonable approach is the assumption of the null hypothesis H0 : r = 0

Downloaded from http://epm.sagepub.com at COLORADO STATE UNIV LIBRARIES on May 8, 2009

432

Educational and Psychological Measurement

(Guilford & Lyons, 1942). In any case, the Fisher z transformation for the Pearson product-moment correlation coefficient has been shown to perform poorly for skewed and heavy-tailed distributions (Berry & Mielke, 2000), which are prevalent in psychology (Micceri, 1989). Under H0 : rt = 0, T=

rt s0

is distributed as Student’s t with n − 2 degrees of freedom (Everitt, 1910).

Permutation Test Permutation tests have advantages over conventional statistical tests because permutation tests are completely data dependent and are free of the usual assumptions associated with traditional asymptotic tests. Exact permutation tests consider all possible arrangements of the observed data. Given the fixed row and column marginal frequency totals in a 2 × 2 contingency table, it is necessary to generate only all possible values of a single cell, for example, cell a, where the lower and upper limits of cell a are given by L = maxð0, a − dÞ

and U = minða + b, a + cÞ,

respectively. A tetrachoric correlation coefficient is then calculated for each of the U − L + 1 possible arrangements. Let ro denote the tetrachoric correlation coefficient calculated on the observed data and let ri denote a tetrachoric correlation coefficient calculated on each arrangement, i = L, . . . , U: The probability (P) of ro is given by P=

U X

cðri Þ,

i=L

where c ðri Þ =



Pðaja + b, a + c, nÞ 0

if ri ≥ ro otherwise

Pðaja + b, a + c, nÞ 0

if ri ≤ ro otherwise

if ro is positive, c ðri Þ =



if ro is negative, and the hypergeometric probability for a 2 × 2 contingency table is given by

Downloaded from http://epm.sagepub.com at COLORADO STATE UNIV LIBRARIES on May 8, 2009

Long et al. / Tetrachoric Correlation

433

Table 1 Tetrachoric Correlation Coefficients (rt ) With Marginal Proportions, Cell Frequencies for Cell a, Exact P Values (Pe ), P Values Based on the t Distribution (Pt ), and Absolute Differences Between the P values (jPe − Pt j), With n = 10 Marginal Proportion .5/.5

.6/.4

.7/.3

.8/.2

.9/.1

Cell a

rt

Pe

Pt

jPe − Pt j

0 1 2 3 4 5 2 3 4 5 6 4 5 6 7 6 7 8 8 9

−1.0000 −.8090 −.3090 +.3090 +.8090 +1.0000 −.6990 −.3979 +.2629 +.7963 +1.0000 −.3449 +.0817 +.7482 +1.0000 +.1222 +.6060 +1.0000 +.7366 +1.0000

.0040 .1032 .5000 .5000 .1032 .0040 .0714 .4524 .5476 .1190 .0048 .2917 .7083 .1833 .0083 1.0000 .3778 .0222 1.0000 .1000

.0394 .0710 .2756 .2756 .0710 .0394 .1032 .2282 .3095 .0780 .0424 .2738 .4427 .1052 .0531 .4273 .1877 .0800 .2242 .1554

.0355 .0322 .2244 .2244 .0322 .0355 .0318 .2242 .2381 .0411 .0376 .0179 .2656 .0782 .0448 .5727 .1901 .0578 .7758 .0554

Pðaja + b, a + c, nÞ =

ða + bÞ!ðc + dÞ!ða + cÞ!ðb + d Þ! : n!a!b!c!d!

If ro = 0, then P = 1:0 because direction is undefined.

Discussion The conventional approach to establish a probability value for rt under H0 employs the Student t distribution with n − 2 degrees of freedom. The permutation approach provides an alternative to the t distribution that is distribution free. Tables 1 and 2 present comparisons between exact probability values (Pe ) and probability values based on the Student t distribution (Pt ) for a variety of marginal proportions and n = 10 and 50, respectively. The marginal proportions in Tables 1 and 2 range from .5/.5 to .9/.1

Downloaded from http://epm.sagepub.com at COLORADO STATE UNIV LIBRARIES on May 8, 2009

434

Educational and Psychological Measurement

Table 2 Tetrachoric Correlation Coefficients (rt ) With Marginal Proportions, Cell Frequencies for Cell a, Exact P Values (Pe ), P Values Based on the t Distribution (Pt ), and Absolute Differences Between the P Values (jPe − Pt j), With n = 50 Marginal Proportion .5/.5

.6/.4

.7/.3

.8/.2

.9/.1

Cell a

rt

Pe

Pt

jPe − Pt j

0 5 10 15 20 25 10 15 20 25 30 20 25 30 35 30 35 40 40 45

−1.0000 −.8090 −.3090 +.3090 +.8090 +1.0000 −.9087 −.3979 +.2629 +.7963 +1.0000 −.7378 +.0817 +.7482 +1.0000 −.4662 +.6060 +1.0000 .0000 +1.0000

.0000 .0000 .1289 .1289 .0000 .0000 .0000 .0692 .1883 .0000 .0000 .0014 .4925 .0005 .0000 .0825 .0181 .0000 1.0000 .0000

.0000 .0003 .0853 .0853 .0003 .0000 .0001 .0433 .1267 .0005 .0000 .0021 .3704 .0019 .0001 .0565 .0205 .0006 .5000 .0097

.0000 .0003 .0436 .0436 .0003 .0000 .0001 .0260 .0617 .0005 .0000 .0007 .1221 .0014 .0001 .0261 .0024 .0006 .5000 .0097

and are identical for both rows and columns. For example, given n = 10 and marginal proportions of .6/.4 in Table 1, the row and column marginal frequency totals are f6, 4g and f6, 4g, respectively. Approximating a skewed discrete probability distribution with a symmetrical continuous distribution (i.e., t) is fraught with difficulties. To illustrate the problems with the use of Pt in tetrachoric correlation, consider the difference between Pe and Pt in Table 1 for n = 10 and .8/.2 marginal proportions. Given the row and column marginal frequency totals, {8,2} and {8,2}, respectively, there are only three possible arrangements of cell frequencies (i.e., 6, 7, and 8), with rt values of +.1222, +.6060, and +1.0000, respectively. It is obvious that the upper-tail probability of rt = +:1222 must be 1.0, because all three coefficients are positive and rt = +:1222 is the smallest of the three coefficients. Thus, the probability of an rt value this large or larger is 1.0. However, the t probability is only Pt = :4273, indicating that only 43% of possible rt values are as large or larger than rt = +:1222, yielding a difference of jPe  Pt j = :5727:

Downloaded from http://epm.sagepub.com at COLORADO STATE UNIV LIBRARIES on May 8, 2009

Long et al. / Tetrachoric Correlation

435

Table 3 jPe − Pt j Values for Five Marginal Proportions With Common rt Values for n = 10, 20, 50, 100, 500, and 1000 Marginal Proportion (rt )

n 10 20 50 100 500 1000

.5/.5

.6/.4

.7/.3

.8/.2

.9/.1

(+.3090)

(+.2629)

(+.0817)

(+.6060)

(+.2649)

.2244 .1329 .0436 .0097 .0000 .0000

.2381 .1513 .0617 .0200 .0000 .0000

.2656 .1948 .1221 .0819 .0222 .0083

.1901 .0616 .0024 .0012 .0000 .0000

— — .1611 .0781 .0032 .0010

Table 4 Marginal Proportions, Cell Frequencies for Cell d, Tetrachoric Correlation Coefficients (rt ), Exact Skewness Values (γt ), Exact P Values (Pe ), P Values Based on the t Distribution (Pt ), and Absolute Differences Between the P Values (jPe − Pt j) Marginal Proportion .50/.50 .60/.40 .70/.30 .80/.20 .90/.10 .94/.06 .95/.05 .96/.04 .97/.03 .98/.02 .99/.01

Cell d

rt

gt

Pe

Pt

jPe − Pt j

0 3 8 18 48 88 108 138 188 288 588

−1.00 −.79 −.58 −.34 −.06 + .09 + .13 + .18 + .23 + .30 + .39

.0000 .0296 .2040 .5730 1.1653 1.5907 1.7570 1.9766 2.2952 2.8267 4.0270

.0011 .0168 .0775 .2267 .5159 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

.0260 .0399 .0771 .1845 .4405 .4118 .3733 .3355 .2993 .2671 .2461

.0249 .0231 .0004 .0422 .0754 .5882 .6267 .6645 .7007 .7329 .7539

It is evident from even a cursory inspection of Tables 1 and 2 that two factors are contributing to the poor performance of the t distribution: sample size and disproportionate marginal frequency totals. Tables 3 and 4 examine these two factors, respectively. Table 3 presents jPe − Pt j values for five marginal proportions (.5/.5, .6/.4, .7/.3, .8/.2, and .9/.1) with common rt values for n = 10, 20, 50, 100, 500, and 1000. As in Tables 1 and 2, the marginal proportions in Table 3 are identical for both rows

Downloaded from http://epm.sagepub.com at COLORADO STATE UNIV LIBRARIES on May 8, 2009

436

Educational and Psychological Measurement

and columns. The rt values for each marginal proportion were chosen simply for convenience and provide illustrations of the effect of increasing sample sizes on jPe  Pt j differences. Inspection of the column values in Table 3 reveals that jPe  Pt j is large with small sample sizes up to n = 50 and in some cases even for n = 100: The blank values in Table 3 under the marginal proportions of .9/.1 are missing because there are too few choices with n = 10 or 20 and marginal proportions of .9/.1. Table 4 presents jPe  Pt j values for 11 marginal proportions—.5/.5, .6/.4, .7/.3, .8/.2, and .9/.1, .94/.06, .95/.05, .96/.04, .97/.03, .98/.02, and .99/.01—and demonstrates the effect of increasingly unequal marginal proportions on jPe  Pt j. As in Tables 1, 2, and 3, the marginal proportions in Table 4 are identical for both rows and columns. For Table 4, a series of 2 × 2 contingency tables was constructed with cell a = 0, cells b and c = 6, and cell d as listed in the second column of Table 4. Examination of Table 4 shows that as the marginal proportions become more unequal, jPe − Pt j increases. This indicates that Pt becomes more inaccurate with increasingly unequal marginal proportions, despite increasing sample sizes as indicated in the second column of Table 4. This relationship is related to the level of skewness (gt ) of rt . The values of gt in Table 4 were obtained from the permutation distribution generated from all possible arrangements of cell frequencies with fixed marginal frequency totals. A comparison of the first and fourth columns of Table 4 demonstrates the relationship between skewness and marginal proportions. As the marginal proportions become increasingly unequal, gt increases. The relationship can be summarized as follows. Given a 2 × 2 contingency table where either b = c or a = d and n is much larger than a + b = m, then the approximate skewness of rt is either n1=2 m or −n1=2 m, respectively. Thus, the skewness of the tetrachoric correlation’s distribution may be arbitrarily large in either the positive or negative direction. Therefore, as gt increases, jPe − Pt j increases. Under H0 : rt = 0, the test statistic (T = srt ) is asymptotically distributed as the 0 Student t distribution with n − 2 degrees of freedom. The use of the standard error, s0 , and the t distribution has been called into question by Kendall and Stuart (1961). In particular, the appropriateness of the t distribution is problematic for small samples and/or widely disproportionate marginal frequency totals. An exact permutation test provides a data-dependent distribution-free alternative that is unaffected by small sample sizes and disproportionate marginal frequency distributions.

References Berry, K. J., & Mielke, P. W. (2000). A Monte Carlo investigation of the Fisher z transformation for normal and nonnormal distributions. Psychological Reports, 87, 1101-1114. Bonett, D. G., & Price, R. M. (2005). Inferential methods for the tetrachoric correlation coefficient. Journal of Educational and Behavioral Statistics, 30, 213-225.

Downloaded from http://epm.sagepub.com at COLORADO STATE UNIV LIBRARIES on May 8, 2009

Long et al. / Tetrachoric Correlation

437

Brown, M. B. (1977). Algorithm AS 116: The tetrachoric correlation and its asymptotic standard error. Applied Statistics, 26, 343-351. Castellan, N. J., Jr. (1966). On the estimation of the tetrachoric correlation coefficient. Psychometrika, 31, 67-73. Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37-46. Eden, T., & Yates, F. (1933). On the validity of Fisher’s z test when applied to an actual example of non-normal data. Journal of Agricultural Statistics, 33, 6-17. Everitt, P. F. (1910). Tables of the tetrachoric functions for fourfold correlation tables. Biometrika, 7, 437-451. Fisher, R. A. (1925). Statistical methods for research workers. Edinburgh: Oliver and Boyd. Fleiss, J. L. (1981). Statistical methods for rates and proportions (2nd ed.). New York: John Wiley. Geary, R. C. (1927). Some properties of correlation and regression in a limited universe. Metron Rivista Internazionale de Statistica, 7, 83-119. Greer, T., Dunlap, W. P., & Beatty, G. O. (2003). A Monte Carlo evaluation of the tetrachoric correlation coefficient. Educational and Psychological Measurement, 63, 931-950. Guilford, J. P., & Lyons, T. C. (1942). On determining the reliability and significance of a tetrachoric coefficient of correlation. Psychometrika, 7, 243-249. Hutchinson, T. P. (1993). Kappa muddles together two sources of disagreement: Tetrachoric correlation is preferable. Research in Nursing & Health, 16, 313-315. Kendall, M. G., & Stuart, A. (1961). The advanced theory of statistics (Vol. 2). New York: Hafner. Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105, 156-166. Pearson, K. (1900). Mathematical contributions to the theory of evolution. VII. On the correlation of characters not quantitatively measurable. Philosophical Transactions of the Royal Society of London, 195A, 1-47. Pearson, K. (1913). On the probable error of a coefficient of correlation as found from a fourfold table. Biometrika, 9, 22-27. Pitman, E.J.G. (1937a). Significance tests which may be applied to samples from any populations. Supplement to the Journal of the Royal Statistical Society, 4, 119-130. Pitman, E.J.G. (1937b). Significance tests which may be applied to samples from any populations, II: The correlation coefficient test. Supplement to the Journal of the Royal Statistical Society, 4, 225-232. Pitman, E.J.G. (1938). Significance tests which may be applied to samples from any populations, III: The analysis of variance test. Biometrika, 29, 322-335. Uebersax, J. S. (2006). The tetrachoric and polychoric correlation coefficients. Retrieved December 20, 2007, from http://ourworld.compuserve.com/homepages/jsuebersax/tetra.htm

Downloaded from http://epm.sagepub.com at COLORADO STATE UNIV LIBRARIES on May 8, 2009