Interpreting Interactions and Interaction Hierarchies in ... - CiteSeerX

19 downloads 0 Views 348KB Size Report
Their contribution analyzes the papers of Justices Thurgood Marshall .... experience and knowledge in this area of the law, and minority status. .... of this public policy area is startling: 45.6 million public school students in the 1996-7 school.
Interpreting Interactions and Interaction Hierarchies in Generalized Linear Models: Issues and Applications

Jeff Gill University of Florida [email protected]

Abstract There is substantial confusion in political science and related literatures about the meaning and interpretation of interaction effects in models with non-interval, non-normal outcome variables. Often these terms are casually thrown into the model specification without observing that their presence fundamentally changes the interpretation of the resulting coefficients. This article addresses this lack of clarity and rigor by explaining the conditional nature of reported coefficients and their standard errors in models with interactions, defining the necessarily different interpretation of interactions in generalized linear models, and introducing useful hierarchies of interaction effects in these specifications. Unfortunately, there is little written about specifying the uncertainty of these interaction effects through reported standard errors, and much of which is currently reported is actually misleading. This article fills a gap in the methodological literature by providing a general analytical method for correctly calculating coefficient standard errors in models with second-order or higher interactions and complex interaction specifications. The methodology is demonstrated with applications to current work in political science. These examples demonstrate the utility of interaction hierarchy specifications in generalized linear models by providing analyses of data from comparative politics, judicial decision-making, and education public policy. Presented at the American Political Science Association Annual Meeting, San Francisco, September 2001. Thanks to Ryan Bakker, Neal Beck, Bill Berry, Michael Chege, Goran Hyden, Rob Franzese, Gary King, Jonathan Nagler, and Nick Theobald for comments on an earlier draft. Thanks as well to Forrest Maltzman and Paul Wahlbeck for sharing their carefully collected data. All data, codebooks, and computer code for replicating the examples are available at the author’s webpage: .

Introduction Generalized linear models, limited dependent variable models in particular, are now quite common in empirical political science studies. Multiplicative interaction terms have been incorporated in linear model applications for many years. Yet there has been insufficient attention paid to the role of these interaction terms in such nonlinear forms in the political science literature. Exceptions include Berry (1999), Berry and Berry (1991), Frant (1991), King (1989), and Friedrich (1982). Furthermore, only a few authors have noted the additional complexity imposed by the conditional nature of interaction specifications in the course of their analyses (Nagler 1994, Kahn and Kenney 1997; Lacy and Paolino 1998; Moon and Dixon 1985) A central problem is that the interpretation of interactions is more complex in generalized linear models than in the basic linear construct, and it is not widely known that interaction effects are necessarily introduced into all generalized linear models by the link function. This essay reviews the meaning and interpretation of interactions in linear and generalized linear models, focusing on their conditional nature. It is shown that interactions are produced in generalized linear models, even when the researcher does not provide a specific interaction parameterization. Failure to account for these effects can easily produce models that are misleading and wrong. A general method for correctly calculating conditional standard errors for these terms is shown, and higher order interaction models are introduced. There is currently no other broadly applicable technique available for obtaining correct standard errors in hierarchical interaction specifications for generalized linear models.

Interactions in Linear Models The typical treatment of interactions in linear models is to consider the interaction as a product term of the main effects variables.1 This takes the form Yi = β0 + β1 Xi1 + β2 Xi2 + β3 Xi1 Xi2

(1)

where β3 is the coefficient estimate corresponding to the product. The complete product term, β3 Xi1 Xi2 , is called a first-order interaction or sometimes a two-factor interaction, where for obvious reasons the order is one less than the number of factors. Subject to the standard Gauss-Markov 1

Unfortunately, the term “main effect” is not very revealing or accurate. The contribution provided by a nonmultiplicative explanatory variable specification for a variable that is specified as interacting with others, is the contribution for that variable at zero levels of the interacting variables. There is no mathematically implied primacy and a more accurate term would be “nonmultiplicative” or “single contribution.” However, I will stay consistent with an overwhelming number of authors, dating from as far back as Fisher and Mackenzie (1923), and begrudgingly use the term “main effect.” Because the term is so pervasive across many disciplines in the social, behavioral, and natural sciences, there is unfortunately little chance of affecting a change.

1

conditions, the sampling distribution of β3 is Student’s-t with N − k degrees of freedom. There is no requirement for the form of this interaction term to be a product of the main effects and others have β1 β2 been suggested, such as β0 Xi1 Xi2 (Wonnacott and Wonnacott 1970) and βj Xi1 /Xi2 (Allison 1977).2

The meaning of interactions in the linear model is actually easier to interpret if (1) is rearranged as follows:

Yi = β0 + β1 Xi1 + (β2 + β3 Xi1 )Xi2 .

(2)

If we are interested in the effect that the explanatory variable Xi2 has on the outcome variable, it is sufficient to take the first derivative of (2) with respect to this variable to obtain the marginal effect as a composite coefficient estimate:

∂Yi = β2 + β3 Xi1 . (3) ∂Xi2 This is useful because it demonstrates that the effect of levels of Xi2 on the outcome variable are intrinsically tied to specific levels of Xi1 : the marginal contribution of Xi2 is conditional on Xi1 . Two scenarios can occur, one when high levels of one variable have an accelerating effect on the

other (β3 has the same sign as β2 ), and the other when high levels of one variable have a dampening effect on the other (β3 has the opposite sign of β2 ). So the sign on first-order interaction effects tells us quite a bit about the conditional effect that a given explanatory variable has on the outcome variable. The interpretation of a given coefficient’s effect is now complicated by the requirement that it occur at a specified level of the other explanatory variable. Often there are theoretically important levels of Xi1 that can be substituted into (3). In the absence of some theory-driven level or point of particular interest, the quantiles of the interaction explanatory variable provide convenient points of analysis. It is important to remember that including an interaction component in linear models (as well as those discussed below) but omitting the corresponding marginal main effects produces nonsensical models (see Nelder (1977) for a thorough critique). This is true even if a main effect is not statistically significant and even if the researcher has an a prior hypothesis that the main effect is not relevant (Cox 1984). In terms of equation (1), such an ill-advised specification would be Yi = β0 + β3 Xi1 Xi2 . This form is equivalent to the assertion that there exist two explanatory effects in which neither had any influence on the outcome variable in the absence of the other (Wood and Games 1990). If this is really the case, then essentially Xi1 Xi2 is a single variable, not an interaction term between two 2 Actually all such models are subsets of polynomial regression specifications in which (sometimes high degree) polynomial forms operate on the right-hand side of expressions like (1) to make local approximations to continuous, multidimensional response surfaces. For extended discussions, see Box and Draper (1987), Khuri and Cornell (1987), and Cornell and Montgomery (1996).

2

variables.3 Omitting lower-order terms also means that common transformations, such as centering and standardization, actually substantively change the coefficient values: violating the principle of noninvariance (Allison 1977; Friedrich 1982). In addition, the variance of the interaction term is interpretable only in the context of the variance of its main effect components (Jaccard & Wan 1995). Despite these well-documented problems, examples of models omitting main effects but including the associated higher-order terms can be found in recent political science work.4

Interactions in Generalized Linear Models The statistical treatment of interaction effects in generalized linear models, dichotomous choice models in particular, is a well-developed tool in bioassay and epidemiology, where researchers are often concerned with the effects of combinations of various behaviors and exposures on disease rates. A typical example is the apparent interaction of exposure to some airborne industrial toxins with smoking as a contributor to lung cancer risk (Hertz-Picciotto, et al. 1992). In this context an interaction is defined as the statistical departure from an additive specification that affects the success rate (Assmann, et al. 1996; Hosmer and Lemeshow 1992; Pearce 1989). Here, the incidence of lung cancer is accelerated by the combination of smoking and exposure in a non-linear fashion that cannot be adequately modeled by the additive marginal effects of smoking and exposure alone. This is typical of the type of effect that political scientists are attempting to model with multiplicative interaction terms. The Effect of the Link Function Most of the applications in this research area involve dichotomous or polychotomous explanatory factors and their corresponding interaction terms (Rothman 1986). However, the methodological issue is actually much broader, encompassing outcome variables treated as counts, operating on truncated sample spaces, and defined by specific discrete events. The family of generalized linear models encompasses most of these forms, and the approach to interactions in generalized linear models 3

Sometimes this is exactly the case and perfectly appropriate. For instance, including gross domestic product (GDP) without separate terms for price and quantity in a model specification implies that price and quantity are important only when considered together. A reader would be quite confused to see price×quantity in this situation as if the constituent parts of GDP had some significance beyond their joint contribution but were not worthy of consideration as individual explanatory variables. 4 A systematic review of articles published in the leading nonspecialized political science journals from 1993 to 1998 supports this point. This mistake appeared in a total of twelve articles: three in the American Political Science Review, three in the Journal of Politics, five in the American Journal of Political Science, and one in Political Research Quarterly.

3

provided here gives a uniform treatment for interactions in nearly all specified empirical models in political science. Interaction effects are more complicated in generalized linear models (a family which include dichotomous choice models) due to the link function between the systematic component and the outcome variable. In generalized linear models the systematic component is related to the mean of the outcome variable by a smooth, invertible function, g(), according to µ = Xβ

Y ) = g−1 (µ µ). E(Y

where

(4)

This is a very flexible arrangement and it allows the modeling of non-normal, bounded, and noncontinuous outcome variables in a manner that generalizes the standard linear model (Fahrmeir and Tutz 2001; Gill 2000; McCullagh and Nelder 1989). Using the link function we can change (2) to the more general form: E(Yi ) = g−1 (β0 + β1 Xi1 + (β2 + β3 Xi1 )Xi2 )

(5)

which obviously reduces to the linear model if g() is the identity function. Common forms of the µ ) = log(µ µ ) for Poisson link function for different assumed distributions of the outcome variable are g(µ µ µ) = − µ1 for modeling truncation at zero with the gamma, and logit (log( 1−µ treatment of counts, g(µ µ )),

µ)), or cloglog (log(−log(1 − µ ))) for dichotomous forms. probit (Φ−1 (µ A poorly understood ramification of interactions in generalized linear models is that by including a link function, we automatically specify interactions. To see that this is true, revisit the calculation of the marginal effect of a single coefficient by taking the derivative of (5) with regard to some variable of interest as we did for the linear model (3) but without an explicitly specified multiplicative term for the interaction. If the form of the model implied no interactions, then we would obtain a marginal effect free of other variables, but this is clearly not so: ∂Yi ∂ −1 = g (β0 + β1 Xi1 + β2 Xi2 ) = (g−1 )′ (β0 + β1 Xi1 + β2 Xi2 ) β2 . ∂Xi2 ∂Xi2

(6)

This just demonstrates that the presence of a link function requires the use of the chain rule and therefore retains other terms on the right-hand side besides β2 . If the link function were the identity function, then its derivative with respect to Xi2 would be 1 and (6) would simplify to β2 as in the standard linear model without specified interaction components. However, with the generalized linear model we always get partial effects for a given variable that are dependent on the levels of the other explanatory variables. To give a more specific example, consider a dichotomous choice outcome variable indicating 4

whether an individual votes for a Republican congressional candidate along with an associated interval measure of ideology. One certainly would not be surprised to observe that more conservative individuals tend to vote for Republicans and more liberal individuals tend not to vote for Republicans. Now add a second explanatory variable indicating whether or not the individual owns a gun. A simple probit model is specified for these data with no directly indicated interaction term: P (Yi = 1) = Φ(β0 + β1 IDEOLOGY + β2 GU N )

(7)

where Φ denotes the standard normal cumulative distribution function. This model is depicted in Figure 1, where gun owners and non gun owners are separated. If this were the standard linear model, (7) but without the probit link function, then Figure 1 would show two parallel lines because no interaction was specified. However, even though we deliberately did not directly specify an interaction term, it is clear from Figure 1 that gun ownership interacts with ideology in affecting the probability of voting for the Republican candidate. For very liberal and very conservative individuals, gun ownership does not greatly affect the probability of voting for the Republican. Yet for individuals without a strong ideological orientation, gun ownership matters considerably: a difference of about 50% at the center. Figure 1: Probit Model for Vote Choice

1 3 2

0.8

Freshman

1

Freshman

0.6

g−1(Xβ)



0.4

−1 −2

Non−Freshman

Non−Freshman

0.2

−3 0

Far

Medium

Close

Far

Medium

Close

Issue Distance to Author of Opposing Opinion

Correctly Interpreting Interaction Effects in Generalized Linear Models From the preceding argument it is clear that interactions are produced in generalized linear models, regardless of whether they are recognized or desired. Yet this observation does not really help in testing 5

for the existence and statistical reliability of hypothesized interactions, or in determining overall model quality in the acknowledged presence of such terms. There are two ways of thinking about the outcome variable of interest in generalized linear models. Since the link function is defined to be smooth and invertible, it is always possible to pass this function back to the left-hand side of equation (4) and consider a specification in which a function of the µ) = Xβ expected value of the outcome variable is modeled linearly: g(µ Xβ. The important difference is that with this form the automatic interaction terms cancel out. This does not mean that such terms do not exist and it does not mean that by strictly considering a model of this form that one does not have to worry about potential interactions. Instead it means that if one is interested in a form of the outcome variable that is modified directly by the appropriate link function, and if the desired specification excludes interaction terms then the model can be greatly simplified. To make this point more concretely consider a case where the conceptual outcome variable of interest is dichotomous and we are using logit, probit, or cloglog as a convenience to link the additive component with support over the entire real line to the necessarily bounded probability of getting a success. Using a logit specification in this context produces: X ) = (1 + exp [−β0 − β1 Xi1 − β2 Xi2 ])−1 . P (Yi = 1|X

(8)

Here we will get the inadvertent interaction effects discussed previously. However, sometimes we are instead interested in the indirectly obtained unbounded variable, and the observed outcome variable is the only measurable manifestation available. Returning to the example above, our real interest might not lie in the binary outcome of voting for the Republican or not for some particular combination of levels of the explanatory variables, but rather in the effects of particular explanatory variables on the inclination to vote for the Republican expressed through the odds of voting Republican versus the odds of not voting Republican. In logit models this can be thought of as the log odds of a successful outcome (i.e. the outcome coded 1), and is obtained by moving the GLM link function to the left-hand side of (8):



 X) P (Yi = 1|X = β0 + β1 Xi1 + β2 Xi2 . log X) P (Yi = 0|X

(9)

There is an important, and under-appreciated, consequence of this model specification. By stipulating that the outcome variable of interest is the unbounded log odds of success and specifying the linear additive right-hand side without interaction terms as done in (9), the researcher is making the strong statement that there are no interactions of consequence to be modeled and no associated hypotheses 6

therefore tested. Regarding this specification, McCullagh and Nelder (1989, p. 110) unambiguously state that: “It is important here that x1 be held fixed and not be permitted to vary as a consequence of the change in x2 .” In other words: no interactions. This does not mean that the form of (9) precludes the inclusion of interaction terms. Quite the contrary. Now that the right-hand side is a simple additive form, specifying interaction terms is as straightforward as with the standard linear model discussed in the first section.   X) P (Yi = 1|X log = β0 + β1 Xi1 + β2 Xi2 + β3 Xi1 Xi2 . X) P (Yi = 0|X

(10)

It is just a consequence of (9) that any non-specified interaction terms that would have been automatically accounted for using (8) are now cancelled out and therefore not modeled. To see that this is true expand the left-hand side of (9) and note the cancellations:     X) P (Yi = 1|X (1 + exp(−β0 − β1 Xi1 − β2 Xi2 ))−1 log = log X) P (Yi = 0|X 1 − (1 + exp(−β0 − β1 Xi1 − β2 Xi2 ))−1   exp(β0 + β1 Xi1 + β2 Xi2 ))−1 = log 1 + exp(β0 + β1 Xi1 + β2 Xi2 ))−1   1 − log 1 + exp(β0 + β1 Xi1 + β2 Xi2 ))−1   = (β0 + β1 Xi1 + β2 Xi2 ) − log 1 + exp(β0 + β1 Xi1 + β2 Xi2 ))−1   − log[1] + log 1 + exp(β0 + β1 Xi1 + β2 Xi2 ))−1

= (β0 + β1 Xi1 + β2 Xi2 )

(11)

This derivation, which produces (8), is evidence that with the log odds of success expression of the logit GLM model interactions can exist in the data, but they are cancelled out in the researcherimposed specification. If the model is calculated with the standard numerical procedure for GLM (maximum likelihood estimation through iterative weighted least squares, see Fahrmeir and Tutz (2001, p. 42), Gill (2000 p. 44), Green (1984), or del Pino (1989) for details), then existing interaction effects are estimated but then ignored due to the cancellation above. If the model is calculated by first transforming the outcome variable according to the link function and then using least squares, then existing interaction effects are not estimated and again ignored. The coefficients estimated with these two methods will be exactly the same and correct, provided that there are no interaction effects in the data. Cox (1984), in particular, advocates the log odds expression for dichotomous choice specifications 7

with specified interaction terms as an interpretational simplification of the analysis by using an unconstrained outcome variable, “. . .since such representations have the greatest scope for extrapolation” (p. 21). Nagler (1994) calls model components such as β3 Xi1 Xi2 in (10) “variable-specific” interaction terms to distinguish them from the automatic variety. If specifying these variable-specific terms in the model leads to improved fit (comparable with likelihood ratio tests), then we have successfully captured through parameterization at least some of the necessarily existent interaction between variables by the model specification. Unfortunately, other forms the link function (i.e. other generalized linear models) do not share the naturally intuitive appeal of the log odds of success outcome variable that the logit model provides. Therefore researchers that move the link function to the left-hand side of the GLM specification equation purely to avoid considering imposed interaction considerations, do so at a great interpretational expense. Understanding Generalized Linear Model Interaction Effects with First Differences The best way to understand generalized linear model results (including those with dichotomous choice outcome variables) is by first differences (King 1989), a term that originates in time series work. The principle of first differences, sometimes called attributable risk, is to select two levels of interest for a given explanatory variable and calculate the difference in impact on the outcome variable holding all of the other variables constant at some value, usually the mean. This principle is only slightly more complicated with interaction specifications. Since the total effect of any given variable is a composite of its main effect and associated interaction effects, it is necessary to include all interacting coefficients in the first difference calculation making sure that all other variables in these terms are set at their means in both their main effects and in the interaction effects. Therefore when looking at a variable of interest in a table of first differences, the observed difference includes the main effect as well as all of the interaction effects that include that particular variable. The mechanics of this process are relatively simple. Suppose that we return to the generalized linear model form in (5): E(Yi ) = g−1 (β0 + β1 Xi1 + (β2 + β3 Xi1 )Xi2 ) where the explanatory variable ¯1 . X2 is of primary interest. The coefficients β0 , β1 , β2 are estimated and X1 is set at its mean: X [1]

Now we select two values of X2 of interest: X2

[2]

and X2 . This is usually very straightforward

for categorical variables, and only slightly less so for interval measured variables where we can pick 8

extreme values or appropriate quantiles. Of course there are often theoretically interesting values to assign as well. These assignments allow us to produce two theoretical expected values of the outcome   variable: ¯ 1 + (β2 + β3 X ¯ 1 )X [1] E(Y )[1] = g−1 β0 + β1 X 2   ¯ 1 + (β2 + β3 X ¯ 1 )X [2] , E(Y )[2] = g−1 β0 + β1 X (12) 2

and take the difference: F D [2,1] = E(Y )[2] −E(Y )[1] . This value therefore gives the expected difference for the outcome variable across the selected range of one explanatory variable of interest, holding the other(s) constant at the mean. These are really “conditional first differences” here because not only do we hold the other variables constant at their mean, but also the first difference values and the standard errors of these first differences are calculated by including the effects of the relevant interaction terms. For this example, with X2 , the first difference is calculated by summing the contribution from the main effect as well as the first-order interaction with X1 . Therefore the first difference for the variable of interest is conditional on the set value for the other variable that it interacts with. The consequence of this conditionality is that the measure of uncertainty for the first difference is a composite term conditional on more than one singular standard error. A procedure for calculating this quantity is provided in a later section. Resolving a Methodological Controversy in Political Science The treatment of interaction terms in generalized linear models is at the core of a discussion between Frant (1991) and Berry & Berry (1991) concerning the specification of dichotomous models with a logit link function. Berry & Berry assert that deliberately specifying multiplicative interaction components is appropriate for testing hypotheses when the left-hand side specification is a log odds form such as (9), but when an expected probability is modeled the automatically generated interaction effects in the marginal first differences for individual coefficients are sufficient for such tests. Frant (1991) disagrees, stating that no explicit hypothesis of interaction can be tested by such indirect means in cases where the outcome variable is treated as the expected probability of success. Both Berry & Berry and Frant do agree however that when the hypothesis test is for a log odds outcome variable, any desired test of interaction must be directly specified with a multiplicative term. So their disagreement is really over whether it is necessary to specify the β3 coefficient in (10), or whether looking at the main effects coefficients alone through first differences is sufficient. Berry (1999) revisits this question with an updated argument for expected probability models. 9

He asserts that specifically including the β3 Xi1 Xi2 term is necessary and that the previously recommended approach of first differences works only if all terms that specify the variable of interest, including interactions, are included in the calculation. The debate between Berry and Frant is therefore over whether sufficient information to perform an hypothesis test of interactions can be found in the magnitude of the interaction term alone (Frant) or is it necessary to use this term and the appropriate main effects terms to judge the impact the variable of interest at different levels of the interacting variable through first differences (Berry). Berry is correct here. The existence of the link function means that the effect of any particular explanatory variable on the left-hand side probability depends on the values of every other explanatory variable and therefore must be included in an hypothesis test. To see how wrong the singular approach of Frant can be imagine that the main effect coefficients for two explanatory variables are both large and positive: β1 + + +, β2 + + +, whereas the interaction coefficient was small and negative: β3 −. The singular approach would be to interpret a purely negative (dampening) effect of X1 on X2 , and vice-versa. However, it is likely that these main effect contributions will overwhelm this term and we would have no way of knowing unless we calculated the full contribution of all interacting variables expressed through the link function using first differences as done in (12) (Berry finds such a case). This is due both to the magnitude of the coefficients (assumed large here) and the measurement of the variables involved. Furthermore, some link functions provide directional results that are surprising given the signs on the tabulated coefficients, such as the gamma link function which imposes a minus sign and an inverse on the linear predictor. In other words, you have no way of fully appreciating the effect of interacting variables until you wind the complete levels (means and first difference values) through the GLM link function using every associated term. There are two prominent mistakes often made here: assuming that only the interaction coefficient is necessary to perform hypothesis tests for interactions (Frant), and neglecting to specify an interaction coefficient at all, assuming that the interaction effects would be revealed through the covariance imposed by the link function (Berry and Berry). This debate highlights the different way that (8) and (9) are treated, even though they are exactly the same mathematical model of the data. The primary difference lies in how we choose to think about the outcome variable, and all of these authors agree that the log odds metric is easier to handle from a modeling perspective, given the caveat that any desired interactions must be specifically stipulated as

10

in the linear form. However, while first differences of log odds are less intuitive than first differences of probabilities, the ease with which interaction terms can be added to the linear predictor of log odds specifications makes this form attractive. As long as it is understood that the GLM link function can be made to operate on either side of the equality and that interaction implications differ, there is no added complexity in using both forms.

Higher-Order Interactions There is no mathematical or statistical restriction to developing higher-order interactions in linear or nonlinear models (Allison 1977; Cornell and Montgomery 1996; Edwards 1990). These are interaction effects between lower-order interaction effects and other coefficients as well as interaction effects between interaction effects. Specifying such terms naturally introduces a hierarchy into the model, not just because higher-order interactions are not defined in the absence of the main effects and the lower-order interaction effects that constitute their foundations, but also because the interpretation of these effects depends on the assumed levels of constituent terms. The primary impediment to widespread use of interaction hierarchies in model specifications is that there is currently no general method for analytically calculating the correct measure of uncertainty for the resulting explanatory effects, although they can sometimes be obtained by simulation (King, et al. 2000). Instead, the analyst is usually required to perform lengthy derivations for each differing specification, This problem is addressed in Appendix B with a general, matrix-based procedure for calculating standard errors in hierarchical interaction model specifications for linear and nonlinear models. Several motivations exist for developing models with higher-order interactions. They are superior to general polynomial models (of which they are a subclass) because the higher-order terms reveal more about the structure of dependency effects in the explanatory variables (Cornell and Montgomery 1996). De Leeuw (1990) points out that a hierarchical linear model “links the analysis of discrete and continuous variables convincingly, and makes it possible to discuss the interpretation of a great variety of statistical techniques in terms of conditional independence.” Constructing hypothesis tests for higher-order interaction terms is no more complicated than for first-order interaction terms (Goodman 1964). Also, Aiken and West (1991) argue that the hierarchical nature of increasing levels of interaction effects easily facilitates exploratory hypotheses as these levels can be increased systematically. 11

There is an extensive literature on testing for the existence of first- and higher-order interactions in factorial experiments and with contingency tables. This is because whenever a factor is crossed with another, three sources of variation are generated: variation from each source and variation from the residuals after removing the marginal effects (Rosnow and Rosenthal 1989, p. 144). There are well-developed tests for the existence of these interaction effects under a broad range of data types, dimensions, and restrictions (Fienberg 1980; Lancaster 1969; McCullagh & Nelder 1989; Thyregod & Spliid 1991; Uesaka 1993), and some authors in this area have paid particular attention to higherorder interactions (Goodman 1964; O’Neill 1980). However, there is considerably less to draw on in terms of generalized linear models with continuous random variables (the more typical application in political science). Ironically, these are more difficult cases to analyze because one loses the naturally intuitive appeal of the effect of being in one of a finite number of cells in a multidimensional rectangle of states. Suppose for the moment that there exist three main effects of interest, and we desire a secondorder interaction hierarchy model. Using the logit specification of the generalized linear model from (5), we produce the following specification for the log odds of success:   X) P (Yi = 1|X = β0 + β1 Xi1 + β2 Xi2 + β3 Xi3 log X) P (Yi = 0|X + β4 Xi1 Xi2 + β5 Xi2 Xi3 + β6 Xi1 Xi3 + β7 Xi1 Xi2 Xi3 ,

(13)

where the log odds form is used to make the imposition of multiple interactions more straightforward. So now in addition to having three first-order interaction effects, there is a second-order interaction effect with contributions from all three explanatory variables. Computationally there is no difficulty specifying and estimating models with higher-order interaction effects such as (13). The interpretation of the second-order interaction effect is straightforward: if we are primarily interested in the explanatory variable X3 then we would consider the coefficient β7 to be the increment to β3 for a particular combination of values of X1 and X2 . Actually this term can be interpreted as a first-order interaction between any of the first-order interaction effects and the remaining main effect: β7 (Xi1 Xi2 )Xi3 = β7 (Xi2 Xi3 )Xi1 = β7 (Xi1 Xi3 )Xi2 .

(14)

Each of the three explanatory variables in (14) can be individually considered as the primary variable of interest and the other two as “moderator” variables (Jaccard, et al. 1990, p. 41). An interpretational motivation for considering higher-order interaction effects, such as in (14), is

12

that a combination of levels of two variables expressed through a first-order interaction term can alter how a third variable affects the outcome variable. For instance, if the interaction between age and race altered the effect of income on the probability of voting for a Republican candidate for Congress, then we might want to add this to a proposed model specification. Suppose that older voters are more likely to vote Republican and white voters are also more likely to vote Republican, but older white voters are far more likely to vote Republican than the total pool of older or white voters. It is also possible that this older-white acceleration effect is mitigated for low-income voters. Thus there exist differing effects by levels of age, race, and income, which can be separated out by a second-order interaction effect parameterization. It is easy to see that a reasonable number of main effects typical of a social science empirical model can produce a huge number of interaction hierarchies. In fact, the number of possible interactions   Pν ν for ν > 1 main effects is determined by the rapidly increasing function: . Therefore the i=2 i selection of a useful subset of possible interaction effects is a vital part of the art of model specification. Although it is argued here that interaction effects provide useful information and lead to better empirical models, researchers should use the same sort of theory-driven procedure that is involved in picking main effects. A “kitchen sink” (or, worse yet, some variant of a stepwise) interaction specification approach is likely to lead to a massively parameterized and completely useless model (Cornell and Montgomery 1996, p.2545; O’Neill 1980; Rovine and von Eye 1990). In addition, just as specifying first-order interaction effects while omitting the corresponding main effects produces poor models, specifying second- and higher-order effects without the component lower-order interaction effects is also ill-advised (Allison 1977; Rosnow and Rosenthal 1991).

Conditional Standard Errors in Interaction Specifications When a model with interaction terms is specified and run with statistical software, the resulting table of coefficient values and standard errors is somewhat misleading. If a variable interacts with others in the model specification, then the main effect coefficient is just the contribution of that variable assuming that all of the other interacting variable coefficients equal zero. The contribution of this variable is better assessed at assigned levels of the interacting variables that are more realistic. Thus the researcher is required to provide theoretically interesting points in the data that hopefully reveal important characteristics of the primary interacting variable. Since the complete effect of the explanatory variables specified as interacting in the model is 13

conditional on the values of the other co-interacting specified variables, then so is the associated standard error. The standard generalized linear model output from statistical software gives only a partial summary of the effects of these terms, so it is required that one specify the appropriate form of the conditional standard errors. Start with the simple case when there are only two explanatory variables in the model, X1 and X2 , specified as in (1). The standard error of X1 conditional on X2 = x2 is provided by 1

sβ1 +β2 X2 |X2 =x2 = (var[β1 ] + x22 var[β3 ] + 2x2 cov[β1 , β3 ]) 2

(15)

(Friedrich 1982, p. 810; Jaccard, et al. 1990, p. 27). The derivation of this form is provided in Appendix A. The variances and covariance in (15) are directly from the variance-covariance matrix in the original output calculated from the negative inverse of the Hessian matrix at the maximum likelihood values for the coefficient estimates. It is easy to see from the structure of (15) that not only is the contribution of X1 to the explained behavior of the outcome variable conditional on the value selected for X2 , but so is the reliability of this contribution as well. Jaccard, et al. (1990, p. 80-81) provide an algorithm to produce a linear model conditional standard error that corresponds to three-variable specifications such as (13). While this is a very useful form, it is given in a relatively nonintuitive series of steps and cannot be generalized to any other model specification. Obviously, given the specifications typically developed in empirical research in the social sciences, using such a basic approach is limiting. One strategy is to obtain the conditional standard error of each explanatory variable by calculating the expected value of the square of the marginal effect (obtained by calculating an integral with respect to the variable of interest). While this approach is guaranteed to produce the desired quantities, it is extremely time-consuming, and likely to produce human factor errors. For example, suppose we wanted to obtain the three conditional standard errors for the simple model specification in (13). The variance of the marginal effect of X1 at specified levels of X2 and X3 is (Appendix A): (sβ1 +β4 X2 +β6 X3 +β7 X2 X3 |X2 =x2 ,X3 =x3 )2 = var[β1 ] + 2x2 cov[β1 , β4 ] + 2x3 cov[β1 , β6 ] + x22 var[β4 ] + x23 var[β6 ] + 2x2 x3 cov[β4 , β6 ] + 2x2 x3 cov[β1 , β7 ] + 2x22 x3 cov[β4 , β7 ] + 2x2 x23 cov[β6 , β7 ] + x22 x23 var[β7 ]. (16) This calculation is performed above for one variable in a very simple three-variable linear interaction specification. Consider this process with a more typical specification containing many more explanatory variables and possibly more complex interactions. The algebraic steps (outlined in Appendix A) 14

would have to be replicated for every variable that contributes to an interaction term, each starting with a different form describing the marginal effect for that explanatory variable. To address this problem, Appendix B provides a matrix-based procedure to obtain conditional standard errors without resorting to lengthy scalar calculations. This new method is flexible with regard to different models, thereby allowing complex hierarchical specifications.5 The new procedure is a more convenient method for calculating coefficient standard errors when these coefficients are composite due to dependence on the levels of other interacting variables. The comparative advantage is that more complicated and hierarchical forms are easily accommodated, and there is less reliance on individualized calculations for each interacting term. Furthermore, the procedure is easily automated in commonly used statistical programming environments. The example sections provide work with these conditional standard errors using the outlined procedure. In each case the conditional standard errors are calculated for the linear predictor equation where the link function operates on the left-hand side of the specification equation. We then apply the inverse link function on the first differences of interest and their standard errors to obtain expected outcome variable quantities of interest. This approach works well in practice since it makes the calculation of conditional standard errors relatively simple through the two step process.

Statistical Significance In previous sections it was demonstrated that the total effect of a given interacting explanatory variable on the outcome variable is conditional on levels of the variables specified as interacting with that variable. This is true for the linear model and generalized linear models, as well as for any hierarchical specification of interactions. In addition, because the conditional standard error of the total effect of an interacting variable is also conditional on levels of the corresponding interaction variables, so is the statistical significance of this total effect. Fortunately, all composite total interaction effects are distributed Student’s-t around zero under the null hypothesis assumption, just as the coefficients are in the standard additive (noninteractive) specification. The test statistic is simply the total effect divided by the conditional standard error at some specified levels of the interacting variables. This may be confusing in the situation where one of the interacting variable’s main effects is not statistically significant. The question naturally arises as to whether or not the interaction contribution from this variable should be included in the composite 5

Computer code for running the matrix-based calculation of conditional standard errors is available for Gauss, SAS, Splus/R, and Stata at the authors’ website.

15

total interaction effect of interest. When the contribution of a composite total interaction effect is evaluated, all constituent terms from every interacting variable must be included regardless of their individual statistical significance. The reason is that a variable could have a statistically significant effect on the outcome variable at specified levels of the interacting outcome variables even when it does not in the situation where all interacting variables are set equal to zero (the solitary main effect for that variable). Returning to the vote choice example with gun ownership and ideology interacting to determine the probability of voting for the Republican congressional candidate, it is possible that the reported coefficient main effect for ideology is not itself statistically significant, but that for moderate ideology voters only, the first difference for gun ownership is statistically reliable. Thus the contribution of ideology to the fit of the model is expressed only at a restricted range. A variable can have a statistically significant contribution only at specified levels of interacting variables. A strictly additive model with these two explanatory variables would miss this feature of the data.

Analyzing Interaction Effects in a Model of Comparative Politics Regime instability is one of the hallmarks of late twentieth century politics in sub-Saharan africa. For a variety of reasons, including ethnic fragmentation, arbitrary borders, economic problems, outside intervention, and poorly developed governmental institutions, a disproportionately high proportion of regime changes have occurred as a result of military takeovers. The bulk of current knowledge about systematic effects in African politics and political systems is synthesized from a range of detailed narratives on individual countries. There is relatively little comparative empirical scholarship that explains changes in government from an overarching perspective (Bratton and Van De Walle 1994), and theories have tended to be more fragmented than integrative (Wiarda 1985). A counter-recent trend in the literature is to analyze the status of regimes through Hyden’s (1988, 1992) delineation of governance (legitimate exercise of state power, establishment of known reciprocities, development of trust between institutions, and visible accountability) through comparative studies of African regime types (c.f. Bienen 1993; Bratton and Rothchild 1992; Bratton and Van De Walle 1994; O’Donnell, et al. 1986; Jackson and Rosberg 1982). The Data This example uses data collected by Bratton and Van De Walle (1997), made available with a codebook through the Inter-University Consortium for Political and Social Research. These authors 16

collected data on regime transition for 47 sub-Saharan countries over the period from each country’s colonial independence to 1989, with some additional variables collected for the period 1990 to 1994. Included are 99 variables describing governmental, economic, and social conditions for the 47 cases. Also provided are data from 106 presidential and 185 parliamentary elections, including information about parties, turnout, and political openness. The focus here is an outcome variable included in Bratton and Van De Walle’s work (1994, p. 479), but not featured as a modeled result: regime change through military coups. This is a wellstudied issue (Bienen 1979; Decalo 1976a and 1976b; Feit 1968; Jackman, et al. 1986; Johnson, et al. 1984), but not necessarily so from a statistical perspective. Military Coups is operationalized as the successful number of military coups for a country over the period from independence to 1989 (ranging from zero to six events). This outcome variable is defined only over a positive integer sample space and therefore requires a generalized linear model link function appropriate to counts. The most basic µ ) = log(µ µ )), which is applicable when there is no evidence of underlink function is the Poisson, (g(µ or over-dispersion: the mean and the variance are approximately equal (King 1988, 1989; McCullagh and Nelder 1989; Zorn 2001). Seven explanatory variables are chosen to model the count of military coups. Countries with a history of military rule are more likely to suffer a military coup in future periods. This proclivity, emblemized by Nigeria, is modeled with the variable Military Oligarchy which counts the years that the country was ruled by a military oligarchy during the period from independence to 1989. Societies granted wide freedom to organize in groups to express political opinions are expected to be less inclined to lead to forced military rule. The variable Political Liberalization takes on values: 0 for no observable civil rights for political expression, 1 for limited civil rights of expression but no ability to organize these interests into competitive political parties, and 2 for all widely recognized civil rights of political expression including the ability to form political parties. Distinct from the idea of liberalization to create political parties is the resulting array of parties that may be created. Small numbers of parties may suggest centralization of political power around strong tribal, economic, or religious interests, whereas a large number of political parties may be indicative of fragmented power and widely divergent political interests. The variable that measures this phenomenon, Parties, is the simple count of legally registered political parties ranging from 0 to 38 in the year 1993.

17

Participation in the electoral process is important in emerging democracies in that it demonstrates concern by citizens as well as a willingness to forgo other activities in order to cast a vote. In order to measure this effect with respect to both total population and total registration, two turnout variables are specified relative to the legislative election occurring most recently to 1990. Percent Legislative Voting (7.5-77.4%) measures the percentage of the total country population that participated in the election and Percent Registered Voting (30.0-99.8%) measures the percentage of those registered participating in the last election. The difference between the two variables thus measures the net effect of enfranchisement and interest through registration. Two broad descriptive variables are specified to account for the diversity of geographic and population sizes (a vast range in Africa). Size indicates the total geographical area covered by the country in one thousand square kilometer units. This includes inland water areas and varies from 0.5 to 2,506. Population measures the total country population in millions as of 1989, ranging from 0.067 to 113.8. These variables are used to specify a multiplicative interaction term: (Size)(Population), with the theory that the ability of the military to draw power from a population through recruits and resources is also conditional on geographic size and the associated need to regulate and protect this territory. The last two variables are centered on political activity over the period of interest. Regimes measures the number of different regime types (numbered one through four), and Elections measures the total number of both presidential and legislative elections ranging from one to fourteen. These two variables are also used to specify a multiplicative interaction term: (Regimes)(Elections). The theoretical idea behind this interaction term is the expectation that large variability in regimes along with a greater number of elections should be a suppressant on military coups since frequent participation accompanied by more changes in regimes indicate widely varying and effectively expressed citizen influence. Model and Results The generalized linear model for these data with the Poisson link function is summarized by:

18

β ) = exp [Xβ β] g−1 (θθ ) = g−1 (Xβ | {z } 47×1

 = exp 1β0 + (Military Oligarchy)β1 + (Political Liberalization)β2

+ (Parties)β3 + (%Legislative Voting)β4 + (%Registered Voting)β5

+ (Size)β6 + (Population)β7 + (Regimes)β8 + (Elections)β9  + (Size)(Population)β10 + (Regimes)(Elections)β11

= E[Y] = E[Military Coups].

The systematic component here is Xβ Xβ, the stochastic component is Y = Military Coups, and the µ) Moving the link function to the left-hand side of the specification, we link function is θ = log(µ can re-express this model as: g(µ) = log(E[Y]) = Xβ (although this is now a less intuitive form for understanding the outcome variable). The goal is to estimate the coefficient vector: β = {β0 , . . . , β11 } in the context above, and this produces the model in The first three columns of Table 1 provide the generalized linear model coefficient estimates and their unconditional standard errors followed by the 95% confidence interval. [Table 1 About Here.] The model appears to fit the data quite well: an improvement from the null deviance of 62 on 46 degrees of freedom to a residual deviance of 7.5 on 35 degrees of freedom (evidence that the model does not fit would be supplied by a deviance value in the tail of a χ2n−k distribution), an Akaike Information Criterion (AIC equals minus two times the likelihood function at the MLE values plus two times the number of parameters, a quantity to minimized) superior to close alternatives, and all but two of the individual coefficients having 95% confidence intervals bounded away from zero. In fact, the AIC for the model excluding the two interaction terms is 111.48, indicating a substantial improvement by including these terms. The second section of Table 1 gives first differences for the complete range of each explanatory variable and conditional standard error calculated using the algorithm for conditional standard errors in Appendix B. These first differences are provided on the scale of the outcome variable metric, β ]). The first differences are produced meaning that they are passed through the link function (exp [Xβ by setting all other variables at their means and observing the difference in expected counts over the range of this single variable of interest. For Size, Population, Regimes, Elections the mean 19

values of the associated interacting variables is included in the interaction contributions to the first differences. A common mistake, mentioned previously, is the exclusion of the interaction contribution to the calculation of variables with corresponding interaction terms, and treating the interaction terms as if they can be considered separately. To illustrate this point, recalculating the first differences for the interacting variables purposely excluding this component gives: Size = −1.6311, Population = −2.3367, Regimes = −2.1962, and Elections = −12.9756. The difference between these values and the correct first differences given in Table 1 is substantial, perhaps especially so for Elections. The main effects and the interaction term for Size and Population are noted to be statistically reliable as evidenced by 95% confidence intervals bounded away from zero. The first differences for both variables, which are calculated at the mean values of all other variables (including of course each other), are also both statistically reliable and negatively signed as well. Evidence that the interaction term between them is not very influential is found by observing that it is not sufficiently large enough to impose its positive sign on either of the first differences (although it is statistically reliable). The two interaction terms have dramatically different individual coefficient values: that for (Regimes)(Elections) is more than three orders of magnitude greater that for (Size)(Population), even though the main effects are roughly comparable. In general the results for (Regimes) and (Elections) are relatively unsatisfactory: there is no evidence that the number of regime types affects military coups: the main effect and the first difference fail to achieve statistical reliability in the model. Note that if we did not test for the statistical reliability of the first difference we might be inclined to trust the first difference for (Regimes). This is also true for (Elections), but we are even more likely to have made this mistake in the absence of a confidence interval for the first difference since the main effect and the interaction effect are both statistically reliable. One of the explanatory variables, Parties, has a statistically reliable coefficient but not first difference. Exactly what does this mean? The answer is that in a very specific circumstance, evaluating the range of party counts from 0 to 62 and setting all other variables set at their means, the conditional standard error for Parties is sufficiently large as to make a causal inference between increased numbers of parties and increased tendency toward military coups not trustworthy. This distinction between the reliability of the main effect and the reliability of the first difference highlights an added conceptual challenge in interpreting interaction models.

20

Since the first differences and their conditional standard errors in Table 1 are given on the scale of the outcome variable. This provides a far more intuitive presentation than giving these values on the linear predictor scale without applying the link function. So, for instance, the range of percent participation in legislative elections in sub-Saharan elections (0-77%) imposes an expected increase of 10 military coups. Conversely the range of percent registered that are voting (0-100%) gives an expected decrease of almost 9 military coups. This is an example of where two first differences (both statistically reliable here) provide more substantive information then their associated main effect coefficients.

Extending Hierarchies: An Application to Judicial Decision Making In fully argued and decided Supreme Court cases, there are two occasions when the justices vote. The first, called the “original vote on the merits,” occurs in secret conference one to five days after oral arguments. The second, called the “final vote on the merits,” occurs much later as the majority and minority opinions are being written. Therefore Supreme Court justices have the opportunity to change their votes at any time between the conference and the announcement of the final decision. This may happen due to personal reflection, persuasion by other justices, or dissatisfaction with the drafting of an initially favored opinion (Murphy 1964). Because the Court is fairly secretive compared to other American political institutions, it is relatively difficult to analyze its internal decision-making processes. However, as more retired justices’ private papers are made public, more information is available about how the Supreme Court arrives at judicial outcomes. The Data and Hypotheses Maltzman and Wahlbeck (1996) provide a broad empirical look at the reasons that justices switch their votes, building on the empirical approach of Dorff and Brenner (1992) but covering the complete era of a chief justice. Their contribution analyzes the papers of Justices Thurgood Marshall and William Brennan to build a dataset of covariates covering the entire Burger Court (1969–1985). Maltzman and Wahlbeck focus on three general hypothesized causes for vote switching: policy preferences, uncertainty, and institutional factors. Maltzman and Wahlbeck’s data provide 15,459 cases, each of which is a record of an individual justice voting twice on a specific case that was orally argued before the Burger Court and terminated with a signed opinion. The outcome variable is coded 1 if the conference vote for a given justice differed from the final vote (7.5% of the values). 21

To measure the spatial preference distance between a justice and the author of the opposing opinion, an ideology scale from Epstein, et al. (1994) is used whereby greater distance implies a lower probability of switching the vote. Specifically, the variable Distance is a measure of the ideological difference between author of the opinion that the justice initially sides with and the author of the opposing opinion. To obtain a measure of each justice’s placement relative to the coalitions that form during and after the conference, Maltzman and Wahlbeck first rank the justices by their ideology scores and establish a mean ideology placement for each of the two coalitions. Cases are assigned +1 if they land at their coalition’s mean or farther away from the opposite coalition mean, 0 if they are between the two coalition means, and −1 if they are at or on the other side of the opposite coalition mean. To capture the “freshman effect” (Snyder 1958; Howard 1968; Wood, et al. 1998) in the model, a justice who has served two years or less at the time of the vote has the variable Freshman coded 1, otherwise coded 0. On the opposite end of the experience spectrum, justices who have written extensively on the topic, or were expert in a policy area related to the case, have the Expert variable coded 1, otherwise coded 0. Two variables are provided to incorporate the different voting patterns of the chief justice: Chief Justice, an indicator function, and Minority Chief Justice, coded 1 if the chief justice voted with the minority on the conference vote, 0 otherwise. Another dichotomous explanatory variable, Minority, is used to indicate a justice voting with the minority at the conference, not just the chief justice. If at the conference vote the majority is decided by the smallest possible margin, then the explanatory variable Minimum Majority is coded 1, otherwise coded 0. The variable Complexity contains a positive direction factor for the complexity of the specific case. Similarly, cases that are highly salient have the variable Landmark coded 1, otherwise coded 0. The size of the dissenting group is likely to matter in a justice’s calculation to switch votes and side with the majority. Accordingly, the variable Dissent is coded with the scheme 4−# of codissenters. So lone dissenters are coded 4 and those in the majority are coded 0. Finally, the study also includes a dichotomous explanatory variable indicating unanimity at the conference vote: Unanimous. This example extends Maltzman and Wahlbeck’s analysis using an added hierarchical level of interactions with the objective of testing two new hypotheses. First, do members in the minority

22

react to increased case complexity and greater issue distance to the opposite coalition? Second, are justices who are expert on the policy issue in the case and find themselves in the minority away from their expected ideological coalition different from those who lack such expertise. In both cases, these hypotheses are tested by looking at the effects of blocks of explanatory variables with their associated interactions on the probability of switching votes (the judicial behavior of interest here). The Model and Results This section gives a reanalysis of Maltzman and Wahlbeck’s study using the same logit model and specifying both first- and second-order interaction effects, whereas they limited the original model to first-order interaction effects. In their study the key interaction variable was Minority, and their substantive purpose with this approach was to test the direction of observed fluidity. If there is a positive and significant first-order interaction effect between another explanatory variable and Minority, then this implies that changing a vote to be in the majority in the final vote is positively related to this explanatory variable. The first model in Table 2 shows the original specification without interaction effects, and the second model in Table 2 includes the minority interaction effects, both exactly as Maltzman and Wahlbeck developed. All values in Table 2 are standard GLM coefficients operating through the (logit) link function on the expected value of the outcome variable. The 95% confidence intervals in the table are thus the unconditional measures of coefficient reliability. The conditional standard errors are calculated later in the discussion on first differences. [Table 2 About Here.] Only the main effect for Minimum Majority changes significance when the interaction effects are added to the specification: the coefficient estimate has a 95% confidence interval bounded away from zero in the noninteractive specification but not when interactions are added. In their discussion, Maltzman and Wahlbeck note that the first-order Minority interactions with Minimum Majority and Complexity are statistically significant and negatively signed, inferring that the presence of a minimum winning coalition at the conference or increased technical considerations are both likely to reduce fluidity. The first-order interaction of Minority with Distance is also statistically significant (95% confidence interval bounded away from zero), but has a positive sign. So the evidence suggests that minority members at the conference have a greater probability of switching their votes than 23

majority members for a higher level of ideological distance from the writer of the opposing opinion! The third model of Table 2 adds two second-order interaction terms to the specification that continue the hierarchical focus on the minority status of the justice and facilitate a test of the two new hypotheses. The first new interaction term specifies a relationship between levels of Distance, Complexity, and Minority. This addresses an idea that Maltzman and Wahlbeck “suspected” but did not otherwise investigate in their work (p. 590): that minority members on complex cases tend to have a decreased inclination to change their vote with increasing issue distance to the mean majority opinion due to the intensity of their convictions. There is also now a second-order interaction term between Coalition, Expert, and Minority. This tests for the presence of an interaction effect between a justice’s positional ranking relative to the formed coalitions, the justice’s personal experience and knowledge in this area of the law, and minority status. The hypothesis is that justices with extensive expertise are more comfortable in the minority and away from their expected coalitional placement than those without such expertise. Therefore these justices have a declining probability of changing their vote with increased coalition ideology difference given minority and expert status. In both cases the second-order interaction effects are statistically significant (95% confidence intervals bounded away from zero using unconditional standard errors), and improve the overall fit of the model as seen by the residual deviance and the Akaike Information Criterion. Interestingly, the rest of the model is virtually unchanged from the first-order interaction specification: every coefficient that was statistically significant remains significant, every coefficient that was not statistically significant remains not statistically significant, and there is virtually no change in the magnitudes of the significant coefficients. The negative sign on the second-order interaction term between Distance, Complexity, and Minority indicates that for members of the minority considering a complex case, increasing distance to the majority mean ideology score reduces the probability of switching. Also for a given ideology distance to the majority, a minority member is more likely to switch for relatively simple cases than for complex cases. Finally, for complex cases and a given ideology distance, minority members are less inclined to switch than majority members. The second-order interaction term between Coalition, Expert, and Minority is negative. Noting that Expert and Minority are dichotomous explanatory variables, we can say that there is evidence that experts in the minority have a decreasing probability of vote switching with greater issue distance from their fellow coalition members.

24

The two second-order interaction terms have quite different effects and interpretations for the model specified. The two first-order interaction coefficient estimates for (Distance)(Minority) and (Complexity)(Minority) are statistically significant, as is the associated second-order-interaction coefficient for (Distance)(Complexity)(Minority). However, this second-order interaction coefficient is small, suggesting only slightly more explanatory power. Conversely, the first-order interaction coefficient estimates for (Coalition)(Minority) and (Expert)(Minority) are not statistically significant but the associated second-order interaction coefficient (Coalition)(Expert)-(Minority) is statistically significant (and large). So the interactions between coalition distance and minority status between expert status and minority status, are evidenced only when considered together. Furthermore, the rest of the model remains essentially unchanged: every second-order model coefficient is contained in its corresponding first-order model 95% confidence interval, and no coefficient estimate changes statistical significance. This directly demonstrates the value of specifying hierarchies of interactions: we would have missed a large and seemingly reliable contribution to explaining outcome variable behavior without specifying (Coalition)(Expert)(Minority). Figure 2: Second-Order Interaction First Differences At Levels Complexity= 0: Coalition = −1 Minority = 0

-0.0156

Coalition = 0 Minority = 0

-0.0092

Coalition = 1 Minority = 0

0.0391 0.0411

0.0299

0.0119

0.0567 Coalition = 0 Minority = 1

-0.0188

Coalition = 0 Minority = 0

0.0394

0.0294

Coalition = 1 Minority = 0 0.0211

0.0406

0.0582 Coalition = −1 Minority = 1

-0.0112

0.0342

Complexity= 1: Coalition = −1 Minority = 0

Coalition = 1 Minority = 1 0.0240

0.0474

Coalition = −1 Minority = 1

0.0211

Coalition = 0 Minority = 1

Coalition = 1 Minority = 1

Table 2 shows that the term (Coalition)(Expert)(Minority) contributes to the quality of the 25

model fit as an added interaction term. Now we look at specific levels of these discrete explanatory variables, calculating conditional first differences to explore changes affecting the probability of vote switching. Recall that since first differences are built on random quantities, they are random quantities as well. In Figure 2, the first difference values are given for the specified explanatory variable changes in the cases where these first differences have 95% confidence intervals bounded away from zero, calculated using the conditional standard errors from Appendix B.6 This visual helps show the probability differences in the outcome variable where more than one explanatory variable is controlled with first differences. Without such a device one would be forced to produce a rather large and cumbersome table. Figure 2 clearly demonstrates that higher-order interactions can matter. For instance, the first difference probability for switching votes for minority (0 to 1), coalition (1 to 0), and complexity at 0 is 0.0391, but the first difference for minority (0 to 1), coalition (0 to -1), and complexity at 1 is 0.0582. So first differences matter in the context of simultaneous changes in higher-order interacting explanatory variables. Note also that the evidence supporting the existence of directional higher-order interaction effects varies as shown by the presence or absence of statistical significance between levels in Figure 2.

Structured Hierarchies: An Application to Education Public Policy Education policy is never far removed from the American public agenda. Although it is rare to observe the kind of shocks or crises that raise other public policy areas to the forefront of the nation’s attention, education often dominates reports of the public’s list of concerns (cf. Pew Report, January 1998, “Spending Favored Over Tax Cuts Or Debt Reduction”). In addition, Education expenditures constitute the largest single component of public service sector budgets in the United States.7 6

The model summary provides the “conditional first differences” because not only are the other variables constant at their mean, but also the first difference values and the standard errors of these first differences are calculated by including the effects of the relevant interaction terms. For example, with Expert the first difference is calculated by summing the contribution from the main effect as well as the first-order interaction with Minority and the second-order interaction with Coalition and Minority at the two levels of Expert: 0 and 1. Therefore there are three terms that produce a first difference of 0.0018 for Expert, where the other main effects supporting the interaction terms (and the other explanatory variables) are held constant at their means. In all cases the first difference effects and their standard errors are calculated first for the GLM linear predictor and then run through the link function for the final expected effect on the outcome variable. 7 The magnitude of this public policy area is startling: 45.6 million public school students in the 1996-7 school year and total government expenditures of roughly $318 billion in 1997 (National Center for Education Statistics 1999, http://nces.ed.gov/).

26

The Data The education testing data used in this example were collected by the California Department of Education (CDE) in 1998 through the requirement that all students in the 2nd through 11th grades take a standardized test (Stanford 9) in a variety of subjects. These data are recorded at the school level, aggregated through the district and state levels, and are freely available.8 Demographic data are provided by CDE’s Educational Demographics Unit, and income data are provided by the National Center for Education Statistics.9 The data contain 303 unified school districts that support kindergarten through twelfth grade. The test scores selected are the California Standardized Testing and Reporting (STAR) data for the ninth grade math test. There is considerable controversy in the choice of test and grade, and this particular selection represents a compromise between various positions (see

2000, Chapter 4), as well as representing a common political measure of teaching

and administrative success at the policy evaluation level. The STAR data report raw test scores for students. The level of measurement and subsequent treatment of these outcome variables are currently under debate in the literature. It is easy to side with skeptics who assert that teacher effects, environmental effects, and district-level heterogeneity make relatively precise distinctions impossible. The approach taken here is to reduce the level of measurement in a meaningful way.10 Consequently, mean test scores at the district level are replaced with the proportion of students in the district scoring over the national median. It is much less demanding on the data in terms of required precision (many authors have called into question the efficacy of making relatively fine level distinctions with standardized educational testing data (Hanushek 1986; Koretz 1996; Meyer 1996; Strenio 1981)). The demographic block of explanatory variables consists of standard factors in the literature: Low Income (Fuller, et al. 1996; Hanushek 1986; Murnane 1975; Levin 1996) measured by percent of students who qualify for reduced or free lunch plans, and Percent Asian, Percent Black, Percent Hispanic (Kickbusch 1985; Meier & Stewart 1991; Rong & Grant 1992). 8

http://goldmine.cde.ca.gov. These data, as released, contained serious flaws in the recording of student demographics (Anderson 1998). The state decided not to make the corrections for cost reasons. As a result, I corrected missing and erroneous values using other sources such as census data. For details of this project, see (1999). The replication dataset along with the accompanying codebook we developed are freely available for downloading at: http://www.calpoly.edu/~jgill. 10 Shively (1997) warns that reducing the level of measurement (primarily to simplify the analysis) wastes information. I agree, except when there is little or no information wasted by doing so. The feeling thermometer questions in the American National Election Study are a classic example of this phenomenon. A scan of the raw data for any of these questions for any year of the survey reveals very few “non-round” numbers like 63, 91, 17, 28, etc. 9

27

The teacher block of explanatory variables contains those specifically describing attributes of the instructional staff: percent who are minorities (Percent Minority Teachers), mean teaching experience (Average Years Experience), and mean salary ($1,000 units) including benefits11 (Average Salary), and the within-block hierarchy of interaction effects. Teachers’ race is used as a policy measure, although California’s 1996 Proposition 209 barred the use of race as a determinant in hiring.12 Teacher experience can be considered a policy issue since there is evidence that salary and districts’ treatment of new teachers affect teachers’ longevity in the job (Murnane and Olsen 1989). Hanushek claims that teacher quality is a key factor in educational outcome (1971, 1986, 1989A), but does not believe that it can be measured in traditional methods, such as holding extra degrees (1992). The student block of explanatory variables consists of the amount spent per student in the district (Per-Pupil Spending), mean class size (Class Size), and the percent enrolled in local college credit programs (Percent in College Courses). The final block, school structure, includes variables that measure the percent of charter schools in the district (Percent Charter) and the percent of yearround schools (Percent Year-round) in the district. 11

Unlike many other occupations, the distribution of salaries for teachers at public elementary and secondary schools is not right-skewed. 12 There is evidence that black teachers are more effective in teaching black students and that an increase in black teachers leads to an increase in graduation rates for black students (Meier, Stewart and England 1991; Murnane 1975). Also, it appears that this improvement does not come at the expense of nonminority students (Meier, et al. 1999).

28

The Model and Results The interaction hierarchy model in log odds form is specified by 

 X) P (Yi = 1|X log = β0 . . . X) P (Yi = 0|X +β (Low Income) + β2 (Percent Asian) + β3 (Percent Black) + β4 (Percent Hispanic) | 1 {z } Demographic Block

+ β5 (Percent Minority Teachers) + β6 (Average Years Experience) + β7 (Average Salary) + β8 (Percent Minority Teachers) × (Average Years Experience) + β9 (Percent Minority Teachers) × (Average Salary) + β10 (Average Years Experience) × (Average Salary) + β11 (Percent Minority Teachers) × (Average Years Experience) × (Average Salary) {z | Teacher Block

}

+ β12 (Per-Pupil Spending) + β13 (Class Size) + β14 (Percent in College Courses) + β15 (Per-Pupil Spending) × (Class Size) + β16 (Per-Pupil Spending) × (Percent in College Courses) + β17 (Class Size) × (Percent in College Courses) + β18 (Per-Pupil Spending) × (Class Size) × (Percent in College Courses) {z | Student Block

}

+β19 (Percent Charter) + β20 (Percent Year-round). | {z } School Structure Block

The specification is set up as a second-order interaction model to capture the interrelatedness of the explanatory factors in determining educational outcomes at two different levels in addition to the main effects. In doing so, the model addresses a serious statistical error with the current literature: it is confusing and illogical to claim or imply that some main effect is zero when there exists an interaction effect that includes this explanatory factor (Nelder 1977; Edwards 1990). So stating, for instance, that increasing per-pupil spending has no effect on learning outcomes but acknowledging that there is an interaction effect between per-pupil spending and teacher’s salary is inappropriate. Or more succinctly stated: “Even in the presence of significant interaction, the main effect of columns is clearly

29

meaningful and readily interpretable” (Rosnow and Rosenthal 1991, p. 575). This principle, termed marginality by Nelder (1977), is well known in the literature on cell means models and hierarchical log-linear models, but is less familiar elsewhere. The structure of the model implies that interactions are confined within the teacher and student blocks, obviously a strong parametric assumption. However, the coefficient values for the main effects differed little in the atheoretic approach of including all combinatorially possible first- and secondorder effects. Of course such a model is useless both in terms of theories about how the education process works and in terms of interpreting the results. Table 3 provides the results of applying the model specification to the STAR data for the ninth grade mathematics test. The model results show every single coefficient estimate producing 99% confidence intervals bounded away from zero (functionally equivalent to p < 0.01).13 [Table 3 About Here.] The signs of the first- and second-order coefficients in Table 3 are interesting. Higher percentages of minority teachers are associated with lower levels of salary and experience in school systems since most of these teachers are relatively recent hires. Yet the second-order interaction has a positive coefficient. This contradiction is deciphered by thinking parenthetically. Higher percentages of minority teachers and higher levels of experience are positively associated with higher salaries in the effect on learning outcomes. Higher percentages of minority teachers and higher levels of salaries are positively associated with higher levels of experience in the effect on learning outcomes. Finally, higher combined levels of experience and salary are positively associated with higher percentages of minority teacher in affecting learning outcomes. This can be confusing because second-order interaction effects cannot be fully understood without first-order interaction effects, which cannot be fully understood without main effects (Nelder 1977). Nowhere is this more apparent than in the student block, where two of the second-order interaction effects appear to be wrongly signed: β15 (Per-Pupil Spending) × (Class Size) and β17 (Class Size) × (Percent in College Courses), i.e. all three student block main effects are signed in the same direction of the second-order interaction. 13

Even though the model residual deviance is relatively large, it is a reduction of about 8.5 to 1 over the saturated model. There is no question that 4078.8 is a large deviance on 282 degrees of freedom, although the variances of the coefficient estimates indicate an outstanding fit to the data. In fact, we know exactly where the excess residual deviance comes from: the very important but omitted variable of parental participation in school governance. This variable, probably best measured by PTA involvement, is not only not included in the California STAR data, but is not currently a component of any comparable education policy study.

30

To move away from this more confusing approach to analyzing the output, the next section focuses on first differences as a way of clarifying these results. Conditional First Differences In Table 4, conditional first differences for each main effect variable in the two models are provided over the interquartile range (75th percentile minus the (25th percentile), and over the whole range. Reported statistical reliability is assessed using the conditional standard error calculations in Appendix B. Thus, for example, the interquartile range for Percent Low Income is 27 − 55%, and the first difference for this explanatory variable over this interval is −11.89%. In other words, districts do about 12% worse at the third quartile than at the first quartile. [Table 4 About Here.] More emphasis is placed on the first differences across the interquartile range than on the first differences across the full range of the data. The motivation for this is practical and prescriptive. Public managers, particularly in educational settings, are unlikely to be able to change a policy emphasis to the degree implied by the full range of levels. So while the first difference across the full spectrum of the data for a given variable is often interesting and instructive, it is probably not the case that it gives practical guidance on improving education delivery. Conversely, the first difference from the first to the third quartile encompasses cases that are slightly lower than the typical case to slightly higher than the typical case, and it is far more likely that this range is near the levels obtainable by policy makers. Looking at first differences clarifies some of the perplexing results from Table 3. First, it seemed paradoxical that Per-Pupil Spending would have a negative coefficient (−1.95217). Of course this is the contribution of this explanatory variable in the nonsensical scenario, from a public policy view, where all other variables are set at zero. Recall that the first differences here fix means to all other variables and this is applied through all of the (nonlisted in Table 4) interaction variables as well. Thus, if there is a large discrepancy between the magnitude of the effect in the estimation table and the first difference result, it means that the interactions dominate the zero-effect marginal. Returning to the first differences for Per-Pupil Spending, moving from the first quartile to the third quartile improves the expected pass rate about 1% and moving across the whole range of spending for districts improves the expected pass rate slightly less than 6%. 31

Conclusion To a great extent, the mission here has been to build on the excellent, but perhaps forgotten, essay of Friedrich (1982). This work goes beyond Friedrich in that he does not consider the generalized linear model and focuses instead strictly on the interpretation of interactions in the linear model. Friedrich addresses a laundry list of criticisms, demonstrates why they are overblown, and then goes on to carefully develop the correct conditional nature of interaction coefficients and their standard errors. His work is extended here by building a framework for interpreting and understanding higher-order interaction effects and the more complicated measures of reliability of interaction effects imposed by the generalized linear model. This essay argues strongly that interactions are important components of researcher specifications in models of non-normal, non-interval outcome variables. When interactions effects are known to exist in a given dataset, they cannot be removed by transformation, case-deletion, or a change of method (Freeman 1985). So rather than considering interactions as annoying features that need to be mitigated, one should view such effects as critical components of the data. Interaction specifications are useful because it is often the case that the full effect of explanatory variables is not purely additive (through the link function). Although there are many ways to specify such models (nonparametrics, polynomials, mixtures, etc.), specifying hierarchical interactions is recommended because of the ease of implementation and directly computable solution. Furthermore, specifying interactions as product components is widely applicable across many data forms. Cohen (1978, p. 863) reinforces this point: In summary, partialed products of variables are correctly interpreted as interactions whatever their level of scaling, whether or not they are correlated, whether or not their means are zero, whether they are observational or experimental in origin, and whether single variables or sets of variables are at issue.

The primary methodological contribution provided here is a widely applicable procedure for calculating conditional standard errors in generalized linear models with hierarchical interaction terms. Thus far only limited forms have been offered, and it has been unclear whether researchers even need to use conditional forms at all. However, ignoring the conditional nature of standard errors in these specifications leads to incorrect estimation of overall coefficient uncertainty. The hope is that the described technique for obtaining standard errors in higher-order models will reduce the number of instances where the zero-effect standard errors are used simply because they were easily produced by the statistical software. 32

Appendix A: Scalar Derivations of Conditional Standard Errors In this appendix I derive conditional standard errors for two forms discussed. Assume, without loss of generality, that the parameter β1 is of interest and that levels of Xi , i 6= 1, values corresponding to the other parameters have been selected. The form here can be used for the linear model or for a generalized linear model where the link function is expressed on the left-hand side of the specification (this is not a serious restriction as the form of link functions for GLMs require that they can be expressed on either side of the model equation). For the simplest first-order interaction model:   X) P (Yi = 1|X log = β0 + β1 Xi1 + β2 Xi2 + β3 Xi1 Xi2 X) P (Yi = 0|X Friedrich (1982, p. 810) and Jaccard, et al. (1990, p. 27) provide the conditional standard error for β1 without its derivation. The variance can easily be produced as follows: (sβ1 +β2 X2 |X2 =x2 )2 = E[(β1 + β3 x2 )2 ] − (E[β1 + β3 x2 ])2 = E[β12 ] + E[2β1 β3 x2 ] + E[β32 x22 ] − E[β1 ]2 − 2E[β1 ]E[β3 x2 ] − E[β3 x2 ]2 = var[β1 ] + E[β1 ]2 + 2x2 cov[β1 , β3 ] + 2x2 E[β1 ]E[β3 ] + x22 var[β3 ] + x22 E[β3 ]2 − E[β1 ]2 − 2x2 E[β1 ]E[β3 ] − x22 E[β3 ]2 = var[β1 ] + 2x2 cov[β1 , β3 ] + x22 var[β3 ].

(17)

For the second-order interaction model specification:   X) P (Yi = 1|X log = β0 + β1 Xi1 + β2 Xi2 + β3 Xi3 + β4 Xi1 Xi2 + β5 Xi2 Xi3 + β6 Xi1 Xi3 + β7 Xi1 Xi2 Xi3 X) P (Yi = 0|X we can use the following procedure to get the conditional variance of the estimate of X1 at assigned levels of X2 and X3 :

33

(sβ1 +β4 X2 +β6 X3 +β7 X2 X3 |X2 =x2 ,X3 =x3 )2 = E[(β1 + β4 x2 + β6 x3 + β7 x2 x3 )2 ] − (E[β1 + β4 x2 + β6 x3 + β7 x2 x3 ])2 = E[β12 ] + 2x2 E[β1 β4 ] + 2x3 E[β1 β6 ] + x22 E[β42 ] + x23 E[β62 ] + 2x2 x3 E[β4 β6 ] + 2x2 x3 E[β1 β7 ] + 2x22 x3 E[β4 β7 ] + 2x2 x23 E[β6 β7 ] + x22 x23 E[β72 ] − (E[β1 ] + E[β4 x2 ] + E[β6 x3 ] + E[β7 x2 x3 ])2 = var[β1 ] + E[β1 ]2 + 2x2 (cov[β1 , β4 ] + E[β1 ]E[β4 ]) + 2x3 (cov[β1 , β6 ] + E[β1 ]E[β6 ]) + x22 (var[β4 ] + E[β4 ]2 ) + x23 (var[β6 ] + E[β6 ]2 ) + 2x2 x3 (cov[β4 , β6 ] + E[β4 ]E[β6 ]) + 2x2 x3 (cov[β1 , β7 ] + E[β1 ]E[β7 ]) + 2x22 x3 (cov[β4 , β7 ] + E[β4 ]E[β7 ]) + 2x2 x23 (cov[β6 , β7 ] + E[β6 ]E[β7 ]) + x22 x23 (var[β7 ] + E[β7 ]2 ) − E[β1 ]2 − 2x2 E[β1 ]E[β4 ] − 2x3 E[β1 ]E[β6 ] − 2x2 x3 E[β1 ]E[β7 ] − 2x2 x3 E[β4 ]E[β6 ] − x22 E[β4 ]2 − x23 E[β6 ]2 − 2x22 x3 E[β4 ]E[β7 ] − 2x2 x23 E[β6 ]E[β7 ] − x22 x23 E[β7 ]2 = var[β1 ] + 2x2 cov[β1 , β4 ] + 2x3 cov[β1 , β6 ] + x22 var[β4 ] + x23 var[β6 ] + 2x2 x3 cov[β4 , β6 ] + 2x2 x3 cov[β1 , β7 ] + 2x22 x3 cov[β4 , β7 ] + 2x2 x23 cov[β6 , β7 ] + x22 x23 var[β7 ].

(18)

In both cases the quantities in the final form come directly from the variance-covariance matrix. Note that (18) reduces to (17) if X3 is set to zero. Observe also that this last derivation provides the conditional variance of β1 only, an additional six different procedures are required to get the full set of conditional variances.

Appendix B: A Generalized Calculation of Conditional Standard Errors For the sake of readability I will use the example of the model specified in (13), but generalizability is preserved as none of the following steps is confined to this form. This expositional strategy has the additional advantage that the result can be compared with Jaccard, et al. for the model in (13). The procedure has five steps (here we assume that X1 is of interest): 1: Define a “specification vector” that summarizes the parameterization: S = [X1 , X2 , X3 , X1 X2 , X1 X3 , X2 X3 , X1 X2 X3 ] . 34

Now take the derivative of the specification vector with respect to the variable for which we want the coefficient’s standard error, drop any zero terms, and square the result:   S 2X1 = 1, X22 , X32 , X22 X32 . dS

(19)

The zero terms are excluded as they do not contribute to the variance calculations and preserving them in the notation makes the subsequent linear algebra more cumbersome.

2: Now define the lower triangular matrix taking partial derivatives according to  S 2X1 [1] dS   ∂  S2 S 2X1 [2] dS  ∂X2 dS X1 [2] ∆S 2X1 =   1 ∂ ∂ ∂ S2 S2 S2 S 2X1 [3] dS  ∂X3 dS X1 [3] 2 ∂X2 ∂X3 dS X1 [2]dS X1 [3]  ∂ ∂ ∂ 1 ∂ S2 S2 S2 S2 2 ∂X2 ∂X3 dS X1 [4] ∂X3 dS X1 [4] ∂X2 dS X1 [4] dS X1 [4]

        

S 2X1 [i] indicates the ith entry in the dS S 2X1 vector, and the zeros are left out of the upper triangle where dS for visual clarity. The observant reader will notice that

1 ∂ ∂ S2 S2 2 ∂X2 ∂X3 dS X1 [2]dS X1 [3]

and

1 ∂ ∂ S2 2 ∂X2 ∂X3 dS X1 [4]

are identical. The second is used for notational consistency with the rest of the final row. The rules for creating the matrix are as follows. First, the operand in the ith row is always the ith value of the S 2X1 vector. Second, the partial derivatives are determined by the constituent components of the dS S 2X1 working right to left on a given row from the diagonal element. Therefore the diagonal values dS S 2X1 vector is always 1. This rule is illustrated most are unaffected because the first element of the dS clearly in the last row. Third, if the column/row combination contains an interaction pair (counting rows again from the diagonal element as one), then the relevant partial is specified. Here the term ∂ 1 ∂ S2 S2 2 ∂X2 ∂X3 dS X1 [2]dS X1 [3]

comes from the necessity of including

∂ S2 ∂X3 dS X1 [3]

and

1 ∂ S2 2 ∂X2 dS X1 [2]

due to

the specified interaction. 3: Substitute the fixed values, x2 and x3 , for X2 and X3 into the matrix ∆S 2X1 . As noted before, these are most often the means, but need not be so if some other values are of theoretical interest. 4: Modify the variance-covariance matrix from the original model estimation by removing the rows and columns corresponding to the values set as fixed (fixed values obviously do not contribute to variance calculations). In this example, we remove the first row and column corresponding to the constant, and the third, fourth, and sixth rows and columns, corresponding to terms where the first derivative is zero h i X) P (Y =1|X ∂ log as indicated by the marginal effect of X1 in (13): ∂X X1 X ) = β1 + β4 X2 + β6 X3 + β7 X2 X3 . P (Y =0|X 35

The resulting matrix is given as an upper triangular form: 

Σ X1

 var(β1 ) cov(β1 , β4 ) cov(β1 , β6 ) cov(β1 , β7 )   var(β4 ) cov(β4 , β6 ) cov(β4 , β7 )  =  var(β6 ) cov(β6 , β7 )   var(β7 )



    .   

Again, the zeros are left out of the expression here just for convenience. 5: Finally calculate the conditional standard error by the square root of the trace of the product of these two matrices:

 1 ∆S 2X1 Σ X1 ) 2 . sβ1 |X2 =x2 ,X3 =x3 = trace(∆

(20)

It is straightforward to see from the structure of the ∆S 2X1 matrix that this can be quickly generalized to higher-order interaction effects and different subsets of effects. For example, in a complete thirdorder multiplicative specification, the derivative squared specification vector would be   S 2X1 = 1, X22 , X32 , X42 , X22 X32 , X22 X42 , X32 X42 , X22 X32 X42 , dS

and the associated specification matrix is  [1]   ∂  [2] ∂X2 [2]   ∂ 1 ∂ ∂  ∂X3 [3] 2 ∂X2 ∂X3 [2][3]   ∂ 1 ∂ ∂   ∂X4 [4] 2 ∂X3 ∂X4 [3][4] 2 ∆S X1 =   ∂ ∂ 1 ∂  2 ∂X2 ∂X3 [5] ∂X4 [5]   ∂ 1 ∂ ∂ 1 ∂  2 ∂X2 ∂X4 [6] 2 ∂X2 ∂X3 [6]   ∂ 1 ∂ ∂ 1 ∂  2 ∂X3 ∂X4 [7] 2 ∂X2 ∂X4 [7]  ∂ ∂ 1 ∂ ∂ 1 ∂ 4 ∂X2 ∂X3 ∂X4 [8] 2 ∂X3 ∂X4 [8]

 [3] 1 ∂ ∂ 2 ∂X2 ∂X4 [2][4]

[4]

∂ ∂X3 [5]

∂ ∂X2 [5]

[5]

∂ ∂X4 [6]

∂ ∂X3 [6]

∂ ∂X2 [6]

[6]

1 ∂ ∂ 2 ∂X2 ∂X3 [7]

∂ ∂X4 [7]

∂ ∂X3 [7]

∂ ∂X2 [7]

[7]

1 ∂ ∂ 2 ∂X2 ∂X4 [8]

1 ∂ ∂ 2 ∂X2 ∂X3 [8]

∂ ∂X4 [8]

∂ ∂X3 [8]

∂ ∂X2 [8]

[8]

S 2X1 vector. So as the variables and levels increase, the where [i] indicates the ith entry in the dS mechanical aspects of this procedure increase, but it is not theoretically any more complicated. The key analytical work is the specification of the ∆S 2Xi matrix (here we used Xi = X1 as an example), which is done for each variable that is specified as interacting with others.

36

                   

References Allison, Paul D. 1977. “Testing for Interaction in Multiple Regression.” American Journal of Sociology 83(1): 144-53. Anderson, Nick. 1998. “Data Foul-Ups Delay Analysis of School Tests.” Los Angeles Times (August 28): 1. Assmann, Susan F., David W. Hosmer, Stanley Lemeshow, Kenneth A. Mundt. 1996. “Confidence Intervals for Measures of Interaction.” Epidemiology 7(3): 286-90. Berry, Frances Stokes, and William D. Berry. 1991. “Specifying a Model of State Policy Innovation.” (Comment in response to Frant). American Political Science Review 85(2): 573-9. Berry, William D. 1999. “Testing for Interaction in Models with Binary Dependent Variables.” Political Methodology Working Paper Archive: http://polmeth.calpoly.edu. Bienen, Henry. 1993. “Leaders, Violence, and the Absence of Change in Africa.” Political Science Quarterly 108, 271-82. Bienen, Henry. 1979. Armies and Parties in Africa. New York: Africana Publishing. Bratton, Michael, and Nicholas Van De Walle. 1997. “Political Regimes and Regime Transitions in Africa, 1910-1994.” Study Number I06996. Ann Arbor: Inter-University Consortium for Political and Social Research. Bratton, Michael, and Nicholas Van De Walle. 1994. “Neopatrimonial Regimes and Political Transitions in Africa.” World Politics 46, 453-89. Bratton, Michael, and Nicholas Rothchild. 1992. “The Institutional Bases of Governance in Africa.” In Governance and Politics in Africa, Goran Hyden and Michael Bratton (eds.). Boulder, CO: Lynne Rienner. Box, G. E. P., and N. R. Draper. 1987. Empirical Model-Building and Response Surfaces. New York: Wiley & Sons. Aiken, Leona S., and Stephen G. West. 1991. Multiple Regression: Testing and Interpreting Interactions. Newbury Park, CA: Sage. Cohen, Jacob. 1978. “Partialed Products Are Interactions: Partialed Powers Are Curve Components.” Psychological Bulletin 85 (July): 858-66. Cornell, John A., and Douglas C. Montgomery. 1996. “Fitting Models to Data: Interaction Versus Polynomial? Your Choice!” Communications in Statistics–Theory and Methods 25(11): 2531-55. Cox, D.R. 1984. “Interaction.” International Statistical Review 52(1): 1-31. Decalo, Samual. 1976a. Coups and Army Rule in Africa. New Haven: Yale University Press. Decalo, Samual. 1976b. “The Morphology of Military Rule in Africa.” In Military Marxist Regimes in Africa, John Markakis and Michael Waller (eds.). London: Frank Cass. De Leeuw, Jan. 1990. “Discussion of the Papers by Edwards, and Wermuth and Lauritzen.” Journal of the Royal Statistical Society, Series B 52(1): 21-50. Del Pino, G. 1989. “The Unifying Role of Iterative Generalized Least Squares in Statistical Algorithms.” Statistical Science 4: 394-408. Dorff, Robert H., and Saul Brenner. 1992. “Conformity Voting on the United States Supreme Court.” Journal of Politics 54(3): 762-75. Edwards, David. 1990. “Hierarchical Interaction Models.” Journal of the Royal Statistical Society, Series B 52(1): 3-20. Epstein, Lee, Jeffrey A. Segal, Harold J. Spaeth, and Thomas G. Walker. 1994. The Supreme Court Compendium: Data, Decisions, and Developments. Washington, D.C.: Congressional Quarterly Press. Fahrmeir, Ludwig, and Gerhard Tutz. 2001. Multivariate Statistical Modelling Based on Generalized Linear Models. Second Edition. New York: Springer.

37

Feit, Edward. 1968. “Military Coups and Political Development: Some Lessons from Ghana and Nigeria.” World Politics 20, 179-93. Fienberg, S. E. 1980. The Analysis of Cross-classified Categorical Data. Second Edition. Cambridge, MA: MIT Press. Fisher, R. A., and W. A. Mackenzie. 1923. “Studies in Crop Variation. II. The Manurial Response of Different Potato Varieties.” Journal of the Agricultural Society 13: 311-20. Frant, Howard. 1991. “Specifying A Model of State Policy Innovation.” American Political Science Association 85(2): 571-3. Freeman, G. H. 1985. “The Analysis and Interpretation of Interactions.” The Statistician 12(1), 3-10. Friedrich, Robert J. 1982. “In Defense of Multiplicative Terms in Multiple Regression Equations.” American Journal of Political Science 26(4), 797-833. Fuller, Bruce, Costanza Eggers-Pierola, Susan D. Holloway, Xiaoyam Liang, and Marylee F. Rambaud. 1996. “Rich Culture, Poor Markets: Why do Latino Parents Forego Preschooling?” Teachers College Record 97(3): 400-18. Gill, Jeff. 2000. Generalized Linear Models: A Unified Approach. Newbury Park, CA: Sage. Goodman, Leo A. 1964. “Interactions in Multidimensional Contingency Tables.” Annals of Mathematical Statistics 35(2): 632-46. Green, P. J. 1984. “Iteratively Reweighted Least Squares for Maximum Likelihood Estimation, and some Robust and Resistant Alternatives.” Journal of the Royal Statistical Society, Series B 46: 149-92. Hanushek, Eric A. 1992. “The Trade-off Between Child Quantity and Quality.” The Journal of Political Economy 100(1): 84-117. Hanushek, Eric A. 1989A. “The Impact of Differential Expenditures on School Performance.” Educational Researcher 23(4): 45-65. Hanushek, Eric A. 1986. “The Economics of Schooling: Production and Efficiency in Public Schools.” Journal of Economic Literature 24(3): 1141-77. Hanushek, Erik A. 1971. “Teacher Characteristics and Gains in Student Achievement: Estimation Using Micro Data.” American Economic Review 61(2): 280-288. Hertz-Picciotto, I., A. H. Smith, D. Holtzman, M. Lipsett, and G. Alexeeff. 1992. “Synergism Between Occupational Arsenic Exposure and Smoking in the Induction of Lung Cancer.” Epidemiology 3(1): 23-31. Hosmer, David W., and Stanley Lemeshow. 1992. “Confidence Interval Estimation of Interaction.” Epidemiology 3(5): 452-6. Howard, J. Woodford. 1968. “On the Fluidity of Judicial Choice.” American Political Science Review 62(1): 43-56. Hyden, Goran. 1992. “Governance and the Study of Politics.” In Governance and Politics in Africa, Goran Hyden and Michael Bratton (eds.). Boulder, CO: Lynne Rienner. Hyden, Goran. 1988. “Governance: A New Approach to Comparative Politics.” Paper presented at: the Annual Meeting of the African Studies Association, Chicago, October 28-31. Jaccard, James, Robert Turrisi, and Choi K. Wan. 1990. Interaction Effects in Multiple Regression. Newbury Park, CA: Sage. Jaccard, James, and Choi K. Wan. 1995. “Measurement Error in the Analysis of Interaction Effects between Continuous Predictors Using Multiple Regression: Multiple Indicator and Structural Equation Approaches.” Psychological Bulletin 117(2): 348-57. Jackman, Robert W., Rosemary H. T. O’Kane, Thomas H. Johnson, Pat McGowan, and Robert O. Slater. 1986. “Explaining African Coups d’Etat.” The American Political Science Review 80, 225-50.

38

Jackson, Robert H., and Carl G. Rosberg. 1982. “Why Africa’s Weak States Persist: The Empirical and the Juridical in Statehood.” World Politics 35, 1-24. Johnson, Thomas H., Robert O. Slater, and Pat McGowan. 1984. “Explaining African Military Coups d’Etat, 1960-1982.” The American Political Science Review 78, 622-40. Kahn, Kim Fridkin, and Patrick J. Kenney. 1997. “A Model of Candidate Evaluations in Senate Elections: The Impact of Campaign Intensity.” The Journal of Politics 59(4), 1173-205. Khuri, A. I., and J. A. Cornell. 1987. Response Surfaces: Designs and Analyses. New York: Marcel Dekker. Kickbusch, Karla. 1985. “Minority Students in Mathematics: The Reading Skill Connection.” Sociological Inquiry 55(4): 402. King, Gary. 1989. Unifying Political Methodology. Cambridge: Cambridge University Press. King, Gary. 1989. “Variance Specification in Event Count Models: From Restrictive Assumptions to a Generalized Estimator.” American Journal of Political Science 33, 762-84. King, Gary. 1988. “Statistical Models for Political Science Event Counts: Bias in Conventional Procedures and Evidence for the Exponential Poisson Regression Model.” American Journal of Political Science 32, 838-63. King, Gary, Michael Tomz, and Jason Wittenberg. 2000. “Making the Most of Statistical Analyses: Improving Interpretation and Presentation,” American Journal of Political Science 44(2): 341-355. Koretz, Daniel. 1996. “Using Student Assessments for Educational Accountability.” In Improving America’s Schools: The Role of Incentives, eds. Eric A. Hanushek and Dale W. Jorgenson. Washington, D.C.: National Academy Press. Lacy, Dean, and Phillip Paolino. 1998. “Downsian Voting and the Separation of Powers.” American Journal of Political Science 42(4), 1180-99. Lancaster, H. O. 1969. The Chi-squared Distribution. New York: Wiley & Sons. Levin, Henry M. 1996 “Economics of School Reform for At-Risk Students.” In Improving America’s Schools: The Role of Incentives, eds. Eric A. Hanushek, and Dale W. Jorgenson. Washington, D.C.: National Academy Press. Maltzman, Forrest, and Paul J. Wahlbeck. 1996. “Strategic Policy Considerations and Voting Fluidity on the Burger Court.” American Political Science Review 90(3): 581-92. McCullagh, P., and J. A. Nelder. 1989. Generalized Linear Models. Second Edition. New York: Chapman & Hall. Meier, Kenneth J., and Joseph Stewart, Jr. 1991. The Politics of Hispanic Education. Albany, NY: State University of New York Press. Meier, Kenneth J., Joseph Stewart, Jr., and Robert E. England. 1991. “The Politics of Bureaucratic Discretion: Education Access as an Urban Service.” American Journal of Political Science 35 (1): 155-177. Meier, Kenneth J., Robert D. Wrinkle, and J. L. Polinard. 1999. “Representative Bureaucracy and Distributional Equity: Addressing the Hard Question.” Journal of Politics 61(4): 1025-1039. Meyer, Robert H. 1996. “Value-Added Indicators of School Performance.” In Improving America’s Schools: The Role of Incentives, eds. Eric A. Hanushek, and Dale W. Jorgenson. Washington, D.C.: National Academy Press. Moon, Bruce E., William J. Dixon. 1985. “Politics, the State, and Basic Human Needs: A Cross-National Study.” American Journal of Political Science 29(4), 661-94. Murnane, Richard J. 1975. The Impact of School Resources on the Learning of Inner City Children. Cambridge, MA.: Ballinger Publishing Co. Murnane, Richard J., and Randall J. Olsen. 1989. “The Effect of Salaries and Opportunity Costs on Duration in Teaching: Evidence From Michigan.” The Review of Economics and Statistics 71(2): 347-352.

39

Murphy, Walter. 1964. Elements of Judicial Strategy. Chicago: University of Chicago Press. Nagler, Jonathan. 1994 “Scobit: An Alternative Estimator to Logit and Probit.” American Journal of Political Science 38(1): 230-55. Nelder, J. A. 1977. “A Reformulation of Linear Models” (with discussion). Journal of the Royal Statistical Society, Series A, 140(1): 48-76. O’Donnell, Guillermo, and Phillip Schmitter. 1986. Transitions from Authoritarian Rule: Tentative Conclusions about Uncertain Democracies. Baltimore: Johns Hopkins University Press. O’Neill, M. E. 1980. “The Distribution of Higher-Order Interactions in Contingency Tables.” Journal of the Royal Statistical Society, Series B 42(3): 357-65. Pearce, N. 1989. “Analytical Implications of Epidemiological Concepts of Interaction.” International Journal of Epidemiology 18: 976-80. Rong, Zue Lan, and Linda Grant. 1992. “Ethnicity, Generation, and School Attainment of Asians, Hispanics and Non-Hispanic Whites.” Sociological Quarterly 33 (4): 625-636. Rosnow, Ralph L., and Robert Rosenthal. 1991. “If You’re Looking at the Cell Means, You’re Not Looking at Only the Interaction (Unless All Main Effects Are Zero).” Psychological Bulletin 110(3): 574-6. Rosnow, Ralph L., and Robert Rosenthal. 1989. “Definition and Interpretation of Interaction Effects.” Psychological Bulletin 105(1): 143-6. Rothman, K. J. 1986 Modern Epidemiology. Boston: Little, Brown. Rovine, Michael J., and Alexander von Eye. 1990. “Patterns of Yielding Significant Regression Interaction Terms.” In Computing Science and Statistics: Statistics of Many Parameters, Curves, Images, Spatial Models. Proceedings of the 22nd Symposium on the Interface, East Lansing, MI, May 16-19. New York: Springer-Verlag. Shively, W. Phillips. 1997. The Craft of Political Research. Fourth Edition. New York: Prentice Hall. Snyder, Eloise. 1958. “The Supreme Court as a Small Group.” Social Forces 36 (March): 236-8. Stigler, George J. 1966. The Theory of Price. London: Macmillan. Strenio, Andrew J. 1981. The Testing Trap. New York: Rawson, Wade Publishers, Inc. Thyregod, Poul, and Henrik Spliid. 1991. “A Hypothesis of No Interaction in Factorial Experiments with a Binary Response.” Scandinavian Journal of Statistics 18: 197-209. Uesaka, Hiroyuki. 1993. “Test for Interaction Between Treatment and Stratum with Ordinal Responses.” Biometrics 49 (March): 123-9. Wiarda, Howard J. 1985. New Directions in Comparative Politics. Boulder, CO: Westview Press. Wonnacott, Ronald J., and Thomas H. Wonnacott. 1970. Econometrics. New York: Wiley & Sons. Wood, Phillip Karl, and Paul Games. 1990. “Rationale, Detection, and Implications of Interactions Between Independent Variables and Unmeasured Variables in Linear Models.” Multivariate Behavioral Research 25(3), 295-311. Wood, Sandra L., Linda Camp Keith, Drew Noble Lanier, and Ayo Ogundele. 1998. “’Acclimation Effects’ for Supreme Court Justices: A Cross-Validation, 1888-1940.” American Journal of Political Science 42(2), 690-7. Zorn, Christopher J. 2001. “Generalized Estimating Equation Models for Correlated Data: A Review With Applications.” American Journal of Political Science 45, 470-90.

40

Table 1: Parameter Estimates, Military Coups In Africa Parameter

Standard

95% Confidence

Data

First

Conditional

95% Confidence

Estimate

Error

Interval

Range

Difference

SE

Interval

Constant

2.9209

1.3368

[ 0.3008: 5.5410]

[1.0: 1.0]

0.0000

3.8069

[ -7.4615: 7.4615]

Military Oligarchy

0.1709

0.0509

[ 0.0711: 0.2706]

[0.0: 18.0]

6.1984

1.0522

[ 4.1361: 8.2608]

-0.4654

0.3319

[-1.1160: 0.1851]

[0.0: 2.0]

-0.8436

1.3936

[ -3.5752: 1.8879]

Parties†

0.0248

0.0109

[ 0.0035: 0.0460]

[0.0: 62.0]

1.5715

1.0109

[ -0.4099: 3.5529]

Percent Legislative Voting

0.0613

0.0218

[ 0.0187: 0.1040]

[0.0: 77.4]

10.3679

1.0220

[ 8.3648:12.3710]

Percent Registered Voting

-0.0361

0.0137

[-0.0629:-0.0093]

[0.0: 99.8]

-8.6260

1.0138

[-10.6130:-6.6390]

Size (sq. km)

-0.0018

0.0007

[-0.0033:-0.0004]

[0.5:2506.0]

-0.2909

0.0003

[ -0.2916:-0.2902]

Population (millions)

-0.1188

0.0397

[-0.1965:-0.0411]

[0.1: 113.8]

-0.6222

0.0189

[ -0.6593:-0.5852]

Regimes

-0.8662

0.4571

[-1.7621: 0.0298]

[1.0: 4.0]

0.4028

0.5794

[ -0.7329: 1.5385]

Elections†

-0.4859

0.2118

[-0.9010:-0.0709]

[0.0: 14.0]

-0.2231

0.2358

[ -0.6853: 0.2391]

(Size)(Population)

0.0001

0.0001

[ 0.0001: 0.0002]

(Regimes)(Elections)

0.1810

0.0689

[ 0.0459: 0.3161]

Political Liberalization

41

Null deviance: 62.3716 on 46 degrees of freedom Residual deviance: 7.5474 on 35 degrees of freedom AIC: 91.742

Bold Coefficient Names Denote 95% CI Bounded Away From Zero, †Indicates Significance For Main Effects Model Only

Table 2: Parameter Estimates, Vote Fluidity on the Supreme Court Main Effects Model

First-Order Interaction Model

Second-Order Interaction Model

42

Parameter Estimate

Std. Error

95% Confidence Interval

Parameter Estimate

Std. Error

95% Confidence Interval

Parameter Estimate

Std. Error

95% Confidence Interval

Constant

-3.0304

0.0669

[-3.1614:-2.8993]

-2.9993

0.0721

[-3.1406:-2.8580]

-2.9995

0.0721

[-3.1409:-2.8582]

Distance

0.0608

0.0017

[ 0.0575: 0.0641]

0.0534

0.0021

[ 0.0492: 0.0575]

0.0534

0.0021

[ 0.0492: 0.0575]

Coalition

-0.4586

0.0538

[-0.5640:-0.3531]

-0.5528

0.0701

[-0.6901:-0.4156]

-0.5529

0.0703

[-0.6902:-0.4156]

Minimum Majority†

-0.3368

0.1128

[-0.5578:-0.1158]

-0.1893

0.1319

[-0.4478: 0.0692]

-0.1891

0.1319

[-0.4477: 0.0694]

Complexity

0.1121

0.0358

[ 0.0420: 0.1822]

0.2031

0.0418

[ 0.1212: 0.2850]

0.2032

0.0418

[ 0.1213: 0.2850]

Landmark

0.0709

0.1548

[-0.2324: 0.3743]

0.1189

0.1870

[-0.2477: 0.4855]

0.1188

0.187

[-0.2478: 0.4854]

Freshman

0.2665

0.1316

[ 0.0086: 0.5243]

0.3438

0.1653

[ 0.0198: 0.6678]

0.3439

0.1653

[ 0.0199: 0.6679]

Expert

-0.1522

0.1126

[-0.3730: 0.0685]

-0.1024

0.1475

[-0.3916: 0.1867]

-0.1025

0.1475

[-0.3916: 0.1867]

Chief Justice

-0.0886

0.1535

[-0.3894: 0.2122]

-0.0560

0.1503

[-0.3507: 0.2387]

-0.0560

0.1503

[-0.3507: 0.2387]

Minority Chief Justice

0.1657

0.2749

[-0.3731: 0.7045]

0.0643

0.2871

[-0.4985: 0.6271]

0.0125

0.2889

[-0.5538: 0.5788]

Dissent

0.3821

0.0700

[ 0.2450: 0.5193]

0.2252

0.0952

[ 0.0386: 0.4119]

0.2179

0.0955

[ 0.0307: 0.4051]

Unanimous

-0.1427

0.1298

[-0.3972: 0.1118]

-0.0739

0.1291

[-0.3270: 0.1793]

-0.0729

0.1291

[-0.3260: 0.1803]

Minority

0.9132

0.1846

[ 0.5515: 1.2750]

1.3242

0.2764

[ 0.7825: 1.8660]

1.3285

0.2769

[ 0.7858: 1.8712]

(Distance)(Minority)

0.0182

0.0037

[ 0.0110: 0.0254]

0.0184

0.0037

[ 0.0111: 0.0256]

(Coalition)(Minority)

0.1172

0.1126

[-0.1036: 0.3379]

0.1789

0.118

[-0.0523: 0.4101]

(Minimum Majority)(Minority)

-0.5623

0.2454

[-1.0433:-0.0813]

-0.5918

0.246

[-1.0739:-0.1097]

(Complexity)(Minority)

-0.2410

0.0784

[-0.3947:-0.0874]

-0.2188

0.0765

[-0.3687:-0.0688]

(Landmark)(Minority)

-0.0691

0.3317

[-0.7193: 0.5812]

-0.0666

0.3312

[-0.7157: 0.5825]

(Freshman)(Minority)

-0.2433

0.2748

[-0.7818: 0.2952]

-0.2129

0.2752

[-0.7522: 0.3264]

(Expert)(Minority)

-0.1113

0.2325

[-0.5671: 0.3445]

-0.0666

0.2342

[-0.5257: 0.3925]

(Distance)(Complexity)(Minority)

-0.0063

0.0030

[-0.0121:-0.0005]

(Coalition)(Expert)(Minority)

-0.5468

0.2629

[-1.0620:-0.0316]

Null Deviance: 8259.0 df: 15458

Residual Deviance: 5271.5

Residual Deviance: 5224.7

Residual Deviance: 5216.2

df: 15446, AIC: 5297.5

df: 15439, AIC: 5264.7

df: 15437, AIC: 5260.2

Bold Coefficient Names Denote 95% CI Bounded Away From Zero, †Indicates Significance For Main Effects Model Only

Table 3: Parameter Estimates for 1998 STAR Math Test Standard

99% Confidence

Estimate

Error

Interval

(Intercept)

2.95889

1.54654

[-1.02474: 6.94251]

Percent Low Income

-0.01682

0.00043

[-0.01793:-0.01569]

Percent Asian

0.00993

0.00060

[ 0.00836: 0.01147]

Percent Black

-0.01872

0.00074

[-0.02064:-0.01681]

Percent Hispanic

-0.01424

0.00043

[-0.01536:-0.01312]

Percent Minority Teachers

0.25449

0.02994

[ 0.19580: 0.31318]

Average Experience (years)

0.24069

0.05714

[ 0.12871: 0.35268]

Average Salary

0.08041

0.00139

[ 0.05312: 0.10770]

(Percent Minority Teachers)(Average Experience)

-0.01408

0.00190

[-0.01898:-0.00917]

(Percent Minority Teachers)(Average Salary)

-0.00401

0.00047

[-0.00523:-0.00279]

-0.00391

0.00096

[-0.00639:-0.00143]

(Percent Minority Teachers)(Average Experience)(Average Salary)

0.00022

0.00003

[ 0.00015: 0.00030]

Per-Student Spending

-1.95217

0.31677

[-2.57302:-1.33131]

Class Size

-0.33409

0.06126

[-0.45415:-0.21403]

-0.16902

0.03270

[-0.23310:-0.10494]

(Per-Student Spending)(Class Size)

0.09172

0.01451

[ 0.05435: 0.12908]

(Per-Student Spending)(Percent in College Courses)

0.04899

0.00745

[ 0.02980: 0.06818]

(Class Size)(Percent in College Courses)

0.00804

0.00150

[ 0.00418: 0.01190]

(Per-Student Spending)(Class Size)(Percent in College Courses)

-0.00225

0.00035

[-0.00315:-0.00135]

Percent Charter Schools

0.00492

0.00125

[ 0.00169: 0.00815]

Percent Year-Round

-0.00358

0.00023

[-0.00416:-0.00299]

Demographic Block

Teacher Block

Parameter

43

(Average Experience)(Average Salary)

Student

Percent in College Courses

Block

School Structure Null deviance: 34345.4 on 302 degrees of freedom Residual deviance: 4078.8 on 282 degrees of freedom AIC: 6039.2

Bold Coefficient Names Denote 99% CI Bounded Away From Zero

Table 4: Conditional First Differences for 1998 STAR Math Test Interquartile Range Total Effect

Full Range

44

Values

First Difference

Values

First Difference

Percent Low Income

[26.68:55.46]

-0.1189

[ 0.00: 92.33]

-0.3620

Percent Asian

[ 0.88: 7.19]

0.0154

[ 0.00: 63.20]

0.1555

Percent Black

[ 0.85: 5.96]

-0.0237

[ 0.00: 76.88]

-0.2955

Percent Hispanic

[13.92:47.62]

-0.1184

[ 2.25: 98.82]

-0.3155

Percent Minority Teachers

[ 6.33:19.18]

0.0144

[ 0.00: 80.17]

0.0907

Average Years Experience

[13.03:15.51]

-0.0024

[ 8.42: 20.55]

-0.0117

Average Salary

[55.32:62.21]

0.0210

[39.73: 80.57]

0.1243

Per-Pupil Spending

[ 3.94: 4.51]

0.0080

[ 2.91: 6.91]

0.0559

Class Size

[21.15:24.12]

0.0042

[14.32: 28.21]

0.0197

Percent in College Courses

[23.45:41.80]

0.0224

[ 0.00: 89.13]

0.1093

Percent Charter

[ 0.00: 0.00]

0.0000

[ 0.00: 71.43]

0.0875

Percent Year-Round

[ 0.00:12.31]

-0.0109

[ 0.00:100.00]

-0.0863

Bold Coefficient Names Denote 99% CI Bounded Away From Zero