Externalizing Behavior Problems and Academic ...

12 downloads 197 Views 3MB Size Report
social behavior; GRB overlapped significantly with neurotic as well as with ..... ence with proper classroom behavior might be the key media- .... evaluations, diagnostic interviews, examiner appraisal of ag- ...... San Diego, CA: Academic. Press.
Copyright 1992 by the American Psychological Association, Inc. 0033-2909/92/$3.00

Psychological Bulletin 1992, Vol. 111, No. 1,127-155

Externalizing Behavior Problems and Academic Underachievement in Childhood and Adolescence: Causal Relationships and Underlying Mechanisms Stephen P. Hinshaw University of California, Berkeley Conceptual and measurement issues surrounding externalizing behavior problems and academic underachievement, the strength and specificity of the covariation between these domains, and the viability of explanatory models that link these areas are reviewed. In childhood, inattention and hyperactivity are stronger correlates of academic problems than is aggression; by adolescence, however, antisocial behavior and delinquency are clearly associated with underachievement. Whereas investigations with designs that allow accurate causal inference are scarce, unidirectional paths from 1 domain to the other have received little support. Indeed, the overlap of externalizing problems with cognitive and readiness deficits early in development suggests the influence of antecedent variables. Low socioeconomic status, family adversity, subaverage IQ, language deficits, and neurodevelopmental delay are explored as possible underlying factors.

deserving of special education services (Professional Group for ADD and Related Disorders, 1991).

Links between academic underachievement and difficulties in behavioral adjustment have long been noted (Sampson, 1966; see also historical review of McGee, Share, Moffitt, Williams, & Silva, 1988). When several British epidemiologic investigations in the 1960s and early 1970s yielded clear evidence of overlap between reading deficits and behavioral problems of an acting-out or externalizing nature, interest in the phenomenon was rekindled (see review of Rutter, 1974). This association continues to generate research and lively debate (e.g., McGee & Share, 1988; Patterson, 1990; Rourke, 1988; Rutter, 1989; Schonfeld, 1990; Silver, 1990) for several reasons: First, in terms of prevalence rates, personal and societal suffering, and resistance to most intervention strategies, both externalizing behavior problems and achievement difficulties constitute major problems of childhood (Kazdin, 1987; Taylor, 1988). Second, each domain strongly predicts later maladjustment, in that externalizing problems often lead to antisocial behavior and substance abuse (Eron, 1987; Gittelman, Mannuzza, Shenker, & Bonagura, 1985), and severe underachievement in reading not only persists but also carries a poor prognosis for other domains (Rutter & Yule, 1975; Spreen, 1988). Third, elucidation of underlying mechanisms may yield theoretical insights into behavior-cognition links in both normal and atypical development, an important tenet of the field of developmental psychopathology (Cicchetti, 1989). Fourth, the association has direct implications for policy, as evidenced by recent efforts toward modifying U.S. law to include attentional deficits as a distinct category

Definition of Domains Externalizing Behavior Childhood behaviors marked by defiance, impulsivity, disruptiveness, aggression, antisocial features, and overactivity are called undercontrolled, or externalizing (see Achenbach & Edelbrock, 1978). The distinctiveness of such features from behavior patterns termed overcontrolled, or internalizing—evidenced by withdrawal, dysphoria, and anxiety—has been established in numerous investigations (for a review, see Quay, 1986).1 Most notably, externalizing problems are more stable than internalizing behaviors, carrying (except in instances of severe inhibition or depression) a worse prognosis as well as resistance to most forms of intervention (Robins, 1979). In child psychopathology, behavioral deviance can be characterized in several ways. For example, to convey a dimension of behavior, investigators typically sum or average data gathered from rating scales or behavior observations, yielding a quantitative score. Through application of cutoff scores or multivariateclustering strategies, the same instruments may yield a category —that is, a subgroup of children with common characteristics. When investigators use extensive clinical judgments, inclusionary markers, and clear exclusionary criteria, they obtain diagnoses. Finally, alternative definitions, such as the legal entity of juvenile delinquency, are sometimes invoked with respect to antisocial activities. This review includes reports that make use

Work on this article was supported by National Institute of Mental Health Grant No. MH 45064. I acknowledge the insightful comments of three anonymous reviewers, whose critiques facilitated the development of the arguments herein. Correspondence concerning this article should be addressed to Stephen P. Hinshaw, Department of Psychology, Tolman Hall, University of California, Berkeley, California 94720.

1 The domains are not completely independent, however. In fact, correlations between externalizing and internalizing behaviors are often moderate to high, especially when considerable psychopathology exists and particularly in young children (Achenbach & Edelbrock, 1983; Rose, Rose, & Feldman, 1989).

127

128

STEPHEN P. HINSHAW

of each of these means of defining externalizing behavior problems. A major issue for the field is the validity of narrower dimensions or categories within the externalizing domain. Such validity depends on the potential separability of dimensions or subgroups, not only on the basis of defining criteria, but—more important—on their degree of independence or divergent validity concerning such external variables as family history, pathophysiology, course, and response to intervention. Although opinion was strong in the recent past that undercontrolled behavior was unidimensional and that externalizing subtypes of children were not distinct (e.g, Quay, 1979; Sandberg, Wieselberg, & Shaffer, 1980), consensus has emerged that two major types of externalizing behavior, inattention and hyperactivity on the one hand and aggression-conduct problems on the other, show at least partial independence and some degree of divergent validity (Hinshaw, 1987). Thus, despite their considerable overlap (e.g., Offord, Alder, & Boyle, 1986; Szatmari, Boyle, & Offord, 1989a), I consider these features separately in relation to academic underachievement.2 Prevalence estimates for externalizing disorders vary with the stringency of definitional criteria. For categories that are based on quantitative instruments, cutoff scores of 1.5 or 2 standard deviations above the population mean are often used, yielding from about 2% to over 15% of the population, depending on the skewness of the distribution of scores. Because of the heterogeneity of categories defined solely on the basis of cutoff scores, their validity is often limited. Regarding formal diagnostic criteria, the official psychiatric nosology in the US. Diagnostic and Statistical Manual of Mental Disorders (DSMIII-R) lists several disruptive behavior disorders (American Psychiatric Association [APA], 1987). Conduct disorder, which involves persistent patterns of rule-breaking and violent behavior, is estimated to have a prevalence of 9% for boys and 2% for girls (see also Offord et al, 1986). Attention-deficit hyperactivity disorder (ADHD), signifying developmental^ inappropriate levels of inattention, impulsivity, and overactivity, is believed to have an overall prevalence of approximately 3%; boys outnumber girls by a considerable margin, particularly in clinical samples (APA, 1987; see also Szatmari, Offord, & Boyle, 1989). Oppositional-defiant disorder, a controversial category that may constitute a precursor of conduct disorder, has an unknown prevalence, in large measure because of its marginal reliability and questionable validity (Key et al, 1988). Precise prevalence figures for these disorders await (a) investigations of the validity of various definitions and (b) continued refinement of sampling and instrumentation in epidemiologic studies. The significance of externalizing behaviors is indicated by their correlates and their persistent course. Attentional problems are associated with such variegated features as developmental immaturity, language delays, and accidents (e.g., poisoning and bone fractures); conduct problems, on the other hand, are correlated with both low family income and dysfunctional family systems (e.g., Offord et al, 1986; Szatmari, Offord, & Boyle, 1989). Furthermore, both attention-disordered and conduct-disordered children have noteworthy difficulties in peer relationships (Milich & Landau, 1989), and follow-up studies have documented the intransigence of both inattentive-hyperactive behavior patterns (e.g, Barkley, Fischer, Edelbrock, &

Smallish, 1990; Weiss & Hechtman, 1986) and aggressive-conduct-disordered behavior (e.g, Eron, 1987; Robins, 1970; Rutter, Tizard, Yule, Graham, & Whitmore, 1976). In all, given their prevalence, correlates, and persistence, externalizing behavior problems constitute a major problem for society (Kazdin, 1987).

Academic Underachievement Below-average academic attainment can be classified in several ways. In the first place, the achievement domain of interest must be specified. Because of the need for literacy in industrialized societies, proficiency in reading is essential, and those who investigate achievement difficulties focus nearly exclusively on problems in reading decoding and reading comprehension. Difficulties in spelling, particularly those of a phonetically inaccurate nature, are correlated strongly with reading deficiencies (Sweeney & Rourke, 1985); I do not consider them separately. Although increasing interest has been shown in children with arithmetic and mathematics disorders (e.g, Strang & Rourke, 1985), I focus on deficient reading achievement herein. Next, the degree of underachievement and its relation to measured intelligence must be considered. In the classic study of Rutter and Yule (1975), subaverage reading achievement was subdivided into two categories: (a) reading achievement significantly behind the level expected for the child's age (termed general reading backwardness, or GRB) versus (b) reading achievement significantly behind the level predicted from intelligence as well as age (specific reading retardation, or SRR). Both the nature of these definitional criteria and the sizable association between IQ and reading scores dictate that GRB children usually display somewhat subaverage intellectual performance (particularly in the verbal domain), which is invoked as the major explanation for their reading difficulties, whereas youngsters with SRR typically have average or above-average IQ scores. Although considerable overlap between these groups exists (i.e, nearly all children with SRR can also be classified as GRB; about half of the youngsters with GRB also have specific reading deficits), meaningful comparisons can be made between "pure" GRB children—those without IQ-discrepant reading scores—versus those with "specific" reading deficits. Rutter and \ule (1975) revealed a pattern of differential correlates for these two groups. Compared with GRB children, children with SRR overwhelmingly were boys, had fewer neurological signs, and displayed a somewhat better outcome in arithmetic but had a far worse prognosis in reading and spelling. Although replication efforts have not confirmed all such differences (e.g, Silva, McGee, & Williams, 1985), the separability of 2

Recent reports have explored the potential validity of further subdivisions. For example, evidence exists that inattention and motoric overactivity constitute separable subdimensions of hyperactivity and that antisocial behaviors may profitably be subdivided into those that are overtly aggressive—fighting, defiance, tantrums—versus those that are clandestine or covert—fire setting, truancy, cheating, stealing (see Loeber & Lahey, 1989, for a review). In focusing on the two major subdivisions, I use the terms attention deficits, attention-deficit hyperactivity disorder (ADHD), and hyperactivity interchangeably to refer to one type and aggression, antisocial behavior, and conduct disorder to denote the other dimension/category.

EXTERNALIZING BEHAVIOR AND UNDERACHIEVEMENT

these two groups is at least partially validated, mandating the specification of IQ levels in children with reading difficulties (see also Aman & Singh, 1983).3 Furthermore, to ensure that measures of intellectual ability are not confounded with reading skill, individual (as opposed to group-administered) IQ tests are necessary. As with externalizing behavior, the prevalence of underachievement varies with the stringency of the denning criteria. With a relatively stringent cutoff of reading scores at least 28 months below age norms, the prevalence of GRB for 10-yearold boys was found to be approximately 7% in a rural setting (Isle of Wight) and nearly 20% in inner London (Rutter et al., 1974). Regarding SRR, defined by a reading score at least 2 standard errors of estimate below the level predicted from IQ, the respective rates were approximately 4% and 10%. Current American estimates for specific (i.e., IQ-discrepant) reading disabilities are similar, from 2% to 8% of the school-age population (APA, 1987). Although boys are considered at higher risk to display specific (but not necessarily general) reading delays, recent data indicate that the preponderance of boys referred for learning difficulties may reflect biases in teacher identification procedures and in the greater amounts of problem behavior displayed by reading disabled boys (McGee et al, 1988; Shaywitz, Shaywitz, Fletcher, & Escobar, 1990). The problems of children who display academic underachievement are not limited to the academic domain. Indeed, self-esteem deficits, problems in language skills, and interpersonal difficulties are common (e.g, Mann & Brady, 1988; Stone & LaGreca, 1990). Also, as noted above, youngsters with IQdiscrepant reading deficits tend to remain well behind their peers in reading skills during their educational careers (McGee et al., 1988; Rutter et al, 1976; see review of Spreen, 1988). Thus, like externalizing behavior, underachievement in reading has meaningful correlates and a persistent course. Perhaps because of the rather inexplicable nature of the persistent reading problems of children with SRR—who, with normal IQs and few neurological signs, would be expected to make adequate gains—considerable research interest has focused on this group. In the United States, the educational term learning disability and the psychiatric term specific developmental disorder (APA, 1987) are used to describe the problems of these children, who are the subjects of most investigations in the field.4 \fet I do not limit my coverage to specific (IQ-discrepant) reading problems. In the first place, the majority of children with achievement deficits do not display intelligence-attainment disparities (e.g, Rutter, Tizard, & Whitmore, 1970); thus, the exclusion of such problems as low grades, special class placement, or general reading problems could severely limit generalizability. Second, the separation of specific from general reading delay is often problematic, especially for young children: The categories overlap considerably and their distinctiveness is unstable across development (e.g, McGee, Williams, Share, Anderson, & Silva, 1986). Finally, because recent longitudinal findings demonstrate provocative links between early verbal deficits and externalizing behavior in adolescence (Schonfeld, Shaffer, O'Connor, & Portnoy, 1988), low verbal intelligence may be an important domain in its own right. On the other hand, the relatively unequivocal causal link between globally retarded intellectual performance and a host of behavioral and

129

emotional problems (see Rutter et al, 1970) dictates that I do not consider studies of children with clear mental retardation. Conceptual, Developmental, and Measurement Issues Before reviewing relevant empirical research, I should highlight several thorny issues regarding the domains under consideration. Discussion of these conceptual and methodologic points focuses attention on the many difficulties that arise in attempting to draw conclusions regarding causal precedence or explanatory mechanisms. In the first place, as noted above, both domains under consideration are extremely heterogeneous: Externalizing behavior problems comprise a wide variety of constituent behaviors, and definitions of underachievement also include a host of problems (e.g, subaverage IQ, specific achievement deficits, grade retention, poor marks in school). Because some of the components within each area (e.g, hyperactivity vs. aggression; specific vs. general reading deficits) have been validated as partially independent, global statements concerning linkages between underachievement and externalizing behavior should often be replaced by models involving narrower variables. \fet much research in the field does not make use of measurement strategies that can adequately distinguish among such subdivisions (see discussion of Loney & Milich, 1982), limiting the specificity of causal models that can be formulated.5 Next, some of the components of each domain may be quite contaminated with aspects of the alternate domain. For example, retention and poor marks may be as closely related to behavior as to achievement deficits per se (e.g, Mantzicopoulos, Morrison, Hinshaw, & Carte, 1989). Thus, I focus primarily, but 3 Considerable controversy exists about the use of intelligence scores as the criterion from which specific deficiencies in reading achievement are judged. For example, intelligence does not not constitute a fixed capacity for academic attainment, as attested by recent evidence that it does not place a rigid ceiling on reading levels (Share, McGee, & Silva, 1989). Furthermore, the achievement problems of children with marked reading deficits are real, regardless of their IQ scores. \et knowledge of a child's intellectual level in relation to reading level may still be important. 4 Indeed, the underlying deficits of this group often are presumed to be of a subtle neurological variety, given their adequate intelligence and given the typical exclusion of such factors as poor schooling and cultural deprivation in diagnosis. Yet such diagnostic labels as dyslexia, which connotes an underlying neurological deficit for reading disabilities, have been severely criticized because of the heterogeneity of children with severe reading deficits and the lack of a consistent neurological or neuropsychological explanation for specific reading deficiencies (e.g., Rourke, 1985; Rutter & Yule, 1975). I make neither assumptions of homogeneity nor automatic etiological inferences about either academic underachievement or externalizing behavior problems in this review. 5 Note that although components of a broadband construct may be partially independent, these components may still coexist. For example, regarding the externalizing domain, hyperactivity and aggression often overlap (Hinshaw, 1987). Because achievement difficulties may pertain chiefly to those youngsters displaying both types of externalizing difficulty (e.g., McGee, Williams, & Silva, 1984b), a focus on narrowband conceptions or on differentiated subgroups should not overlook the frequent comorbidity between them.

130

STEPHEN P. HINSHAW

not exclusively, on appraisal of underachievement by individual test administration. In addition, substantial debate centers around whether inattention, considered heretofore as a type of externalizing behavior, is actually a cognitive deficit (for debate, see Barkley, 1989; Douglas, 1983; Sergeant, 1989). An association between inattention (even if measured by adult ratings) and underachievement may thus signify a link within the general realm of cognitive deficits rather than a behavior-cognition correlation. Third, manifestations of both externalizing behavior and underachievement change markedly over time. For example, the same youngsters who display delinquency in adolescence and severe antisocial behavior in adulthood often show comorbid aggression and hyperactivity during the grade school years (Magnusson, 1988; Moffitt, 1990). Furthermore, many of these same youngsters are quite likely to have displayed difficult temperaments in infancy and marked oppositionality in the preschool era. In short, there may be a heterotypic continuity across seemingly different aspects of externalizing behavior. Similarly, many grade school children with reading deficits evidence language delays during the preschool years, suggesting that early linguistic deficits become commensurate with underachievement when academic curricula are introduced (e.g, Mann & Brady, 1988). It follows that (a) measures must be tailored to children's developmental levels and (b) the status of a variable as a cause of problems in the counterpart domain (as opposed to its status as a consequence of the alternate domain) may be confused unless careful attention is paid to developmentally sensitive measurement. A hypothetical example may help to illustrate this last point. Suppose it is discovered that in a given sample, acting-out behavior in the first grade—not accompanied by underachievement —predicts IQ-discrepant reading deficits 3 years later. One might be tempted to conclude that a unidirectional causal linkage between early externalizing behavior and subsequent learning disabilities is supported. For one thing, however, it may be nearly impossible to find IQ-discrepant reading failure in first graders, because children cannot be 1 to 2 years behind predicted reading levels when they have been exposed to less than one grade of formal schooling. In other words, methods of defining underachievement in later years may be notably insensitive at earlier phases of development. Furthermore, these hypothetical first graders might well display delays (preexisting as well as concurrent) in receptive or expressive language abilities, delays that often translate directly into difficulties with the phonologic processing necessary for proficiency in reading. Such language deficits could actually be causative of early externalizing behavior and of subsequent underachievement. Without assessment of antecedent variables and without knowledge of developmental trajectories, erroneous causal precedence may well be inferred. An additional issue pertains to the source of assessment data regarding the domains under consideration, particularly externalizing behavior. Although adult informants provide the primary information regarding behavioral symptomatology, key informants (e.g., parents vs. teachers) show rather modest correspondence regarding their appraisals (see Achenbach, McConaughy, & Howell, 1987). Furthermore, recent evidence suggests that not only the prevalence but also the correlates of childhood

disorders vary when different sources are used to identify disordered behavior (Offord, Boyle, & Racine, 1989). Thus, associations with underachievement might pertain to externalizing behavior as defined by one set of informants but not another. This latter scenario would be particularly likely, for example, when teacher ratings were used to define externalizing behavior and when grades (given by teachers) indicated underachievement: Shared method variance might account for a good deal of the cross-domain linkage. For all of the reasons highlighted in this section, measurement issues play a key role in explanatory accounts of overlap between externalizing behavior and underachievement. Investigators must use both developmental ly sensitive measures and designs that afford inference of causal precedence, including assessment of antecedent variables, if cross-domain links are to be elucidated fully. I therefore highlight factors related to methodology in this review.

Evidence for the Association Primarily to discover the specific types of underachievement that are linked to subcategories of externalizing behavior, I now review empirical evidence regarding the strength of the linkage between domains. An initial question pertains to the degree of covariation that would be meaningful or significant. Assuming an upper bound estimate of 10% for the domain of IQ-discrepant underachievement and a similar upper bound estimate of 10% for noteworthy externalizing behavior problems, the degree of overlap would be 1%—the product of the independent probabilities—if the domains were associated at chance levels. In fact, however, some contend that the overlap between attentional disorders and underachievement exceeds 50% (McGee & Share, 1988). The wide range of comorbidity estimates currently in the literature mandates careful appraisal of the evidence. Because of the well-documented problems associated with inferring both prevalence and comorbidity rates from clinical samples, which are typically biased toward extreme pathology and toward frequent co-occurence of a wide range of associated problems (see Rutter, 1989), the preferred database would comprise epidemiologic investigations of unselected populations. Inspection of the major child epidemiologic studies from the 1970s and 1980s, however, reveals that very few studies of this scope used individualized tests of achievement or intelligence, reflecting the need for most epidemiologic investigators to trade depth of assessment for breadth of coverage (for a review of recent population surveys of child disorders, see Brandenburg, Friedman, & Silver, 1990). Thus, I include only a handful of relevant epidemiologic reports, beginning with groundbreaking work by Rutter and colleagues (see Rutter, 1974, for a review of earlier studies).

Key Epidemiologic Investigations Isle of Wight and inner London studies. The epidemiologic study that reawakened the field to the association of interest was the Isle of Wight investigation of Rutter et al. (1970). This pioneering study made use of a two-stage procedure for assessing both underachievement and psychiatric disturbance. First,

EXTERNALIZING BEHAVIOR AND UNDERACHIEVEMENT the general population of 9- to 11-year-olds on this rural island was screened for (a) intellectual and academic retardation, by means of group achievement tests, and (b) psychiatric disturbance, by means of parent and teacher behavioral questionnaires. Children considered at risk for intellectual-scholastic impairment, plus a randomly selected group of children from the general population, then received individual testing; those thought to manifest psychiatric disorder received semistructured parent-and-child interviews and additional adult ratings. So as not to confound appraisal of the overlap between domains, the cognitive and psychiatric evaluations were performed independently. Relevant to the topic of interest, children with SRR were over four times more likely than the general population to display antisocial behavior, whether the latter was appraised from teacher ratings, parent ratings, or child interviews. This overlap was significantly larger than the association that would be expected by chance. For example, using data from the teacher rating instrument, Rutter's Scale B (see Rutter, 1967), 24% of the youngsters with SRR displayed above-cutoff levels of antisocial behavior, compared with approximately 5% of the general population. Furthermore, the reverse conditional probabilities were similar, in that over a third of the youngsters meeting criteria for antisocial behavior had SRR, a rate far higher than would be expected by chance (see Rutter & Yule, 1970). This epidemiologic investigation was replicated in an inner London borough (see Berger, Yule, & Rutter, 1975), with similar but not identical screening methods. The base rates of both SRR and antisocial behavior were higher than on the Isle of Wight, and the degree of overlap was also greater: Nearly half of the children with SRR scored above cutoffs for antisocial behavior on teacher ratings (see Sturge, 1982). As regards specificity, several findings are noteworthy. First, significant comorbidity was not found between SRR and classifications of internalizing (neurotic) behavior on the Isle of Wight. That is, SRR children did not have high levels of neurotic behavior, and children classified as neurotic did not have elevated levels of SRR, arguing for a specific link between SRR and acting-out behavior problems. Second, although children with GRB also had greater levels of problem behavior than did the general population, this association was not specific to antisocial behavior; GRB overlapped significantly with neurotic as well as with antisocial behavior. It was therefore concluded that antisocial behavior correlates more strongly and specifically with achievement failure than with subaverage intelligence. Although the formal classification of antisocial disturbance in this investigation was made from interviews as well as ratings, the figures presented above arose from the use of cutoffs on teacher rating scales. The common use of such ratings to ascertain externalizing behavior mandates brief comment. First, although such ratings constitute an economical and generally accepted mechanism for obtaining information about children's behavior, they are subject to distortions and bias (Saal, Downey, & Lahey, 1980). They may be less sensitive to internalizing than to externalizing aspects of behavior, especially when teachers are informants (e.g., Aman & Singh, 1983; Hinshaw, Han, Erhardt, & Dressier, 1991); for adolescents, they may be less sensitive to acting-out behavior than are self-reports (Offord et al., 1986). Next, the dimensions and categories yielded

131

by a particular scale can incorporate only those domains adequately represented in the item pool. As for the Rutter Scale B, only two general, broadband dimensions (antisocial vs. neurotic behavior) were available; the few items pertaining to inattention and overactivity did not form a separate scale with early use of this instrument (Rutter, 1967).6 Refinement of rating instruments and interview protocols has led to respecification of the linkage between externalizing behavior and underachievement for preadolescents. Waltham Forest report. In a replication of the Isle of Wight and inner London methodologies with a younger cohort, Richman, Stevenson, and Graham (1982) studied linkages between cognitive-reading difficulties and behavior problems in a middle-class outer borough of London. Data collection began with over 800 three-year-olds, with follow-up at ages 4 and 8 for both behavior-problem and nonrisk subsamples. I focus here on age 8 data; preschool precursors of both behavioral disturbance and learning problems are addressed in a subsequent section. To avoid "empty sets" of reading-delayed children, Richman et al. (1982) eased the criteria for defining GRB and SRR from 28 to 18 months below age- or IQ-expected performance. Regarding cross-domain associations, children with behavioral deviance (defined by parent and teacher ratings in addition to global clinical severity ratings) were more likely than comparison youngsters to display underachievement, although the finding fell short of statistical significance. In addition, for boys only, GRB was associated with neurotic (but not with antisocial) behavior. As noted by Richman et al., the relatively small numbers of reading-delayed or behaviorally deviant children may have reduced the power to ascertain overlap. Dunedin study. A more recent source of epidemiologic data is the Dunedin Multidisciplinary Child Development Study, a major prospective study of a birth cohort in New Zealand. Throughout this continuing investigation, which has followed the cohort during odd-numbered years of life beginning at age 3, the association between cognitive-achievement delays and externalizing behavior has been explored. When the cohort was 7 years old, the Dunedin team performed factor analyses with the Rutter teacher scale, yielding a three-item Hyperactivity factor (composed of the items restless/ overactive, squirmy, and poor concentration/short attention span) that was separate from the usual externalizing dimension of aggressive and antisocial behavior (McGee, Williams, et al., 1985; see also Schachar, Rutter, & Smith, 1981). Each of the three dimensions of deviant behavior—Aggressive-Antisocial, Hyperactive, and Anxious-Fearful (corresponding to neurotic or internalizing behavior)—was examined for partial correlations with Verbal and Performance IQ and with the Burt Reading Test, controlling for the effects of the other two behavioral dimensions. The partial correlations of the aggression and internalizing factor scores with the cognitive variables were essentially zero, whereas the Hyperactivity dimension correlated 6 Also, because diagnostic interviews used in the Isle of Wight study relied on British conceptions of hyperkinesis as a severe syndrome of pervasively overactive behavior, usually accompanied by mental retardation, the more formal diagnoses yielded by the Isle of Wight team did not include categories congruent with current conceptions of attentional deficits or hyperactivity.

132

STEPHEN P. HINSHAW

significantly (and negatively) with Verbal IQ, Performance IQ, and reading level, even when the other behavioral dimensions were controlled. The main conclusion is thus that hyperactivity-inattention is specifically associated with below-average intelligence and with underachievement in childhood, helping to establish the validity of this domain as a subtype of externalizing behavior (see Hinshaw, 1987). In a closely related report, McGee, Williams, and Silva (1984b) used cutoff scores from the two externalizing factors to classify the 7-year-olds into hyperactive, aggressive, hyperactive-aggressive, or non-behaviorally disturbed categories. The two hyperactive groups displayed deficits in IQ and achievement, but the aggressive-only children typically did not differ from the control population. \fet aggression played an interactive role regarding specific reading delays: At both 7 and 9 years, the hyperactive-aggressive subgroup had a rate of SRR (36%) twice as high as rates for the single-disorder groups and five times that of the control children. Such results again point to the value of differentiating dimensions and categories within the externalizing domain. When the cohort had reached the age of 9, the Rutter parent and teacher scales were supplemented with additional items of inattentive, impulsive, and hyperactive behavior (McGee, Williams, & Silva, 1985). Not only did a more differentiated factor structure emerge, with inattention forming an independent dimension, but also only this factor correlated with IQ, reading, spelling, and speech measures (negatively in each case). Thus, an even finer distinction among types of externalizing behavior was partially validated, in that inattention was specifically associated with achievement and IQ. At age 11, the cohort's externalizing and internalizing features were appraised by means of formal diagnostic interviews with the children, which were supplemented by parent and teacher reports to yield categories of attention-deficit disorder, conduct-oppositional disorders, and internalizing disorders. Only the children with attentional deficits (whether paired with the conduct-oppositional or internalizing categories or alone) displayed IQ, reading, and spelling performance scores below those of the nondiagnosed and internalizing children (Anderson, Williams, McGee, & Silva, 1989). Also, 62% of the 45 children classified as pervasively ADHD—that is, categorized on the basis of at least two independent sources—had severe SRR, defined by scores on the Burt Reading Test that were below the entire sample's average score at age 9 (see also McGee & Share, 1988). Thus, the earlier findings relating attentional problems to underachievement were replicated with a more formal diagnostic approach; such cross-method replicability adds to the viability of the findings. Dimensional IQ or reading scores were more often used in the Dunedin reports than were the categories of SRR or GRB. Indeed, these investigations raise the issue of the viability of categorical classifications of reading deficits, particularly for young children. First, as noted earlier, the typical requirement of rather severe (28-month) disparities between reading level and either intelligence or age is quite restrictive for young children. Indeed, it is nearly impossible for children with even slightly subaverage IQ scores to obtain reading scores lower than 1.5 or 2 standard errors of estimate below prediction, leaving the SRR category viable only for those young children with

relatively high IQ scores. In addition, even among somewhat older children, the distinction between SRR and GRB across the span of 9 to 11 years is relatively unstable (Share, McGee, McKenzie, Williams, & Silva, 1987). Given the importance of younger age groups for discerning underlying mechanisms that could mediate the overlap between externalizing behavior and underachievement, these difficulties in accurately categorizing reading problems are noteworthy. Overall, the Dunedin reports challenge the specificity of the linkage between antisocial behavior and SRR found in the Isle of Wight and inner London studies. Here, the externalizing features of hyperactivity and inattention were associated with both subaverage IQ and reading delays.7 Note, however, that youngsters with the combination of hyperactivity and aggression were at greatly elevated risk for SRR in the Dunedin reports. Other epidemiologic investigations. Several other studies using epidemiologic methods or population sampling merit brief mention. First, Lambert and Sandoval (1980) used various psychometric criteria to classify a community sample as either (a) hyperactive—that is, either home, school, or physician criteria —or (b) learning disabled—that is, various discrepancy formulas involving Wechsler Intelligence Scale for Children-Revised (WISC-R) IQ and Peabody Individual Achievement Test (PIAT) achievement scores. Under lenient criteria, 53.5% of hyperactive subjects were defined as learning disabled, but so were 20% of controls: a greatly inflated estimation. With more conservative definitions, lesser percentages of both hyperactive and control youngsters met criteria for learning disability (e.g., 16% of hyperactive children and 11% of controls meeting three of five strict criteria). Second, although lacking individual cognitive assessments, the Ontario Child Health Study provided demographic data for Canadian 4- to 16-year-olds regarding both poor school performance, defined as the child's having failed a grade or needing full-time remedial education, and the use of special education services (Szatmari, Boyle, & Offord, 1989). Whereas children with carefully defined attention deficits displayed poorer school performance and had higher rates of special education usage than did nondiagnosed agemates, these rates were not significantly different from those of children with other disorders (a global category of conduct disorders and emotional disorders). Given that retention and special education placement may reflect general behavioral difficulties as well as

7

The British and New Zealand epidemiologic results may not be as discrepant as they appear. Rutter and Graham (1970) divided the Rutter Scale B items into informal clusters, one of which comprised four "motoric" items (restless/overactive, fidgety, twitches, and poor concentration/short attention span). Inspection of Table 7.3 (Rutter & Graham, 1970, p. 109) reveals that for children with specific reading retardation (SRR), teachers endorsed these items much more frequently than any from the antisocial cluster; indeed, the endorsement of poor concentration was nearly 85% for SRR youngsters, almost double the rate for any other item on the entire scale. Yet because such items were not consolidated into a formal dimension with early use of the Rutter scales, inattention-hyperactivity could not be independently examined for covariation with reading delays.

EXTERNALIZING BEHAVIOR AND UNDERACHIEVEMENT

achievement problems per se, the lack of divergent validity among clinical disorders is not surprising. In an Australian epidemiologic report, Holborow and Berry (1986) used the Conners Abbreviated Symptom Questionnaire (Goyette, Conners, & Ulrich, 1978) to measure hyperactivity and used four additional teacher-completed items to determine the prevalence of learning difficulties. Among children meeting the typical cutoff score for hyperactivity on the Conners, 26.5% met rating criteria for learning difficulties, compared with 5% of the remainder of the sample. Reversing the conditional probabilities, over 40% of the children rated as having learning problems were also considered hyperactive. The chief limitation of this report is the ad hoc, teacher-rated definition of learning difficulties, which contributes to shared method variance for defining both underachievement and externalizing problems. A salient issue is the age at which associations between behavioral problems and achievement difficulties first emerge. Hinshaw, Morrison, Carte, and Cornsweet (1987) performed a population study of kindergarten children in several suburban school districts, with a participation rate of over 80%. Parent and teacher ratings from the Revised Behavior Problem Checklist (RBPC; Quay, 1983) were examined for associations with group tests of academic readiness and individual intellectual assessments. For teachers, whereas the Conduct Disorder factor displayed nearly zero correlations with cognitive and early achievement indexes, the Attention Problem and several internalizing scales showed negative correlations ranging from -.25 to —.35. No significant correlations were noted for parent-rated factors, however, suggesting that situational specificity may exist for the correlates of behavioral problems (see also Szatmari, Offord, Siegel, Finlayson, & Tuff, 1990). In a companion report with additional cohorts, Morrison, Mantzicopoulos, and Carte (1989) found that children at risk for learning disabilities—categorized on the basis of low perceptual and prereading skills—were rated as behaviorally deviant across all RBPC teacher dimensions. Yet partial correlations (not calculated in published study) are noteworthy: The correlation between attention problems and prereading achievement, partialing conduct problems, was a significant —.25, but the correlation of conduct problems with prereading, partialing attention problems, was .00. Thus, even before children experience a formal academic curriculum, hyperactivity is associated with readiness problems. Summary. The specific link between antisocial behavior and IQ-discrepant reading deficits from the seminal reports of Rutter et al. (1970) and Berger et al. (1975) was not replicated in the Dunedin studies, in which attentional deficits and hyperactivity comprised the externalizing scales or categories that were associated with both subaverage IQ and reading delay. A possible explanation is that the original Rutter Scale B used in England did not yield a separate inattention-hyperactivity factor; only through addition of pertinent items and subsequent factor analyses did divergently valid dimensions (and distinct subgroups) of aggression versus hyperactivity emerge. Other epidemiologic results essentially confirm the finding that among kindergarten and grade school children, inattention and hyperactivity are the most consistent correlates of underachievement.

133

Clinical Reports Although the nonrepresentativeness of clinical samples is potentially problematic for inferring estimates of comorbidity, most of the evidence for overlap between the domains of interest comes from such sources. Given increasing sophistication in sampling and instrumentation, I highlight key clinical reports from the last dozen years. Because of the volume of such studies, I make no claims for exhaustive coverage. Links between externalizing disorders and underachievement in elementary grades. Most clinical reports that examine overlap between externalizing problems and underachievement before adolescence focus on hyperactivity-attention deficits rather than aggression or conduct disorder. In an early report that made use of an IQ-achievement discrepancy formula for ascertaining specific achievement deficits, Cantwell and Satterfield (1978) found that 39% of a clinical sample of hyperactive children (vs. 9% of controls) were two or more grades behind their predicted grade level in reading. The same percentage of overlap (39%) was found by August and Garfinkel (1990), who used a DSM-III-R definition of ADHD and a non-regressionbased discrepancy formula—Wide Range Achievement TestRevised (WRAT-R) scores 1 standard deviation below Peabody Picture Vocabulary Test-Revised (PPVT-R) IQ scores—to define reading disability (see also August & Garfinkel, 1989). On the other hand, among a clinical sample of 241 hyperactive children, defined by cutoffs on the Conners Teacher Rating Scale and clinical interviews with parents, only 22 (9%) met criteria for specific reading disability, which was indicated by a 1 standard deviation disparity between WRAT Reading and WISC IQ (Halperin, Gittelman, Klein, & Rudel, 1984). Furthermore, in a recent investigation with rigorously defined samples of ADHD and ADHD-aggressive boys, Forness, Youpa, Hanna, Cantwell, and Swanson (in press) found that only 6% met strict criteria for reading disability (i.e., a reading score at least 1.5 standard deviations of the difference below IQ), with no difference in prevalence of reading problems between the subgroups. With a slightly more lenient criterion of a 1 standard deviation discrepancy, 10% (7 of the 71 boys) were reading disabled.8 In short, unless one uses estimates of underachievement that also classify overly large numbers of comparison children as underachieving (Lambert & Sandoval, 1980), the contention of McGee and Share (1988) that hyperactivity-attentional deficits overlap with learning disabilities at rates above 50% does not seem warranted. Although these and other clinical reports have yielded meager evidence for the differential relationship of hyperactivity versus aggression with underachievement (e.g., McConaughy, Achenbach, & Gent, 1988; Reeves, Werry, Elkind, & Zametkin, 1987), the recent investigation of Frick et al. (1991) is 8 Because children with attentional disorders often score below norms on an empirically derived cluster of Wechsler Intelligence Scale for Children-Revised items (Arithmetic, Coding, and Digit Span; see Kavale & Forness, 1984), their full-scale IQ scores may be somewhat depressed, meaning that they must attain extremely low achievement scores to meet stringent criteria for learning disability (e.g., a 1.5 standard deviation discrepancy from IQ). Thus, a 1 standard deviation disparity is considered more valid by some investigators (Forness, Youpa, Hanna, Cantwell, & Swanson, in press).

134

STEPHEN P. HINSHAW

noteworthy because of its exemplary methodology. Here, externalizing children were categorized into ADHD, conduct disorder, or clinic control diagnoses on the basis of combined parent, child, and teacher interview data from the Diagnostic Interview Schedule for Children. Underachievement was determined on the basis of regression-formula discrepancies between individual IQ and achievement scores that also controlled for the age of the child. Depending on the stringency of the denning formula for underachievement, from about 6% to 20% of either conduct-disordered or attention-disordered children showed underachievement, reflecting a modest degree of association between each domain and learning problems. "Vet multivariate logit analyses that controlled for the co-occurrence of ADHD and conduct disorder revealed that only attentiondisorder status was uniquely associated with underachievement. Indeed, the apparent overlap between conduct disorder and underachievement was related to the presence of attentional problems in many conduct-disordered children. Thus, mirroring the epidemiologic results of the Dunedin investigation, Frick et al. (1991) concluded that attentional difficulties constitute the externalizing domain that is uniquely associated with underachievement during childhood. To summarize, although rates of overlap between externalizing disorders and psychometrically defined learning disabilities are above chance levels, recent investigations with rigorous formulas for defining underachievement have yielded comorbidity estimates that are lower than earlier figures. These rates are still higher, however, than those reported for internalizing youngsters. In addition, children with externalizing disorders often have achievement-related difficulties that may not be classifiable as formal learning disabilities, including retention and school suspension (Barkley et al., 1990; Forness et al., in press). Furthermore, whereas many initial clinical reports did not find differences between hyperactive and aggressive children in rates of underachievement, the most methodologically sophisticated investigation in the field (Frick et al., 1991) indicates that hyperactivity-inattention is the externalizing domain most clearly associated with academic failure during the childhood years (for cogent discussion and provocative longitudinal findings, see Loney, Kramer, & Milich, 1981). Furthermore, the academic status of hyperactive children continues to be compromised during adolescent follow-up (Fischer, Barkley, Edelbrock, & Smallish, 1990). Links between externalizing disorders and cognitive-verbal deficits. Although the focus of this review is on underachievement, more basic cognitive deficits might also be associated with externalizing behavior problems. Whereas space limitations preclude a comprehensive review of this issue, I consider two recent reports that address the link between verbal, cognitive, and perceptual deficits with externalizing behavior. First, with the intention of distinguishing among clinical diagnostic groups regarding mechanisms of psychopathology, Werry, Elkind, and Reeves (1987) examined hyperactive, hyperactive plus conduct-disordered, anxiety-disordered, and comparison subjects on a host of laboratory measures of cognition, attention, motor coordination, impulsivity, and reading recognition (see also Reeves et al., 1987). The most striking finding was that when age, sex, and PPVT scores were partialed from the analyses that compared categories, almost none of the many

initial group differences remained. For instance, once equated for PPVT scores, the three clinical groups did not differ in reading achievement, arguing against a higher rate of specific reading problems in the externalizing categories. Furthermore, only the combined hyperactive plus conduct-disordered group differed significantly in reading from the comparison subjects when sex and PPVT were controlled. The conclusion was that the major deficit of youngsters with externalizing problems (particularly attention-deficit disorders) is in the area of verbal abilities. Indeed, long-standing verbal deficits could potentially be a powerful third variable, mediating both externalizing behavior and underachievement, a point addressed subsequently. Second, Szatmari et al. (1990) measured neurocognitive functioning in a clinic sample of externalizing and internalizing youngsters. Their battery included measures of visual-perceptual processing and problem solving. Carefully diagnosed groups of attention-disordered versus conduct-disordered children did not differ regarding rates of neurocognitive deficits, whereas both categories displayed more problems (primarily of a visual-perceptual type) than did the internalizing youngsters. When neurocognitive scores were used to predict dimensional teacher ratings of problem behavior, teacher-rated school performance mediated the relationship, suggesting that neurocognitive impairments may lead to disturbed behavior by means of underachievement. I discuss further the possible role of perceptual-neurodevelopmental delay in predicting both externalizing behavior and underachievement in a later section. Links between juvenile delinquency and cognitive-achievement problems. Because of interest in tracking the overlap between domains into adolescence, I briefly summarize findings regarding the cognitive-achievement status of delinquent youngsters. The volume of empirical reports on links between delinquency and underachievement requires that I rely on several major reviews (Hawkins & Lishner, 1987; Hirschi & Hindelang, 1977; Loeber & Stouthamer-Loeber, 1987; Quay, 1987; Rutter&Giller,1983). First, delinquent adolescents have lower IQ scores than do their peers, particularly in the verbal domain. Although such intelligence deficits are not typically sufficient to place delinquents in a mentally retarded range, they are robust (on the order of half a standard deviation) and are particularly evident in subgroups of delinquent youngsters with aggressive-psychopathic features (Quay, 1987). Such IQ deficits are not explainable by socioeconomic status (SES) or racial differences between delinquent and comparison groups (Hirschi & Hindelang, 1977; Wilson & Herrnstein, 1985). Furthermore, from the reverse perspective, high IQ scores serve as a protective factor against delinquent outcomes for aggressive boys (White, Moffitt, & Silva, 1989). Note also that recent reports challenge the differential detection hypothesis, which contends that it is not delinquency per se but rather detection and incarceration that are associated with lowered intelligence. Notably, Moffitt and Silva (1988a) found that both adjudicated delinquents and a severity-matched group of self-reported delinquents had comparably depressed IQ scores. Second, delinquent adolescents have subaverage academic achievement, including an elevated rate of learning disabilities. Indeed, the link between academic failure and delinquency is claimed to be stronger than the association between lowered

EXTERNALIZING BEHAVIOR AND UNDERACHIEVEMENT

verbal IQ and delinquency (Hawkins & Lishner, 1987). Furthermore, among several skill deficits, achievement difficulties have the strongest concurrent associations with official and self-reported delinquency (Dishion, Loeber, StouthamerLoeber, & Patterson, 1984); poor academic performance by the end of elementary school significantly enhances the prediction of adolescent delinquency (Loeber & Dishion, 1983). These findings suggest that early underachievement is causally related to subsequent antisocial activity. Yet because underachievement and hyperactivity are often linked during grade school and because hyperactivity predicts later delinquency (Gittelman et al, 1985), support for a direct path from low achievement to delinquent behavior is weakened (see Loeber, 1990, for methodologic discussion). On the other hand, delinquency rates decline when adolescents are not in school— for example, during school vacations or after school dropout (see review of Phillips & Kelly, 1979). Furthermore, as is discussed subsequently, some reading disabled children without externalizing features develop delinquency by late adolescence (Maughan, Gray, & Rutter, 1985). Both of these findings suggest that at least in some instances, school failure may predispose to acting-out behavior. As is evident, causal paths between domains may be variegated and complex. To summarize, delinquency is associated with low verbal IQ and both general and specific achievement problems, but achievement difficulties are more strongly predictive of delinquent behavior than is low intellectual ability per se. Thus, by adolescence, there is a clear link between aggressive-antisocial acts and underachievement, whereas during childhood the more specific relationship pertains to hyperactivity-inattention. Such findings mandate examination of the causal nature of this complex association across development, the topic area to which I now turn. Causal Pathways and Underlying Mechanisms Hypothetical Causal Models When two variables are associated, it is often assumed that the first caused the second or that the second caused the first. Within such unidirectional models, additional factors can be hypothesized to mediate paths from cause to effect, yielding both direct and indirect causal linkages. Even simple causal models can therefore quickly increase in complexity. Also, each factor might cause the other, suggesting bidirectional influence. Furthermore, a third factor (or set of factors) may cause both variables of interest, or distinct—but correlated—background factors may cause each domain, leading to a largely spurious association between the main variables (Olweus, 1983; Sturge, 1982). In basic terms, several causal models might explain the covariation between externalizing behavior and underachievement (see Huesmann, Eron, & Yarmel, 1987; Olweus, 1983, for cogent discussion): 1. Underachievement leads to externalizing behavior. This model requires a history of learning failure that precedes (or exacerbates) the emergence of externalizing features. This causal relationship might include such additional variables as frustration, lowered self-image, demoralization, or lack of

135

school attachment, consequences of poor achievement that may mediate subsequent antisocial activity (e.g., Hirschi, 1969). 2. Externalizing behavior leads to underachievement. Here, behavioral disturbance predating school entry or appearing during early schooling would be viewed as primary; its interference with proper classroom behavior might be the key mediator of underachievement. For this model to be viable, the early externalizing features should predict subsequent underachievement independently of poor readiness skills, which might accompany the behavioral features. 3. Both domains lead to the other. This bidirectional model acknowledges that both of the previous unidirectional models occur simultaneously (see Olweus, 1983). 4. Underlying variables result in both problem domains. Such antecedent variables could be intraindividual (e.g, temperament, language difficulties) or environmental (e.g., discordant homes, large family size). Because this model requires that they causally precede the association, preliminary evidence for third variables would include the joint presence of externalizing behaviors and cognitive difficulties in early years. More comprehensive investigations would require prospective, longitudinal evaluations that include sensitive measures of the hypothesized causal variables and their statistical control in analyses of explanatory factors. The heterogeneity of the domains under review and the probable complexity of the links between them may render linear causal models, even those with bidirectional implications, overly simplistic. Indeed, investigators of adult psychopathology are increasingly cognizant of the need to seek reciprocally deterministic, multifactorial models of causation (Ohman & Magnusson, 1987). Because of the rapid development of children and because of the plethora of individual, familial, and school variables that could enter into causal equations, such complex models may be especially pertinent for developmental psychopathology. Much of the early literature regarding the current topic comes from cross-sectional or retrospective research designs. For example, Rutter and Yule (1970) compared three subgroups of boys from the Isle of Wight investigation: (a) antisocial only (ASB), (b) SRR only, and (c) jointly ASB and SRR (see discussion in Yule & Rutter, 1985). Because the latter, mixed subgroup had a later onset of antisocial tendencies than did the exclusively antisocial boys and because they resembled the SRR-only youngsters regarding key background variables, Rutter and Yule (1970) concluded that either antisocial behavior results from reading deficits or that prior variables must underlie the association. In similar analyses with the inner-London sample, however, Sturge (1982) did not find such a clear-cut pattern of results. Because these designs have clear limitations in detecting causal relationships, I turn to prospective, longitudinal studies, which are potentially more informative about causal inferences (Loeber, 1990). Prospective, Longitudinal Investigations First, because of the aforementioned trend for inattentionhyperactivity to be correlated with underachievement during grade school and for antisocial behavior-delinquency to show association with academic failure in adolescence, I subgroup

136

STEPHEN P. HINSHAW

the following reports on the basis of the age range (elementary vs. secondary grades) in which follow-up data were collected. Next, I consider only those reports with an interval of at least 1 year between initial and follow-up assessment periods, seemingly a minimum time period for appraising the relationships of interest. Third, whereas consensus is growing that meta-analytic methods are the preferred means for amalgamating findings across independent investigations, the extreme differences in measures, methods, and analytic plans across the relatively few studies to be reviewed preclude their meaningful combination for meta-analysis. Fourth, I limit my discussion to reports that have appeared since the review of Rutter (1974). To yield data that would shed light on causal relationships, investigations should include (a) assessments of both achievement-related and behavioral variables at initial as well as followup periods, (b) measures of relevant antecedent variables, and (c) statistical analyses that allow for causal inference, through control for the effects of antecedent factors or correlated predictors. Extremely few reports in the literature include all such features; these receive the strongest weight in my discussion of causal mechanisms. Nevertheless, because of their heuristic value, I also include additional studies without such fully informative designs, realizing their limitations regarding explanations of causal mechanisms. Elementary grades. Table 1 contains details of the investigations in which follow-up evaluations were made during elementary school. All reports used community samples of moderateto-large size, a necessary feature to obtain enough children with either deviant behavior patterns or underachievement. Even so, those studies with categorical definitions of either poor academic performance or behavior problems often had a relatively small number of children who were classified (e.g., McGee et al, 1986). Examination of the table reveals a diversity of methods for assessing cognitive deficits during initial assessment periods and for evaluating underachievement during follow-up periods; dimensional as well as categorical indexes are used in many reports. Indeed, only Richman et al. (1982), Jorm, Share, Matthews, and Maclean (1986), and McGee et al. (1986) used categories of GRB and SRR, partly because of the aforementioned difficulties in obtaining stable classifications of these categories during early and middle childhood. Teacher ratings predominate as the chief means of assessing behavior. Perhaps the most salient overall finding from these reports is that in all six instances in which relevant analyses were performed, patterns of problem behavior during the initial assessment were associated with early cognitive-readiness deficiencies (Lambert & Nicoll, 1977, did not perform relevant analyses; Stott, 1981, did not include measures of cognitive functioning during the initial assessment). Specifically, Kellam, Branch, Agrawal, and Ensminger (1975) found that teacher ratings of internalizing and externalizing items, which predicted lowered IQ and poor reading status 2 years later, were associated with low readiness performance in first grade. Next, antisocial behavior in kindergarten, which predicted poor reading 1.5 years later in the report of McMichael (1979), was also correlated with deficient readiness skills during the initial assessmenU-'Such associations emerged even earlier in the report of Richman et al. (1982): At 3 years, general behavioral deviance overlapped with cognitive and language delay. Further-

more, in Palfrey, Levine, Walker, and Sullivan (1985), early measures of inattention correlated with poor reading readiness skills in kindergarten. Finally, both Jorm, Share, Matthews, and Maclean (1986) and McGee et al. (1986), who made classifications of GRB or SRR in grade school, showed externalizing behavior by one or both reading-delayed groups in kindergarten, before exposure to academic curricula. Such early associations between the domains of interest strongly suggest the influence of prior antecedent variables. Furthermore, these associations mandate statistical control when investigators attempt to establish causal precedence. That is, without such techniques as multiple regression, which allow for examination of the unique contribution of a predictor variable once its correlates are partialed, the status of a predictor as an independent cause is suspect. Unfortunately, several reports in Table 1 failed to make use of such analyses but still went on to posit a unidirectional influence from one domain to the other. In those reports in which such analyses were performed, it was found in one instance that internalizing rather than externalizing behavior uniquely predicted reading (Lambert & Nicoll, 1977) and in another that early readiness skills predicted subsequent underachievement more strongly than did behavioral measures (McMichael, 1979; see also Jorm, Share, Maclean, & Matthews, 1986). In short, associations in elementary school between underachievement and externalizing behavior are typically predated by correlated precursors, mandating a search for underlying variables that may be responsible for the early association and requiring adequate statistical controls in making causal inference. When such controls were performed in the studies reviewed, support for unidirectional causation was sharply mitigated. Several additional conclusions are suggested from Table 1. In the first place, despite the caveats just raised, some suggestive evidence exists that the experience of reading failure may exacerbate initial externalizing behavior. For example, Jorm, Share, Matthews, and Maclean (1986) found that whereas hyperactive-inattentive behavior was elevated at school entry for children later classified as GRB, antisocial behavior in these youngsters increased after first and second grade. Also, the careful analyses of McGee et al. (1986) demonstrated that although teacher-rated externalizing behavior was elevated at school entry for boys later categorized as reading delayed, boys with GRB showed a relative increase in hyperactivity from age 5 to age 7, and boys with SRR displayed a relative increase in hyperactivity from age 7 to age 9. Second, at least in the preschool and early elementary years, links between behavior problems and underachievement are not always specific to the externalizing domain. For instance, in Richman et al. (1982), behavior-problem status at 3 years was relatively nonspecific: No consistent internalizing or externalizing dimensions were yielded from the rating scale and interview that were used. This generic behavior-problem category was predictive of GRB 5 years later. Also, in Kellam et al. (1975), the associations between each of six diverse behavioral items and either concurrent or subsequent achievement status were of nearly the same magnitude (see Tables 6 and 8; see also Stott, 1981, for similar findings). In addition, Lambert and Nicoll (1977) found that internalizing rather than externalizing ratings made independent predictions of later reading scores.

EXTERNALIZING BEHAVIOR AND UNDERACHIEVEMENT

On the other hand, the investigations of Jorm, Share, Matthews, and Maclean (1986) and McGee et al. (1986) revealed that teacher-rated dimensions or categories of externalizing (hyperactive, aggressive) problems were correlated with delayed reading status but that internalizing behaviors were not. Although those reports with better-validated measures of problem behavior tend to converge on the primacy of externalizing behavior in relation to underachievement, complete specificity cannot be presumed. Third, the contention that behavior problems are associated specifically with IQ-discrepant reading delay (SRR; see Berger et al., 1975; Rutter et al., 1970) is not supported by these investigations. Indeed, in many of the reports, indexes of behavioral maladjustment were correlated with either IQ measures or with reading scores/categories that were not adjusted for intelligence. Also, in direct opposition to the early British investigations, those studies with categorizations of GRB and SRR (Jorm, Share, Matthews, & Maclean, 1986; McGee et al., 1986; Richman et al, 1982) typically indicated that behavior problems were related more strongly to the former than to the latter category. Thus, little support exists for a preferential association of specific, IQ-discrepant reading deficits to behavioral problems. Fourth, the magnitude of association or prediction across domains is typically modest to moderate. One perspective on this issue is provided by McMichael (1979), who found that although kindergarteners with antisocial problems tended to develop reading problems by the end of the first grade, over half of these youngsters failed to show subsequent underachievement. Thus, the relationship of interest may pertain to only a subgroup of children with externalizing behavior.9 A final point is that in addition to failing to control for correlated measures at preassessment, most reports did not analyze for bidirectional relationships. That is, few investigators examined whether both early behavior-problem status predicted subsequent underachievement and whether early cognitive-readiness problems predicted later misbehavior. Unless both predictive relationships are examined, however, the full set of causal possibilities cannot be adequately appraised. Secondary grades. Table 2 contains highlights of investigations that examined follow-up status during adolescence. Note that none of these reports made use of categories of SRR or GRB to define underachievement during the follow-up assessment; instead, investigators used such indexes as school grades, special classroom placement, early school withdrawal, and latent variables measured by multiple indicators. The aforementioned caveats about the overlap of such variables with behavior per se pertain here (see also Shaywitz et al, 1990). Externalizing behavior was also assessed in diverse ways: Peer sociometric evaluations, diagnostic interviews, examiner appraisal of aggressive behavior during testing sessions, and self-reports of delinquency were used in addition to the more typical teacher ratings. Two of the investigations depicted in Table 2 included (a) measures of each domain at multiple time points, (b) assessments of relevant antecedent variables, and (c) structural equation modeling to analyze data—features that allow more confidence in causal inferences. First, in an explicit attempt to establish the credibility of unidirectional versus reciprocal versus

137

third-variable models, Olweus (1983) obtained peer assessments of aggression as well as averaged school grades at both 6th and 9th grades in a Swedish sample of boys; he also obtained measures of a number of familial and environmental background variables. Cross-lagged correlations provided no evidence for the low-grades-predict-aggression unidirectional model, but a near-significant effect was found for the aggression-predicts-low-grades path. However, given critiques of cross-lagged correlations as an analytic tool, Olweus also performed structural equation modeling. Once the background variables of social class, parents' ages, divorce, birth out of wedlock, and birth order were entered into the causal models, no unidirectional or reciprocal causation paths remained significant, leading to the conclusion that antecedent variables accounted for the association between poor grades and aggressive status. As mentioned earlier, however, poor grades constitute an index of underachievement that may well be directly contaminated by externalizing behavior. Second, the report of Schonfeld et al. (1988), which measured relevant background variables and used path analyses, provided some evidence for a link between cognitive deficits in childhood and conduct disorder 10 years later. Here, an indirect path between age 7 WISC Verbal IQ and severity ratings of conduct disorder at age 17—mediated by adolescent Wechsler Adult Intelligence Scale (WAIS) IQ scores—held up even when (a) early aggressive behavior and (b) the background variables of environmental disadvantage, neurological soft signs, and parental psychopathology were controlled. Yet measurement issues cloud the viability of these findings, in that initial aggression was indexed by behavior ratings from the psychologists who performed the cognitive testing. Such individualized appraisals of externalizing behavior in the laboratory, doctor's office, or testing room are not sensitive indicators (see, for example, Sleator & Ullmann, 1981). Because more valid indexes might have altered the paths, the unidirectional link of Schonfeld et al. from verbal deficits to conduct disorder should be viewed cautiously. Several additional investigations, without such stringent analyses, provided apparent support for the unidirectional model that underachievement leads to externalizing behavior. For example, Farrington (1979) showed that both low IQ and low vocabulary status at 8-10 years of age predicted self-reported and official delinquency 6 years later. Because, however, teacher ratings of externalizing behavior at preassessment also predicted delinquent outcomes, and because analyses controlling for behavior measures were not performed, the causal role (text continues on page 140) 9 Another issue here has to do with the antisocial nature of the children studied by McMichael (1979), who used Rutter's (1967) method for classifying behavior-problem subgroups. In this procedure, an overall behavior-problem score on the total scale is used to classify a child as deviant; antisocial versus neurotic categorizations are then made on the basis of simple numeric comparisons. Thus, if a child received many endorsements for items pertaining to attention deficithyperactivity, he or she could still be classified as antisocial if the antisocial score were just 1 point higher than the neurotic score (e.g, 1 vs. 0; see Jorm, Share, Matthews, & Maclean (1986). Thus, McMichael's findings might pertain more to hyperactivity than to antisocial tendencies per se.

138

STEPHEN P. HINSHAW

Table 1 Longitudinal Investigations with Follow-Up Assessments in Elementary Grades Initial assessment Cognitive measure Study

Location

N

Grade (age)

Content

Scale type

Kellam, Branch, Agrawal, & Ensminger (1975)

Chicago

649

1st

IQ (KuhlmanAnderson) and readiness (Metropolitan) tests

Categorical

Lambert & Nicoll (1977)

San Francisco Bay Area

—b

1st

Readiness battery (Let's Look at Children)

Dimensional

McMichael (1979)

Edinburgh, Scotland

198 (boys only)

Kc

Readiness battery (Thackray)

Dimensional, categorical

Stott(1981)

Guelph, Ontario, Canada

1,292

K

Richman, Stevenson, & Graham (1982)

Waltham Forest, England

185r

3yr

Language development (Reynell), picture vocabulary, general ability test (Griffith)

Palfrey, Levine, Walker, & Sullivan (1985)

Greater Boston

285

2 wk-5 yr'

Varied depending on child's age1

Dimensional and categorical

Behavioral measure Content

Scale type

Teacher ratings on 6 items, including concentration and authority acceptance (externalizing), social contact (internalizing) Teacher ratings on items including interpersonal (externalizing) and intrapersonal adjustment

Categorical

Teacher ratings on Rutter Scale B

Dimensional, categorical

Teacher ratings on 6 items: timid, distant, lethargic (internalizing); hyperactive, impulsive, hostile (externalizing) Parent ratings and semistructured parent 8 interview

Categorical

Parent ratings, teacher ratings, psychologist impressions on items and on scales measuring attentionj

Categorical

Dimensional

Categorical

139

EXTERNALIZING BEHAVIOR AND UNDERACHIEVEMENT

Follow-up assessment Achievement measure

Behavioral measure

Grade

N 365

Content

Scale type

Content

Scale type

Key findings*

3rd

IQ (Kuhlman-Anderson) and reading (Metropolitan) tests

Categorical

Although teacher ratings in 1st grade predicted both IQ and reading categories in 3rd grade, the ratings were also correlated with IQ and readiness categories in 1st grade; further analyses were not performed

2nd

Group-administered reading test (Comprehensive Primary)

Dimensional

167 (boys only)

lstc

Group-administered reading tests (Southgate)

Dimensional, categorical

Same as initial assessment

1,100

2nd

Group-administered reading and arithmetic tests

Categorical

Same as initial assessment

Teacher ratings of intrapersonal adjustment in 1st grade significantly predicted reading scores in 2nd grade, with sex, ethnicity, SES, and readiness level controlled (incremental /J2 = .064); ratings of interpersonal adjustment did not make significant prediction Although teacher ratings of antisocial behavior in K predicted poor reading status at the end of 1st grade, these ratings were also associated with poor readiness in K. The latter were a stronger predictor of Ist-grade reading (incremental R2 of antisocial scores = .024)" All 6 behavioral items in K predicted reading and arithmetic status in 2nd grade, with largest associations for lethargic (fi = .21 for reading) and hyperactive (