Gender differences in science achievement: Do school effects make a ...

7 downloads 718 Views 1006KB Size Report
Science and Mathematics Education Centre, Curtin University of Technology, ... 14-year-old, and year-I 2 students, school effects were much more powerful in ...
JOURNAL OF RESEARCH IN SCIENCE TEACHING

VOL. 31, NO. 8, PP. 857-871 (1994)

Gender Differences in Science Achievement: Do School Effects Make a Difference? Deidra J. Young and Barry J. Fraser

Science and Mathematics Education Centre, Curtin University of Technology, GPO Box 111987, Perth, Western Australia 6001, Australia

Abstract The problem of the underrepresentation of girls in science in Australian schools is often attributed to their poor performance. Yet the role of both the home and the school in affecting female science achievement is rarely examined empirically. The comprehensiveness of the Second International Science Study database provided an excellent opportunity to investigate the presence of gender differences in science achievement. Although previous studies of gender differences in science achievement have relied on methodology that has not adequately accounted for the school effects, this study used the design effect and hierarchical linear modeling (multilevel analysis) to explore whether there were significant gender differences. The relative contribution of schools to student achievement was examined, and school-level differences were found to contribute significantly toward explaining variations in student performance. Although statistically significant sex differences were found in physics achievement for 10-year-old, 14-year-old, and year-I 2 students, school effects were much more powerful in explaining student differences (9-19%) when compared with gender (3%). Introduction The International Association for the Evaluation of Educational Achievement (IEA) initiated the first international science education study in 1959 ( H u s h , 1969). A nongovernment organization formed to conduct cross-national empirical educational research, the IEA sponsored the First International Science Study (FISS) between 1966 and 1972 in 19 countries (Comber & Keeves, 1973). The purpose of the FISS was to evaluate various national educational systems and to explore those variables that were thought to be associated with educational achievement. Although sex differences in science achievement were found during analyses of the FISS, the reasons for these sex differences have remained obscure. The Second International Science Study (SISS) was conducted in 1984 in 24 countries and included a much broader range of science achievement, attitude, and home background items than was available in the FISS and has proven to be a significant database for science education researchers (IEA, 1988; Rosier, 1987, 1988). The initial investigations of sex differences in science achievement by Keeves (1973), using the FISS data, suggested that girls consistently performed less well than did boys in mathematics and science. In addition, Kelly (1978) found that girls had less favorable attitudes toward 0 1994 by the National Association for Research in Science Teaching Published by John Wiley & Sons, Inc.

CCC 0022-4308/94/080857- 15

858

YOUNG AND FRASER

science than did boys. Unfortunately, the methodology of both of these studies relied upon the use of standardized sex differences to measure the size of the sex differences, without adequately accounting for the complex sample design used in the FISS. That is, both the FISS and SISS consisted of stratified sample designs with a separate strata for each state and school type, as well as a two-stage student/school selection. Initially, schools were randomly selected from within each school type and state; then students were selected randomly from within each school. This type of sample design required the use of different techniques because most commonly used statistical analyses assume simple random sampling. In addition, the problems of ignoring the multilevel nature of the students grouped into schools and the lack of multivariate analyses were also evident in the FISS and SISS studies. For example, Kelly’s analysis of FISS (1978, p. 28) did not use an adequate methodology to analyze data both at the school and student levels, nor did she examine other factors simultaneously with science achievement, such as student attitudes, school characteristics, and home background. Similarly, Postlethwaite and Wiley (1992, pp. 75-77) presented sex differences in science achievement as standardized score differences for the 24 countries participating in the SISS. The influence of attitudes and home background of students in co-determining student achievement should not be ignored by educational researchers. Consequently, the present study provided an opportunity to explore some of these important research questions using a new, more comprehensive database and taking advantage of recent advancements in research methodology. This article describes the initial analyses of sex differences in science achievement in the Australian SISS database. The purpose of this study was to investigate sex differences in science achievement in Australian schools. Additional factors associated with the differences in science achievement between girls and boys were examined in an attempt to derive implications for improving the participation and achievement levels of girls in science education in other studies (Young, 1991a, 1991b; Young & Fraser, 1992a, 1992b). These studies investigated the influence on science achievement of factors such as school type, home background, and attitudes toward science, which required an adequate national database that involved students from all types of schools and states. The Australian SISS provided such a database and therefore was chosen for use in the present study and in other studies. Previously reported investigations into sex differences for the SISS database by Humrich in the United States (1988), Keys in England (1986), Alting and Pelgrum in the Netherlands (19901, Rosier and colleagues in Australia (IEA, 1988; Rosier & Banks, 1990; Rosier & Keeves, 1991; Rosier & Long, 1991), and Keeves and others for all participating countries (Keeves, 1992; Postlethwaite & Wiley, 1992) have revealed sex differences in science achievement and attitudes existing among students of different ages and backgrounds irrespective of the science content area. The results of the SISS in Australia revealed that the average science achievement of 10-year-old and 14-year-old male students was significantly higher than for female students (Rosier & Banks, 1990). The mean science achievement for males was higher than for females in every Australian state, with the largest sex differences found in the Australian Capital Territory and the smallest in South Australia. However, although the 10-year-old and 14-year-old student samples were representative of Australian students in these age groups, the year- 12 science groups consisted of students with differential participation rates (i.e., there were more female year-I2 students taking biology than male year-12 students; there were more male year- 12 students taking physics than female year- 12 students). In Australia, year- 12 students range in ages from 16 to 18, with 17 being the median age. There is an increasing tendency for students to stay in school until the end of year 12; however, this is a relatively new phenomenon.

GENDER DIFFERENCES IN SCIENCE ACHIEVEMENT

859

A summary of results by Rosier and Banks (1990) revealed that males achieved higher scores than did females, particularly in chemistry and physics. The year-12 biology students sample appeared to have different characteristics to that of year- 12 chemistry and physics student samples, such as student verbal and quantitative ability and student attitudes toward science. For this reason, the three year-12 samples could not be considered as representative of students in the general community of Australian students in this age group (i.e., students still in school at year 12 are not similar to those students who left school early). At the time of this sample collection, year-12 students tended to be of higher social status and cognitive ability; however, more and more students are opting to remain in Australian high schools. Rosier and Banks’ analyses attempted to investigate sex differences using effect sizes, which did not adequately account for the nesting effect of students within schools (Rosier & Banks, 1990; Rosier & Long, 1991). In contrast, the present study examined sex differences in science achievement using the same database employed by Rosier, but with a technique that accommodated the hierarchical structure of students nested within schools. The Sample Design The sample design used in this study of the Australian SISS was a stratified two-stage cluster design composed of five different age groups. There were 4,259 10-year-old students (mostly Grade 5 ) , 4,917 14-year-old students (mostly Grade 9), 1,63 1 year-12 students studying biology, 1,177 year- 12 students studying chemistry, and 1,073 year- 12 students studying physics (mostly 17-year-old students). Stratification was used to ensure that each state and school type was adequately represented. In addition, clustering was used to represent the natural structure of the school as one where students are clustered in classes and schools. Random sampling was then done first in the selection of schools within each strata and, secondly, in selection of the students within each school. The 24 strata consisted of a matrix of the eight Australian states and territories (the Australian Capital Territory, New South Wales, Victoria, Queensland, South Australia, Western Australia, Tasmania, and Northern Territory) and the three school types (government, Catholic, and independent schools). In Australia, there are two types of private schools: Catholic and nonCatholic. Both types of private schools can be single-sex or coeducational and vary in social status depending on the location of the school. Many private schools target the poorer sections of the Australian community, whereas many others are elitist. Non-Catholic private schools tend to be called independent schools. Similarly, Australian public schools are usually referred to as government schools. They vary in social status and can be single-sex or coeducational. There has been a general trend in Australia, as in the United States, to transform single-sex schools into coeducational schools, particularly in the public sector. (See Table 1.) Table 1 Sample Sizes for Each Student Population by Sex Student population

Males

Females

Unknown

Total

10-year-old students 14-year-old student3 Year-I2 biology students Year-12 chemistry students Year-12 physics students

2078 2565 533 663 735

2177 2352 1069 502 332

4 0 29 12 6

4259 4917 1631 1 I77 1073

YOUNG AND FRASER

860

The first stage of sampling involved the selection of a sample of 40 schools from most states, with smaller numbers of schools from the smaller states. Each state had different sampling fractions owing to oversampling of Catholic and independent (non-Catholic nongovernment) schools in order to obtain a large enough number of schools to provide stable estimates of population parameters by strata. Schools were selected from a list of all primary and secondary Australian schools using a random-start, constant-interval method (arranged in numerical postcode order, reflecting a systematic geographical distribution of the school location). If a school did not wish to participate in SISS, a replacement school was then selected. The second stage of sampling involved sampling a cluster of 24 target population students selected at random from within the selected schools. Each school participating in SISS provided a list of all students in the school belonging to each target population. Twenty-four students were selected by date of birth from the list by selecting all students born on the first day of any valid month, and then selecting students born on the second day of any valid month until the required number of 24 students was obtained. Sex Differences in Science Achievement The science achievement test for each of the five samples in Table 1 consisted of a common test and four rotated tests, with each student attempting the same common test and two of the rotated tests. For the purposes of this study, the science-achievement test items were categorized by content into two subtests for the 10-year-old and 14-year-old students-that is, biology and physics subtests (examples of both biology and physics test items are provided in Figures 1 and 2, respectively). The chemistry subtest was discarded because of insufficient reliability (not enough chemistry items were available), whereas the reliability of the biology and physics subtests varied from 0.50 to 0.80 depending on the number of items available for analysis (see Table 2). Additionally, only common test items were used to provide a valid basis for comparing boys’ and girls’ achievement in different schools. Unfortunately, the SISS data only provided a small number of common test items, and this reduced the reliability of this study. Further studies by the authors used Rasch analysis of all science test items (common core test items and rotated test items) to produce ability estimates that have been adjusted for test item difficulty and differences in student abilities (Young & Fraser, 1992b). For the year-12 students, science Table 2 Reliubiliries of Science Subtests f o r Populations I , 2 , and 3 (Kuder-Richardson 20)

Population

Achievement test Biology Chemistry Physics Biology Chemistry Physics Biology Chemistry Physics

Number of items

KR-20

5 2 7 7 2 8

0.49 0.34* 0.58 0.55 0.28* 0.53

26

0.71

39 31

0.83 0.76

*These two subtests wcre discarded from further use in this thesis.

1. The diagram below shows an example of interdependence among aquatic organisms. During the day the organisms either use up or give off (a) or (b) as shown by the arrows.

Floating

-

w t c r 1) 1 a n t

Small water animals

-Water

p 1ant with r o o t s

Choose the right answer for (a) and (b) from the alternatives given: A B C

(a) is oxygen and (b) is carbon dioxide (a) is oxygen and (b) is carbohydrate (a) is nitrogen and (b) is carbon dioxide D (a) is carbon dioxide and (b) is oxygen E (a) is carbon dioxide and (b) is carbohydrate

2. A girl found the skull of an animal. She did not know what the animal was, but she was sure that it preyed on other animals for its food. What clue led to this conclusion? A

B C D E

The eye sockets faced sideways. The skull was much longer than it was wide. There was a projecting ridge along the top of the skull. Four of the teeth were long and pointed. The jaws could move sideways as well as up and down

3 . Some seeds germinate (start to grow) best in the dark, others in the light, while others germinate equally well in the dark or the light. A girl wanted to find out, by means of an experiment, to which group a certain kind of seed belonged. She should put some of the seeds on damp newspaper and A keep them in a warm place in the dark. B keep one batch in the light and another in the dark. C keep them in a warm place in the light. D put some on dry newspaper and keep them in the light. E put some on dry newspaper and keep them in the dark. c

4 Flowers cannot usually produce seeds unless A

B C D E

they are visited by insects. they appear in the summer. they are on plants growing in good soil. they produce nectar. suitable pollen is placed on their stigmas. Figure I.

YOUNG AND FRASER

862

1. Which of the following particles are gained, lost or shared during chemical changes?

A

electrons furthest from the nucleus of the atom electrons closest to the nucleus of the atom electrons from the nucleus of the atom protons from the nucleus of the atom neutrons from the nucleus of the atom

B C D E

2. How long is the block of wood shown in the diagram?

I

I

0

A

B C D E

I

10

I

I

1

30 40 length in cm (centimetres) 20

4 50

10cm 20cm 25cm 30cm 35cm

3. Mary and Jane each bought the same kind of rubber ball. Mary said, “My ball bounces better than yours.” Jane replied, “I’d like to see you prove that.” What should Mary do? Drop both balls from the same height and notice which bounces higher. Throw both balls against a wall and see how far each ball bounces off the wall. C Drop the two balls from different heights and notice which bounces higher. D Throw the balls down against the floor and see how high they bounce. E Feel the balls by hand to find which is the harder. A

B

4. An iron container is weighed after air in it has been pumped out (evacuated). Then it is

filled with hydrogen gas and weighed again. What is the weight of the container full of hydrogen, compared to the weight of the evacuated container?

A B C D E

less greater the same greater or less depending on the volume of the gas in the container greater or less depending on the temperature of the gas in the container Figure 2 .

GENDER DIFFERENCES IN SCIENCE ACHIEVEMENT

863

Table 3 Sex Differences in Science Achievement ~

Dependent variable 10-year-old students Biology achievement Physics achievement 14-year-old students Biology achievement Physics achievement Year-12 students Biology achievement Chemistry achievement Physics achievement

Male mean

Male

%

N

M-F Female M-F mean diff mean Female mean diff effect Ratio Ratio % N % size SRS* complex*

47.03 2078 63.07 2078

45.18 54.44

2177 2177

1.85 8.63

0.06 0.25

1.31 5.75

0.66 2.94

65.90 2565 68.11 2565

61.60 62.12

2352 2352

4.30 5.99

0.13 0.21

3.17 5.10

1.18 2.03

533 663 735

59.39 49.38 51.94

1069 502 332

0.81 7.30 5.26

0.03 0.31 0.22

0.46 3.79 2.58

0.28 2.25 1.62

60.20 56.68 57.20

*Ratio stands for the ratio of male-female mean difference and 2 standard errors of difference between male and female students. Ratio is statistically significant at the 0.05 level of confidence if ratio is 2 1 .OO or 5 - 1 .OO.

achievement was categorized by the science being studied at pretertiary level-namely biology, chemistry, and physics (see Table 1). Although biology, chemistry, and physics test items were used for each age group, they were not the same test items, but rather vaned in content and difficulty. Table 3 reports the male and female mean percent science achievement and sample sizes for each age group and science subtest. Sex differences were calculated by subtracting the female mean percent score from the male mean percent score. The difference between male and female means was then divided by the pooled standard error to provide an estimate of simple effect size (see column M-F, Mean Diff Effect Size, in Table 3). For comparison purposes with more sophisticated analyses described below, the significance of the sex differences was represented by the simple random sample ratio (SRS) of malefemale mean difference and two SRS standard errors of difference (Ratio SRS column in Table 3). For this ratio, any figure greater than 1 was statistically significant. That is, there were statistically significant sex differences favoring boys in biology and physics achievement for all age groups, except year-12 biology students. However, this ratio did not account for the complex sample design. To provide a simple statistic that did account for the complex sample design, the pooled standard error of difference was used to calculate the ratio between malefemale mean percent difference and two standard errors (complex) of difference (Kish, 1965; Ross, 1976). The adjustment used the design effect reflecting how close the sample is to a simple random sample (column Ratio Complex in Table 3). The design effect was calculated using a technique called the Bootstrap and developed by Efron (Efron, 1979, 1982; Efron & Tibshirani, 1986). However, this ratio did not adequately address the problem that the standard deviation is influenced by both student and school-level effects. This problem is addressed later in this article. Using standard errors adjusted for the complex sample design, statistically significant sex differences in biology achievement were found among 14-year-old students, but not among 10year-old students nor year-12 biology students (see Table 3). Note that the use of traditional simple random sampling standard errors for the same analyses indicated statistically significant

864

YOUNG AND FRASER

sex differences in biology achievement among 10-year-old students, with boys outperforming girls (Ratio SRS in Table 3), but differences were not found to be significant when the sample design effect was incorporated (Ratio Complex in Table 3). This finding, and numerous others, clearly highlights the importance of taking the complex nature of sampling designs into account when estimating the statistical significance of sex differences. Statistically significant sex differences in chemistry achievement were also found among year- 12 chemistry students favoring boys. There were statistically significant sex differences in physics achievement among students from all age groups. These results suggested that the sex differences in physics achievement among Australian students both in primary and secondary schools decreased slightly with age, although the year-12 physics students were self-selected (i.e., tended to be of higher cognitive ability). In further analyses not presented here, these sex differences appeared to be greater among students from the higher socioeducational groups; however, both male and female students appeared to achieve higher physics scores when from the more advantaged home backgrounds (i.e., parents of better education and more professional occupations) and when attending independent singlesex schools. Of most significance, the aggregated socioeducational level of the school was found to be more highly associated with improved physics achievement than was the school type (see Young & Fraser, 1992a; Young, 1993). The Hierarchical Linear Model Although the previous section presented results suggesting that sex differences in science achievement were present, the methodology used was not consistent with the sampling methodology. A two-stage sampling procedure was used, firstly, with schools being sampled from each state and, secondly, students sampled from each school. The nesting of groups of students within schools leads to homogeneity in student outcomes within schools and a confounding of the variability. This can give rise to misleading tests of significance, where the standard errors consist of variations between students within schools and variations between schools. It is important, therefore, that the variance be decomposed into student and school levels before a test of significance is applied. In this section, the decomposition of the variance is investigated using the hierarchical linear model. The way in which most educational research revolves around students who receive schooling in classrooms located within schools, within school districts, within states, and so forth, is often ignored in most educational research. The grouping of students, classes, and schools occurs in a hierarchical order with each group influencing the members of the group in thought and behavior. The nature of these hierarchical structures produces multilevel data. Traditional linear models on which most researchers rely require the assumption that subjects respond independently and that the sample is a simple random sample. However, most subjects in educational research are grouped into schools with resulting homogeneity within the school and variability between the schools. Educational researchers who ignore this problem could produce misleading statistical tests of significance, which underestimates the true standard errors (Raudenbush, 1988). Names for these type of models include multilevel linear models, mixed linear models, random coefficient models, and hierarchical linear models. Hierarchical Linear Model (HLM) is the label used by Raudenbush (1988), and HLM2 is the name given to the computer program developed by Bryk, Raudenbush, Seltzer, and Congdon ( I 989) for these types of analyses and used in this study. Any analysis of gender differences in science achievement between different school types

GENDER DIFFERENCES IN SCIENCE ACHIEVEMENT

865

must take into account variations from school to school unless inferences are to remain doubtful. Raudenbush and Bryk (1986) pointed out the fallacies of research findings that ignore the potential effects of the school or classroom as sociological units, citing many research studies with doubtful inferences. For example, their reanalysis of data from a random sample of American high schools illustrated technical and conceptual advances facilitated by HLM and showed that the relationship between socioeconomic status and mathematics achievement varied substantially across American high schools, and that much of this variation was attributable to school type (public versus Catholic). Distinguishing between microparameter variance (such as school or classroom) and the sampling variance was possible with HLM, where it was possible to partition the socioeducational level effect into within-group and between-group components, which yielded an estimate of the school type effect substantially different from earlier estimates. Similarly, Lee’s (1986) reanalysis of data from the “High School and Beyond” study (Coleman et al., 1966; Haertel, 1987) revealed that differences between public and Catholic schools were attributable to the curriculum and the discipline policies of the schools. Use of the Hierarchical Linear Model for the investigation of the influence of the organizational structure of the school on student performance has been documented by Bryk and Raudenbush (1989, pp. 159-204), Lee and Bryk (1989), and Raudenbush and Bryk (1986). The present study sought to examine the role of school effects in explaining science achievement and sex differences in science achievement. Research on school effects is described as a set of data analyzed at the individual student level, with the assumption that classrooms and schools affect students equally. However, when the effects vary among individuals and their contexts, this type of statistical analysis can be misleading (Bryk & Raudenbush, 1987). This study endeavored to explain variations in student outcomes by first decomposing observed relationships into between-school and within-school components. Once the total variance was separated into the student (within) and school (between) levels, then sex was included in the model so as to determine whether sex was a significant factor in determining unexplained variance. In further analyses described elsewhere (Young, 1991a, 1993; Young & Fraser, 1992a, 1992b), examination of the sex differences in physics achievement revealed that the size of sex differences varied with socioeducational level, school type, and sex composition of the school. Explanation of sex differences in physics achievement in terms of the social class of the school, the school environment, and location is described in Young and Fraser (1992a, 1992b) and Young (1993). Analysis of the Variance Components The HLM2 program (Bryk et al., 1989) was used in the present research to partition the total variance in biology and physics achievement into the within-school and between-school components. These variance components were estimated by first fitting an HLM model for which only a random base coefficient is specified for the within-school model and the betweenschool model is unconditional (no school-level variables included). The variance components were estimated for 10-year-old and 14-year-old students initially by fitting a Hierarchical Linear Model (Equations I and 2), where only a random average biology and physics achievement coefficient was specified for the within-school model (no student or school-level variables). This was also done for year-12 chemistry and physics students. Year-12 biology students were not investigated further owing to the lack of sex differences found in biology achievement. Biologyij =

poj + R,

Equation 1

YOUNG AND FRASER

866

where i = 1, . . ., nj students in school j; j = 1, . . ., J schools; Biologyij represents biology achievement of student i in school j; Poj represents the mean biology achievement for students in school j; and Rij represents random error of student i in school j. Physics,, =

Poj + Rij

Equation 2

where i = 1, . . . , rtj students in school j; j = 1 , . . ., J schools; Physicsij represents physics achievement of student i in school j; Poj represents the mean physics achievement for students in school j , and R, represents random error of student i in school j. A similar approach was used for year-I2 chemistry and year-12 physics students. An unconditional between-school model was also specified for Equations 1 and 2: Poj =

Fj +

Uj

Equation 3

where pojrepresents the mean biology/physics achievement for students in school j; kj is equal to poj, and U, represents random error of school j . Equation 3 represents the model for schoollevel variations in student achievement, without fitting any school effects to the model. To examine the influence of the student variable, namely sex of the student, the model was estimated with and without sex of the student so as to examine the difference in explanatory power (reduction in variance explained by the model). This model is shown in Equations 4 and 5: Biology, =

Poj + p,jSexij + Rij

Equation 4

where Biology represents student biology achievement and Sex represents gender of the student. The beta coefficients are described as follows:

Po, = Mean biology achievement for students in school j. = The degree to which sex differences in biology achievement related to students’ achievement.

Physicsij =

Poj + p,jSexij + R,

Equation 5

where Physics represents student physics achievement and Sex represents gender of the student. The beta coefficients are described as follows:

pol = Mean physics achievement for students in school j . PI, = The degree to which sex differences in physics achievement related to students’ achievement.

In this article the variable sex of the student was added to the explanatory model to determine whether sex differences were significant within a multilevel model. Other significant student and school-level variables were investigated, but space does not allow these results to be presented here (see Young, 1991a, and Young & Fraser, 1992a, 1992b, for further investigations of student and school-level effects that were found to explain student differences in science and physics achievement). For the 10-year-old student population, biology and physics achievement subtest scores were analyzed for variations in student outcomes using the Hierarchical Linear Model (Table 4). Variance component analyses of biology and physics achievement showed that 9 1 .O% and 90.3% of unexplained variance was attributable to within-school differences, and 9.0% and

GENDER DIFFERENCES IN SCIENCE ACHIEVEMENT

867

Table 4 Hierarchical Linear Modeling for Gender Differences in Science Achievement

Dependent variable 10-year-old students Biology achievement Physics achievement 14-year-old students Biology achievement Physics achievement Year-I2 students Biology achievement Chemistry achievement Physics achievement

Total variance

Within- %Reduction schools in variance variance due to sex sex

51.10% 721.97 (91.0) 71.46 (9.0) 59.45% 605.42 (90.3) 65.20 (9.7)

793.43 670.62

716.81 587.24

0.7% 3.0%

63.37% 498.01 (89.2) 60.55 (10.8) 64.69% 370.24 (88.2) 49.99 (1 1.8)

558.56 422.35

491.63 359.15

1.3% 3.0%

59.83% 227.67 (87.3) 32.98 (12.7) 52.51% 218.44 (81.2) 50.45 (18.8) 55.56% 204.71 (80.6) 49.20 (19.4)

260.65 268.89 253.91

226.40 212.44 198.80

0.6% 2.7% 2.9%

Mean school score

Withinschools variance

(%I

Betweenschools variance

+

9.7% to differences in between-schools, respectively. The very small amount of between-school differences were also investigated further, but the within-school differences were of greater importance in explaining variance. When the 14-year-old student population was investigated, biology and physics achievement subtest scores were also analyzed for variance in student outcomes as shown in Table 4. The biology and physics achievement subtest mean score was out of loo%, and revealed larger explained variance in achievement when compared with the 10-year-old students. The percentage of explained variance in biology and physics achievement was 89.2% and 88.2% attributable to within-school differences and 10.8% and 11.8% due to differences between schools, respectively. Although a substantial amount of explained variance was attributable to betweenschool variance, the greater amount of explained variance was attributable to within-school variance. School effects appear to contribute to variations in student achievement to a greater extent in the 14-year-old students than in the 10-year-old student population. The year- 12 biology, chemistry, and physics students were investigated in the multilevel analyses next. Although 9% to 10% of explained variance in physics achievement was attributable to school effects among 10-year-old and 14-year-old students, between-school variance was more than double among year-12 chemistry and physics students (Table 4). School effects did not appear to be as great for the year-12 biology students. The within-school variance appeared to contribute much less to the total variation in chemistry achievement (81.2%), with the school contributing 18.8% toward the total percentage of explained variance. The slightly smaller variations among the year- 12 chemistry students (2 18.44) is quite probably due to the somewhat similar students participating in the year-12 chemistry courses. In other words, students participating in the year-12 chemistry courses appeared to be slightly more similar. Similarly, the year-12 physics students tended to have lower within-schools variance (204.7 1) when compared with 14-year-old students’ within-schools variance in biology and physics achievement. Differences in school characteristics contributed 19.4% toward variations in physics achievement among year- 12 physics students. However, the school-level differences in year- 12 biology achievement were less than in chemistry and physics achievement. These variance-components results indicated that the schools contributed slightly to variations in science achievement and

YOUNG AND FRASER

868

that this contribution appeared to be more substantial among the year-12 students when compared with the 10-year-old and 14-year-old students. Explanatory Power of Gender Although gender differences in science achievement and student attitudes were examined by Rosier and Banks (1990) and Rosier and Long (1991) with a post hoc adjustment of the sampling errors using the design effect, as previously elaborated on in this article the nested structure of the school setting (students nested in classes and classes nested in schools) has not been accounted for in these studies. This section presents an Hierarchical Linear Model (HLM) approach to testing gender differences in science achievement by adding gender to the model described in the previous section (the unconditional model). The analysis was first conducted without the presence of any explanatory variables. Then the variable sex was added to the model to determine whether there was a substantial reduction in the amount of unexplained variance. The two-level model takes the following form: Within-school model: Between-school model:

Science Achievement Base

Sex

+ Sex + Error = Base + Error =

Base

=

Base

+ Error

Hierarchical Linear Models were used to investigate the explanatory power of sex in science achievement for the population of 10-year-old students, and although there was no significant gender effect for biology achievement, the gender effect for physics achievement was statistically significant. However, for the 14-year-old students, the sex effect was also found to be significant for both biology and physics achievement. When the gamma coefficient for the sex effect among year-I2 biology students was examined, no significant effect was found, unlike year-12 chemistry and physics students with a more substantial sex effect. When the total within-schools variance of the model with sex excluded was compared with the model with sex included, the reduction in the variance was minimal. For the 10-year-old students, the inclusion of sex in the models of biology and physics achievement contributed 0.7% and 3 .O%, respectively, toward explaining residual student-level variance (Table 4). For the 14-year-old students, the inclusion of sex in the models of biology and physics achievement contributed I .3% and 3.0%, respectively, toward explaining within-schools variance. Finally, for year-I2 students, the unexplained variance was reduced by 0.6%, 2.776, and 2.9% for biology, chemistry, and physics achievement, respectively, when gender was included in the model. The HLM approach, when compared with the post hoc design effect adjustment to standard error described previously in this article (Sex Differences in Science Achievement), confirmed that significant sex differences existed in science achievement among students and that the magnitude of these sex differences was likely to vary across schools. It should also be noted that the HLM approach results in similar sex differences being reported, as in the complex sample design adjustment used previously. Sex differences in biology achievement proved to be statistically insignificant among the 10-year-old and year- 12 student populations. When HLM was used to model sex differences in chemistry and physics achievement among year-12 students, there did not appear to be any sex differences in biology achievement (see Table 4). Boys appeared to outperform girls in chemistry and physics, but there were no significant school effects related to these sex differences. Caution should be observed in drawing

GENDER DIFFERENCES IN SCIENCE ACHIEVEMENT

869

inferences from these gender differences, owing to the probable differences in student characteristics. In further studies, Young and Fraser (in press) investigated the role of student and school characteristics that help to explain these sex differences. Students electing to study biology at the year-12 level could be very different from the students electing to study chemistry and/or physics. In addition, significant school effects were noted for student differences in chemistry and physics achievement. Discussion Recent research into sex differences in science achievement has rarely examined the interaction between the school environment and processes on student performance. When educational researchers ignore school effects, the student differences (or variance) become confounded by school differences, resulting in biased statistical significance tests. This study attempted to separate the influence of school effects on student achievement and to examine the relationship of gender as a determinant of student achievement. The presence of gender differences in science achievement occurs as a result of a number of social factors both at home and at school. Gender differences appear to be greater in some schools than in others, with the variation between school often being neglected by researchers. This study examined gender differences both within the schools and between the schools using methodology that accounted for the hierarchical nature of student databases. When sex differences in biology and physics achievement were compared, they were found to be greater in physics achievement. However, these sex differences were not as substantial as the school effects. The use of multilevel analyses revealed that gender differences contributed relatively little toward explaining the overall amount of variation in student performance. While school effects appeared to account for between 10% and 20% of the variance in physics achievement, the sex of the student contributed only 3% toward the Hierarchical Linear Model. This would lead us to doubt the validity of focusing so much research on reducing sex differences as a pathway toward equity in science education. Perhaps equity issues are better addressed by examining other factors associated with student differences. A limitation of this study was the limited reliability of the subtests constructed for these analyses. The use of common core test items meant that there were only between 5 and 7 items in some of the subtests measuring biology and physics achievement. The subtests would have been much more reliable had both core and optional test items been used. However, the optional test items varied in difficulty and were not undertaken by all students in the sample. In further analyses, the Rasch probabilistic model will be used to equalize the difficulty of rotated test items and thus increase the reliability of the subtests and the efficacy of the multilevel analyses (Young & Fraser, 1992a, 1992b, in press). This article presents some limited findings that gender differences may be related to schoollevel differences. Although these are not necessarily new findings, they do reveal the usefulness of the HLM for revealing the amount of unexplained variance due to school effects, when compared with the student-level variance. The traditional models using standard errors for measuring statistically significant sex differences (such as the r-test) are likely to misestimate the standard errors and provide the educational research with misleading results. If this large database reveals that there are between 10% and 20% unexplained variability at the school level, then the implications are enormous for educational research. Unless researchers use the Hierarchical Linear Model for hierarchical data, as students nested in schools are, their results are almost certainly going to be of dubious quality with faulty statistical tests and poor control of student and school-level variables.

870

YOUNG AND FRASER

References Alting, A., & Pelgrum, W.J. (1990). The SISS in the Netherlands: Descriptive and gender differences. Studies in Educational Evaluation, 16, 42 1-441. Bryk, A.S., & Raudenbush, S.W. (1987). Application of hierarchical linear models to assessing change. Psychological Bulletin, 101( I), 147-158. Bryk, A.S., & Raudenbush, S.W. (1989). Toward a more appropriate conceptualisation of research on school effects: A three-level hierarchical linear model. In R.D. Bock (Ed.), Multilevel analysis of educational data (pp. 159-204). San Diego: Academic. Bryk, A.S., Raudenbush, S.W., Seltzer, M., & Congdon, R.T. (1989). An introduction to HLM: Computer program and users’ guide. Chicago: University of Chicago Press. (Available from Scientific Software in the USA) Coleman, J.S., Campbell, E.Q., Hobson, C.J., McPartland, J., Mood, A.M., Weinfeld, F.D., & York, R.L. (1966). Equality of educational opportunity. Washington, DC: U . S . Government Printing Office. Comber, L.C., & Keeves, J.P. (1973). Science education in nineteen countries: An empirical study. International Studies in Evaluation I. Stockholm: Almqvist and Wiksell. Efron, B. (1979). Bootstrap methods: Another look at the jackknife. The Annals of Statistics, 7(1), 1-26. Efron, B. (1982). The jackknife, the bootstrap and other resampling plans. Philadelphia: Society for Industrial and Applied Mathematics. Efron, B., & Tibshirani, R. (1986). Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Statistical Science, 1 , 54-77. Haertel, E.H. (1987). Comparing public and private schools using longitudinal data from the HSB study. In E.H. Haertel, T. James, & H.M. Levin (Eds.), Comparing public andprivate schools. Vol. 2: School achievement (pp. 9-32) New York: Falmer Press. Humrich, E. (1988). Sex direrences in the Second IEA Science Study-4.S. results in an international context. Paper presented at the annual meeting of the National Association for Research in Science Teaching. H u s h , T. (1969). International impact of evaluation. Educational evaluation: New roles, new means. In R.W. Tayler (Ed.), The sixty-eighth yearbook of the National Society for the Study of Education (Part 2). Chicago: University of Chicago Press. IEA ( 1988). Science achievement in seventeen countries: A preliminary report. London: Pergamon . Keeves, J.P. (1973). Differences between the sexes in mathematics and science courses. International Review of Education, 19( I), 47-63. Keeves, J.P. (1992). Learning science in a changing world: Cross-national studies of science achievement: 1970 to 1984. The Hague, The Netherlands: The International Association for the Evaluation of Educational Achievement (IEA). Kelly, A. (1978). Girls and science: An international study of sex differences in school science achievement. (International Association for the Evaluation of Educational Achievement IEA Monograph Studies No. 9.) Stockholm: Almqvist and Wiksell. Keys, W. (1986). A comparison of a-level science students in schools, sixth-form colleges and colleges of further education. Educational Research, 28(3), 190-201. Kish, L. (1965). Survey sampling. New York: Wiley. Lee, V.E. (1986). Multi-level causal models for social class and achievement. Paper presented at the annual meeting of the American Educational Research Association, San Francisco.

GENDER DIFFERENCES IN SCIENCE ACHIEVEMENT

87 1

Lee, V.E., & Bryk, A.S. (1989). A multilevel model of the social distribution of high school achievement. Sociology of Education, 62(2),246. Postlethwaite, T.N., & Wiley, D.E. (1992). The IEA study of science. II: Science achievement in twenty-three countries. Oxford: Pergamon . Raudenbush, S.W. (1988). Educational applications of hierarchical linear models: A review. Journal of Educational Statistics, 13(2), 85- 116. Raudenbush, S.W., & Bryk, A.S. (1986, January). A hierarchical model for studying school effects. Sociology of Education, 59, 1-17. Rosier, M. J. (1987). The Second International Science Study. Comparative Education Review, 31(1), 106-128 Rosier, M.J. (1988). Results from the Second International Science Study: Some sex differences for Australian 14-year-old students. Research in Science Education, 18, 205-210. Rosier, M.J., & Banks, D.K. (1990). The scientific literacy ofAustralian students: Science achievement of students in Australian primary and lower secondary schools. (ACER Research Monograph Number 39). Hawthorn, Victoria: Australian Council for Educational Research. Rosier, M.J., & Keeves, J.P. (1991). The IEA study of science. I: Science education and curricula in twenty-three countries. Oxford: Pergamon. Rosier, M.J., & Long, M.G. (1991). The science achievement of Year-12 students in Australia. Hawthorn, Victoria: Australian Council for Educational Research. Ross, K.N. (1976). Searching for uncertainty. An empirical investigation of sampling errors in educational survey research (Occasional Paper No. 9). Hawthorn, Victoria: Australian Council for Educational Research. Young, D.J. (1991a). Gender direrences in science achievement: Secondary analysis of data from the Second International Science Study. Unpublished doctoral dissertation, Curtin University of Technology, Perth, Western Australia. Young, D. J. (199 1b). Multilevel analysis of sex and other factors in.uencing science achievement. Paper presented at the Gender and Science and Technology Sixth International Conference, University of Melbourne, Melbourne. Young, D.J. (in press). Single-sex schools and physics achievement: Are girls really advantaged? International Journal of Science Education. Young, D.J., & Fraser, B.J. (1992a, April). Sex direrences in science achievement: A multilevel analysis. Paper presented at the annual meeting of the American Educational Research Association. San Francisco. Young, D.J., & Fraser, B.J. (1992b, March). School effectiveness and science achievement: Are there any sex differences? Paper presented at the annual meeting of the National Association for Research in Science Teaching. Boston. Young, D.J., & Fraser, B.J. (1993). Socioeconomic and gender effects on science achievement: An Australian perspective. School Effectiveness and School Improvement, 4 , 265-289. Manuscript accepted April 16, 1993.