Measurement Educational and Psychological

2 downloads 0 Views 191KB Size Report
Mar 27, 2008 - Jody A. Worley, Matt Vassar, Denna L. Wheeler and Laura L. B. Barnes ..... bility estimates (Fan & Thompson, 2001; Henson, 2001; Loo, 2001).
Educational and Psychological Measurement http://epm.sagepub.com

Factor Structure of Scores From the Maslach Burnout Inventory: A Review and Meta-Analysis of 45 Exploratory and Confirmatory Factor-Analytic Studies Jody A. Worley, Matt Vassar, Denna L. Wheeler and Laura L. B. Barnes Educational and Psychological Measurement 2008; 68; 797 originally published online Mar 27, 2008; DOI: 10.1177/0013164408315268 The online version of this article can be found at: http://epm.sagepub.com/cgi/content/abstract/68/5/797

Published by: http://www.sagepublications.com

Additional services and information for Educational and Psychological Measurement can be found at: Email Alerts: http://epm.sagepub.com/cgi/alerts Subscriptions: http://epm.sagepub.com/subscriptions Reprints: http://www.sagepub.com/journalsReprints.nav Permissions: http://www.sagepub.com/journalsPermissions.nav Citations http://epm.sagepub.com/cgi/content/refs/68/5/797

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

Validity Studies

Factor Structure of Scores From the Maslach Burnout Inventory

Educational and Psychological Measurement Volume 68 Number 5 October 2008 797-823 © 2008 Sage Publications 10.1177/0013164408315268 http://epm.sagepub.com hosted at http://online.sagepub.com

A Review and Meta-Analysis of 45 Exploratory and Confirmatory Factor-Analytic Studies Jody A. Worley University of Oklahoma

Matt Vassar Oklahoma State University Center for Health Sciences

Denna L. Wheeler Connors State College

Laura L. B. Barnes Oklahoma State University This study provides a summary of 45 exploratory and confirmatory factor-analytic studies that examined the internal structure of scores obtained from the Maslach Burnout Inventory (MBI). It highlights characteristics of the studies that account for differences in reporting of the MBI factor structure. This approach includes an examination of the various sample characteristics, forms of the instrument, factor-analytic methods, and the reported factor structure across studies that have attempted to examine the dimensionality of the MBI. This study also investigates the dimensionality of MBI scale scores using meta-analysis. Both descriptive and empirical analysis supported a three-factor model. The pattern of reported dimensions across validation studies should enhance understanding of the structural dimensions that the MBI measures as well as provide a more meaningful interpretation of its test scores. Keywords: burnout; factor structure; dimensionality; meta-analysis; construct validity

T

he Maslach Burnout Inventory (MBI; Maslach & Jackson, 1981), since its development in the early 1980s, has been the most widely used measure of occupational burnout in the empirical literature. Schaufeli and Enzman (1998) estimated that as many as 90% of all studies examining occupational burnout have used the MBI. Despite its popularity, the internal structure of the MBI has been a source of considerable debate (cf. Schaufeli & Van Dierendonck, 1993). Although

Authors’ Note: Please address correspondence to Jody A. Worley, Department of Human Relations, University of Oklahoma, 4502 E 41st, Tulsa, OK 74135-2553; e-mail: [email protected]. 797 Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

798

Educational and Psychological Measurement

the original three-dimensional structure of the MBI presented by Maslach and Jackson (1981) has been replicated (e.g., Belcastro, Gold, & Hays, 1983), other researchers have presented evidence for two (Brookings, Bolton, Brown, & McEvoy, 1985; Green, Walkey, & Taylor, 1991; Holland, Michael, & Kim, 1994; Walkey & Green, 1992), four (Firth, McIntee, McKeown, & Britton, 1985; Golembiewski, Munzenrider, & Carter, 1983; Iwanicki & Schwab, 1981; Mor & Laliberte, 1984; Powers & Gose, 1986), and five (Densten, 2001) dimensions. In addition, Maslach and Jackson (1981) conceptualize the three MBI dimensions as independent aspects of burnout. In studies where researchers have tested correlated factor models, the moderate to strong interfactor correlations provide strong evidence that the factors of the MBI are not independent (e.g., Byrne, 1991, 1994; Gold, Roth, & Wright, 1992). This study provides a summary of 24 exploratory factor-analytic (EFA) studies, 13 confirmatory factor-analytic (CFA) studies, and 8 studies that used a combination of EFA and CFA techniques to examine the internal structure of the MBI scores. It investigates features of the studies that might account for differences in the reported factor structure of MBI scores and the relationship between factors. The purpose of the study is to address three related questions: (a) What is the factor structure of the MBI scores? (b) What is the nature of the relationship between the factors? and (c) How does the factor structure compare with Maslach and Jackson’s conceptualization of burnout? Particular attention is directed at the unique characteristics of the studies that may have influenced the reported factor structure. This approach considers the various sample characteristics, forms of the instrument, factoranalytic methods, and the reported factor structure across studies that have attempted to examine the dimensionality of MBI scale scores.

Development of the MBI Maslach and Jackson (1981) initially used 47 items in the development of the MBI with a sample of 605 health and service workers. Following a principal-axis factor analysis (PAF) with varimax rotation, 25 items were retained comprising four factors. Scores from the reduced scale were then validated with a sample of 420 helping professionals. Similar results from the two samples led Maslach and Jackson to combine the samples for subsequent validity analysis. Burnout was therefore conceptualized by Maslach and Jackson (1981) as a syndrome experienced by workers in human service professions manifesting itself as Emotional Exhaustion (EE: 9 items), Personal Accomplishment (PA: 8 items), Depersonalization (DP: 5 items), and Involvement With People (a weak factor of 3 items with an eigenvalue less than 1.0). The last factor, Involvement With People, was later dismissed, leaving a three-factor solution composed of 22 items. The final version of the MBI was rated in terms of both frequency and intensity, with response categories ranging from 1 (a few times a year) to 6 (every day) for frequency and 1 (very mild,

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

Worley et al. / Factor Structure of Maslach Burnout Inventory Scores

799

Table 1 Maslach Burnout Inventory Item Stems and Frequency With Which Items Load on Expected Factors as Reported in Exploratory Factor-Analytic Studies Orthogonal Rotation EE Emotional Exhaustion (EE) 1. Emotionally drained 2. Used up 3. Fatigued in the morning 6. Working with people 8. Burned out 13. Frustrated 14. Working too hard 16. Stress 20. End of my rope Personal Accomplishment (PA) 4. I can understand 7. I deal effectively 9. I’m positively influencing 12. Energetic 17. Relaxed atmosphere 18. I feel exhilarated 19. Accomplished 21. Deal with emotional problems Depersonalization (DP) 5. Impersonal objects 10. Callous toward people 11. Hardening me emotionally 15. I don’t really care 22. Recipients blame me

16 16 16 13 16 13 14 12 14

1

1 4 1 2

PA

DP

DNL

Oblique Rotation OF

EE

PA

DP

DNL

OF

5 5 2

1 1 1 1

1 2 1

2 1

11 12 13 13 14 14 13 12

1 1 2 2 1 2 2 1

2 2 2 2 2

11 12 11 10 8

1

5

1

5 1 1

2 1 2 2 1 1

1 1

5 4 4

4 4 2 2

1 1

3 5 5 5

1 1

3 1 1

3 1 3

1

2 2 2 2

2 2

3

2 5 2

4

1 1

1

2

1 1

5 5 5 4

Note: DNL = item failed to load; OF = item loaded on a unique other factor.

barely noticeable) to 7 (very strong, major) for intensity. Most research following development of the MBI has reported only frequency ratings. Item stems from the 22 MBI items are presented in Table 1. Of the three factors, emotional exhaustion was viewed as the primary manifestation of the burnout syndrome. Emotional exhaustion has been categorized as ‘‘feelings of being overextended and depleted of one’s emotional and physical resources’’ (Halbesleben & Demerouti, 2005, p. 208), thus reflecting the stress component underlying the construct. Moreover, EE is the most widely reported and thoroughly analyzed dimension, considered by some researchers (Shirom, 1989) as the only critical component to the burnout syndrome (Maslach, Schaufeli, & Leiter, 2001). Yet it is acknowledged that emotional exhaustion fails to account for the relationship that

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

800

Educational and Psychological Measurement

employees have with their work, because emotional demands may reduce employee capacity to remain engaged with the needs of the recipients or clients (Maslach et al., 2001). Depersonalization (DP), the second dimension of burnout, takes into account this occurrence, reflecting the worker’s detached response to either the service recipients or aspects of the job itself. By placing distance between the worker and the service recipient, workers are able to view recipients as impersonal objects, making job demands more manageable (Maslach et al., 2001). The final dimension of burnout, personal accomplishment (PA), refers to workers feeling competent to perform the functions of the job or their sense of productivity at work. Maslach and Jackson (1981) stated, ‘‘It is important to note that the Personal Accomplishment subscale is independent of the other subscales . . . and cannot be assumed to be the opposite of Emotional Exhaustion and/or Depersonalization’’ (p. 104). Although the 22 MBI items were developed for use with human service professionals, many practitioners saw a use for the burnout measure among teachers. Consequently, the initial set of 22 items became known as the MBI–Human Services Survey (MBI-HSS), and an MBI–Educators’ Survey (MBI-ES) was created by replacing the word recipient with student in the respective items (Maslach & Jackson, 1986). Both MBI-HSS and the MBI-ES reflect dimensions (EE, DP, and PA) where the focus of burnout involves direct interaction with people. As such, some practitioners questioned the adequacy of the MBI to measure burnout in occupations outside human service sectors. Schaufeli, Leiter, Maslach, and Jackson (1996) developed the MBI–General Survey (MBI-GS) to address this perceived limitation. The three dimensions of the MBI-GS were conceptualized in slightly broader terms, taking into consideration the job itself rather than the personal relationships associated with the job (Maslach et al., 2001). The three dimensions of the 16-item MBI-GS were labeled Exhaustion (described previously); Cynicism, a distant attitude toward the job; and Reduced Professional Efficacy. These dimensions are cited as assessing the same dimensions as the original measures: exhaustion, depersonalization, and personal accomplishment, respectively (Maslach et al., 2001). Score reliability has been assessed across the three versions of the MBI. Maslach and Jackson (1981) found internal consistency coefficients for the EE (a = :89), DP (a = :77), and PA (a = :74) subscale scores from a sample of human service personnel. Test score reliability estimates have also been demonstrated by the MBI-ES. Specifically, Maslach and Jackson (1986) reported reliability estimates for the EE (a = :90), DP (a = :79), and PA (a = :71) subscale scores in a sample of elementary school teachers. Others have also reported reliability estimates among teacher samples. The Depersonalization subscale scores tend to produce lower reliability estimates than the other subscales. Reliability of the DP subscale scores has been reported as between .50 and .80 in several empirical studies (cf. Aluja, Blanch, & Garcia, 2005; Beckstead, 2002; Gold, Bachelor, & Michael, 1989; Iwanicki & Schwab, 1981; Leiter & Maslach, 1988). In considering

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

Worley et al. / Factor Structure of Maslach Burnout Inventory Scores

801

the interpretation of reported reliability estimates, Nunnally (1978) recommended a minimum reliability estimate of .80 for basic research. Loo (2001) also suggested using .80 as a general cutoff rule for reliability estimates. A more comprehensive discussion on the interpretation of reliability estimates is provided by Henson (2001). Thus, the MBI appears to measure occupational burnout with internal consistency across samples from various organizational settings as well as across cultural contexts (cf. Lee & Ashforth, 1990). However, practitioners considering the use of the MBI are encouraged to investigate current literature on the interpretation of reported reliability estimates (Fan & Thompson, 2001; Henson, 2001; Loo, 2001). The development of the various MBI versions (i.e., MBI-HSS, MBI-ES, MBIGS) stimulated a host of studies examining dimensions of burnout in the workplace and the construct validity of the MBI. The most common of these approaches are EFA and CFA. With inconsistent results concerning the factor structure of MBI scores, we undertook a review and meta-analysis of these studies to identify and describe study features that may account for these differences. We intend for the current study to contribute to the understanding of the burnout landscape in general and the use of the MBI in particular. The pattern of dimensions across validation studies should lead to a greater understanding of the structural dimensions of the construct that the MBI measures as well as a more meaningful interpretation of its scores.

Method PsychInfo, ERIC, and Academic Search Elite were used to acquire validation studies of the MBI using the keywords Maslach Burnout Inventory and MBI in conjunction with validation and factor analysis. References within each article were also cross-referenced against initial findings to obtain additional articles. Results from this search yielded 69 studies. The MBI form, items, and response formats varied considerably among the studies. As mentioned previously, the General Survey (GS) differs from the Human Services Survey (HSS) or the Educators’ Survey (ES). To provide some consistency across studies, articles that reported use of the 16-item General Survey (GS) were excluded. Furthermore, only studies that used primary data sufficient to determine that individual items were subjected to EFA, CFA, or a combination of the two were included. Forty-five studies, published from 1981 through 2006, were identified as meeting the inclusion criteria. All EFA articles (n = 32) meeting the criteria were catalogued according to the author’s name, publication year, sample characteristics, form of the instrument, factor-analytic methods, the reported factor structure of scores, and factor correlations (summarized in Table 2). The CFA articles (n = 21) were catalogued based on author’s name, publication year, sample characteristics, models tested, fit indices, and factor correlations (summarized in Table 3). The 8 studies

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

802

Educational and Psychological Measurement

that analyzed data using both techniques are cross-listed on both summary tables. Furthermore, the results presented herein have been framed around the recommendations of Henson and Roberts (2006) concerning best practices of reporting EFA results.

Results Sample Characteristics: EFA Studies Two studies (Byrne, 1991; Linehan, Cochran, Mar, Levensky, & Comtois, 2000) conducted separate analyses for distinct groups so that 35 distinct samples were analyzed. Sample sizes ranged from 30 to 5,730, with a median sample size of 234 and a mean sample size of 491. Most of the studies reported samples consisting of working professionals. More specifically, almost one half of the samples (n = 16) consisted of teachers, and the remaining studies used samples of social service workers (n = 7), students (n = 2), health care providers (n = 5), and business managers (n = 3). Three notable exceptions to the typical sample of working professionals were studies that administered the MBI to gifted and talented students 10 to 17 years old who participated in a summer enrichment program (Fimian, Fastenau, Tashner, & Cross, 1989), individuals with borderline personality disorders (Linehan et al., 2000), and mothers (Pelsma, Roland, Tollefson, & Wigington, 1989). Fifteen studies reported the average participant age. Regardless of whether sample-size-weighted or -unweighted techniques were used, the mean participant age was 36 years. Gender distribution was reported in 27 samples. The studies analyzed reported a wide range of gender distributions from all male (n = 1) to all female (n = 4) with an average distribution of 64% female. Although the majority of the studies were conducted with English-speaking U.S. samples (n = 18), additional English-speaking participants from Canada, Great Britain, New Zealand, and Australia were sampled. Several studies used translations, including Arabic (Abu-Hilal, 1995; Abu-Hilal & Salameh, 1992), Greek (Kantas & Vassilaki, 1997; Kokkinos, 2006), Japanese (Kitaoka-Higashiguchi et al., 2004), Swedish (Soderfeldt, Soderfeldt, Warg, & Ohlson, 1996), Dutch (Gorter, Albrecht, Hoogstraten, & Eijkman, 1999), Chinese (Tang, 1998), and Catalan (Aluja et al., 2005).

Sample Characteristics: CFA Studies Five CFA studies (Byrne, 1991, 1993, 1994; Evans & Fischer, 1993; Kalliath, O’Driscoll, Gillespie, & Bluedorn, 2000) reported results for 2 or more samples so that a total of 28 samples were analyzed. Sample size ranged from 113 to 1, 590, with a mean of 491 participants. Participant age was reported for 23 samples with a mean of 41 years. One study (Richardsen & Martinussen, 2004) reported age as a range (text continues on page 807)

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

803

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

Abu-Hilal (1995) Abu-Hilal & Salameh (1992) Aluja, Blanch, & Garcia (2005) Belcastro, Gold, & Hays (1983) Brookings, Bolton, Brown, & McEvoy (1985) Bryne (1991) Densten (2001) Fimian & Blanton (1987) Fimian, Fastenau, Tashner, & Cross (1989) Firth, McIntee, McKeown, & Britton (1985) Gold (1984) Gold, Bachelor, & Michael (1989) Gold, Roth, & Wright (1992) Golembiewski, Munzenrider, & Carter (1983) Gorter, Albrecht, Hoogstraten, & Eijkman (1999) Green & Walkey (1988) Gupchup, Lively, Holiday-Goodman, Siganga, & Black (1994) Holland, Michael, & Kim (1994) Huberty & Huebner (1998) Iwanicki & Schwab (1981) Kantas & Vassilaki (1997) Koeske & Koeske (1989) Kokkinos (2006)

Study 3 3 3 3 2 3 4 3 3 4 4 3 3 4 3 3 3 2 and 3 3 4 3 3 3

22 22 22 22 22 22

Factors

22 22 22 25b 22 22 22 22 22 22 22 22 22 25 20 22 22

Items

2c 1b,c 2b,c T2c 1c T2c

T5a,b,c T5a,b,c T4c 1c 1c 2c 1b 2b 4b 4c 2b,c 2c 2c 4a T4c 1c 1c

Forma

150 234 469 220 237 771

162 223 631 710 135 163, 162, and 218 480 413 311 200 462 147 133 296 689 244 84

N

U.S. middle school teachers U.S. school psychologists U.S. teachers Greek teachers U.S social workers Greek Cypriot teachers (continued)

Jordanian teachers Emirati teachers Catalan teachers U.S. teachers U.S. human service professionals Canadian teachers Law enforcement officers U.S. education students and 1st-year teachers U.S. gifted and talented students British nurses U.S. elementary and junior high teachers U.S. beginning teachers U.S. elementary education students U.S. corporate employees Dutch dentists New Zealand nurses U.S. pharmacists

Sample Description

Table 2 Author(s), Number of Items Analyzed, Number of Factors Retained, Form and Response Format, Sample Size, and Description for Exploratory Factor-Analytic Studies (n = 32)

804

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

3 4 3 3 3 3 3 3 3

25 22 22 22 16 22 22 22

Factors

22

Items

4c 1b,c 2b,c 5b,c 4a,b T4c T1 2c

4c, 1c

Forma

1,281 121 750 72 171 5,730 612 387

70 and 30

N

U.S. borderline personality disorder clients and their therapists U.S. hospice staff U.S. mothers Australian high school teachers U.S. college students U.S. business professionals Swedish social security employees Chinese human service professionals New Zealand teachers and principals

Sample Description

a. 1 = Human Services Survey (HSS); 2 = Educators’ Survey (ES); 4 = modified form; 5 = unspecified; T1 = translation of HSS; T2 = translation of ES; T4 = translation of modified form; T5 = translation of unspecified form. Lower-case alphabets on the baseline indicate, respectively, that (a) study used altered response format, (b) study used intensity ratings, and (c) study used frequency ratings b. Twenty-five items were used and produced a four-factor solution. However, reported use of the 22 items also produced the expected three-factor solution.

Linehan, Cochran, Mar, Levensky, & Comtois (2000) Mor & Laliberte (1984) Pelsma, Roland, Tollefson, & Wigington (1989) Pierce & Molloy (1989) Powers & Gose (1986) Scherer, Cox, Key, Stickney, & Spangler (1992) Soderfeldt, Soderfeldt, Warg, & Ohlson (1996) Tang (1998) Whitehead, Ryba, & O’Driscoll (2000)

Study

Table 2 (continued)

805

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

2.065

1.071

3.056

w2/df

182

151

631

N

U.S. teachers and administrators U.S. small business owners Canadian teachers

U.S. nurses

Catalonian teachers

Sample Description

1.925 4.539

133 689

U.S. teachers Dutch dentists (continued)

3 correlated factors, with Items 2, 12, and 16 deleted 1.851 157 3 correlated factors, with Items 2, 12, 16, and 20 2.076, 1.645, 163, 162, deleted and residual covariance between 10 and 11 and 1.706 and 218 3 correlated factors, with Items 12 and 16 deleted and 2.878, 2.127, 1,159, 396, Canadian teachers residual covariance between 1 and 2 and 10 and 11 and 3.798 and 1,394 3 correlated factors, Items 11 and 12 cross-loaded 2.181 and 1.742 1,544 and 1,381 Canadian teachers on EE, residual covariance between 1 and 2, 10 and 11, and 6 and 16 3 correlated factors 3.510 354 Human resource employees 5 correlated factors, with both EE and PA splitting 1.815 480 Australian law into two factors enforcement managers 3 correlated factors 3.804 and 228 and 361 Canadian private 1.398 sector employees and teachers 3 correlated factors 2.346 147 U.S. college students

3 correlated factors, with Items 1, 7, 12, 14, and 16 deleted 3 correlated factors, allowing Items 12 and 16 to cross-load with 23 residual covariances 3 correlated factors, with Items 2, 12, and 16 deleted

Best Fitting Model

Gold, Bachelor, & Michael (1989) Gold, Roth, & Wright (1992) 3 correlated factors Gorter, Albrecht, Hoogstraten, & 3 correlated factors, modified to allow Item 2 to load Eijkman (1999) on DP and Item 6 to load on PA

Evans & Fischer (1993)

Cordes, Dougherty, & Blum (1997) Densten (2001)

Byrne (1994)

Byrne (1993)

Boles, Dean, Ricks, Short, & Wang (2000) Boles et al. (2000) Byrne (1991)

Beckstead (2002)

Aluja, Blanch, & Garcia (2005)

Study

Table 3 Author(s), Best Fitting Model Reported, Sample Size, and Description for Confirmatory Factor-Analytic Studies (n = 21)

806

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

2.883

3 correlated factors

Note: EE = Emotional Exhaustion; PA = Personal Accomplishment; DP = Depersonalization.

3 factors, with Items 12, 13, 16, and 18 deleted

1.429

3-factor model, with Items 12 and 16 deleted and the inclusion of 12 residual covariances

Schaufeli, Bakker, Hoognuin, Schaap, & Kladler (2001) Schaufeli, Daamen, & Van Mierlo (1994) Schaufeli & Van Dierendonck (1993)

2.843 and 2.421

7.605

3 correlated factors, with Items 12 and 16 deleted

334 and 333

326

139

1,590

771 219

4.751 1.663

Richardson & Martinussen (2004)

150 197

N

2.213 1.020

w2/df

3 factor correlated factors 2 correlated factors, using only Items 1, 3, 13, 14, and 20 from EE and 10 and 11 from DP 3 correlated factors 3 correlated factors

Best Fitting Model

Holland, Mishael, & Kim (1994) Kalliath, O’Driscoll, Gillespie, & Bluedorn (2000) Kokkinos (2006) Lee & Ashforth (1990)

Study

Table 3 (continued)

Dutch nurses

Dutch teachers

Cypriot teachers U.S. welfare agency managers Norwegian social service workers, nurses, and teachers Dutch outpatients

U.S. teachers U.S. nurses

Sample Description

Worley et al. / Factor Structure of Maslach Burnout Inventory Scores

807

from 19 to 70 years. Gender distribution was reported for 27 samples. The proportion of female participants ranged from 1.9% to 94%, with an average of 58%. The majority of studies used samples of teachers (n = 14). The remaining samples included health care workers (n = 5) consisting primarily of nurses (Beckstead, 2002; Kalliath et al., 2000; Schaufeli & Van Dierendonck, 1993) but also included dentists (Gorter et al., 1999) and lab technicians (Kalliath et al., 2000). Additional study participants included social service workers (n = 3), business employees, owners, and managers (n = 4), students (n = 1), and outpatients (n = 1). Twenty samples used the original English version of the MBI. The additional 8 samples used various translations, including Dutch (n = 5), Catalan (n = 1), Greek (n = 1), and Norwegian (n = 1). Nine samples were associated with studies conducted in the United States and 10 were associated with Canadian studies. The remaining studies were conducted in Australia (n = 1), Catalonia (n = 1), Norway (n = 1), Cyprus (n = 1), and the Netherlands (n = 5). There appear to be no significant differences between the samples used in EFA studies and those in CFA studies. The participants in the CFA studies were slightly older, with a larger proportion of male participants reported than participants in the EFA studies. Other characteristics including professions sampled and MBI translations used are very similar. It appears that sample characteristics are not related to differences in factor structure of scores or relationships between factors reported in EFA and CFA studies.

EFA Items, Factor Extraction, and Rotation Nine studies used forms with items modified from the original. For example, Linehan et al. (2000) asked individuals who were diagnosed with borderline personality disorder to rate their attitudes toward their therapists as an indicator of therapy burnout. Three studies did not identify the specific form. Most studies (n = 20) analyzed item frequency ratings, three analyzed intensity ratings, and nine analyzed both rating types. Lee and Ashforth (1996) suggested that the frequency and intensity rating formats are ‘‘largely redundant and only one is necessary’’ (p. 126). Although there is no apparent relationship between the various forms (see Table 2) and the reported factor structure of scores in this sample of EFA validity studies, it is instructive to note these differences between measures sharing the name Maslach Burnout Inventory. The publications listed in Table 2 consist of several combinations of extraction and rotation methods. In some cases multiple extraction and/or rotation methods were used on the same sample in a single publication. For example, two of the publications listed in Table 2 used both principal components analysis (PCA) and principal-axis factor Analysis (PAF; sometimes referred to as common factor analysis) as the extraction method. On the basis of Monte Carlo studies, there is evidence that PCA and PAF can produce similar outcomes (e.g., Velicer, Peacock, & Jackson, 1982; Velicer

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

808

Educational and Psychological Measurement

& Jackson, 1990). However, PCA and PAF are more likely to generate different outcomes when the ratio of the number of factors to the number of measured variables is low (e.g., three variables for a factor) and when the communalities in PAF are also low (e.g., .40; Widaman, 1993). A rank biserial correlation coefficient was calculated to determine the relation between extraction methods and the number of reported factors (or components). The resulting coefficient indicated that the use of PCA or PAF as the extraction method is not correlated with the number of factors or components reported (rbc = :006). Although the criteria for determining the number of factors to retain varied across studies (e.g., eigenvalues greater than 1.0, structure coefficients greater than .40, analysis of scree plot, or some combination), those differences do not appear to have influenced the number of factors reported. Criteria for extraction were not reported in one publication that reported a two-factor solution (Holland et al., 1994). Brookings et al. (1985) extracted two factors based on the Cattell scree plot and eigenvalues greater than 1.0. In publications that reported a four-factor solution, two studies did not report criteria for retaining factors (Iwanicki & Schwab, 1981; Mor & Laliberte, 1984). Two studies retained factors when eigenvalues were greater than 1.0 (Gold, 1984; Golembiewski et al., 1983). Gold (1984) reported a four-factor solution despite not having met criteria for retention (eigenvalues greater than 1.0). Firth et al. (1985) was the only study in this sample that reported a four-factor solution using multiple criteria for determining the number of factors for extraction. It is informative to note that no studies used Horn’s parallel analysis for factor retention decisions (see Henson and Roberts, 2006, for a detailed discussion on this issue). The majority of the studies (n = 26) used an orthogonal rotation with the retained factors. Only 19 studies used an oblique rotation method. Three studies did not report the type of rotation method used to improve interpretation of the retained factors. Four studies reported the use of both oblique and orthogonal rotation methods (Aluja et al., 2005; Gold et al., 1989; Holland et al., 1994; Koeske & Koeske, 1989). These authors reported that the pattern of factor loadings did not differ between oblique and orthogonal rotation methods, and all four chose to report the orthogonal factor solution without regard to the magnitude of factor correlations produced using oblique rotation techniques.

Frequency in Which Individual Items Load on the Expected Factor in EFA studies Maslach and Jackson (1981) provided the standard for comparison with a 9-item subscale of Emotional Exhaustion (EE), a 5-item subscale of Depersonalization (DP), and an 8-item subscale of Personal Accomplishment (PA). The test manual suggests omitting Items 12 (PA) and 16 (EE) from factor analyses because of a consistent failure to load on the expected factor. The frequency in which each of

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

Worley et al. / Factor Structure of Maslach Burnout Inventory Scores

809

the 22 items loads on a given factor as reported across studies that have reported results from an EFA are presented in Table 1. Emotional Exhaustion. Seven studies (Green & Walkey, 1998; Gupchup, Lively, Holiday-Goodman, Siganga, & Black, 1994; Iwanicki & Schwab, 1981; Kantas & Vassilaki, 1997; Koeske & Koeske, 1989; Pierce & Molloy, 1989; Soderfeldt et al., 1996) reported a factor identical to Maslach and Jackson’s EE factor. In several studies, the EE items split across subscales. For example, in Densten (2001), MBIEE Items 6, 16, and 20 loaded together to form an independent factor referred to as psychological strain, with the remaining EE items referred to as somatic strain. Similarly, Firth et al. (1985) interpreted the EE items as two factors labeled emotional draining (Items 1, 2, 3, 6, and 8) and frustration and discouragement (Items 3, 8, 13, and 20). The division of the EE subscale was reflected in several studies where Items 6 and 16 were commonly problematic in that those items did not load on the EE subscale (Abu-Hilal, 1995; Abu-Hilal & Salameh, 1992; Belcastro et al., 1983; Fimian & Blanton, 1987; Gold et al., 1992; Whitehead, Ryba, & O’Driscoll, 2000). It is noteworthy that Items 6 and 16 were problematic in several other studies in which the EE subscale was not consistent with Maslach and Jackson (1981; e.g., Densten, 2001; Firth et al., 1985; Golembiewski et al., 1983; Gorter et al., 1999; Pelsma et al., 1989). In studies that have included Item 16, and where this item does not load as expected according to Maslach and Jackson (1981), it loads most frequently with DP items. Depersonalization. Nine of the EFA studies also reported a factor identical to Maslach and Jackson’s DP factor (Densten, 2001; Fimian & Blanton, 1987; Green & Walkey, 1988; Gupchup et al., 1994; Kantas & Vassilaki, 1997; Kokkinos, 2006; Linehan et al., 2000; Soderfeldt et al., 1996; Whitehead et al., 2000). The remaining EFA studies failed to replicate the original DP structure, nor were any of the other proposed structures replicated with any consistency. For example, Iwanicki and Schwab (1981) suggested that the DP subscale splits into two factors that reflect depersonalization as affected by job-related concerns (Items 10, 11, and 15) and student-related concerns (Items 5, 15, and 22) in a sample of educators. Although this division of the DP subscale is not replicated in any other study, Powers and Gose (1986) attempted to replicate Iwanicki and Schwab (1981) using a college student sample. They reported that the DP consisted only of Items 5, 10, and 11. Firth et al. (1985) reported that Items 10, 11, and 22 loaded on a factor labeled hardening. However, Item 11 loaded on EE in Fimian et al. (1989) and Aluja et al. (2005). Items 5 and 15 did not load on any factor in Firth et al. (1985) or in Abu-Hilal (1992). Item 22 was also problematic in several studies in that it either did not load on any factor (Abu-Hilal, 1995; Abu-Hilal & Salameh, 1992; Golembiewski et al., 1983; Koeske & Koeske, 1989; Mor & Laliberte, 1984; Pierce & Molloy, 1989) or loaded on EE (Gold et al., 1992; Pelsma et al., 1989). Item 5

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

810

Educational and Psychological Measurement

was the only DP item that did not load across other burnout dimensions in Gold et al. (1992). Personal Accomplishment. The PA factor as reported in Maslach and Jackson (1981) was replicated in 10 studies (Abu-Hilal, 1995; Aluja et al., 1992; Belcastro et al., 1983; Green & Walkey, 1988; Iwanicki & Schwab, 1981; Kantas & Vassilaki, 1997; Koeske & Koeske, 1989; Pierce & Molloy, 1989; Soderfeldt et al., 1996; Whitehead et al., 2000). A similar PA factor with slight discrepancy from Maslach and Jackson (1981) was also reported by several studies where Item 12 did not load on any factor (Abu-Hilal & Salameh, 1992; Fimian & Blanton, 1987; Linehan et al., 2000) or cross-loaded on one or more factors (Fimian et al., 1989; Mor & Laliberte, 1984). The PA factor reported by Kokkinos (2006) was also similar to that reported by Maslach and Jackson (1981), with the exception that Item 14 (EE) cross-loaded with other PA items. Densten (2001) pointed out that the PA subscale consisted of two factors; the first contains items that reference selfaccomplishment (Items 9, 18, and 19) and the second is formed by items associated with working with others (Items 7, 17, and 21).

Factor Structure of the MBI: Exploratory and Confirmatory Studies Exploratory factor-analytic studies. Most studies included in this sample tended to replicate the original structure reported by Maslach and Jackson (1981). Some did not, but it is informative that all but one of the eight studies include Item 12 (I feel very energetic) and Item 16 (Working with people directly puts too much stress on me) in the factor analyses, thus ignoring recommendations in the test manual to remove those items when investigating the measurement structure. Densten (2001) removed Items 12, 13 (I feel frustrated by my job), and 14 (I feel I’m working too hard on my job), reporting that it improved interpretability of the final four-factor solution. The six studies presented in Table 2 that reported a four-factor solution varied in terms of factor structure. Two studies (Golembiewski et al., 1983; Mor & Laliberte, 1984) used a 25-item form of the MBI that included the 3 personal involvement items. Interestingly, one study (Mor & Laliberte, 1984) replicated Maslach and Jackson’s original four-factor structure, whereas the other (Golembiewski et al., 1983) produced a distinct four-factor solution. Specifically, the first factor consisted of 6 items from the EE scale; the second factor consisted of 3 EE items and 3 PA items; the third factor was defined by 5 PA items, along with 2 of the 3 personal involvement (PI) items; and the fourth factor was formed from the 4 DP items as well as the 3rd PI item. Both the EE and PA items formed two factors, with the second factor consisting of items from both scales. The remaining studies that reported a four-factor solution (Densten, 2001; Firth et al., 1985; Gold, 1984; Iwanicki & Schwab, 1981) used the 22-item form of the

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

Worley et al. / Factor Structure of Maslach Burnout Inventory Scores

811

MBI. Four-factor solutions were generally formed when one of the original three scales was interpreted as two factors (Firth et al., 1985; Iwanicki & Schwab, 1981). There was no real consistency as to which of the three scales divided to form two factors. Densten (2001) and Gold (1984) interpreted the PA scale as two factors. The items regarding self loaded separately from the items regarding recipients in both studies. Firth et al. (1985) interpreted two factors from the EE scale, frustration about work and emotional draining, and Iwanicki and Schwab (1981) interpreted two factors from the DP scale labeled job-related and student-related factors. More consistency was found with studies reporting two-factor solutions. Twofactor solutions were reported when EE and DP items were interpreted together as a single factor (sometimes referred to as a core of burnout; Walkey & Green, 1992) with PA reported as the second factor (Brookings et al., 1985; Holland et al., 1994). In addition, two studies that reanalyzed previously published MBI data (Green et al., 1991; Walkey & Green, 1992) found support for this same two-factor structure in nine samples, lending support to the concept of EE and DP items forming the core of burnout. This model was tested in a number of CFA studies and the results will be discussed in the following section. The criteria for determining the number of factors to retain were not always clearly reported and may have contributed to some deviations in interpreting factor structure. This evidence suggests that the interpretation of retained factors is the greatest contributor to the observed discrepancies in the reported factor structure. Perhaps a closer examination of structure coefficients across studies would be informative in terms of interpreting the factor structure of MBI scores. Confirmatory factor-analytic studies. Twenty-one studies that met previously described criteria explored the structure of the MBI scores using CFA procedures (see Table 3). Several studies analyzed more than one sample so that 29 distinct samples were analyzed. The majority of studies (n = 11) tested a sequence of nested models that included null-, one-, two-, and three-factor models (Cordes, Dougherty, & Blum, 1997; Evans & Fischer, 1993; Gold et al., 1989; Gold et al., 1992; Gorter et al., 1999; Holland et al., 1994; Kokkinos, 2006; Lee & Ashforth, 1990; Schaufeli, Bakker, Hoognuin, Schaap, & Kladler, 2001; Schaufeli, Daamen, & Van Mierlo, 1994; Schauffeli & Van Dierendonck, 1993). Several authors were specifically testing the core of burnout model proposed by Walkey and Green (1992). The remaining studies generally tested various three-factor models, improving fit by allowing the factors to correlate, deleting items and/or allowing for residual covariance. Only one study (Densten, 2001) analyzed a nested sequence of models that included null-, one-, two-, three-, four-, and five-factor models. In light of the modest support found for various four-factor models in the EFA studies, it was somewhat surprising that only one researcher empirically tested a four-factor model using CFA procedures. The CFA studies produced a more consistent picture of the factor structure of the MBI than the EFA studies. Eighteen of the 21 studies reported a correlated three-factor

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

812

Educational and Psychological Measurement

model as providing the best fit to the data. The two exceptions included a two-factor model that was achieved by deleting all of the items associated with the PA scale leaving EE and DP as the two distinct factors (Kalliath et al., 2000), and a five-factor model where both the EE and PA scales formed two distinct factors (Densten, 2001). A number of studies documented improved fit by allowing items to load on more than one factor and/or deleting poor performing items. Fit was improved in seven studies (see Table 3) by deleting Items 12 and 16 as recommended by Maslach and Jackson (1986). Three studies deleted Item 2 and one study allowed this item to cross-load on the DP factor. Additional deletions that resulted in improved fit in individual studies included Items 1, 13, 14, and 20 from the EE scale and Items 7 and 18 from the PA scale. Additional improvements to model fit were accomplished by allowing for residual covariance between items. The most common documented residual covariance was between Items 10 and 11, two similarly worded items from the DP scale. Additional residual covariance pairs include Items 1 and 2 and Items 6 and 16, which are similarly worded item pairs from the EE scale.

Factor Correlations There appears to be strong support for a three-factor model of burnout as conceptualized by Maslach and Jackson (1981). There is, however, less consistency with regard to the nature of the relationship between factors. The three theoretical components of burnout, EE, PA, and DP, were initially described as independent, uncorrelated factors (Maslach & Jackson, 1981). Most of the EFA studies selected orthogonal rotation for interpretation of factors following Maslach and Jackson. The studies that have used oblique factor rotation have produced varied results (see Table 4). As previously mentioned, a few studies conducted both orthogonal and oblique rotation and generally reported that the results were similar, choosing to report orthogonal results. In CFA studies, researchers consistently reported improved model fit by specifying correlated rather than orthogonal factors. Specifically, six EFA studies reported factor correlations between EE and DP factors ranging from .40 to .76 with a mean correlation of .56. Twelve CFA studies reported factor correlations between EE and DP factors ranging from .08 to .73, with an absolute mean correlation of .57. It is worth noting that the r value of .08 between EE and DP in one study was an extreme value. When this correlation was removed, the range was similar to that observed in the EFA studies (.43 to .73), and the mean correlation increases to .60. The reported correlations between EE and PA factors ranged from –.65 to .56 in the EFA studies and –.49 to .10 in the CFA studies, with absolute mean correlations of .30 and .28, respectively. Correlations between DP and PA factors ranged from –.62 to .10 in CFA studies and –.74 to .55 in EFA studies with absolute mean correlations of .35 and .38, respectively.

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

Worley et al. / Factor Structure of Maslach Burnout Inventory Scores

813

Table 4 Reported Factor Correlations in Confirmatory and Exploratory Factor-Analytic Studies EE to DP EE to PA DP to PA Extraction Method Confirmatory factor-analytic studies Aluja, Blanch, & Garcia (2005) Beckstead (2002) Byrne (1991): Intermediate Byrne (1991): Secondary Byrne (1991): University Byrne (1993): Elementary Byrne (1993): Intermediate Byrne (1993): Secondary Byrne (1994): Female Byrne (1994): Male Evans & Fisher (1993) Gold, Bachelor, & Michael (1989) Gold, Roth, & Wright (1992) Gorter, Albrecht, Hoogstraten, & Eijkman (1999) Kalliath, O’Driscoll, Gillespie, & Bluedorn (2000) Lee & Ashforth (1990) Richardson & Martinussen (2004) Exploratory factor-analytic studies Aluja, Blanch, & Garcia (2005) Fimian et al. (1989) Gold, Bachelor, & Michael (1989) Gold, Bachelor, & Michael (1989) Gold, Roth, & Wright (1992) Koeske & Koeske (1989): Studies 1−4 Koeske & Koeske (1989): Study 5 Whitehead (2000)

.59 .65 .73 .65 .43 .55 .55 .49 .67 .69 .65 .08 .56 .61 .64 .66 .46

−.49 −.16 −.17 −.34 .09 −.41 −.36 −.35 −.42 −.31 −.23 .10 −.48 −.22

−.39 −.25 −.10 −.44 .26 −.45 −.44 −.38 −.62 −.45 −.12 .10 −.54 −.38

−.05

−.20

.42 .60 .40 .47 .76 .60 .75 .48

−.33 .21 .05 .06 −.65 .56 .35 −.18

−.37 .16 .17 .20 −.74 .55 .44 −.37

PCA PCA PCA PAF PAF PAF PAF MLE

Note: EE = Emotional Exhaustion; DP = Depersonalization; PA = Personal Accomplishment; PCA = principal components analysis; PAF = principal-axis factor; MLE = maximum likelihood estimation. Densten (2001) reported factor correlations for a five-factor solution.

Because measurement error is treated differently across factor extraction methods in EFA (i.e., ignored in PCA and removed in PAF), a reasonable next step was to determine if the variability in factor correlations reported in the EFA studies could be explained by the factor extraction method that was used. Unfortunately, only eight studies reported the factor correlations. This lack of reporting factor correlations in EFA studies affirms reporting practices cited by Henson and Roberts (2006) and many other authors who have argued for more detailed factor-analytic information in published research (cf. Comrey, 1978; Gorsuch, 1983; Henson, Capraro, & Capraro, 2004). The available factor correlations are presented in the

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

814

Educational and Psychological Measurement

lower portion of Table 4, with the extraction method used. The range of factor correlations across factors does not appear to vary considerably across extraction methods. The factor correlations between EE and DP ranged from .40 to .60 when PCA was used, and from .47 to .76 when PAF was the extraction method. The factor correlations between EE and PA ranged from –.33 to .21 when PCA was used and from –.65 to .56 when PAF was the extraction method. Finally, the factor correlations between DP and PA ranged from –.37 to .17, and –.74 to .55 when PCA and PAF were used, respectively.

Meta-Analysis of MBI Factor Structure Studies Following the descriptive inquiry, a meta-analysis of MBI factor structure studies was undertaken to provide empirical evidence regarding the dimensionality of its scale scores. A number of meta-analytic methods have been proposed in the literature. Kaiser, Hunka, and Bianchini (1969) developed a pairwise comparison method to assess structural dimensions within a meta-analytic framework capable of comparing the factor solutions of two samples. The major limitation with this approach is that studies may only be investigated in a pairwise fashion such that a single result is not produced to summarize a set of research studies collectively. Becker (1996) developed a method to analyze structural dimensions meta-analytically using aggregated correlation matrices from previous studies. After all studies reporting correlation matrices are found, raw correlations across studies are weighted and averaged so that sampling error is reduced. The averaged correlations are corrected for attenuation using weighted and averaged reliability coefficients. Next, a single factor analysis is conducted on the aggregate correlation matrix. Similarly, Beretvas and Furlow (2006) proposed a method of meta-analysis by synthesizing covariance matrices across studies within a structural equation modeling context. Although Becker (1996) and Beretvas and Furlow (2006) provide insightful approaches to the handling of meta-analytic factor analysis, the majority of structure studies of the MBI scale scores failed to provide the inter-item correlation or covariance matrices prerequisite for conducting such analyses. Consequently, a method initially designed by Loeber and Schmaling (1985) and modified by Shafer (2005) was used in the present study. Shafer’s technique involves extracting the factor loadings across factor-analytic studies to derive an item-by-factor matrix for each study. These matrices are then dichotomized according to salient factor loadings. Co-occurrence matrices are calculated for each study, indicating the number of times each pair of test items had their highest factor loadings on the same factor. From these co-occurrence matrices, a raw similarity matrix is formed by summing the elements of the individual co-occurrence matrices to create one aggregate matrix. The raw similarity matrix is then proportionalized to make it suitable for factor-analytic procedures. Principal components

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

Worley et al. / Factor Structure of Maslach Burnout Inventory Scores

815

analysis is then performed on the proportionalized similarity matrix to assess structural dimensionality across studies. Shafer’s method departs from Loeber and Schmaling’s in two ways. First, whereas Loeber and Schmaling used all salient factor loadings in the computation of the raw co-occurrence matrix, Shafer used only the highest loading. In addition, Shafer’s method departs from Loeber and Schmaling’s in that principal components analysis was used instead of multidimensional scaling. The method proposed by Shafer is attractive, because only factor loadings, which are readily available from most structure studies, are needed to run the analysis. Furthermore, a single result may be obtained to characterize dimensionality across studies. Owing to the strengths of this method, it was employed in the present study.

Results of the Meta-Analysis To qualify for inclusion in the present analysis, studies had to employ an exploratory factor-analytic procedure, use all 22 items of the MBI, and report factor loadings from their analyses. Twenty-five studies met these selection criteria and were included in the meta-analysis (studies included in the meta-analysis are marked with an asterisk in the reference section). The raw similarity matrix was generated according to Shafer’s (2005) metaanalytic procedure. Individual factor-by-item matrices were generated for each of the 25 studies in the sample. Elements of the individual matrices were coded as 1 if the item loaded primarily on the factor, or 0 otherwise. Following the recommendation of Stevens (2002), factor loadings greater than or equal to .40 were considered salient. Each matrix was subsequently multiplied by its transpose to generate a co-occurrence matrix. The 27 co-occurrence matrices were summed to produce the raw similarity matrix (see Table 5). The elements of the raw similarity matrix were next converted into simple proportions to generate a Gramian co-occurrence matrix suitable for factor-analytic procedures. Results of the analysis are discussed in a similar fashion to conventional factor-analytic interpretations (see Nunnally, 1978, for a discussion of factor analysis interpretation using similarity data). A principal components analysis (PCA) was chosen as the extraction method for the similarity data. Three components emerged with eigenvalues greater than 1 (the K1 rule), accounting for 86.7% of the variance. An examination of the scree plot also suggested that three components should be rotated to final solution. The three components were considered to demonstrate conceptual clarity and were consistent with Maslach and Jackson’s initial conceptualization of burnout. Therefore, it was decided to retain and rotate the three components. The components were rotated using orthogonal (varimax) rotation to aid in interpretation of the components. Table 6 presents the rotated pattern coefficients. Items have been arranged in descending order according to these loadings.

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

816

Educational and Psychological Measurement

Table 5 Raw Co-Occurrence Matrix of Maslach Burnout Inventory Studies Item

1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 17 18 19 20 21 22

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

27 27 25 20 26 22 21 15 21 1 1 4 1 2 0 0 0 3 0 1 0 0

27 25 20 26 22 21 15 21 1 1 4 1 2 0 0 0 3 0 1 0 0

27 19 26 23 20 15 22 1 1 4 1 2 1 1 0 3 1 2 0 0

26 19 17 15 20 17 5 6 8 5 3 0 0 1 4 0 1 0 0

27 23 21 15 22 1 1 4 1 2 0 0 0 3 0 1 0 0

26 20 24 13 12 24 21 20 15 24 3 2 7 2 23 3 2 8 2 23 27 6 3 10 4 21 25 28 3 1 8 1 20 21 19 25 4 2 4 2 14 15 13 15 18 0 2 0 1 1 1 1 2 0 24 0 1 1 0 0 0 0 3 0 23 26 0 2 2 0 0 1 1 2 0 21 23 27 4 3 3 2 0 1 2 0 0 15 16 18 23 0 1 1 0 0 0 0 2 0 21 24 24 17 26 1 2 0 0 0 0 1 1 0 21 22 24 19 24 27 1 2 0 0 0 0 0 1 0 20 21 24 19 23 25 26 0 1 0 0 0 0 0 1 0 18 20 20 15 21 20 20 21

Note: Items 1 to 9 form the Emotional Exhaustion subscale, Items 10 to 14 form the Depersonalization subscale, and Items 15 to 22 form the Personal Accomplishment subscale.

Discussion The purpose of this study was to address three distinct but related questions pertaining to the Maslach Burnout Inventory. First, what is the factor structure of the MBI? Second, what is the nature of the relationship between the factors? Finally, how does the factor structure compare and contrast with the conceptualization of burnout in the original work by Maslach and Jackson (1981)? The various sample characteristics, instrumentation variation, factor-analytic methods, and the reported factor structure across EFA and CFA studies that have attempted to examine the dimensionality of the MBI were reviewed.

What Is the Factor Structure of the MBI? On the basis of the EFA studies, there is substantial support for the three-factor model. We found support for this in the descriptive study and in the meta-analysis.

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

Worley et al. / Factor Structure of Maslach Burnout Inventory Scores

817

Table 6 Rotated Pattern Coefficients for the Maslach Burnout Inventory (MBI) MBI Subscales Item EE 5 EE 1 EE 2 EE 3 EE 9 EE 6 EE 7 EE 4 EE 8 PA 22 PA 19 PA 20 PA 17 PA 21 PA 16 PA 15 PA 18 DP 11 DP 10 DP 13 DP 12 DP 14

Emotional Exhaustion

Personal Accomplishment

Depersonalization

.98 .97 .97 .96 .94 .90 .89 .81 .69 −.01 −.01 .02 .00 .00 .00 .00 .14 .05 .05 .03 .16 .09

.00 .00 .00 .03 .00 .02 .00 .02 .02 .98 .97 .97 .96 .96 .94 .92 .82 .00 .00 .06 .01 −.01

.00 .00 .00 .00 .02 .09 .01 .20 .33 .00 .01 .00 .03 .00 .02 .03 .01 .98 .97 .92 .92 .87

This is particularly the case when orthogonal rotation is used to improve interpretation. This result is not surprising, given that oblique rotations require more parameters to be estimated. As a consequence, oblique solutions tend to yield less consistent results across replications (Thompson, 2004). There is modest support for the four-factor solution in EFA studies, although there is no consistency as to which of the hypothesized factors is split to form the fourth factor. Walkey and Green (1992) offered support for a two-factor solution when reanalyzing data from previously published studies. Indeed, there is more consistency among the studies that report a two-factor solution when EE and DP were interpreted together as the core of burnout. Likewise, there is strong support for a correlated three-factor model based on the CFA studies, despite discrepancies in how the models were constructed. Interestingly, in spite of substantial interfactor correlation, only one study (Boles, Dean, Ricks, Short, & Wang, 2000) tested a model that included a higher order factor (HOF). Unfortunately, because the model included three first-order factors, both a

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

818

Educational and Psychological Measurement

correlated factor model and an HOF model produced identical fit statistics. When ‘‘there are only three first-order factors, then . . . the model is just identified, and therefore the overall test of goodness of fit of the model does not test the secondorder structure’’ (Rindskopf & Rose, 1988, p. 53). In addressing the need for HOF analysis, Gorsuch (1983) suggests that higher order factors ‘‘be extracted and examined so that the investigator may gain the fullest possible understanding of the data’’ (p. 255). Thus, the magnitude of the reported interfactor correlations in the MBI studies reviewed suggests that a higher order factor might exist and should therefore be explored in future research. Densten (2001) provides the only known CFA study that has tested a model with more than three factors, finding good fit for a five-factor model. No known attempts have been made to replicate the five-factor model. Densten (2005) argues that this expanded, five-factor model ‘‘allows for greater sensitivity in understanding . . . burnout [and] . . . the development of a process model of burnout’’ (p. 107). Densten’s (2001, 2005) model suggests a need for further research regarding the nature of the five correlated factors unique to his research. Although there is no evidence of a direct replication of this model in the literature, partial support has been demonstrated. Using EFA, Kanste, Miettunen, and Kyngas (2006) found that a four-factor solution best accounted for the underlying dimensions of the MBI-HSS. Similar to Densten (2001), the PA factor separated into two distinct factors.

What Is the Nature of the Relationship Between the MBI Factors? There is consistency in the findings between the EFA and CFA studies with regard to the correlations between factors. Overall, the EFA and CFA studies produced similar mean factor correlations. These correlations between the three burnout factors have practical significance. What is less clear is the precise nature of the relationship between factors. The strongest and most consistent relationship appears to be between the EE and DP factors. A large positive factor correlation between EE and DP was reported for 25 distinct samples. The mean correlation between these two factors in both EFA and CFA studies was essentially equivalent, indicating that the shared variance between these two factors is greater than 32%. Factor correlations between EE and PA, or DP with PA, were largely negative in the CFA studies, providing support for the notion that PA is a distinct factor. However, there was a mixture of positive and negative results in the EFA studies. It is unclear why these factor correlations varied so dramatically in both sign and magnitude in the EFA studies, but it may have something to do with the way that measurement error is treated in exploratory factor analysis. Unfortunately, only eight studies provided enough information to further investigate the role of measurement error on factor correlations. Therefore, the role of measurement error on MBI factor correlations in EFA studies remains unknown.

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

Worley et al. / Factor Structure of Maslach Burnout Inventory Scores

819

How Does the Reported Factor Structure Compare With the Initial Conceptualization of Burnout? Maslach and Jackson (1981) discussed the independence among the factors of burnout that are measured by the Maslach Burnout Inventory. There is a mixed response among researchers who have attempted to address the question of independence among factors. There is support for three factors, but the nature of the relationship is not consistent across studies. Specifically, the three factors are not independent, but they are related in different ways across studies. This might suggest differences in sample characteristics that could be addressed in a reliability generalization study. Differences in the reported relationships between factors may be explained in part if there are significant differences in sample characteristics that account for measurement error, and thus attenuate the estimated internal consistency reliability. Alternatively, the differences in direction and magnitude of correlated factors might also suggest that there is an alternative model for understanding the underlying structure of burnout that has not been tested. One example of an alternative model might impose a higher order factor, such as the core of burnout. Indeed, as mentioned earlier, Boles et al. (2000) compared an HOF model with the correlated factor model and found identical fit. However, the limited reporting of results from that study preclude making a decision about the existence of a higher order factor. The meaning and importance of the current study contributes most to questions concerning the interpretability of the three-factor structure of burnout. The possibility of a higher order factor, for example, may call into question the interpretation of burnout as measured by the MBI. The higher order factor might be the core of burnout that Walkey and Green (1992) introduced. In any event, the findings reported here suggest that there is something more complex going on with the structure of burnout as measured by the MBI beyond the commonly accepted three-factor model. This point continues to be illustrated in part by the fact that studies are still being published that have produced sample-specific differences in the reported factor structure (e.g., Kanste et al., 2006). All of these results must be understood in the context of some limitations of the current study. The reported factor structure and relationships between factors discussed here are, of course, limited by the comprehensiveness of the literature review. Although diligence was practiced in the search for pertinent exploratory and confirmatory studies that have examined the MBI, it would be presumptuous to claim that the reference list included here is exhaustive. Among studies not included here were those that used the MBI–General survey because the items and thus the interpretation of the factors are different. Also not included were unpublished manuscripts and dissertations. Continued investigation of the underlying structure of the MBI scale scores and the identification of factors that may influence that structure is warranted. Although

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

820

Educational and Psychological Measurement

the MBI is one of several measures of burnout, it continues to be among the most frequently cited measures in the burnout literature. Future applications and interpretation of results from use of the MBI would benefit from research that explores the possibility of a higher order factor of burnout. In addition, studies that consider the influence of sample characteristics on measurement error when administering the MBI would contribute to more meaningful interpretations of the varied correlations between factors and may provide useful information with regard to the use of the MBI in different situational contexts. Finally, future research can enhance understanding of the underlying structure of burnout as measured by the MBI by conducting an empirical analysis of existing findings with regard to factor loadings or correlation matrices.

References References marked with an asterisk indicate studies included in the meta-analysis. * Abu-Hilal, M. M. (1995). Dimensionality of burnout: Testing for invariance across Jordanian and Emirati teachers. Psychological Reports, 77, 1367-1375. * Abu-Hilal, M. M., & Salameh, K. M. (1992). Validity and reliability of the Maslach Burnout Inventory for a sample of non-Western teachers. Educational and Psychological Measurement, 52, 161-169. * Aluja, A., Blanch, A., & Garcia, L. F. (2005). Dimensionality of the Maslach Burnout Inventory in school teachers: A study of several proposals. European Journal of Psychological Assessment, 21, 67-66. Becker, G. (1996). The meta-analysis of factor analysis: An illustration based on the cumulation of correlation matrices. Psychological Methods, 1, 341-353. Beckstead, J. W. (2002). Confirmatory factor analysis of the Maslach Burnout Inventory among Florida nurses. International Journal of Nursing Studies, 39, 785-792. * Belcastro, P. A., Gold, R. S., & Hays, L. C. (1983). Maslach Burnout Inventory: Factor structures for samples of teachers. Psychological Reports, 53, 364-366. Beretvas, S. N., & Furlow, C. F. (2006). Evaluations of an approximate method for synthesizing covariance matrices for use in meta-analytic SEM. Structural Equation Modeling: A Multidisciplinary Journal, 13(2), 153-185. Boles, J. S., Dean, D. H., Ricks, J. M., Short, J. C., & Wang, G. (2000). The dimensionality of the Maslach Burnout Inventory across small business owners and educators. Journal of Vocational Behavior, 56, 13-34. Brookings, J. B., Bolton, B., Brown, C. E., & McEvoy, A. (1985). Self-reported job burnout among female human service professionals. Journal of Occupational Behaviour, 6, 143-150. Byrne, B. M. (1991). The Maslach Burnout Inventory: Validating factorial structure and invariance across intermediate, secondary, and university educators. Multivariate Behavioral Research, 26, 583-605. Byrne, B. M. (1993). The Maslach Burnout Inventory: Testing for factorial validity and invariance across elementary, intermediate, and secondary teachers. Journal of Occupational and Organizational Psychology, 66, 197-212. Byrne, B. M. (1994). Testing for the factorial validity, replication, and invariance of a measuring instrument: A paradigmatic application based on the Maslach Burnout Inventory. Multivariate Behavioral Research, 29, 289-311. Comrey, A. L. (1978). Common methodological problems in factor analytic studies. Journal of Consulting and Clinical Psychology, 46, 648-659.

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

Worley et al. / Factor Structure of Maslach Burnout Inventory Scores

821

Cordes, C. L., Dougherty, T. W., & Blum, M. (1997). Patterns of burnout among managers and professionals: A comparison of models. Journal of Organizational Behavior, 18, 685-701. * Densten, I. L. (2001). Re-thinking burnout. Journal of Organizational Behavior, 22, 833-847. Densten, I. (2005). The relationship between visioning behaviours of leaders and follower burnout. British Journal of Management, 16, 105-118. Evans, B. K., & Fischer, D. G. (1993). The nature of burnout: A study of the three-factor model of burnout in human service and non-human service samples. Journal of Occupational and Organizational Psychology, 66, 29-38. Fan, X., & Thompson, B. (2001). Confidence intervals about score reliability coefficients, please: An EPM guidelines editorial. Educational and Psychological Measurement, 61, 517-531. Fimian, M. J., & Blanton, L. P. (1987). Stress, burnout, and role problems among teacher trainees and first year teachers. Journal of Occupational Behaviour, 8, 157-165. Fimian, M. J., Fastenau, P. A., Tashner, J. H., & Cross, A. H. (1989). The measure of classroom stress and burnout among gifted and talented students. Psychology in Schools, 26, 139-153. Firth, H., McIntee, J., McKeown, P., & Britton, P. G. (1985). Maslach Burnout Inventory: Factor structure and norms for British nursing staff. Psychological Reports, 57, 147-150. Gold, Y. (1984). The factorial validity of the Maslach Burnout Inventory in a sample of California elementary and junior high school classroom teachers. Educational and Psychological Measurement, 44, 1009-1016. Gold, Y., Bachelor, P., & Michael, W. B. (1989). The dimensionally of a modified form of the Maslach Burnout Inventory for university students in a teacher-training program. Educational and Psychological Measurement, 49, 549-561. Gold, Y., Roth, R. A., & Wright, C. R. (1992). The factorial validity of a teacher burnout measure (Educators Survey) administered to a sample of beginning teachers in elementary and secondary schools in California. Educational and Psychological Measurement, 52, 761-768. Golembiewski, R. T., Munzenrider, R., & Carter, D. (1983). Phases of progressive burnout and their work site covariants: Critical issues in OD research and praxis. Journal of Applied Behavioral Sciences, 19, 461-481. Gorsuch, R. L. (1983). Factor analysis (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum. Gorter, R. C., Albrecht, G., Hoogstraten, J., & Eijkman, M. A. J. (1999). Factorial validity of the Maslach burnout inventory—Dutch version (MBI-NL) among dentists. Journal of Organizational Behavior, 20, 209-217. Green, D. E., & Walkey, F. H. (1988). A confirmation of the three-factor structure of the Maslach Burnout Inventory. Educational and Psychological Measurement, 48, 579-585. * Green, D. E., Walkey, F. H., & Taylor, A. J. W. (1991). The three-factor structure of the Maslach Burnout Inventory: A multicultural, multinational confirmatory study. Journal of Social Behavior and Personality, 6, 453-472. * Gupchup, G. V., Lively, B. T., Holiday-Goodman, M., Siganga, W. W., & Black, C. D. (1994). Maslach Burnout Inventory: Factor structures for pharmacists in health maintenance organizations and comparison with normative data for USA pharmacists. Psychological Reports, 74, 891-895. Halbesleben, J. R. B., & Demerouti, E. (2005). The construct validity of an alternative measure of burnout: Investigating the English translation of the Oldenburg Burnout Inventory. Work and Stress, 19, 208-220. Henson, R. K. (2001). Understanding internal consistency reliability estimates: A conceptual primer on coefficient alpha. Measurement and Evaluation in Counseling and Development, 34, 177-189. Henson, R. K., Capraro, R. M., & Capraro, M. M. (2004). Reporting practices and use of exploratory factor analyses in educational research journals: Errors and explanation. Research in the Schools, 11, 61-72. Henson, R. K., & Roberts, J. K. (2006). Use of exploratory factor analysis in published research: Common errors and some comment on improved practice. Educational and Psychological Measurement, 66, 393-416.

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

822

Educational and Psychological Measurement

Holland, P. J., Michael, W. B., & Kim, S. (1994). Construct validity of the Educators Survey for a sample of middle school teachers. Educational and Psychological Measurement, 54, 822-829. Huberty, T. J., & Huebner, E. S. (1988). A national survey of burnout among school psychologists. Psychology in the Schools, 25, 54-61. * Iwanicki, E. F., & Schwab, R. L. (1981). A cross validation study of the Maslach Burnout Inventory. Educational and Psychological Measurement, 41, 1167-1174. Kaiser, H. F., Hunka, S., & Bianchini, J. (1969). Relating factors between studies based upon different individuals. In H. J. Eysenck & S. B. G. Eysenck (Eds.), Personality structure and measurement (pp. 333-343). San Diego, CA: Knapp. Kalliath, T. J., O’Driscoll, M. P., Gillespie, D. F., & Bluedorn, A. C. (2000). A test of the Maslach Burnout Inventory in three samples of healthcare professionals. Work & Stress, 14, 35-50. * Kantas, A., & Vassilaki, E. (1997). Burnout in Greek teachers: Main findings and validity of the Maslach Burnout Inventory. Work & Stress, 11, 94-100. Kanste, O., Miettunen, J., & Kyngas, H. (2006). Factor structure of the Maslach Burnout Inventory among Finnish nursing staff. Nursing and Health Sciences, 8, 201-207. Kitaoka-Higashiguchi, K., Nakagawa, H., Morikawa, Y., Ishizaki, M., Miura, K., Naruse, Y., et al. (2004). Construct validity of the Maslach Burnout Inventory–General Survey. Stress and Health, 20, 255-260. * Koeske, G. F., & Koeske, R. D. (1989). Construct validity of the Maslach Burnout Inventory: A critical review and reconceptualization. Journal of Applied Behavioral Science, 25, 131-144. * Kokkinos, C. M. (2006). Factor structure and psychometric properties of the Maslach Burnout Inventory– Educator Survey among elementary and secondary school teachers in Cyprus. Stress and Health, 22, 25-33. Lee, R. T., & Ashforth, B. E. (1990). On the meaning of Maslach’s three dimensions of burnout. Journal of Applied Psychology, 75, 743-747. Lee, R. T., & Ashforth, B. E. (1996). A meta-analytic examination of correlates of the three dimensions of job burnout. Journal of Applied Psychology, 81, 123-133. Leiter, M. P., & Maslach, C. (1988). The impact of interpersonal environment on burnout and organizational commitment. Journal of Organizational Behavior, 9, 297-308. * Linehan, M. M., Cochran, B. N., Mar, C. M., Levensky, E. R., & Comtois, K. A. (2000). Therapeutic burnout among borderline personality disordered clients and their therapists: Development and evaluation of two adaptations of the Maslach Burnout Inventory. Cognitive and Behavioral Practice, 7, 329-337. Loeber, R., & Schmaling, K. B. (1985). Empirical evidence for overt and covert patterns of anti-social conduct problems: A meta-analysis. Journal of Abnormal Child Psychology, 13, 337-353. Loo, R. (2001). Motivational orientations toward work: An evaluation of the Work Preference Inventory (Student Form). Measurement and Evaluation in Counseling and Development, 33, 222-233. * Maslach, C., & Jackson, S. E. (1981). The measurement of experienced burnout. Journal of Occupational Behaviour, 2, 99-113. Maslach, C., & Jackson, S. E. (1986). Maslach Burnout Inventory manual (2nd ed.). Palo Alto, CA: Consulting Psychologists Press. Maslach, C., Schaufeli, W. B., & Leiter, M. P. (2001). Job burnout. Annual Review of Psychology, 52, 397-422. * Mor, V., & Laliberte, L. (1984). Burnout among hospice staff. Health and Social Work, 9, 274-283. Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill. * Pelsma, D. M., Roland, B., Tollefson, N., & Wigington, H. (1989). Parent burnout: Validation of the Maslach Burnout Inventory with a sample of mothers. Measurement and Evaluation in Counseling and Development, 22, 81-87. * Pierce, C. M. B., & Molloy, G. N. (1989). The construct validity of the Maslach Burnout Inventory: Some data from down under. Psychological Reports, 65, 1340-1342.

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008

Worley et al. / Factor Structure of Maslach Burnout Inventory Scores

823

* Powers, S., & Gose, K. F. (1986). Reliability and construct validity of the Maslach Burnout Inventory in a sample of university students. Educational and Psychological Measurement, 46, 251-255. Richardsen, A. M., & Martinussen, M. (2004). The Maslach Burnout Inventory: Factorial validity and consistency across occupational groups in Norway. Journal of Occupational and Organizational Psychology, 77, 377-384. Rindskopf, D., & Rose, T. (1988). Some theory and applications of confirmatory second-order factor analysis. Multivariate Behavioral Research, 23(1), 51-67. Schaufeli, W. B., Bakker, A. B., Hoognuin, K., Schaap, C., & Kladler, A. (2001). On the clinical validity of the Maslach Burnout Inventory and the Burnout Measure. Psychology and Health, 16, 565-582. Schaufeli, W. B., Daamen, J., & Van Mierlo, H. (1994). Burnout among Dutch teachers: An MBI-validity study. Educational and Psychological Measurement, 54, 803-812. Schaufeli, W. B., & Enzmann, D. (1998). The burnout companion to study and practice: A critical analysis. London: Taylor & Francis. Schaufeli, W. B., Leiter, M. P., Maslach, C., & Jackson, S. E. (1996). The Maslach Burnout Inventory– General Survey. In C. Maslach, S. E. Jackson, & M. P. Leiter (Eds.), Maslach Burnout Inventory manual (3rd ed.). Palo Alto, CA: Consulting Psychologists Press. Schaufeli, W. B., & Van Dierendonck, D. (1993). The construct validity of two burnout measures. Journal of Organizational Behavior, 14, 631-647. Scherer, R. F., Cox, M. K., Key, C. C., Stickney, F. A., & Spangler, E. M. (1992). Assessing the similarity of burnout dimensions in two business samples. Psychological Reports, 71, 28-30. Shafer, A. (2005). Meta-analysis of the Brief Psychiatric Rating Scale factor structure. Psychological Assessment, 17, 324-335. Shirom, A. (1989). Burnout in work organizations. In C. L. Cooper and I. Robertson (Eds.), International review of industrial and organizational psychology (pp. 25-48). New York: Wiley. * Soderfeldt, M., Soderfeldt, B., Warg, L., & Ohlson, C. (1996). The factor structure of the Maslach Burnout Inventory in two Swedish human service organizations. Scandinavian Journal of Psychology, 37, 437-443. Stevens, J. P. (2002). Applied multivariate statistics for the social sciences (4th ed.). Mahwah, NJ: Lawrence Erlbaum. Tang, C. S. (1998). Assessment of burnout for Chinese human service professional: A study of factorial validity and invariance. Journal of Clinical Psychology, 54, 55-58. Thompson, B. (2004). Oblique rotations and higher order factors. In Exploratory and confirmatory factor analysis: Understanding concepts and applications. Washington, DC: American Psychological Association. Velicer, W. F., & Jackson, D. N. (1990). Component analysis versus common factor analysis: Some issues in selecting an appropriate procedure. Multivariate Behavioral Research, 25, 1-28. Velicer, W. F., Peacock, A. C., & Jackson, D. N. (1982). A comparison of component and factor patterns: A Monte Carlo approach. Multivariate Behavioral Research, 17, 371-388. Walkey, F. H., & Green, D. E. (1992). An exhaustive examination of the replicable factor structure of the Maslach Burnout Inventory. Educational and Psychological Measurement, 52, 309-323 * Whitehead, A., Ryba, K., & O’Driscoll, M. (2000). Burnout among New Zealand primary school teachers. New Zealand Journal of Psychology, 29, 52-68. Widaman, K. F. (1993). Common factor analysis versus principal component analysis: Differential bias in representing model parameters? Multivariate Behavioral Research, 28, 263-311.

Downloaded from http://epm.sagepub.com at UNIV OF OKLAHOMA TULSA on October 28, 2008