Implications for the Hispanic Paradox

American Journal of Epidemiology Copyright © 2004 by the Johns Hopkins Bloomberg School of Public Health All rights reserved

Vol. 159, No. 7 Printed in U.S.A. DOI: 10.1093/aje/kwh089

PRACTICE OF EPIDEMIOLOGY Evaluation of Mortality Data for Older Mexican Americans: Implications for the Hispanic Paradox

Kushang V. Patel1,2, Karl Eschbach2,3,4,5, Laura A. Ray2,3, and Kyriakos S. Markides1,2

Received for publication June 18, 2003; accepted for publication October 2, 2003.

The authors evaluated underascertainment bias in Hispanic mortality rates from population surveys linked to the US National Death Index (NDI). They compared vital status through 7 years ascertained from an NDI search and from active follow-up for 2,886 Mexican-American subjects, aged ≥65 years at baseline in 1993–1994, from the Hispanic Established Populations for Epidemiologic Studies of the Elderly (EPESE). Estimates of NDI underascertainment were applied to mortality rate ratios for 66,667 older Mexican Americans and non-Hispanic Whites from the 1986–1994 National Health Interview Surveys linked to the NDI. The NDI and active follow-up agreed on vital status for 91.2% of Hispanic EPESE subjects. The NDI did not identify 177 deaths (20.7%) reported by proxies. Underascertainment was greater for women and when stratified by age and nativity. The ratios of proxyreported to NDI mortality rates were 1.31 (95% confidence interval (CI): 1.06, 1.62) for immigrant men and 1.65 (95% CI: 1.32, 2.08) for immigrant women. Before adjustment, National Health Interview Surveys–NDI agestandardized mortality rate ratios comparing Mexican Americans with non-Hispanic Whites were 0.77 (95% CI: 0.65, 0.92) for men and 0.92 (95% CI: 0.77, 1.09) for women but were 0.84 and 1.18, respectively, with adjustment for underascertainment. Findings suggest that NDI-based Hispanic mortality rates may be understated. bias (epidemiology); databases; Hispanic Americans; longitudinal studies; mortality; vital statistics

Abbreviations: EPESE, Established Populations for Epidemiologic Studies of the Elderly; NDI, National Death Index; NHIS, National Health Interview Survey.

Since the middle 1980s, epidemiologists who study population health disparities have noted an apparent epidemiologic paradox. The Hispanic population in the United States has an average socioeconomic status comparable to that of African Americans but has age-adjusted mortality rates more similar to those for non-Hispanic Whites (1–4). This mortality advantage is commonly considered a paradox both because socioeconomic standing is a well-established determinant of mortality and because Hispanics (primarily

Mexican Americans) have an elevated prevalence of several risk factors for mortality, including diabetes and obesity (5– 9). Several substantive explanations for the Hispanic mortality advantage in the United States have been proposed, including health-selective immigration, return migration, and advantages in health-related behaviors and social support (10–14). However, before accepting such substantive explanations for the Hispanic mortality advan-

Correspondence to Dr. Karl Eschbach, Department of Internal Medicine, Division of Geriatrics, University of Texas Medical Branch, 301 University Boulevard, Galveston, TX 77555-0460 (e-mail: [email protected]).

707

Am J Epidemiol 2004;159:707–715

Downloaded from http://aje.oxfordjournals.org/ by guest on June 4, 2013

1 Department of Preventive Medicine and Community Health, Division of Sociomedical Sciences, University of Texas Medical Branch, Galveston, TX. 2 Sealy Center on Aging, University of Texas Medical Branch, Galveston, TX. 3 Department of Preventive Medicine and Community Health, Division of Epidemiology and Biostatistics, University of Texas Medical Branch, Galveston, TX. 4 Department of Internal Medicine, Division of Geriatrics, University of Texas Medical Branch, Galveston, TX. 5 Center for Immigration Research, University of Houston, Houston, TX.

708 Patel et al.

mortality follow-up for select cohorts from the Current Population Survey. A second study has also ascertained mortality for nine successive cohorts (1986–1994) from the National Health Interview Survey (NHIS). Both data sets showed mortality advantages for Hispanics compared with non-Hispanic Whites (1, 2, 4). However, cohort studies are vulnerable to a potential bias resulting from underascertainment of mortality during the follow-up period. Mortality follow-up for both the National Longitudinal Mortality Study and the NHIS was accomplished by linking data on survey respondents to mortality records in the National Death Index (NDI). The NDI matching process uses the following information to identify cases: first and last names, middle initial, father’s surname, Social Security number, date of birth, state of birth, state of residence, sex, race, marital status, and age at death (19). Researchers have developed probabilistic approaches that classify cohort members as matched or unmatched to death records based on the level of agreement and disagreement between survey and death record information (20). This matching process has been carefully validated by using calibration samples from studies with an active follow-up of vital status (21). Both the NDI database and the matching method used for linkages between survey respondents and NDI records are considered the “gold standard” for ascertainment of mortality for a community cohort (22). Important questions remain, however, about the accuracy of death information when the NDI is used for Hispanic cohorts. Emigration from the United States presents potential problems for NDI-based studies because deaths outside of the country are not included in the NDI database. Even when a death occurs in the United States, automated matching algorithms may not work as well for Hispanics as they do for other groups, especially non-Hispanic Whites. There are several reasons for this concern. The NDI matching method relies heavily on Social Security number matches. Use and accuracy of Social Security numbers may be lower both in survey responses and on death certificates for segments of the Hispanic population that are undocumented or whose work experience is primarily in the informal agriculture or domestic service sectors. Name matches may also be less reliable for Hispanics than for nonHispanic Whites because Hispanic naming practices differ from non-Hispanic White conventions in several ways that can affect how names are reported both to survey researchers and on the death certificate. For example, many Hispanics use both father’s and mother’s surnames as part of their name. By custom, at marriage, the bride adds her husband’s surname while retaining her parents’ surnames. For many Hispanics, the first surname listed is considered the primary surname, in contrast to the conventional US emphasis placed on “last name.” Identifying a single “middle name” may also be difficult for some Hispanics because this term may refer to one of several given names or one of several family surnames. Anglicized name variants, for example, “Mary” rather than “Maria,” may also be used in giving information to non-Hispanics (23). A lower NDI match rate for Hispanics for any of these reasons would increase the appearance of a Hispanic mortality advantage. Unfortunately, performance of the NDI Am J Epidemiol 2004;159:707–715


tage, it is important to explore whether systematic data errors create the appearance of a mortality advantage. Two kinds of evidence exist for a Hispanic mortality advantage compared with other groups: 1) mortality rates calculated by linking vital registration death data to census population counts and 2) mortality rates calculated from cohort studies in which mortality is ascertained over a follow-up period. Both kinds of evidence are subject to data errors that might lead to misstatement of a Hispanic mortality advantage. Mortality rates calculated by linking death counts from vital registration to population counts from census enumerations are subject to error because the numerator and denominator come from different sources. Both data systems are subject to coverage errors (i.e., incomplete enumeration and death registration), discrepancies in ethnic classification, and age misstatement. Each of these errors can produce systematic biases in the calculation of death rates that could create a false appearance of ethnic disparities (15). Hispanics, like other minority populations, are disproportionately exposed to several risk factors for both underregistration of deaths and underenumeration in a census. These factors include low levels of formal schooling and English language literacy; residency in poor, rural, and near-border environments; overrepresentation in agricultural and domestic service occupations; and undocumented immigration status. It is not clear a priori whether the undercount error will be larger in the numerator (deaths) or the denominator (census). However, one particular concern is that health-selective return migration of international immigrants to a country of origin may lead to underestimation of mortality rates in the United States by systematically removing report of deaths from the numerator but not the denominator of mortality calculations (10, 13, 14, 16). This possibility has been called the “salmon bias” hypothesis (12). Another especially important concern is that Hispanic ethnicity may be systematically underreported in death registration compared with census data. This underreporting may occur in particular because the ethnicity field on the death certificate is sometimes filled out by a physician or funeral director who may not know the decedent well (15, 17). In addition, overstatement of age on the death certificate lowers age-adjusted mortality estimates (18), which may occur disproportionately among Hispanics because of their relatively low levels of formal schooling and potentially limited access to a birth certificate. All of these concerns cumulate to raise questions about comparison of Hispanic and non-Hispanic mortality using vital registration death rates alone. The second kind of data documenting a Hispanic mortality advantage comes from studies that ascertain vital status for large cohorts over a period of follow-up. Data from cohort studies seemingly provide powerful corroborating evidence about the Hispanic mortality advantage because they eliminate concerns about inconsistent ethnic classification in census and vital registration and reduce concerns about age misstatement. Hispanic identification is fixed by self-report at the beginning of the study, so inconsistency of reporting ethnicity at subsequent points does not affect calculated mortality rates. Two important studies have used this design. The National Longitudinal Mortality Study reported

Evaluation of Mortality Data for Mexican Americans 709

MATERIALS AND METHODS Study samples

The Hispanic EPESE is a cohort of 3,050 communitydwelling Mexican Americans aged 65 years or older living in the southwestern region of the United States (25). An area probability sample was drawn in 1993–1994, with 83 percent of those originally sampled participating in the baseline survey. This sample is representative of approximately 500,000 elderly Mexican Americans living in five southwestern states (Arizona, California, Colorado, New Mexico, and Texas) during 1990 and was followed up through 2001, with three subsequent waves of data collection in 1995– 1996, 1998–1999, and 2000–2001. The first vital status information for the Hispanic EPESE cohort was collected during each of the three follow-up interview waves. If a subject was missing at follow-up, then proxy respondents were asked whether the subject had died, had moved, or had been admitted to a hospital, nursing home, or hospice facility. Proxy informants who reported that the subject was deceased were interviewed by using a death questionnaire that asked the date, location, and cause of death as well as questions about hospital or nursing home admissions prior to the subject’s death. The overwhelming majority (86 percent) of proxy informants were family members of the missing subject. In 164 instances, interviews with proxy informants were not completed for subjects lost over the follow-up period (i.e., neither the subject nor a proxy informant was reinterviewed after baseline). Therefore, data were analyzed for 2,886 subjects, with vital status Am J Epidemiol 2004;159:707–715

ascertained through interviews with subjects or proxy informants. Excluded subjects for whom vital status information was missing were significantly younger (mean age, 69.2 years; standard deviation, 5.2 years) than those subjects included in the data analysis (mean age, 73.3 years; standard deviation, 6.8 years), although no differences were observed by sex and nativity. Vital status information obtained from interviews with subjects or proxy informants (hereafter referred to as “proxy reports”) was compared with information from a match of the Hispanic EPESE cohort to death records in the NDI database through a search performed in 2002. Deaths through December 31, 2000, were ascertained. The system used by the National Center for Health Statistics (Hyattsville, Maryland) to match the NHIS to the NDI was used to identify deaths of Hispanic EPESE subjects (26). This system involves a two-step process. In the first step, NDI records are identified as potential matches to subjects when one of nine criteria is met (formerly 12 criteria were used by the NDI, but, in 1999, four of the Social Security number–related criteria were collapsed into a single Social Security number match criterion). The nine criteria are various combinations of agreement between supplied survey information and NDI records on seven matching items. For a given subject, one or more NDI records may be identified as potential matches; for other subjects, no potential matches are identified. In the second step, potentially matched records are evaluated to determine whether one particular record can be identified as a “true” match to the survey subject. In this step, statistical weights derived from calibration samples provide investigators with probability scores to evaluate potential matches. All potential matches are categorized into mutually exclusive classes based on which items matched and the combination of matched items. Class-specific cutoffs of probability scores are then used to identify “true” matches (21). Using the NHIS-NDI cutoff scores, we identified 753 deaths among the 2,886 subjects in the Hispanic EPESE study sample. To evaluate the implications of the findings from the Hispanic EPESE for mortality differentials between Hispanics and non-Hispanic Whites, adjustment factors calculated from the Hispanic EPESE data were applied to mortality information estimated from the 1986–1994 NHIS cohort data that were linked to the NDI. The NHIS is an annual cross-sectional survey of approximately 75,000 persons, representative of the noninstitutionalized US adult population. Vital status searches for subjects from these surveys were performed through December 31, 1997, using the NDI. To provide survival time comparable to that in the Hispanic EPESE (7.5 years), data on subjects from the 1986–1991 NHIS cohorts were eligible for analysis. Data for all 66,667 non-Hispanic White and Mexican-American (self-identified as Mexican Mexicano, Mexican American, or Chicano) subjects aged 65 years or older were used in the analysis. Data analysis

To compare the vital status of Hispanic EPESE subjects by proxy reports and the NDI, both simple agreement and the kappa statistic were calculated. Stratified analyses of


matching algorithm has not been investigated for Hispanics because the calibration samples used thus far have contained few Hispanic members (24). These issues have contributed to continuing debate about the existence and magnitude of the Hispanic mortality advantage (11, 16). The purpose of this study was to determine empirically whether bias exists in Hispanic mortality estimates based on an NDI search for a large cohort of older Mexican Americans. Data from the Hispanic Established Populations for Epidemiologic Studies of the Elderly (EPESE) were used for this investigation. The Hispanic EPESE collected identifying information for all 12 items used in the NDI matching process, making it possible to reproduce closely the matching methods used in other population surveys linked to the NDI. Additionally, the Hispanic EPESE includes 7 years of active follow-up of vital status through interviews with subjects or reports from proxy informants. This combination of vital status information from proxy reports and the NDI makes the Hispanic EPESE a unique data set evaluating mortality information for Hispanics. We performed three tasks in this study. First, we compared vital status information from proxy informants and NDI matches. Second, we examined discrepancies in mortality ascertainment by age, sex, and nativity status. Finally, we derived adjustment factors for underascertainment from the Hispanic EPESE and applied them to 1986–1994 NHIS– NDI linked data to illustrate the impact of these adjustment factors on estimates of the Hispanic mortality advantage.

710 Patel et al.

TABLE 1. Cross-tabulation of NDI* and proxy-reported vital status classifications, 1993/1994–2000 Hispanic EPESE* (n = 2,886)† Proxy report (no.) NDI

Total (no.) Dead

Alive

Dead

677

76

753

Alive

177

1,956

2,133

Total

854

2,032

2,886

* NDI, National Death Index; EPESE, Established Populations for Epidemiologic Studies of the Elderly. † Deaths were identified through December 31, 2000.

In calculating the preadjustment survival time from the NHIS-NDI data, survival time also began on the date of interview. To make 1986–1997 NHIS survival data comparable to the 1993/1994–2000 Hispanic EPESE data, subjects whose survival time exceeded 7.5 years were right censored at this maximal point (2,740 days). All other NHIS subjects were right censored with death dates from the NDI or December 31, 1997, whichever came first. Data were analyzed by using the SAS and SUDAAN statistical packages (31, 32). RESULTS

Among Hispanic EPESE subjects, crude mortality based on proxy reports was 29.6 percent over the 7.5-year followup period (table 1). An NDI search over the same time period yielded a lower crude mortality estimate of 26.1 percent. Vital status classifications between the two sources agreed on 91.2 percent of the cases. A kappa value of 0.78 demonstrated reasonable concordance between the two sources while adjusting for chanceful agreement. Nonetheless, a total of 177 (20.7 percent) deaths reported by proxy informants were not identified by the NDI. Table 2 shows the demographic predictors of vital status classifications for Hispanic EPESE subjects by NDI and proxy reports. We found that the effect of age on mortality was stronger with proxy-reported information than when the NDI was used. This difference stemmed from fewer deaths identified by the NDI of very old Mexican Americans. For example, for subjects aged 80 years or older, proxy informants reported 60 deaths not identified by the NDI. (Note that only marginal totals are shown for each subgroup in table 2; therefore, counts reported in this paragraph cannot be directly calculated from table 2.) In addition, the mortality effect of male compared with female gender was weaker in the proxy-reported information than in the NDI data. A total of 116 women were not classified as dead by the NDI but had died according to proxy informants. Finally, foreignborn nativity was protective for mortality based on NDI classifications; however, this effect was not significant in the proxy death data because informants reported an additional 99 deaths of foreign-born subjects not classified as dead by the NDI. Am J Epidemiol 2004;159:707–715


mortality by age, sex, and nativity were also performed to investigate systematic differences in death ascertainment as a function of each of these characteristics. For each vital status source, logistic regression models were used to examine univariate associations of vital status with age, sex, and nativity. Multivariate proportional hazards models were also estimated to determine the effects of demographic factors on survival. Odds ratio and hazard ratio effect sizes were compared across each vital status source. To assess the magnitude of death underascertainment bias, ratios of proxy-reported to NDI mortality rates as well as 95 percent confidence intervals were calculated using the Hispanic EPESE data. To determine how this level of underascertainment would affect estimates of mortality for Mexican Americans in comparison to non-Hispanic Whites, these mortality ratios were then applied as adjustment factors to 1986–1994 NHIS Mexican-American (subjects aged ≥65 years) mortality rates. Multiplying the stratum-specific mortality ratio by the respective NHIS observed mortality rate provided an adjusted mortality rate. This methodology has been used previously to adjust mortality data for inconsistent race/ethnicity classifications of Native Americans in the United States and Mäoris in New Zealand (27–29). Age standardizations for observed and adjusted mortality rates were made in the NHIS data by using the direct method. Standardized rates were prepared for each sex within each ethnic group (Mexican Americans and nonHispanic Whites). All subjects aged 65 years or older in the entire 1986–1997 NHIS data set were considered the standard population. Age-standardized mortality rate ratios of NHIS Mexican Americans to non-Hispanic Whites were calculated with 95 percent confidence intervals by using the Mexican-American (exposed group) distribution as the reference standard (30). Prior to applying adjustment factors from the Hispanic EPESE to the rates calculated from the NHIS–NDI, it was important to ensure that the mortality structure was comparable between the two data sets. The age distributions of NDI deaths in the Hispanic EPESE and the NHIS were found to be similar. For Hispanic EPESE subjects, survival time was estimated separately for each vital status source. The survival clock began with the baseline interview date for both sources. For mortality rates based on the NDI, subjects were right censored either on their date of death according to NDI or on December 31, 2000, for unmatched cases. Censoring subjects for whom proxy report information was available depended on reinterview status. If a subject refused to participate in the study at follow-up, then he or she was right censored at the midpoint of that data collection period (dates of refused interviews were not recorded). Proxy informants who reported that a subject had died were asked for the month and year of the death, which was used as the censoring date (the 15th was used as the day of death for all proxy-reported deaths). Subjects were right censored on December 31, 2000, if a censoring event occurred after this date. Finally, those subjects interviewed in the fourth wave of data collection were right censored either on December 31, 2000, or their date of interview that preceded January 1, 2001.


TABLE 2. Demographic characteristics of NDI* and proxy-reported vital status classifications, 1993/1994–2000 Hispanic EPESE* (n = 2,886)† NDI Dead

Proxy report

Alive

Dead OR*,‡

95% CI*

HR*,§

Alive

95% CI

No.

%

No.

%

65–69

170

22.6

888

41.6

1.00

70–74

183

24.3

607

28.5

1.57

1.27, 1.96

1.47

1.20, 1.79

75–79

135

17.9

349

16.4

2.02

1.55, 2.64

1.90

≥80

265

35.2

289

13.6

4.79

3.76, 6.10

3.93

Female

348

46.2

1,312

61.5

1.00

Male

405

53.8

821

38.5

1.86

US born

427

56.7

1,195

56.0

1.00

Foreign born

326

43.3

938

44.0

0.97

OR‡

95% CI

HR§

95% CI

No.

%

No.

%

183

21.4

875

43.1

1.00

202

23.7

588

28.9

1.64

1.33, 2.03

1.51

1.24, 1.84

1.50, 2.41

157

18.4

327

16.1

2.30

1.83, 2.87

2.03

1.68, 2.46

3.25, 4.76

312

36.5

242

11.9

6.16

4.80, 7.91

4.37

3.63, 5.26

427

50.0

1,233

60.7

1.00

427

50.0

799

39.3

1.54

470

55.0

1,152

56.9

1.00

384

45.0

880

43.3

1.07

Age (years) 1.00

1.00

Sex 1.00 1.57, 2.20

1.79

1.54, 2.07

1.00 1.31, 1.82

1.51

1.32, 1.73

Nativity

753

0.84, 1.13

0.82

0.72, 0.93

2,133

854

1.00 0.92, 1.24

0.91

0.80, 1.02

2,032

* NDI, National Death Index; EPESE, Established Populations for Epidemiologic Studies of the Elderly; OR, odds ratio; CI, confidence interval; HR, hazard ratio. † Deaths were identified through December 31, 2000. ‡ Univariate odds ratios and 95% confidence intervals. § Multivariate hazard ratios and 95% confidence intervals; age, sex, and nativity were entered into the model.

higher than NDI rates for each age stratum. For women, however, mortality ratios were larger, especially in the older age groups. Similarly, the proxy-informed mortality rate for foreign-born men was 1.1 times higher than the NDI rate; for women, it was 1.4 times higher. Overall, proxy-reported mortality rates were higher, by 9 percent for men and 28 percent for women, than rates based on the NDI. Table 4 shows stratified mortality rates (based on the NDI) for elderly Mexican Americans and non-Hispanic Whites from the 1986–1997 NHIS. For each age stratum, the origi-

Age- and nativity-specific mortality rates were calculated separately for Hispanic EPESE men and women because of significant sex-by-age and sex-by-nativity effect modification (interaction tests are not shown). Ratios of proxy report to NDI mortality rates are shown in table 3. Although mortality rates were higher for men than for women, discrepancies between the two vital status sources were larger for women than for men and were more pronounced when stratified by age and nativity. Among men, for example, proxyinformed mortality rates were approximately 1.1 times

TABLE 3. Mortality rates per 1,000 person-years and mortality rate ratios of proxy report to the NDI,* 1993/1994–2000 Hispanic EPESE* (n = 2,886)† Male (n = 1,226) Proxy report Mortality rate

95% CI*

Female (n = 1,660)

NDI Mortality rate

95% CI

Ratio

Proxy report

NDI

Ratio

Mortality ratio

95% CI

Mortality rate

95% CI

Mortality rate

95% CI

Mortality ratio

95% CI

Age (years) 65–69

32.0

25.6, 39.5

29.1

23.2, 36.2

1.10

0.81, 1.49

25.0

20.2, 30.5

21.9

17.6, 27.0

1.14

0.85, 1.52

70–74

54.2

44.7, 65.2

48.1

39.3, 58.3

1.13

0.86, 1.47

32.9

26.4, 40.5

28.5

22.5, 35.5

1.16

0.86, 1.57

75–79

69.5

55.0, 86.6

64.3

50.5, 80.7

1.08

0.79, 1.49

45.7

36.1, 57.0

34.5

26.4, 44.3

1.32

0.95, 1.87

113.3, 158.0

1.04

0.83, 1.31

94.8

80.9, 110.5

64.3

53.3, 76.9

1.47

1.17, 1.88

50.0, 65.2

1.05

0.88, 1.27

41.2

36.1, 46.8

34.0

29.5, 39.1

1.21

1.00, 1.46

32.6

27.5, 38.3

1.38

1.11, 1.71

33.4

30.0, 37.1

1.28

1.11, 1.48

≥80

139.7

118.1, 164.1

134.2

Nativity US born

60.2

52.8, 68.5

57.2

Foreign born

62.0

53.5, 71.4

55.0

47.3, 63.7

1.13

0.92, 1.38

44.8

61.0

55.4, 67.1

56.2

50.9, 62.0

1.09

0.95, 1.24

42.8

Total

38.7, 51.6 38.8, 47.0

* NDI, National Death Index; EPESE, Established Populations for Epidemiologic Studies of the Elderly, CI, confidence interval. † Deaths were identified through December 31, 2000.

Am J Epidemiol 2004;159:707–715


Total

1.00

712 Patel et al.

TABLE 4. Observed and adjusted mortality rates per 1,000 person-years for older Mexican Americans and non-Hispanic Whites, 1986–1997 NHIS* (n = 66,667)† Men Mexican American‡ (n = 491) Observed

Women Non-Hispanic White (n = 26,583)

Mexican American‡ (n = 623) Observed

Mortality rate

95% CI*

Adjusted mortality rate

Mortality rate

95% CI

65–69

31.5

22.7, 42.6

34.7

34.2

70–74

39.4

25.2, 58.6

44.5

53.3

75–79

53.8

36.2, 76.9

58.1

77.9

≥80

97.5

69.6, 132.9

101.4

132.9

51.0

42.1, 59.9

55.1

US born

42.7

30.5, 58.1

Foreign born

48.6

39.4, 59.4

Non-Hispanic White (n = 38,700)

Mortality rate

95% CI

Adjusted mortality rate

Mortality rate

95% CI

32.7, 35.6

21.8

14.9, 30.8

24.9

20.1

19.1, 21.1

51.2, 55.4

28.5

18.9, 41.3

33.1

29.2

27.9, 30.5

74.7, 81.3

41.6

26.9, 61.4

54.9

44.7

42.8, 46.6

128.0, 138.0

67.9

48.1, 93.1

99.8

88.3

85.6, 91.0

67.4

66.0, 68.8

36.6

30.0, 43.1

47.7

40.8

40.0, 41.6

44.8

59.1

57.4, 60.8

34.2

24.5, 46.4

44.3

41.0

39.8, 42.2

54.9

61.5

59.1, 63.2

34.1

27.1, 42.3

47.1

40.2

39.2, 41.4

Age (years)

Age standardized data§ Nativity

nally observed mortality rates were lower for Mexican Americans (first column) than for non-Hispanic Whites (fourth column) for both men and women (except for women aged 65–69 years). This pattern extended to age-standardized mortality rates and rates by nativity, suggesting the existence of a Hispanic paradox. However, when Hispanic EPESE mortality ratios (of proxy report to the NDI) were applied as adjustment factors to NHIS Mexican-American mortality rates (third column), adjusted mortality rates for Mexican-American women exceeded rates observed for non-Hispanic White women for all age and nativity strata. Adjustments also affected death rates for Mexican-American men, particularly in the foreign-born category. The original rate ratio of the Mexican-American to the non-Hispanic White age-standardized mortality rate was 0.77 (95 percent confidence interval: 0.65, 0.92) for men and 0.92 (95 percent confidence interval: 0.77, 1.09) for women, but the adjusted rate ratios were 0.84 and 1.18, respectively. Thus, the adjusted age-standardized Mexican-American mortality rate for men remained lower compared with the age-standardized non-Hispanic White rate; however, for Mexican-American women, the adjusted rate was higher than the non-Hispanic White rate by 18 percent. DISCUSSION

Findings from this investigation indicate that deaths of older Mexican Americans are underascertained when the Hispanic EPESE cohort is linked to the NDI. If these findings generalize to other data sets, Hispanic mortality rates based on cohort studies linked to the NDI are understated.

Death underascertainment by the NDI was most apparent for Mexican-American women aged 75 years or older and women born outside of the United States. Once this pattern was accounted for, the age-adjusted mortality rate was substantially higher for older Mexican-American women compared with older non-Hispanic White women. Thus, the mortality disadvantage experienced by Hispanic women is undetected in studies linked to the NDI. This pattern of gender difference in mortality rates adjusted for underascertainment reproduces earlier reports that the Hispanic advantage, if it exists, is primarily true for men (33). Much of the discordance between proxy-reported and NDI-identified deaths likely stems from problems with matching of Social Security numbers. Although researchers evaluating the NDI report that death ascertainment is good when identifiers other than Social Security number (34–36) are used, the most important determinant of an NDI match is the Social Security number (37, 38). Indeed, the sensitivity of NDI death identification has exceeded 95 percent when Social Security numbers were used (22). Given the importance of the Social Security number for NDI death ascertainment, it is not surprising that misclassification was most concentrated among the oldest old and foreign-born Mexican-American women. It is likely that some of these women born before 1919 were never employed in the formal US economy or for a time had undocumented immigrant status and thus never acquired a valid Social Security number. For example, our tabulations of the California mortality master file for the years 1993–1998 confirmed that the Social Security number field was blank for 7.3 percent of Mexican-American women and 5.6 percent of MexicanAm J Epidemiol 2004;159:707–715


* NHIS, National Health Interview Survey; CI, confidence interval. † Deaths of subjects from the 1986–1991 NHIS were searched through December 31, 1997; survival was censored at 7.5 years of follow-up. ‡ Adjustments were made by multiplying ratios from table 2 by unadjusted Mexican-American mortality rates. § Age-adjusted mortality rates were calculated using the direct method by applying the entire 1986–1997 NHIS sample as the standard population.


Am J Epidemiol 2004;159:707–715

example, researchers who are collecting survey data on Hispanic populations and anticipate linking these data to the NDI database may want to collect information about both mother’s and father’s surnames and to submit duplicate records with each parental surname as “last name.” All possible middle initials derived from parental surnames and from middle names may be tested. Records may be submitted without Social Security numbers to prevent false nonmatches from misreported Social Security numbers. Researchers should also carefully review potential matches at a lower threshold of probability than is the case for other groups. Until such alternative practices can be developed into calibrated scoring algorithms sensitive to different use of identifiers for Hispanic populations, NDI-based comparisons of mortality rates for Hispanics and other subpopulations should be made with caution. The current study is limited because death certificates were unavailable to permit a more nuanced evaluation of the NDI matching method for Hispanics. Such evaluation could identify matching elements that may need to be weighted differently for Hispanics to optimize the matching algorithm in this ethnic group. In addition, differences in data collection procedures between the Hispanic EPESE and the NHIS did not allow a direct assessment of the NHIS–NDI matching process. Application of bias estimates from the Hispanic EPESE to the NHIS–NDI study is reported only to illustrate the magnitude of underascertainment effects on the Hispanic mortality paradox. Finally, mortality estimates for non-Hispanic Whites were not adjusted for underascertainment because a second source of vital status information was not available for the NHIS–NDI cohort. In studies that report sensitivity of the NDI, estimates range from 92 percent to 98 percent for Whites (37, 41, 42). Considering that Hispanics are included in these estimates, the sensitivity of the NDI for non-Hispanic Whites is likely closer to the upper bound. Assuming that the NDI has a sensitivity of 95 percent for non-Hispanic White men and women, the mortality rate ratio adjusted for underascertainment would likely reduce to 0.80 for men and 1.12 for women. The strength and importance of this study is that it is the first known to evaluate NDI matching in a large sample of Hispanics with a second independent source of vital status information. The current study documents a pattern of NDI death underascertainment that has implications for comparing death rates across racial and ethnic groups. An important implication is that small classification errors or underperformance of the matching algorithm for subgroups can have an important effect on calculated ethnic differentials, especially for the older Hispanic population. The social and economic characteristics of this population likely contribute to data quality problems that lead to overstatement of the Hispanic mortality advantage in cohort studies linked to the NDI. Results from this study and others underscore the importance of interpreting mortality rates for ethnic minority groups with caution. Considering that death rates by race/ethnicity are used to set and evaluate “Healthy People 2010” goals (43), improving the quality of vital registration and data reporting systems for ethnic minority populations needs to become a greater public health priority.


American men born before 1919 compared with just 0.4 percent of non-Hispanic White men and women in the same age group. Another possibility is that older Mexican Americans may use relatives’ or friends’ Social Security numbers or that women may use their husband’s. Thus, reliance on the Social Security number for NDI death matching may be problematic for older Mexican Americans, especially women. The findings from this study are concordant with these expectations. Other major identifiers, such as date of birth, sex, or race, did not yield many NDI matches for Hispanic EPESE subjects. Only 119 Hispanic EPESE subjects were identified as dead by the NDI when the Social Security number did not match, even though all other identifiers were supplied to the NDI. That is, the Hispanic EPESE had fewer NDI matches with probability scores meeting cutoffs for classification categories that do not require a Social Security number. This problem could result from misrecording of data by survey interviewers or misrecording of information on death certificates, such as name misspelling or use of a name variant. For example, our tabulation of 1993–1998 California State vital statistics data found that persons identified as Mexican American were more than twice as likely as non-Hispanic Whites (34.0 percent vs. 14.0 percent) to have a blank middle name field in the data and were more than seven times as likely (5.9 percent vs. 0.8 percent) to report mother’s maiden name as their own surname. This heightened potential for misrecording of items both on death certificates and in survey responses underscores the challenges of matching Hispanics to the NDI. The relatively high proportion of underascertained deaths of foreign-born persons is consistent with the “salmon bias” hypothesis, which predicts a higher rate of underascertainment for the foreign born because it is likely that emigration from the United States would be more common for the foreign born than for those born in the United States (10). That is, some of the foreign born may maintain active social ties in the home community that they left when they came to the United States (39). They may choose to return home for social support when they experience illness (40). However, only 21 of the 99 foreign-born Hispanic EPESE decedents (21.2 percent) not matched by the NDI were reported by proxy informants to have died in Mexico. Thus, “salmon bias” accounts for only a small proportion of deaths not ascertained by the NDI. Use of automated matching algorithms is a practical necessity for studies that seek to link large population-based cohorts to the NDI. That these algorithms can sometimes fail is well understood. NDI documentation emphasizes that the classification and scoring apparatus provided to users of this service are intended as tools rather than definitive determinants of vital status. It is the responsibility of the end user to use the results of the algorithm, along with any additional information available, to determine whether a record is matched (19). The experience of linking the Hispanic EPESE to the NDI suggests that standard algorithms calibrated for the general US population may not work as well for this population subgroup. Alternative practices for submitting survey records for an NDI search may improve the match rate for Hispanics. For

714 Patel et al. ACKNOWLEDGMENTS

Support for this study was provided by grants from the National Institute on Aging (F31 AG021872-01 and R01 AG10939) and the National Cancer Institute (IP50 CA105631-01). The authors thank Drs. Christine Cox and Bryan Sayer for providing useful comments on the manuscript.

REFERENCES

Am J Epidemiol 2004;159:707–715


1. Hummer RA, Rogers RG, Amir SH, et al. Adult mortality differentials among Hispanic subgroups and non-Hispanic Whites. Soc Sci Q 2000;81:459–76. 2. Liao Y, Cooper RS, Cao G, et al. Mortality patterns among adult Hispanics: findings from the NHIS, 1986 to 1990. Am J Public Health 1998;88:227–32. 3. Markides KS, Coreil J. The health of Hispanics in the southwestern United States: an epidemiologic paradox. Public Health Rep 1986;101:253–65. 4. Sorlie PD, Backlund E, Johnson NJ, et al. Mortality by Hispanic status in the United States. JAMA 1993;270:2464–8. 5. Burchfiel CM, Hamman RF, Marshall JA, et al. Cardiovascular risk factors and impaired glucose tolerance: the San Luis Valley Diabetes Study. Am J Epidemiol 1990;131:57–70. 6. Flegal KM, Ezzati TM, Harris MI, et al. Prevalence of diabetes in Mexican Americans, Cubans, and Puerto Ricans from the Hispanic Health and Nutrition Examination Survey, 1982– 1984. Diabetes Care 1991;14:628–38. 7. Haffner SM, Hazuda HP, Mitchell BD, et al. Increased incidence of type II diabetes mellitus in Mexican Americans. Diabetes Care 1991;14:102–8. 8. Hamman RF, Marshall JA, Baxter J, et al. Methods and prevalence of non-insulin-dependent diabetes mellitus in a biethnic Colorado population: The San Luis Valley Diabetes Study. Am J Epidemiol 1989;129:295–311. 9. Mitchell BD, Stern MP, Haffner SM, et al. Risk factors for cardiovascular mortality in Mexican Americans and non-Hispanic whites: San Antonio Heart Study. Am J Epidemiol 1990;131: 423–33. 10. Abraido-Lanza AF, Dohrenwend BP, Ng-Mak DS, et al. The Latino mortality paradox: a test of the “salmon bias” and healthy migrant hypotheses. Am J Public Health 1999;89: 1543–8. 11. Franzini L, Ribble JC, Keddie AM. Understanding the Hispanic paradox. Ethn Dis 2001;11:496–518. 12. Pablos-Mendez A. Mortality among Hispanics. (Letter). JAMA 1994;271:1237. 13. Palloni A, Arias E. A re-examination of the Hispanic mortality paradox. Madison, WI: Center for Demography and Ecology, University of Wisconsin-Madison, 2003. (Working paper no. 2003-01). 14. Shai D, Rosenwaike I. Mortality among Hispanics in metropolitan Chicago: an examination based on vital statistics data. J Chronic Dis 1987;40:445–51. 15. Rosenberg HM, Maurer JD, Sorlie PD, et al. Quality of death rates by race and Hispanic origin: a summary of current research, 1999. Vital Health Stat 2 1999 Sep:1–13. 16. Palloni A, Morenoff JD. Interpreting the paradoxical in the Hispanic paradox: demographic and epidemiologic approaches. Ann N Y Acad Sci 2001;954:140–74. 17. Swallen K, Guend A. Data quality and adjusted Hispanic mortality in the United States, 1989–1991. Ethn Dis 2003;13:126–

33. 18. Kestenbaum B. A description of the extreme aged population based on improved Medicare enrollment data. Demography 1992;29:565–80. 19. National Death Index user’s manual. Hyattsville, MD: National Center for Health Statistics, 2000. 20. Rogot E, Sorlie P, Johnson NJ. Probabilistic methods in matching census samples to the National Death Index. J Chronic Dis 1986;39:719–34. 21. Horm J. Assignment of probabilistic scores to National Death Index record matches. In: Supplement to the National Death Index user’s manual. Hyattsville, MD: National Center for Health Statistics, 1999:A1–A13. 22. Cowper DC, Kubal JD, Maynard C, et al. A primer and comparative review of major US mortality databases. Ann Epidemiol 2002;12:462–8. 23. Ruiz-Pérez R, López-Cózar ED, Jiménez-Contreras E. Spanish personal name variations in national and international biomedical databases: implications for information retrieval and bibliometric studies. J Med Library Assoc 2002;90:411–30. 24. Herrera CR, Stern MP, Goff D, et al. Mortality among Hispanics. (Letter). JAMA 1994;271:1237. 25. Markides KS, Stroup-Benham CA, Goodwin JS, et al. The effect of medical conditions on the functional limitations of Mexican-American elderly. Ann Epidemiol 1996;6:386–91. 26. National Health Interview Survey multiple cause of death public use data files, 1986–1997. (Documentation). Hyattsville, MD: National Center for Health Statistics, 2000. 27. Blakely T, Robson B, Atkinson J, et al. Unlocking the numerator-denominator bias. I: adjustments ratios by ethnicity for 1991–94 mortality data. The New Zealand Census-Mortality Study. N Z Med J 2002;115:39–43. 28. Blakely T, Kiro C, Woodward A. Unlocking the numeratordenominator bias. II: adjustments to mortality rates by ethnicity and deprivation during 1991–94. The New Zealand CensusMortality Study. N Z Med J 2002;115:43–8. 29. US Department of Health and Human Services, Indian Health Service. Final report: methodology for adjusting IHS mortality data for inconsistent classification of race-ethnicity of American Indians and Alaska Natives between state death certificate and IHS patient registration records. Rockville, MD: US DHHS, 1996. 30. Greenland S, Rothman KJ. Introduction to categorical statistics. In: Rothman KJ, Greenland S, eds. Modern epidemiology. 2nd ed. Philadelphia, PA: Lippincott-Raven, 1998:231–52. 31. SAS Institute, Inc. SAS procedures guide, version 8.1. Cary, NC: SAS Institute, Inc, 2000. 32. Shah BV, Barnwell BG, Bieler GS. SUDAAN user’s manual, release 7.5.6. Research Triangle Park, NC: Research Triangle Institute, 2000. 33. Markides KS, Rudkin L, Angel RJ, et al. Health status of Hispanic elderly. In: Martin LG, Soldo BJ, eds. Racial and ethnic differences in the health of older Americans. Washington, DC: National Academy Press, 1997:285–300. 34. Boyle CA, Decouflé P. National sources of vital status information: extent of coverage and possible selectivity in reporting. Am J Epidemiol 1990;131:160–8. 35. Fisher SG, Weber L, Goldberg J, et al. Mortality ascertainment in the veteran population: alternatives to the National Death Index. Am J Epidemiol 1995;141:242–50. 36. Stampfer MJ, Willett WC, Speizer FE, et al. Test of the National Death Index. Am J Epidemiol 1984;119:837–9. 37. Calle EE, Terrell DD. Utility of the National Death Index for ascertainment of mortality among Cancer Prevention Study II participants. Am J Epidemiol 1993;137:235–41. 38. Williams BC, Demitrack LB, Fries BE. The accuracy of the


National Death Index when personal identifiers other than Social Security number are used. Am J Public Health 1992;82: 1145–7. 39. Massey DS, Espinosa KE. What’s driving Mexico–U.S. migration? A theoretical, empirical, and policy analysis. Am J Sociol 1997;102:939–99. 40. Soldo B, Wong R, Palloni A. Migrant health selection: evidence from Mexico and the US. Presented at the Population Association of America Conference, Atlanta, Georgia, May 2002.

41. Curb JD, Ford CE, Pressel S, et al. Ascertainment of vital status through the National Death Index and the Social Security Administration. Am J Epidemiol 1985;121:754–66. 42. Acquavella JF, Donaleski D, Hanis NM. An analysis of mortality follow-up through the National Death Index for a cohort of refinery and petrochemical workers. Am J Ind Med 1986;9: 181–7. 43. US Department of Health and Human Services. Healthy people 2010: understanding and improving health. 2nd ed. Washington, DC: US Government Printing Office, 2000:7–10.


Am J Epidemiol 2004;159:707–715