Comparing the Discriminative Validity of Two ... - Value in Health

25 downloads 16838 Views 81KB Size Report
Blackwell Science, LtdOxford, UKVHEValue in Health1098-30152005 ISPORMarch/April 200582168174Original ... Waterloo, Waterloo, Ontario; 9Southern California College of Optometry, Fullerton, CA, USA ... viduals over 65 years in the United States are known .... of ocular irritation, computer use, history of contact.
Blackwell Science, LtdOxford, UKVHEValue in Health1098-30152005 ISPORMarch/April 200582168174Original ArticleDiscriminative Validity of QoL Measures in Dry EyeRajagopalan et al.

Volume 8 • Number 2 • 2005 VALUE IN HEALTH

Comparing the Discriminative Validity of Two Generic and One Disease-Specific Health-Related Quality of Life Measures in a Sample of Patients with Dry Eye Krithika Rajagopalan, PhD,1 Linda Abetz, MA,2 Polyxane Mertzanis, MPH,2 Derek Espindle, MA,2 Carolyn Begley, OD, MS,3 Robin Chalmers, OD,4 Barbara Caffery, OD, MS,5 Christopher Snyder, OD, MS,6 J. Daniel Nelson, MD,7 Trefford Simpson, PhD,8 Timothy Edrington, OD, MS9 1

AstraZeneca Pharmaceuticals, Wilmington, DE (formerly K. Venkataraman of Alcon Laboratories, TX, USA); 2Mapi Values, Boston, MA, USA; 3Indiana University, Bloomington, IN, USA; 4Clinical Trial Consultant, Atlanta, GA, USA; 5Private Practice, Toronto, Ontario; 6 University of Alabama, Birmingham, AL, USA; 7University of Minnesota School of Medicine, Minneapolis, MN, USA; 8University of Waterloo, Waterloo, Ontario; 9Southern California College of Optometry, Fullerton, CA, USA

ABST R ACT

Objective: The purpose of this study was to compare the discriminative properties of two generic health-related quality of life (QoL) instruments (SF-36 and EQ-5D) and a newly developed disease-specific patient-reported outcomes instrument (Impact of Dry Eye on Everyday Life (IDEEL)) to distinguish between different levels of dry eye severity. Methods: Assessment of 210 people: 130 with nonSjögren’s Keratoconjunctivitis Sicca (non-SS KCS), 32 with Sjögren’s Syndrome (SS) and 48 controls; comparison of SF-36, EQ-5D, and IDEEL age-adjusted data by dry eye severity levels. Severity was assessed based on diagnosis (non-SS KCS, SS, control), patient-report (none, very mild, mild, moderate, severe, extremely severe) and clinician-report (none, mild, moderate, severe). Results: Discriminative validity results were consistent for all instruments. Significant differences between sever-

ity levels were found with most SF-36 scales (P < 0.05), all EQ-5D scales (P < 0.05), and all IDEEL scales (P < 0.0001), except for Treatment Satisfaction. IDEEL scales consistently outperformed the generic QoL measures regardless of the severity criterion used. Most SF-36 scales outperformed the EQ-5D QoL scale, but the EQ-5D visual analog scale outperformed the SF-36 scales, except for General Health Perceptions. Conclusions: The disease-specific IDEEL scales are better able to discriminate between severity levels than the majority of the generic QoL scales. Preliminary evidence demonstrates that the IDEEL will be sensitive to QoL changes over time, although further testing in controlled longitudinal studies is needed. Keywords: dry eye, Impact of Dry Eye on Everyday Life (IDEEL), Keratoconjunctivitis Sicca, quality of life, SF-36, Sjögren’s Syndrome.

Introduction

rates of up to 29% in clinical optometry practices [1,6–14]. Overall, individuals with non-SS KCS experience less severe symptoms than those with SS. Symptoms for both groups include eye irritation or discomfort and can result in a need to “rest” or treat the eyes. Environmental factors such as smoke-filled, heated or air-conditioned rooms can also trigger dry eye symptoms, thereby resulting in avoidance of these types of environments [15,16]. The majority of patients with dry eye have a mild form of the condition: 65% to 89% have mild dry eye, 12% to 33% have moderate dry eye and up to 2% of the population have severe dry eye [13]. Some reports suggest that prevalence of SS in the United States varies from 0.5% to 5.0%, depending on the criteria utilized to confirm the diagnosis [17].

Dry eye is a common condition caused by diminished production or increased evaporation of tears (non-Sjögren’s Keratoconjunctivitis Sicca (KCS)) or by a systemic immunologic disorder (Sjögren’s Syndrome (SS)) that causes insufficient moisture production in the salivary and tear-producing glands, among others [1–5]. Prevalence estimates suggest that dry eye is a common condition with a prevalence of 11% to 17% in the general population and

Address correspondence to: Krithika Rajagopalan, AstraZeneca Pharmaceuticals, Brandywine 3B-703 A, 1800 Concord Pike, Wilmington, DE 76134-2099, USA. E-mail: [email protected] © ISPOR

1098-3015/05/168

168–174

168

169

Discriminative Validity of QoL Measures in Dry Eye Dry eye appears to affect women and the elderly disproportionately. More than 90% of those affected by the condition are women, predominantly between the ages of 40 and 65 years [18]. Additionally, approximately 10% to 15% of individuals over 65 years in the United States are known to be affected with dry eye [1]. Treatment of dry eye has generally included the use of artificial tears and ointments, punctal occlusion through a temporary or permanent plugging of the tear drainage openings, and behavioral modifications. Some common treatment methods include anti-inflammatory drugs [19] and corticosteroids [20]. Some of these patients follow a balanced diet and exercise to overcome pain and fatigue [21]. In moderate to severe dry eye, the treatments are reported to be of limited value to patients; patients often indicate they are constantly aware of discomfort in their eyes and are not able to enjoy a full health-related quality of life (QoL). Furthermore, these patients repeatedly visit doctors and specialists without much success and seek complementary and alternative medication therapy [22–25]. For such patients, dry eye is likely to have a significant impact on their QoL. While generic QoL measures have been used to assess the impact of dry eye, these measures lack discriminative power when distinguishing between different levels of dry eye severity. Thus, the purpose of this study was to compare the ability of two generic QoL instruments and a newly developed disease-specific patient-reported outcomes instrument to distinguish between different levels of dry eye severity.

Methods

Study Population This study involved six sites: four in the United States (Alabama, California, Indiana, and Minnesota) and two in Canada (Toronto and Waterloo). To participate in the study, subjects had to be at least 18 years old, must have had an eye exam in the past 18 months, and should have had a confirmed diagnosis of either non-SS KCS or SS (except for the controls). Clinicians at two study sites recruited SS subjects using the ICD-9-CM or San Diego criteria, which includes a positive gland biopsy, while clinicians at four other study sites recruited non-SS KCS subjects using ICD-9-CM codes. Potential study subjects were recruited from the clinicians’ records and were screened to ensure the presence of dry eye symptoms in the previous 4 weeks. Patients were excluded from the study if they had a punctal occlusion within the past 60 days or if they had experi-

enced a systemic medication regimen change within the last 30 days. Control subjects were recruited from lists of patients who did not have diagnostic codes for dry eye. For inclusion in the study, these subjects had to respond negatively to the question, “Do you think you have dry eye?” and were required to have dry eye symptoms less often than “sometimes,” and to report that they “never” or “rarely” use artificial tears. Finally, at least two-thirds of the controls had to be over age 35. Subjects also had to be literate in English, willing and able to complete a series of questionnaires twice over a 2-week period and willing to undergo clinical testing for dry eye as part of the study. Patients were excluded from the study if they wore contact lenses or previously had refractive surgery. Subjects signed consent forms before study participation and were compensated for their time.

Measures Used Generic measures. Medical Outcomes Study Short Form 36 Health Survey (SF-36) [26]. The SF-36 was included as a generic measure of health status. It assesses, by means of 36 items: Physical Functioning, Role Limitations due to Physical Problems (Role-Physical), Bodily Pain, General Health Perceptions, Vitality, Social Functioning, Role Limitations due to Emotional Problems (Role-Emotional), and Mental Health. Summary scores for overall physical and mental health were calculated: physical component summary (PCS) and mental component summary (MCS), respectively. A 4-week recall period was used for the majority of these scales, except Physical Functioning and General Health Perceptions, which reflected subjects’ assessments at the time the questionnaires were completed. The SF36 was scored according to the developer’s instructions [26]; higher scores indicate better health status, the individual scale scores range from 0 to 100 and the component scores are scored such that in the general population the mean score is 50, with an SD of 10. EuroQoL (EQ-5D) [27]. The EQ-5D is an indirect utility questionnaire. It consists of five attributes, with three levels per attribute. The dimensions of the EQ-5D include: Mobility, SelfCare, Usual Activities, Pain/Discomfort, and Anxiety/Depression. In addition to these five dimensions, a visual analog scale (VAS) for Overall QoL is included. A total score for the five domains is calculated; in addition, the VAS is scored as a separate measure.

Rajagopalan et al.

170 Dry eye specific measures. The Dry Eye Questionnaire (DEQ). THE DEQ is a self-administered habitual symptom questionnaire. The Dry Eye Questionnaire—2001 revision (DEQ 2001) includes categorical scales to measure the prevalence, frequency, diurnal intensity, and intrusiveness of common ocular surface symptoms. These scales assess environmental factors that can produce symptoms of ocular irritation, computer use, history of contact lens wear, use of systemic and ocular medications, artificial tear usage, self-assessment of dry eye and whether subjects have been previously diagnosed as having dry eye. This questionnaire can be completed in 15 minutes. For this article, the DEQ was used as an assessment of patient-reported severity. The Impact of Dry Eye on Everyday Life (IDEEL). The IDEEL consists of three modules with a total of six scales: QoL (Activity Limitations, Emotional Well-Being, Work Impact), Treatment Satisfaction (Treatment Satisfaction and TreatmentBother), and Symptom-Bother. The IDEEL is a selfadministered instrument and can be completed in approximately 30 minutes. The IDEEL instrument fielded in this study consisted of 112 items, which was reduced to 57 items through systematic item reduction during psychometric validation. All the IDEEL scales demonstrated good internal consistency reliability with Cronbach’s alpha scores from 0.70 to 0.97. All scales, except for Work Impact, demonstrated good test-retest reliability (i.e., intraclass correlation coefficients of at least 0.70). Scale scores used in this article are based on the final shorter version. Higher scores indicate better QoL, worse symptoms and better treatment satisfaction; scores range from 0 to 100. Clinician assessments. Clinicians were asked to provide the diagnosis for each subject (non-SS KCS, SS, control) and to rate the severity of the subject’s condition before and after clinical tests. The following clinical tests were performed during the first study visit: Snellen Visual Acuity, Schirmer 1 Tear Test, Fluorescein Tear Break Up Time, Corneal Fluorescein Staining, Lissamine Green Conjunctival Staining, and Biomicroscopy. For this study, the clinicians’ recruitment diagnoses and post-test assessments of severity were used as the severity ratings.

Analysis The following analyses were conducted using Statistical Analysis Software version 8.2 (SAS Institute Inc., Cary, NC, USA, 2003) and Multi-trait Analysis Program—Revised software version 1.0 (Health Assessment Laboratory, Boston, MA, USA, 1997).

For all tests, a significance level of 0.05 was used unless otherwise indicated. Discriminative validity for each measure was assessed by examining differences in scale scores for groups of subjects with different levels of dry eye severity and for each severity assessment method. It should be noted that severity was defined in three ways during the study: patient diagnosis at study recruitment (non-SS KCS, SS, control), clinician report of severity (none, mild, moderate, severe) and patient report of severity (none, very mild, mild, moderate, severe, and extremely severe). Although scores on the measures were not normally distributed, recent empiric testing showed that nonparametric testing methods produced results similar to standard parametric tests when analyzing QoL data, and thus nonparametric methods are not more appropriate than standard parametric tests for analyzing QoL data [28]. Therefore, analyses were performed using parametric analysis of covariance (ANCOVA) to adjust for significant age differences among the groups. Because cell sizes were unbalanced, the ANCOVAs were implemented using general linear models (GLM). The magnitude of the resulting F-statistics was used to evaluate the relative validity of each scale for discriminating among the groups [29–31].

Results

Study Population The subject demographic information is presented in Table 1. The population included 210 adult subjects: 130 with non-SS KCS, 32 with SS, and 48 controls. Chi-square tests indicated that there were no statistically significant sex or education differences among the three groups; however, a one-way analysis of variance (ANOVA) found statistically significant age differences (F = 26.32, P < 0.0001). Follow-up pairwise t-tests showed that the group of controls was significantly younger (39 years) than both the non-SS KCS (55 years) and SS groups (58 years), although the latter two groups did not differ by age. A Chi-square test also indicated statistically significant race/ethnicity differences between the groups (Caucasian vs. Non-Caucasian, C2 = 8.6130, P = 0.0135): the SS group consisted of fewer non-Caucasian subjects than expected, whereas the control group included more non-Caucasians than expected.

Discriminative Validity The differences in scale scores for the SF-36, EQ5D, and IDEEL, by severity measure, are presented

171

Discriminative Validity of QoL Measures in Dry Eye Table 1 Demographic characteristics of the validation study population by recruited severity Control (N = 48) Sex n (%) Male 13 (27) Female 35 (73) Age Mean 39.23 SD 11.76 Range 20.0–66.0 Ethnicity n (%) Total Caucasian 34 (71) Total Non-Caucasian 14 (29) African American 6 (13) Hispanic/Spanish 5 (10) American Asian/Oriental/Pacific 1 (2) Islander Other 2 (4) Highest level of education n (%) High school diploma or less 9 (19) Some college 13 (27) College degree 15 (31) Graduate/postgraduate 8 (17) Other 3 (6)

Non-SS KCS (N = 130)

SS (N = 32)

27 (21) 103 (79)

3 (9) 29 (91)

55.18 15.26 22.0–89.0

58.25 11.78 34.0–80.0

106 (82) 24 (18) 12 (9) 5 (4)

31 (97) 1 (3) 1 (3) 0 (0)

6 (5)

0 (0)

1 (1)

0 (0)

23 (18) 47 (36) 29 (22) 31 (24) 0 (0)

6 (19) 8 (25) 11 (34) 5 (16) 2 (6)

in Table 2 [32]. The age-adjusted mean scales scores by recruited severity for the three questionnaires are in Table 3 (scores by clinician and self-rated severity were similar to those presented). For the SF-36, significant differences between the various severity levels were noted for all scales, with the exception of Role-Emotional by patients’ recruited diagnosis (F = 1.75, P = 0.1765) and selfrated severity (F = 2.60, P = 0.0533), Physical

Functioning by clinician-rated severity (F = 2.18, P = 0.0916) and self-rated severity (F = 1.26, P = 0.2907), and Bodily Pain by clinician-rated severity (F = 2.59, P = 0.0537) (Table 2). In observing the EQ-5D results, significant differences in scale scores at the varying severity levels were also consistently noted for the EQ-5D QoL scores (P < 0.05) and the VAS (P < 0.0001) across all severity measures (Table 2). Significantly different scale scores were observed between different levels of severity in all the IDEEL scales (P < 0.0001) except Treatment Satisfaction, for recruited diagnosis (F = 2.26, P = 0.1087) and clinician-rated severity (F = 1.39, P = 0.2477) (Table 2). When examining the strength of differences by ranking F-values (Table 2), the IDEEL Activity Limitations and Symptom-Bother scales consistently outperformed the generic QoL measures, regardless of the severity criterion used. The IDEEL Emotional Well-Being and Work Impact scales also consistently outperformed all SF-36 scales and the EQ-5D QoL score regardless of the criterion used, and the EQ-5D VAS when using clinician and self-rated severity levels. The IDEEL Treatment-Bother scale was more discriminative than all other generic QoL scales except the General Health Perceptions scale and the EQ-5D VAS when using recruited diagnosis as a criterion measure. Interestingly, the IDEEL Treatment Satisfaction scale was not able to discriminate between recruited and clinician-rated severity groups and was consistently outperformed by SF-36 and EQ-5D scales.

Table 2 F and P values for age-adjusted ANCOVAS for each severity grouping variable (control patients included in sample) Scale name SF-36 Physical functioning Role-physical Bodily pain General health perceptions Vitality Social functioning Role-emotional Mental health Physical component scale Mental component scale EQ-5D EQ-5D QoL score EQ-5D VAS IDEEL Activity limitations Emotional well-being Work impact Treatment satisfaction Treatment-bother Symptom-bother

Recruited severity F value P value

Clinician-rated severity F value P value

Self-rated severity F value P value

3.22 12.00 6.64 22.03 10.37 13.71 1.75 4.58 14.15 4.13

0.0421