Retrospective Assessment of Occupational Exposure to ... - CiteSeerX

International Journal of Epidemiology © International Epidemiological Association 1997

Vol. 26, No. 3 Printed in Great Britain

Retrospective Assessment of Occupational Exposure to Chemicals in Community-Based Studies: Validity and Repeatability of Industrial Hygiene Panel Ratings GEZA BENKE, MALCOLM SIM, ANDREW FORBES AND MICHAEL SALZBERG Benke G (Department of Epidemiology and Preventive Medicine, Monash University, Melbourne, Australia), Sim M, Forbes A and Salzberg M. Retrospective assessment of occupational exposure to chemicals in community-based studies: Validity and repeatability of industrial hygiene panel ratings. International Journal of Epidemiology 1997; 26: 635–642. Background. Occupational hygiene panels are increasingly being used to rate retrospective occupational exposures to chemicals in community-based studies. This study aimed to assess the validity, reliability and feasibility of using such an expert panel in a brain tumour case-control study. Methods. A panel of five experts was recruited to rate exposure to 21 chemicals for 298 job descriptions to investigate the level of agreement. Validity was assessed by comparing the ratings of the experts for 49 of the jobs with objective quantitative exposure data which existed for these jobs. Repeatability was assessed by comparing the results for 50 resubmissions. Results. Specificity was high for reporting that exposure occurred (all above 90%), but sensitivity was variable with values between 48% and 79%. Weaker validity was found for rating exposure level and exposure frequency. The raters showed the greatest inter-rater agreement for exposure to three of the 21 chemicals considered (κ = 0.64 for cutting fluids, κ = 0.57 for welding fumes and κ = 0.42 for lubricating oils). Intra-rater reliability, based on the 50 resubmitted jobs, was fair to good (κ = 0.46, 0.73). Conclusions. The potential effect of exposure misclassification from using expert panels was quantified and found to be a significant source of bias. The optimum situation occurred where three of the five raters concurred, where an odds ratio of 2.2 was observed for a true odds ratio of 4.0. Future studies which plan to use expert panels should screen the experts for their suitability by validating their performance against jobs with known exposure data. Keywords: epidemiology, exposure assessment, reliability, validation

use their expert training, experience, supplementary questionnaires and local knowledge to assess historical exposure to chemicals. This method minimizes nondifferential misclassification and might give more reliable results in exploratory studies of exposed and non-exposed, compared to other methods e.g. Job Exposure Matrices (JEM).6 It is particularly suited to community-based case-control studies and industry specific studies where past air monitoring data are not available. Validation of exposure assessment by experts can only be carried out by comparing the qualitative exposure estimates by the experts with recently collected or historical quantitative exposure measurements. A validation study by Kromhout et al.7 investigated agreement between qualitative exposure estimates rated by workers, supervisors and hygienists and current

Reliable and valid measurement of exposure to hazards is a key factor in the design of occupational epidemiological studies. Methods for the assessment of retrospective exposure to chemicals include personal interviews with subjects or next of kin, self-administered questionnaires, reference to personnel records and chemical measurements on the subjects or the environment.1 Rarely are chemical measurements available and more often combinations of the subjective methods (i.e. questionnaires and interviews) are employed. However, exposure misclassification and recall bias are common problems with these methods. A method increasingly used for exposure characterization is expert panels of industrial hygienists,2–6 who Department of Epidemiology and Preventive Medicine, Monash University, Melbourne, Australia.

635

636

INTERNATIONAL JOURNAL OF EPIDEMIOLOGY

quantitative exposure measurements. This study found the industrial hygienists were more successful estimators of exposure than plant supervisors and workers, but validity was low for all groups. Hawkins and Evans,8 in a study of subjective estimation of toluene exposures by 24 industrial hygienists, found that 70% were within a factor of two and 95% were within a factor of four of measured toluene time weighted average (TWA) exposures. Teschke et al.9 studied chlorphenate exposure in sawmill workers and found senior workers and industrial hygienists estimated exposures similarly, but a large amount of unexplained variance still existed. In several recent studies,10–12 JEM have been validated against expert occupational hygiene assessments as the reference method or gold standard. Caution however was advised,10 since it was recognized that the quality of the questionnaire and the skill of the experts could not be measured in absolute terms in the absence of any objective reference measure of exposure. One of these studies, involving exposure to organic solvents and evaluation of a JEM, assumed that the expert assessments of exposure were 100% correct.12 However, the authors questioned the use of experts as a gold standard in the evaluation of JEM. Clearly, it is important to quantitatively assess the validity of an expert panel in assessing an exposure in a study, since an estimate can then be made of the likely degree of misclassification of the exposure assessment, degree of bias of the risk estimates and the implications for sample size calculations. The purpose of this study was to assess the feasibility, resource requirements, validity and repeatability of using an expert panel to assess retrospective exposures to workplace chemicals for a community-based study, using job histories collected by questionnaires.

MATERIALS AND METHODS Study Design This study was part of a community-based case-control study of glioma,13 which involved the collection of job histories to assess occupational chemical exposures implicated as causes of brain tumours in previous epidemiological and experimental studies.14–16 As an adjunct to the main study, a team of three experienced industrial hygienists and two occupational physicians was assembled to form the expert panel, to assess the validity, repeatability and feasibility of using such a panel in the main study. All five experts were experienced in the area of chemical exposure assessment, with an average of 20 years experience in occupational health in the state of Victoria. Over a period of 6 months the panellists were required to assess four batches of job descriptions, totalling 298 jobs. For each job, the following

information was given: employer, industry, occupation, main tasks, equipment used and start/end dates. The 298 jobs consisted of: (i) 199 jobs randomly selected from questionnaires administered as part of the glioma case-control study; (ii) 49 ‘dummy’ jobs compiled by an industrial hygienist (not one of the panel members). These dummy jobs were selected from about 350 occupational hygiene reports and surveys undertaken at a large variety of Victorian workplaces over a 12 year period (1978–1989). Each dummy job was converted into the same format as the randomly selected jobs from the glioma case-control database, using fictitious start and end dates, and therefore being indistinguishable from the randomly selected jobs. These jobs were selected from workplaces where baseline quantitative personal air sampling data and a complete description of the workplace, were available; (iii) 50 jobs were resubmissions from the first two batches which were included in the last two batches. Since assessment of the 300 job descriptions was undertaken in four batches over a 6-month period, there was at least a 4-month interval between the first sighting and assessment and the second. Thirty of the resubmissions were randomly selected from the dummy jobs in (ii) above and 20 were randomly selected jobs from the real jobs in (i) above. The panellists were ‘blind’ to the status of both the dummy and resubmission jobs and there was no communication between panellists during the study. Each panellist was required to indicate if there had been exposure to one or more of 21 listed chemicals for each job (Table 1 for complete list). If a panellist decided that there had been exposure, he or she was asked to nominate the level of exposure and the frequency of exposure. The trichotomy for exposure level was ‘low’, defined as below the threshold limit value (TLV) but above background, ‘medium’, defined as around TLV and ‘high’, defined as significantly above TLV. Frequency of exposure was trichotomized to low (,5% of time at work exposed), medium (5–30%) and high (.30%).17 Finally, panellists were required to record the time taken to rate the jobs in each of the four batches. Statistical Analysis Statistical analysis was performed to determine the extent of agreement of exposure assessment between raters, the test/retest reliability of exposure assessment, and the validity of the assessments via comparison with a ‘gold standard’. Four sets of analyses were performed, as follows: 1. Measurement of agreement between the exposure classification across the 199 jobs for each of the 10 pairs of panellists was performed using the standard kappa

637

ASSESSMENT OF OCCUPATIONAL EXPOSURE

TABLE 1 Pairwise agreement statistics between raters assessing 199 jobs for exposures to 21 chemicals Exposure Other organic solvents Lubricating oils and greases Soldering fumes Welding fumes Cutting fluids PAHsc Lead Toluene Benzene Chromates Formaldehyde Organochlorine pesticides Arsenic Mercury Ethylene oxide N-nitroso compounds Jet-fuel Phenol Vinyl chloride Acrylonitrile TDId

% prevalencea

(Range)

Pairwise agreement (%)

κb

(Range)

29.0

(8.0,54.3)

71.0

0.31

(0.14,0.54)

17.5

(8.0,33.2)

83.1

0.42

(0.27,0.62)

9.0 8.3 8.1 7.4 6.9 6.2 4.7 3.9 3.3 3.2

(2.5,15.6) (4.0,13.6) (5.5,13.1) (0.5,17.1) (0.5,15.1) (1.5,17.1) (0.0,13.1) (0.5,8.5) (1.0,7.0) (1.5,6.0)

90.1 93.4 94.5 89.4 90.1 90.2 93.0 94.0 94.7 95.9

0.38 0.57 0.64 0.22 0.23 0.19 0.19 0.12 0.16 0.34

(0.17,0.56) (0.42,0.74) (0.44,0.81) (0.05,0.38) (0.06,0.36) (0.08,0.56) (0.0,0.49) (–0.01,0.27) (–0.03,0.32) (0.18,0.50)

1.5 1.3 0.9 0.9

(0.0,4.5) (0.0,3.0) (0.0,1.5) (0.0,2.0)

97.1 97.6 98.5 98.4

0.02 0.03 0.13 0.05

(–0.02,0.13) (–0.01,0.39) (–0.02,0.80) (–0.01,0.56)

0.7 0.4 0.4 0.2 0.2

(0.5,1.5) (0.0,1.0) (0.0,1.0) (0.0,0.5) (0.0,0.5)

99.0 99.2 99.2 99.6 99.6

0.30 –0.003 –0.003 –0.001 –0.001

(–0.01,1.0) (–0.01,0.0) (–0.01,0.0) (–0.01,0.0) (–0.01,0.0)

a

% prevalence, is the mean prevalence across the five raters per chemical exposure. Summary kappa statistic (see text). c Polycyclic aromatic hydrocarbons. d Toluene di-isocyanate. b

statistic (κ) of Fleiss18 for each exposure separately. An overall summary measure of pairwise agreement between the raters for each exposure was calculated using a generalization of the standard kappa statistic,19 which can be regarded as a weighted average of the 10 pairwise kappa statistics. Usually, values greater than 0.75 represent excellent agreement beyond chance, values between 0.40 and 0.75 represent fair to good agreement beyond chance and values below 0.40 represent poor agreement.18 A value of zero or less than zero indicates that the agreement is no better than expected by chance alone. 2. Measurement of intra-rater test/retest reliability. This was performed as in (1) above using the 50 resubmissions for identification of exposure, but not for exposure level or frequency. The kappa statistic was employed by calculating the agreement between the first rating of a job and the second rating of the same job by the same panellist. 3. Measurement of validity, i.e. agreement between panellist ratings and real data for the dummy jobs, used

as the ‘gold standard’. This was undertaken for presence of chemical exposure, level of exposure and frequency of exposure. The prevalence of exposure, sensitivity and specificity for each rater were calculated.20 This was repeated using all possible combinations of panellists. 4. A sensitivity analysis of the potential effect of panel misclassification on the odds ratio, given the calculated sensitivity and specificity for rater validity for individual raters and combinations of raters. Since the true prevalence of exposure was unknown, prevalences of 1% and 5% were used in this analysis.1,21

RESULTS Reliability Table 1 demonstrates the inter-rater reliability for the for the presence of exposure for the 199 jobs randomly selected from the glioma case-control study. It shows the mean and range of kappas for agreement across the 10 pairwise combinations of raters for the 21 different chemical exposures. Where prevalence was low, kappas

638


TABLE 2 Intra-rater reliability of exposure identification by raters for all 21 chemicals (listed in Table 1) for 50 resubmission jobs Rater 1 2 3 4 5

(Physician) (Physician) (Hygienist) (Hygienist) (Hygienist)

Prevalencea 2.7% 7.8% 3.4% 5.9% 6.7%

κb 0.46 0.64 0.60 0.73 0.54

95% CIc (0.31,0.61) (0.53,0.75) (0.48,0.72) (0.65,0.81) (0.42,0.66)

a

Prevalence, total exposures identified across all chemicals for the 50 resubmission jobs by the particular rater. b kappa statistic. c Confidence interval.

TABLE 3 Validity of exposure identification by raters for all 21 chemicals (listed in Table 1) for the 49 dummy jobs Rater 1 2 3 4 5

(Physician) (Physician) (Hygienist) (Hygienist) (Hygienist)

Prevalencea

Sensitivity

Specificity

4.2% 9.3% 7.6% 13.0% 9.5%

48.1% 69.2% 57.7% 78.9% 65.4%

97.9% 93.9% 94.9% 90.9% 93.3%

a

Prevalence, total exposures identified across all chemicals for the 49 dummy jobs by the particular rater.

tended to be low. In addition, kappas tended to be lower where there was a large difference in reported prevalence of exposure between two raters. Table 2 demonstrates the results for intra-rater reliability and identifies the occupational physicians and occupational hygienists. Validity Table 3 shows the results of the measurements of validity, i.e. agreement between panellist ratings and environmental data for the 49 dummy jobs. The prevalence, sensitivity and specificity for the 49 dummies are tabulated for each rater. The specificities were generally high and tended to be higher with lower prevalence. There was considerable variation in sensitivity which tended to be higher with increasing prevalence. Tables 4 and 5 demonstrate the exposure misclassification matrices for the rating of exposure level and frequency, respectively. These Tables have been constructed from the 49 dummy jobs for the five raters. The results have been combined for all the raters since the numbers constituting many of the cells in the 3 × 3 tables were

TABLE 4 Exposure misclassification matrix for level ratings by the five raters for the 49 dummy jobs Rater exposure levels

% no exposure (range) % low level (range) % medium and high level (range) TOTAL

True exposure levels No exposure (n = 4885)

Low level (n = 160)

Medium and high level (n = 100)

94.1 (90.7,97.7) 4.4 (1.7,7.6) 1.5

37.5 (21.9,46.9) 22.5 (12.5,34.4) 40.0

34.0 (20.0,60.0) 25.0 (15.0,35.0) 41.0

(0.6,3.0) 100%

(28.1,56.2) 100%

(25.0,50.0) 100%

TABLE 5 Exposure misclassification matrix for frequency ratings by the five raters for the 49 dummy jobs Rater frequency levels

True frequency levels No exposure (n = 4885)

% no exposure 94.1 (range) (90.7,97.7) % low and medium 5.5 frequency (range) (1.7,9.3) % high frequency 0.4 (range) (0.0,1.0) TOTAL 100%

Low and medium frequency (n = 90)

High frequency (n = 170)

43.3 (16.7,61.1) 47.8

32.4 (23.5,47.0) 50.0

(22.2,83.3) 8.9 (0.0,22.2) 100%

(26.5,76.5) 17.6 (0.0,29.4) 100%

small for individual raters. The results are expressed as percentages of the total answers for the dummy results. The total numbers (n) in these two tables sum to 5145 each i.e. 21 exposures by the 49 dummy jobs for five raters. As numbers were small for some categories, the four possible categories were condensed to three categories for each of exposure level and exposure frequency. For the exposure level the condensed categories were ‘no exposure’, ‘low’ and ‘medium and high’. For the exposure frequency the condensed categories were ‘no exposure’, ‘low and medium’ and ‘high’. The Tables show that the raters tended to underestimate both level and frequency of exposure. Table 6 demonstrates combinations of exposure identification for the five raters to assess overall validity of the panel, rather than individual raters. These

639


TABLE 6 Validity of panel using different combinations of raters assessing job exposure for the 49 dummy jobs No. of ratersa Sensitivity All 5 >4 >3 >2

28.8% 42.3% 67.3% 82.7%

Specificity

PPVb

NPVc

99.3% 98.6% 98.5% 96.7%

68.2% 73.3% 70% 57.3%

96.3% 97.0% 98.3% 99.1%

usual attenuation of the odds ratios estimates towards the null with the application of the formula. Additionally, the results demonstrate that where three raters from a panel of five correctly identify the presence of an exposure, the minimum exposure misclassification occurs. The prevalences of exposure for the 200 jobs from the case-control database were found to range from 1.8% for rater 1 to 10.0% for rater 4. The mean time per rater for the rating of all 300 jobs (i.e. case-control jobs plus dummies and resubmissions) was found to be 7 h and 35 min, with a mean of 1 min 31 s per job. Raters varied from 4 h 16 min to 11 h 45 min, to rate the 300 jobs. This data is presented to indicate the level of resources required for incorporating a panel assessment in an epidemiological study.

a

Number of raters correctly assessing an exposure. Positive predictive value. c Negative predictive value. b

results were based on the 49 dummy jobs. Once again the specificities were high, as shown for individual raters in Table 3, with the optimal positive predictive value achieved with at least four raters agreeing on exposure. Table 7 demonstrates the potential effect on the odds ratio of misclassification of exposure for individual raters and combinations of panellists. The true prevalence of exposure to the cases and controls is unknown, but based on the ratings of the experts to the 199 jobs from the main study, it was considered to lie between 1% and 5%. The calculations in this Table have been based on an assumed prevalence of exposure in the cases of 1% and 5%. The true odds ratios (ORT) of 2, 3 and 4 were arbitrarily chosen, with the resultant observed odds ratios (ORO) calculated using a previously published formula,1 based on the sensitivity and specificity results in Tables 3 and 6. The results show the

DISCUSSION In this study, the validity, inter-rater and intra-rater reliability of industrial hygiene experts in assessing retrospective occupational exposure, and the likely degree of exposure misclassification, have been quantitatively evaluated. The optimum situation in terms of minimizing bias of the odds ratio to the null through exposure misclassification occurred when at least three members of the expert panel of five raters agreed on the presence of exposure. Even under these optimum conditions, significant attenuation of the odds ratio was observed e.g. for prevalence of exposure of 5%, a true odds ratio of 4 would only result in an observed odds ratio of 2.22. The best an individual rater could achieve

TABLE 7 Effects on odds ratio of rating misclassification for different combinations of raters for the 49 dummy jobs Rater

Sensitivity

Specificity

Prevalence of exposure in cases

a

ORT = 2

ORT = 3

ORT = 4

ORO

ORO

ORO

1.10 1.39 1.04 1.19 1.17 1.55 1.13 1.47 1.19 1.59 1.11 1.43

1.14 1.60 1.06 1.27 1.24 1.89 1.18 1.73 1.26 1.96 1.16 1.66

1.16 1.72 1.06 1.32 1.28 2.12 1.21 1.90 1.31 2.22 1.18 1.80

b

1

48.1

97.9

4

78.9

90.9

All 5 correct

28.8

99.3

>4 correct

42.3

98.6

>3 correct

67.3

98.5

>2 correct

82.7

96.7

a b

True odds ratio. Observed odds ratio.

0.01 0.05 0.01 0.05 0.01 0.05 0.01 0.05 0.01 0.05 0.01 0.05

640


under these conditions was an observed odds ratio of 1.72. Therefore, this suggests that a panel of experts will perform better than a single rater, by reducing the impact of exposure misclassification on estimates of effect. These results have implications for sample size calculations in the design of future community-based studies, which may consider expert panels for retrospective occupational exposure assessment. Our results indicate that there can be significant differences in reported presence, level and frequency of exposure by the experts, which varied for different chemicals. Lower reported prevalence of exposure, coupled with a large difference in the reported prevalence between the two raters being compared, tend to result in lower kappas. As has been previously reported,22 although kappa provides an overall measure of agreement dependent on the prevalence, it does not take into account the impact any bias may have between the two raters. The variation in the mean kappas for the various chemical exposures was most pronounced for the exposures with low prevalences. The three highest kappas (i.e. cutting fluids κ = 0.64, welding fumes κ = 0.57, lubricating oils and greases κ = 0.42) were all found to have relatively high prevalences of 8.1%, 8.3% and 17.5% respectively. These results suggest that the use of experts for studies with low prevalence of exposure may not be a satisfactory method of retrospective assessment of occupational exposure. Panels may be best suited to community-based studies for relatively common exposures, where expected prevalence is greater than 5%. The intra-rater reliability results demonstrate fair to good overall reliability. The kappa results in Table 2 indicated fair to good agreement between the two ratings for rater 4 while for rater 1, with the lowest kappa result, reliability was found to be fair. The differences in the size of the confidence intervals between these two raters was a reflection of the finding that rater 4 more commonly reported exposures than rater 1. It was found that all but one rater identified more exposures in their first assessment of the job descriptions compared to the reassessments, which may indicate a training effect. These results give confidence to use of experts, since good test-retest reliability is an essential component of methods used for exposure assessment. The validity of the assessments when measured against a ‘gold standard’, did not result in high sensitivity if the rater reported low prevalence of exposure. However, given that the overall prevalence of exposure to the 21 chemicals was low for the dummy jobs in Table 3, there appeared a crude correlation of increasing sensitivity with increasing prevalence. Conversely, rater specificity decreased as reported prevalence increased. The

results for the quantitative evaluation of the exposure level and frequency were of limited value due to low numbers and, in the evaluation of exposure level, the performance of the experts was found to be very inconsistent. Assessment of the misclassification of level of exposure in Table 4 suggested a tendency of the raters to underestimate the level of exposure, especially in the medium and high level category. In the low level category there appears to be no consistent pattern, of under- or overestimation. This disagrees with the findings of other similar studies,23 where the experts all tended to consistently overestimate the exposure levels. Table 5 demonstrates that the raters were consistent in underestimating the frequency of exposure for both the low and medium, and high frequency categories. The strong underestimating for the high frequency jobs was observed across all raters. Since, in general, exposures to chemicals tend to decrease in the workplace over time, the observed underestimations may be explained by the possibility that the raters are considering more recent knowledge of workplace exposures in their level and frequency evaluations, for the retrospective jobs. It is also possible that the rating of both level and frequency may be linked, during the evaluation process. This was supported by the observation for the nine dummies where both level and frequency of the dummy was high, only five of the possible 45 evaluations were identified by the five raters. However, due to the relatively small numbers involved in this analysis, caution is advised and for a more definitive analysis of these aspects of exposure, a considerably larger number of dummy jobs would be required. Unlike previous validation studies,7–9 where the experts were industry specific and required to assess current exposures, the experts in this study had to draw on their knowledge across a very broad range of industries, of which some, they had no practical experience. The exposures were also retrospective and some of the jobs no longer existed. Variability in the dummy jobs due to inter- and intra-job exposure variability, would also have complicated the evaluations by the experts.7 The validity results from the industry specific studies may therefore be numerically greater than the results reported in this study. Since the experts only assessed the 300 jobs in isolation over the 6 months of the study, no insight could be achieved regarding the group dynamics that occurs when experts confer to reach consensus. Whether better results of validity and repeatability could be achieved with panellists discussing results prior to evaluation, could not be answered in this study. The processes involved in expert panel evaluations have been examined and previously reported and clear


guidelines recommended on the decision-making process by the expert panel, prior to job evaluations.5 The main advantage of individual evaluations compared with group evaluations are that each expert is required to provide a full evaluation for each job and that one person cannot dominate the process by influencing the others on the panel. This study required exposures to 21 chemicals to be evaluated for the 50 dummy job descriptions (with, at most, only two exposures per job). As a result, only small numbers of dummy jobs were included for each specific chemical exposure. This limited the analysis, since it was not possible to assess with acceptable statistical significance the performance of the raters for specific chemicals e.g. benzene, toluene etc. A more intensive analysis, involving a greater number of dummy jobs, may have revealed whether the experts’ speciality areas dominated their resultant validity. It is recommended that future studies consider comparisons between the validity of individual raters and the group validity for specific exposures of interest. The reported assessment times by the panellists, for the 300 jobs was considered acceptable from a cost-effectiveness viewpoint. The ratings per job varied from just under one min for the fastest rater, to about 2 min. and 20 s for the slowest rater. Given that there were 21 exposures to consider for each job by the raters, by reducing the number of chemicals, the rating speed should increase, hence improving the cost-effectiveness. This information should help to estimate costs and resources required for incorporating this method of exposure assessment into study designs. Other methods that may reduce the cost of experts have been described elsewhere.24 Clearly, the reported prevalence of exposure by the raters has a highly significant effect upon the validity of expert panels to retrospectively assess occupational exposure to chemicals in community-based studies. It is suggested that future study designs should consider, where feasible, a screening process of potential experts by a batch of dummy jobs, to assess the experts determinations of prevalence of the exposures of interest. The suitability of the experts can then be assessed, before they are engaged on an industrial hygiene panel.

ACKNOWLEDGEMENTS This research was financially supported by a grant from the Victorian Health Promotion Foundation. The authors gratefully acknowledge the helpful suggestions and comments by Dr Michael Abramson of the Department of Epidemiology and Preventive Medicine, Mr Geoff Aldred for data analysis support and Ms Judy Snaddon for data collection and availability.

641

REFERENCES 1

Armstrong B K, White E, Saracci R. Monographs in Epidemiology and Biostatistics Volume 21: Principles of Exposure Measurement in Epidemiology. New York: Oxford University Press, 1994, p. 25. 2 Siemiatycki J, Day N E, Fabry J, Cooper J A. Discovering carcinogens in the occupational environment: a novel epidemiologic approach. J Natl Cancer Inst 1981; 66: 217–25. 3 Gérin M, Siemiatycki J, Kemper H, Bégin D. Obtaining occupational exposure histories in epidemiologic case-control studies. J Occup Med 1985; 27: 420–26. 4 Goldberg M, Hemon D. Occupational epidemiology and assessment of exposure. Int J Epidemiol 1993; 22 (Suppl. 2): S5–9. 5 Clavel J, Glass D C, Cordier S, Hémon D. Standardisation in the retrospective evaluation by experts of occupational exposure to organic solvents in a population-based case-control study. Int J Epidemiol 1993; 22 (Suppl. 2): S121–26. 6 Miligi L, Masala G. Methods of exposure assessment for community-based studies: Aspects inherent to the validation of questionnaires. Appl Occup Environ Hyg 1991; 6: 502–07. 7 Kromhout H, Osstendorp Y, Heederik D, Boleij J S M. Agreement between qualitative exposure estimates and quantitative exposure measurements. Am J Ind Med 1987; 12: 551–62. 8 Hawkins N C, Evans J S. Subjective estimation of toluene exposure: A calibration study of industrial hygienists. Appl Ind Hyg 1989; 4: 61–68. 9 Teschke K, Hertzman C, Dimich-Ward H, Ostry A, Blair J, Hershler R. A comparison of exposure estimates by worker raters and industrial hygienists. Scand J Work Environ Health 1989; 15: 424–29. 10 Stücker I, Bouyer J, Mandereau L, Hémon D. Retrospective evaluation of the exposure to polycyclic aromatic hydrocarbons: Comparative assessments with a job exposure matrix and by experts in industrial hygiene. Int J Epidemiol 1993; 22 (Suppl. 2): S106–12. 11 Luce D, Gérin M, Berrino F, Pisani P, Leclerc A. Sources of discrepancies between a job exposure matrix and a case by case expert assessment for occupational exposure to formaldehyde and wood-dust. Int J Epidemiol 1993; 22 (Suppl. 2): S113–20. 12 Stengel B, Pisani P, Limasett J C, Bouyer J, Berrino F, Hémon D. Retrospective evaluation of occupational exposure to organic solvents: questionnaire and job exposure matrix. Int J Epidemiol 1993; 22 (Suppl. 2): S72–82. 13 Giles G, McNeil J, Donnan G et al. Dietary factors and the risk of glioma in adults: Results of a case-control study in Melbourne, Australia. Int J Cancer 1994; 59: 357–62. 14 Selikhof I J, Hammond E C (eds). Brain tumors in the chemical industry. Annals of New York Academy of Sciences. New York: New York Academy of Science, 1982, pp. 43–53. 15 Moss A R. Occupational exposure and brain tumors. J Toxicol Environ Health 1985; 16: 703–11. 16 Thomas T L, Waxweiler R J. Brain tumors and occupational risk factors. Scand J Work Environ Health 1986; 12: 1–15. 17 Goldberg M S, Siemiatycki J, Gérin M. Inter-rater agreement in assessing occupational exposure in a case-control study. Br J Ind Med 1986; 43: 667–76. 18 Fleiss J L. Statistical Methods for Rates and Proportions. New York: John Wiley, 1981, pp. 212–36. 19 Dunn G. Design and Analysis of Reliability Studies. London: Edward Arnold, 1989, pp. 152–55.

642 20


Sackett D L, Haynes R B, Tugwell P. Clinical Epidemiology. A Basic Science for Clinical Medicine. 1st edn. Boston/ Toronto: Little, Brown and Company, 1986, pp. 70–79. 21 Kelsey J L, Thompson W D, Evans A S. Methods in Observational Epidemiology. Monographs in Epidemiology and Biostatistics Volume 10. New York, Oxford: Oxford University Press, 1986, pp. 293–97. 22 Brennan P, Silman A. Statistical methods for assessing observer in clinical measures. Br Med J 1992; 304: 1491–94.

23

Boleij J, Buringh E, Heederik D, Kromhout H. Occupational Hygiene of Chemical and Biological Agents. Amsterdam: Elsevier, 1995, p. 24. 24 Stewart W, Stewart P. Occupational case-control studies: 1. Collecting information on work histories and work-related exposures. Am J Ind Med 1994; 26: 297–312.

(Revised version received November 1996)