Validation of a lower radiation computed tomography enterography ...

10 downloads 3176 Views 800KB Size Report
Jun 1, 2010 - dose computed tomography enterography (CTE) imaging protocol ... Using the clinical reference standard, the sensitivity of CTE to detect.
ORIGINAL ARTICLE

Validation of a Lower Radiation Computed Tomography Enterography Imaging Protocol to Detect Crohn’s Disease in the Small Bowel Hassan Siddiki, MD,* Joel G. Fletcher, MD,* Amy K. Hara, MD,* James M. Kofler, PhD,* Cynthia H. McCollough, PhD,* Jeff L. Fidler, MD,* Luis Guimaraes, MD,* James E. Huprich, MD,* William J. Sandborn, MD,† Edward V. Loftus, Jr., MD,† Jay Mandrekar, PhD,‡ and David H. Bruining, MD†

Background: The purpose was to validate a lower radiation

Key Words: Crohn’s disease, CT enterography, ileocolonoscopy, radiation dose

dose computed tomography enterography (CTE) imaging protocol to detect the presence of Crohn’s disease (CD) in the small bowel using two different reference standards and to identify a prediction model based on CTE signs for the presence of active CD.

C

Methods: This retrospective study included patients with known or suspected CD who underwent CTE between January and October 2006 according to a lower radiation dose protocol. Two gastrointestinal radiologists blindly and independently classified each CTE as being active or inactive. Reference standards included ileocolonoscopy 6 biopsy and a comprehensive clinical reference standard (retrospectively created by a gastroenterologist, also including history, physical, follow-up course, and subsequent endoscopy, imaging, or surgery). Logistic regression was used to identify CTE findings that predicted the presence of active CD based on the combined clinical reference standard. Results: In all, 137 patients underwent CTE and ileocolonoscopy. Using an endoscopic reference standard, the sensitivity of CTE to detect active CD for the two readers was 81% and 89%, respectively. Using the clinical reference standard, the sensitivity of CTE to detect active CD was 89% and 98%, respectively. For both readers the sensitivity of CTE increased by 8%–9% when using the comprehensive reference standard. Multivariate analysis showed that a combination of mural thickness and hyperenhancement best predicted active CD (area under the curve [AUC] ¼ 0.92–0.93, P < 0.0001). Conclusions: Lower radiation dose CTE exams are sensitive for the detection of active small bowel CD. The combination of mural thickness and hyperenhancement are the best radiologic predictors of active CD. (Inflamm Bowel Dis 2011;17:778–786) Received for publication April 20, 2010; Accepted April 22, 2010. From the *Department of Radiology, †Division of Gastroenterology & Hepatology, ‡Division of Biostatistics, Mayo Clinic, Rochester, Minnesota. Reprints: Joel G. Fletcher, MD, 200 First Street SW, Rochester, MN 55905 (e-mail: [email protected]) C 2010 Crohn’s & Colitis Foundation of America, Inc. Copyright V DOI 10.1002/ibd.21364 Published online 1 June 2010 in Wiley Online Library (wileyonlinelibrary. com).

778

omparative accuracy studies have demonstrated that computed tomography enterography (CTE) is more sensitive than fluoroscopic small bowel follow-through to detect Crohn’s disease (CD) in the small bowel, that CTE is more specific than capsule endoscopy for this imaging indication, and that CTE is complementary to direct visual assessment of the gut lumen with ileocolonoscopy.1–15 Based on these data, the combination of CTE and ileocolonoscopy has become the first-line imaging algorithm for evaluation of patients with known or suspected inflammatory bowel disease (IBD) at many institutions.1–5 Mural hyperenhancement, mural wall thickness, mural stratification, vascular engorgement of the vasa recta, fibrofatty proliferation, increased fat density of the perienteric fat, and asymmetrical bowel wall enhancement have all been described as CT signs indicative of inflammation.6,7 Hyperenhancement and wall thickening are sensitive inflammatory markers,6 and when measured quantitatively are significantly associated with endoscopic inflammation at ileocolonoscopy and histologic inflammation at biopsy.2 However, the predictive value of combining two or more CT signs to accurately detect CD in the small bowel has not been described. Ileocolonoscopy with or without biopsy has been used as a reference standard in published CTE studies for the analysis of its diagnostic accuracy.2,3,8–11 However, cross-sectional radiographic and endoscopic evaluations of the small bowel are complementary techniques, each with its own unique advantages and limitations.5,12 Using a reference test that may miss active CD to evaluate another diagnostic test will potentially introduce errors into performance estimates. Thus, evaluating the diagnostic performance of CTE using a comprehensive clinical reference standard may be a more appropriate study design. There is a theoretical risk of radiation-induced cancer for young patients undergoing multiple CT exams.13–16 Inflamm Bowel Dis  Volume 17, Number 3, March 2011

Inflamm Bowel Dis  Volume 17, Number 3, March 2011

Many CD patients are adolescents or in the 3rd and 4th decades of their life at the time of diagnosis and experience multiple disease flares that may require imaging. The potential lifetime cumulative radiation dose of radiation in such clinical scenarios warrants the development of imaging protocols that lower the radiation dose while simultaneously maintaining an appropriate dose to yield accurate diagnostic results. The effect of lowering the radiation dose on the diagnostic accuracy of CTE has not been systematically studied. Using a combination of CTE signs, we first sought to validate the performance of a lower radiation dose CTE protocol15 using both an endoscopic and a comprehensive clinical reference standard, and finally to identify predictors of active CD based on a combination of CTE signs representing mural inflammation.

MATERIALS AND METHODS This retrospective study was approved by the Institutional Review Board of Mayo Clinic. The inclusion criteria included adult outpatients with known or suspected CD who underwent a clinically indicated CTE between January and October 2006, and in whom the CTE was performed according to a lower radiation dose protocol. Additional inclusion criteria included patients who: underwent an ileocolonoscopy 6 biopsy or surgery with resection within 30 days of CTE exam without intervening treatment change. Excluded were patients who did not give authorization for the use of medical records for research purposes and those with a known inflammatory condition other than CD (i.e., ulcerative colitis, celiac disease, etc.). A total of 142 patients met the inclusion and exclusion criteria. Five were excluded from the analysis due to failure to reach a consensus when forming the clinical reference standard (see Reference Standard Assessment below), as clinical followup was felt to be inadequate, leaving a total of 137 subjects who were included in the analysis. Organ doses were calculated using measured radiation exposures and published exposure-to-dose conversion factors based on Monte-Carlo simulated data.17 Effective doses were calculated from the organ doses using ICRP-60 organ weighting factors.18 The effective dose calculation considers a standard-sized adult patient scanned using a fixed mA technique, which was employed in this study and described in Table 1. The lower-dose protocol utilized in the patients included in our study delivered an estimated radiation effective dose of 12 milliSievert (mSv), compared to our prior practice, which delivered an effective dose of 16 mSv (a total dose reduction 25%) or 20 mSv (dose reduction 40%), depending on the scanning technology and CT scan vendor we employed. The lower radiation dose CTE exams in this study were performed using a 16-slice

Validation of CTE Imaging Protocol

TABLE 1. Comparison of the Two Protocols Used in Our Clinical Practice CT Parameters Scanner MDCT system Tube voltage (kV) Tube current (mA) Rotation time (sec) Pitch (table speed/ total collimation) Effective tube current (mAs/ pitch) Detector configuration CTDI (air) (mGy/100mAs): Average effective dose (mSv) CTDI-Vol (mGy)

Prior Protocol

Lower Dose Protocol

GE Light Speed Ultra 8 120 240 0.5 0.625

GE Light Speed Pro 16 120 310 0.5 0.937

192

165

8  2.5 25.2 16 24

16  0.625 26.6 12 18

The standard protocol is given in the first column and the lower dose protocol, which was used empirically in the year 2006, is given in second column. CTDI ¼ Computed tomography dose index, MDCT ¼ Multi-detector CT system, kV ¼ kiloVolts, mA ¼ milliAmpere, mSv ¼ milliSievert.

MDCT scanner (GE LightSpeed Pro; GE Healthcare, Waukesha, WI) with the radiation dose reduction achieved by using a faster scan table speed (for GE scanners, from 0.625 mm/rotation to 0.9365 mm/rotation). Because the ‘‘lower radiation dose protocol’’ uses fewer photons, the resulting reconstructed images have higher noise (particularly in the bony pelvis). We sought to determine if these noisier lower radiation dose examinations contained an adequate amount of qualitative information to yield the correct diagnosis. CT enterography was performed in each patient using a 16-channel MDCT system. A neutral oral contrast agent (VoLumen, Bracco Diagnostics, Princeton, NJ) was used for bowel distension. Patients consumed three bottles of 450 mL oral contrast agent, 60 minutes, 45, and 30 minutes before the scan. Just prior to the scan the participants were asked to drink an additional 500 mL of water. Immediately prior to scanning, patients were given 1 mg of glucagon intravenously. Contrast-enhanced CT was performed using 140 mL of intravenous contrast material (Omnipaque 300; Amersham Health, Princeton, NJ). The contrast agent was injected at a rate of 4 mL/sec, with scanning initiated after a 50-second delay.19 Images were obtained with a 2.5-mm section thickness and an interval of 1.25 mm. Coronal images were generated from a second axial dataset with a 1.25-mm slice thickness and a 1-mm reconstruction interval, generating coronal images 2-mm thick with a reconstruction increment of 1 mm. A detailed comparison of the

779

Siddiki et al

Inflamm Bowel Dis  Volume 17, Number 3, March 2011

FIGURE 1. Transverse images from two CTE datasets on the same patient at two different timepoints, at the similar anatomical level of pelvis. (A) The first image is from a dataset acquired with the old protocol using 16 mSv. The region of interest (ROI) defined by the circle shows a mean of CT number of 113.0 Hounsfeld units (HU) and a noise of 9.36. (B) The second image is from a dataset acquired after dose reduction using 12 mSv. The ROI shows that while the CT number did not change significantly (112.6) the noise has increased to 20.26.

protocol used before and after dose reduction is provided in Table 1. The size of each patient was recorded as the lateral width of each patient (at the level of the iliac crests) as measured from the CT scout image. Illustrative examples (Figs. 1, 2) are given comparing the CT number and noise for the same patient who underwent two CTE exams before and after our CT protocol was changed. Two gastrointestinal radiologists with 8 and 10 years experience who were blinded to all clinical, endoscopic, imaging, and pathologic information, except for the extent of terminal ileal intubation (in centimeters) at ileoscopy or the length of small bowel resected, reviewed all CTE images. The radiologists were asked to evaluate the terminal ileum in each CTE dataset. CTE findings described in earlier published studies7 were measured on a five-point continuous scale for mural hyperenhancement, mural stratification,

increased attenuation in the perienteric fat, fibrofatty proliferation, asymmetrical enhancement of the mesenteric border, and the comb sign. For mural hyperenhancement, attenuation of the terminal ileum was compared to that of nearby distal ileal loops, as jejunal enhancement is known to be greater than ileal enhancement,8 with the hyperenhancement scale ranging from 1 (corresponding to equal enhancement to adjacent distal ileal loops) to 5 (attenuation similar to renal cortex). Mural thickening was measured objectively using continuous variables and a line measurement tool at the thickest portion of the wall of involved portion of the terminal ileum. Each radiologist also classified each CTE scan into four ordinal categories of definitely active disease, probably active disease, inactive disease, and absent, based on their overall subjective assessment of disease activity in the small bowel, without predetermined criteria for individual CT findings.

FIGURE 2. Transverse images from two CTE datasets on the same patient at two different timepoints, at the similar anatomical level of upper abdomen. The first image (A) is from the dataset acquired with the old protocol. The ROI in the area of liver shows a mean of CT number of 136.7 HU and a noise of 16.9. Another ROI in the same image taken over the gastric contents shows a CT number of 22.7 and noise of 14.3. The second image (B) is from the dataset acquired after dose reduction using 12 mSv. The ROI in the region of liver shows the CT number (158.9) and a noise of 23.0. Similarly, the noise in the area of gastric contents has also increased (CT number 12.0 HU and noise 19.2).

780

Inflamm Bowel Dis  Volume 17, Number 3, March 2011

Reference Standard Assessment A gastroenterologist subspecializing in IBD (D.H.B.) reviewed the reports from ileocolonoscopy and histology (from ileal biopsy) on each patient to determine if they had definite active, probably active, inactive or absent Crohn’s ileal inflammation using previously utilized criteria.2 Subsequently, a comprehensive clinical reference standard was constructed based on consensus agreement by this gastroenterologist along with another gastrointestinal radiologist who did not participate in the blinded interpretation of the studies. In addition to revisiting imaging datasets and endoscopic data included in the study, these physicians made use of prospective clinical data from the onset of study in 2006 until January, 2009, including the history and physical examination findings, any changes in the clinical follow-up course, serial radiologic imaging exams, endoscopy findings, subsequent operative notes, and biopsy findings, when available. The comprehensive clinical reference standard also categorized ileal inflammation as definite active, probably active, inactive, or absent Crohn’s inflammation.

Sample Size Calculation and Comparison with Prior Studies Sensitivity was chosen to calculate sample size, as this operating characteristic is clinically most relevant and often the primary variable of interest for clinicians when choosing a diagnostic test for CD. A pooled estimate of the sensitivity of CTE was 77%, based on high-quality CTE studies in the literature (defined as studies that used ileocolonoscopy with histopathology as a reference standard and that employed a CT slice thickness of 3 mm or less).20 To test the hypothesis that the sensitivity of a lower radiation dose CTE exam was not significantly different compared to published estimates, we assumed that a lower radiation dose CTE would detect about 77% (i.e., 66 cases out of 85) yielding an estimated 95% exact binomial confidence interval (CI) of 67%–86%. Based on this information, we decided to consider the lower radiation dose CTE to be equivalent to standard radiation dose if the estimate of sensitivity from this study was at least a lower threshold of a confidence interval (67%) i.e., within 10% of the pooled sensitivity of 77%.

Statistical Analysis The definite active and probable active cases were combined into an ‘‘active disease’’ category, and the inactive disease and absent disease cases were combined into an ‘‘inactive disease’’ category, to make the outcome of interest a dichotomous variable for statistical analysis. The diagnostic capacity of lower radiation dose CTE was assessed by estimating the sensitivity, specificity, and pre-

Validation of CTE Imaging Protocol

TABLE 2. Comparison of Categorization of Each Case into Disease Present and Disease Absent By Using Two Different Reference Standards Contingency Table Ileocolonoscopy vs. Clinical Reference Standard Clinical Reference Standard Ileocolonoscopy Active Absent Total

Active

Absent

Total

77 (89%) 6 (12%) 83

10 (11%) 44 (88%) 54

87 50 137

Columns denote reference standard per comprehensive clinical reference standard while rows denote reference standard based on ileocolonoscopy 6 biopsy. N ¼ 142; five excluded (N ¼ 142-5 ¼ 137).

dictive values and accuracy with 95% exact binomial CIs using an endoscopic/histologic reference standard as well as the comprehensive clinical reference standard. Likelihood ratios and diagnostic odds ratios (ORs) were also calculated. For each reader, logistic regression was used to report unit OR with 95% CI, area under the curve (AUC), and sensitivity and specificity after applying optimal cutoff values for wall thickness and other variables. Optimal cutoff was identified by choosing a value of the CT sign that maximized the sum of sensitivity and specificity to correctly predict the disease activity. Using a multivariate logistic regression model, all CT signs were incorporated to determine which combination of CT signs best predicted the presence of small bowel disease activity using the combined clinical reference standard as the gold standard. This was achieved using a stepwise method with a P-value threshold of 0.25 for entry into the model and of 0.05 for exiting the model. Statistical analysis was performed using JMP v. 8 (SAS Institute, Cary, NC).

Ethical Considerations The Institutional Review Board of our institution approved this retrospective study, conducted from data in institutional patient databases and archives. This article was presented in part at the 93rd Annual Meeting of the Radiological Society of North America (2007, Chicago, IL), but has not been previously published and is not under consideration for publication elsewhere. All authors have participated in the study to a significant extent.

RESULTS Based on ileocolonoscopy findings, 87 patients were classified as having active ileal disease and 50 patients

781

Inflamm Bowel Dis  Volume 17, Number 3, March 2011

Siddiki et al

TABLE 3. Performance of Lower Dose CTE for Active Ileal Crohn’s Disease Using an Endoscopic/Histologic Reference Standard Parameter Estimate

Reader 1

95% C.I.

Reader 2

Sensitivity, % (tp/tpþfn) Specificity, % (tn/tnþfp)

80.5 (70/87) 82.0 (41/50)

70.3–87.9

88.5 (77/87) 62.0 (31/50)

68.0–91.0

95% CI 79.4–94.0 47.2–75.0

Tp, true positive; tn, true negative; fp, false positive; fn, false negative.

were deemed to have absent ileal disease (Table 2). Using the comprehensive consensus reference standard, 83 patients had active and 54 patients had inactive CD in the (neo)terminal ileum (Table 2). The mean patient size as measured by their lateral width was 38.0 6 6.4 cm (range 25.0–53.0 cm). Using the endoscopic reference standard, the CTE scans as reviewed by readers 1 and 2 had sensitivities of 80.5% (70/87) and 88.5% (77/87), which overlapped with the pooled published estimates of CTE sensitivity (77% 6 10%) (Table 3). The specificity of the lower dose protocol ranged between 82% (41/50) and 62% (31/50). The sensitivity of CTE to detect the presence of overall active small bowel disease according to the clinical reference standard was 89.2% (74/83) for reader 1 and 97.6% (81/83) for reader 2 (Table 4). The specificity for reader 1 was 90.7% (49/54) and for reader 2 was 72.2% (39/54). The overall accuracy of CTE to correctly diagnose CD in the small bowel according to the comprehensive clinical reference standard was 89.8% (123/137) for reader 1 and 87.6% (120/137) for reader 2 (Table 4). The likelihood ratio for a positive test result for reader 1 was 9.6 (95% CI, 4.2–22.3) and for reader 2 was 3.5 (2.3–5.4). Conversely, the mean sensitivity, specificity, and overall accuracy of ileocolonoscopy to detect active CD using the consensus

clinical reference were 92.7% (77/83), 81.5% (44/54), and 88% (121/137), respectively. For both readers the sensitivity increased by 8%–9% when using the comprehensive reference standard, because six patients classified as ‘‘positive’’ by the clinical reference standard had negative ileocolonoscopies and 10 patients with a positive ileocolonoscopy were classified as absent disease as per clinical reference standard. This change in reference standard meant that some CTE interpretations classified as false positive by ileocolonoscopy were actually true positive exams according to the combined clinical reference, indicating the presence of small bowel inflammation (Table 2). Accordingly, use of the combined clinical reference standard increased the specificity by 8%– 9% for both readers. False positive and false negative exams at CT enterography can arise from differences between readers (interobserver variability) or from the CT images misrepresenting Crohn’s inflammation (as present or absent). Both radiologists misclassified ileal inflammation as erroneously present in only two cases, and erroneously absent in one case. Consequently, the large majority of false positive and negative CTE exams appear to arise from perceptual differences in radiologic findings (in probably equivocal cases) rather than the selected radiologic findings themselves being insensitive or nonspecific. Increased performance of CTE using a combined clinical reference standard only mildly affects the performance of endoscopy in detecting disease using this same combined standard. Using the comprehensive clinical reference standard, univariate performance characteristics for each CT sign for both readers are shown in Table 5. In univariate analysis, mural wall thickness, hyperenhancement, and stratification generated AUCs of 0.907, 0.906, and 0.832, respectively, for reader 1 and 0.900, 0.851, and 0.873 for reader 2. The ORs for mural hyperenhancement were 7.03 (95% CI 3.89, 14.98) for reader 1 and 7.59 (95% CI 3.99, 16.57) for reader 2. The sensitivities of mural hyperenhancement were 89.0% (73/82) for reader 1 and 80.7% (67/83) for

TABLE 4. Performance of Lower Dose CTE for Active Ileal Crohn’s Disease Using a Comprehensive Clinical Reference Standard Parameter Estimate Sensitivity, % (tp/tpþfn) Specificity, % (tn/tnþfp) Likelihood ratio of þ test NPV % (tn/tnþfn) PPV % (tp/tpþfp) Accuracy, % (tpþtn/ N)

Reader 1 89.2 90.7 9.6 84.5 93.7 89.8

(74/83) (49/54) (49/58) (74/79) (123/137)

NPV, negative predictive value; PPV, positive predictive value.

782

95% CI 80.6–94.2 80.1–96.0 4.20–22.3 73.0–91.6 86.0–97.3 83.6–93.8

Reader 2 97.6 72.2 3.5 95.1 84.4 87.6

(81/83) (39/54) (39/41) (81/96) (120/137)

95% CI 91.6–99.3 59.1–82.4 2.30–5.40 83.9–98.6 75.8–90.3 81.0–92.1

Inflamm Bowel Dis  Volume 17, Number 3, March 2011

Validation of CTE Imaging Protocol

TABLE 5. Odds Ratio and 95% Confidence Interval for Correctly Diagnosing the Disease as per Clinical Reference Standard with Each Unit Increase for CT Signs of Inflammatory Crohn’s Disease, Using a Five Point Scale CT sign Mural hyperenhancement Maximum mural thickness Mural stratification Combs sign Asym/Mesenteric border Increased Fat density Fibro fatty proliferation

Reader

OR (95% CI)

AUC

R1 R2 R1 R2 R1 R2 R1 R2 R1 R2 R1 R2 R1 R2

7.03 (3.89, 14.98) 7.59 (3.99, 16.57) 1.85 (1.55-2.30) 3.63 (2.11, 6.18) 4.95 (3.00-9.07) 6.95 (3.86, 14.47) 27.61(6.19,491.01) (NA) 17.96 (5.53,111.45) 4.43 (2.26, 11.15) (NA) 26.09 (5.8, 464.2) 2.58 (1.39-6.13) 6.77 (2.5, 37.7)

0.91 0.85 0.91 0.90 0.83 0.87 0.78 0.72 0.75 0.71 0.70 0.73 0.61 0.66

Sensitivity 89.0 80.7 78.0 80.2 78.0 83.1 57.3 44.6 53.6 49.4 39.0 48.2 29.7 33.7

(73/82) (67/83) (64/82) (65/81) (64/82) (69/83) (47/82) (37/83) (44/82) (41/83) (32/82) (40/83) (24/82) (28/83)

Specificity 81.5 83.4 87.0 94.5 81.5 85.2 98.1 100 96.3 92.6 100 98.1 92.6 98.1

(44/54) (45/54) (47/54) (51/54) (44/54) (46/54) (53/54) (54/54) (52/54) (50/54) (54/54) (53/54) (50/54) (53/54)

Area under the curve, sensitivity and specificity for individual CT signs is also provided. The three CTE signs of active inflammation with largest Area under curve (AUC) are highlighted as bold. R1 ¼ reader 1, R2 ¼ reader 2, OR ¼ Unit odds ratio, AUC ¼ Area under the curve, NA ¼ Unable to calculate due to small outcomes.

reader 2. The sensitivities of mural stratification were 78.0% (64/82) for reader 1 and 83.1% (69/83) for reader 2. The specificities of maximum mural wall thickness were 87.0% (47/54) for reader 1 and 94.5% (51/54) for reader 2. The cutoffs for wall thickness, which produced the optimum AUC, was 6 mm for reader 1 and 4.0 mm for reader 2. The ORs for the comb sign and increased fat density were both 26.0 and the specificities were both 98.1% (53/54) for two readers. However, the sensitivities of the comb sign and increased fat density were 57% for both readers. In multivariate analysis, after forward and backward stepwise logistic regression fit, mural thickness and hyperenhancement were selected using the specified probabilities to enter or leave the model. Using this combination the regression model for reader 1 produced an AUC of 0.93 and P < 0.0001 (Fig. 3A) and for reader 2 produced an AUC of 0.92 and P < 0.0001 (Fig. 3B). A comparison of diagnostic performance of these two models is given in Table 6.

DISCUSSION Using endoscopy and/or histology as the reference standard, CTE in our study had a sensitivity of 80.5% (95% CI 70.3, 87.9) and 88.5% (95% CI 79.4, 94.0) for readers 1 and 2, which overlapped with the pooled sensitivity of CTE of 77% 6 8%3,4,8,20 reported in prior published studies with higher radiation dose levels, indicating that we have validated the performance of the technique with a

lower radiation dose level in a large number of patients with a high degree of confidence. However, using visual mucosal assessment alone as a reference standard for active inflammatory CD may underestimate the true performance of CTE, as mucosal assessment and cross-sectional imaging assessment appear complementary.5 In a recent small prospective study comparing CTE and MRE, Siddiki et al12 found that in one-third of subjects (33.3%), cross-sectional imaging provided new information complementary to ileocolonoscopy. The reasons for these ‘‘endoscopic misses’’ included inability to cannulate the ileocecal valve (36%), which can be stenotic in CD, normal mucosa with mural inflammation indicated by CTE and MRE, as well as clinical assessment (27%), or sampling error (27%). In the present study we found that by using a comprehensive clinical reference standard in which all available clinical information could be considered, both the sensitivity and specificity of CTE increased by 8%–9%, owing to misclassification of patients with ileal inflammation that was occult at endoscopic inspection. These findings, along with those of reported by Siddiki et al,12 suggest that some of the prior studies in the literature may have underestimated the performance of CTE. A comprehensive clinical reference standard is a more appropriate reference standard that incorporates the complementary findings of both ileocolonoscopy and CTE. While CTE is better at diagnosing both mural and extramural manifestations of CTE, ileocolonoscopy has the advantage of diagnosing more superficial and subtle mucosal defects. Indeed, despite a retrospective study design that was biased against

783

Siddiki et al

Inflamm Bowel Dis  Volume 17, Number 3, March 2011

ence, as opposed to another diagnostic test, are likely to be more accurate and reproducible. Prior CTE studies have focused on validating specific CT findings that represent mural inflammation, rather than creating multivariate models that identify active CD.2,8 As in prior studies, mural hyperenhancement had a higher odds ratio than other CT signs for mural inflammation, indicating that when present, hyperenhancement is the strongest predictive CT finding associated with the presence of active CD. This finding highlights the need for scanning CD patients with neutral enteric contrast so that active inflammation can be detected. Maximum mural thickness was the most specific sign for both readers (87.0% and 94.5%). While the cutoff for mural thickness which produced the optimum AUC was slightly different for the two readers (6 versus 4 mm), this difference is small considering the fact that each reader had to identify the location at which maximal bowel wall thickness should be measured, in addition to defining the luminal and serosal boundaries at this location. While the comb sign and increased fat density had the largest odds ratios (i.e., 26) and were also the most specific CT signs (98%), the low sensitivity of these findings render them nondiagnostic if used alone. Perceptual differences in the assessment of mural hyperenhancement and wall thickness between the readers accounted for the majority of false positive and negative exams for each reader. In the future, automated image analysis tools that reproducibly measure hyperenhancement and wall thickness promise to refine visual and manual estimates of mural enhancement and wall thickness. In our multivariate analysis using all the CTE signs for CD, mural thickness and mural hyperenhancement were found to be the combination for both readers that generated the best predictive model to correctly diagnose active CD. The model that produced the largest AUC (92%) for both readers employed both mural hyperenhancement and FIGURE 3. AUC and cutoff for sensitivity and specificity for model consisting on mural thickness and hyperenhancement based on stepwise regression using multivariate analysis for reader 1 (A) and reader 2 (B).

ileocolonoscopy (i.e., ileocolonoscopy employed multiple operators unaware of the study design, as opposed to the two participating GI radiologists), the operating characteristics of ileocolonoscopy were impressive (sensitivity, 93%; specificity, 88%) compared to the combined clinical reference standard. The high performance of ileocolonoscopy in this setting bolsters the view that visual and cross-sectional imaging of the small bowel is complementary. Conversely, endoscopic biopsy may not eliminate visual misclassification as sampling error can still occur. Hence, the results obtained in this study using a comprehensive clinical refer-

784

TABLE 6. Best predicting model based on stepwise logistic regression comprising of mucosal hyperenhancement and mural wall thickness Parameter estimates

Reader 1 (n¼136)

Reader 2 (n¼135)

R2 P-value Intercept (95% CI) AUC Sensitivity Specificity

0.52