Preoperative prediction of sentinel lymph node

1 downloads 0 Views 3MB Size Report
breast cancer, sentinel lymph node (SLN) metastasis could accurately predict ALN status, ..... ysis, 86/91 (94.5%) SLN had macrometastases, 5/91. (5.5%) had ...
Eur Radiol DOI 10.1007/s00330-017-5005-7

BREAST

Preoperative prediction of sentinel lymph node metastasis in breast cancer based on radiomics of T2-weighted fat-suppression and diffusion-weighted MRI Yuhao Dong 1,2 & Qianjin Feng 3 & Wei Yang 3 & Zixiao Lu 3 & Chunyan Deng 3 & Lu Zhang 1 & Zhouyang Lian 1 & Jing Liu 1 & Xiaoning Luo 1 & Shufang Pei 1 & Xiaokai Mo 1,2 & Wenhui Huang 1 & Changhong Liang 1 & Bin Zhang 1 & Shuixing Zhang 1

Received: 6 June 2017 / Revised: 12 July 2017 / Accepted: 24 July 2017 # European Society of Radiology 2017

Abstract Objectives To predict sentinel lymph node (SLN) metastasis in breast cancer patients using radiomics based on T2-weighted fat suppression (T2-FS) and diffusion-weighted MRI (DWI). Methods We enrolled 146 patients with histologically proven breast cancer. All underwent pretreatment T2-FS and DWI MRI scan. In all, 10,962 texture and four non-texture features were extracted for each patient. The 0.623 + bootstrap method and the area under the curve (AUC) were used to select the features. We constructed ten logistic regression models (orders of 1–10) based on different combination of image features using stepwise forward method. Results For T2-FS, model 10 with ten features yielded the highest AUC of 0.847 in the training set and 0.770 in the validation set. For DWI, model 8 with eight features reached

the highest AUC of 0.847 in the training set and 0.787 in the validation set. For joint T2-FS and DWI, model 10 with ten features yielded an AUC of 0.863 in the training set and 0.805 in the validation set. Conclusions Full utilisation of breast cancer-specific textural features extracted from anatomical and functional MRI images improves the performance of radiomics in predicting SLN metastasis, providing a non-invasive approach in clinical practice. Key Points • SLN biopsy to access breast cancer metastasis has multiple complications. • Radiomics uses features extracted from medical images to characterise intratumour heterogeneity. • We combined T2-FS and DWI textural features to predict SLN metastasis non-invasively.

Yuhao Dong, Qianjin Feng and Wei Yang contributed equally to this work.

Keywords Imaging . Breast cancer . Sentinel lymph node metastasis . Radiomics . Preoperative prediction

Electronic supplementary material The online version of this article (doi:10.1007/s00330-017-5005-7) contains supplementary material, which is available to authorized users. * Shuixing Zhang [email protected] Bin Zhang [email protected] 1

Department of Radiology, Guangdong General Hospital/Guangdong Academy of Medical Sciences, No. 106 Zhongshan Er Road, 510080 Guangzhou, Guangdong Province, People’s Republic of China

2

Graduate College, Shantou University Medical College, Shantou, Guangdong, People’s Republic of China

3

The Guangdong Provincial Key Laboratory of Medical Image Processing, School of Biomedical Engineering, Southern Medical University, Guangzhou, Guangdong, People’s Republic of China

Abbreviations ALN Axillary lymph node AUC Area under the curve DWI Diffusion-weighted MRI ER Oestrogen receptor PR Progesterone receptor SLN Sentinel lymph node T2FS T2-weighted fat suppression

Introduction Breast cancer is the most frequently diagnosed cancer and the leading cause of cancer- related death among women worldwide, with an estimated 1.7 million cases annually

Eur Radiol

[1]. Identification of the axillary lymph node (ALN) status remains important for breast cancer patients because it is an essential prognostic factor and guides adjuvant therapy decisions [2]. As the first station of lymph node metastasis of breast cancer, sentinel lymph node (SLN) metastasis could accurately predict ALN status, especially for patients with clinically negative ALNs. Therefore, SLN biopsy is used as a common method to evaluate ALN status and has become an alternative to ALN dissection [3, 4]. However, SLN biopsy is invasive and has significant complications, including shoulder dysfunction, nerve damage, upper arm numbness, and lymphoedema [5]. Although histopathological and clinical data of patients, such as lymphovascular invasion, extranodal extension, Ki67 proliferation index, SLN size, histological grade, oestrogen receptor (ER) status and progesterone receptor (PR) status, are known to be predictors of SLN metastasis, they are available postoperatively and cannot be used to guide decisions on performing SLN biopsy [6–10]. Preoperative knowledge of SLN status can potentially help in clinical decision-making concerning axillary surgery. Recently, a new method of SLN detection using contrastenhanced ultrasonography (CEUS) with Sonazoid by evaluating microvessels was reported, but it is operatorindependent and the accuracy remains unclear [11, 12]. Hence, there is a great need for highly accurate, sensitive, and yet non-invasive methods to preoperatively predict SLN metastasis. Radiomics has been a research hotspot in recent years, and it is the process of the conversion medical images into high-dimensional, mineable and quantitative imaging features via high-throughput extraction of datacharacterisation algorithms [13, 14]. Great advances in pattern recognition tools and data set sizes have facilitated the development of radiomics, which provides an unprecedented opportunity to improve decision support in oncology at low cost and non-invasively [14]. Previous studies have shown that objective and quantitative image features could potentially be used as prognostic or predictive biomarkers. However, almost all previous works focused on computed tomography (CT) and anatomical MRI, such as T1-weighted MRI (T1wMRI) and T2-weighted MRI (T2w-MRI) [15–21]. Very few studies use functional MRI in radiomics research. To the best of our knowledge, there is no publication that has determined whether combined anatomical and functional MRI radiomics features would render better prediction of SLN metastasis. Therefore, in this present study, we investigate the accuracy of T2-weighted fat suppression (T2-FS) and diffusion-weighted imaging (DWI) MRI-based radiomics features in preoperative prediction of SLN metastasis in breast cancer patients.

Methods and materials Study population and MRI images Our institutional review board approved this retrospective study and waived the need to obtain informed consent from the patients. Consecutive patients with histologically confirmed breast cancer between March 2014 and June 2016 were retrospectively reviewed. Not all patients received treatment prior to MRI examination. SLN metastasis was confirmed by final histopathology. Patients and pretreatment tumour characteristics were collected from medical records including age, histological grade, apparent diffusion coefficient (ADC) value (×10−3 mm2/s), ER status, PR status, cerB, HER2 status and Ki-67 proliferation index. MR imaging was performed by using a 1.5-T MR imager (Achieva 1.5 T, Philips Healthcare, Best, Netherlands) equipped with a 4-channel SENSE breast coil in prone position. T2-FS (TR/TE = 3400/90 ms; FOV = 320 × 260 mm2; matrix = 348 × 299; slice thickness = 3 mm; slice gap = 0.3 mm) images of breast were obtained. Axial DW images were obtained by using single-shot spin-echo echo-planar imaging (EPI). Axial DW images with bilateral breast coverage were collected (TR/TE = 5065/66 ms; FOV = 300 × 300 mm2; matrix = 200 × 196; slice thickness = 5 mm; slice gap = 1 mm; b values of 0 and 1000 s/mm2). Figure 1a–c shows a 53-year-old woman with visible left ALN metastasis on axillary image and two histopathologically confirmed SLN metastases. The breast tumour presents vague margin, hyperintense on T2-FS and DWI images. Figure 1d–f shows a 62-year-old woman with suspicious right ALN metastasis on axillary image and no SLN metastasis. The tumour presents relative clear margin, hyperintense on T2-FS and DWI images. Radiomics workflow Radiomics workflow is presented in Fig. 2, including (1) image segmentation; (2) feature extraction; (3) feature reduction; (4) feature selection; (5) predictive model building. Image segmentation We used axial T 2 -FS and DWI Digital Imaging and Communications in Medicine (DICOM) images that had been archived in the Picture Archiving and Communication System (PACS), without applying normalisation. Note that segmentation is required before the extraction of quantitative radiomics features. We used ITK-SNAP software for three-dimensional manual segmentation (open source software; www.itk-snap.org). All manual segmentations of the tumour were done by a radiologist who had 15 years of experience, and each segmentation was validated by a senior

Eur Radiol

Fig. 1 a–c A 53-year-old woman with visible left ALN metastasis on axillary image and two histopathologically confirmed SLN metastases. The breast tumour presents vague margin, hyperintense on T2-FS and

DWI images. d–f A 62-year-old woman with suspicious right ALN metastasis on axillary image and no SLN metastasis. The tumour presents relative clear margin, hyperintense on T2-FS and DWI images

Fig. 2 Radiomics workflow, including (1) MRI image segmentation; (2) feature extraction; (3) feature reduction; (4) feature selection; (5) predictive model building

Eur Radiol

radiologist, who had 20 years of experience (largely with breast cancer). The region of interest covered the whole tumour and was delineated on both the axial T2-FS images and DWI images on each slice. Feature extraction The methodology used to extract radiomics features from the tumour region and texture extraction parameters have been described in the Supplementary Information. A total of four non-texture and 10,962 scan texture parameter features (5481 features from T2-FS images and the remaining 5481 from DWI images) were extracted for each patient. Feature extraction methods were implemented using MATLAB 2014a (MathWorks, Natick, MA, USA). Univariate analysis Univariate association between the whole set of features, namely four non-texture features and 10,962 scan texture parameter features, and SLN metastases was assessed using Spearman’s rank correlation (rs). Bonferroni correction method was applied for multiple comparisons: the significance level was lowered to a value of p < α/K, where α is the significance set to 0.05, and K is the number of comparisons. Multivariable analysis We were interested in finding a linear combination of p variables in a multivariable model so as to maximise the conditional probability of the set of outcome states {0,1} corresponding to the input data, which was achieved by the logistic regression defined as Xp β x ; f or i ¼ 1; 2; …N ; ð1Þ g ðX i Þ ¼ β 0 þ j¼1 j i j where xij is the j th feature vector of the i th patient Xi, for a total of N patients, β = {βj ∈ R : j = 0, 1, … p} is the set of coefficients of the regression model. The 0.623 + bootstrap method and the AUC metric were adopted to estimate which model learned from our patient cohort would best predict SLN metastases on new prospective data. Let the imaging data set of our cohort be denoted as X = {Xi : i = 1, 2, … N}, then a bootstrap sample X* = {X*i : i = 1, 2, … N} is a sample of input variables Xi of N patients randomly drawn with replacement from X. The generation of a large number B of randomly drawn bootstrap samples X* b (b = 1, 2, … B) is used to estimate a statistical quantity of interest on the unknown true population distribution. Note that in the multivariable analysis, the probability of choosing a negative instance (Non-SLN-Mets group class) was made equal to that of choosing a positive instance (SLN-Mets group class) each time by drawing a bootstrap sample X* b

from X, which is thus denoted as ‘imbalance-adjusted bootstrap resampling’. Feature reduction Prediction models were constructed for an initial feature set which contains four non-texture features and 10,962 MRI scan texture parameter features. First, feature set reduction was performed to create reduced feature sets containing 25 different scan texture features from the initial set based on the gain equation: 2 3    f   X   2ð f −k þ 1Þ d  d 4 Gain j ¼ γ⋅rbs x j ; y  þ δa ⋅ PIC xk ; x j 5 k¼1

þ δb ⋅



1X F

F d l¼1 PIC



f ð f þ 1Þ

ð2Þ



xl ; x j 

   ; y ; and where rbs x j ; y ¼ B1 ∑ Bb¼1 rs x*b j  B 1 *b *b : B ∑ b¼1 PIC xk ; x j

  d xk ; x j ¼ PIC

In Eq. (2), rs(xj, y) is the Spearman’s rank correlation between feature j and the outcome vector y = {y i ϵ{0 : NonSLNMets, 1 : SLNMets}, i = 1, 2, … N}. PIC (Xk, Xj) is the potential information coefficient defined as PIC (xk, xj) = 1 − MIC (xk, xj), where MIC (xk, xj) is the maximal information coefficient between feature j and k as defined by Reshef et al. [22]. f represents the features that have already been chosen for the reduced feature set, while F represents the features that have not yet been removed from the initial set. γ, δa and δb are all constants set to 0.5, 0.5 and 0, respectively. In each iteration, a new feature with the largest value of gain would be chosen for the reduced set, and then a new gain was calculated for all remaining features in the larger initial set using imbalance-adjusted bootstrap resampling (B = 1000). Since part 1 of the gain equation uses Spearman’s rank correlation varying over the whole set of texture extraction parameters, it allows for ranking specific scan texture parameter features. Feature selection After feature reduction, stepwise forward feature selection was then performed for logistic regression model orders of 1–10 by maximising the 0.632 + bootstrap AUC. The order of a model specifies the number of features to be selected as variables for the model. For a given model order and a given reduced feature set, the feature selection step was divided into 25 separate experiments, in each of which a different feature from the reduced set was assigned as a different ‘starter’. For each given starter, 1000 logistic regression models were first created for the remaining features by imbalance-adjusted

Eur Radiol

bootstrap resampling (1000 samples), and then the single feature that maximised the 0.632 + bootstrap AUC, defined in Eq. (3), was chosen. This process was repeated up to order 10, after which the combination of features that yields the highest 0.632 + bootstrap AUC for each model was identified. d 0:632þ AUC

2 3 B 1 X 4ð1−αðbÞÞ⋅AUCðX ; X Þ þ αðbÞ 5 ¼  0 B b¼1 ⋅AUC X *b ; X *b ð0Þ

ð3Þ

The last step in the construction of the final prediction model was to compute the coefficients of the optimal combination of features. Let the logistic regression coefficient of feature j computed in a bootstrap sample X* b and its corresponding outcome vector y be modelled as βj(X* b, y) for j = 0, 1, … p, where p is the model order and j = 0 refers to the offset of the model g (xi). The computation of the different coefficient was performed as in Eq. (5), where B was set at 1000. B

  where AUC X ; X ð0Þ ¼ max 0:5; AUC X *b ; X *b ð0Þ ; αðbÞ 0



¼

and RðbÞ ¼

*b

*b



1   AUCðX ; X Þ−AUC X *b ; X *b ð0Þ > AUCðX ; X Þ−0:5 > : 0

f or

j ¼ 0; 1; …p:

ð5Þ

The predictive performance of models was first assessed in the training set and then validated in the validation set.

0:632 ; 1−0:368⋅RðbÞ

8 > >
1 if 2 > AUC X *b ; X *b ð0Þ otherwise:

Training data and validation data In the 0.632 bootstrap, a particular piece of training data has a probability of 1 – 1/n of not being picked, thus its probability of ending up in the validation data (not selected) is   1 n −1 ≈e ¼ 0:368 1− n In our study, there were approximately 146 × 0.368 ≈ 54 patients in the validation set. The training data will contain about 63.2% of the instances, i.e. 146 × 63.2% = 92 patients.

Statistical analysis The packages of MATLAB 2014a used to analyse radiomics data were available at https://cn.mathworks.com/matlabcentral/ fileexchange/51948-radiomics.We conducted a descriptive analysis, using cross-tabulations of histological grade, ER status, PR status, cerB and HER2 status. A two-tailed Student t test was used to compare the mean age, mean ADC value and Ki-67 proliferation index between the patients with SLN metastasis and free of SLN metastasis. The data were analysed using Statistical Package for Social Sciences (SPSS) software version 23.0 (SPSS Inc., Chicago, IL, USA). A two-tailed p value less than 0.05 was considered statistically significant.

Results Tumour and patient characteristics

Predictive model building Once the optimal combinations of features were identified for the models of different orders, prediction performance was then estimated on the basis of the 0.632 + bootstrap AUC again. A single combination of features that possessed the best parsimonious properties in terms of the receiver operating characteristics (ROC) analysis was finally determined. ROC makes use of the TP rate and the FP rate, which are defined as TPrate ¼

TP FP ; FPrate ¼ ; TP þ FN TN þ FP

ð4Þ

where TP (true positives), TN (true negatives), FP (false positives) and FN (false negatives) represent correctly classified positive cases, correctly classified negative cases, incorrectly classified negative cases and incorrectly classified positive cases, respectively.

The clinical characteristics of the breast cancer patients are summarised in Table 1. There were no significant differences between the Non-SLN-Mets group (n = 91 patients) and SLNMets group (n = 55 patients) in age, histological grade, mean ADC value, ER status, PR status, cerB, HER2 status and Ki67 proliferation index (p = 0.100–0.946). In the SLNMets group, the mean number of patients affected was 1.7 (±0.9). According to the final histopathological analysis, 86/91 (94.5%) SLN had macrometastases, 5/91 (5.5%) had micrometastases and none had isolated tumour cells. Univariate results Table 2 presents the Spearman’s rank correlation (rs) between all features (non-texture and texture features) and SLN metastasis, along with their corresponding p values. For non-

Eur Radiol Table 1

Patient and tumour characteristics

Mean age (SD)

Table 2 Spearman rank correlation between features and sentinel lymph node metastasis

Non-SLN-Mets group (n = 91)

SLN-Mets group (n = 55)

p value

47.10 ± 11.0

48.0 ± 10.2

0.642

Type

Histological grade I II III

14 (15.3%) 35 (38.5%) 42 (46.2%)

3 (5.5%) 27 (49.1%) 25 (45.4%)

0.148

ER status Negative 1+

23 (25.3%) 14 (15.4%)

5 (9.1%) 8 (14.5%)

0.100

12 (13.2%)

10 (18.2%)

42 (46.2%)

32 (58.2%)

19 (20.9%) 24 (26.4%) 9 (9.9%)

5 (9.1%) 16 (29.1%) 8 (14.5%)

2+ 3+ PR status Negative 1+ 2+ 3+ cerB Negative 1+ 2+ 3+ HER2 status Positive Negative Ki-67(%) Mean ADC value (SD)

Non-texture

26 (47.3%)

22 (24.2%) 19 (20.9%) 28 (30.8%)

10 (18.2%) 13 (23.6%) 21 (38.2%)

22 (24.2%)

11 (20.0%)

p value*

Volume

0.3091

0.00011

Size Solidity

0.2936 −0.0421

0.0801 0.00017

0.0511

0.4129

−0.2387 −0.3355

0.0017 0.00004

Eccentricity Texture Global

39 (42.9%)

rs

Feature

0.286

GLCM

0.681

Variance Skewness

DWI

0.0007

0.0029

Kurtosis

T2FS DWI

0.1107 0.1402

0.0079 0.000002

Energy

T2FS DWI

0.2194 0.2377