Consistency-based Respiratory Motion Estimation in ...

1 downloads 0 Views 1MB Size Report
E-mail: mathias[email protected]; web: www5.cs.fau.de/~unberath .... In X-ray imaging, contrasted vessels appear as narrow tubular structures that are bright.
1

Consistency-based Respiratory Motion Estimation in Rotational

2

Angiography

3

Mathias Unberath∗ and Andreas Maier

4

Pattern Recognition Lab, Computer Science Department,

5

Friedrich-Alexander-University Erlangen-Nuremberg and

6

Erlangen Graduate School in Advanced Optical Technologies (SAOT), Germany

7

Andr´e Aichert

8

Pattern Recognition Lab, Computer Science Department,

9

Friedrich-Alexander-University Erlangen-Nuremberg

10

Stephan Achenbach

11

Department of Cardiology, Friedrich-Alexander-University Erlangen-Nuremberg

12

(Dated: March 11, 2017)

13

Purpose: Rotational coronary angiography enables 3D reconstruction but suffers

14

from intra-scan cardiac and respiratory motion. While gating handles cardiac mo-

15

tion, respiratory motion requires compensation. State-of-the-art algorithms rely on

16

3D-2D registration that depends on initial reconstructions of sufficient quality. We

17

propose a compensation method that is applied directly in projection domain. It

18

overcomes the need for reconstruction and thus complements the state-of-the-art.

19

Methods: Virtual single-frame background subtraction based on vessel segmenta-

20

tion and spectral deconvolution yields non-truncated images of the contrasted lu-

21

men. This allows motion compensation based on data consistency conditions. We

22

compensate craniocaudal shifts by optimizing epipolar consistency to (a) devise an

2 23

image-based surrogate for cardiac motion and (b) compensate for respiratory mo-

24

tion. We validate our approach in two numerical phantom studies and three clinical

25

cases.

26

Results: Correlation of the image-based surrogate for cardiac motion with the

27

ECG-based ground truth was excellent yielding a Pearson correlation of 0.93 ± 0.04.

28

Considering motion compensation, the target error measure decreased by 98 % and

29

69 %, respectively, for the phantom experiments while for the clinical cases the same

30

figure of merit improved by 46 ± 21 %.

31

Conclusions: The proposed method is entirely image-based and accurately esti-

32

mates craniocaudal shifts due to respiration and cardiac contraction. Future work

33

will investigate experimental trajectories and possibilities for simplification of the

34

single-frame subtraction pipeline.



E-mail: [email protected]; web: www5.cs.fau.de/~unberath

3 I.

35

INTRODUCTION

36

Minimally invasive treatment of cardiovascular disease (CVD) heavily relies on inter-

37

ventional X-ray imaging [1]. Traditionally, coronary angiograms are acquired from few

38

viewing angles only, impeding assessment due to projective simplifications, such as overlap

39

and foreshortening [2–4]. These limitations can be alleviated by providing physicians with

40

3D reconstructions of the arterial tree, which is expected to improve diagnostic assessment

41

and interventional guidance [1, 5, 6]. Unfortunately, the low temporal resolution of C-arm

42

cone-beam CT (CBCT) scanners inhibits straight-forward 3D reconstruction. Rotational

43

sequences are acquired over several seconds and, therefore, are corrupted by cardiac and

44

respiratory motion. Consequently, both motion patterns have to be incorporated into re-

45

construction algorithms.

46

Cardiac motion is highly complex but repetitive with high frequency. Instead of estimating

47

a full 4D motion field, retrospective gating of the sequence is sufficient. After gating, only

48

images of the same cardiac phase contribute to the reconstruction. Gating approaches are

49

generic in the sense that they can be employed in both back-projection-type [7, 8] and sym-

50

bolic reconstruction algorithms [3, 4, 9]. Moreover, the choice of low-dimensional surrogate

51

signal is unrestricted. Simultaneously acquired electrocardiograms (ECG) are common sur-

52

rogate signals [8, 10]. Yet, image-based surrogates exist [3] and should be preferred as they

53

encode the motion-state rather than the electro-physiologic activation.

54

In contrast to cardiac motion, respiratory motion is low frequency such that only very few

55

recurrences are observed during a standard acquisition.

56

ning strategies followed by cyclic registration have proved robust in image-guided radiation

57

therapy [11–15]. Unfortunately, in the context of rotational angiography respiratory phase

Respiratory phase gating or bin-

4 58

binning is impossible as it would result in a highly ill-posed problem, impeding the appli-

59

cation of methods proposed by Sonke et al. [11], Li et al. [12], and Brehm et al. [13, 14]. A

60

prominent solution is to avoid the problem as a whole by requiring patients to hold their

61

breath throughout the acquisition. If breath-hold is impossible or imperfect, motion estima-

62

tion techniques must be applied to mitigate the corruption due to respiration and, finally,

63

enable 3D reconstruction.

64

State-of-the-art methods for respiratory motion compensation in standard protocols require

65

an initial 3D reconstruction of the structure of interest, that enables 3D/2D registration of

66

the model to subsequent motion states. In bi-plane angiography it has been shown that

67

this strategy allows for the estimation of 3D rotations and translations [16] but even dense

68

3D displacement fields can be recovered [17]. Single-plane rotational acquisitions also offer

69

multiple views enabling reconstruction. Blondel et al. used alternating reconstruction and

70

bundle adjustment steps to estimate detector shifts to account for respiratory motion [3, 18].

71

In the context of rotational angiography, however, motion estimation is far more challeng-

72

ing as supposedly consistent frames are acquired successively rather than simultaneously.

73

Hence, there is no guarantee that a particular motion state is observed more than once

74

which would prevent any uncompensated reconstruction in the first place. Aforementioned

75

approaches are similar in that they seek to establish correspondences among different views

76

by transformation of a 3D model. In case of non-residual, low frequency motion, the recon-

77

struction of such initial model is challenging or even impossible, making its requirement a

78

limiting factor for the applicability of the methods.

79

The good news is that the motion of the heart due to respiration is considerably less com-

80

plex than cardiac motion itself. It is linearly related to diaphragm motion [19] and can

81

be approximated as a 3D translation that exhibits its largest component in craniocaudal

5 82

direction [3, 20, 21]. In practice, the craniocaudal axis of the patient aligns with the vertical

83

detector coordinate such that the optimization for detector domain shifts should already

84

yield substantial improvement [3]. To omit the need for initial reconstructions, we build on

85

a recently published motion estimation technique that is applied in projection domain and

86

optimizes for consistency among all images. The method exploits redundancies in transmis-

87

sion imaging to derive data consistency conditions (DCC) from the epipolar geometry [22].

88

Unfortunately, the method cannot be applied to truncated images as truncation inherently

89

voids the consistency of an acquisition. Although the projection of the whole thorax is trun-

90

cated, the coronary arteries do not extend beyond the field-of-view. Separation of the con-

91

trasted coronary arteries from the remaining thorax yields non-truncated projections that

92

could be used as input to DCC-based motion estimation algorithms. A straight-forward

93

method to achieve material decomposition is background removal via digital subtraction.

94

For this task, projections of the same scene without contrast enhancement are estimated

95

from the acquired images and a segmentation mask [23]. The virtual background image

96

is subtracted from the original projections yielding non-truncated images of the contrast

97

agent only. Background suppression is an integral component of many vessel reconstruction

98

methods but is usually low-level and used for artifact reduction only [6]. Extending its use-

99

fulness to purely projection-based motion detection and compensation is, therefore, of high

100

relevance.

101

In this paper we show that virtual subtraction imaging of the coronary arteries enables

102

consistency-based respiratory motion estimation. Epipolar consistency is used to estimate

103

vertical detector shifts in numerical cone-beam CT simulations of an XCAT-like phantom

104

[24, 25] and on real patient data. Finally, the high frequency component of the estimated

105

shifts that is correlated to the superior-inferior motion of the heart during contraction [21]

6 106

is used to derive an image-based surrogate for cardiac motion.

II.

107

METHODS

108

We describe an algorithm targeted at translational motion estimation in rotational cone-

109

beam CT angiography using epipolar consistency conditions (ECC). First, the contrasted

110

vessels must be separated from the background via virtual subtraction imaging in order to

111

obtain non-truncated images and, therefore, enable consistency-based processing. Second,

112

horizontal and vertical shifts of the detector are estimated that maximize consistency among

113

all images in the sense of epipolar consistency conditions. Finally, we extract high frequency

114

components of the estimated motion patterns that correlate with the superior-inferior motion

115

of the heart during contraction to derive an image-based surrogate for the cardiac phase.

116

A.

Virtual subtraction imaging

117

Background removal is already used in many vessel reconstruction algorithms to reduce

118

artifact or to enable sparsity priors. To this end, simple image processing techniques such

119

as morphological closure [3] or top-hat filtering are sufficient [8]. For consistency-based pro-

120

cessing we seek to reliably separate the contrasted vessels from the background via digital

121

subtraction. Therefore, high fidelity estimates of the contrasted regions are needed. Subse-

122

quently, the segmented pixels are removed from the images and then filled with values that

123

are estimated from a local neighborhood. The estimated background is then subtracted

124

from the original projection, ideally yielding a non-truncated image of the contrasted vessels

125

only. A schematic of the algorithm is given in Figure 1.

7

Subtraction Segmentation algorithm

Inpainting algorithm

Masking

X-ray projection

Virtual mask image

Binary mask

Digital subtraction angiogram

FIG. 1. Schematic of the truncation removal. A binary segmentation of the object is used to remove the corresponding pixels and the background is estimated using an inpainting method. Finally, the background is subtracted from the original projection yielding the difference image.

1.

126

Segmentation

127

In X-ray imaging, contrasted vessels appear as narrow tubular structures that are bright

128

with respect to the surrounding background. It seems natural to exploit this prior knowledge

129

for segmentation in a multi-scale approach based on first and second order derivatives. The

130

acquired images I 0 are preprocessed using the well-known bilateral filter that suppresses

131

noise while preserving edges [26].

132

Let α(u, σ) be the local orientation of a structure at position u ∈ R2 and scale σ, and let eα

133

be a unit vector orthogonal to the respective orientation. If the structure at u is a contrasted

134

vessel of scale σ we expect high negative gradients in the direction orthogonal to α as well

135

as a high negative curvature along eα . Filters that enhance structures with aforementioned

136

properties are already known in literature. We briefly restate vessel enhancement and Koller

137

filtering following [27] and [28], respectively. Finally, we describe a strategy to merge both

138

responses in a binary segmentation mask.

139

In order to determine the respective responses at position u, first the Hessian matrix is

8 140

computed as 

    H(u, σ) =   

141

∂ 2 Iσ (u) ∂u21

∂ 2 Iσ (u) ∂u1 ∂u2

∂ 2 Iσ (u) ∂u2 ∂u1

∂ 2 Iσ (u) ∂u22

   ,  

(1)

142

where Iσ = I ∗Gσ , Gσ is a 2D Gaussian kernel with standard deviation σ, I is the projection

143

image after bilateral filtering, and ∗ denotes convolution. The Eigenvalues of the Hessian

144

matrix λ1/2 , |λ1 | ≥ |λ2 | are related to the tubularity of an underlying structure and allow for

145

the definition of meaningful quantities. The Blobness Rb = λ2/λ1 is close to zero for tubular

146

147

structures and close to unity for bright regions with similar extent in all directions, such p as blobs. The structureness S = λ21 + λ22 is given by the Frobenius norm of the Hessian

148

and is close to zero for uniform background and large in regions with high contrast [27].

149

Combining the two measures into a probability-like response yields the vesselness

V(u) = max exp(−c1 Rb ) · (1 − exp(−c2 S)) ,

150

σ

(2)

where c1/2 are constants that control the influence of Rb and S. Tuning c1/2 for sensitivity yields strong responses for vessels, but also for other structures with high curvature. Subsequent hysteresis thresholding and removal of small connected components suppresses isolated responses and yields a binary mask Vb . To remove some of the non-vessel segmentations still included in the binary mask the Koller filter can be used. We found it to have a higher specificity than the vesselness, yet, Koller filtering only yields strong responses on the centerline of arteries. Given the Hessian H(u, σ), we can compute the orientation α(u, σ) of the generating structure as tan(2α) = 2H12 (H11 − H22 ). For vessels with scale σ we expect large negative gradients at its boundaries u ± σeα , yielding the Koller filter response n n > > oo K(u) = max min − ∇Iσ (u + σeα ) eα , ∇Iσ (u − σeα ) eα . σ

9

TABLE I. Heuristically determined parameters used in the segmentation algorithms. Constant c2 is defined using the c-th percentile of the structureness Pc (S)). As the Koller filter response is not normalized, tK is dependent on image domain gradients and, hence, the contrast level. c1

c2

c3

tK

Ph1 and Ph2

1/0.52

1/(P0.94 (S))2

4.0 mm

0.5

R1, R2,and R3}

1/0.52

1/(P0.98 (S))2

4.0 mm

0.07

151

Thresholding of K yields an approximation of the set of centerline points QK = {u | K(u) > tK },

152

where tK is a heuristically determined threshold. We define g(u) = min{ku − vk2 } with

153

v ∈ QK as the distance of position u to its closest centerline point, and merge both responses

154

in a binary segmentation mask W such that (

155

W(u) =

1 for Vb (u) 6= 0 ∧ g(u) < c3

,

(3)

0 otherwise 156

where c3 is chosen to reflect the largest vessel radius to be expected.

157

The performance of aforementioned algorithms depends on the choice of parameters that

158

were tuned heuristically. Table I summarizes the values used for phantom and real data

160 159

experiments. Finally, we define the inverse segmentation mask W, where W(u) = 1−W(u).

161

2.

Background estimation

162

The segmentation masks obtained according to Section II A 1 are used to remove intensity

163

information at the locations of contrasted vessels, thereby artificially corrupting the projec-

164

tions. Formally, the corrupted image G is obtained by multiplying the original image with

165

the inverse mask such that G = I · W. The acquired image after noise filtering I and the

166

background image that we seek to estimate B are similar in that their corrupted versions

10 167

cannot be distinguished: !

168

G =I ·W =B·W .

169

The artificially corrupted observation G does not contain any information about the presence

170

of contrast agent. Consequently, the estimation solely relies on the remaining structures and

171

the resulting background estimate can be considered a virtual non-contrast image [23]. The

172

corrupted regions of B are narrow and sparse but they are connected. Simple methods, such

173

as morphological closure, may be rewarding for defect pixel interpolation [29], however, in

174

the case of background recovery they often yield patchy images with unnatural appearance.

175

A concurrent approach proposed by Aach et al. tries to recover the missing values using

176

spectral domain techniques [30]. According to the convolution theorem, Equation 4 can be

177

restated as a convolution in frequency domain g =

178

in the background image, and ∗ denotes convolution. Both G and B are real valued such that

179

their Fourier transforms g and b , respectively, are symmetric. Therefore, b (k) = b ∗ (N − k)

180

for a spectral line pair b (k) and b (N −k). Let the spectrum b of the undistorted background

181

B consist of only a single spectral line pair b (k) = bˆ(s) δ(k − s) + bˆ(N − s) δ(k − (N − s)).

182

Then, the observed line pair at s and N − s reads:

g (s) =

1 N



1 N



(4)

1 ( N

b ∗ w ), where N is the number of pixels



bˆ (s) · w (0) + bˆ ∗ (s) · w (2s)

(5)

183

g (N − s) = g ∗ (s) =

bˆ ∗ (s) · w ∗ (0) + bˆ (s) · w ∗ (2s)



184

where bˆ(s) and bˆ(N − s) = bˆ∗ (s) are the coefficients to be recovered. Equation 5 can be

185

solved for bˆ(s) yielding

186

bˆ(s) = N ·

g (s)w (0) − g ∗ (s)w (2s) . |w (0)|2 − |w (2s)|2

(6)

11 187

In practice, however, b consists of more than a single spectral line pair. After one such

188

deconvolution step the error is zero only at s and N − s. Therefore, deconvolution is applied

189

iteratively, estimating multiple line pairs s(i) and N − s(i) in succession. The line pairs

190

for each iteration are selected such that they maximize the decrease in Mean-Square Error

191

[30]. We apply spectral deconvolution in a patch wise manner to preserve the locality of

192

image appearance. The patches are weighted with a Blackman window to create artificially

193

continuous signals and to avoid spectral leakage [31].

194

Finally, the estimated background image is subtracted from the acquired images yielding

195

an artificial digital subtraction angiogram (aDSA) D(u) = I(u) − B(u) [32].

196

perfect segmentation and background estimation, D contains contributions of the contrast

197

agent only. As the contrast agent is confined to the coronary arteries that do not extend

198

beyond the field of view, the resulting aDSA images D are, ideally, not truncated.

199

enables the application of consistency-based methods for intra-scan motion detection that

200

are described in the following sections.

201

B.

Assuming

This

Epipolar consistency

202

Consistency-based processing requires a series of non-truncated images Di , i = 1, . . . , M

203

acquired in a known geometry that is defined by its projection matrices Pi ∈ R3×4 [22, 33].

204

The epipolar geometry describes the geometric relation between two views i1 and i2 , where

205

i1/2 ∈ {1, . . . , M } and i1 6= i2 . It allows for the definition of corresponding lines lii21 and

206

lii12 in the images Di1 and Di2 , respectively. The lines arise from the intersection of the

207

respective epipolar plane E, which is a homogeneous 3-vector, with the two image planes.

208

A schematic of the described relations is given in Figure 2. The resulting lines in the domain

209

of the respective images are defined by a signed distance to the image center t and an angle

12 210

β measured with respect to the u1 axis, such that l = (− sin(β), cos(β), −t)> .

211

Assuming an orthogonal acquisition geometry, integration over corresponding epipolar lines

212

lii21 and lii12 gives two redundant ways of calculating the integral over the epipolar plane E

213

through the three dimensional object. It holds

214

  ρi1 lii21 − ρi2 lii12 = 0 ,

215

where ρi (li ) denotes integration over the line li in image Di . Equation 7 also provides

216

intuition of why subtraction imaging as discussed in Section II A is necessary: In case of

217

truncation the integrals over epipolar lines would not yield the complete plane integral

218

and would, therefore, disagree even when no motion is present. For cone-beam projections

219

aforementioned equality does not apply as integrals over epipolar lines differ by a weighting

220

with the distance to the camera center [22]. Yet, they carry redundant information. Aichert

221

et al. use Grangeat’s theorem to cancel out the weighting and derive consistency conditions

222

similar to Equation 7 for cone-beam geometries:

(7)

223

 ∂  ∂ ρi1 lii21 − ρi2 lii12 ≈ 0. ∂t ∂t

224

In Equation 8, t is the distance of the line to the image origin. There exists a pencil of

225

epipolar planes Eii21 (κ) around the baseline, where κ is the angle of the epipolar plane and

226

the principal ray and the baseline is the join of the two camera centers [22]. This allows for

227

the definition of a global consistency metric EC by summation over all image pairs:

228

EC =

M X X i1 =1 i2 fresp

Γ(f ) if

  0

,

(11)

else

250

max is a constant related to the maximum where Γ denotes the Fourier transform of γ. fresp

251

respiratory and minimum cardiac frequency to be expected and set to 0.5 Hz.

252

max = 0.5 Hz proves reasonable considering an average ventilation rate of 0.33±0.08 Hz [34]. fresp

253

Moreover, the acquisition protocol discussed here is performed using breath-hold commands

254

suggesting residual respiratory motion.

255

correspond to the same cardiac phase and interpolate between them to obtain a normalized

256

cardiac time ci ∈ [0, 1] for each frame i. Then, we can calculate gating weights wi for a

257

reference heart phase cr following [8]

258

wi (cr ) =

   

a

cos

  0



Setting

We detect peaks in γcard that are assumed to

|ci −cr | π w



if |ci − cr |
, k ∈ {i2 , i3 } maps to the blue epipolar lines. Images i1 and i2 correspond to the same cardiac phase cr . The rightmost image i3 , however, corresponds to a drastically different phase and shows residual mismatch between the compensated epipolar line and the tracked key point. This observation highlights the influence of cardiac motion and motivates the use of gating weights in Equation 14.

294

state the Structural Similarity (SSIM) and the Root-Mean-Square Error (RMSE) of the

295

background estimate with respect to the non-contrasted, noise-free ground-truth. We also

296

state SSIM and RMSE of noisy realizations of the non-contrasted ground-truth to establish

297

an upper bound for both measures. To avoid bias introduced by pixels that never changed,

298

the measures are computed only in a region of interest that is obtained by dilating the

299

estimated segmentation mask by 10 pixels.

300

C.

Motion compensation

301

Evaluating the performance of motion estimation is difficult. Direct comparison of the

302

estimated shifts with ground truth is unrewarding, as cardiac motion is non-rigid and, hence,

303

cannot be compensated for by the proposed strategy. To overcome this fundamental limi-

304

tation, we evaluate the compensation error for different target heart phases separately. To

18 305

this end, we track the 2D position ξij ∈ R2 of key points j = 1, . . . , Q, such as vessel bifur-

306

cations, over all frames i = 1, . . . , M . In the phantom experiments the reference positions

307

are obtained by forward projection of the key points in 3D. For the real data experiments

308

vessel bifurcations are tracked manually in 2D. Finally, the error averaged over all tracked

309

points at a given heart phase cr reads Q

M

310

X 1 1 X P wi (cr ) · ζij , εcr = Q j=1 i wi (cr ) i=1

311

where the gating weights wi (cr ) are calculated from γ according to Section II C. Moreover,

312

ζij is the weighted average distance of point j to the corresponding epipolar lines given by ζij

313

>   X 1 j j k k > wk (ci ) · ξk · Fi · ξi + (0, ti , 0) , =P k6=i wk (ci ) k6=i

(13)

(14)

314

where we use underline notation to denote homogeneous extension, tki = γi − γk , and F ∈

315

R3×3 is the fundamental matrix that can be computed from the projection matrices [22].

316

The uncompensated error can be computed by setting tik = 0 in Equation 14. We also state

317

the Pearson correlation coefficient of the image-based surrogate for cardiac motion derived

318

according to Section II C with the gold standard surrogate that is derived from the ECG

319

signal.

IV.

320

A.

321

RESULTS

Segmentation and inpainting

322

Quantitative results of the segmentation quality assessment in the phantom study are

323

stated in Table IV. Sensitivity and specificity values are slightly smaller than unity indi-

324

cating minor segmentation errors. Representative examples of such mis-segmentations are

325

highlighted in Figure 4.

19

TABLE IV. Sensitivity and specificity for the segmentations of phantom data sets Ph1 and Ph2 at both photon fluence levels in the rectangular region of interest around the arteries with respect to the ground-truth. 5 · 102 photons

1 · 104 photons

Study Sensitivity

Specificity

Sensitivity

Specificity

Ph1

0.96 ± 0.017

0.99 ± 0.0033

0.96 ± 0.014

0.98 ± 0.0036

Ph2

0.96 ± 0.015

0.99 ± 0.0037

0.96 ± 0.014

0.99 ± 0.0036

326

SSIM and RMSE of the background estimation algorithm described in Section II A 2 can be

327

found in Table V and Table VI for the phantom studies with low and high photon counts per

328

pixel, respectively. Values of the SSIM and RMSE are worse for the more noisy realizations.

329

B.

330

Motion estimation

331

Ground truth motion patterns and the shifts estimated according to Section II B for

332

the phantom studies are shown in Figure 5. The cardiac motion is non-rigid which can be

333

understood from the dashed black lines in Figure 5 that correspond to the displacement

334

in u2 coordinate of the distinct bifurcation points ξ j of the artery tree. Qualitatively, the

335

estimated shifts are in very good agreement with the ground truth motion patterns.

336

For all phantom and real data sets, the high frequency component of the estimated shifts

337

correlated very well with cardiac motion. The Pearson correlation coefficients between sur-

338

rogates derived from the ECG and shifts γ are stated in Table VII. Correlation was higher

339

for the clinical cases.

340

Finally, we used the surrogates to assess the motion compensation error at distinct heart

R3

R2

R1

Ph2: 1·104 phot.

Ph1: 5·102 phot.

20

FIG. 4. From left to right: projection, segmentation, estimated background, and virtual subtraction image. Red and green arrows point to excess and missing segmentations, respectively. We show representative angiograms of the left coronary artery at different viewing angles for the two phantom data sets Ph1 and Ph2, and the three real data cases R1, R2, and R3. Individual data sets occupy one row each.

21

TABLE V. Quantitative results of the image inpainting step for phantom studies Ph1 and Ph2 with a photon fluence of 5 · 102 photons per pixel. We state the mean value and standard deviation of the structural similarity (SSIM) and the root-mean-squared error (RMSE) of the inpainted image Bˆ compared to the noise-free ground-truth. The values stated in brackets were obtained when comparing the a bilateral filtered version of the background estimate (see Section II A 1) to the noise-free ground-truth. Moreover, we provide an upper bound to aforementioned quantities by computing SSIM and RMSE for a noisy version of the ground-truth (noisy GT). @ @

SSIM

RMSE

inpainted

0.91 ± 0.03 (0.99 ± 0.01)

0.36 ± 0.04 (0.23 ± 0.06)

noisy GT

0.92 ± 0.03 (1.0 ± 0.0)

0.27 ± 0.03 (0.058 ± 0.006)

inpainted

0.91 ± 0.03 (0.99 ± 0.00)

0.36 ± 0.05 (0.23 ± 0.07)

noisy GT

0.92 ± 0.03 (1.0 ± 0.0)

0.27 ± 0.03 (0.057 ± 0.006)

@ @ Study @ @

Ph1

Ph2

341

phases according to Equation 13. The resulting errors are shown in Figure 7 and Figure 8

342

for the phantom and real data studies, respectively. We observed substantial reductions in

343

εcr for all studied cases, the numeric values are stated in Table VIII.

344

V.

345

346

A.

DISCUSSION

Segmentation and inpainting

347

From the phantom images shown in Figure 4 it becomes apparent that the proposed

348

pipeline yields segmentation masks that are marginally thicker than the underlying vessels.

22

TABLE VI. Inpainting results for both phantom studies when simulating 1 · 104 photons per pixel. Mean value and standard deviation of the SSIM and the RMSE of the inpainted image Bˆ and a bilateral filtered version (in brackets) compared to the noise-free ground-truth. Again, upper bounds for all measures are computed using a noisy version of the ground-truth (noisy GT). @ @

SSIM

RMSE

inpainted

0.98 ± 0.01 (0.99 ± 0.01)

0.26 ± 0.08 (0.25 ± 0.08)

noisy GT

1.0 ± 0.0 (1.0 ± 0.0)

0.060 ± 0.006 (0.034 ± 0.006)

inpainted

0.99 ± 0.01 (0.99 ± 0.00)

0.25 ± 0.09 (0.24 ± 0.09)

noisy GT

1.0 ± 0.0 (1.0 ± 0.0)

0.059 ± 0.006 (0.033 ± 0.005)

@ @ Study @ @

Ph1

Ph2

TABLE VII. Pearson correlation of the image-based surrogate for cardiac motion with the surrogate derived from the ECG. Study

Ph1

Ph2

Photons

5·102

1·104

5·102

1·104

Pearson R

0.88

0.88

0.91

0.91

R1

R2

R3

0.92

0.98

0.94

349

This effect arises from hysteresis thresholding and decreases specificity. Sensitivity is in-

350

fluenced by incomplete segmentations, e.g., at bifurcations. Although over-segmentation is

351

the dominant source of error, the impact of under-segmentation is more pronounced due

352

to the imbalanced foreground and background class labels.

353

slight decrease in specificity for phantom study Ph1when comparing high and low photon

354

count realizations, which suggests over-segmentation. This unexpected deterioration can be

355

attributed to the complex vessel enhancing filters, e.g., due to excess points in QK that have

356

responses barely higher than the allowed threshold tK . However, not having found sub-

Interestingly, we observed a

23 5·102 phot. γ + 10 mm 1·104 phot. γ + 5 mm Ground-truth

Shift in mm

30 20 10 0 -10 -20 1

1.5

2

2.5

3

3.5

4

4.5

5

5.5

Frame time in multiples of cardiac time

Shift in mm

(a) Study Ph1 5·102 phot. γ + 10 mm 1·104 phot. γ + 5 mm Ground-truth

10

0

1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5

Frame time in multiples of cardiac time

(b) Study Ph2

FIG. 5. Shifts γ obtained by optimizing for consistency are plotted for phantom studies Ph1 and Ph2 in the top and bottom figure, respectively. Shifts corresponding to different noise realizations are slightly offset to allow for improved visualization. The black, dotted curves show reference shifts of the vessel key points that were tracked in 3D.

357

stantial difference in performance for high and low noise realizations suggests that bilateral

358

filtering effectively mitigated the influences of noise (see Table IV).

359

more challenging in the clinical cases suggesting that the numerical phantom study was

360

limited with respect to anatomical variation and imaging physics. Obtaining good segmen-

361

tation quality was particularly difficult in views that exhibit overlap of the arterial tree with

362

the spine which drastically decreased vessel visibility. Moreover, we observed discontinuous

363

segmentations that arise from low vessel contrast.

Segmentation proved

Shift in mm, phase in a.u.

24

6

γ Cardiac phase

4 2 0 -2 -4 -6 20

40 60 80 100 Projection number

120 133

Shift in mm, phase in a.u.

(a) Clinical case R1 γ Cardiac phase

4 2 0 -2 -4 -6 20

40 60 80 100 Projection number

120 133

Shift in mm, phase in a.u.

(b) Clinical case R2 12 10 8 6 4 2 0 -2 -4 -6

γ Cardiac phase

20

40 60 80 100 Projection number

120 133

(c) Clinical case R3

FIG. 6. Shifts γ that maximize data consistency for the three clinical cases. Moreover, we show the corresponding cardiac time as derived from the ECG. Local maxima of the cardiac time coincide with R-peaks in the ECG and indicate ventricular systole.

364

Considering background estimation, we expect a deterioration of SSIM and RMSE for the

365

more noisy realizations. This is because the background estimate B is compared to noise-

366

free ground truth. The quantitative measures improved when noise-filtered [26] background

Error in mm

25 5·102 phot. εcr 1·104 phot. εcr No Comp. −10 mm

6 5 4 3 2 1 0.5 0 0

0.2

0.4

0.6

0.8

1

Reference heart phase

(a) Study Ph1 5·102 phot. εcr 1·104 phot. εcr No Comp.

Error in mm

2

1 0.5 0 0

0.2

0.4

0.6

0.8

1

Reference heart phase

(b) Study Ph2

FIG. 7. Compensation errors εcr at different reference heart phases cr . Note that the error curve for study Ph1 was shifted by −10 mm to improve visualization.

367

estimates were used indicating that the observed deteriorations originate, at least partly,

368

from noise characteristic. The proposed method was not able to achieve RMSE lower than

369

0.25 although theoretically possible (see right-most column in Table VI) indicating minor

370

inpainting errors. Yet, SSIM and RMSE of the background estimate are very close to the

371

respective upper bounds obtained using the noisy ground-truth. This observation suggests

372

that background estimation worked very well in most cases.

373

Segmentation errors propagate through the whole pipeline and erroneously segmented struc-

26 3

εcr No Comp.

Error in mm

2.5 2 1.5 1 0.5 0

0.2

0.4

0.6

0.8

1

Reference heart phase

(a) Clinical case R1

Error in mm

2.5

εcr No Comp.

2

1.5

1

0.5 0

0.2

0.4

0.6

0.8

1

Reference heart phase

(b) Clinical case R2

Error in mm

5

εcr No Comp. −2 mm

4

3 2.5 2 1.5 1 0

0.2

0.4

0.6

0.8

1

Reference heart phase

(c) Clinical case R3

FIG. 8. Compensation errors εcr at different reference heart phases cr for the clinical cases. Note that the error curve for study R3 was shifted by −2 mm to improve visualization.

27

TABLE VIII. Mean and minimal error with and without compensation stated in mm. For the phantom cases, we limit the analysis to the high photon count realization as the compensation errors are either the same or higher (see Figure 7). Ph1

Orig.

Ph2

Mean

Min.

Mean

Min.

14 ± 1

13

1.2 ± 0.6

0.39

Comp. 0.81 ± 0.23 0.26 0.75 ± 0.39 0.12

R1

R2

R3

Mean

Min.

Mean

Min.

Mean

Min.

1.9 ± 0.3

1.3

1.7 ± 0.2

1.4

6.2 ± 0.1

6.0

Comp. 1.0 ± 0.1 0.85 1.1 ± 0.1 0.95 2.7 ± 0.5

1.8

Orig.

374

tures, such as vertebrae, are pronounced in the binary mask as can be seen in the last row

375

of Figure 4. However, the presence of mis-segmentation is not as distinct in the subtraction

376

image, decreasing its influence on motion estimation. Moreover, we have found ECC to be

377

robust against statistical errors introduced in the subtraction pipeline which is supported

378

by the results presented in the following section.

379

B.

Motion estimation

380

From Figure 5 it is apparent that the respiratory motion modeled in study Ph1 is much

381

more drastic than for Ph2. A similar observation can be made when considering the clinical

382

cases shown in Figure 6, where cases R1 and R2 exhibit residual respiratory motion only

383

while scan R3 suffers from heavy intra-scan respiratory motion.

28 384

The surrogate for cardiac motion exhibited a higher correlation for the clinical cases. This

385

arises from the shape of the local maxima in γ that denote the start of ventricular con-

386

traction and are detected by the method detailed in Section II C: for the phantom data we

387

observed broader maxima that complicated exact localization (see Figure 5), whereas the

388

peaks signaling ventricular systole were sharp for the clinical cases, particularly for R1, as

389

can be seen in Figure 6. Small deviations in peak location heavily deteriorate Pearson corre-

390

lation as the surrogates have the form of sawtooth waves. Exemplarily, shifting a single peak

391

position to a neighboring frame yields a deterioration in Pearson correlation of 4.4 %. When

392

considering the motion compensation error at specific heart phases, we observed the lowest

393

error around ventricular diastole (cr ∼ 0.4 − 0.6). This result is to be expected, as cardiac

394

motion inside the gating window is negligible at diastole. However, some cases exhibit a

395

second local minimum at end systole that also coincides with an inflection point in γ. This

396

behavior is best observed in Figure 7b at cr ∼ 0.85 for the phantom study and in Figure 8b

397

at cr ∼ 0.9 for the clinical cases. In presence of drastic respiratory motion as in clinical case

398

R3, aforementioned behavior is suppressed. This observation is reflected in a high mean

399

error with a low standard deviation as can be understood from Table VIII and Figure 8c.

400

It is worth mentioning that the quantitative errors reported in Table VIII and Figures 7 and

401

8 depend on the choice of gating parameters (see Section II C), as they influence the gating

402

weights used in Equation 13. Wider gating windows increase the residual motion contained

403

in each gate which would, therefore, result in higher errors. The values a = 4 and w = 0.4

404

that were used here constitute a reasonable compromise between heart phase sensitivity and

405

residual cardiac motion [8, 35].

406

The reduction in error achieved for the clinical cases is substantial, yet an average minimal

407

error of 1.2 ± 0.5 mm (see Table VIII) at diastole remains. This residual error is a composite

29 408

of three different sources of error: incorrect estimates γi , a finite gating window width, but

409

also inaccuracies in manual key point tracking. Incorrect key point positions impose a fun-

410

damental lower bound on error reduction, unfortunately, direct assessment of this systematic

411

error is not possible for the real data cases. The results obtained on phantom data, however,

412

suggest that very accurate motion compensation even below the pixel size is possible with

413

the proposed algorithm.

C.

414

Limitations and challenges

415

First, the current optimization described in Section II B does not inhibit a systematic

416

shift of all detectors. Such shift is bounded by the maximally occurring motion and would

417

introduce cone-beam artifacts, the effect of which is negligibly compared to the drastic

418

motion artifacts. However, the method is, in principle, able to optimize for systematic drifts

419

[36], an extension that should be considered for future work.

420

Second, within this work we optimized for craniocaudal shifts only, a motion model that is

421

not sufficient in the most general case. However, respiratory motion of the coronary arteries

422

occurring in craniocaudal direction is substantially larger than in left-right or posterior-

423

anterior direction, respectively [21]. Compensation of the dominant motion component as

424

proposed here may allow for initial reconstructions and, therefore, enable methods based on

425

3D/2D registration that allow for more complex motion models [3, 16, 20].

426

Finally, the experiments were restricted to incomplete breathing cycles, an assumption that

427

was motivated by imperfect breath-hold. In previous work, we showed that the proposed

428

method is fit to handle multiple recurrences [23], yet, a demonstration on real data remains

429

subject of future work.

30 VI.

430

CONCLUSIONS

431

In summary, we proposed a projection domain algorithm targeted at intra-scan motion

432

assessment in rotational coronary angiography. The method is based on virtual single-

433

frame subtraction imaging, that uses segmentation and background estimation algorithms

434

to obtain non-truncated images of the contrast-filled lumen. Then, craniocaudal shifts were

435

estimated that maximize epipolar consistency within an acquisition. The shifts served as

436

basis for an image-based surrogate for cardiac motion and allow to compensate for intra-scan

437

respiratory motion. All figures of merit suggested good performance for both phantom and

438

clinical cases.

439

Future work will investigate extensions of the proposed method, e.g., with respect to motion

440

models, which is of particular interest for experimental imaging trajectories, such as dual axis

441

rotational angiography [37] or saddle trajectories [38]. Finally, the current single-frame sub-

442

traction pipeline relies on sophisticated segmentation and background estimation algorithms

443

that require thorough parameter tuning and exhibit long run times. Although the results

444

reported here are very promising, we believe that a less sophisticated background subtrac-

445

tion pipeline, e.g., based on morphological operations, would drastically ease the transition

446

to clinical application. Possibilities in this regard should, therefore, be investigated.

447

ACKNOWLEDGMENTS

448

The authors gratefully acknowledge funding of DFG MA 4898/3-1 and the Erlangen

449

Graduate School in Advanced Optical Technologies (SAOT) by the German Research Foun-

450

dation (DFG) in the framework of the German excellence initiative, and would like to thank

451

Dr. G¨ unter Lauritsch (Siemens Healthineers) for his valuable comments, and Dr. Christian

31 452

Schlundt and Dr. Michaela Hell (both University Hospital Erlangen) for providing the image

453

data.

CONFLICTS OF INTEREST

454

455

The authors have no relevant conflicts of interest to disclose.

456

[1] H Hetterich, T Redel, G Lauritsch, C Rohkohl, and J Rieber, “New X-ray imaging modal-

457

ities and their integration with intravascular imaging and interventions.” Int. J. Cardiovasc.

458

Imaging 26, 797–808 (2010).

459

[2] N E Green, S Y J Chen, A R Hansgen, J C Messenger, B M Groves, and J D Carroll, “Angio-

460

graphic Views Used for Percutaneous Coronary Interventions: A Three-Dimensional Analysis

461

of Physician-Determined vs. Computer-Generated Views,” Catheter. Cardiovasc. Interv. 64,

462

451–459 (2005).

463

[3] C Blondel, G Malandain, R Vaillant, and N Ayache, “Reconstruction of Coronary Arteries

464

from a Single rotational X-ray projection sequence.” IEEE Trans. Med. Imaging 25, 653–663

465

(2006).

466

467

[4] M Unberath, S Achenbach, R Fahrig, and A Maier, “Exhaustive Graph Cut-based Vasculature Reconstruction,” in Proc. ISBI (IEEE, 2016) pp. 1143–1146.

468

[5] J C Messenger, S Y J Chen, J D Carroll, J E B Burchenal, K Kioussopoulos, and B M

469

Groves, “3D coronary reconstruction from routine single-plane coronary angiograms: Clinical

470

validation and quantitative analysis of the right coronary artery in 100 patients,” Int. J. Card.

471

Imaging 16, 413–427 (2000).

32 472

473

[6] S C ¸ imen, A Gooya, M Grass, and A F Frangi, “Reconstruction of Coronary Arteries from X-ray Angiography: A Review,” Med. Image Anal. (2016).

474

[7] E Hansis, D Sch¨ afer, O D¨ ossel, and M Grass, “Projection-based motion compensation for

475

gated coronary artery reconstruction from rotational x-ray angiograms.” Phys. Med. Biol. 53,

476

3807–20 (2008).

477

[8] C Schwemmer, C Rohkohl, G Lauritsch, K M¨ uller,

and J Hornegger, “Residual motion

478

compensation in ECG-gated interventional cardiac vasculature reconstruction.” Phys. Med.

479

Biol. 58, 3717–37 (2013).

480

481

482

483

484

485

486

487

[9] U Jandt, D Sch¨ afer, M Grass, and V Rasche, “Automatic generation of 3D coronary artery centerlines using rotational X-ray angiography.” Med. Image Anal. 13, 846–58 (2009). [10] B Desjardins and E A Kazerooni, “ECG-gated Cardiac CT,” AJR Am. J. Roentgenol. 182, 993–1010 (2004). [11] J-J Sonke, L Zijp, P Remeijer, and M van Herk, “Respiratory correlated cone beam CT,” Med. Phys. 32, 1176 (2005). [12] T Li, A Koong, and L Xing, “Enhanced 4D cone-beam CT with inter-phase motion model,” Med. Phys. 34, 3688 (2007).

488

[13] M Brehm, P Paysan, M Oelhafen, P Kunz, and M Kachelrieß, “Self-adapting cyclic registra-

489

tion for motion-compensated cone-beam CT in image-guided radiation therapy,” Med. Phys.

490

39, 7603–7618 (2012).

491

[14] M Brehm, P Paysan, M Oelhafen, and M Kachelrieß, “Artifact-resistant motion estimation

492

with a patient-specific artifact model for motion-compensated cone-beam CT,” Med. Phys.

493

40, 1–13 (2013).

33 494

[15] M Brehm, S Sawall, J Maier, S Sauppe,

and M Kachelrieß, “Cardiorespiratory motion-

495

compensated micro-CT image reconstruction using an artifact model-based motion estima-

496

tion,” Med. Phys. 42, 1948–1958 (2015).

497

498

499

[16] A Brost, R Liao, N Strobel, and J Hornegger, “Respiratory motion compensation by modelbased catheter tracking during EP procedures,” Med. Image Anal. 14, 695–706 (2010). [17] G Shechter, F Devernay, E Coste-Mani`ere, A Quyyumi,

and E R McVeigh, “Three-

500

dimensional motion tracking of coronary arteries in biplane cineangiograms.” IEEE Trans.

501

Med. Imaging 22, 493–503 (2003).

502

[18] C Blondel, R Vaillant, G Malandain, and N Ayache, “3D tomographic reconstruction of

503

coronary arteries using a precomputed 4D motion field,” Phys. Med. Biol. 49, 2197–2208

504

(2004).

505

[19] Y Wang, S J Riederer, and R L Ehman, “Respiratory Motion of the Heart: Kinematics and

506

the Implications for the Spatial Resolution in Coronary Imaging,” Magn. Reson. Med. 33,

507

713–719 (1995).

508

[20] G Shechter, C Ozturk, J R Resar, and E R McVeigh, “Respiratory Motion of the Heart From

509

Free Breathing Coronary Angiograms.” IEEE Trans. Med. Imaging 23, 1046–56 (2004).

510

[21] G Shechter, J R Resar, and E R McVeigh, “Displacement and Velocity of the Coronary

511

512

513

514

515

Arteries: Cardiac and Respiratory Motion.” IEEE Trans. Med. Imaging 25, 369–75 (2006). [22] A Aichert, M Berger, J Wang, N Maass, A Doerfler, J Hornegger, and A Maier, “Epipolar Consistency in Transmission Imaging,” IEEE Trans. Med. Imaging 34, 2205–2219 (2015). [23] M Unberath, A Aichert, S Achenbach, and A Maier, “Single-frame Subtraction Imaging,” in Proc. CT Meeting (2016) pp. 89–92.

34 516

517

[24] W P Segars, M Mahesh, T J Beck, E C Frey, and B M W Tsui, “Realistic CT simulation using the 4D XCAT phantom,” Med. Phys. 35, 3800–3808 (2008).

518

[25] A Maier, H G Hofmann, C Schwemmer, J Hornegger, A Keil, and R Fahrig, “Fast simulation

519

of x-ray projections of spline-based surfaces using an append buffer,” Phys. Med. Biol. 57,

520

6193–6210 (2012).

521

522

523

524

525

526

[26] C Tomasi and R Manduchi, “Bilateral filtering for gray and color images,” in Proc. ICCV (IEEE, 1998) pp. 839–846. [27] A F Frangi, W J Niessen, K L Vincken, and M A Viergever, “Multiscale vessel enhancement filtering,” in Proc. MICCAI (Springer, 1998) pp. 130–137. [28] T M Koller, G Gerig, G Szekely, and D Dettwiler, “Multiscale detection of curvilinear structures in 2-D and 3-D image data,” in Proc. ICCV (IEEE, 1995) pp. 864–869.

527

[29] M Berger, C Forman, C Schwemmer, J H Choi, K M¨ uller, A Maier, J Hornegger, and R Fahrig,

528

“Automatic Removal of Externally Attached Fiducial Markers in Cone Beam C-arm CT,” in

529

Proc. BVM (2014) pp. 168–173.

530

531

532

533

[30] A Aach and V Metzler, “Defect interpolation in digital radiography - how object-oriented transform coding helps,” in Proc. SPIE Med. Imaging (2001) pp. 824–835. [31] F J Harris, “On the use of windows for harmonic analysis with the discrete Fourier transform,” Proc. IEEE 66, 51–83 (1978).

534

[32] W R Brody, “Digital subtraction angiography,” IEEE Trans. Nucl. Sci 29, 1176–1180 (1982).

535

[33] M Berger, A Maier, Y Xia, J Hornegger, and R Fahrig, “Motion Compensated Fan-Beam CT

536

by Enforcing Fourier Properties of the Sinogram,” in Proc. CT Meeting (2014) pp. 329–332.

537

[34] A Rodr´ıguez-Molinero, L Narvaiza, J Ruiz, and C G´alvez-Barr´on, “Normal respiratory rate

538

and peripheral blood oxygen saturation in the elderly population,” J. Am. Geriatr. Soc. 61,

35 539

2238–2240 (2013).

540

[35] M Unberath, K Mentl, O Taubmann, S Achenbach, R Fahrig, J Hornegger, and A Maier,

541

“Torsional Heart Motion in Cone-beam Computed Tomography Reconstruction,” in Proc.

542

Fully3D (2015) pp. 651–654.

543

[36] C Debbeler, N Maass, M Elter, F Dennerlein, and T M Buzug, “A new CT rawdata redun-

544

dancy measure applied to automated misalignment correction,” in Proc. Fully3D (2013) pp.

545

264–267.

546

547

548

549

[37] A J P Klein and J A Garcia, “Rotational coronary angiography,” Cardiol. Clin. 27, 395–405 (2009). [38] J D Pack, F Noo, and H Kudo, “Investigation of saddle trajectories for cardiac CT imaging in cone-beam geometry,” Phys. Med. Biol. 49, 2317 (2004).