Unsupervised Algorithm for Radiographic Image ... - IEEE Xplore

EUROCON 2007 The International Conference on “Computer as a Tool”

Warsaw, September 9-12

Unsupervised Algorithm for Radiographic Image Segmentation Based on the Gaussian Mixture Model Faiza Mekhalfa*, Nafaâ Nacereddine* and Aïcha Baya Goumeïdane* *

Welding and NDT Research Center/Image and Signal Processing Laboratory, Cheraga, Algeria [email protected], [email protected], [email protected] radiogram images of weld defects; mainly in three classes: the weld defect, the welded joint and the base metal. Here we cast the segmentation problem as an incomplete data problem in which the intensity data is observable, the underlying spatial state map is unobservable or missing and the model parameters need to be determined [4] [5]. For a comparison performance, we simulate the experiments with fuzzy c-means (FCM) algorithm [6].

Abstract—In this paper we study an unsupervised algorithm for radiographic image segmentation, based on the Gaussian mixture models (GMMs). Gaussian mixture models constitute a well-known type of probabilistic neural networks. One of their many successful applications is in image segmentation. Mixture model parameters have been trained using the expectation maximization (EM) algorithm. Numerical experiments using radiographic images illustrate the superior performance of EM method in term of segmentation accuracy compared to fuzzy c-means algorithm.

The rest of the paper is organized as follows. In section II, we explain basic principles of FCM algorithm, Gaussian mixtures and EM algorithm. In section III we applied the EM algorithm for radiographic image segmentation. Some conclusions are given in section IV.

Keywords— Weld defect, radiographic images, image segmentation, Gaussian mixture model, expectation maximization algorithm, fuzzy C-means algorithm.

II.

I. INTRODUCTION To evaluate the quality of welded joint, many methods of non destructive testing (NDT) are used. Radiography testing is one of major NDT methods to examine weld defects. Inspected zones may present multifarious defects such as porosity, inclusion, crack, lack of penetration, etc [1]. The obtained radiographic films are examined by interpreters, of which the task is to detect, recognize and quantify eventual defects and to accept or reject them by referring to the non destructive testing codes and standards. The detection of the defects in a radiogram is sometimes very difficult, because of the bad quality of the films, the weld thickness, and the weak sizes of defects. Perfect knowledge of the geometry of these weld defects is an important step which is essential to appreciate the quality of the weld. The progress in computer science and the artificial intelligence techniques have allowed the defect detection and classification to be carried out by using digital image processing and pattern recognition tools [2] [3].

A. The Fuzzy C-Means Algorithm The fuzzy c-means (FCM) algorithm is one of the most widely used fuzzy clustering algorithms. The FCM algorithm attempts to partition a finite collection of elements X={x1,x2,…, xN} into a c fuzzy clusters with respect to some given criterion. Given a finite set of data, the algorithm returns a list of c clusters V such that

V = vk ;

k = 1,2, K, c

and a partition matrix U such that

U = u ik ; i = 1,2, K, N ; k = 1,2, Kc where uik is a numerical value in [0, 1] that tells the degree to which the element xi belongs to the kth cluster. The following is a linguistic description of the FCM algorithm [7]: Step1: Fix the number of clusters c (2 ≤ c ≤ N), exponential weight β (β >1) and the termination criterion ε. Initialize membership function U(t=0) to random values. Select metric d. Step2: Calculate the fuzzy cluster centers {vk(t)} using U(t)

The segmentation constitutes one of the most significant problems in the image analysis system, because the result obtained at the end of this stage governs the final quality of interpretation. There are primarily two image segmentation approaches, supervised and unsupervised. Often, image segmentation must be done in an unsupervised fashion in that training data is not available and the class conditioned feature vectors must be estimated directly from the data.

N

vk(t ) =

In this paper, we propose to use an unsupervised model-based segmentation approach based upon the expectation maximization (EM) algorithm to segment

1-4244-0813-X/07/$20.00 2007 IEEE.

PROBLEM FORMULATION

∑ (u i =1 N

∑ (u i =1

289

ik

) β xi

ik

)

β

(1)

Step3: Update the membership (fuzzification) using {vk(t), k=1,…,c}

uik(t +1) =

function

1

c

∑ (d j =1

ik

/ d ij )

2 β −1

U(t)

mixtures the sample came. Now we discuss how to estimate the parameters of the Gaussian mixtures with the EM algorithm [9]. The usual EM algorithm consists of an E-step and an M-step. Suppose that Θ(t) denotes the estimation of Θ obtained after the tth iteration of the algorithm. Then at the (t+1) iteration, E-step computes the expected complete data log-likelihood function.

(2)

Step 4: Compare U(t) to U(t+1): if⎥⎥ U(t+1) - U(t)⎥⎥ ≤ ε stop, otherwise set t = t+1 and go to step 2. Step 5: Defuzzification: if uik0 = max (uik), then uik0 = 1 and uik = 0 for 1≤ i ≤N, 1≤ k ≤c and k ≠ k0. The form xi belongs to the class k0.

N M

Q(Θ/ Θ(t ) ) = ∑∑{logαm p(xk /θm)}P(m/ xk ;Θ(t ) )

where P(m/xk; Θ(t)) is a posterior probability and is computed as

B. Gaussian Mixture Model Consider mixture model with M (M >1) components in Rn for n≥1.

P( m / x k ; Θ ( t ) ) =

M

p ( x ) = ∑ α m p ( x / m)

(3)

m =1

m

α m(t +1) =

=1

(2π )

n/2

1 det(Cm )1 / 2

⎧ 1 ⎫ × exp⎨− ( x − µm )T Cm−1 ( x − µm )⎬ ⎩ 2 ⎭

µ

p( x / Θ) =

∑α m =1

m

p( x / θ m )

( t + 1) m

=

∑x k =1 N

k

(7)

p( x k / θ ) (t ) l

(4)

∑ P(m / x

(t +1) m

C

=

k =1

(9)

; Θ (t ) ) (t +1) m

(t )

k

k

N

∑P(m/ x ;Θ k =1

III.

k

∑P(m/ x ;Θ ){(x − µ N

(8)

P (m / x k ; Θ (t ) )

k =1

where T denote the transpose operation. Here we encapsulate these parameters into a parameter vector, writing the parameters of each component θm = (µm,Cm), to get Θ = (α1,α2,…,αM,θ1,θ2,…,θM). Then, equation (3) can be rewritten as M

∑α

(t ) l

1 N P(m / xk ; Θ(t ) ) ∑ N k =1 N

For the Gaussian mixtures, each component density p(x/m) is a normal probability distribution.

p( x / θ m ) =

M

And the M-step finds the (t+1)th estimation Θ(t+1) of Θ by maximizing Q(Θ/Θ(t))

where αm ∈[0,1] (∀m=1,2,…,M) are the mixing proportions subject to

∑α

α m(t ) p( xk / θ m(t ) ) l =1

m =1

M

(6)

k =1 m=1

)(xk − µm(t +1) )T

(t )

k

}

(10)

)

EM ALGORITHM FOR IMAGE SEGMENTATION

A. Statistical Image Model We suppose that an image consists of a set of disjoint pixels labeled 1 to N, and each pixel is assumed to belong to one of M distinct classes. We let xi denote the ndimensional feature vector observed from ith pixel (i=1,…,N), which encapsulate image attributes such as position, color and texture information. And we use yi∈{1,2, …,M} to indicate from which class the pixel i came. We write X={x1, x2,…, xN} for the observed image data, and Y={y1, y2,…, yN} for a realization of a M-class label field. The process of segmentation is to find Y which represents the correct class at each pixel site given by image X; thus, the segmentation process is a labeling process. To find a solution we model the segmentation problem by using statistical model [4] [5]. We usually adopt a finite mixture model to represent the marginal distribution of the feature vector xi observed from the ith pixel (i=1,…, N). The segmentation consists of two steps: the first step is mixture estimation. The EM method is used to estimate the mixture parameters. The second step is pixel clustering; the segmentation is carried out by assigning each pixel into proper cluster according to the maximum likelihood (ML) estimation.

(5)

If we knew the component from which x came, then it would be simple to determine the parameters Θ. Similarly if we knew the parameters Θ, we could determine the component that would be most likely to have produce x. The difficulty is that we know neither. However, the EM algorithm [8] could be introduced to deal with this difficulty through the concept of missing data. C. The EM Algorithm The expectation maximization (EM) algorithm is a general approach to iterative computation of local maximum likelihood (ML) estimates when the observations can be viewed as incomplete data. The EM algorithm alternates between finding a greatest lower bound to the likelihood function, and then maximizing this bound. Given a set of samples X={x1, x2,…, xN}, the complete data set Z=(X,Y) consists of the sample set X and a set Y of variables indicating from which component of the

290

Gaussian mixtures estimated by the EM algorithm of our testing images and their corresponding histograms. By examining the segmentation results we can deduce that, the EM algorithm gives more accurate results on the overall images. In the case of image 1, the experimental result by EM algorithm is similar to this given by FCM algorithm. For image 2, the defect (external undercut), the welded joint and the base metal are put in obviousness by the EM algorithm. However, with the FCM method we can not distinguish the defect. For image 3, the FCM algorithm merges the welded joint and the background (base metal) as one class, while the EM algorithm does a correct job. In this case, the weld defect (tungsten inclusion) is well extracted by both algorithms. Also, the better result for image 5 is provided by the EM algorithm which can differentiate the defect (crack) from the other region classes, in spite of the noised and blurred nature of the image. For image 4, while there are some spurious regions obtained by the EM algorithm, the image is under-segmented by the FCM algorithm. This substandard performance can be explained by the facts that the EM algorithm tends to converge to a local optimum, and that in the weld defects images, the welded joint intensity is variable and the overlapping between the region classes is therefore large, due to the weld thickness variations, the weak sizes of the defect and the geometrical considerations related to the used radiography technique.

B.

Experimental Results To assess the performance of the EM based segmentation method, a set of five radiographic testing images representing weld defects such as porosity, external undercut, tungsten inclusion, lack of penetration and crack, is used in the experiments. Our objective is to apply this method to extract the defect from the welded joint in one hand, and extract this latter from the base metal in the other hand. The major part of the radiographic films that we have digitized, were extracted from the base of the standard films provided by International Institute of Welding. The expectation maximization algorithm is applied to perform the segmentation process. The radiographic film images contain weld defect placed in welded joint with different intensities. For such images, intensity is a distinguishing feature that can be used to segment the images. The Gaussian mixture parameters are initialized by randomly sampling and the number of classes is specified by the user. In order to evaluate the performance of this method with a comparative manner, we have also carried out radiographic image segmentation using the FCM algorithm. The segmentation results for radiographic images of weld defects are illustrated in Fig.1. First, we show a comparison between the EM and FCM methods for the image segmentation results. Then, we present in Fig.2 the

Porosity

Image 1

External undercut

Image 2

Tungsten inclusion

Image 3

Lack of penetration Image 4

Crack

Image 5

Fig. 1. Radiographic image segmentation results: left: Original images, middle: By FCM algorithm, right: By EM algorithm.

291

0.012

0.01

P robability dens ity

0.008

0.006

0.004

0.002

0 -100

-50

0

50

100

150

200

250

300

350

X

0.09

0.08

0.07


0.06

0.05

0.04

0.03

0.02

0.01

0 -100

-50

0

50

100

150

200

250

X

0.08

0.07

0.06


0.05

0.04

0.03

0.02

0.01

0 -150

-100

-50

0

50

100 X

150

200

250

300

350

0.08

0.07

0.06


0.05

0.04

0.03

0.02

0.01

0 -150

-100

-50

0

50

100

150

200

250

300

X

0.035

0.03


0.025

0.02

0.015

0.01

0.005

0 -50

0

50

100

150 X

200

250

300

350

(a)

(b)

Fig.2. (a) From top to bottom: the Gaussian mixtures estimated by EM algorithm of images (1)-(5) respectively, (b) The corresponding histograms.

292

IV. CONCLUSION In this work we have investigated experimentally the effectiveness of the EM based segmentation method through the radiographic images of weld defects. The observed image is approximated by a mixture of multivariate normal densities. Each pixel is considered as a 1-dimensional input data of the mixture. The segmentation is carried out by the ML estimation. We have show some preliminary results obtained on a set of radiographic images. The practical behavior of the EM and FCM algorithms has been compared in different situations on radiogram images. We conclude from the experiments that the EM algorithm is useful and stable, provides a satisfactory segmentation results and outperforms in many cases the FCM algorithm. REFERENCES [1] [2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

J. H. Rogerson, “Defects in welds: Their prevention and their significance”. Applied science publishers, 1983. Y. Suga, K. Kojima, T. Tominago, “Detection of weld defects by computer aided X ray radiography image processing”. International Journal of Offshore and Polar Engineering, vol. 5, no. 2, June. 1995. N. Nacereddine, L. Hamami, D. Ziou, “Image thresholding for weld defect extraction in industrial radiographic testing”. International Journal of signal processing, vol. 3, no. 4, 2006, pp. 257-265. J. Cornelis, E. Nyssen, A. Katartizis, L. Van Kempen, P. Boekaerts, R. Deklerck, A. Salomie, “Statistical models for multidisciplinary applications of image segmentation and labelling”., Proceedings of ICSP 2000, pp. 2103-2110. Y. Wu, X. Yang, K. L. Chan, “Unsupervised color image segmentation based on Gaussian mixture model”. Fourth International Conference on Information, Communications & Signal Processing, Fourth IEEE Pacific-Rim Conference on Multimedia, Singapore, 15-18 Dec. 2003. M. B. Carcalho, C. Joe Gau, G. T. Herman, T. Yung Kong, “Algorithms for fuzzy segmentation”. Journal of pattern analysis and applications, vol. 2, no. 1, April. 1999, pp. 73-81. N. R. Pal, K. Pal, J. M. Keller, J. C. Bezdek, “A possibilistic Fuzzy C-Means clustering algorithm”. IEEE Transaction on fuzzy systems, vol. 13, no. 4, Aug. 2005. A. P. Dempster, N. M. Laird, D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm”. Journal of the Royal Statistical Society, Ser. B, vol. 39, pp 1-38, 1977. Z. Zhang, C. Chen, J. Sun, K. L. Chan, “EM algorithms for Gaussian mixtures with split and merge operation”. Pattern recognition 36 (2003), pp. 1973-1983.

293