Image Thresholding by Histogram Segmentation Using ... - CiteSeerX

Image Thresholding by Histogram Segmentation Using Discriminant Analysis Agus Zainal Arifin1 and Akira Asano2 1 Graduate School of Engineering, Hiroshima University Email : [email protected] 2 Division of Mathematical and Information Sciences, Faculty of Integrated Arts and Sciences, Hiroshima University Email : [email protected]

Abstract Image segmentation is often used to distinguish the foreground from the background. This paper proposes a novel method of image thresholding using the optimal histogram segmentation by the cluster organization based on the similarity between adjacent clusters. Since this method is not based on the minimization of a function, the problem of selecting the threshold at the local minima is avoided. This approach overcomes the local minima that affect most of the conventional methods by maximizing the between-class and minimizing within-class objects. Agglomerative clustering is used in this method so as to merge two adjacent clusters in the histogram. The distance measurement using discriminant analysis is adapted from the criterion function defined by Otsu. It directly approaches the feasibility of evaluating the goodness of every pair and automatically grouping the closest pair. The most similar pair is selected, which is the most homogeneous one. In addition, this pair should be the closest pair in the sense of means distance. All steps are repeated iteratively until achieving two clusters. It is straightforward to extend the method to multi-level thresholding problem by stopping the grouping as the expected segment number is achieved. Results obtained from automatic thresholding of the experimental images are showing the validity of the method.

1. Introduction Image segmentation is very essential to image processing and pattern recognition. It leads to the high quality of the final result of analysis. Image segmentation is a process of dividing an image into different regions. One of the special kinds of segmentation is thresholding, which attempts to classify image pixels into one of the two categories (e.g. foreground and background). At the end of such thresholding, each object of the image, represented by a set of pixels, is isolated from the rest of the scene. In this case, the aim is to find a critical value or threshold. The most straightforward approach is to pick up a fixed grayscale value as the threshold and classify each grayscale by checking whether it lies above or below this value. In general, the threshold

should be located at the obvious and deep valley of the histogram. Especially for a well-defined image, its histogram has a deep valley between two peaks. Therefore, the optimum threshold value can be found in the valley region [5]. One extremely simple way to find a suitable threshold is to find each of the modes (local maxima) and then find the valley (minimum) between them [1]. Theoretically, the optimal threshold value can be determined according to the Bayes rule if we know the pixel distribution of both classes [1]. What we have in practice, however, is not two separate distributions, but a mixture of both distributions as shown in the histogram. Therefore it needs some assumptions about the forms of both distributions to simplify the problem. There have been some techniques proposed in order to approximate it. One of them approximates in the least square sense by a sum of Gaussian distribution, which is estimated from the histogram. We can find a set of parameters to fit the image histogram to the probability models by minimizing the Mean Square Error between the actual probability density function and the model. An iterative selection method is used based on the one of nonlinear optimizations [2]. As Such a method, however, uses as iterative computation, the final solution heavily depends on the initial value. Many thresholding techniques used the criterion-based concept to select the most suitable gray scale as the threshold value. One of the oldest methods is Otsu’s thresholding method that utilizes discriminant analysis to find the maximum separability of classes [3]. For every possibility of threshold value, Otsu (1979) evaluated the goodness of this value if used as the threshold. This evaluation includes the heterogeneity of both classes and the homogeneity of every class. Kittler and Illingworth (1986) also used criterion-based concept by obtaining the minimum error threshold between Gaussian distribution used in background and foreground [4]. Criterion-based methods are effective and efficient for determining a threshold value. However the more the number of threshold values, become the computation complexity increases exponentially. In addition, the methods work very well for bimodal or nearly bimodal histogram [5]. For unimodal and multimodal histogram, however, the separation between both classes is not clear. In this paper, we proposed a novel method that splitting the image histogram based on the measurement of similarity between sub clusters of gray levels. As the proposed method is not based on the minimization of a function, we can avoid the problem of selecting threshold value at local minima [4].

2. Proposed Method The multi-level thresholding problem is closely related with the clustering problem, which segmented the image into several classes [6]. The proposed method uses an iterative cluster unification to develop a dendrogram iteratively until two groups of gray levels are obtained. Initially, it assumed that each gray level is assigned to a different cluster. If there are K gray levels used in the image, then we can assume there are K classes, C1, C2, … CK, which gray level Tk is contained in Ck, and satisfy T1