AUTOMATIC LANDMARK DETECTION IN UTERINE CERVIX IMAGES ...

7 downloads 0 Views 264KB Size Report
AUTOMATIC LANDMARK DETECTION IN UTERINE CERVIX IMAGES FOR INDEXING. IN A CONTENT-RETRIEVAL SYSTEM. Gali Zimmerman, Shiri Gordon ...
AUTOMATIC LANDMARK DETECTION IN UTERINE CERVIX IMAGES FOR INDEXING IN A CONTENT-RETRIEVAL SYSTEM Gali Zimmerman, Shiri Gordon and Hayit Greenspan Biomedical Engineering Dept, Faculty of Engineering, Tel-Aviv University, Tel-Aviv 69978, Israel ABSTRACT This work is motivated by the need for visual information extraction and management in the growing field of content based image retrieval from medical archives. In particular it focuses on a unique medical repository of cervicographic images (“Cervigrams”) collected by the National Cancer Institute, National Institutes of Health, to study the evolution of lesions related to cervical cancer. The paper briefly presents a framework for cervigram segmentation and labelling, focusing on the identification of two anatomical landmarks: the cervix boundary and the os. These landmarks are identified based on their convexity, using adequate mathematical tools. Segmentation results are exemplified and an initial validation is carried out on a subset of 120 manually labelled cervigrams.

vagina, and the cervix boundary. Part of the tissue types of interest are: The squamous epithelium (SE), which is smooth and pink; The columnar epithelium (CE) that appears red and irregular; and the acetowhite (AW) region which is relatively smooth and light (Figure 1)1 . Various tissue parameters, such as color, size and location relative to the os, can be extracted from each of these tissues and can be used for indexing and retrieval.

1. INTRODUCTION Cervicography is one of the methods for cervical cancer screening that uses visual testing based on the color change of cervix tissues when exposed to acetic acid. This method helps to detect abnormal cells that turn white (Aceto-White) following the application of the acetic acid. It is inexpensive and is suitable for resource-poor regions where cervical cancer is the most common cancer affecting women and the available treatment can be complex and expensive. The National Cancer Institute (NCI) has collected a vast amount of biomedical information related to cervical cancer evolution, including around 60,000 cervigrams along with their diagnostic categorization by an expert and additional biomedical information (Pap smear, cytology). The NCI together with the National Library of Medicine (NLM) are working on developing a unique Web-based database of the digitized cervix images. This work is part of an on-going effort towards the generation of content-based image retrieval (CBIR) capabilities for the cervigrams database. Developing these CBIR capabilities requires the detection of regions of medical interest, as well as anatomical landmarks, which are important for cervigram content analysis and indexing. Important landmarks include for example the os, which is the opening of the cervix into the

Fig. 1. A typical cervigram and the various regions within it Cervix images possess unique characteristics that hamper their automated analysis. Artifacts such as specular reflections and shading are generated in every cervigram due to the acquisition setup and the convex shape of the cervix. Complex texture patterns, narrow dynamic range of the colors and the lack of clear boundaries between the regions generate additional challenges (Figure 1 and Figure 4). The large diversity in visual appearance of the cervigrams calls for robust and adaptive segmentation approaches. A limited number of studies, all in their very preliminary stages, address the task of automated cervical image analysis (e.g. [1], [2], [3], [4]). Preliminary segmentation efforts [5], were recently introduced by the authors. The current paper briefly presents the overall automated cervigram segmentation system. The main focus of the paper is on the identification of the two anatomical landmarks: the cervix boundary and the os. Identifying these landmarks improves the content analysis quality of the cervigrams and enables the extraction of additional measurements 1A

colored version will be available at the author’s web page.

2. SYSTEM DESCRIPTION The overall automated cervigram segmentation system is based on a multi-stage process, described in Figure 2.

f (x, y) scalar weight function. Next comes the robust alignment term that helps to achieve a better alignment of the curve with the edges within the image. The third term is derived from geodesic active contours (GAC). It attains high values for the portions of the contour that overlap image edges, thus preventing these portions from further evolution. The function g(x, y) is an inverse edge indicator, usually of the form: g(x, y) = √ 1 2 . Finally, the minimal variance term, at1+|∇I|

Fig. 2. Block diagram of the multi-stage analysis process The first preprocessing step locates the approximate cervical region within the cervigram. This region of interest (ROI) excludes as much irrelevant information as possible, while making sure that the entire cervical area is included. Image pixels are automatically clustered in the {a, d} feature space, where a is a color channel of the CIE − Lab color-space and d is the pixels distance from the image center. The cluster that has the lowest mean(d) and the highest mean(a) is identified as the ROI, which is a relatively pink region located around the image center. Subsequent steps of the process are performed within this ROI. The second preprocessing step resolves the problem of specular reflections. A novel algorithm that detects and eliminates the specularities in cervigrams while preserving the original image features, was devised for this purpose [6]. Following the preprocessing steps, the cervix boundaries are extracted as described in Section 3. The os is detected next in Section 4. The following tissue segmentation steps, that are not part of this contribution, receive the detected landmarks as part of their input. 3. CERVIX BOUNDARY DETECTION The coarse ROI extracted in the preprocessing step often includes large portions of the vagina that confuse the automatic identification of the tissues within the ROI. The active contours approach, that searches for an energy maximizing curve in the image, is used to contract the coarse ROI outline to the actual cervix boundaries. The method presented here is based on the curve evolution functional suggested for fast edge integration by Kimmel [7]. The compact representation of the framework in level set formulation is the following curve evolution equation: ´ h ³ ∇φ φt = f (x, y) + sign(h∇φ, ∇Ii)∆I + αdiv g(x, y) |∇φ| ¡ ¢¤ 2 +β(c2 − c1 ) I − c1 +c |∇φ|, 2 (1) where I is a gray level function of the image and φ is the level-set function. The first term of this equation is the weighted region term, which advances the curve through the image according to a

tempts to separate the foreground and the background of the image with respect of their relative mean values. The two constants, c1 , c2 , get the mean intensities in the interior and the exterior of the contour, respectively. A direct application of the level set formulation of Equation (1) to the detection of cervix boundaries, is inappropriate. There are irrelevant edges within the cervigram formed by: skin folds, medical instruments, the cervigram frame and the various tissues within the cervix (as seen in Figure 1). These edges interfere with the curve attraction to the cervix boundaries. The original formulation of Equation (1) is therefore adapted to support the unique characteristics of the cervigrams as shown in the subsequent paragraphs. The input image is initially cropped around the coarse ROI. This ensures that the strong edges of the frame and the medical instruments do not attract the evolving curve. The initial curve is set to be the outline of the coarse ROI, which is always larger than the desired final contour. The sign of the weighted region term is set to be negative thus ensuring that the contour moves inward from it’s initial state. The minimal variance term is assigned a very low weight, since the color difference between the interior and exterior of the cervix is not always significant. This term operates on the a color channel (of the CIE −Lab color space), reflecting the pinkness inside the cervical region. The multitude of irrelevant edges in the image makes the gradient based terms (GAC, robust alignment), inappropriate for the current task of cervix boundary detection. An alternative edge indicator, based on the cervix convexity, is proposed. This is motivated by the fact that most of the cervix boundaries are outlined by folds of skin that look as narrow valleys and are distinctively concave. The boundaries are easily detected by their largest positive principal curvature, k1 [8]. Edges generated by the color transition between two different tissues, have strong gradients, but low curvature and will not interfere with the curve evolution in that case. The principal curvatures measure the maximum and minimum bending of a surface, I(x, y), at each point [8]. The two principal curvatures and directions are obtained as the eigenvalues and eigenvectors, respectively, of the following matrix: 1 2 g11 g22 − g12

·

g22 −g12

−g12 g11

¸·

b11 b12

b12 b22

¸ .

(2)

gij and bij relate to the first and the second fundamental forms respectively and are given by:

{g11 ; g12 ; g22 } = {Ix2 + 1; Ix Iy ; Iy2 + 1} 1

{b11 ; b12 ; b22 } = q

1 + Ix2 + Iy2

{Ixx ; Ixy ; Iyy }.

(3) (4)

~, The largest principal curvature, k1, and it’s direction, V are used as the curvature-based edge indicators in the GAC and the Alignment terms of the modified energy functional. The inverse edge indicator function g, within the GAC term, is modified to: g˜(x, y) = √ 1 2 . The robust alignment 1+|k1|

~ , of the principal term is now driven by the vector field, V curvature directions and the modified alignment term is: ~ , ∇φi)div(V ~ )|∇φ|. The resulting modified level set sign(hV formulation used in this work is: h ~ , ∇φi)div(V ~) φt = −C + sign(hV ³ ´ ¡ ¢i ∇φ 2 +αdiv g˜(x, y) |∇φ| + β(c2 − c1 ) I − c1 +c |∇φ| 2 (5) where C, α and β are the energy functional parameters. Several resulting contours are presented in Figure 4. 4. OS DETECTION The os size and shape varies widely with age, hormonal state, and whether the woman has had a vaginal birth. Os is the most obvious choice for a reference point in cervix images, since it is always visible. Automatic os detection was not addressed thus far in former works. The os shape, color and relative location in the image have large variability between the images. However, the os region is always concave, since it is an opening into the depth of the cervix. A geometric measure of local concavity [9] of the gray-level image surface, I(x, y), is used for the detection of the os region. This measure detects surface patches, that are locally concave or convex, by detecting the ray of discontinuity in the gradient argument, θ(x, y), of the function I. The gradient argument is the angle of the intensity gradient: µ ¶ ∂ ∂ θ(x, y) = arg(∇I(x, y)) = arctan I(x, y), I(x, y) , ∂y ∂x (6) where the two dimensional arc tangent is defined by:  y  arctan( x ) if x ≥ 0 if x < 0, y ≥ 0 arctan( xy ) + π arctan(y, x) =  if x < 0, y < 0. arctan( xy ) − π (7) It can be shown that for a locally concave or convex structure, that can be described as a paraboloid, the gradient argument has a discontinuity in the negative x axis (if the coordinate system is placed in the center of the paraboloid, in parallel to it’s axes). This structure can be easily detected by

a differentiation of this argument, which leads to the convex∂ θ(x, y). ity operator: Yarg = ∂y The convexity operator will react strongly (in theory, infinitely) to the negative part of a horizontal axis of a convex or concave patch. Its response to linear gradients or abrupt changes in the image should be finite, and therefore much weaker than the response to convex patches. This operator is insufficient for the purpose of cervix boundary detection, since it does not provide the direction and magnitude of principal curvature directions. However it’s stability and robustness make it very suitable for the purpose of os detection. The convexity operator may be extended in order to react to other axes of the paraboloid, as shown in [9]. In the current work it is extended to the positive x axis. This results in a strong reaction to paraboloids with horizontal or nearly horizontal orientation, thus providing a strong reaction to the os region, which is circular or nearly horizontal. The extended version of Yarg is termed the Darg . It is obtained by rotating the original image by π degrees, calculating the Yarg and then rotating the result back. The received response is summed with the original Yarg . The Darg response for the cervigram image is illustrated in Figure 3(a). It can be seen that there is a strong positive response (light gray) for regions that look like concave horizontal cylinders. For the case of a horizontal convex patches, the response is negative (dark gray).

(a)

(b)

(c)

Fig. 3. Os detection: (a) Darg reaction within the cervix image; (b) Labelled mask pixels. The grey pixels were classified as close to the cervix boundary and the white pixels as close to the cervix center; (c) The cervix image with the os marked by a black diamond. Looking at Figure 3(a) it can be seen that there are multiple regions with high value of Darg , most of which are located on the horizontal boundaries of the cervix, and need to be discarded. For this purpose the Darg response is initially automatically thresholded to generate a mask of candidate os regions. Next the distance between the pixels within the mask and the cervix boundaries is calculated using the distance transform. Utilizing this distance feature and the Kmeans algorithm the pixels are clustered and labelled into two groups, one close to the cervix boundary and one close to its center (Figure 3(b)). The segment that includes the os region is identified as the largest segment that has a majority of pixels affiliated with the center cluster. The center of this segment is selected as a marker for the os landmark (marked by the diamond in Figure 3(c)).

5. RESULTS The testing set for the presented system consists of 120 cervigrams, part of which contain abnormal tissues. Each of the images was automatically analyzed using the suggested framework and the results were compared to manual segmentations by an NCI expert. Several typical results of cervix boundary and os detection are presented in Figure 4. The top row contains the cervix boundaries detected by the algorithm (white line). Os is marked by a diamond. The bottom row shows the expert segmentations of both regions. Visually, the detected cervix boundaries are very similar to the ones marked by the expert. Parts of the surrounding vaginal walls are sometimes included in the cervix region, since their characteristics are quite similar to those of the cervix. In order to evaluate the proposed curvature-based functional (Equation (5)) for the cervix boundary detection, the suggested method was compared to the conventional gradientbased functional (Equation (1)). For each method the optimal set of functional parameters was experimentally selected and was fixed for the experiments. The Dice metric, S∩R S∪R , and , were calculated for each detected the Sensitivity metric, S∩R S cervix region, relative to the manual segmentations (S being the area of the automatically segmented region, while R is the area of the expert’s segmentation). The results of Dice and Sensitivity for the two methods are summarized in Table 1. It can be seen that for both methods the Sensitivity is extremely high, meaning that in most cases all of the cervix pixels are included in the detected region. This is of outmost importance in our application, since no relevant information may be omitted. The Dice measure shows an improvement with curvature (from 0.597 to 0.66, with the same variance). This is as expected since less irrelevant regions are now included. The os detection was evaluated on a sub-set of 101 images with acceptable quality of cervix region detection. The minimal distance between the automatic and the manual os markings was found for each image. In 87% of the cases, this distance is less then 10 pixels ( which is the approximate width resolution of the os region, as marked by the expert), and in 76% the distance is less that 5 pixels. Such accuracy enables the future use of this landmark for detection of other tissues. 6. DISCUSSION In this work we present an analysis scheme for cervical images, with an emphasis on automatic detection of two important landmarks: cervical boundaries and the os. A novel approach to cervix content analysis, that relies on the geometrical curvature characteristic of the image is introduced, and the performance on an extensive data set is carried out. The detected landmarks are intended to help further cervix content analysis by placing the various tissues within the cervix in the right context, using their locations relative to the landmarks. The landmarks detection accuracy is already good enough for

Curvature Gradients

meanDice 0.66 0.597

stdDice 0.13 0.14

meanSens 0.94 0.987

stdSens 0.08 0.022

Table 1. Dice and sensitivity metric for detection of cervix boundaries that use. It may be further improved by combining the landmark detection process with the subsequent tissue segmentation steps. Regarding the content retrieval task, future work includes the extraction of indexing features such as: tissue (or cervix lesion) position relative to the os landmark. Acknowledgement: We would like to thank the Communications Engineering Branch, NLM, NIH, for the data and support of the work and Dr. Nir Sochen, Tel Aviv University, for fruitful discussions.

Fig. 4. Results of cervix and os detection: automatic (top) vs. manual (bottom) 7. REFERENCES [1] B. W. Pogue, M. A. Mycek, and D. Harper, “Image analysis for discrimination of cervical neoplasia,” Journal of Biomedical Optics, vol. 5, no. 1, pp. 72–82, 2000. [2] Q. Ji, J. Engel, and E. Craine, “Texture analysis for classification of cervix lesions,” IEEE Trans. on Medical Imaging, vol. 19, no. 11, pp. 1144–1149, 2000. [3] V. Van-Raad, “Frequency space analysis of cervical images using short time Fourier transform,” in Proc. of the IASTED International Conference of Biomedical Engineering, 2003, vol. 1, pp. 77–81. [4] B. Tulpule et al., “A probabilistic approaches to segmentation and classification of neoplasia in uterine cervix images using color and geometric features,” in Proc. of SPIE Medical Imaging, February 2005, vol. 5747, pp. 995–1003. [5] G. Zimmerman, S. Gordon, and H. Greenspan, “Content-based indexing and retrieval of uterine cervix images,” in Proc. of 23rd IEEE Convention of Electrical and Electronics Engineers in Israel, 2004, pp. 181–185. [6] G. Zimmerman and H. Greenspan, “Automatic detection of specular reflections in uterine cervix images,” To appear in Proc. of SPIE Medical Imaging, 2006. [7] R. Kimmel, “Fast edge integration,” in Geometric Level-Set Methods in Imaging, Vision and Graphics, Springer-Verlag, New-York, 2003. [8] M. Farber, Introduction to differential geometry, in Hebrew, Mea, 1999. [9] A. Tankus and H. Yeshurun, “Convexity-based visual camuflage breaking,” Computer Vision and Image Understanding, vol. 82, pp. 208–237, 2001.