Abstract 1 Introduction - CiteSeerX

Segmentation of Cell Clusters by Nearest Neighbour Graphs J.M. Geusebroek

1,2

A.W.M. Smeulders

1

F. Cornelissen

2

Intelligent Sensory Information Systems, Faculty of WINS, University of Amsterdam, Kruislaan 403, 1098 SJ Amsterdam, The Netherlands; 2 Biological Imaging Laboratory, Life Sciences Department, Janssen Research Foundation, Turnhoutseweg 30, B2340 Beerse, Belgium. [email protected]

1

Keywords: cellular sociology, Nearest Neighbour graphs, tissue, segmentation.

Abstract

[2, 5]. Segmentation can be based on interdistances between cells [12]. That is, on a clustering of characteristic points, e.g. the centre of gravity of the cells. Tissue architecture is then reduced to the question: \is this a cloud of points?". This leads to the application of neighbour graphs. The Voronoi diagram {or a subset{ is often applied as an architecture modelling tool [1, 3, 4, 8, 9, 10]. This graph determines the neighbours on the basis of touching zones of in uence. Rodenacker et al. [10] used this for partitioning epithelial tissue. Segmentation was obtained by propagating the neighbours from the basal layer of the epithelial tissue to the surface. Borders between basal, intermediate and super cial area were determined by examining the occupied surface of propagation. Bigras et al. [1] directly derived parameters from the Voronoi diagram. The area of in uence was used as a measure for cellular density. Although discriminating for lung carcinomas, this method is not robust. Cells on the border of clusters have larger zones of in uence than the inner cells. In our view, the Voronoi diagram is certainly useful for determination of neighbours, but more robust parameters could be estimated from the euclidian distance to these neighbours. A drawback of tissue architecture segmentation

The image of a tissue can often be characterised by the topographical relation between its cells. As a consequence, distinction between tissue parts can be based on the clustering of cell markers. The characterisation of the tissue architecture is then reduced to the question: \is this a cloud of points?". In this respect, graph theory may be used as the modelling tool. Although most studies apply the Voronoi diagram (or a subset) to reveal cell topography, this method is not robust against cell loss and false cell detection. We propose a distance graph to model the tissue architecture. Simple adaptive ltering techniques can be applied to segment cell clusters. Advantages of such an approach are absence of cluster border eects and robustness against cell loss and false cell detection. Cell detection con dence is taken into account.

1 Introduction Segmentation of tissue architecture is considered a non-trivial task. Biological variety and preparation or acquisition artefacts have a major in uence on the result. Simpli cation can be obtained by representing the topography among cells in the tissue 1

cal analysis of all k-nearest neighbours.

based on cell type classi cation is the instability due to detection errors. False cell detection or missing cells in the segmentation process are re ected in the nal result. Although cell segmentation in tissues is not at all a simple task, robustness is often neglected. In this paper we propose a method which is more robust and takes cell detection con dence into account. The organization of the paper is as follows. In section (2), cell clustering methods are reviewed. Robustness against several artefacts is addressed in section (3). A demonstration by means of some experiments is given in section (4).

2 Review of Methods

Instead of taking a xed number of neighbours, the selection can be based on the Voronoi diagram [14]. Raymond et al. used Gabriel's graph {a subset of the Voronoi diagram{ to obtain the neighbourhood of a cell, to which the above methods can be applied. The drawback here is the instability of the Voronoi diagram for cell detection errors. As discussed by Darro et al. [3], the Voronoi diagram is very sensitive to touching objects or detection errors. No cell can be eliminated without modifying the characteristics of cell clusters. This makes the Voronoi graph unsuitable for robust segmentation of tissue architecture.

Alternative

Straightforward is the use of the distance transform of the background [11]. Clusters can be segmented by searching the ridges in the obtained mountain landscape. All-sized openings or closings propagate this maximum to the object borders. This gives a measure for the inscribing circle which ts locally in the background. The same result can be obtained by taking the distance transform of the Voronoi tesselation [11]. The grey value at the cell positions gives the distance to the closest zone of in uence from a neighbour cell. Again, the drawback is the instability of the Voronoi diagram. Segmentation directly based upon the area of in uence in the Voronoi diagram, as in [1, 4], is comparable. A major drawback in this is the presence of cluster border eects. In the distance transform methods, border eects are suppressed. However, these methods are based on the distance to the nearest neighbour. This does not guarantee correct discrimination between the clusters [7, 12].

In this section a review of existing methods for tissue architecture segmentation is given. Only methods based on cluster density are considered. This implies that the distances between the cells determine cluster membership. A simple clustering method consists of counting the number of cells per area [6]. All cells within a xed distance from a centre cell are assigned to the same cluster. A drawback is the determination of the optimal xed distance for the problem at hand. A dierent image magni cation will result in a different clustering. This implies that the method is scale variant. Methods based on (morphological) smoothing to merge cluster elements use this xed distance principle. The k-nearest neighbours method is scale invariant. All k-nearest neighbours of a centre cell are assigned to the same cluster. Distance to these neighbours is not taken into account. O'Callaghan [7] showed that this does not lead to an intuitive cluster segmentation. Schwarz and Exner [12] pointed out that discrimination can be based on the distance to one of the nearest neighbours. Evaluation of the distance distribution can be used for clustering. Shapiro [13] used a statisti-

We can conclude that a method based on the distance to one of the nearest neighbours is most applicable for our purpose. Discrimination is not limited to the rst nearest neighbour. In the next section we will discuss the robustness of this method. 2

3 Graph Segmentation

to shell a. The distance ia is given by the nearest neighbour found, not already a shell member. The 3.1 Distance Graph number of shells depends on the number of \jumps" ~ in In order to obtain robust segmentation results, a the distances Nx and can be in uenced by a minimum jump size or margin. Let the set of shells or cell cluster algorithm should be robust against: isodistances, within margin marg be magni cation dierences Dmarg = fi1 ; i2; : : : ; im g (4) x cell-loss Each isodistance can be interpreted as a shell of false cell detections neighbouring cells. Each node has its own neighbour shells. Therefore, dierent cells have dierent border eects sets Dmarg with a possibly dierent number of elex Robustness against cell-loss implies that miss- ments (shells). Only the margin marg for the cells ing cells have limited in uence on the result. If is equal. some cells are not detected, or absent in the image, If there are several cells in each shell, missing this should give minor distortion in the nal result. some cells has no in uence on the isodistance set Artefacts which are classi ed as cells are a com- Dmarg x . As long as one cell remain in each shell, mon source of error in detection algorithms. Cells missing the other cells during detection does not at cluster borders should get the same classi cation result in a dierent graph. This increases the roas cells within the cluster. bustness against cell loss. Let G be a graph with each labeled node Nx We assume a suitable image magni cation is choconnected to all other nodes (total graph). The sen for the problem at hand. Border eects due to vertice length is given by the distance d between cluster edges are not present since isodistance shells the centre node and the neighbour. We can order are preserved within the cluster. the neighbours of node Nx on euclidian distance (1) 3.2 Cell Classi cation Robustness against magni cation dierences can Thus far, only (cell-) detectors giving the centre of be achieved by dividing all distances by the dis- gravity as output has been examined. We can contance to one of the nearest neighbours. In the case sider detectors which also calculate the probability of cell loss, a robust scale measurement can be ob- that the detected object indeed was a cell. If this tained by dividing all distance sets by the mean of probability is introduced into the nearest neighbour the distances to one of the nearest neighbours, i.e. con guration, more robust segmentation is possible. The errors made due to detected artefacts will ~ ~ ~ ~ Nx = fd1; d2; : : : ; dng (2) become less signi cant. Given the probability of the neighbours in N d~i = di j 1 C n (3) (eqn. 1), the corresponding probability set is givenx

Nx = fd1; d2; : : : ; dnjdi?1 < di ; d0 = 0g

dC

by were dC is the mean taken over the C th -nearest Px = fp1; p2; : : : ; png (5) neighbour in the graph. We can de ne the isodistance set of N~x by as- All probabilities are independent of each other (this signing all cells on approximately equal distances ia can be seen as a detector restriction). Calculation 3

of which node is most probably the rst, second, etc. nearest neighbour is straightforward. For the rst nearest neighbour, the probability of the rst object being a cell is p1 . The probability that the second object is the nearest cell is the probability that the rst object is not a cell and the second object is a cell, or (1 ? p1)p2. After calculating the probabilities for all neighbours, the object with the highest probability of being the rst nearest neighbour is chosen. A similar calculation can be obtained for the second, third, and so on. The resulting (re-ordered) neighbour-set can be used as Nx. The probability of the centre cell itself can be passed for application speci c control.

(a)

(b)

Figure 1: Examples of clusters not separable by nearest neighbour methods segmentation can be based on the distance to the third nearest neighbour. Within the dense cluster, this distance corresponds to the vertical distance between the cells. This is the smallest distance in the clusters; all cells with higher distances to the third nearest neighbour belong to the less dense cluster. This gives the correct segmentation. Figure 2-a shows two clusters which can be segmented on basis of their vertical spacing (the third or fourth nearest neighbour). In the centre, two cells are not detected. The neighbours of the edge cells at the right and left of the clusters have different arrangements of neighbours compared with the other cluster cells. This causes border eects. Also, cells around the missing cell have dierent arrangements. If shells are taken into account by looking at jumps in the distances, correct classi cation takes place. For the top cluster, the second shell describes all cells at distance 2pr. For the bottom, the shell describes all cells at 2r. In the eld of histology, the use of shells is desirable. If more or less equaly sized cells in tissue are clustered more or less homogeniously, they constitute a hexagonal grid. As demonstrated in gure 2-b, circles can be drawn through the centres of neighbouring cells. The radius of these circles or shells is fr; 2r; 3r; : : :g. The number of cells tting in the shells is approximately f6; 12; 18; : : :g In gure 3-a a bright eld image of a rat hippocampus is shown. The CA1 region is aected by brain damage and most of its typical cells are

4 Experiments In this section some examples are shown that clearly demonstrate the advantage of k-nearest neighbour discrimination over other methods. In order to investigate the robustness of the isodistance method, some experiments are made. Figure 1-a shows a barbell with a one-layer thick bar. The distance to the rst and second nearest neighbour is, for both the bar and the discs, determined by r, the xed distance between two points. Within the bar, the distance to the third nearest neighbour is 2r. For the cells in the discs, the distance to the third nearest is still r. Segmentation based on the rst (or second) nearest neighbour will result in one cluster {the complete barbell{. If we take the third (or further) neighbour for segmentation, this results in all three components {the bar and two discs{. This example can not be clustered properly by methods taking only the nearest neighbour into account. This is also the case for gure 1-b. Here, results for the fourth row from above are higly in uenced by the dense cluster at the bottom. As pointed out by [7], the k-nearest neighbours method does not result in the desired segmentation. By taking distances into account, 4

(a)

(b)

Figure 2: Experiment to investigate the stability of the isodistance method absent in the image. After cell detection, remaining objects are the pyramidal cell bodies, bloodvessels and some other artefacts. Figure 3-b shows classi cation of the objects based on the second isodistance. The intensity of each cell corresponds to the distance from the cell to the second shell. The CA3 region is clearly visible as a dark \U" shaped region at the bottom of the image. Cells in the hilus (H) are damaged due to histological preparation. The dark, \" shaped cluster consists of small holes. Other cells are lighter due to the higher distance to the second shell. In this image, the non-trivial segmentation of the CA3 region can be obtained by simple thresholding of the shell distances. In addition, it is possible to dierentiate damaged from normal cell layers.

(a)

5 Discussion It was shown that selecting the kth -nearest neighbour serves classifying patterns better. It is more selective as well as more stable than methods based on the nearest neighbour. Instead of taking only one neighbour, a lter centered on this neighbour (b) can be trained for discrimination. This could be even more robust, as shown for the isodistance method. Common adaptive ltering, eigenvector Figure 3: Application of isodistance method on the analysis or non-linear methods can be considered hippocampus of a rat 5

for implementation. by means of Voronoi diagrams. Cytometry 14:783792 (1993). Another point for discussion is the assumption that neighbouring cells can be assigned to shells. [4] Duyckaerts C, Godefroy G, Hauw JJ: Evaluation For dense clusters, this seems to be the case for of neuronal numerical density by Dirichlet tesselthe rst few shells. This is enough to prevent edge lation. Journal of Neuroscience Methods 51:47-69 eects and to obtain robustness against cell loss. (1994). For non-cluster cells, each shell contains only one neighbour. For these cells, no improvement in ro- [5] Honda H: Geometrical models for cells in tissues. International Review of Cytology 81:191-248 bustness is gained. If we loose a neighbour of a (1983). non-cluster cell, it is very likely that the next neighbour is only a small fraction of the (high) distance [6] Koontz WLC, Fukunaga K: A nonparametric valleyseeking technique for cluster analysis. IEEE further away. The relative error is smaller for nonTransactions on Computers 21:171-178 (1972). cluster cells than for cells within clusters. This makes the need for robust measurements within [7] O'Callaghan JF: An alternative de nition for clusters higher than for non-cluster cells. Within neighbourhood of a point. IEEE Transactions on clusters, it is very likely that cells can be assigned to Computers 24:1121-1125 (1975). shells. If cells have equal diameter, the distance to [8] Palmari J, Dussert C, Berthois Y, Penel C, Martin the shells are ordered most likely as f6; 12; 18; : : : g, PM: Distribution of estrogen receptor heterogeneshown in section (4). ity in growing MCF-7 cells measured by quantitaNot implemented yet is a fuzzy detector of cells. tive microscopy. Cytometry 27:26-35 (1997). This should detect as much cells as possible, in order to nd all shells. In many cases, this result in [9] Raymond E, Raphael M, Grimaud M, Vincent L, Binet JL, Meyer F: Germinal center analysis with an over-segmentation. The problem here is how to the tools of mathematical morphology on graphs. give all objects a correct classi cation between 0 Cytometry 14:848-861 (1993). (artefact) and 1 (cell). This can be obtained by designing rules for classi cation and using probability [10] Rodenacker K, Bischo P: Quanti cation of tissue sections: graph theory and topology as modpropagation to combine them. Another approach elling tools. Pattern Recognition Letters 11:275is to train a neural network to classify the objects. 284 (1990).

References

[11] Russ JC: The Image Processing Handbook (2nd ed). CRC Press Inc., Boca Raton, 1995.

[1] Bigras G, Marcelpoil R, Brambilla E, Brugal G: [12] Schwarz H, Exner HE: The characterization of the Cellular sociology applied to neuroendocrine tuarrangement of feature centroids in planes and volmors of the lung: quantitative model of neoplastic umes. Journal of Microscopy 129:155-169 (1983). architecture. Cytometry 24:74-82 (1996). [13] Shapiro MB, Schein SJ, Monasterio FM: Regular[2] Chandebois R: Cell sociology: a way of reconsidity and structure of the spatial pattern of blue ering the current concepts of morphogenesis. Acta cones of macaque retina. Journal of the American Biotheoretica 25:71-102 (1976). Statistical Association 80:803-812 (1985). [3] Darro F, Kruczynski A, Etievant C, Martinez J, [14] Vincent L: Graphs and mathematical morphology. Pasteels J-L, Kiss R: Characterization of the difSignal Processing 16:365-388 (1989). ferentiation of human colorectal cancer cell lines

6