Visual Characterisation of Colon Images

Visual Characterisation of Colon Images Alison Todmana, Raouf N. G. Naguiba, Mark K. Bennettb a

BIOCORE, School of Mathematical & Information Sciences, Coventry University, bNewcastle University and Freeman Hospital, Newcastle-upon-Tyne Abstract This paper describes a pilot study investigating the potential of metrics based on low-level processes underlying form perception in human vision to differentiate between normal and cancerous tissue in histopathological images of colon. We also discuss possibilities for further visual characterisation of these images in terms of low-level visual processes and their relationship to domain based knowledge developed through clinical training.

1. Introduction Colon cancer is one of the most common types of cancer currently affecting the population of the UK and, due to an increase in life expectancy, its incidence is rising among the aging population. In general, the disease can be treated very effectively if detected in its early stages. Regular screening is therefore desirable and there is a need for the development of automated, quantitative analysis techniques capable of supporting the increase in workload associated with such programmes. Most of the colorectal cancers are resected in an advanced stage and the prognosis for the patient is dependent upon the depth of growth and spread of the tumour. There is significant observational variation between pathologists in several histological features, which potentially are of prognostic importance. For screening to be effective, the early phases of the disease (in which a dysplastic polyp may be present for over a decade) should predict features, which will indicate rapid progression or an indolent course. At present the only feature of significance is one of size. A polyp greater than 1cm in diameter is more likely to undergo malignant transformation than one of less than 1cm. A number of studies have looked at potential methods for classifying histopathological images used in the diagnosis of colorectal disease [1-6]. These have investigated the use of morphometric and densiometric techniques, as well as texture and fractal analysis for the classification of images of normal and cancerous tissue. Some consideration has also been given to conditions such as dysplasia that signal potential malignancy. However, despite reports of promising results, problems still exist and further research is required before we can deliver the accuracy required of a working automated system. Overwhelmingly, diagnosis still relies on the clinical expertise of a trained pathologist.

2. Methods The images in Figure 1 show typical examples of normal and diseased tissue extracted from stained (using Haematoxylin & Eosin) histological specimens of colon. 256 x 256 pixel regions were extracted from the original images and the colour converted to 8 bit greyscale resolution for structural analysis. As can be seen, normal colon tissue (Figure 1(a)) is characterised by a well-organised, strongly oriented structure. This progressively breaks down due to disease, losing much of the orientational coherence in dysplastic conditions (Figure 1(b)) and almost all regular structure in a cancerous state (Figure 1(c)). Based on a statistical analysis, previous studies [6] have identified measures of elongation and parallelism of delineated structures in the images as discriminating features and subsequently performed classification through a morphometrical analysis of the segmented images. However, given the general complexity of the images (poor contrast, touching features, irregular shapes and sizes) and the erratic nature of cancerous specimens, these have yielded unreliable results. In addition, some measures such as the stratification index used in [7] are simply undefined in cancerous conditions due to the absence of certain baseline features. Texture-based approaches have considered the power of features such as the angular second moment, contrast function, correlation function, entropy and inverse difference moment (derived from the grey level co-occurrence matrix) to discriminate between normal and cancerous specimens. While these studies claim improved results, the classification does not include dysplasia, a sign of potentially malignant changes in the tissue, crucial to early detection of the disease. The statistical analysis of textural properties is clearly worthy of further attention. However, in parallel with analysis of the data itself, we consider the analysis of human performance of equal interest here and are currently investigating the role of low and higher-level visual processes in the classification of these images. This work currently includes psychophysical experiments using techniques similar to those described in [8] aimed at understanding the role of clinical training and experience in this task. It also includes an

investigation of lower-level visual processes that have the potential to capture some of the essential characteristics that define these image types. In this study we seek to demonstrate that metrics based on the responses of receptive field operators modelling the orientational selectivity of neurons found in the early visual pathway are capable of discriminating between images of normal, dysplastic and cancerous samples. These operate directly on the grey-level image, giving a measure of the degree to which the structure perceived displays a coherent orientational preference on visual examination. We do not deny the presence of further cognitive processing in determining the final classification but rather suggest that the activity of neurons at this level must surely contribute to the final outcome. Orientational selectivity is central to human form perception. Founded on substantial neurophysiological research, computational models that simulate the behaviour of line and edge-detecting neurons found in the primary visual cortex have been devised [9,10]. Such oriented mechanisms are now well established computer vision tools [11]. However, here, we explicitly return to their role within human vision as a first step towards the development of a perceptually-driven system. As described in [12], we have investigated the use of 2 metrics based on receptive field operators that simulate the activity of simple and complex cells found in the primary visual cortex for this task. A bank of asymmetrical and symmetrical simple cells each covering 6 equally spaced orientations is used. These stretched-Gabor filters (devised by Heitger et al [10]) respond optimally to contrast edges and line-like features of a specific orientation, respectively. Complex cells with corresponding orientational preference respond equally well to oriented contrast discontinuities of either type (edge or line). The response of these cells, gives us a measure of the contrast boundary-induced neural activity associated with each orientation in the image. A local energy model that combines the outputs of quadrature pairs of simple cells at the same orientation is used to simulate this response. The complex cell response is then used to calculate the total activation ratio, T, and the orthogonal activation ratio, T⊥, that both vary according to the extent to which the structure in an image displays a coherent orientational preference. Theoretically, such metrics should yield high values for images of normal colon, while evidence of dysplasia or cancer should give rise to comparatively low metric values. For an input image of size N x M pixels, the orientational response, R, that measures the level of activation for a single orientation, i, is defined as N

M

∑∑ C [ x, y]

Ri =

i

x =1 y =1

(1)

where Ci[x,y] denotes the complex cell response at a point in the image. T is then defined as the ratio of the maximum orientational response to total neural activation, normalised according to the number of orientations over which activation is measured, i.e., T=

Rmax .z z

∑R

(2) i

i =1

where Rmax = max(Ri) denotes the maximum oriented response measured over the set of all orientations, and z is the total number of orientations used. T⊥, is defined as the inverse ratio of the maximum orientational response to the corresponding orthogonal response, i.e.

T⊥ =

R max ⊥ R max

(3)

where max ⊥ denotes the orientation orthogonal to the orientation of maximum response. T⊥ varies between 0 and 1 so for comparative purposes 1- T⊥ is quoted in the following results. Hence, high values of 1- T⊥ indicate strong orientational cohesion while low values signify a lack of orientational preference. Note that the values of both metrics are independent of the absolute orientation of features in the image. Furthermore, since the metrics rely on ratios of orientational activity within the image, they are robust to contrast differences between samples.

3. Results As predicted, the strongly oriented normal samples in Figure 1(a) give rise to high values of T and 1-T⊥ (shown in Table 1 and Figure 2) compared with the diseased and cancerous tissues in Figures 1(b) and 1(c). However, while both metrics appear to differentiate between normal and abnormal tissue, 1-T⊥ shows more overlap

between measures derived for the dysplastic and cancerous samples. In this case T (which is less sensitive to individual components of the total activity) appears to be marginally more effective. Since the ultimate aim of the work is to provide quantitative measures to assist pathologists in arriving at consistent evaluations of an image at a clinically higher resolution than so far considered, we must clearly establish the full range of features being used in the classification process. Further investigations are currently in progress. Through image similarity experiments and the use of multidimensional scaling and hierarchical clustering techniques, we are attempting to determine the actual dimensions being used by clinicians to classify these images. We are also attempting to determine the granularity of the scale along which images showing dysplasia are graded. Initial results suggest that structural orientation may indeed be an important factor. Modelling of this situation, however, appears to require a multi-resolution approach that combines the outputs of simple and complex cell operators at different scales. (a) Normal colon

N1 (b) Dysplasia

N2

N3

N4

N5

D1 (c) Colon cancer

D2

D3

D4

D5

C1

T 1-T⊥

N1 2.24 0.72

C2 C3 C4 Figure 1. Images showing normal, dysplastic and cancerous samples N2 2.04 0.73

N3 1.78 0.60

N4 1.75 0.59

N5 1.49 0.52

D1 1.33 0.34

D2 1.31 0.38

D3 1.25 0.39

D4 1.14 0.07

D5 1.10 0.12

C1 1.12 0.15

C2 1.10 0.10

C5

C3 1.06 0.08

C4 1.06 0.02

C5 1.04 0.01

Table 1. Total and orthogonal activation ratios for the images shown in Figure 1.

Total Activation Ratio

Figure 2. Comparison of metrics

Orthogonal Activation Ratio

4. Discussion Human vision has always been a source of inspiration for the development of new methods in the field of computer vision. However, for various reasons methods that attempt to directly model human perception have largely been rejected in favour of generally more efficient image analysis techniques. While this is understandable, there remains a need to investigate perception more fully in order to find better ways of expressing what people see in relation to the features and mechanisms used in the machine analysis of an image. These issues have recently been discussed in relation to content based image retrieval [8] where the success of a system relies not only on the underlying mechanisms but largely on the human computer interface and the ability of a user to describe the images they are looking for. Our failure to obtain good results in such systems even when we supply a query image that is visually similar to the images to be retrieved highlights our still general lack of understanding of vision and the knowledge and context within which it operates. Computerised systems for the classification of medical images also frequently require intervention by a clinical expert, particularly to highlight regions of interest on which to focus computational analysis. Here too it is difficult for clinicians to adequately describe images in terms that correspond well to the low-level features and processes we use to extract and analyse them. A number of visual features are apparent in the colon images used in this study. In this paper we have focused on the orientational coherence evident in normal specimens and are currently considering other low-level visual mechanisms associated with texture and colour perception that may be used to aid the classification of these images. In particular, we are interested in comparing our metrics to human performance and assessing differences between novice and expert clinicians. To this end, image similarity experiments akin to those described in [8] and [13] are currently being piloted. These have been used in an attempt to identify dimensions of texture in greyscale and colour patterns. We hope to learn more about dimensions of texture in natural images and to establish parameters that may allow us to quantify elements of learning associated with the progression of clinical training.

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.

P. W. Hamilton, D. C. Allen, and P. C. H. Watt, “Classification of normal colorectal mucosa and adenocarcinoma by morphometry”, Histopathology, 11, pp. 901-911, 1987. P. W. Hamilton, P. H. Bartels, D. Thompson, N. H. Anderson, and R. Montorini, “Automated location of dysplastic fields in colorectal histology using image texture analysis”, J. Pathol., 182, pp. 68-75, 1997. M. Bibbo, F. Michelassi, P. H. Bartels, H. Dytch, C. Bania, E. Lerma, and A. G. Montag, “Karyometric marker features in normal-appearing glands adjacent to human colonic adenocarcinoma”, Cancer Research, 50, pp. 147-151, 1990. N. Esgiar, R. N. G. Naguib, B. S. Sharif, M. K. Bennett, and A. Murray, “Microscopic image analysis for quantitative measurement and feature identification of normal and cancerous colonic mucosa”, IEEE Trans Info Tech. Biomed., 2(3), pp. 197-203, 1998. N. Esgiar, R. N. G. Naguib, B. S. Sharif, M. K. Bennett, and A. Murray, “Fractal analysis in the detection of colonic cancer images”, IEEE Trans. Inform. Tech. Biomed., submitted July 2000. N. Esgiar, R. N. G. Naguib, M. K. Bennett, and A. Murray, “Automated feature extraction and identification of colon carcinoma”, Analytical and Quantitative Cytology and Histology, 20(4), pp. 297-301, 1998. W.Polkowski, J.Baak, J.Van Lanschot, G..Meijer, L. Schuurmans, F. Ten Kate, H.Obertop and G. Offerhaus, “Clinical decision making in Barrett’s oesophagus”, J. Pathol., 184, pp. 161-168, 1998. A. Mojsilovic, J. Kovacevic, J. Hu, R.J. Safranek, and S.K. Ganapathy, “Matching and retrieval based on the vocabulary and grammar of color patterns”, IEEE Trans. on Image Processing, 9(1), pp. 38-53, 2000. M. Sun and A. B. Bonds, “Two-dimensional receptive field organisation in striate cortical neurons of the cat”, Visual Neuroscience, 11, pp. 703-720, 1994. F. Heitger, L. Rosenthaler, R. von der Heydt, E. Peterhans, and O. Kubler, “Simulation of neural contour mechanisms: From simple to end-stopped cells”, Vision Res., 32(5), pp 963-981, 1992. W. T. Freeman and E. H. Adelson, The design and use of steerable filters, IEEE Trans. Pattern Anal. and Mach. Intell., 13(9), pp 891 – 906, 1991 A. G. Todman, R. N. G. Naguib, M. K. Bennett, “Orientational coherence metrics: classification of colon images based on human form perception”, Proc IEEE CCECE, pp 1379-1385, 2001. A.R. Rao and G.L. Lohse, “Towards a texture naming system: Identifying relevant dimensions of texture”, Vision Res., 36(11), pp. 1649-1669, 1996.