Comparison of Fuzzy Rule Based Classification with Neural Network ...

1 downloads 0 Views 128KB Size Report
William & Wilkins, 1993. 7] A. Zell, N. Mache, T. Sommer, and T. Korb. The SNNS neural network simulator. In T. Christaller, editor, German Workshop on Arti cialĀ ...
Comparison of Fuzzy Rule Based Classi cation with Neural Network Approaches For Medical Diagnosis 1 Christoph S. Herrmann

Saman K. Halgamuge, Manfred Glesner

Darmstadt University of Technology Darmstadt University of Technology Department of Computer Engineering Department of Computer Science Institute of Intellectics Institute of Microelctroelectronic Systems Alexanderstr. 10, 64283 Darmstadt Karlstr. 15, 64283 Darmdtadt Tel.: (06151) 16-6651, Email: Tel.:(06151) 16-4337, Email: [email protected] [email protected]

Abstract

Several methods of fuzzy rule based classi cation are applied to medical data. Features from patient data, collected in a clinic, are pre-processed before being fed into fuzzy neural networks, where fuzzy rule based classi cation systems are generated. Additionally, results obtained from feedforward architectures such as standard-backpropagation networks, radial basis function nets (RBF) and Dynamic Vector Quantization (DVQ) are compared with the generated fuzzy classi er systems. The detection of certain phenomena in EEGs, so-called graphoelements, is the major problem handled in this work. The traditional way of reading pages of EEG diagrams by the trained medical practitioners can be eased by this method of automation. Results show the relative performances of the compared networks and possible applications to other medical data. The approaches, sketched here, particularly the classi er and the concept of membership function generation, are not dedicated to EEG classi cation, but may as well be applied to any set of features in patient data that can be transformed into a fuzzy representation.

1 Introduction Our main goal is to give an overview of possible fuzzy pattern classi cation methods, using neural networks. Four di erent neural network architectures will be compared with respect to their performance in medical data.

Figure 1: EEG time trace In Section 2 we will introduce the problem of diagnosing electroencephalograms (EEGs, see Fig. 1) in neural networks. The pre-processing and fuzzi cation will be explained and represent the guideline for applying our methods to other data sets. The advantages and disadvantages of the compared network architectures will be discussed with respect to medical diagnosis. The results of our comparison experiments will be demonstrated in Section 3 to aid the potential applicant in his choice of an optimal architecture for a given type of problem. The conclusions in Section 4 will summarize our proposals. 1 This work was a cooperation with the Mainz University Clinic, Department of Neurology, Reisingerweg, 55101 Mainz

1

2 Methods For our comparison of multiple learning strategies, we chose medical real world data for evaluation. Figure 1 shows 10 seconds of physiologic EEG in the time domain 2 . In second zero, a bulbus artifact occurred, visualized by the steep high voltage transient. Such a distraction of the brain electric potential by eye movements is also seen twice in seconds seven and eight. The base rhythm of this EEG is approximately eight Hertz and low voltage fast waves are mingled in (ripples). For the purpose of feature extraction, this time domain signal is transformed into its frequency domain as a pre-processing step. Figure 2 (a) shows the spectrum of one second EEG. amplitude

amplitude high

delta high alpha mid

mid

artifacts high

beta low

low

base rhythm mid

zero

fast waves low

delta

zero delta

theta

alpha

beta

0-4 Hz

4-8 Hz

8-12 Hz

>12 Hz

a

theta

alpha

frequency

beta

frequency

b

Figure 2: EEG spectrum (a) and mapping scheme (b) Here, we can see the three features described above, bulbus artifacts, base rhythm and fast waves as peaks in the spectrum. The frequency range is divided into the regions delta, theta, alpha and beta according to medical standards [6]. The relative maximum of each frequency region is considered as a feature and brought into a fuzzy representation by assigning one of the amplitude attributes zero, low, mid or high to the region. If no maximum exists, the zero attribute is assigned. Then the membership functions have to be assigned to the input layer neurons of the neural network. This is done by a two dimensional mapping technique [4]. The spectrum is mapped into the 16 neurons Nfrequencyamplitude of the mapping scheme by calculating the cross product of the fuzzy membership functions (see Figure 2 (b)). The four activated neurons in the lower left hand corner illustrate the fuzzi cation of the highlighted membership functions. In addition, the labeled neurons represent the three features extracted from the spectrum. The neuron Ndeltahigh corresponds to the artifact of the spectrum, while Nalphamid represents the base rhythm and Nbetalow stands for the fast waves. In order to allow interpreting of the neural representation as an analogy to the fuzzy one, each fuzzy feature must result in an equal amount of neural activity. As seen in the fuzzi cation example in the lower left hand corner, up to four neurons may be activated by a single feature. Hence, we demanded that the sum of activation resulting from one feature equals 1. Sum-of-1-criterion:

8

f

2 F;

u

2 U:

X

i 2 fdelta; : : : ; betag; j 2 fzero; : : : ; highg

ij (f; u)

N

= 1

(1)

where F and U denote the two universes of discourse for the two fuzzy variables frequency and amplitude. Any data set that allows processing according to these steps is suitable for our method of fuzzy rule based classi cation (i.e. evoked potentials (EP), electromyography (EMG), electrocardiagraphy (ECG), etc.). 2

recorded from frontal electrodes Fp1-F3 in the left hemisphere

2

The resulting 16 neurons serve as inputs for the di erent network architectures we will compare her. A physician labeled the EEG traces in order to supply the supervision of the learning process. A label 1 will denote a 'bulbus artifact', while a 0 represents 'no bulbus artifact'. For the purpose of training the networks, we prepared two les, bulbus1 and bulbus2, both containing bulbus artifacts. The le bulbus2 additionally contains electrode artifacts, being closely related to the former in shape and size. The additional electrode artifacts in bulbus2 resulted in a di erence of performance, as will be seen in Section 3. The di erent network architectures not only performed di erently|they also bear some basic di erences in applicability. Before we present the performance results in the next Section, we give a compact overview in the following comparison list. For the choice of an architecture, these properties should be concerned, too.

Standard Backpropagation is implemented in most simulators [7] and thus available to a broad community of potential users. No possibility of generating fuzzy systems is provided. Radial Basis Function Networks were among the rst to be reported functionally equivalent with fuzzy systems [5] and permit very simple rule bases with minimum numbers of rules [3]. Dynamic Vector Quantization has proven its applicability to classi cation problems [2] and allows extraction of fuzzy rules, due to its functional equivalence. FuNe I has been applied to a variety of real world problems [1] and is easily implemented in hardware. However, the derived fuzzy system is static and can not be adapted on-line.

3 Results Two data sets are considerd, the data set 1 (bulbus1) as a training le and the data set 2 (bulbus2) as a test le. The test and training les are interchanged to get more reasonable nal results. It seems that the networks with generalizing character deliver better results than the Radial Basis Function Networks trained with restricted coulomb energy learning. The best results are obtained with an extended fuzzy neural network based on Dynamic Vector Quantization with sub Bayesian approximation [2]. The general problem in many neural networks that can be interpreted as fuzzy systems is that the interpretable fuzzy systems are in most cases not readable. This is due to the fact that all inputs are included in rules creating complicated knowledge structures with many redundant rules. The extended networks is capable of optimizing the membership functions keeping them limited in number, which delivers more generalized and human readable compact systems. The best result was obtained when considering data set 1 as the training set. Only 3 DVQ neurons were generated in the hidden layer (which is usually interpreted as 3 rules) and a maximum of 2 di erent weights and radii were associated with an input (i.e. maximum of 2 membership functions for each input). The performance in testing the second data set was as high as 98:4%. In the case of cross validation, a net was generated consisting of 3 hidden neurons with the second data set and the performance was tested by applying it to the rst data set, which lead to an accuracy of 90:6%. Training set bulbus1 bulbus2 Test set bulbus2 bulbus1 DVQPerformance 98.4 % 90.6 % Performance shown by applying standard backpropagation and FuNe I based on modi ed backpropagation reached 94%. In case of FuNe I more compact fuzzy classi es could be generated. Another advantage in using FuNe I was the possibility of including the two dimensional mapping or fuzzi cation (a priori knowledge) discussed earlier. 3

4 Conclusions It is shown that the traditional way of reading pages of EEG diagrams by the trained medical practitioners can be eased by the proposed methods of automation. Results show the relative performances of the compared networks and possible applications to other medical data. Two basic advantages could be identi ed in applying fuzzy rule based classi cation over neural networks.  

the transparency of the structure in using neural networks that can be interpreted as human readable fuzzy systems it o ers the possibility of including proposals based on a priori knowledge, formulated by the medical practitioners

However, the second advantage mentioned is not only limited to the forming rules but also extended in including fuzzi cation methods in case of FuNe I. Authors are working on expanding the network capacity in order to analyse more complicated data sets from patients for identifying higher numbers of di erent graphoelements and artifact.

References [1] S. K. Halgamuge and M. Glesner. Neural Networks in Designing Fuzzy Systems for Real World Applications. International Journal for Fuzzy Sets and Systems, 65(1):1{12, 1994. North Holland. [2] S. K. Halgamuge, C. Grimm, and M. Glesner. A Sub Bayasian Nearest Prototype Neural Network with Fuzzy Interpretability for Diagnosis Problems. In ACM Symposium on Applied Computing (SAC'95) (invited session), Nashville, USA, Februray 1995. [3] S. K. Halgamuge, W. Pochmueller, and M. Glesner. An Alternative Approach for Generation of Membership Functions and Fuzzy Rules Based on Radial and Cubic Basis Function Networks. International Journal of Approximate Reasoning (in press), 1995. [4] C.S. Herrmann. A fuzzy neural network for detecting graphoelements in EEGs. In Supercomputers in Brain Research. World Scienti c Publisher, 1995. will be published in March. [5] J.-S. R. Jang and C.-T. Sun. Functional equivalence between radial basis function networks and fuzzy inference systems. IEEE Transactions on Neural Networks, 4(1):156{158, 1993. [6] E. Niedermeyer and F. Lopes da Silva. Electroencephalography, Basic Principles, Clinical Applications and Related Fields. William & Wilkins, 1993. [7] A. Zell, N. Mache, T. Sommer, and T. Korb. The SNNS neural network simulator. In T. Christaller, editor, German Workshop on Arti cial Intelligence. Gesellschaft fur Informatik, 1991.

4