Electroencephalogram processing using Hidden Markov ... - CiteSeerX

22 downloads 159 Views 141KB Size Report
An excellent tutorial ... tion entropy of probability function of power spectral density (psd), which is ... Thus, normalization of psd with respect to the total power of ...
Electroencephalogram processing using Hidden Markov Models D.Nov´ak1 , T.Al-ani2 , A.Hamam2 , L.Lhotsk´a1 1 Department

of Cybernetics Czech Technical University in Prague [email protected] http://cyber.felk.cvut.cz 2 A2SI Laboratory

Group ESIEE-Paris, France [email protected]

ABSTRACT An approach for Electroencephalogram (EEG) processing is presented. Along with the theoretical development of stochastic processing techniques, two application areas are suggested: EEG sleep recording analysis and Brain Computer Interface (BCI). Many methods have been already developed in the area of sleep staging, nevertheless the automatic scoring in not still so effective as the manual scoring. Our sleep scoring method has the advantage of better temporal resolution (1 second) compared to the classical manual approach (30 seconds). In case of BCI this is a quite new approach offering mainly support for disable people in terms of controlling personal computer. The algorithm for cue movements determination has been designed resulting in detecting the movements within one second interval.

KEYWORDS Electroencephalogram, Sleep stage scoring, Brain-Computer Interface, Hidden Markov Models

1

Introduction

Sleep scoring is the fundamental basis of sleep research and sleep medicine. An excellent tutorial on the problem of sleep analysis based on polysomnography can be found in [1]. More than 30 years the sleep scoring was based on rules which were compiled by a committee chaired by Rechtschaffen and Kales [2]. Today is widely recognized that these rules have similar limitations which were not foreseen 30 years ago. The major drawbacks are: low temporal resolution, ignorance of spatial information, low correspondence between electrophysiological activity and stages, and ignorance of other physiological parameters such as autonomous nervous system activity and body motility [3]. BCI along with biofeedback is a quite new approach in neurology [4]. Biofeedback is a coaching and training process which helps people learn how to change patterns of behavior, to take greater self responsibility for their health and for their mental, physical, emotional and spiritual functioning. In contrast to biofeedback we do not want people to adapt to computers. The basic idea is that computer adapt rather than the human. This requires a phase in which the prospective user performs a particular fixed task, such as imagining making finger movements [5], whilst the EEG is recorded. A computer then adapts its recognition algorithm in order to discriminate between imagined movements and the background EEG. The up-to-date achieved performance using dynamic Bayesian nets [6] or autoregressive method [7] varies between 7080%.

In this paper we firstly propose a sleep analysis method based on the theory of Hidden Markov Models. The method seems to be very promising to extend and correct the rules of Rechtschaffen and Kales (R&K), mainly in case of increasing temporal resolution, which is based on 30 second epochs in R&K scoring. On the other hand, our analysis is done in 1 second interval resolution. Secondly, regarding BCI application [8], we cast the detection of cue movements under the framework of Hidden Markov Models. We set the following two objectives: to determine from the set of 59 EEG electrodes the most descriptive one and consequently to perform the detection itself.

2

Method

The main framework of EEG processing has many elements in common both in Sleep EEG and BCI areas. Firstly, an important step of artifacts removal must be done. Secondly, suitable feature extraction methods must be selected. Finally, the analysis itself must be carried out.

2.1

EEG Preprocessing

We summarize the principle steps involved in our EEG analysis. 1. Removal of artifacts. The following artifacts are desired to remove: electrocardiogram (ECG), electrooculogram (EOG), electromyograph (EMG), body movement, respiration. Artifacts rejection can be achieved by digital filtering, correlation analysis or by independent component analysis. We have applied the last technique to remove EOG artifacts from EEG signal [9]. To enhanced the performance of BCI system the Laplacian transform [10] was applied. The Laplacian operator transforms recordings at each time point by calculating the difference between each electrode reading and the average reading from its four neighbors. It has the effect of localizing the information received from each channels. 2. Feature Extraction. The following features, which have shown to bear important information about development of sleep stages and imaginary task, have been used: spectral entropy [11], complexity stochastic measure [12], autoregressive parameters [13] and Hjorth parameters such as mobility and complexity [14]. Spectral entropy [11] quantifies the spectral complexity of time series. It is Shanon information entropy of probability function of power spectral density (psd), which is magnitude square of the Fourier coefficients. Thus, normalization of psd with respect to the total power of Fourier coefficients yields a probability function used in calculation of entropy. Complexity stochastic measure [12] is computed using the methods of delays to construct embedding matrix for the temporal EEG analysis. The number of significant singular values of these matrices obtained via singular value decomposition serves as the measure of complexity by quantifying the temporal information content of the signals. The third feature is a calculation of autoregressive parameters [13]. This is a linear predictive time series model where the next value in time series is modelled as a linear combination of the r previous values, where r is referred to as the order of the model. In our case, we have used r = 7 and the standard Yule-Walker approach for the computation of autoregressive coefficients. Finally the last feature, Hjorth measures mobility and complexity, describes the physiological and physical alterations as was demonstrated in [14]. Mobility gives a measure of deviation of the slope with respect to deviation of the EEG amplitude, while complexity provides a measure of excessive details with regard to the softest possible curve shape.

3. Classification rules. The decision of a sleep stage or BCI based on the features is often done using machine learning [15], knowledge based approach [16], artificial neural network [17] or chaos theory [18]. In this paper we propose to use the Hidden Markov Model theory which belongs under the statistical approach and which is described in the next section.

2.2

HMM description

An HMM is a stochastic finite state automata defined by the following parameters λ = (A, p, B), where A is a state transition probability, p is the initial state probability and B is the emission probability density function of each state is defined by a finite multivariate Gaussian mixture. Each model can be used to compute the probability of observing a discrete input sequence O = O1 , . . . , OT , P (O|λ) to find the corresponding state sequence that maximizes the probability of the input sequence, P (Q|O, λ), and to induce the model that maximizes the probability of a given sequence P (O|λ ) > P (O|λ). The following keywords are known as the three problems of an HMM: evaluation, generation, and training [19].

3

Results and Discussion

We have applied the previous mentioned methodology to two application areas in our hand: sleep analysis staging and brain computer interface. Next, we will describe the experimental HMM set-up and we will summarize achieved results in both application areas. We assume all the temporal feature values are continuous, thus, we use the continuous density HMM. Consequently we assume all feature characteristics (sequences) are independent and identically distributed of each other. Therefore, the covariance matrix at each state degenerates into a variance vector. Since classical Expectation-Maximization (EM) approach [20] used during learning is a local optimization method, we used k-means initialization to increase chance of starting training procedure near to global optima.

3.1

Sleep EEG

The data was recorded from one male and five female subjects exhibiting poor sleep quality during overnight session in a sleep laboratory. The data was manually scored by a sleep expert based on the standard R&K system. The data was collected using bipolar montage and recorded using a 250Hz sampling rates. The features mentioned in Section 3 were computed for consecutive non-overlapping windows of one second length, i.e: spectral entropy, complexity stochastic measure, autoregressive parameters and Hjorth parameters. The features were calculated from the two EEG channels: F7,T3 and one EMG channel. The results are summarized in confusion matrix-see Table 1. R&K scores are taken as true scores and for each sleep stage separately the percentage of HMM classification are given in the confusion matrix. The number of HMM states, N , was set to N = 6 at the beginning of HMM training. However, after HMM training, the final rank of matrix A was RankA = 4 resulting that in the end only N = 4 states were used by HMM. Figure 1 shows the hypnogram with a filtered Viterbi sequences. We have applied average filter with windows of length 30 seconds to allow faire comparison between manually scored hypnogram and HMM classification. Looking at the results in Table 1, we can furthermore merge the first two HMM states into one, resulting in having only three HMM states. The HMM state 1 corresponds mostly to wake state W ; also there is less strong association with lighter sleep stages S1 and S2. The same is valid for HMM state 2, again the much overlap exists among states W,S1 and S2. The deeper sleep stages S2 and S3 are associated with HMM state 3. Finally, HMM state 4 is predominantly

R&K Manual Hypnogram

sleep stages

7 6

S4

5

S3

4

S2

3

R

2

S1

1

W

0

0

100

200

300

400 time

500

600

700

800

HMM Classification (30 sec not−overlapping Average Filter)

HMM states

5 4

s4

3

s3

2

s2

1

s1

0

0

100

200

300

400 time

500

600

700

800

Figure 1. Example of whole night sleep segmentation for one subject. Each time sample corresponds to 30 seconds time resolution. visited when the subject is in deep sleep S4. It is interesting to note, that the weakest association exhibits the sleep stage of REM sleep R, which does not clearly correspond to any HMM state. Table 1. Confusion matrix. HMM classification versus R&K is given in percentages. The codification of sleep stages is following: W-wake, R-rem sleep, S1, S2, S3, S4-deep sleep. HMM state 1 2 3 4

W 46 42 8 7

S1 24 13 4 1

R 2 4 16 18

S2 27 41 40 9

S3 0 0 22 10

S4 0 0 10 54

We have shown that our approach is able to perform the sleep staging analysis in better temporal resolution (1 second) that standardly used R&K scoring system (30 seconds).

3.2

BCI

Our first objective was the determination the most descriptive electrode in terms of cue hand movements detection. We have used 59 EEG system (sampling rate f = 500Hz). We applied the HMM segmentation to EEG data recorded while a subject performed hand movements in response to cues. Again the same features as in the previous case were calculated for all EEG channels and for three records. We train the HMM on the feature set resulting in convergence within 15 EM iterations. Comparing the successful detection rate over all channels we have concluded that the most descriptive the channel is number 12, which corresponds to F2 electrode. The second objective was the detection itself which is shown in Figure 2. Also the Viterbi path along with the timing of the movement cue (dashed line) is depicted. The EEG signal consists of three blocks of 10 seconds (dotted line), in each block there is 5 seconds section before and after the cue. We have used N = 2 states corresponding to the present/absent movement. As can be seen from Figure 2, the HMM classification provides a reasonable partitioning of the data. It detects the cue movement within one second resolution (5, 15 and 25 time points). Moreover, HMM Viterbi state sequence detects also the beginning of new data block (time points 10 and 20 seconds). We have applied the same analysis to other two data sets obtaining very similar results.

EEG channel 12 6

amplitude

4 2 0 −2 −4 −6 −8 0

5

10

15 time [sec]

20

25

30

HMM classification 3 2.5

states

2 1.5 1

0

cue

cue

0.5 0

5

10

15 time [sec]

cue

20

25

30

Figure 2. HMM partitioning of hand cue movements.

4

Conclusion

We presented an approach towards automatic EEG analysis in two medical application areas: scoring of human sleep and brain computer interface. It applies the method of Hidden Markov Models to analysis of EEG which seems to be a natural choice given the multi-variate temporal nature of the data. In both application areas our approach is unsupervised resulting that each time when a processing of new subject is carried out, the HMM must be trained. However, either due to different protocols and hardware equipment used in sleep laboratories; or due to big inter-personality difference we prefer to use unsupervised analysis. In addition the feature extraction and consequently HMM training of over-night recoding takes about several tenths minutes on ordinary PC and in Matlab environment. Notwithstanding, we have not persuaded any optimization for alleviating computational burden. Regarding the sleep scoring results, we have already commented that in spite of defining N = 6 HMM states this number was reduced to N = 3 at the end of learning procedure. These findings agrees with [21], where the authors argued that human sleep is a mixture of three different processes: wakefulness, deep sleep and REM sleep. Considering hand movement analysis, firstly we have determined the most descriptive electrode and secondly performed the detection. We have found that the most information-bearing channel is F2. Regarding the detection, the difference between the detection and the cue timing varies within one second that is satisfactory in terms of potential use in BCI domain. However, we could not make an quantitative analysis (e.g. the method performance) due to the fact that during evaluation process a small data set of recordings has been available. The interesting question, which we have left for future direction, is recognition of hand movement direction, i.e. the left and right movement.

ACKNOWLEDGEMENTS This work has been supported by the research program MSM 210000012 ’Transdisciplinary Biomedical Engineering Research’ sponsored by the Ministry of Education, Youth and Sports of the Czech Republic. We are pleased to give our acknowledgment to Martin Brunovsky, Prague Psychiatric Center, 3rd Medical Faculty of Charles University (3th MFCHU) for providing a sleep EEG data and to Andrej Stanˇc´ak, Institute of Physiology, 3th MFCHU for providing the cue movement EEG data.

REFERENCES [1] T. Penzel and R. Condrat, “Computer based sleep recording and analysis,” Sleep Medicine reviews, vol. 4, no. 2, pp. 131–148, 2000. [2] A. Rechtschaffen and A. Kales, A Manual of Standardized Terminology, Techniques and Scoring System for Sleep Stages of Human Subjects, University of California, Los Angeles, 1968, cA: BIS/BRI. [3] S. Himanen and J. Hasan, “Limitations of rechtschaffen and kales,” Sleep Medical Reviews, vol. 4, pp. 149–167, 2000. [4] G. Pfurtscheller, D. Flotzinger, and C. Neuper, “Differentiation between finger, toe and tongue in man based on 40 hz eeg,” Electroencephalography and Clinical Neurophysiology, vol. 90, pp. 456–460, 1994. [5] A. Stanˇc´ak, B. Freige, C. H. L¨ ucking, and R. Kriteva-Frige, “Oscillatory cortial activity and movement-related potentials in proximal and distal movements,” Clinical Neurophysiology, vol. 111, pp. 636–650, 2000. [6] P. Sykacek, S. Roberts, and M. Stokes, “Adaptive bci based on variational bayes: an empirical evaluation,” in Albany BCI workshop, June 2002. [7] J. Wolpaw and D. McFarland, “Multichannel eegbased braincomputer communication.” Electroencephalography and Clinical Neurophysiology, vol. 90, pp. 444–449, 1994. [8] B. Obermaier, “Hidden markov models for online classification of single trial eeg data,” Pattern Recognition Letters, vol. 22, no. 1299–1309, 2001. [9] D. Novak and L. Lhotska, “Independent component analysis and its applications,” in Intelligent and Adaptive Systems in Medicine, EUNITE Workshop 2003, 2003. [10] B. Hjort, “An online transformation of eeg scalp potentials into orthogonal source derivations,” Electroencephalography and Clinical Neurophysiology, vol. 39, pp. 526–530, 1975. [11] I. Rezek and S. Roberts, “Stochastic complexity measures for physiological signal analysis,” IEEE Transactions on Biomedical Engineering, vol. 44, no. 9, pp. 1186–1191, 1998. [12] S. Roberts, W. Penny, and I. Rezek, “Temporal and spatial complexity measures for eeg-based brain-computer interfacing.” Medical & Biological Engineering & Computing, vol. 37, no. 1, pp. 93–99, 1998. [13] J. Pardey, S. Roberts, and L. Tarassenko, “A review of parametric modelling techniques for eeg analysis,” Med. Eng. Phys., vol. 18, no. 1, pp. 2–11, 1996. [14] B. Hjort, “Eeg analysis based on time doamin properties,” Electroencephalography and Clinical Neurophysiology, vol. 29, pp. 206–310, 1970. [15] M. Kubat, G. Pfurtscheller, and D. Flotzinger, “Ai-based approach to automatic sleep classification,” Biological Cybernetics, no. 70, pp. 227–250, 1994. [16] M. J. Karim and B. Jansen, “Knowledge acquisition for multi-channel electroencephalogram interpretation,” Artificial Intelligence in Medicine, vol. 4, no. 5, pp. 315–328, 1992. [17] J. Pardey, S. Roberts, and L. Tarassenko, “A new approach to the analysis of the human sleep/wakefulness continuum,” Journal of Sleep Research, vol. 5, pp. 201–210, 1996. [18] P. Achermann, R. Hartmann, A. Gunzinger, W. Guggenbuhl, and A. Borbely, “All-night sleep eeg and artificial stochastic control signals have similar correlation dimensions,” Electroenceph Clin Neurphysiol, vol. 90, pp. 384–387, 1994. [19] R. Rabiner, “A tutorial on hidden markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, 1989. [20] A. Dempster, N. Laird, and D. Rubin, “Maximum-likelihood from incomplete data via the em algorithm,” Journal of Royal Statistics, vol. 39, pp. 1–38, 1977. [21] A. Flexer, P.Sykacek, I.Rezek, and G. Dorffner, “An automatic, continous and probabilistic sleep stager based on a hidden markov model,” Applied Artificial Intelligence, vol. 16, no. 3, pp. 199–207, 2002.