Sleep Stage Classification Based on Bioradiolocation ... - IEEE Xplore

74 downloads 27 Views 721KB Size Report
NREM sleep stages usually called as Light Sleep (LS) and the third stage as Deep Sleep (DS). Certainly, PSG is unsuitable for long-term ambulatory sleep ...
Sleep Stage Classification Based on Bioradiolocation Signals Alexander Tataraidze, Graduate Student Member, IEEE, Lesya Anishchenko, Lyudmila Korostovtseva, Bert Jan Kooij, Mikhail Bochkarev and Yurii Sviryaev Abstract— This paper presents an algorithm for the detection of wakeful state, rapid eye movement sleep (REM) and non-REM sleep based on the analysis of respiratory movements acquired through a bioradar. We used the data from 29 subjects without sleep-related breathing disorders who underwent a polysomnography study at a sleep laboratory. A leave-one-subject-out cross-validation procedure was used for testing the classification performance. Cohen's kappa of 0.56 ± 0.16 and accuracy of 75.13 ± 9.81 % were achieved when compared to polysomnography results. The results of our work contribute to the development of home sleep monitoring systems.

I. INTRODUCTION

Figure 1. Simultaneous registration of BRL and PSG data during sleep.

Sleep disorders are very common among the general population and often go undiagnosed [1]. They increase the risk of accidents, psychiatric disorders, cardiovascular diseases and obesity [1]. Long-term monitoring of sleep at home could be helpful for sleep disorders diagnostics at an early stage, to control treatment effectiveness, to improve sleep quality and to determine the best time for waking up. One of the key issues in the sleep monitor development is the recognition of sleep stages. Normal sleep includes two different phases: rapid eye movement sleep (REM) and non-REM sleep (NREM), which alternate cyclically throughout the night. The gold standard for the evaluation of sleep stages is polysomnography (PSG). On the base of PSG signals, namely electroencephalogram, electrooculogram and electromyogram, an expert manually scores each epoch (30 seconds interval) as wakefulness (W), REM or NREM. Furthermore, NREM is categorized into three stages according to the American Academy of Sleep Medicine Scoring Rules (AASMSR) [2]. The first two NREM sleep stages usually called as Light Sleep (LS) and the third stage as Deep Sleep (DS). Certainly, PSG is unsuitable for long-term ambulatory sleep monitoring. The most common alternative method for sleep structure detection is analysis of heart rate variability (HRV). Meanwhile, as the breathing varies in different sleep stages and wakefulness [1], some authors combine HRV analysis and analysis of respiration signals [3]. Moreover, a few studies demonstrated the opportunity to detect sleep structure based on the respiratory analysis only [4, 5]. Research is supported by Russian Foundation for Basic Research (15-07-02472 A) and the grant of President of Russian Federation (MK-889.2014.9) A. Tataraidze, L. Anishchenko are with Bauman Moscow State Technical University, Moscow, 105005, Russian Federation (phone/fax: +7 (495) 632-22-19, e-mail: [email protected]). L. Korostovtseva, M. Bochkarev, Y. Sviryaev are with Federal NorthWest Medical Research Centre, St. Petersburg, 197341, Russian Federation. B.J. Kooij is with Delft University of Technology, Delft, 2628 CD, the Netherlands.

978-1-4244-9270-1/15/$31.00 ©2015 IEEE

Non-contact methods for the registration of physiological parameters, providing high comfort, seem to be a promising way for the development of sleep monitoring systems. Two the most common methods for the non-contact monitoring of vital signs during sleep: Ballistocardiography (BCG) and Bioradiolocation (BRL). BRL is a method for the remote detection of biological objects and registration of their limb and organ motions using radar [6]. BRL allows the registering of breathing and movements during sleep. Although there are data of heartbeat detection by in-lab BRL [6], we could not get reliable heartbeat detection by BRL during sleep. A few studies report results of human sleep stages classification based on BRL monitoring. Pallin et al. [7] and Hashizaki et al. [8] reported results of 2-stage (sleep/ wakefulness) classification. Zaffaroni et al. [9] reported results of 2-stage, 3-stage (W/REM/NREM) and 4-stage (W/REM/LS/DS) classification. All of those studies do not provide any details of the algorithms, such as used features, classification method, etc., due to commercial reasons. Sleep stage classification by BCG was researched significantly better than by BRL [10-12]. The possible reason is that BCG enables detection of heartbeat during sleep. Additionally, it is possible to detect breathing and body motion based on video analysis and Heinrich et al. [13] reported preliminary results of sleep efficiency estimation using the method. Our paper gives a comprehensive description of the algorithm for 3-stage classification based on BRL monitoring. II. MATERIALS AND METHODS A. Clinical Protocol We analyzed the data of 29 subjects (Table I) who were sent for PSG in the Sleep Medicine Laboratory, North-West Federal Medical Research Centre (St. Petersburg, Russia) due to the suspected sleep disorders. However, based on the PSG

362

31260

Male:Female

9:20

Age (years)

45.38 ± 15.71 (22 - 67)

Body Mass Index (kg/m2)

27.31 ± 6.11 (17 - 48)

Apnea Hypopnea Index (episodes/h.)

2.27 ± 1.36 (0 – 4.9)

Wakefulness (%)

24.03 ± 12.31 (5.83 – 52.94)

REM (%)

17.73 ± 5.61 (9.31 – 29.25)

NREM(%)

58.24 ± 9.64 (33.92 – 73.85)

Sleep Efficiency (%)

77.91 ± 11.90 (53.40 – 94.40)

0

Mean ± SD (range)

BRL monitoring (fig. 1) was performed simultaneously with PSG. We used a multi-frequency bioradar with quadrature receiver, continuous wave signal and step frequency modulation, developed at the Remote Sensing Laboratory of Bauman Moscow State Technical University. The bioradar has 8 operating frequencies in the range from 3.6 to 4.0 GHz and sampling rate of 50 Hz. The emitted power flux density is 1.36 μW/cm2, which guarantees safety for both patients and medical staff during BRL monitoring. B. Data preprocessing Each BRL record consists of 16 signals (8 operating frequencies, each of them has I and Q quadrature), which were recorded simultaneously. Fig. 2 shows one of BRL signals. The best signal for analysis can be changed after a subject's movements, due to changing the distance between the bioradar and the subject. Thus, for each record we need to assemble the best parts of all signals to the one signal Sb for the feature extraction. This was achieved by the following steps:

2.

4

6 Time [h]

8

10

12

Figure 2. A BRL signal before preprocessing.

study (Embla N7000, Natus, USA) they did not have sleeprelated breathing disorders (SBD). PSG records were scored by an experienced physician according to the AASMSR. Registration of respiratory movements by respiratory inductive plethysmography (RIP) was included in the PSG study.

1.

2

Amplitude [a.u.]

Total number of epochs

Amplitude [a.u.]

THE DATASET CHARACTERISTICS

0

3.

Intervals with movement artifacts on Sb were identified and replaced with zeros. A moving window of 5 seconds with a step of 2 seconds was used for entropy calculation. Intervals featuring entropy levels three times bigger than the mean value for a subject were identified as artifacts and replaced.

4.

Each inter-artifact interval (IAI) of Sb was replaced by a coincident interval with maximum mean amplitude from one of the BRL signals.

40

Time [s]

60

80

Figure 3. An interval of a BRL signal. Inspiratory peaks point up on the left side, rejected artifacts are on the center, inspiratory peaks point down on the right side.

0

10

20

Time [s]

30

40

Figure 4. An interval of a BRL signal. The peaks and troughs are represented by filled triangles and squares, respectively. An analazable part of the signal (epoch) and an interval for extraction cycle-based features are represented by dotted and dashed lines, respectively. The areas between the curves and the baseline are filled in light and dark gray for inhalation and exhalation periods, respectively.

The BRL signals were filtered with a Butterworth low-pass filter at cut-off frequency of 0.6 Hz, and the baseline was removed from them. A signal with the maximum mean amplitude was chosen as Sb. Then, peak-to-peak manual synchronization of the signal Sb and thorax RIP signal were performed. Whereupon, the signal Sb and other BRL signals were truncated according to the beginning of the first scored epoch and the ending of the last scored epoch. Thus, BRL signals were synchronized with the results of the manual scoring of PSG data by a physician.

20

Amplitude [a.u.]

TABLE I.

5.

In the case of inverted IAI signal in consequence of phase shifting (fig. 3), it was flipped over. Thus, all inspiratory peaks were turned up.

6.

Z-normalization was performed for each IAI by subtracting the mean value of the signal and dividing by the standard deviation because amplitudes of IAIs might have a big difference.

7.

Peaks and troughs were detected based on the search of turning points (fig. 4). Each breathing cycle was described by means of the peak, the left trough, the right trough, the width and the amplitude, where the width is the distance between left and right troughs, and the amplitude is the height from the nearest trough to the peak. Breathing cycles with amplitudes or width twice less than average for a subject were removed as false.

C. Feature extraction A set of 23 features was extracted for each epoch. Cyclebased features were extracted from breathing cycles if their peaks were located in the analyzable part (an epoch or a window) of the signal. Thus, an interval for extraction these features might be a bit less or more than the analyzable part depending on the first breathing cycle's left trough and the last breathing cycle's right trough (fig. 4). Other features were extracted directly from the analyzable part.

363

Sample entropy [8] and spectral features were extracted from the signal Sb during an epoch. Spectral features were estimated using a Fourier transform with a Hamming window. In the frequency domain, these were the extracted features: the logarithm of the power in the very low frequency range (VLF) between 0.01 and 0.05 Hz, the logarithm of the power in the low frequency range (LF) between 0.05 and 0.15 Hz, the logarithm of the power in the high frequency range (HF) between 0.15 and 0.50 Hz, the ratio between LF and HF (LF/HF), the peak frequency and its power in HF. These cycle-based features were extracted from the cycles related to an epoch: the median and inter-quantile range (IQR) of breathing cycle amplitudes, the median and IQR of breathing cycle widths, the median and IQR of peaks, the median and IQR of troughs, the median of areas between the signal and baseline during inhalation (MAI), the median of areas between the signal and baseline during exhalation (MAE), the ratio between MAI and MAE (fig. 4). The standard deviation of breathing frequency [6], dynamic time and frequency wrapping [8] were extracted using a moving window of 5 epochs. The median of peaks divided by IQR of peaks, and the median of troughs divided by IQR of troughs were extracted using a moving window of 25 epochs. Z-normalization was performed on each feature per subject in order to remove subject-to-subject variations. D. Classification A bagging classifier was used in the experiments. Moreover, a few simple heuristics were used to improve classification performance in the following order: 1.

First 20 minutes were scored as wakefulness;

2.

If an epoch did not belong to one of the nearest stages, it was scored as the previous epoch;

3.

All REM epochs during the first 60 minutes of records were scored as the previous epoch;

4.

If the interval between REM epochs was less than 15 minutes, all epoch included in the interval were scored as REM. TABLE III.

D. Experiments A Leave-one-subject-out cross-validation procedure (LOSOCV) was used for testing the classification performance. A training set was formed from the features of 28 subjects and data of the last remaining subject was used as a testing set. That was repeated 29 times with changing subjects included in training and testing sets. For the evaluation of the classification performance, classification accuracy, Cohen's kappa coefficient (k) and confusion matrix were computed for a test subject on each of LOSOCV iteration. Sleep stage classification is an imbalanced task because NREM takes around 75-80% of overall sleep time in a normal subject. In that situation, Cohen's kappa coefficient of inter-rater agreement, being insensitive to imbalance, is a more important metric than accuracy. Mean, standard deviation and range were calculated for the accuracy and k. The confusion matrix for full dataset was computed as the sum of the confusion matrices for test subjects. III. RESULTS The accuracy of 72.62 ± 8.91, 54.35 – 83.90 (mean ± SD, range) and k of 0.49 ± 0.12, 0.25 – 0.69 were achieved with the classifier, while usage of the classifier with heuristics resulted in the accuracy of 75.13 ± 9.81, 54.10 – 88.87 and k of 0.56 ± 0.16, 0.28 – 0.79. Table II shows the confusion matrices for the classifier with the heuristics. The comparison of our methods with those reported by other authors is presented in Table III. Fig. 5 compares hypnograms for subject #20 which was plotted by a physician, the classifier and the classifier with the heuristics. TABLE II. Algorithm Manual Wakefulness REM NREM

CONFUSION MATRIX

Wakefulness

REM

NREM

5091 572 1897

389 3359 1969

1443 1719 14911

COMPARISON W/REM/NREM CLASSIFICATION PERFOMANCE

First author/year Xiao, 2013 [14] Redmond, 2007 [3] Long, 2014 [4] Kortelainen, 2010 [11] Mendez, 2010 [12] Kurihara, 2012 [10] Zaffaroni, 2014 [9]

Signals HRV HRV, RIP RIP BCG BCG BCG BRL

Number of subjects 45 31 48 17 17 10 40

Estimation method LOCOCV LOSOCV 10-fold CV LOSOCV LOSOCV training/test

This paper

BRL

29

LOCOCV

Classifier RF LD LD kNN HMM BAG BAG+H

Accuracy (%) 72.6 ± 6.7 76.1 ± 5.9 77.1 ± 7.6 71.95 ± 7.47 79 ± 10 77.5 ± 6.2 78.3 72.6. ± 8.9 75.13 ± 9.81

Cohen's kappa 0.46 ± 0.09 0.46 ± 0.10 0.48 ± 0.17 0.42 ± 0.10 0.44 ± 0.19 0.48 ± 0.07 0.53 0.49 ± 0.12 0.56 ± 0.16

RIP - Respiratory Inductance Plethysmography, HRV – Heart Rate Variability, BCG - Ballistocardiography. LD – Linear Discriminant, RF – Random Forest, BAG – Bagging, HMM - Hidden Markov Model, kNN - k-Nearest Neighbor, H - Heuristics. 10-fold CV – subjects are divided into 10 subsets, during each iteration data from 9 subsets are used for training and another one is used for testing. Training/test – records were divided once into training and test sets.

364

W

V. CONCLUSION

Classifier

The paper depicts the results of sleep structure detection based on BRL monitoring. The technical characteristics of the bioradar designed at the Remote Sensing Laboratory are given. The algorithm for the 3-stage (W/REM/NREM) classification is described. Cohen's kappa of 0.56 ± 0.16 and accuracy of 75.13 ± 9.81 % were achieved when compared to polysomnography results. We conclude that BRL is a viable alternative for the development of long-term sleep monitoring system.

R N W

Classifier + heuristics

R N W

Physician

R N 0

REFERENCES 100

200 Time [min]

300

400

[1] [2]

Figure 5. Hypnograms of subject #20 (k of 0.69 and 0.79 for the classifier and classifier + heuristics, respectively). W – wakefulness, R – REM, N – NREM.

[3]

IV. DISCUSSION To the best of our knowledge, this is the first paper describing an algorithm for sleep stage classification based on BRL monitoring. The results correspond to published data presenting automatic sleep stage classification based on information such as motion, breathing and heartbeat. Some limitations of our study should be noted. The study is limited by data peculiarities. The dataset is small, some subjects were not healthy (having concomitant disease, sleep disorders other than SBD), male/female ratio is imbalanced with the predominance of women. Thus, the results should be accepted with caution. It is important that all included subjects were free from SBD. The possibility of sleep structure detection in subjects with SBD, based on analysis of respiratory activity in general, and in particular using BRL monitoring, is a topic for further research. This topic has been poorly investigated, although Pallin et al. [7] and Hashizaki et al. [8] presented results of sleep/wakefulness classification based on BRL for subjects with SBD, and Redmond et al. [15] used combination of respiratory and HRV analysis for sleep structure detection of subjects with SBD. The algorithms of artifact detection and breathing cycle identification are not yet verified by comparison with manual scoring. When implemented, such algorithm may increase the accuracy of cycle identification, resulting in better sleep stage classification. Using the heuristics is a controversial strategy as it might decrease classification results for some records. Surely, proposed heuristics are not optimal. Nonetheless, we think that using additional information, such as knowledge of normal human sleep structure, is particularly important for automated sleep staging based on non-direct data. BRL allows detecting sleep structure as well as SBD. This technology is highly promising for long-term sleep monitoring at home and seems to be helpful for prevention, timely diagnostics of sleep disorders and life quality improvement.

[4]

[5]

[6]

[7]

[8]

[9]

[10] [11]

[12]

[13]

[14] [15]

365

T. Lee-Chiong, Sleep medicine: essentials and review. New York: Oxford University Press, 2008. C. Iber, S. Ancoli-Israel, A. L. Chesson, and S. F. Quan, The AASM manual for the scoring of sleep and associated events: rules, terminology and technical specification. Westchester, IL: American Academy of Sleep Medicine, 2007. S. Redmond, P. de Chazal, C. O’Brien, S. Ryan, W. T. McNicholas, and C. Henegan, “Sleep staging using cardiorespiratory signal,” Somnologie, vol. 11, pp. 245–256, Oct. 2007. X. Long, J. Yang, T. Weysen, R. Haakma, J. Foussier, P. Foneseca, and R. M. Aarts, “Measuring dissimilarity between respiratory effort signals based on uniform scaling for sleep staging,” Physiol. Meas., vol. 35, pp. 2529–2542, Nov. 2014. X. Long, J. Foussier, P. Foneseca, R. Haakma, and R. M. Aarts, “Analyzing respiratory effort amplitude for automated sleep stage classification,” Biomed. Signal Process. Control, vol. 14, pp. 197– 205, 2014. L. Anishchenko, M. Alekhin, A. Tataraidze, S. Ivashov, Soldovieri F., and Bugaev A. “Application of step-frequency radars in medicine,” in Proc. SPIE Symp. on Defense and Security, Radar Sensor Technol. XVIII Conf., Baltimore, 2014 , pp. 90771N-1…N-7. M. Pallin, E. O’Hare, A. Zaffaroni, P. Boyle, C. Fagan, B. Kent, C. Heneghan, P. de Chazal, and W. T. McNicholas, “Comparison of a novel non‐contact biomotion sensor with wrist actigraphy in estimating sleep quality in patients with obstructive sleep apnoea,” J. Sleep Research, vol. 23, pp. 475–484, 2014. M. Hashizaki, H. Nakajima, M. Tsutsumi, T. Shiga, S. Chiba, T. Yagi, Y. Ojima, A. Ikegami, M. Kawabata, and K. Kume, “Accuracy validation of sleep measurements by a contactless biomotion sensor on subjects with suspected sleep apnea,” Sleep Biol. Rhythms, vol. 12, pp. 106–115, Apr. 2014. A. Zaffaroni, L. Gahan, L. Collins, E. O’Hare, C. Heneghan, C. Garcia, I. Fietze, and T. Penzel, “Automated sleep staging classification using a non-contact biomotion sensor,” in Abstracts 22nd ESRS Congr., Tallinn, Estonia, 2014, pp. 105. Y. Kurihara, and K. Watanabe, “Sleep-stage decision algorithm by using heartbeat and body-movement signals,” IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 42, pp. 1450–1459, Nov. 2012. J. M. Kortelainen, M. O. Mendez, A. M. Bianchi, M. Matteucci, and S. Cerutti, “Sleep staging based on signals acquired through bed sensor,” IEEE Trans. Inf. Technol. Biomed., vol. 14, pp. 776–785, May 2010. M. O. Mendez, M. Migliorini, J. M. Kortelainen, D. Nistico, E. Arce-Santana, S. Cerutti, and A. M. Bianchi “Evaluation of the sleep quality based on bed sensor signals: time-variant analysis,” in Proc. 29th IEEE EMBS Annu. Int. Conf., Buenos Aires, Argentina, pp. 3994–3997, 2010. A. Heinrich, X. Aubert, and G. de Haan, “Body movement analysis during sleep based on video motion estimation,” in Proc. 15th Int. Conf. on e-Health Networking, Applications and Services, Lisbon, Portugal, pp. 539–543, 2013. M. Xiao, H. Yan, J. Song, Y. Yang, and X. Yang, “Sleep stages classification based on heart rate variability and random forest,” Biomed. Signal Process. & Control, vol. 8, pp. 624–633, 2013. S. J. Redmond , and C. Heneghan, “Cardiorespiratory-based sleep staging in subjects with obstructive sleep apnea,” IEEE Trans. Biomed. Eng., vol. 53, Mar. 2006.