Senior Health Monitoring Using Kinect - IEEE Xplore

Senior Health Monitoring Using Kinect Monish Parajuli, Dat Tran, Wanli Ma and Dharmendra Sharma Faculty of Information Sciences & Engineering University of Canberra, Australia [email protected] Abstract—This paper presents a new senior health monitoring system using the Kinect device to monitor elderly people and detect when they are likely to fall by measuring their gait, and analyzing change in posture when they change from sitting to standing or vice versa. Support vector machine is used to analyze the gait and posture data obtained from the Kinect device. Several experiments were performed to evaluate the proposed system and experimental results as well as research experience on using the Kinect device will be presented. Keywords-component; senior health support vector machine, gait recognition.

I.

monitoring;

Kinect;

INTRODUCTION

Fall is a major cause of death in the elderly. It is not the fall itself but the complications cause by the falls that leads to death [1]. Falls account for over 80% of all injury-related admissions to hospital of people over 65 years. Falls are also the leading cause of unintentional injury death in these individuals and responsible for appreciable morbidity [2]. According to statistics, 70% of accidental deaths in persons over 75 [3]. It is imperative that falls should be prevented. The elderly might have a fall due to variety of reasons like muscle weakness, impaired activities of daily living (ADL), depression and cognitive impairment [4]. Research has shown that gait is an indicator whether a person is likely to fall or not [5, 6]. In the current state of things, gait and falls in elderly are monitored by using various technologies like Video Recording and/or Manual observation [7], Reflective Markers and Special Cameras [7], Electrical Sensors on muscles [7], Visual Sensor Networks [8], Combination of detectors based on surroundings and worn devices [3], Combination of shoe embedded devices and computing system [9], Wireless sensor network [10, 11], Using accelerometers and mobile phone technology [12], Wearable sensor package [13], and Smart Inactivity Monitor using Array-Based Detectors [14]. Most of the above-mentioned methods depend upon the seniors’ willingness to wear falls monitoring device or extensive sensors, and these sensors require significant infrastructure and specialized devices. On the other hand, the Kinect device that has been released recently allows freedom to track motion without having the person to wear sensors [15]. Due to the relative popularity of the Kinect device [16], its low cost [17] as well as a free SDK to develop application for noncommercial use [15], our research was motivated to use the device to explore its use in elderly health monitoring.

978-1-4673-2493-9/12/$31.00 ©2012 IEEE

In this paper, we propose a senior health monitoring system that uses the Kinect device to explore possibilities in healthcare research such as monitoring a patient to detect when he/she is likely to fall by measuring his/her gait, and analyzing change is posture when a person changes from sitting to standing or vice versa. Support vector machine—the well-known pattern recognition method is used to analyze the gait and posture data obtained from the Kinect device. We perform several experiments to evaluate the proposed system and present our experimental results as well as research experience on using the Kinect device. II.

THE KINECT DEVICE AND DATA COLLECTION

The Kinect device gets 3D scene information from its 3D depth sensors. This sensor consists of an infrared laser projector combined with a monochrome CMOS sensor, which captures video data in 3D under any ambient light conditions. The device also has a RGB camera and a multi-array microphone for speech recognition.

Figure 1. The Kinect device

The skeletal tracking tool provided with the Kinect SDK was modified to collect the joint data. The Kinect device records the joints as points relative to the device itself, the joints which are obstructed and cannot be resolved are inferred from the posture of the person being tracked. The data was collected in frames. Each frame represents a posture of the person being tracked and it consists of twenty joints, namely: Hip Center, Spine, Shoulder Center, Head, Left Shoulder, Left Elbow, Left Wrist, Left Hand, Right Shoulder, Right Elbow, Right Wrist, Right Hand, Left Hip, Left Knee, Left Ankle, Left Foot, Right Hip, Right Knee, Right Ankle and Right Foot. The coordinates of each joint from a frame was concatenated together to represent a feature vector in higher dimension.

309

Our research is focused on distinguishing normal and abnormal gaits. The following two actions were considered in data collection and system evaluation: 1) Walking left to right and vice versa: This action was considered in order to show whether data captured using Kinect can be used to find out abnormalities while performing the same action i.e. walking. A dataset which consisted of a sequence of frames was captured as either walking normally or walking abnormally. 2) Standing up and sitting down: This action was considered in order to show whether data captured using Kinect can be used to distinguish between different postures. A dataset which consisted of a sequence of frames was captured as either standing up or sitting down. The Kinect device was positioned at a height of 1 meter from the ground. For the walking data the researcher walked in as straight line for 4 meters and the perpendicular distance between the sensor and the line was 2 meters. For the sitting/standing data a chair was positioned 2 meters in front of the sensor, which is the recommended distance. III.

DATA PROCESSING

Real world data are usually incomplete, noisy and inconsistent. In order to fill incomplete data, smooth out noisy data and remove inconsistent data preprocessing is required [18]. The data preprocessing methods applied in our research include: 1) Data transformation: The data collected was distorted due to viewing angle. Objects closer to the sensor appeared larger than the objects farther away. Firstly, the researcher’s height and shoulder width was measured using a measuring tape and then either of the following approaches was undertaken to adjust the position of joints being represented: • Adjustment with respect to height and shoulder width: The height was resolved by calculating the perpendicular distance from the head coordinates to the foot position. The shoulder width was calculated by calculating the distance between left shoulder and right shoulder. After that, a scale factor is calculated for both instances by dividing the measured value by the calculated value. Finally, the scaling factor was multiplied with all the joint position data. • Adjustment with respect to the calculated height and shoulder width: The height was calculated by adding the distance from head to shoulder center, shoulder center to spine, spine to hip center, hip center to left hip, hip left to knee left and from knee left to ankle left. Shoulder width was calculated by adding the distance from left shoulder to shoulder center and from shoulder center to right shoulder. Then the same process of applying scale factor was followed. 2) Data cleaning: The data was inconsistent because of inferred joint position. As previously mentioned, some joint positions are inferred; they did not represent realistic body

joint position e.g. abnormal bends in arm so the hands’ joint data was ignored. 3) Data reduction: To reduce the complexity of machine learning the scale of data is normally reduced. The following approaches are taken in this project: • Change in position: In order to reduce the range of data being collected joint is stored as percentage of change in position with respect to its position in previous frame. This method has the advantage that the measurements become independent of the data’s dimension i.e. height, width and length. • Ignore Z-coordinates: In order to have a simplistic model z- coordinates of the joint was ignored. As depth data does not change very much while standing or sitting down. • Standard Gait Analysis: A model based approach is used in this project as the Kinect device track the skeleton of a person [19]. The joint in the form of coordinates are used to classify or separate between them. Since we are simply trying to classify between normal and abnormal gait, we can use pattern recognition techniques to classify the gaits. IV.

SUPPORT VECTOR MACHINE

Support Vector Machine (SVM) was selected for classification in our research due to high accuracy and ability to work with high dimensional data [20], ability to generate non-linear and well as high dimensional classifier, and availability of a simple and fast library (LibSVM) [21]. Let {xi , yi }, i = 1,..., l , yi ∈ {−1,1} , xi ∈ R d be the training data with labels y. The support vector machine (SVM) using C-Support Vector Classification (C-SVC) algorithm will find the optimal hyper-plane [22]

f ( x) = wT Φ ( x) + b

(1)

to separate the training data by solving the following optimization problem: n 1 2 min w + C ∑ ξi (2) 2 i =1 subject to

yi ⎡⎣ wT Φ ( xi ) + b ⎤⎦ ≥ 1 − ξi and ξi ≥ 0 , i = 1,..., l

(3)

The optimization problem (2) will guarantee to maximize the hyper-plane margin while minimize the cost of error. ξi , i = 1,..., l are non-negative slack variables introduced to relax the constraints of separable data problem to the constraint (9) of non-separable data problem. For an error to occur the corresponding ξi must exceed unity (3), so ∑ i ξ i is an upper bound on the number of training errors. Hence an extra cost C ∑ i ξi for errors is added to the objective function (2) where C is a parameter chosen by the user. The Lagrangian formulation of the primal problem is:

310

1 2 w + C ∑ ξi − ∑ α i { yi ( xiT w + b) − 1 + ξi } − ∑ μiξi 2 i i i (4) We will need the Karush-Kuhn-Tucker conditions for the primal problem to attain the dual problem: 1 LD = ∑ α i − ∑ α iα j yi y j Φ ( xi )T Φ ( xi ) (5) 2 i, j i subject to: 0 ≤ αi ≤ C and (6) ∑α i yi = 0 LP =

γ = 2−15 , 2−13 ,..., 23 . The 10-fold cross-validation was used with every pair of values of C and γ .

i

The solution is given by: NS

w = ∑ α i yi xi

(7)

i

where NS is the number of support vectors. Notice that data only appear in the training problem (4) and (5) in the form of dot product Φ ( xi )T Φ ( xi ) and can be replaced by any kernel K with K ( xi , x j ) = Φ( xi )T Φ( x j ) , Φ is a mapping to map the data to some other (possibly infinite dimensional) Euclidean space. One example is Radial Basis − γ xi − x j

2

Function (RBF) kernel K ( xi , x j ) = e In test phase an SVM is used by computing the sign of NS

NS

i

i

f ( x) = ∑ α i yi Φ ( si )T Φ ( x) + b = ∑ α i yi K ( si , x) + b (8)

where the si are the support vectors. V.

EXPERIMENTAL RESULTS

Four data sets which were 1) normal walking, 2) abnormal walking, 3) standing and 4) sitting were collected from the Kinect device in an hour each day over a period of a month. The data sets were divided randomly in to training data sets and test data sets for system evaluation. All feature vectors were scale to range [-1, 1] in order to avoid domination of some dimension to general performance of classifiers. The following SVM types were pre-tested: C-SVC, ε-SVR, ν-SVR and SVDD were tested. However, the first three types of SVM gave similar and higher results than SVDD. So the C-SVC was chosen and the details of the C-SVC have been presented in the previous section. Different kernel functions for C-SVC were also tried namely, linear, polynomial, radial basis function and sigmoid function. When tested on the same data linear kernel gave the lowest accuracy followed by sigmoid. However, Radial Basis function (RBF) and the polynomial function gave the same highest accuracy and in this paper the results of C-SVC with RBF kernel were presented. Choosing the parameters of the SVM training application manually was cumbersome so a python script supplied with LibSVM was used to perform a coarse grid search to find a good combination of parameters C and γ and then a fine grid search to find the optimal combination of parameters. The and chosen values were C = 21 , 23 ,..., 215

Figure 2. Experimental results for 9 experiments on the 4 data sets (normal walking versus abnormal walking and sitting versus standing)

Figure 2 presents the experimental results for the following experiments: 1) Z-coordinate used, absolute height used to scale data, arms coordinates used, SVM scaling not used. 2) Z-coordinate used, absolute shoulder width used to scale data, arms coordinates used, SVM scaling not used. 3) Z-coordinate not used, both absolute height and shoulder width used to scale, arms coordinates used, SVM scaling not used. 4) Z-coordinate used, both height and shoulder width used to scale, arms coordinates used, SVM scaling not used. 5) Z-coordinate not used, both height and shoulder width used to scale, arms coordinates used, SVM scaling used. 6) Z-coordinate used, both height and shoulder width used to scale, arms coordinate used, SVM scaling used. 7) Z-coordinate not used, change in position percent used, both height and shoulder width used to scale, arms coordinates used, SVM scaling used. 8) Z-coordinate not used, change in position not used, both shoulder and height used to scale, arms coordinates not used, SVM scaling used. 9) Z-coordinate not used, both shoulder and height data used to scale, hands not used. As seen from the above chart, the increase in accuracy of the C-SVM clearly increases from data set 4 to data set 5. The only difference in the learning between those two is SVM scaling was applied on datasets 5 onwards. So, the SVM scaling of data is critical for accuracy. Also, after data set 7 the accuracy of the SVM slightly decreases when arm coordinates were not used. This is probably due to loss of information from the arm coordinate

311

data. There was no difference observed when the relative change in position in percentage was used instead of coordinates. Both the posture recognition (sitting versus standing) and gait recognition (normal walking versus abnormal walking) show the same pattern and accuracy. VI. CONCLUSION We have presented the senior health monitoring system using the Kinect device to collect data and performed posture recognition (sitting versus standing) and gait recognition (normal walking versus abnormal walking). The recognition results for the data sets that consist of height, shoulder width, and arms coordinates and used SVM scaling achieved the highest results. The Kinect device shows potential to be used in senior health monitoring due to its versatility, size and cost. As fall is a major cause of death in the elderly people, falls account for over 80% of all injury-related admissions to hospital of people over 65 years, our proposed monitoring system will have significant contribution to health services. REFERENCES [1]

L. Menon and G. Menon, G. “Clinical: Falls in the elderly. GP: General Practitioner”, Retrieved from http://ezproxy.canberra.edu.au/login? url=http://search.ebscohost.com/login.aspx?direct=true&db=hch&AN=6 7609963&site=ehost-live, 2011 [2] P. Kannus, K. M. Khan and S. R. Lord, “Preventing falls among elderly people in the hospital environment”. The Medical Journal of Australia, vol. 184, no. 8, pp. 379-382, 2006. [3] G. Andonegi, “ HEBE: Detection of falls and monitoring of the elderly”. Retrieved from http://www.eurekalert.org/pub_releases/2006-04/efhdo042406.php, 2011. [4] H. Umegaki, Y. Suzuki, M., Yanagawa, Z. Nonogaki, and H. Endo, “Falls in elderly at high risk of requiring care”, Geriatrics & Gerontology International, vol. 12, no. 1, pp. 147-148, 2012. [5] F. E. Pollo, “AbnormGait analysis: Techniques and recognition of al gait”, retrieve from http://anatomy.org/Files2/Gait%20Analysis %20Techniques%20and%20Abnormal%20Patterns_AAA.ppt, 2011. [6] Y. Lajoie & S. P. Gallagher, “Predicting falls within the elderly community: Comparison of postural sway, reaction time, the berg balance scale and the activities-specific balance confidence (ABC) scale for comparing fallers and non-fallers”. Archives of Gerontology and Geriatrics, vol. 38, no. 1, pp. 11-26, 2004. [7] H. M., Clayton, & H. C. Schamhard, “Measurement techniques for Gait Analysis”, retrieved from http://www.elsevierhealth.co.uk/media/us/ samplechapters/9780702024832/9780702024832.pdf, 2011. [8] Ö. D., Incel , & H. Alemdar, “Pervasive visual sensor networks for elderly care”, in Proc. the 5th International Conference on Intelligent Sensors, Sensor Networks and Information Processing, ISSNIP 2009. [9] J. Bae, K. Kong, N. Byl, & M. Tomizuka, “A mobile gait monitoring system for gait analysis”, in Proceedings of the IEEE International Conference on Rehabilitation Robotics, pp. 73-79, 2009. [10] A., Bagnasco, A. M., Scapolla, & V. Spasova, “Design, implementation and experimental evaluation of a wireless fall detector”, Proceedings of the 4th International Symposium on Applied Sciences in Biomedical and Communication Technologies, 2011. [11] R., Paoli, F. J., Fernández-Luque, G., Doménech, F., Martínez, J., Zapata, & R. Ruiz, “A system for ubiquitous fall monitoring at home via a wireless sensor network and a wearable mote”, Expert Systems with Applications, vol. 39, no. 5, pp. 5566–5575, 2012.

[12] R. Y. W., Lee, & A. J. Carlisle, “Detection of falls using accelerometers and mobile phone technology”, Age & Ageing, vol. 40, no. 6, pp. 690696, 2011. [13] S. J., Morris, & J. A. Paradiso, “A compact wearable sensor packagefor clinical gait monitoring”, Offspring, vol. 1, no. 1, pp. 7-15, 2003. [14] A. Sixsmith & N. Johnson, “A smart sensor to detect the falls of the elderly”. Pervasive Computing, IEEE, vol. 3, no. 2, pp. 42-47, 2004. [15] Kinect for windows SDK beta. Retrieved 12/08, 2011, from http://research.microsoft.com/en-us/collaboration/focus/nui/kinectwindows.aspx [16] Guiness World Records. Fastest-selling gaming peripheral. Retrieved 12/14, 2011, from http://www.guinnessworldrecords.com/records9000/fastest-selling-gaming-peripheral/. [17] M. Fisher, Kinect. http://graphics.stanford.edu/~mdfisher/Kinect.html [18] J. Han & M. Kambe, In Stephan A. (Ed.), Data mining: Concepts and techniques (Second Edition ed.). Morgan Kaufmann Publishers, 2011. [19] J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, “Real-time human pose recognition in parts from a single depth image”, Retrieved from http://research.microsoft.com/apps/pubs/?id=145347 [20] A. Ben-Hur & J. Weston, “A user's guide to support vector machines”, Retrieved 12/09, 2011, from http://pyml.sourceforge.net/doc/howto.pdf [21] C.-C. Chang and C.-J. Lin., “LIBSVM: A library for support vector machines”, ACM Trans. Intell. Syst. Technol., vol. 2, no. 3, pp.1-27, 2011. [22] C. J. C. Burges, “A tutorial on support vector machines for pattern recognition,” Knowledge Discovery and Data Mining, vol. 2, no. 2, pp.121–167, 1998. [23] I. H. Witten and E. Frank. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco, 2nd edition, 2005. [24] C.-C. Chang and C.-J. Lin. LibSVM: a library for sup-port vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/˜cjlin/libsvm [25] C. E. Colvin, J. H. Babcock, J. H. Forrest, C. M. Stuart, M. J. Tonnemacher, & W.-S. Wang, “Multiple user motion capture and systems engineering”, in Proceedings of IEEE Conf. on Systems and Information Engineering Design Symposium (SIEDS), pp. 137-140, 2011. [26] R. B., Davis III, S., Õunpuu, D., Tyburski, & J. R. Gage, “A gait analysis data collection and reduction technique”, Human Movement Science, vol. 10, no. 5, pp.575-587, 1991. [27] Hsu, C. W., Chang, C. C. & Lin, C. J. (2010). A practical guide to support vector classification. Retrieved 12/18, 2011, from http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf [28] M. J. W. She, S. Nahavandi & A. Kouzani, “A review of vision-based gait recognition methods for human identification”, in Proceedings of the International Conference on Digital Image Computing: Techniques and Applications (DICTA) , pp.320-327, 2010. [29] J. K. Tan, S. Houman & S. Ishikawa, “Human motion representation using eigenspaces”, in Proceedings of the IEEE TENCON 2005, pp. 14, 2005. [30] T. Leyvand, C. Meekhof, Yi-Chen Wei, Jian Sun, & Baining Guo, “Kinect identity: Technology and experience”, Computer, vol. 44, no. 4, pp. 94-96, 2011. [31] Prevent falls to stop deaths. Australian Nursing Journal, vol. 19, no. 4, pp. 15-15, 2011. [32] M. Tang, “Recognizing hand gestures with Microsoft’s kinect”, Retrieved from http://www.stanford.edu/class/ee368/Project_11/ Reports/Tang_Hand_Gesture_Recognition.pdf [33] R. Urtasun, “Human motion analysis,” Retrieved 19/12/2011, from http://ttic.uchicago.edu/~rurtasun/courses/ETH10/human_motion_analys is.html 2010 [34] T. Wall, “Fall monitoring system add to seniors' independence”, Retrieved 12/08, 2011, from http://engineering.missouri.edu/2011/04/ fall-monitoring-systems-add-to-seniors%E2%80%99-independence/

312