Advances in data stream mining for mobile and ubiquitous environments

0 downloads 0 Views 404KB Size Report
The tutorial presents the state-of-the-art in mobile and ubiquitous data stream mining and discusses open research problems, issues, and challenges in this area ...
Advances in Data Stream Mining for Mobile and Ubiquitous Environments Shonali Krishnaswamy

Joao Gama

Mohamed Medhat Gaber

Faculty of Information Technology

School of Computing

Monash University, Australia

Laboratory of Artificial Intelligence and Decision Support

University of Portsmouth, UK

[email protected]

University of Porto, Portugal

[email protected]

[email protected] changing resource-levels and user needs. In the last few years, rapid strides have been made in accurately and efficiently mining high speed data streams in mobile devices such smart phones] and there is a growing focus on “in-network” processing using embedded devices such as sensor nodes. These techniques leverage the body of work that exists in mining data streams and aim to enable the operation of these algorithms in resourceconstrained environments. There is also an emerging focus on context-aware data stream mining which targets the learning process to be aware of its operational and application constraints and self-adapt according to changing situations/needs. The body of work in mobile/ubiquitous data stream mining ranges from algorithms, adaptation strategies for coping with mobile/ubiquitous environments, systems/toolkits for mobile data mining and innovative/new applications.

ABSTRACT The tutorial presents the state-of-the-art in mobile and ubiquitous data stream mining and discusses open research problems, issues, and challenges in this area.

Categories and Subject Descriptors H.2.8 [Database Management]: Database Applications – Data Mining

General Terms Algorithms

Keywords Data Stream Mining, Mobile Computing, Ubiquitous Computing

The tutorial is organized as follows. We will present the fundamental techniques for data stream analysis such as change detection, clustering, classification, frequent patterns, and time series analysis from distributed data streams. We will present the critical factors that need to be considered in order to develop and deploy data stream mining in mobile/ubiquitous environments including the need for adaptation and context/situation-aware reasoning. We will then present state-of-the-art algorithms for mobile data stream mining, including the emerging topic of Pocket Data Mining (PDM), focusing on ad hoc distributed data stream mining in mobile environments. The tutorial will also present the Open Mobile Miner (OMM) toolkit [3] for rapid deployment of mobile data stream mining and real-world application/case studies and demonstrations to stimulate the real need for this growing research field. Finally the tutorial will be concluded with open issues and future directions.

1. INTRODUCTION The phenomenal growth of mobile devices coupled with their ever-increasing computational capacity presents an exciting new opportunity for real-time, intelligent data analysis in mobile and ubiquitous environments. Ubiquitous/Mobile data mining refers to the process of performing data stream mining using mobile and/or embedded devices (e.g. sensors) to support critical applications such as mobile healthcare, intelligent transportation systems, mobile activity recognition, smart homes, and emergency/disaster management like bushfires [1, 2]. Thus, the key focus is on developing data stream mining algorithms that are highly scalable/computationally efficient, energy-efficient, and context/resource-aware. These features enable the continued operation of data stream mining algorithms in a highly dynamic mobile/ubiquitous environment.

2. REFERENCES

The typical constraints that have to be addressed in performing mobile/ubiquitous data stream mining are: (1) data streams are generated and sent in real-time in a stream format with little or no potential for persistent storage, (2) resource constraints include limited computational resources such as memory, processor speed, network bandwidth, battery power, and screen real-estate, (3) temporal constraints refer to real-time information and decision-making needs that, in turn, necessitate the analysis to be online, incremental and continuous, (4) mobility of users and devices and the connectivity issues thereof, and (5) adaptation and context-awareness of the analysis process to varying/dynamically

[1] João Gama, Michael May: Ubiquitous Knowledge Discovery Intelligent Data Analysis 15(1): 1 (2011). [2] Gaber, M, M., Krishnaswamy, S., and Zaslavsky, A. 2005. On-board Mining of Data Streams in Sensor Networks, A Book Chapter in Advanced Methods of Knowledge Discovery from Complex Data, (Eds.) S. Badhyopadhyay, U. Maulik, L. Holder and D. Cook, Springer. [3] Krishnaswamy, S., Gaber, M, M., Harbach, M., Hugues, C., Sinha, A., Gillick, B., Delir Haghighi, P.,and Zaslavsky, A., (2009), Open Mobile Miner: A Toolkit for Mobile Data Stream Mining, ACM Knowledge Discovery in Databases (ACM KDD 2009), Demo Paper.

Copyright is held by the author/owner(s). CIKM’11, October 24–28, 2011, Glasgow, Scotland, UK. ACM 978-1-4503-0717-8/11/10.

2607