Multimedia Data Mining: A Review

83 downloads 37071 Views 206KB Size Report
Data Mining is the analysis step of Knowledge discovery in database process KDD. ... pattern finding from large data sets which involves various sub domains of ...
Multimedia Data Mining: A Review P. P. Shrishrimal, R. R. Deshmukh, V. B. Waghmare Dept of Computer Science & IT, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad (MS) India [email protected], [email protected], [email protected]

Abstract— This paper discuss about the one of the sub domains of data mining i.e. Multimedia Data Mining. This paper focuses on what is meant by multimedia data mining? The background of multimedia data mining, its architecture, feature extraction and techniques & algorithm used for mining multimedia data. This paper will help to understand the basic concept of multimedia data mining and the commonly used techniques and algorithms. Keywords— Speech Database, Emotion, Speech, LPC, PCA

I.

INTRODUCTION

Data Mining is the analysis step of Knowledge discovery in database process KDD. It is an interdisciplinary subfield of computer science used for the computational process of pattern finding from large data sets which involves various sub domains of computer science like artificial intelligence, machine learning and database system and which also requires the knowledge of statistics. The main aim of Data mining is to extract the useful patterns which are hidden in large data sets and transform the extracted information into a meaningful form [1]. The Figure 1.1 shows the data mining as the step in knowledge discovery process. As the current era is known as information era and the recent advances in the electronic imaging, video devices, storage, networking and computer power, the amount of multimedia has grown enormously, and data mining has become a popular way of discovering new knowledge from such a large data sets [2]. The field of data mining have been prospered and posed into new areas of human life with various integrations and

advancements in the fields of Statistics, Databases, Machine Learning, Pattern Reorganization, Artificial Intelligence and Computation capabilities etc. The various application areas of data mining are Life Sciences (LS), Customer Relationship Management (CRM), Web Applications, Manufacturing, Competitive Intelligence, Retail/Finance/Banking, Computer/Network/Security, Monitoring/Surveillance, Teaching Support, Climate modeling, Astronomy, and Behavioral Ecology etc [3]. The paper is organized as follows section 2 describes the what is multimedia data mining, the background of multimedia data mining is described in section 3. The general architecture of the multimedia data mining is discussed in section4, section 5 focuses on feature extraction for multimedia data mining, section 6 describes the techniques and algorithm for multimedia data mining and conclusion is given in section7. II.

MULTIMEDIA DATA MINING

Advances in multimedia acquisition and storage technology have led to tremendous growth in very large and detailed multimedia databases. Analyzing this huge amount of multimedia data to discover useful knowledge is a challenging problem. This challenge has opened the opportunity for research in Multimedia Data Mining (MDM). Multimedia data mining can be defined as “the process of finding interesting patterns from media data such as audio, video, image and text that are not ordinarily accessible by basic queries and associated results” [4].

Figure 1.1: Data mining as the steps in knowledge Discovery process.

As in recent years, multimedia data have grown at a phenomenal rate and are ubiquitous, as a result, not only the methods and tools to organize, manage and search such data have gained widespread attention but the methods and tools to discover hidden knowledge from such data have become extremely important. The task of developing such methods and tools is facing the big challenge of overcoming the semantic gap of multimedia data. “The semantic gap is the lack of coincidence between the information that one can extract from the multimedia data and the interpretation that the same data have for a user in a given situation” [5]. Multimedia mining deals with the extraction of implicit

knowledge, multimedia data relationships, or other patterns not explicitly stored in multimedia files. Multimedia mining is more than just an extension of data mining, as it is an interdisciplinary endeavor that draws upon expertise in computer vision, multimedia processing, multimedia retrieval, data mining, machine learning, database and artificial intelligence [6]. Figure 2.1 illustrates multimedia data mining, in particular, with various aspects of multimedia data mining [7]

Figure 2.1 Multimedia Data Mining quantity of raw data determines the overall achievable III. MULTIMEDIA DATA MINING BACKGROUND performance. As multimedia data mining is a new research area and it C. Data Pre-processing, cleaning and transformation requires background from data mining as well as multimedia The goal of preprocessing stage is to discover important processing domain. The typical multimedia data mining features from raw data. The preprocessing step involves process consists of several stages and the overall process is integrating data from different sources and/or making choices inherently interactive and iterative. The main stages of the about representing or coding certain data fields that serve as multimedia data mining process are (1) Domain inputs to the pattern discovery stage. Such representation understanding; (2) Data selection; (3) Data pre-processing, choices are needed because certain fields may contain data at cleaning and transformation; (4) Discovering patterns; (5) levels of details not considered suitable for the pattern Interpretation; and (6) Reporting and using discovered discovery stage. This stage is of considerable importance in knowledge [8]. multimedia data mining, given the unstructured and A. Domain Understanding heterogeneous nature and sheer volume of multimedia data. The preprocessing stage includes data cleaning, normalization, The domain understanding stage requires learning transformation and feature selection. Cleaning removes the how the results of multimedia data mining will be used so as noise from data. Normalization is beneficial as there is often to gather all relevant prior knowledge before mining. For large difference between maximum and minimum values of example, while mining sports video for a particular sport like data. Constructing a new feature may be of higher semantic cricket, it is important to have a good knowledge and value to enable semantically more meaningful knowledge. understanding of the game to detect interesting strokes used by Selecting subset of features reduces the dimensionality and players. makes learning faster and more effective. Computation in this B. Data Selection stage depends on modalities used and application’s The data selection stage requires the user to target a database requirements. or select a subset of fields or data records to be used for data mining. A proper understanding of the domain at this stage helps in the identification of useful data. The quality and

D. Discovering patterns The pattern discovery stage is the heart of the entire data mining process. It is the stage where the hidden patterns, relationships and trends in the data are actually uncovered. There are several approaches to the pattern discovery stage. These include association, classification, clustering, regression, time-series analysis, and visualization. Each of these approaches can be implemented through one of several competing methodologies, such as statistical data analysis, machine learning, neural networks, fuzzy logic and pattern recognition. It is because of the use of methodologies from several disciplines that data mining is often viewed as a multidisciplinary field. E. Interpretation The interpretation stage of the data mining process is used to evaluate the quality of discovery and its value to determine whether the previous stages should be revisited or not. Proper domain understanding is crucial at this stage to put a value to the discovered patterns. F. Reporting The final stage of the data mining process consists of reporting and putting to use the discovered knowledge to generate new actions or products and services or marketing strategies as the case may be. This stage is application dependent. IV.

MULTIMEDIA DATA MINING ARCHITECTURE

There are various architectures that have been designed and studied till date for development Multimedia data mining system. The three different architectures that have been designed and studied for the multimedia data mining system has been described in the paper Manjunath et.al.. The architecture of the multimedia data mining system may vary upon which type of data will be used. It can be a single data type or it can be combination of different data types as shown earlier in the figure 2.1. Generalization of an architecture for multimedia data mining hence become a bit difficult as it can have different data types. V.

FEATURE EXTRACTION FOR MULTIMEDIA DATA MINING

As there are different types of data involved in the multimedia mining the feature that are been extracted in each data type is different. Color, edges, shape, and texture are the common image attributes that are used to extract features for mining. Feature extraction based on these attributes may be performed at the global or local level. To apply existing data mining techniques on video data, one of the most important steps is to transform video from nonrelational data into a relational data set. Video as a whole is very large data to mine. Thus we need some preprocessing to get data in the suitable format for mining. Video data is composed of spatial, temporal and optionally audio features. All these features can be used to mine based on applications requirement. Commonly, video is hierarchically constructed of frames (key-frames), shots (segments), scenes, clips and full length video. Every hierarchical unit has its own features which are useful for pattern mining [9].

In case of audio, both the temporal and the spectral domain features have been employed. Examples of some of the features used include short-time energy, pause rate, zerocrossing rate, normalized harmonicity, fundamental frequency, frequency spectrum, bandwidth, spectral centroid, spectral roll-off frequency and band energy ratio [10]. In text mining the feature extraction usually means identifying the keywords that summarize the contents of the document. One way is to look for words that occur frequently in the document. These words tend to be what the document is about. Of course, from the remaining words, a good heuristic is to look for words that occur frequently in documents of the same class, but rarely in documents of other classes. In order to cope with documents of different lengths, relative frequency is preferred over absolute frequency. The special features of multimodal data mining can be easily seen apart from traditional single modality features of image, audio or video modality. The image annotations can be considered very useful feature for cross modal mining for text and image. The subtitles or movie scripts, Optical character recognition (OCR) text label extracted from videos can be very useful feature for cross modal mining of video and text. From audio, extracting the speech is semantically very rich. An important issue with features extracted from multimodal data is how the features should be integrated for mining. Most multimodal analysis is usually performed separately on each modality, and the results are brought together at a later stage to arrive at the final decision about the input data. This approach is called late fusion or decision-level fusion. Although this is a simpler approach, we lose valuable information about the multimedia events or objects present in the data because, by processing separately, we discard the inherent associations between different modalities. Another approach for combining features is to represent features from all modalities together as components of a high-dimensional vector for further processing. This approach is known as early fusion. The data mining through this approach is known as cross-modal analysis because such an approach allows the discovery of semantic associations between different modalities [11]. VI.

TECHNIQUES AND ALGORITHMS FOR MULTIMEDIA DATA MINING

There are various algorithms and techniques that are used for data mining. The algorithms and techniques employed to perform multimedia data mining are most important. Many of available techniques have been applied for multimedia data mining. Within the supervised framework, three data mining methods have been used. These are classification, association and statistical modeling. Within the unsupervised learning, clustering is another data mining methodology used. A. Classification based Multimedia Data mining The meaningful information can only be realized when the requested information is identified and recognized by the system. In classification approach there absolute accurate rules. The object recognition problem can be referred as the supervised labeling problem. The researchers Yu and wolf

used one dimensional Hidden Markov Model for classifying the images and videos [12]. Decision trees can be translated into a set of rules by creation of separate rule for each path from the root node to the leaf node in the tree [13]. The Rules can also be directly induced from the training data using different rule based algorithms. Artificial Neural Network is another method of inductive learning, which is based on the computational models of biological neurons and the networks. Bayesian networks which are the graphical model for probabilistic relationships among the extracted set of features can also be used. The new technique like Support Vector Machine (SVM) which considers the notion of margin. It maximizes the margin and thereby creating the largest possible distance between the separating hyper plane and the instances on either side of it, it reduces the upper bound on the expected generalization error [14]. B. Association based Multimedia Data mining Association rule learning is a popular and well researched method for discovering interesting relations between variables in large databases. Most of the studied association rules are generally focused on the corporate data which is mostly alphanumeric. The Research for association for multimedia data mining is relatively less as compared to that for alphanumeric data [15]. There are different types of associations: association between image content and non image content features. Association mining in multimedia data can be transformed into problems of association mining in traditional transactional databases. The image can be modeled as a transaction, assigned with an ImageID, and the features of the images are the items contained in the transaction. Therefore, mining the frequently occurring patterns among different images becomes mining the frequent patterns in a set of transactions. Za, Han and Zhu extend the concept of content-based multimedia association rules using feature localization. They introduced the concept of progressive refinement in discovery of patterns in images [16]. A recent work in this area is due to Lei Wang et. al. [17], who introduced a clustering method based on unsupervised neural nets and self-organizing maps [18]. C. Statistical Modeling Statisticians were the first to use the term “data mining.” Data mining was a derogatory term referring to attempts to extract information which was not supported by the data. Now a days, statisticians view data mining as construction of a statistical model, by which an underlying distribution from the visible data is drawn. A statistician might decide that the data comes from a Gaussian distribution and use a formula to compute the most likely parameters of this Gaussian distribution. The mean and standard deviation of this Gaussian distribution completely characterize the distribution and would become the model of the data. Statistical mining models are used to determine the statistical validity of test parameters which can be utilized to test hypothesis, undertake the correlation studies and transform and prepare data for further analysis. Pattern matching is used to find hidden characteristics within data and

the methods used to find patterns with the data include association rules [19]. D. Clustering Clustering means the process of grouping physical or abstract objects into a class of more similar objects. It is an unsupervised method of learning which is commonly used in data mining. It is the main task of explorative data mining and common technique for data analysis which is used in areas including information retrieval. The clusters are created on the basis of similarity and the similarity measure can be computed for various types of data. Clustering algorithms can be categorized into partitioning methods, hierarchical methods, density-based methods, grid-based methods and model-based methods, k-means algorithm and graph based models [20]. The partitioning methods constructs carious partitions and then evaluate them using some criteria. The two major subcategories are the centroid and the medoids algorithm. The hierarchical methods creates a hierarchical decomposition of the set of data or objects using some criteria. It groups the data instances into tree of clusters. Density-based methods is based on the connectivity and density functions. Grid-based methods are based on multiple-level granularity structure. Model-based methods is hypothesized for each of the clusters and the idea is to find the best fit of that model to each other. The recent trend for it also consists of the self organizing maps unsupervised neural networks. VII. REFERENCES [1] Data Mining concepts and techniques, by Jiawei Han and Micheline Kamber, 2nd edition, Morgan Kaufmann Publishers. [2] Manjunath T. N., Ravindra S. Hegadi, Ravikumar G. K., “A Survey on Multimedia Data Mining and Its Relevance Today”, IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.11, pp. 165-170, November 2010. [3] Venkatadri M., Lokanatha C. Reddy, “A Review on Data mining from Past to the Future”, in International Journal of Computer Application (IJCA), Volume 15– No.7, pp. 19-22, February 2011 [4] Chitra Wasnik, “Tools, Techniques And Models For Multimedia Database Mining”, International Journal of Networking & Parallel Computing, Volume 1, No 2, pp. 1-5, November 2012 [5] Arnold W. M. Smeulders, Senior Member, Marcel Worring, Simone Santini, Amarnath Gupta, and Ramesh Jain. “Content-based image retrieval at the end of the early years”. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22:1349–1380, 2000. [6] Chitra Wasnik, "Tools, Techniques and Models for Multimedia Database Mining", Internation Journal of Networing & Parallel Computing, Vol. 1, No. 2, pp. 1 - 5, November - 2012 [7] Manjunath T. N., Ravindra S. Hegadi, Ravikumar G. K., "A Survey on Multimedia Data Mining and Its Relevance Today", International Journal of Computer Science and Network Security, Vol. 10, No. 11, pp 165-170, Novemeber 2010. [8] Nilesh Patel and Iswar Sethi. Multimedia data mining: An overview. Multimedia Data Mining and Knowledge Discovery, Springer, 2007 [9] Bhatt c. A., "Probablilistic Temporal Multimedia Data mining", Ph.D. Thesis, National University of Singapore, 2012 [10] Jadhav S. R., Kumbargoudar P., "Multimedia Data Mining in Digital Libraries: Standards and Features", IEEE International Conference on Advances in Computer Vision and Information Technology ACVIT- 07, Dr. Babasaheb Ambedkar MarathWada University, Aurangabad, MSIndia, 2007 [11] Li Dongge, Dimitrova Nevenka, Li Mingkun, and Sethi Ishwar K., “Multimedia content processing through cross-modal association”, In ACM international conference on Multimedia, pages 604–611, 2003.