ERPASD: A Novel Algorithm for Integrated Distributed ... - IEEE Xplore

5 downloads 0 Views 127KB Size Report
Tehran, Iran [email protected]. Abstract — In this paper, a novel algorithm for integrated distributed reliable systems using data mining mechanisms has.
ERPASD: A Novel Algorithm for Integrated Distributed Reliable Systems Using Data Mining Mechanisms Arash Ghorbannia Delavar

Mehdi Zekriyapanah Gashti

Behrouz Noori Lahrood

Payame Noor University Tehran, Iran [email protected]

Payame Noor University Tehran, Iran [email protected]

Payame Noor University Tehran, Iran [email protected]

Information technology is trying to integrate a combination of methods which can provide the right solution for increasing customer satisfaction level. Therefore, for increasing dependability and reliability, integrated distributed system were replaced with the central integrated system. Having more study about proposed framework and models helped us to find better Solutions for creating novel algorithm. In addition to all advantage of novel algorithm, there are some problems so according to novel algorithm, new method with new methodology were created. But by considering novel proposed algorithm, all problems were solved by new techniques. In proposed algorithm data mining techniques, such as Apriori algorithm, were used [4,5]. Firstly we describe the base Apriori algorithm.

Abstract — In this paper, a novel algorithm for integrated distributed reliable systems using data mining mechanisms has been proposed for utilizing and optimizing system resources considering contention levels and also customers’ satisfaction level. This novel algorithm has increased the security and performance levels comparing with other proposed methods. Considering ERPASD algorithm, we have proposed a new technique in distributed database that decreases the total cost of integrated distributed system and also decreases the search time for replicated data using Apriori ERPSD. Thus this novel algorithm implemented in integrated distributed systems eliminates and decreases the replicated patterns in database and optimizes the ERP system and decreases the response time and waiting time for each user comparing ERP methodology. Comparing ERPASD with ERPSD, the novel algorithm is more efficient, and more dependable. Finally ERPASD has significantly decreased the service time for traversing records with huge data in databases and shows more improvement comparing prior algorithms.

II.

Data mining is a technology of multi-interdisciplinary research field, which combines the latest research results in database technology, artificial intelligence, machine learning, statistics, knowledge engineering, information retrieval, high-performance computing , data visualization technology and so on [6,7]. Association rule mining is one of the most important data mining problems. The purpose of association rule mining is the discovery of association relationship among a set of items. The mining of association rule include two sub problems (1) finding all frequent item sets that appear more often than a minimum support threshold, and (2)generate association rules using these frequent item sets. The first sub problem plays an important role in association rules mining [8]. Once frequently set generated from the database, strong association rules can be directly generated. The core algorithm as follows: Apriori algorithm called two sub-processes which are Apriori-gen() and subset(), Apriori-gen() process produces a candidate, then use the Apriori property (all non-empty subsets of frequent item sets must also be frequent) to delete those candidates of the non-frequent subsets[9].  Apriori property: All nonempty subsets of a frequent item set must also be frequent. A two-step process is used to find the frequent item sets: join and prune actions [7].

Keywords - ERPSD; ERPASD; Data Mining; Scheduling; Dependability

I.

THE BASE IDEA OF APRIORI ALGORITHMYPE

INTRODUCTION

In many years, all programmers believed that by information technology, new methodology can be presented which can satisfy more customers. That one of the important events in the 21st century has been information technology [1,2]. ERP is not a strategy but software systems that integrate all the information inside an organization and present them to the users in proper time. So ERP includes all aspects of an organization like sale and distribution, manufacturing process, provisioning raw materials, human and organizational resources management and control of structured commercial competitive processes [1,3]. As we know, uncategorized information will change to data and proper data will become information. This categorized information eventually will become knowledge, which will bring us closer to the edge of technology. But the main point is that the data should have the highest security so that we would be able to use it in applications like electronic government safely and dependably. The proposed ERPSD framework details every block and using it can result more performance and efficiency in a general and complete information system [2, 3, 4]. ___________________________________ 978-1-4244-6928-4/10/$26.00 ©2010 IEEE

483

them in to the new framework which lead to updated software according to this algorithm [10].

a) The join step To find Lk a set of candidate k-item sets is generated by joining Lk-1 with itself. This set of candidates is denoted Ck.

A.1. CRM phase feedback: a) Internet & Network Services For having multipurpose services from novel ERPSD algorithm in output customer and also having distributed and secure database, one of the necessities is network bandwidth. Increase in bandwidth can lead to implementation of better services. b) Virtual Services Users physical activities have input and output characteristics. Virtual environment can be replaced with physical environment and all requests and responds are done in processing unit in scheduling and also in reciprocal way. This process can be done, when we virtually had the all standard parameters required for exchange information. c) Mobile Services mobiles are popular and common instruments which can be connected to processing resources easily and in short time. Statistic IP makes this capability for users by exchanging information from original source to destination.

b) The prune step The members of Ck may or may not be frequent, but all of the frequent k-item sets are included in Ck. A scan of the database to determine the count of each candidate in Ck [9]. III.

PROPOSED PHASE OF ERPASD ALGORITHM

Enterprise Resource Planning framework helps us to have a secure and distribute system. In fact, secure means access to ERPSD methodology as layers which can be explained in thired phases according to figure 1: CRM phase feedback Internet & Network Services

Virtual Services

Mobile Services

Knowledge Discovery proposed phase

Output Customers

ERPASD

Customer Services Management

Data Warehouse

Sales Management

B. ERPSD proposed phase 1) Business Legacy This system considers the attributes and specifications of an organization like structure, culture, and sales strategies and so on in a legacy manner . 2) IT Legacy This is the available IT infrastructure, which can affect the functions and operations of an organization and it can have an important role in effectiveness of an organization. For example, software applications affect the operations of a company. It is important because of factors like cost, size and complexity. 3) Business Pressures: This part of the systems involves with pressures like globalization, information technology and competition. For example, the rapid changes in market, changes in customers’ requirements and so on are examples for commercial problems. Older systems and the problems mentioned above affects the implementation process. The implementation process includes four parts which are [1, 2]: 3.1) IT Strategic Reviews: The basis of this part of system is strategic selection, which includes creation, evaluation and strategic selection options and needs a model with improved and optimized infrastructure. For example when its needed to improve the IT infrastructure of a company. By improving that part of the system, service levels and also performance of the system will improve much. 3.2) Project Management Strategy: Strategies are considered in much detail in this part of the system. Processes like using internal or external professional people, decisions about using advisors from large corporations or determining what expert knowledge is needed for doing a new project are all

Replicated Data

Input Customers

Rule Based DB

Selected Fn

CRM proposed phase ERPSD proposed phase

Business Pressures

It Legacy Business Legacy

IT Strategic Review

Business Management Strategy

MRP

BPR Strategy

UML

IT Strategy

RUP

E R P S D

Distributed DB

Figure 1. Proposed phase of ERPASD

A.

CRM proposed phase 1) Input Customers Input customers are users which are using the resources directly by algorithm and proposed framework and they have independent features and same indicators toward ERP. Input customers need methodology and framework for having faster access to information and having more satisfaction which lead to improvement of dependability. 2) Output Customers Output customers are ones which new method and proposed algorithm make them reach high satisfaction level and also their service times are reduced. 3) Sales Management It is one of the important entities in CRM as a fixed entity that is used for all designed CRM because it is a base in order to enter in other phase. Without this entity we can’t have comparison between other entities [10]. 4) Customer Services Management Usually, after secure integrated systems are updated and placed in implemented system, they are in environmental condition that it has effects on this integrated system. For avoiding these effects we can present novel algorithm to put

484

embedded in this part of the system. In project management strategy, economical feasibility study could be an important parameter for project control. 3.3) BPR Strategy: It includes the changes in business, which is involved in the project .When a software package is implemented in an organization, that organization reengineers the processes in the direction of that software. Also formal study of system regarding system resistance is being considered. 3.4) IT Strategy: It is used for implementation. For example, a company manager uses “70/30” rule to implement a software. In this case, installing and using this software will satisfy 70% of the tasks and the remainder 30% will be satisfied by improvement of external MIS. 4) MRP II Strategy: MRP II logic is very simple but it is being done handy and it makes it very time consumable. Also it is being executed by cooperation of large computer software’s. MRP II is an effective technique for managing available resources but it ignores other distributed resources because it is a central approach for managing the resources. MRP II has been developed by support of large corporations like IBM. As MRP II is costly and also time consumable because of separate integration of different available sub-systems, new MRP II system has been developed by considering new facilities and also sub-systems integration [1,2].

1) Start 2) Select Fn (Frequent itemsets of Maximum Cardinality) 3) Select K= minimum cardinality , N=maximum cardinality 4) Generate propose candidate whit function Apriori_gen 5) Generate all Subset of propose candidate 6) Search all priori subset 7) Check all Subset and select itemset whit maximum count and minimum support 8) End



C. Knowledge Discovery proposed phase For using data and its existing rules from database and processing source, having search phase is necessary that helps us optimizing time complexity, waiting time and scheduling processing resources by novel algorithm. 1) Selected Fn Largest frequent item set for the merge of subsets to implement novel proposed algorithm. 2) Replicate Data Replicate data increase time of search in distributed databases. For performing new technique to omit or limit replicate data, firstly replicate data should be recognized and then not used in search time. 3) Data Warehouse For being able to study data mining mechanism, dynamic environment is needed to save all data according to algorithm’s changes. 4) ERPASD Novel proposed algorithm can help us to improve the time of data searching in distributed integrated database by data mining mechanism. IV.





Figure 2. flowchart of novel proposed ERPASD algorithm

V.

SIMULATION PROPOSED ERPASD ALGORITHM

For simulating algorithm minimum hardware and software equipments were used as shown in Table I. The time of the implementation proposed algorithm to the number of records in Literacy Movement Organization IRAN database is per second.

ERPASD NOVEL PROPOSED ALGORITHM

The flowchart of novel proposed ERPASD algorithm present in figure 2. Our proposed algorithm is for improvement in the number of bank search for finding best rules that the algorithm is as follows:

485

TABLE I.

INFORMATIONS HARDWARE AND SOFTWARE USED IN SIMULATION

Hardware or Software

Information

1

Processor

Intel 1.73 GHz

2

Memory (RAM)

1 GB

3

Operating system

Microsoft Windows XP Professional Version 2002 Service Pack 2

4

System type

32-bit Operating System

After Simulation of response times of users with attention to number of received records from Literacy Movement Organization database in integrated systems with ERPASD algorithm that it was presented in figure 4.

The result of algorithm Implementation and the examination of performance time for each of Apriori and novel ERPASD algorithm showed the positive effects of running proposed algorithm compared to Apriori algorithm. According to the case study, the database of Literacy Movement Organization of Iran was surveyed precisely, considering tables with the minimum number of records R1 and tables with the maximum number of records R5. The result revealed significant time interval in users’ respond time in proposed ERPASD algorithm compared with previous one. Results are shown in Table II. Figure 4. Respond times of ERPASD algorithm. TABLE II.

RESULT RESPOND TIMES OF IMPLEMENTATION ALGORITHM IN LMO DATABASE

Number of records

users respond time in Apriori Algorithm (S)

users respond time in ERPASD Algorithm (S)

1

R1

5000

0.017

0.016

2

R2

50000

0.030

0.029

3

R3

500000

0.500

0.093

4

R4

3000000

1.533

0.789

5

R5

5000000

2.647

1.321

After calculating response times of users with independent parameters for each of the integrated systems and distributed systems in the same conditions were compared that result was presented in Figure 5 charts.

After Simulation of response times of users with attention to received records from Literacy Movement Organization database in integrated systems with Apriori algorithm that it was presented in figure3.

Figure 5. Compared respond times of Apriori and ERPASD algorithms.

VI.

CONCLUSION

We presented novel proposed ERPASD algorithm in secure distributed integrated system. The combination of data mining mechanisms and above algorithm decrease users respond Significantly during using high capacity data base in Literacy Movement Organization. The results of the above algorithm and the compared one show in the worst case scenario there is 1% increase in performance during service

Figure 3. Respond times of Apriori algorithm.

486

[8]

Q.Lan , D.Zhang , B.Wu “A New Algorithm For Frequent Itemsets Mining Based On Apriori And FP-Tree” IEEE GCIS 2009 Xiamen, China, May 19-21 2009 , PP.360-364. [9] Y.Liu “Study on Application of Apriori Algorithm in Data Mining” IEEE ICCMS 2010 Sanya, China , jan 22-24 2010 , PP.111-114. [10] A.Abdullah , S..Al-Mudimigh , B.Zahid Ullah , C.Farrukh Saleem “A Framework of an Automated Data Mining Systems Using ERP Model” International Journal of Computer and Electrical Engineering (IJCEE 2009) , Vol. 1, No. 5 December 2009 , PP.651-655.

time and at best in average there is 21% increase in service time in compare with different previous simulation algorithm. REFERENCES [1]

[2]

[3]

[4]

[5]

[6]

[7]

A.Ghorbannia Delavar , V.Aghazarian , S.Sadighi “ERPSD: A New Model for Developing Distributed,Secure, and Dependable Organizational Softwares” IEEE CSIT 2009 Yerevan, Armenia, September 28 2009 ,PP.404-408. A.Ghorbannia Delavar ,L.Saiedmoshir “Analysis of a New Model for Developing Concurrent Organizational Behaviors with Distributed, Secure and Dependable ERPSD Technology” IEEE CSIT 2007 September 24-28 2007,PP.321-324. A. Ghorbannia Delavar and M. Nejadkheirallah and M. Motalleb, A New Scheduling Algorithm for Dynamic Task and Fault Tolerant in Heterogeneous Grid Systems Using Genetic Algorithm, Chengdu, China, IEEE ICCSIT 2010. XindongWu · Vipin Kumar · J. Ross Quinlan · Joydeep Ghosh · Qiang ang · Hiroshi Motoda · Geoffrey J. McLachlan · Angus Ng · Bing Liu · Philip S. Yu · Zhi-Hua Zhou · Michael Steinbach ·David J. Hand · Dan Steinberg " Top 10 algorithms in data mining ", Knowl Inf Syst (2008) Springer , PP14:1–37 , DOI 10.1007/s10115-007-0114-2 H. Feng, Z. Shu-mao, D. Ying-shuang" The analysis and improvement of Apriori algorithm ", Journal of Communication and Computer, ISSN1548-7709, Sep. 2008, Volume 5, No.9 (Serial No.46), USA B.Wu, D. Zhang, Q.Lan, J. Zheng" An Efficient Frequent Patterns Mining Algorithm based on Apriori Algorithm and the FP-tree Structure", Third 2008 International Conference on Convergence and Hybrid Information Technology ,IEEE ICCIT.2008, November 11-13, 2008, Busan, Korea Zhang.Ch , Li.Zh , Zheng.D “An Improved Algorithm for Apriori” IEEE, ETSC 2009 Wuhan, China ,March 7-8 2009 ,PP.995-998.

487