Empirical Study of Complexity Graphs for Sorting

0 downloads 0 Views 370KB Size Report
Sort, Merge Sort, Heap Sort, Insertion Sort, Bubble Sort etc. have different complexities depending on the ..... 2013 from http://www.cs.wm.edu/~wm/CS303/ln7.pdf.
Empirical Study of Complexity Graphs for Sorting Algorithms

IJCCIT Vol.1, No.1, Dec.2013

Empirical Study of Complexity Graphs for Sorting Algorithms Gaurav Kumar Panipat Institute of Engineering and Technology Panipat, Haryana, India [email protected]

Harish Chugh Panipat Institute of Engineering and Technology Panipat, Haryana, India [email protected] Abstract-This study investigates the characteristic of the sorting algorithms with reference to number of comparisons made for the specific number of elements. Sorting algorithms are used by many applications to arrange the elements in increasing/decreasing order or any other permutation. Sorting algorithms, like Quick Sort, Merge Sort, Heap Sort, Insertion Sort, Bubble Sort etc. have different complexities depending on the number of elements to sort. The purpose of this investigation is to determine the number of comparisons, number of swap operations and after that plotting line graph for the same to extract values for polynomial equation. The values a, b and c got is then used for drawing parabola graph. The study concludes what algorithm to use for a large number of elements. For larger arrays, the best choice is Quick sort, which uses recursion method to sort the elements and leads to faster results. Least square method and Matrix inversion method is used to get the value of constants a, b and c for each polynomial equation of sorting algorithms. After calculating the values, Graph is drawn for each sorting algorithm for the polynomial equation i.e. Y=AX2 + BX + C or Y=AX lgX + BX + C.

Keywords: Bubble Sort, Heap Sort, Insertion Sort, Merge Sort, Quick Sort.

I. INTRODUCTION Sorting operation is the most important operation that is performed by the computer on data. Searching operation is the most used algorithm in computing in which computer spends much of its time. To get higher efficiency while doing sorting - fast, efficient and

31

Empirical Study of Complexity Graphs for Sorting Algorithms

IJCCIT Vol.1, No.1, Dec.2013

inexpensive algorithms for sorting and ordering elements are required. For this reason, the development of fast, efficient and inexpensive algorithms for sorting and ordering lists and arrays is a fundamental field of computing. The study investigates the sorting algorithms applied on long lists and draws graphs for the sorting algorithms mentioned leading to efficient algorithms for sorting of long length [1]. The investigation is to determine which of these algorithms is the fastest to sort lists of different lengths, and to therefore determine which algorithm should be used depending on the list length. Although asymptotic analysis of the algorithms is taken into consideration, the main type of comparison discussed is an empirical assessment based on running each algorithm against random lists of different sizes.

II. BACKGROUND A limitation of the empirical comparison is that it is system-dependent. A more effective way of comparing algorithms is through their time complexity upper bound to guarantee that an algorithm will run at most a certain time with order of magnitude O(f(n)) where n is the number of items in the list to be sorted. This type of comparison is called asymptotic analysis [2,3]. The time complexities of the algorithms studied are shown in table 1. TABLE I Time Complexity of Sorting Algorithms Algorithm Bubble Sort Insertion Sort Quick Sort Merge Sort Heap Sort

Best Case O(n) O(n) O(n lgn) O(n lgn) O(n lgn)

Time Complexity Average Case Worst Case O(n2) O(n2) 2 O(n ) O(n2) O(n lgn) O(n2) O(n lgn) O(n lgn) O(n lgn) O(n lgn)

Although all algorithms have a worst-case runtime of O(n2), only Quicksort have best and average runtime of O(n lgn). This implies that Quicksort, on average, will always be faster than Bubble, Insertion and Selection sort, if the list is sufficiently large [4,5].

III METHODOLOGY USED Each algorithm is tested on the lists of length 1, 3, 5, 7, 10, and 15 lakhs using C#.net and VC++. The number of comparisons and number of Assignment/ Swap operations are 32

IJCCIT Vol.1, No.1, Dec.2013

Empirical Study of Complexity Graphs for Sorting Algorithms

recorded by using a counter. Execution is done on Windows XP Pro. SP 2, with an Intel Core2 Duo 2.40GHz processor and 1GB of RAM. The raw results are recorded by reading and writing in the file. These raw results were tabulated, calculated, and graphed using VC++ and MS-Excel. Numbers of comparisons, swap operations for each algorithm and corresponding list length are shown in tabular and graphical format are shown below: TABLE II NUMBER OF COMPARISONS FOR (N LGN) BASED SORTING ALGORITHMS m

Algo / No.of elements Quick Sort Merge Sort Heap Sort

1 lacs 2083013 1868928 5173930

3 lacs 7154514 6075712 16938139

Number of elements 5 lacs 7 lacs 12439847 18266103 10475712 15051424 29321482 42113995

10 lacs 25881883 21951424 61645978

15 lacs 41478603 33902848 95209435

Figure 1. Total number of comparisons for (N lgN) sorting algorithms TABLE III NUMBER OF SWAP/ASSIGNMENT OPERATIONS FOR (N LGN) BASED SORTING ALGORITHMS Algom/ No.of elements Quick Sort Merge Sort Heap Sort

1 lacs 1026729 1668928 1674642

3 lacs 3609352 5475712 5496045

Number of elements 5 lacs 7 lacs 6335691 8665471 9475712 13651424 9523826 13687997

10 lacs 12252113 19951424 20048658

15 lacs 22301526 30902848 30986477

33

IJCCIT Vol.1, No.1, Dec.2013

Empirical Study of Complexity Graphs for Sorting Algorithms

Figure 2. Total number of Swap/Assignment for (N lgN) based sorting algorithms TABLE IV NUMBER OF COMPARISONS FOR (N2) BASED SORTING ALGORITHMS Algom/ No.of elements Bubble Sort Insertion Sort

Number of elements 5 lacs 7 lacs

1 lacs

3 lacs

4999950001

44999850001

124999750001

2503921057

22500033726

62489124089

10 lacs

15 lacs

244999650001

499999500001

1124999250001

122464377656

249931402775

562246099741

Figure 3. Total number of comparisons for (N2) sorting algorithms TABLE V TOTAL NUMBER OF SWAP/ASSIGNMENT FOR (N2) BASED SORTING ALGORITHMS Algom/ No.of elements Bubble Sort Insertion Sort

1 lacs 2503821057 2503821057

3 lacs 22499733726 22500033725

Number of elements 5 lacs 7 lacs 62488624089 122463677656 62489124088 122464377655

10 lacs 249930402775 249931402774

15 lacs 562244599741 562246099740

IV EVALUATION The results are evaluated using Least Square Fitting for each sorting algorithm is shown in tabular as well as in graphical format. In the calculations done, X denotes number of elements and Y denotes number of comparisons.

34

IJCCIT Vol.1, No.1, Dec.2013

Empirical Study of Complexity Graphs for Sorting Algorithms

Figure 4 Total number of Swap/Assignment for (N2) sorting algorithms 4.1 Bubble Sort

As per the methodology used, calculations done on bubble sort are given in table 6 and then shown in figure 5. TABLE VI BUBBLE SORT CALCULATION USING LEAST SQUARE FITTING METHOD X (No. of elements) 100000 300000 500000 700000 1000000 1500000 ∑X=4100000

Y (No. of Comparison) 4.9999E+9 4.4998E+10 1.25E+11 2.45E+11 5E+11 1.125E+12 ∑Y= 2.045E+12

X2

X3

X4

YX

YX2

1.0E+10 9E+10 2.5E+11 4.9E+11 1E+12 2.25E+12 ∑X2= 4.09E+12

1E+15 2.7E+16 1.25E+17 3.43E+17 1E+18 3.375E+18 ∑ X3= 4.871E+18

1E+20 8.1E+21 6.25E+22 2.401E+23 1E+24 5.0625E+24 ∑X4= 6.373E+24

4.99995E+14 1.35E+16 6.24999E+16 1.715E+17 5E+17 1.6875E+18 ∑YX= 2.4355E+18

4.99995E+19 4.04999E+21 3.12499E+22 1.2005E+23 5E+23 2.53125E+24 ∑ YX2= 3.18665E+24

6a1+4100000a2+4090000000000a3=2044997950006 4100000a1+4090000000000a2+4871000000000000000a3=2435497955004100000 4090000000000a1+4871000000000000000a2+6.3733E+24a3=3.18664756450409E+24 Calculating the values of coefficients a1, a2 and a3 by using Matrix Inversion Method: a1=0.98144531250000, a2=-0.50000003352761, a3=0.49999999999999 y=0.98144531250000-0.50000003352761x+0.49999999999999x2

Figure 5 Bubble Sort Graph for X-Y values 35

IJCCIT Vol.1, No.1, Dec.2013

Empirical Study of Complexity Graphs for Sorting Algorithms

4.2 Insertion Sort

As per the methodology used, calculations done on insertion sort are given in table 7 and then shown in figure 6. TABLE VII INSERTION SORT CALCULATION USING LEAST SQUARE FITTING METHOD X (No. of elements) 100000 300000 500000 700000 1000000 1500000 ∑X=4100000

Y (No. of Comparison) 2503921057 2.25E+10 6.248E+10 1.2246E+11 2.4993E+11 5.6224E+11 ∑Y= 1.0221E+12

X2

X3

X4

YX

YX2

1E+10 9E+10 2.5E+11 4.9E+11 1E+12 2.25E+12 ∑X2= 4.09E+12

1E+15 2.7E+16 1.25E+17 3.43E+17 1E+18 3.375E+18 ∑X3= 4.871E+18

1E+20 8.1E+21 6.25E+22 2.401E+23 1E+24 5.0625E+24 ∑X4= 6.373E+24

2.50392E+14 6.75001E+15 3.12446E+16 8.57251E+16 2.49931E+17 8.43369E+17 ∑YX= 1.21727E+18

2.50392E+19 2.025E+21 1.56223E+22 6.00075E+22 2.49931E+23 1.26505E+24 ∑YX2= 1.59266E+24

6a1+4100000a2+4090000000000a3=1022139059039 4100000a1+4090000000000a2+4871000000000000000a3=1217274671010100000 4090000000000a1+4871000000000000000a2+6.3733E+24a3=1.592669E+24 Calculating the values of coefficients a1, a2 and a3 by using Matrix Inversion Method: a1=-9.2744141

a2=95.404689602554 a3=0.24983064989564

y=-9.2744141+95.404689602554x+0.24983064989564x2 4.3 Merge Sort

As per the methodology used, calculations done on merge sort using table 2 are shown in figure 7. 500000log500000a+500000b+c=10475712 9465784.2846621a+500000b+c=10475712 1000000a-c=1000000 2377443.7510817a-2c=2475712 377443.75108170a=475712 Calculating the values of coefficients a, b and c by using Matrix Inversion Method: a=1.2603520355992, b=-2.909018983,

c=260352.0356

y=1.260352036x lgx - 2.909018983x+260352.0356

36

IJCCIT Vol.1, No.1, Dec.2013

Empirical Study of Complexity Graphs for Sorting Algorithms

Figure 6. Insertion Sort Graph for X-Y values.

Figure 7. Merge Sort Graph for X-Y values. 4.4 Heap Sort

As per the methodology used, calculations are done on heap sort using table 2 and are shown in figure 8. 500000log500000a+500000b+c=29321482 9465784.2846621a+500000b+c=29321482 1000000a-c=3003014 2377443.7510817a-2c=7244989 377443.75108170a=1238961 Calculating the values of coefficients a, b and c by using Matrix Inversion Method: a= 3.2825049996173,

b= -3.500006479,

c= 279490.9996

y=13.2825049996173xlgx-3.500006479x+279490.9996

37

IJCCIT Vol.1, No.1, Dec.2013

Empirical Study of Complexity Graphs for Sorting Algorithms

Figure 8. Heap Sort Graph for X-Y values. 4.5 Quick Sort

As per the methodology used, calculations are done on quick sort using table 2 and are shown in figure 9. 500000log500000a+500000b+c=12439847 9465784.28466208a+500000b+c=12439847 19931568.5693241a+1000000b+c=25881883 30774796.605068a+1500000b+c=41478603 Calculating the values of coefficients a, b and c by using Matrix Inversion Method: a=5.7086227916717434

b=-12.6063574003357

c=235321.896

Y=5.7086227916717434XlgX-92.606357400272373X+4706433.79167138

V RESULT ANALYSIS Based on the calculations done in section 4, the analysis of result values for polynomial equations are given in table 8. Graph showing polynomial equation comparison for NlgN and N2 are shown in figure 10 and 11 respectively.

Figure 9. Quick Sort Graph for X-Y values. 38

IJCCIT Vol.1, No.1, Dec.2013

Empirical Study of Complexity Graphs for Sorting Algorithms

TABLE VIII SUMMARY OF CALCULATIONS OF SORTING ALGORITHMS Sorting Algom Merge Sort Heap Sort Quick Sort Insertion Sort Bubble Sort

A 1.2603520355992 3.2825049996173 5.70862279167174 -9.2744141 0.98144531250000

B -2.909018983 -3.500006479 -12.60635740033 95.404689602554 -0.50000003352761

C 260352.0356 279490.9996 235321.896 0.24983064989564 0.49999999999999

Y Values

Polynomial Comparison 400000 350000 300000 250000 200000 150000 100000 50000 0

Merge Quick Heap

10

500

1000

1500

2000

X Values

Figure 10. Polynomial Equation Comparison for NlgN algorithm

VI CONCLUSION The empirical data obtained by using the program reveals the speed of each algorithm, from fastest to slowest for very large list and ranks them as follows: 1. Merge sort 2. Quick sort 3. Heap sort 4. Insertion sort 5. Bubble sort There is a large difference in the time taken to sort very large lists between the fastest three and the slowest two. This is due to the efficiency Merge Sort, Quick Sort and Heap sort have over the others when the list to sort is sufficiently large.

39

IJCCIT Vol.1, No.1, Dec.2013

Empirical Study of Complexity Graphs for Sorting Algorithms

Polynomial Comparison 2500000

Y Value

2000000 1500000

Insertion

1000000

Bubble

500000 0 10

500

1000

1500

2000

X Value

Figure 11. Polynomial Equation Comparison for N2 algorithm

REFERENCES 1.

L. M. Busse, M. H. Chehreghani, J. M. Buhmann, “The information content in sorting algorithms”, IEEE International Symposium on Information Theory Proceedings (ISIT), pp. 2746 - 2750, July 2012.

2.

Wang Xiang, “Analysis of the Time Complexity of Quick Sort Algorithm”, International Conference on Information Management, Innovation Management and Industrial Engineering (ICIII), vol. 1, pp. 408-410, Nov. 2011.

3.

Zhiyi Huang, S. Kannan, S. Khanna, “Algorithms for the Generalized Sorting Problem”, IEEE 52nd Annual Symposium on Foundations of Computer Science (FOCS), pp. 738-747, Oct. 2011.

4.

T. Cormen, C. Leiserson, R. Rivest, “Introduction to Algorithms”, MIT Press and McGraw-Hill, 3rd edition, Aug. 2009.

5.

Juliana Pena Ocampo, “An empirical comparison of the runtime of five sorting algorithms”, International Baccalaureate Extended Essay, 2008.

6.

J. Pagter, T. Rauhe, "Optimal time-space trade-offs for sorting", Proc. 39th Annual Symposium on Foundations of Computer Science, pp. 264 - 268, Nov 1998.

7.

A.V. Aho, J.E. Hopcroft, J.D. Ullman, “The Design and Analysis of Computer Algorithms”, Addison-Wesley, 1974.

8.

Wikipedia

for

Sorting

algorithm

retrieved

on

Aug.

20,

2013

from

2013

from

http://en.wikipedia.org/w/index.php?title=Sorting_algorithm&oldid=162949511 9.

http://warp.povusers.org/SortComparison/index.html

10.

Sorting

Algorithms

description

retrieved

on

Aug.

20,

http://www.cs.wm.edu/~wm/CS303/ln7.pdf

40