Home

# Towards a Realistic Analysis of Some Popular Sorting Algorithms

## Abstract

We describe a general framework for realistic analysis of sorting algorithms, and we apply it to the average-case analysis of three basic sorting algorithms (QuickSort, InsertionSort, BubbleSort). Usually the analysis deals with the mean number of key comparisons, but here we view keys as words produced by the same source, which are compared via their symbols in lexicographic order. The ‘realistic’ cost of the algorithm is now the total number of symbol comparisons performed by the algorithm, and, in this context, the average-case analysis aims to provide estimates for the mean number of symbol comparisons used by the algorithm. For sorting algorithms, and with respect to key comparisons, the average-case complexity of QuickSort is asymptotic to 2n log n, InsertionSort to n 2/4 and BubbleSort to n 2/2. With respect to symbol comparisons, we prove that their average-case complexity becomes Θ (n log2 n), Θ(n 2), Θ (n 2 log n). In these three cases, we describe the dominant constants which exhibit the probabilistic behaviour of the source (namely entropy and coincidence) with respect to the algorithm.

## References

Hide All
[1] Cesaratto, E. and Vallée, B. Gaussian distribution of trie depth for strongly tame sources. In the special issue of CPC (Combinatorics, Probability and Computing) dedicated to Philippe Flajolet (Cambridge University Press 2015).
[2] Clément, J., Fill, J., Nguyen Thi, T. and Vallée, B. (2014) Towards a realistic analysis of the QuickSelect algorithm. Theory of Computing Systems, special Issue “STACS 2013” (Anca Muscholl and Martin Dietzfelbinger).
[3] Clément, J., Flajolet, P. and Vallée, B. (2001) Dynamical sources in information theory: A general analysis of trie structures. Algorithmica 29 307369.
[4] Clément, J., Nguyen Thi, T. and Vallée, B. (2013) A general framework for the realistic analysis of sorting and searching algorithms: Application to some popular algorithms. In STACS 2013, pp. 598609.
[5] De La Briandais, R. (1959) File searching using variable length keys. In Papers Presented at the the March 3–5, 1959, Western Joint Computer Conference, IRE-AIEE-ACM '59 (Western), ACM, pp. 295298.
[6] Dolgopyat, D. (1998) On decay of correlations in Anosov flows. Ann. of Math. 147 357390.
[7] Dolgopyat, D. (1998) Prevalence of rapid mixing in hyperbolic flows. Ergodic Theory Dynamical Systems 18 10971114.
[8] Fill, J. A. (2013) Distributional convergence for the number of symbol comparisons used by Quicksort. Ann. Appl. Probab. 23 11291147.
[9] Fill, J. A. and Janson, S. (2004) The number of bit comparisons used by Quicksort: An average-case analysis. In Proc. ACM–SIAM Symposium on Discrete Algorithms: SODA 2004, pp. 300307. Long version in Electron. J. Probab. 17 (2012) #43.
[10] Fill, J. A. and Nakama, T. (2013) Distributional convergence for the number of symbol comparisons used by QuickSelect. Adv. Appl. Probab. 45 425450.
[11] Flajolet, P. (2006) The ubiquitous digital tree. In Proc. 23rd Annual Symposium on Theoretical Aspects of Computer Science: STACS 2006, Vol. 3884 of Lecture Notes in Computer Science, Springer, pp. 122.
[12] Flajolet, P. (2008) A journey between Rice, Mellin and Poisson. Personal communication.
[13] Flajolet, P., Gourdon, X. and Dumas, P. (1995) Mellin transforms and asymptotics: Harmonic sums. Theoret. Comput. Sci. 144 358.
[14] Flajolet, P., Roux, M. and Vallée, B. (2010) Digital trees and memoryless sources: From arithmetics to analysis. In Proc. AofA'10, DMTCS Proc. AM, pp. 231258.
[15] Flajolet, P. and Sedgewick, R. (1995) Mellin transforms and asymptotics: Finite differences and Rice's integrals. Theor. Comput. Sci. 144 101124.
[16] Flajolet, P. and Sedgewick, R. (2009) Analytic Combinatorics, Cambridge University Press.
[17] Fredkin, E. (1960) Trie memory. Commun. Assoc. Comput. Mach. 3 490499.
[18] Jacquet, P. and Szpankowski, W. (1998) Analytical de-Poissonization and its applications. Theoret. Comput. Sci. 201 162.
[19] Jacquet, P. and Szpankowski, W. (1998) Entropy computations for discrete distributions: Towards analytic information theory. In IEEE International Symposium on Information Theory.
[20] Nörlund, N. E. (1929) Leçons sur les équations linéaires aux différences finies. In Collection de Monographies sur la Théorie des Fonctions, Gauthier-Villars.
[21] Nörlund, N. E. (1954) Vorlesungen über Differenzenrechnung, Chelsea Publishing Company.
[22] Roux, M. and Vallée, B. (2011) Information theory: Sources, Dirichlet series, and realistic analyses of data structures. In Proc. 8th International Conference Words 2011, Vol. 63 of Electronic Proceedings in Theoretical Computer Science, pp. 199214.
[23] Sedgewick, R. (1998) Algorithms in C, parts 1–4, third edition, Addison-Wesley.
[24] Seidel, R. (2010) Data-specific analysis of string sorting. In Proc. 21st Annual ACM–SIAM Symposium on Discrete Algorithms: SODA, pp. 12781286.
[25] Szpankowski, W. (2001) Average Case Analysis of Algorithms on Sequences, Interscience series in Discrete Mathematics and Optimization, Wiley.
[26] Vallée, B. Rice or Poisson–Mellin? In preparation.
[27] Vallée, B. (2001) Dynamical sources in information theory: Fundamental intervals and word prefixes. Algorithmica 29 262306.
[28] Vallée, B., Clément, J., Fill, J. A. and Flajolet, P. (2009) The number of symbol comparisons in QuickSort and QuickSelect. In Proc. ICALP 2009, part I, Vol. 5555 of Lecture Notes in Computer Science, Springer, pp. 750763.

# Towards a Realistic Analysis of Some Popular Sorting Algorithms

## Metrics

### Full text viewsFull text views reflects the number of PDF downloads, PDFs sent to Google Drive, Dropbox and Kindle and HTML full text views.

Total number of HTML views: 0
Total number of PDF views: 0 *

### Abstract viewsAbstract views reflect the number of visits to the article landing page.

Total abstract views: 0 *

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed.