Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Comparative Study of Sorting Algorithms to sort a set of n number of records usi

ID: 3763120 • Letter: C

Question

Comparative Study of Sorting Algorithms to sort a set of n number of records using multiple algorithms and analysis of the runtime.deliverable: Source code in C++ Input Data File Output Data File Report discussing each algorithm in brief, a results section, and explanation of your code. The report should be in paper format. Any references used must be included. A How chart should be included to understand your program. Algorithms considered should be: Selection Insertion Merge Randomized Quick Sort and Heap Sort Each record consists of persons first name (string), last name (string), birthdate in date form (which itself will be a struct with month (string), day (int) and year (int)), email address (string) and phone no (area code (int), number (int)). The input file can be constructed as a text file with all the data, and be read appropriately. Put sufficient comments in your program.

Explanation / Answer

Answer:

=======

Among non-stable algorithms heapsort and Shellsort are probably the most underappreciated and quicksort is one of the most overhyped. Please note the quicksort is a fragile algorithm that is not that good in using predictive execution of instructions in pipeline (typical feature of modern CPUs). There are practically valuable sequences ("near sorted" data sets) in which for example shellsort is twice faster then quicksort (see examples below).

It is fragile because the choice of pivot is equal to guessing an important property of the data to be sorted (and if it went wrong the performance can be close to quadratic). There is also some experimental evidence that on very large sets Quicksort runs into "suboptimal partitioning" on a regular basis, so it becomes a feature, not a bug. It does not work well on already sorted or "almost sorted" (with a couple or permutations) data as well as data sorted in a reverse order. It does not work well on data that contain a lot of identical keys. Those are important cases that are frequent in real world usage of sorting. Again I would like to stress that one will be surprised to find what percentage of sorting is performed on already sorted datasets or datasets with less then 10% of permutations.

For those (and not only for those ;-) reasons you need to be skeptical about "quicksort lobby" with Robert Sedgewick (at least in the past) as the main cheerleader. Quicksort is a really elegant algorithm invented by Hoare in Moscow in 1961, but in real life other qualities then elegance and speed of random data are more valuable ;-). On important practical case of "semi-sorted" and "almost reverse sorted" data quicksort is far from being optimal and often demonstrates dismal performance. You need to do some experiments to see how horrible quicksort can be in case of already sorted data (simple variants exhibit quadratic behavior, the fact that is not mentioned in many textbooks on the subject) and how good shellsort is ;-). And on typical corporate data sets including log records heapsort usually beats quicksort because performance of quicksort too much depends of the choice of pivot and series of bad choices increase time considerably. Each time the pivot element is close to minimum (or maximum) the performance of this stage goes into the drain and on large sets such degenerative cases are not only common, they can happen in series .

enerally the more we know about the properties of data to be sorted, the faster we can sort them. As we already mentioned the size of key space is one of the most important dimensions (sort algorithms that use the size of key space can sort any sequence for time O(n log k) ). For example if we are sorting subset of a card deck we can take into account that there are only 52 keys in any input sequence and select an algorithms that uses limited keyspace for dramatic speeding of the sorting. In this case using generic sorting algorithm is just a waist.

Moreover, the relative speed of the algorithms depends on the size of the data set: one algorithm can be faster then the other for sorting less then, say, 64 or 128 items and slower on larger sequences. Simpler algorithms with minimal housekeeping tend to perform better on small sets even if the are   O(n2) type of algorithms. Especially algorithms that have less number of jump statement in compiled version of code. For example insertion sort is competitive with more complex algorithms up to N=25 or so.

Bubble sort:

The simplest sorting algorithm is bubble sort. The bubble sort works by iterating down an array to be sorted from the first element to the last, comparing each pair of elements and switching their positions if necessary. This process is repeated as many times as necessary, until the array is sorted. Since the worst case scenario is that the array is in reverse order, and that the first element in sorted array is the last element in the starting array, the most exchanges that will be necessary is equal to the length of the array. Here is a simple example:

Given an array 23154 a bubble sort would lead to the following sequence of partially sorted arrays: 21354, 21345, 12345. First the 1 and 3 would be compared and switched, then the 4 and 5. On the next pass, the 1 and 2 would switch, and the array would be in order.

The basic code for bubble sort looks like this, for sorting an integer array:

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote