Write a function SplittingFind that uses path splitting for the find operations
ID: 3583589 • Letter: W
Question
Write a function SplittingFind that uses path splitting for the find operations instead of path collapsing.This is defined below: Definition [Path Splitting]: In path splitting, the parent pointer in each node (expect the root and its child) on the path from i to the root is changed to point to the node 's grandparent.
Note that when path splitting is used, a single pass from i to the root suffices.Tarjan and Van Leeuwen have shown that Lemma 5.6 holds when path splitting is used in conjunction with either the weight or height rule for unions. 8 (Exercise 5.2 of Chapter 5.10) Write a function Splitting Find that uses path splitting for the find operation instead of path collapsing. The is defined below: Definition [Path Splittingl: In the path splitting, the parent pointer in each node (except the root and its child) on the path from l to the root is changed to point to the node's grandparent.
Explanation / Answer
Mergesort is an O(n log n) sort that works recursively by splitting the list into two halves, recursively sorting each half, and then merging the results. The merge algorithm takes two sorted lists and creates a merged sorted list by examining each list and comparing pairs of values, putting the smaller of each pair of values in a new array. There are two pointers, one for each array of numbers, and they start at the beginning of the array. The pointer to the list from which the smaller value was taken is incremented. At some point one of the pointers hits the end of the array, and then the remainder of the other array is simply copied to the new array.
The time complexity for merging is the sum of the lengths of the two arrays being merged, because after each comparison a pointer to one array is moved down and the algorithm terminates when the pointers are both at the ends of their arrays. This gives a recurrence equation of T(n) = 2T(n/2) + O(n), T(2) = 1, whose solution is O(n log n).
One important feature of Mergesort is that is not in-place. That is, it uses extra space proportional to the size of the list being sorted. Most sorts are in-place, including insertion sort, bubble sort, heapsort and quicksort. Mergesort has a positive feature as well in that the whole array does not need to be in RAM at the same time. It is easy to merge files off disks or tapes in chunks, so for this kind of application, mergesort is appropriate. You can find a C version of mergesort in assignment 1 of How Computers Work (month 3).
Heapsort
Heapsort is the first sort we discuss whose efficiency depends strongly on an abstract data type called a heap. A heap is a binary tree that is as complete as possible. That is, we fill it in one level at a time from right to left on each level. It has the property that the data value at each node is less than or equal to the data value at its parent. (Note that there is another abstract data type called a binary search tree that is not the same as a heap. Also, there is another heap used in the context of dynamic allocation of storage and garbage collection for programming languages such as Java or Scheme. This other heap has nothing to do with our heap. The heap from dynamic memory allocation has more of a usual English meaning as in a heap of free of memory, and is actually more like a linked list.)
A heap supports a number of useful operations on a collection of data values including GetMax(), Insert(x), and DeleteMax(). The easiest way to implement a heap is with a simple array, where A[1] is the root, and the successive elements fill each level from left to right. This makes the children of A[i] turn up at locations A[2i] and A[2i+1]. Hence moving from a parent to a child or vice versa, is a simple multiplication or integer division. Heaps also allow changing any data value while maintaining the heap property, Modify(i, x), where i is the index of the array and x is the new value. Heaps are a useful way to implement priority queues that is a commonly used abstract data type (ADT) likes stacks and queues.
To GetMax(), we need only pull the data value from the root of the tree. The other operations Insert(x), DeleteMax() and Modify(i,x) require more careful work, because the tree itself needs to be modified to maintain the heap property. The modification and maintenance of the tree is done by two algorithms called Heapify (page 143) and Heap-Insert (page 150). These correspond to the need to push a value up through the heap (Heap-Insert) or down through the heap (Heapify). If a value is smaller or equal to its parent but smaller than at least one of its children, we push the value downwards. If a value is larger or equal than both its children, but larger than its parent, then we push it upwards. Some texts call these two methods simply PushUp and PushDown. The details of these two methods using examples will be shown in class. The time complexity for these operations is O(h) where h is the height of the tree, and in a heap h is O(lg n) because it is so close to perfectly balanced. An alternate way to calculate the complexity is the recurrence T(n) = T(2n/3) + O(1), T(1) = 0, whose solution is O(log n). The recurrence comes from the fact that the worst case splitting of a heap is 2/3 and 1/3 (page 144) on the two children.
Heapsort works in two phases. The first phase is to build a heap out of an unstructured array. The next step is:
for index = last to 1 {
swap(A[0], A[index]);
Heapify (0);
}
We will discuss the buildheap phase in class, and there is a problem on it in your Pset. It is also discussed at length in the text. The next phase works assuming the array is a heap. It computes the largest value in the array, and the next etc., by repeatedly removing the top of the heap and swapping it with the next available slot working backwards from the end of the array. Every iteration needs to restoe theheap property since a potentially small value has been placed at the top of the heap. After n-1 iterations, the heap is sorted. Since each PushUp and PushDown takes O(lg n) and we do O(n) of these, that gives O(n log n) total time.
Heapsort has some nice generalizations and applications as you will see in your Pset, and it shows the use of heaps, but it is not the fastest practical sorting algorithms.
Quicksort
Quicksort and its variations are the most commonly used sorting algorithms. Quicksort is a recursive algorithm that first partitions the array in place into two parts where all the elements of one part are less than or equal to all the elements of the second part. After this step, the two parts are recursively sorted.
The only part of Quicksort that requires any discussion at all is how to do the partition. One way is to take the first element a[0] and split the list into parts based on which elements are smaller or larger than A[0]. There are a number of ways to do this, but it is important to try to do it without introducing O(n) extra space, and instead accomplish the partition in place. The issue is that depending on A[0], the size of the two parts may be similar or extremely unbalanced, in the worst case being 1 and n-1. The worst case of Quicksort therefore gives a recurrence equation of T(n) = T(n-1) + O(n), T(1) = 0, whose solution is O(n^2).
The partition method we will review in class keeps pointers to two ends of the array moving them closer to each other swapping elements that are in the wrong places. It is described on pages 154-155. An alternative partition algorithm is described in problem 8-2 on page 168.
The question is why is Quicksort called an O(n log n) algorithm even though it is clearly worst case O(n^2)? It happens to run as fast or faster than O(n log n) algorithms so we better figure out where the theory is messing up. It turns out that if we calculate the average case time complexity of Quicksort, we get an O(n log n) result. This is very interesting in that agrees with what we see in practice. Moreover it requires the solution of a complicated recurrence equation, T(n) = (2/n) (T(1) + T(2) + … + T(n-1) ) + O(n), T(1) = 0, whose solution is obtained by guessing O(n log n) and verifying by mathematical induction, a technique with which you may be familiar. The solution also requires the closed form summation of k log k for k = 1 to n, another technique from discrete mathematics that you have seen before.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.