Given k sorted lists L1;L2; : : : ;Lk of n=k real numbers each, with 1 k n,
ID: 3650248 • Letter: G
Question
Given k sorted lists L1;L2; : : : ;Lk of n=k real numbers each, with 1 k n, design an O(n log k) time divide-and-conquer algorithm for sorting the set of n numbers in the k sorted lists....Explanation / Answer
take help from this Divide and Conquer The divide and conquer is a useful paradigm in designing ecient algorithms. The technique involves splitting the original problem into subproblems, solving the subproblems recursively and combining the solutions of the subproblems to obtain the solution to the original problem. The running time analysis of a divide and conquer algorithm usually involves writing a recursion. Merge Sort Merge Sort is an algorithm for sorting based on the divide and conquer paradigm. Here, we split the problem of sorting an n-length array into two subproblems { each involves sorting an n=2-length array { and combine the solutions to obtain a sorted array. The combining step involves merging two sorted array to obtain a sorted array. Hence, the name of the algorithm. The algorithm runs as follows. MergeSort(A1;:::;n) if jAj=1, then return A else return Merge(MergeSort(L1;:::;bn=2c), MergeSort(Lbn=2c+1;:::;n)) The Merge function takes as input two sorted arrays and outputs a sorted array containing all elements in both arrays. Merge(A,B) C Empty while(jAj > 0 and jBj > 0) if A1 0 Append A to C if jBj > 0 Append B to C return C Time Complexity We now analyze the running time of the algorithm. For this purpose, we view the execution of the algorithm as a tree. At the root, we have the whole array. At the rst level, we still have the whole array, but split into two halves. At the second level, we once again have the whole array, but split into four quarters. And so on, until we reach the bottom level which contains individual elements. Clearly, the depth of this tree is O(log n). At each level, two smaller arrays are merged to obtain the larger array that is the parent. This is done one element at a time. Thus, n elements have to be copied at each level of the tree. Therefore, the total running time complexity is O(n log n). PAGE 1/5 Week 4 CS6505: Computability & Algorithms Jan 30, Feb 1 Notation We have seen the big-oh notation several times, but in the future, we will also use big-omega and big-theta notation. Here are the denitions. g(n) = O(f(n)) if 9c; n0 s.t., 8n > n0, g(n) cf(n). g(n) = (f(n)) if 9c; n0 s.t., 8n > n0, g(n) cf(n). g(n) = (f(n)) if 9c1; c2; n0 s.t., 8n > n0, c1f(n) g(n) c2f(n). Master Theorem If we have a recurrence of the form T(n) = ( aT(n=b) + nc; if n > 1 d; if n=1 then, -bounds on the non-recurrent solution is given by T(n) = 8>< >: (nc) if logb a c Lower Bound for Comparison Sorting Is it possible to sort in time faster than O(n log n) time using comparisons only? No. Here is a proof. Assume the array is distinct (repetitions can only increase the number of comparisons needed). When all elements are distinct, we may as well map them to the naturals. So, assume that the input is an array containing the rst n naturals. Suppose we have a sorting algorithm. We may assume that the sorting algorithm outputs a permutation associated with the input. For example, if the input array is of the form [2; 3; 1; 4], the associated permutation is , where (1) = 2, (2) = 3, (3) = 1, (4) = 1. For each permutation of the rst n naturals, there exists an input for which is the correct answer. The sorting algorithm provides enough information to pinpoint which permutation corresponds to the input. Since our algorithm uses only comparisons to determine the order of elements, we can lower bound the running time by the number of comparisons performed. Each comparison returns only one of two possible results (less than" or greater than"). Thus, each comparison reveals exactly 1 bit of information about the permutation. Now, how many bits do we need to specify a particular permutation? There are n! permutations and the number of bits needed to specify a particular permutation is at least log2 (n!). Thus, we need at least log2 (n!) comparisons. But, by Stirling's approximation log2 (n!) = (log (n!)) = (n log n). Thus, the number of comparisons needed is at least (n log n). Stirling's Approximation log n! n log n ?? n = (n log n). PAGE 2/5 Week 4 CS6505: Computability & Algorithms Jan 30, Feb 1 k-th Order Selection Next, we look at an algorithm for nding the k-th smallest element in an array. How long does it take to nd the smallest (k = 1) element? We may not see the minimum until we see the end of the list, so we will need to scan all n entries. Thus, it takes O(n) time. Similarly, for k = n, we are asking for the largest element in the array and it takes O(n) time. What about the time to look up the k-th smallest element? An obvious idea is to scan the entire input for the minimum and remove it. Then, we could scan for the minimum of the remaining n??1 elements, which would give us the second largest. We could repeat this, giving us an O(kn) algorithm. Now, what about the median? The median is the dn=2e-th element in the array. The above algorithm gives us an O(n2) algorithm. However, this is clearly not optimal since there is an obvious O(n log n) algorithm. Pivot based algorithm Suppose, we select a pivot p from the n-element array. Now, compare the element p against all of the element in the array, sorting elements that are less than p into L and those that are greater than p into R. If the number jLj of elements less than p is less than k, then p is less than the k-th smallest element. So, we can recurse on the elements in R searching for the k ?? jLj-th smallest element. If the number jLj of elements greater than p is greater than k, then p is greater than the k-th smallest element. So, we can recurse on the element in L searching for the k-th smallest element. We have not introduced randomness as a resource in this course, but we can show that this will eliminate one-fourth of the elements during recursion each time 1. Suppose, we had a technique that eliminated one-fourth of the elements at each stage. Then, we would be left with 3=4-th of array at each stage. If we iterate over the entire list to gauge how good our pivot was, then we need O(n) time at each stage. So, we have the following analysis for the running time. T(n) = n + T 3n 4 = n + 3n 4 + T 3 4 3n 4 n 1 ?? 3=4 = 4n: In other words, this was a geometric series that we evaluated. So, at each stage, we iterated over our entire list and removed a small portion of our list. Since we removed a small portion of the list, we ran in linear time. Now, we will show an algorithm that guarantees that we eliminate a constant fraction without using randomness, and so also gives an upper bound of O(n) time. This algorithm is not obvious; for a long time, many thought that O(n log n) was the best possible. 1See CLRS Section 9-2 PAGE 3/5 Week 4 CS6505: Computability & Algorithms Jan 30, Feb 1 Without Randomness We proceed as follows. We have an array containing n elements. Partition the array into groups of size 5. Now, take the median of each 5-element set and collect these into a set S of medians. We will now nd the median of the set S of medians using this same algorithm and take p =median(S). What does the median of medians tell us? Among the n=5 medians, n=10 of the medians are less than p. Now, consider the 5-element partitions with medians less than the median of medians. Both the median and the two elements less than the median will be smaller than the median of medians. Counting these two elements with the medians themselves, we have at least 3n=10 elements less than the median of medians. Thus, jLj 3n=10 and hence, jRj 7n=10. By the same argument, we have at least 3n=10 elements greater than the median of medians. Thus, jRj 3n=10 and hence, jLj 7n=10. Thus, by picking the median of medians as the choice for the pivot, we can eliminate at least 3n=10 elements from consideration, making our list at most 7n=10 long. Now, we can apply our pivot based algorithm to recurse and nd the median, or another k-th smallest element we seek. Time Complexity Let us now analyze the time complexity as follows. First, nding medians of n=5 sets each of size 5 takes time c(n=5) = O(n). Finding the median of these medians takes time T(n=5). Second, once we have the median of medians, we require O(n) time to compare all elements in the array to p and time at most T(7n=10) to recurse on the set that contains the element we seek. So, our recursion is T(n) = T(n=5) + T(7n=10) + O(n): This recursion cannot be attacked by the Master Theorem, but we already know to expect a linear time algorithm, so let us just guess that our algorithm runs in time cn and verify if there exists a constant c for which the above equation holds. Substituting, we get cn = cn 5 + 7cn 10 + O(n) cn = 9cn 10 + kn for some k. This equation is true if c = 10k. The existence of a constant c for which the algorithm runs in time cn proves that the algorithm runs in time O(n). Variations What happens if we partition into groups of size 7? First of all, the step of nding the median of each partition will take longer, which increases the constant in the O(n) term in the recurrence. Using the same analysis as above, we get the following recurrence. T(n) = T(n=7) + T(10n=14) + O(n): Guessing that it is a linear function, we seek a constant c so that cn = 6cn 7 + kn: Taking c = 7k satises the equation and therefore, the algorithm runs in time O(n). PAGE 4/5 Week 4 CS6505: Computability & Algorithms Jan 30, Feb 1 What happens if we use groups of size three? The recurrence becomes, T(n) = T(n=3) + T(2n=3) + O(n): Guessing that this runs in linear time gives us the equation cn = cn + kn; which clearly has no solution for c when k is positive. In other words, the problem does not get smaller each time we recurse. So, we have to do linear work at all log n levels of the recursion tree, resulting in an O(n log n) time algorithm. Reading CLRS - Chapter 2, Section 3, Chapter 9Related Questions
Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.