Recall the problem of finding the number of inversions. As in the text, we are g
ID: 3673711 • Letter: R
Question
Recall the problem of finding the number of inversions. As in the text, we are given a sequence of n numbers a1, . . . , an, which we assume are all distinct, and we define an inversion to be a pair i < j such that ai > aj.
We motivated the problem of counting inversions as a good measure of how different two orderings are. However, one might feel that this measure is too sensitive. Let’s call a pair a significant inversion if i < j and ai > 2aj. Give an O(n log n) algorithm to count the number of significant inversions between two orderings.
EXPLAIN COMPLEXITY AND PROVE ITS CORRECTNESSS
PLEASE ALSO PROVIDE IMPLEMENTATION CODE
Explanation / Answer
The algorithm is very similar to the algorithm of counting inversions. The only change is that here we separate the counting of significant inversions from the merge-sort process.
Algorithm:
Let A = (a1, a2, . . . , an).
Function CountSigInv(A[1...n])
if n = 1 return 0; //base case
Let L := A[1...floor(n/2)]; // Get the first half of A
Let R := A[floor(n/2)+1...n]; // Get the second half of A
//Recurse on L. Return B, the sorted L,
//and x, the number of significant inversions in $L$
Let B, x := CountSigInv(L);
Let C, y := CountSigInv(R); //Do the counting of significant split inversions
Let i := 1;
Let j := 1;
Let z := 0;
// to count the number of significant split inversions while(i <= length(B) and j <= length(C)) if(B[i] > 2*C[j]) z += length(B)-i+1; j += 1; else i += 1;
//the normal merge-sort process i := 1; j := 1;
//the sorted A to be output Let D[1...n] be an array of length n, and every entry is initialized with 0; for k = 1 to n if B[i] < C[j] D[k] = B[i]; i += 1; else D[k] = C[j]; j += 1; return D, (x + y + z);
Runtime Analysis: At each level, both the counting of significant split inversions and the normal merge-sort process take O(n) time, because we take a linear scan in both cases. Also, at each level, we break the problem into two subproblems and the size of each subproblem is n/2. Hence, the recurrence relation is T(n) = 2T(n/2) + O(n). So in total, the time complexity is O(n log n).
Correctness Proof: We will prove a stronger statement that on any input A our algorithm produces correctly the sorted version of A and the number of significant inversions in A. And, we will prove it by induction. Base Case: When n = 1, the algorithm returns 0, which is obviously correct. Inductive Case: Suppose that the algorithm works correctly for 1, 2, . . . , n 1. We have to show that the algorithm also works for n. By the induction hypothesis, B, x, C, y are correct. We need to show that the counting of significant split inversions and the normal merge-sort process are both correct. In other words, we are to show that z is the correct number of significant split inversions, and that D is correctly sorted. First, we prove that z is correct. Due to the fact that B and C are both sorted by the induction hypothesis, for every C[j], where j = 1, . . . , length(C), we can always find the smallest i such that B[i] > 2C[j]. Therefore, B[1], B[2], . . . , B[i 1] are all smaller than or equal to 2C[j], and B[i], B[i + 1], . . . , B[length(B)] are all greater than 2C[j]. Thus, length(B) i + 1 is the correct significant split inversion count for pairs involving C[j]. After counting the number of correct significant split inversions for all C[j], where j = 1, . . . , length(C), z is the correct number of significant split inversions. Second, we show that D is the sorted version of A. This is actually the proof of merge-sort. Each time, we append one element from B or C to D. At each iteration k, since B and C are both sorted by the induction hypothesis, min(B[i], C[j]), the one to be appended to D, is also the smallest number among all the numbers that have not been added to D yet. Consequently, every time we append the smallest remaining element (of B and C) to D, which makes D a sorted array
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.