Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

I want to approximate how close is the greedy algorithm to the optimal solution

ID: 647070 • Letter: I

Question

I want to approximate how close is the greedy algorithm to the optimal solution for the Set Cover Problem, which I'm sure most of you are familiar with, but just in case, you can visit the link above.
The problem is NP-Hard, and I'm trying to find a bound on how well does the greedy algorithm perform.
I know it looks a lot, but please bare with me. I pretty much did most of the work, I'm just missing that last small piece.

Here is the pseudo code:

Input: U - set of elements, F - family of sets s.t. ?S?FS=U
Output: C - a family of sets; C?F s.t. ?S?CS=U

The algorithm is pretty straight forward, and it is easy to see that it is indeed polynomial.

This is my attempt to approximate:
Claim 1: In a set U of m objects, that can be covered with k sets, there has to be a set S?U whose size is at least 1km.
Proof: Trivial. (I decided not to prove it)

A corollary is that given the situation described in that claim, the greedy algorithm will choose a set whose size is at least 1km.

Claim 2: Given a universe U, if there exist a cover of size k, then after k iterations, the greedy algorithm will cover at least half of the elements, meaning at least 12n elements.
Proof: By claim 1, in the first iteration the algorithm will cover at least 1kn elements. Upon entering the second iteration, there are at most n?n1k elements, and so the greedy algorithm will at least cover additional 1k(n?n1k) elements. In general, on the i'th iteration, the algorithm will cover 1k(n?ni?1k) elements. So after k iterations:

?i=1k1k(n?ni?1k)=?i=0k?11k(n?nik)

=?i=0k?1nk??i=0k?1nik2=n?nk2?i=0k?1i

=n?nk2(k(k?1)2)=n?n2(k?1k)?12n

OK, now this is where I need help: I know that in the first k iterations, the algorithm picks at least half of the elements. After another k iterations, another half was covered, out of what's left (meaning another 14n). So in general, I know that the bound is klogn, I just can't figure out how to formalize it.
What formula represents this behaviour? T(ki)=T(12in) and solve for i? It didn't work...
What formula or equation should I solve to actually show that the number of iterations is bounded by klogn?

Explanation / Answer

You have shown that if there is a set cover of size k and there are n elements left uncovered, then the greedy algorithm finds k sets which cover at least n/2 elements, and so at most n/2 elements are left uncovered. You can prove by induction that if there is a set cover of size k and there are n elements left uncovered, then for each t, the first tk sets found by the greedy algorithm cover all but at most n/2t elements. The idea now is to pick a value of t such that n/2t<1. Since the number of elements left uncovered by the algorithm is an integer, if the number m of elements left uncovered satisfies m<1, then in fact m=0, i.e., all elements are covered.

As a side note, you can actually improve on your reasoning and show that the first k sets cover at least (1?1/e)n elements, and so leave only n/e elements uncovered. This improves the bound from your klog2n to the sharp klnn, which is known to be tight (unless P=NP).

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote