Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Data compression algorithms (like gzip) take as input a bit string i of length n

ID: 3627090 • Letter: D

Question

Data compression algorithms (like gzip) take as input a bit string i of length n and produce as output a
bit string o of length m for which it is typically the case that m < n. Lossless compression algorithms
are able to perfectly reconstruct i from o. Prove that it is not possible for a lossless compression
algorithm to actually compress every string of length n. That is, for some string(s) it must be the case
that m >= n.
Hint: Think about what happens if two distinct input strings i1 and i2 both compress to
the same string o when it comes time to decompress and recover the input

Explanation / Answer

Please rate This follows pigeonhole principle which states that if n items are put into m pigeonholes with n > m, then at least one pigeonhole must contain more than one item. If a strings for length n produce an output of length m with m =n.