Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Consider unlabeled, rooted binary trees. We can compress such trees: whenever th

ID: 647974 • Letter: C

Question

Consider unlabeled, rooted binary trees. We can compress such trees: whenever there are pointers to subtrees T and T? with T=T? (interpreting = as structural equality), we store (w.l.o.g.) T and replace all pointers to T? with pointers to T. See uli's answer for an example.

Give an algorithm that takes a tree in the above sense as input and computes the (minimal) number of nodes that remain after compression. The algorithm should run in time O(nlogn) (in the uniform cost model) with n the number of nodes in the input.

This has been an exam question and I have not been able to come up with a nice solution, nor have I seen one.

Explanation / Answer

Yes, you can perform this compression in O(nlogn) time, but it is not easy :) We first make some observations and then present the algorithm. We assume the tree is initially not compressed - this is not really needed but makes analysis easier.

Firstly, we characterize 'structural equality' inductively. Let T and T? be two (sub)trees. If T and T? are both the null trees (having no vertices at all), they are structurally equivalent. If T and T? are both not null trees, then they are structurally equivalent iff their left children are structurally equivalent and their right children are structurally equivalent. 'Structural equivalence' is the minimal fixed point over these definitions.

For example, any two leaf nodes are structurally equivalent, as they both have the null trees as both their children, which are structurally equivalent.

As it is rather annoying to say 'their left children are structurally equivalent and so are their right children', we will often say 'their children are structurally equivalent' and intend the same. Also note we sometimes say 'this vertex' when we mean 'the subtree rooted at this vertex'.

The above definition immediately gives us a hint how to perform the compression: if we know the structural equivalence of all subtrees with depth at most d, then we can easily compute the structural equivalence of the subtrees with depth d+1. We do have to do this computation in a smart way to avoid a O(n2) running time.

The algorithm will assign identifiers to every vertex during its execution. An identifier is a number in the set {1,2,3,

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote