compare the performance of simple binary search trees versus AVL trees by readin
ID: 3583691 • Letter: C
Question
compare the performance of simple binary search trees versus AVL trees by reading in a corpus of text, storing the word and phrases therein into a search tree, and then performing manipulations on the resulting tree. You will implement the following functions for both kinds of trees: insert which inserts a value into the tree (or updates a frequency count) delete which decreases a frequency count (or removes a value from the tree) find which reports the frequency of a value in the tree These operations should maintain bst ordering. You will also implement an interpreter which processes requests to manipulate the trees. Your interpreter should handle the following commands: i W insert word or phrase W into the tree d W delete word or phrase W from the tree f W report the frequency of word or phrase W s show the tree r report statistics Words are spans of non-whitespace characters. After reading a word, you should remove all characters that are not ASCII letters. Next, you should convert all upper-case letters to lower case. For example, the word Boy's. would be rendered as boys. Do not insert empty strings into the tree. Phrases will be delineated by double quotes and will consist of all the characters between the opening and closing quotes. The same processing that is applied to words should be applied to phrases, as well. In addition, all spans of whitespace (spaces, tabs, and newlines) should be replaced with a single space. For example, the phrase "Girls' night out!" would be rendered as girls night out As with words, do not insert empty strings into the tree. Inserting an item already in the tree would increase its frequency count by one. When deleting an item from the tree, you should reduce its frequency count by one. If the frequency count goes to zero, you should remove its corresponding node completely from the tree. When showing the tree, display the nodes with a breadth-first (left-first) traversal. All nodes at a given level should be on the same line of output. The level number, starting at zero, should precede the display of the nodes at that level. Display each node according to the following format: an optional equals sign (if the node is a leaf), followed by the node value, followed by a minus sign if the node favors the left child or a plus sign if the child favors the right child, followed by a parenthesized display of the parent's value and favortism, followed by the frequency count, followed by an X if the node is the root, an L if the node is a left child, and an R otherwise Note that the parent of the root is itself. Displaying the tree should take ( n ) time. Here is an example of a completely balanced AVL tree: 0: beta(beta)1X 1: =alpha(beta)1L =gamma(beta)1R Here is an example of a less-than-perfectly-balanced AVL tree: 0: beta+(beta+)1X 1: =alpha(beta+)1L gamma-(beta+)1R 2: =delta(gamma-)1L The words should be ordered within a tree in case-insensitive lexicographic ordering. Suppose the corpus was: The quick brown fox jumped over the girl and her lazy, lazy dog. found in a file named data. Suppose the file commands holds the command: s Then a display of a bst generated by this corpus would look something like: $ trees -b data commands 0: the(the)2X 1: quick(the)1L 2: brown(quick)1L 3: =and(brown)1L fox(brown)1R 4: =dog(fox)1L jumped(fox)1R 5: girl(jumped)1L over(jumped)1R 6: =her(girl)1R =lazy(over)2L When inserted into a AVL tree, the resulting structure might look like: $ trees -a data commands 0: jumped(jumped)1X 1: fox(jumped)1L quick-(jumped)1R 2: brown(fox)1L girl+(fox)1R over-(quick-)1L =the(quick-)2R 3: =and(brown)1L =dog(brown)1R =her(girl+)1R =lazy(over-)2L The statistics to be reported are: the number of nodes in the tree the distance from the root to the closest node with a null child pointer (the root is at distance 0) the distance from the root to the furthest node with a null child pointer (the root is at distance 0) The commands will be read from a free format text file; individual tokens may be separated by arbitrary amounts of whitespace. For example, these three text file contents are all legal and equivalent: i "spongebob squarepants " f Patrick s or i "spongebob squarepants " f Patrick s or i "spongebob squarepants " f Patrick s Error handling You should ignore, but report via stderr, an attempt to delete an item that does not exist in the tree. Thus you ought to be able to randomly generate a large number commands and have your interpreter run without failing. Program invocation Your program will process a free-format corpus of tokens and double-quote delimited strings stored in a text file and a free-format text file containing an arbitrary number of commands. The name of the corpus and the file of commands will be passed to the interpreter as a command line arguments. Switching between the two tree implementations is to be accomplished by providing the command line options -b (simple bst) and -a (AVL tree). Here is an example call to your interpreter: trees -b corpus commands where corpus is a file of text and commands is the name of a file which contains a sequence of commands. In executing this call, you would read the words found in corpus, store them into a simple binary search tree, and the process the sequence of commands found in commands. The commands file may be empty. Program output All output should go to the console (stdout). When processing commands, only the result should be echoed to the console with a terminating newline. The commands should not be echoed. The insert and delete commands do not have a printable result and therefore should be processed silently (other than the error message arising from a bad deletion). Documentation All code you hand in should be attributed to the author. Comment sparingly but well. Do explain the purpose of your program. Do not explain obvious code. If code is not obvious, consider rewriting the code rather than explaining what is going on through comments. Project Organization Your working directory will hold your makefile and your source code. The top-level rule in your makefile should look like: trees : gcc -Wall -std=c99 -g trees.o [other .o files go here] -o trees You should also have a test rule that tests your implementation on a relatively small set of commands: test : trees @echo testing simple BST trees -b mytestcorpus mytestcommands @echo testing AVL tree trees -a mytestcorpus mytestcommands Your commands file should feature all of the interpreter commands. You can name your test files anything you want. Finally, you should have a clean rule which deletes all your object files and the executable. You may not use any built-in data structures. All of your data structures should have worst-case behavior that is typical for that structure. You must implement your interpreter in C99.
Explanation / Answer
When we need to compare AVL tree with simple binary search tree (BST) without balancing, then AVL will consume more memory (reason is - each node has to remember its balance factor) and each operation can be slower (because we need to maintain the balance factor and sometimes perform rotations too).
As mentioned, BST without balancing has a very bad (linear) worst case. But if we know that this worst case won't happen to us, or if we are fine if the operation is slow in rare cases, BST without balancing might be better than AVL.
/*
* SIMPLE BST data structure without balancing info.
*
*/
#define RIGHT (1)
#define LEFT (0)
#define TREE_NUM_CHILDREN (2)
struct tree {
/* Now we will make this an array so that some operations symmetric can be made */
struct tree *child[TREE_NUM_CHILDREN];
int key;
int height; /* This is the height of this node */
size_t size; /* This will be the size of subtree rooted at this node */
};
#define TREE_EMPTY_HEIGHT (-1)
#define TREE_EMPTY (0)
/* This will free all elements of a tree, replacing it with TREE_EMPTY */
void treeDestroy(struct tree **root);
/* This will insert an element into a tree pointed to by root */
void treeInsert(struct tree **root, int newElement);
/* This will return 1 if target is in tree, 0 otherwise */
int treeContains(const struct tree *root, int target);
/* This will delete minimum element from the tree and return its key */
/* do not call this on an empty tree */
int treeDeleteMin(struct tree **root);
/* This will delete target from the tree */
/* has no effect if target is not in tree */
void treeDelete(struct tree **root, int target);
/* This will return height of tree */
int treeHeight(const struct tree *root);
/* This will return size of tree */
size_t treeSize(const struct tree *root);
/* This will pretty-print the contents of a tree */
void treePrint(const struct tree *root);
/* This will return the number of elements in tree less than target */
size_t treeRank(const struct tree *root, int target);
/* This will return an element with the given rank */
/* rank must be less than treeSize(root) */
int treeUnrank(const struct tree *root, size_t rank);
#include <assert.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include "tree.h" //Include tree.h
int
treeHeight(const struct tree *root) //treeHeight
{
if(root == 0) { //if statement
return TREE_EMPTY_HEIGHT;
} else {
return root->height;
}
}
/* This will re-compute height from height of kids */
static int
treeComputeHeight(const struct tree *root) //treeComputeHeight() function
{
int childHeight; //childHeight declared
int maxChildHeight; //maxChildHeight declared
int i;
if(root == 0) { //if statement begins here
return TREE_EMPTY_HEIGHT;
} else {
maxChildHeight = TREE_EMPTY_HEIGHT;
for(i = 0; i < TREE_NUM_CHILDREN; i++) {
childHeight = treeHeight(root->child[i]);
if(childHeight > maxChildHeight) { //IF Statement condition
maxChildHeight = childHeight;
}
}
return maxChildHeight + 1; //return condition
}
}
size_t //size_t declaration
treeSize(const struct tree *root)
{
if(root == 0) { //if statement
return 0;
} else {
return root->size;
}
}
/* This will recompute size from size of kids */
static int
treeComputeSize(const struct tree *root) //treeComputeSize() function declared here
{
int size;
int i;
if(root == 0) {
return 0;
} else {
size = 1;
for(i = 0; i < TREE_NUM_CHILDREN; i++) { //if statement
size += treeSize(root->child[i]);
}
return size;
}
}
/* This will fix the aggregate data in root */
/* after assuming children are correct */
static void
treeFix(struct tree *root)
{
if(root) { //if statement
root->height = treeComputeHeight(root);
root->size = treeComputeSize(root);
}
}
/* The below will rotate child in given direction to root */
static void
treeRotate(struct tree **root, int direction)
{
struct tree *x;
struct tree *y;
struct tree *b;
y = *root; assert(y); //y value declared
x = y->child[direction]; assert(x); //x value declared
b = x->child[!direction]; //b value declared
/* This will do the rotation */
*root = x;
x->child[!direction] = y; //x value
y->child[direction] = b; //y value
/* fix y then x */
treeFix(y);
treeFix(x);
}
int tallerKid;
if(*root) {
for(tallerKid = 0; tallerKid < TREE_NUM_CHILDREN; tallerKid++) { //FOR loop condition starts here
if(treeHeight((*root)->child[tallerKid]) >= treeHeight((*root)->child[!tallerKid]) + 2) {
/* The below will check if zig-zag: opposite-direction nephew is the tall one */
/* this also covers case where both nephews are too tall */
if(treeHeight((*root)->child[tallerKid]->child[!tallerKid]) //if statement
>= treeHeight((*root)->child[tallerKid]) - 1) {
/* zig zag case */
treeRotate(&(*root)->child[tallerKid], !tallerKid);
}
/* This will fall through to zig zig case */
treeRotate(root, tallerKid);
/* If don't bother with other kid */
break;
}
}
}
}
/* This will free all elements of a tree, replacing it with TREE_EMPTY */
void
treeDestroy(struct tree **root) //treeDestroy() function used here
{
int i;
if(*root) { //if statement
for(i = 0; i < TREE_NUM_CHILDREN; i++) {
treeDestroy(&(*root)->child[i]);
}
free(*root);
*root = 0;
}
}
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.