Can you Help me, please !! Project Description – Web Crawler For this project, y
ID: 3589541 • Letter: C
Question
Can you Help me, please !!
Project Description – Web Crawler
For this project, you will be completing a web crawler. The web crawler will start at some web page, follow a random link on that page, then follow a random link on that new page, and so on, keeping a list of the pages that it has visited. Your part of this project will be to complete the linked list data structure that stores the web addresses that the crawler visits. Note that this project deals with a singly-linked list (unlike the doubly-linked list we discussed in class). Singlylinked lists are simpler, because each node just points to the next node (i.e., no “previous” pointers). The last node in the list has a NULL next pointer to indicate the end. (FYI, the disadvantage with singly-linked lists is that you can’t traverse them backwards, while doubly-linked lists let you go forward and backward.)
The file crawler.c contains some code already written. You should only modify that file in the places that are marked "TODO: complete this". There are comments to help guide you through the code. Note that there are four functions that operate on the linked list that you need to complete: contains, insertBack, printAddresses, and destroyList. Each of these functions must be recursive. The comments indicate how those functions should behave. Only "printAddresses" should print any output to the terminal.
A couple of other notes that arise from dealing with real data:
1. You might get errors from the Python script "getLinks.py" that parses the web pages (because they might be malformed or in an unexpected language). If so, just rerun your program to get a different random crawling trajectory.
2. Because of the network traffic involved in fetching these pages, you should limit the number of hops to 50 or fewer when you are running your project.
To successfully complete the assignment, you must complete the four linked-list functions in crawler.c so that they are RECURSIVE functions that behave as the comments require. Your project should not have any memory leaks. Use the following to find leaks
valgrind --leak-check=yes ./crawler
Here is the code,
There are four functions that you need to complete: contains, insertBack, printAddresses, and destroyListthat :
I Bold the function that you need to complete.
crawler.c
Explanation / Answer
int contains(const struct listNode *pNode, const char *addr){
int i
for(i=0;*addr[i];i++) //traverse the whole array
{
if(pNode->addr[i] == *addr[i])//match the address addr with theaddress in the starting node
continue;
else
return 0;//return 0 if mismatched
}
return 1;//return 1 if matched
}
void insertBack(struct listNode *pNode, const char *addr){
struct listNode *temp, *node; //creating a new node
node->addr = *addr;
node->next = NULL;
for(temp=pNode;temp->next!=NULL;temp=temp->next);//traversing to the end of existing linklist
temp->next = node;//joining the new node to the end of linklist
}
void printAddresses(const struct listNode *pNode){
struct listNode *temp, *destroy;
for(temp=pNode;temp!=NULL;temp=temp->next)
{ printf("%s", temp->addr);
}
}
void destroyList(struct listNode *pNode){
struct listNode *temp, *destroy;
for(temp=pNode; temp!=NULL;temp=temp->next)
{
destroy = temp;
free = destroy;
}
}
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.