One of the services Twitter provides its users is the ability to track the most
ID: 3687407 • Letter: O
Question
One of the services Twitter provides its users is the ability to track the most popular topics. For this part of the assignment you will do something similar. Your task is to keep track of the topics identified by users with the hashtag symbol ‘#’. You will also need to count the frequency of the hashtags you found and provide a ranking of hashtags based on their frequency. The output of your script should be one file, named top_hashtags.txt, with the N most popular hashtags, where N is a parameter to your function (Python). For example, assume this is the content of your twitter_data.txt file:
#lebron best athlete of our generation
ML 5 Demos! Lots of great stuff to come! Yes, I'm excited. :) http://htmlfive.appspot.com #io2009 #googleio
At GWT fireside chat #googleio
@khalid0456 No, Lebron is the best #lebron
If N is set to 2, then your script should generate a file top_hashtags.txt with the following content (note that in case of ties the order doesn’t matter):
#googleio 2
#lebron 2
twitter_data.txt link : https://drive.google.com/open?id=0BzB5lIrANOIPNXJVb3ZnbksxVTg
Explanation / Answer
Solution: See the code below
---------------------------------------------------
#This script extract top hash tags from twitter data and stores them to a file
#import of relevant packages
import codecs
import re
import csv
filename = "twitter_data.txt" #File name containing twitter data
#reading twitter from file
file=codecs.open(filename,"r","utf-8")
twitter_data=file.read()
#print(twitter_data)
#extraction of hashtags
pattern=re.compile(r"#(w+)")
tags=pattern.findall(twitter_data)
#print(len(tags))
tags_sorted_freq=sorted(tags,key=tags.count,reverse=True)
#print(tags_sorted_freq)
#counting freqencies of tags
from collections import Counter
tag_counts = Counter(tags_sorted_freq)
#tag_counts = Counter(tags)
#print(tag_counts)
#writing output data to a file
output_filename="top_hashtags.txt"
output_file=open(output_filename,"w")
output_file_writer=csv.writer(output_file)
for key,count in tag_counts.most_common():
output_file_writer.writerow([key, count])
--------------------------------------------------------
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.