In this post I continue my journey into Twitterverse with R and capture the tweet frequency for the hashtags #NaMo, #AAP and #RaGa over the last 7 days. This seemed the most appropriate thing to do given that the 16th Indian General Election 2014 is just around the corner. The handshake that has to be established with Twitter is the same as mentioned in my last post “To R is human …”
Here is a great blog post on measuring tweet frequencies – Getting Genetics done by Stephen Turner.
Once the initial handshake is done the following has to be done. It appears that searchTwitter can only search tweets within the last 7 days and that too for a maximum of 1500 tweets.
This is done as follows for the hashtag #NaMo. The dates variable creates 7 date strings. The for loop performs a searchTwitter everyday for the last 7 days
#Search the last 7 days for the hashtag #NaMo everyday
dates <- paste(“2014-03-“,10:17,sep=””) # need to go to 18th to catch tweets from 17th
for (i in 2:length(dates)) {
print(paste(dates[i-1], dates[i]))
tweets <- c(tweets, searchTwitter(“#Namo”, since=dates[i-1], until=dates[i], n=1500))
}
The tweets are then converted to dataframes for processing
# Create a dataframe from the tweets
tweets <- twListToDF(tweets)
tweets <- unique(tweets)
Finally the tweets are plotted using ggplot
#Plot the frequency of tweets in 2 hour windows
minutes <- 120
ggplot(data=tweets, aes(x=created)) +
geom_bar(aes(fill=..count..), binwidth=60*minutes) +
scale_x_datetime(“Date”) +
scale_y_continuous(“Frequency”) +
opts(title=”#NaMo Tweet Frequency March 11-17″, legend.position=’none’)
ggsave(file=’NaMo-frequency.png’, width=7, height=7, dpi=100)
The plot for #NaMo is shown below
The same is performed for
#AAP
And for #RaGa
While the number of tweets for #NaMo is very high, #RaGa seems to occur in lower number but consistently everyday
Of course we can check the tweets whether is sentiment is positive or negative for the hashtags. Thats for another day though.
The code can be cloned at Rtweet-frequency