“To R is human, to dabble in it fun” one could say. In this post I try to be a little of Nate Silver looking at Twiiterverse. Since the Indian general election 2014 is around the corner for constituting the 16th Lok Sabha in India I wanted to play around a little bit. Anyway here goes.
To get started on Twitter, with R we first need to establish a handshake between Twitter and R. We need to authenticate our R application with Twitter to enable us to mine the tweets in Twitterverse.. The steps are fairly straightforward. The R app you create has to authenticated and authorized with Twitter.
The first step is to create an app at Twitter at http://dev.twitter.com.. Login to your twitter account. Click the drop down at your photo and choose “My applications”. Then click “Create new application”. Now do the following
– Enter a unique name for your application
– Enter a description
– For the ‘Website’ enter any valid URL
– Leave the Callback URL blank
– Accept the conditions
Leave this in your browser. The handshake between your R application and Twitter needs to be established as follows
#install the necessary packages
install.packages("ROAuth")
install.packages("twitteR")
install.packages("wordcloud")
install.packages("tm")
library("ROAuth")
library("twitteR")
library("wordcloud")
library("tm")
library(RCurl)
# Set SSL certs globally
options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl")))
require(twitteR)
reqURL <- "https://api.twitter.com/oauth/request_token"
accessURL <- "https://api.twitter.com/oauth/access_token"
authURL <- "https://api.twitter.com/oauth/authorize"
Now go to your browser. In the created Twitter application, choose the API Keys tab. Copy and paste the API key and API secret in the next 2 lines
apiKey <- "Your API key here"
apiSecret <- "Your API secret here"
twitCred twitCred$handshake(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl"))
When you enter this you should see the following
To enable the connection, please direct your web browser to:
https://api.twitter.com/oauth/authorize?oauth_token=WnTGL4eHsiNJRFRiW1UU3GoYSvVZiYDBbO3WAsZO
Copy and paste the link given in a new tab in your browser. Copy the 7 digit PIN and paste it in the space below
When complete, record the PIN given to you and provide it here: 7377963
registerTwitterOAuth(twitCred)
This should complete the authorization. Now you are good to go.
Here is a short example of performing Text Mining with the help of package “tm”.
I wanted to create a word cloud around the hashtag #NaMo
So here is the code. We need to create a Corpus
#Search Twitter for the hashtag #NaMo
#Search Twitter for the hashtag #NaMo
r_stats<- searchTwitter("#NaMo",n=500, cainfo="cacert.pem")
# Now create a word cloud
wordcloud(r_stats_text_corpus)
This will create a Wordcloud of the words most used with the hashtag, in this case #NaMo
You can clone the code at Rwordcloud
Watch this space. Hasta la vista. I’ll be back!