To R is human …

“To R is human, to dabble in it fun” one could say. In this post I try to be a little of Nate Silver looking at Twiiterverse. Since the Indian general election 2014 is around the corner for constituting the 16th Lok Sabha in India I wanted to play around a little bit. Anyway here goes.

To get started on Twitter, with R we first need to establish a handshake between Twitter and R. We need to authenticate our R application with Twitter to enable us to mine the tweets in Twitterverse.. The steps are fairly straightforward. The R app you create has to authenticated and authorized with Twitter.

The first step is to create an app at Twitter at http://dev.twitter.com.. Login to your twitter account. Click the drop down at your photo and choose “My applications”. Then click “Create new application”. Now do the following
– Enter a unique name for your application
– Enter a description
– For the ‘Website’ enter any valid URL
– Leave the Callback URL blank
– Accept the conditions

Leave this in your browser. The handshake between your R application and Twitter needs to be established as follows

#install the necessary packages install.packages("ROAuth") install.packages("twitteR") install.packages("wordcloud") install.packages("tm")

library("ROAuth") library("twitteR") library("wordcloud") library("tm") library(RCurl)
# Set SSL certs globally options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl")))

require(twitteR) reqURL <- "https://api.twitter.com/oauth/request_token" accessURL <- "https://api.twitter.com/oauth/access_token" authURL <- "https://api.twitter.com/oauth/authorize"

Now go to your browser. In the created Twitter application, choose the API Keys tab. Copy and paste the API key and API secret in the next 2 lines

apiKey <- "Your API key here" apiSecret <- "Your API secret here" twitCred twitCred$handshake(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl"))

When you enter this you should see the following
To enable the connection, please direct your web browser to:
https://api.twitter.com/oauth/authorize?oauth_token=WnTGL4eHsiNJRFRiW1UU3GoYSvVZiYDBbO3WAsZO

Copy and paste the link given in a new tab in your browser. Copy the 7 digit PIN and paste it in the space below
When complete, record the PIN given to you and provide it here: 7377963

registerTwitterOAuth(twitCred)

This should complete the authorization. Now you are good to go.

Here is a short example of performing Text Mining with the help of package “tm”.

I wanted to create a word cloud around the hashtag #NaMo

So here is the code. We need to create a Corpus

#Search Twitter for the hashtag #NaMo

#Search Twitter for the hashtag #NaMo r_stats<- searchTwitter("#NaMo",n=500, cainfo="cacert.pem")

# Save text

r_stats_text <- sapply(r_stats, function(x) x$getText())

# Create a corpus

r_stats_text_corpus <- Corpus(VectorSource(r_stats_text))

# Clean up the text