Literacy in India – A deepR dive

Published in R-bloggers: Literacy in India – A deepR dive
You can do magic!
You can have anything,
That you desire
Magic…
You can do magic – song by America (1982)

That is exactly how I feel when I write code in R. A few lines of R, lo behold, hundreds of rows and columns are magically transformed into  easily understandable graphs, regression curves or choropleth maps. (By the way, the song is a really cool! Listen to it if you have not heard it before). You really can do magic with R

In this post I do a deep dive into literacy in India The dataset is taken from Open Government Data (OGD) platform India was used for this purpose. This data is based on the 2001 census. Though the data is a little dated, it is extremely rich with literacy details across different age groups, and over all Indian States. The data includes the total number of persons/males/females who are in the primary, middle.matric, college,technical diploma, non-technical diploma and so on. In fact the data also includes the educational background of people in the districts in each state. I slice and dice the data across multiple parameters. I have created an interactive Shiny App which will provide very detailed visualization based on the parameters chosen

Do try out my interactive Shiny app : IndiaLiteracy

The entire code for this app is on GitHub. Feel free to download/clone/fork/modify or enhance the code – IndiaLiteracy

For analyzing   such a rich data set as the Census data of 2001, I create 4 tabs
1) State Literacy
2) Educational Levels vs Age
3) India Literacy and
4) District Literacy

Here are the details of these 4 tabs in my Shiny app

A) State Literacy
This tab provides the age wise distribution of people (Persons/Males/Females) who attend educational institutions. This is shown as a barplot. The plot also includes the national average. In the plot below which is for entire India we see that the national average

1

The distribution of females attending primary school in the state of Haryana is shown. Also included is the national average. As can be seen there are options for (Total/Urban/Rural) against (Persons/Males/Females) and whether these people attend educational institutions are illiterate of literate.

2

I also have another option under “Who’ which is “All” This will plot the age wise distribution of males/females/persons in urban/rural or entire state.

3

B. Educational Institutions vs Age plot

This plot displays the the educational institutions attended by people in a particular age group. So for example in the state of Orissa for the 18 year age group we can see that there persons who are in (Primary, Matric, Higher Secondary, Non-Technical Diploma and Technical Diploma). The bar length for each color is the percentage of the total persons at that level of education

4

C. Literacy across India
This tab plots a chorpleth map for a region(Urban+Rural, Urban, Rural), Who(Persons, Males, Females) and the literacy level (attending educational institutions, primary, higher secondary, Matric etc) across the whole of India.

6

D. Literacy within a state
This tab plots a chorpleth map of literacy in the districts of a state. A sample plot for Karnataka is shown below

7

E. Key observations

There is a wealth of insights you can glean by looking at the various charts. Here a few insights from my initial observations
1) The literacy in Kerala across ages is higher than the national average while in Bihar it is less than the national average

a) Kerala

8b) Bihar

9
2) In Rajasthan The Males Attending education instituions is higher than the national average while for females it less than the national average. However the situation is reverse in Chandigarh where there are the percentage of females attending education instiuons is higher than the national average and the males

a) Rajasthan

10b) Chandigarh

11
3) When we look at the number of persons attending educational institution across India the north-eastern states lead with Manipur, Nagaland and Sikkim in the top 3.

12

We have heard that Kerala is the most literate state. But  it looks like Manipur, Nagaland, Sikkim actually edge Kerala out. If we look at the State literacy chart for Kerala and Manipur this becomes more clear

a) Kerala

13

b) Manipur

14

It can be seen that in Manipur the number of persons attending educational instition in the age range 13-24 years it is much higher than the national average and much higher than Kerala

4) If we take a look at the District wise literacy for the state of Bihar we see that the literacy is lower in the north eastern districts.,

15

5) Here is another interesting observation I made. The top 3 states which are most ‘literate with no education’ are i) Rajasthan ii) Madhya Pradesh iii) Chhattisgarh

16

While I have included several charts with accompanying explanation, this is largely unnecessary as  most of the charts are self-explanatory.

Do try out the Shiny app and see for yourself the literacy in each state/district/age group educational  level etc –IndiaLiteracy

Feel free to clone/fork my code and make your own enhancements –IndiaLiteracy

You may also like
1.  Natural Language Processing: What would Shakespeare say?
2. Introducing cricketr! : An R package to analyze performances of cricketers
3. Revisiting crimes against women in India
4. Informed choices through Machine Learning : Analyzing Kohli, Tendulkar and Dravid
5. Re-working the Lucy-Richardson Algorithm in OpenCV
6.  What’s up Watson? Using IBM Watson’s QAAPI with Bluemix, NodeExpress – Part 1
7.  Bend it like Bluemix, MongoDB with autoscaling – Part 2
8. TWS-4: Gossip protocol: Epidemics and rumors to the rescue
9. Thinking Web Scale (TWS-3): Map-Reduce – Bring compute to data
10.  Simulating an Edge Shape in Android

Revisiting crimes against women in India

Here I go again, raking the muck about crimes against women in India. My earlier post “A crime map of India in R: Crimes against women in India” garnered a lot of responses from readers. In fact one of the readers even volunteered to create the only choropleth map in that post. The data for this post is taken from http://data.gov.in. You can download the data from the link “Crimes against women in India

I was so impressed by the choropleth map that I decided to do that for all crimes against women.(Wikipedia definition: A choropleth map is a thematic map in which areas are shaded or patterned in proportion to the measurement of the statistical variable being displayed on the map). Personally, I think pictures tell the story better. I am sure you will agree!

So here, I have it a Shiny app which will plot choropleth maps for a chosen crime in a given year.

You can try out my interactive Shiny app at  Crimes against women in India

Checkout out my book  on Amazon available in both  Paperback ($9.99) and a Kindle version($6.99/Rs449/). (see ‘Practical Machine Learning with R and Python – Machine Learning in stereo‘)

The following technique can be used to determine the ‘goodness’ of a hypothesis or how well the hypothesis can fit the data and can also generalize to new examples not in the training set.

In the picture below  are the details of  ‘Rape” in the year 2015.
1

Interestingly the ‘Total Crime against women’ in 2001 shows the Top 5 as
1) Uttar Pradresh 2) Andhra Pradesh 3) Madhya Pradesh 4) Maharashtra 5) Rajasthan

2

But in 2015 West Bengal tops the list, as the real heavy weight in crimes against women. The new pecking order in 2015 for ‘Total Crimes against Women’ is

1) West Bengal 2) Andhra Pradesh 3) Uttar Pradesh  4) Rajasthan 5) Maharashtra

3

Similarly for rapes, West Bengal is nowhere in the top 5 list in 2001. In 2015, it is in second only to the national rape leader Madhya Pradesh.  Also in 2001 West Bengal is not in the top 5 for any of 6 crime heads. But in 2015, West Bengal is in the top 5 of 6 crime heads. The emergence of West Bengal as the leader in Crimes against Women is due to the steep increase in crime rate  over the years.Clearly the law and order situation in West Bengal is heading south.

In Dowry Deaths, UP, Bihar, MP, West Bengal lead the pack, and in that order in 2015.

The usual suspects for most crime categories are West Bengal, UP, MP, AP & Maharashtra.

The state-wise crime charts plot the incidence of the crime (rape, dowry death, assault on women etc) over the years. Data for each state and for each crime was available from 2001-2013. The data for period 2014-2018 are projected using linear regression. The shaded portion in the plots indicate the 95% confidence level in the prediction (i.e in other words we can be 95% certain that the true mean of the crime rate in the projected years will lie within the shaded region)

4

Several  interesting requests came from readers to my earlier post. Some of them were to to plot the crimes as function of population and per capita income of the State/Union Territory to see if the plots  throw up new crime leaders. I have not got the relevant state-wise population distribution data yet. I intend to update this when I get my hands on this data.

I have included the crimes.csv which has been used to generate the visualization. However for the Shiny app I save this as .RData for better performance of the app.

You can clone/download  the code for the Shiny app from GitHub at  crimesAgainWomenIndia

Please checkout my Shiny app : Crimes against women

I also intend to add further interactivity to my visualizations in a future version. Watch this space. I’ll be back!

You may like
1. My book ‘Practical Machine Learning with R and Python’ on Amazon
2. Natural Language Processing: What would Shakespeare say?
3. Introducing cricketr! : An R package to analyze performances of cricketers
4. A peek into literacy in India: Statistical Learning with R
5. Informed choices through Machine Learning : Analyzing Kohli, Tendulkar and Dravid
6. Re-working the Lucy-Richardson Algorithm in OpenCV
7.  What’s up Watson? Using IBM Watson’s QAAPI with Bluemix, NodeExpress – Part 1
8.  Bend it like Bluemix, MongoDB with autoscaling – Part 2
9. TWS-4: Gossip protocol: Epidemics and rumors to the rescue
10. Thinking Web Scale (TWS-3): Map-Reduce – Bring compute to data
11.  Simulating an Edge Shape in Android