Googly: An interactive app for analyzing IPL players, matches and teams using R package yorkr

Presenting ‘Googly’, a cool Shiny app that I developed over the last couple of days. This interactive Shiny app was on my mind for quite some time, and I finally got down to implementing it. The Googly Shiny app is based on my R package ‘yorkr’ which is now available in CRAN. The R package and hence this Shiny app is based on data from Cricsheet.

If you are passionate about cricket, and love analyzing cricket performances, then check out my 2 racy books on cricket! In my books, I perform detailed yet compact analysis of performances of both batsmen, bowlers besides evaluating team & match performances in Tests , ODIs, T20s & IPL. You can buy my books on cricket from Amazon at $12.99 for the paperback and $4.99/$6.99 respectively for the kindle versions. The books can be accessed at Cricket analytics with cricketr  and Beaten by sheer pace-Cricket analytics with yorkr  A must read for any cricket lover! Check it out!!

1

Googly is based on R package yorkr, and uses the data of all IPL matches from 2008 up to 2016, available on Cricsheet.

Googly can do detailed analyses of a) Individual IPL batsman b) Individual IPL bowler c) Any IPL match d) Head to head confrontation between 2 IPL teams e) All matches of an IPL team against all other teams.

With respect to the individual IPL batsman and bowler performance, I was in a bit of a ‘bind’ literally (pun unintended), as any IPL player could have played in more than 1 IPL team. Fortunately ‘rbind’ came to my rescue. I just get all the batsman’s/bowler’s performance in each IPL team, and then consolidate it into a single large dataframe to do the analyses of.

The Shiny app can be accessed at Googly

The code for Googly is available at Github. Feel free to clone/download/fork  the code from Googly

Check out my 2 books on cricket, a) Cricket analytics with cricketr b) Beaten by sheer pace – Cricket analytics with yorkr, now available in both paperback & kindle versions on Amazon!!! Pick up your copies today!

Also see my post GooglyPlus: yorkr analyzes IPL players, teams, matches with plots and tables

Based on the 5 detailed analysis domains there are 5 tabs

IPL Batsman: This tab can be used to perform analysis of all IPL batsman. If a batsman has played in more than 1 team, then the overall performance is considered. There are 10 functions for the IPL Batsman. They are shown below

  1. Batsman Runs vs. Deliveries
  2. Batsman’s Fours & Sixes
  3. Dismissals of batsman
  4. Batsman’s Runs vs Strike Rate
  5. Batsman’s Moving Average
  6. Batsman’s Cumulative Average Run
  7. Batsman’s Cumulative Strike Rate
  8. Batsman’s Runs against Opposition
  9. Batsman’s Runs at Venue
  10. Predict Runs of batsman

IPL Bowler: This tab can be used to analyze individual IPL bowlers. The functions handle IPL bowlers who have played in more than 1 IPL team.

  1. Mean Economy Rate of bowler
  2. Mean runs conceded by bowler
  3. Bowler’s Moving Average
  4. Bowler’s Cumulative Avg. Wickets
  5. Bowler’s Cumulative Avg. Economy Rate
  6. Bowler’s Wicket Plot
  7. Bowler’s Wickets against opposition
  8. Bowler’s Wickets at Venues
  9. Bowler’s wickets prediction

IPL match: This tab can be used for analyzing individual IPL matches. The available functions are

  1. Batting Partnerships
  2. Batsmen vs Bowlers
  3. Bowling Wicket Kind
  4. Bowling Wicket Runs
  5. Bowling Wicket Match
  6. Bowler vs Batsmen
  7. Match Worm Graph

Head to head : This tab can be used for analyzing head-to-head confrontations, between any 2 IPL teams for e.g. all matches between Chennai Super Kings vs. Deccan Chargers or Kolkata Knight Riders vs. Delhi Daredevils. The available functions are

  1. Team Batsmen Batting Partnerships All Matches
  2. Team Batsmen vs Bowlers all Matches
  3. Team Wickets Opposition All Matches
  4. Team Bowler vs Batsmen All Matches
  5. Team Bowlers Wicket Kind All Matches
  6. Team Bowler Wicket Runs All Matches
  7. Win Loss All Matches

Overall performance : this tab can be used analyze the overall performance of any IPL team. For this analysis all matches played by this team is considered. The available functions are

  1. Team Batsmen Partnerships Overall
  2. Team Batsmen vs Bowlers Overall
  3. Team Bowler vs Batsmen Overall
  4. Team Bowler Wicket Kind Overall

Below I include a random set of charts that are generated in each of the 5 tabs

A. IPL Batsman
a. A Symonds : Runs vs Deliveries
untitled

b. AB Devilliers – Cumulative Strike Rate
untitled

c.  Gautam Gambhir – Runs at venues
untitled

d. CH Gayle – Predict runs 
untitled

B. IPL Bowler
a. Ashish Nehra – Cumulative Average Wickets
untitled

b.  DJ Bravo – Moving Average of wickets
untitled

c. R Ashwin – Mean Economy rate vs Overs
untitled

C.IPL Match
a. Chennai Super Kings vs Deccan Chargers   (2008 -05-06) – Batsmen Partnerships

Note: You can choose either team in the match from the drop down ‘Choose team’

untitled

b. Kolkata Knight Riders vs Delhi Daredevils (2013-04-02) – Bowling wicket runs
untitled

c. Mumbai Indians vs Kings XI Punjab (2010-03-30) – Match worm graph
untitled

D. Head to head confrontation
a. Rising Pune Supergiants vs Mumbai Indians in all matches – Team batsmen partnerships

Note: You can choose the partnership of either team in the drop down ‘Choose team’
untitled

b.  Gujarat Lions – Royal Challengers Bangalore all matches – Bowlers performance against batsmen
untitled

E. Overall Performance
a.  Royal Challengers Bangalore overall performance – Batsman Partnership (Rank=1)
This is Virat Kohli for RCB. Try out other ranks
untitled

b.  Rajashthan Royals overall Performance – Bowler vs batsman (Rank =2)
This is Vinay Kumar.
untitled

The Shiny app Googly can be accessed at Googly. Feel free to clone/fork the code from Github at Googly

For details on my R package yorkr, please see my blog Giga thoughts. There are more than 15 posts detailing the functions and their usage.

Do bowl a Googly!!!

You may like my other Shiny apps

Also see my other posts

  1. Introducing QCSimulator: A 5-qubit quantum computing simulator in R
  2. Deblurring with OpenCV: Weiner filter reloaded
  3. Rock N’ Roll with Bluemix, Cloudant & NodeExpress
  4. Introducing cricket package yorkr: Part 1- Beaten by sheer pace!
  5. Fun simulation of a Chain in Android
  6. Beaten by sheer pace! Cricket analytics with yorkr in paperback and Kindle versions
  7. Introducing cricketr! : An R package to analyze performances of cricketers
  8. Cricket analytics with cricketr!!!

For more posts see Index of posts

Analyzing World Bank data with WDI, googleVis Motion Charts

Recently I was surfing the web, when I came across a real cool post New R package to access World Bank data, by Markus Gesmann on using googleVis and motion charts with World Bank Data. The post also introduced me to Hans Rosling, Professor of Sweden’s Karolinska Institute. Hans Rosling, the creator of the famous Gapminder chart, the “Heath and Wealth of Nations” displays global trends through animated charts (A must see!!!). As they say, in Hans Rosling’s hands, data dances and sings. Take a look at some of his Ted talks for e.g. Hans Rosling:New insights on poverty. Prof Rosling developed the breakthrough software behind the visualizations, in the Gapminder. The free software, which can be loaded with any data – was purchased by Google in March 2007.

In this post, I recreate some of the Gapminder charts with the help of R packages WDI and googleVis. The WDI  package of  Vincent Arel-Bundock, provides a set of really useful functions to get to data based on the World Bank Data indicators.  googleVis provides motion charts with which you can animate the data.. Incidentally Datacamp has a very nice, short course on googleVis “Having fun with googleVis

See an updated version of this post Revisiting World Bank data analysis with WDI and gVisMotionChart

You can clone/download the code from Github at worldBankAnalysis which is in the form of an Rmd file.

library(WDI)
library(ggplot2)
library(googleVis)
library(plyr)

1.Get the data from 1960 to 2016 for the following

  1. Population – SP.POP.TOTL
  2. GDP in US $ – NY.GDP.MKTP.CD
  3. Life Expectancy at birth (Years) – SP.DYN.LE00.IN
  4. GDP Per capita income – NY.GDP.PCAP.PP.CD
  5. Fertility rate (Births per woman) – SP.DYN.TFRT.IN
  6. Poverty headcount ratio – SI.POV.2DAY
# World population total
population = WDI(indicator='SP.POP.TOTL', country="all",start=1960, end=2016)
# GDP in US $
gdp= WDI(indicator='NY.GDP.MKTP.CD', country="all",start=1960, end=2016)
# Life expectancy at birth (Years)
lifeExpectancy= WDI(indicator='SP.DYN.LE00.IN', country="all",start=1960, end=2016)
# GDP Per capita
income = WDI(indicator='NY.GDP.PCAP.PP.CD', country="all",start=1960, end=2016)
# Fertility rate (births per woman)
fertility = WDI(indicator='SP.DYN.TFRT.IN', country="all",start=1960, end=2016)
# Poverty head count
poverty= WDI(indicator='SI.POV.2DAY', country="all",start=1960, end=2016)

2.Rename the columns

names(population)[3]="Total population"
names(lifeExpectancy)[3]="Life Expectancy (Years)"
names(gdp)[3]="GDP (US$)"
names(income)[3]="GDP per capita income"
names(fertility)[3]="Fertility (Births per woman)"
names(poverty)[3]="Poverty headcount ratio"

3.Join the data frames

Join the individual data frames to one large wide data frame with all the indicators for the countries


j1 <- join(population, gdp)
j2 <- join(j1,lifeExpectancy)
j3 <- join(j2,income)
j4 <- join(j3,poverty)
wbData <- join(j4,fertility)

4.Use WDI_data

Use WDI_data to get the list of indicators and the countries. Join the countries and region

#This returns  list of 2 matrixes
wdi_data =WDI_data
# The 1st matrix is the list is the set of all World Bank Indicators
indicators=wdi_data[[1]]
# The 2nd  matrix gives the set of countries and regions
countries=wdi_data[[2]]
df = as.data.frame(countries)
aa <- df$region != "Aggregates"
# Remove the aggregates
countries_df <- df[aa,]
# Subset from the development data only those corresponding to the countries
bb = subset(wbData, country %in% countries_df$country)
cc = join(bb,countries_df)
dd = complete.cases(cc)
developmentDF = cc[dd,]

5.Create and display the motion chart

gg<- gvisMotionChart(cc,
                                idvar = "country",
                                timevar = "year",
                                xvar = "GDP",
                                yvar = "Life Expectancy",
                                sizevar ="Population",
                                colorvar = "region")
plot(gg)
cat(gg$html$chart, file="chart1.html")

Note: Unfortunately it is not possible to embed the motion chart in WordPress. It is has to hosted on a server as a Webpage. After exploring several possibilities I came up with the following process to display the animation graph. The plot is saved as a html file using ‘cat’ as shown above. The chart1.html page is then hosted as a Github page (gh-page) on Github.

Here is the ggvisMotionChart

Do give  World Bank Motion Chart1  a spin.  Here is how the Motion Chart has to be used

untitled

You can select Life Expectancy, Population, Fertility etc by clicking the black arrows. The blue arrow shows the ‘play’ button to set animate the motion chart. You can also select the countries and change the size of the circles. Do give it a try. Here are some quick analysis by playing around with the motion charts with different parameters chosen

The set of charts below are screenshots captured by running the motion chart World Bank Motion Chart1

a. Life Expectancy vs Fertility chart

This chart is used by Hans Rosling in his Ted talk. The left chart shows low life expectancy and high fertility rate for several sub Saharan and East Asia Pacific countries in the early 1960’s. Today the fertility has dropped and the life expectancy has increased overall. However the sub Saharan countries still have a high fertility rate

pic1

b. Population vs GDP

The chart below shows that GDP of India and China have the same GDP from 1973-1994 with US and Japan well ahead.

pic2

From 1998- 2014 China really pulls away from India and Japan as seen below

pic3

c. Per capita income vs Life Expectancy

In the 1990’s the per capita income and life expectancy of the sub -saharan countries are low (42-50). Japan and US have a good life expectancy in 1990’s. In 2014 the per capita income of the sub-saharan countries are still low though the life expectancy has marginally improved.

pic4

d. Population vs Poverty headcount

pic5

In the early 1990’s China had a higher poverty head count ratio than India. By 2004 China had this all figured out and the poverty head count ratio drops significantly. This can also be seen in the chart below.

pop_pov3

In the chart above China shows a drastic reduction in poverty headcount ratio vs India. Strangely Zambia shows an increase in the poverty head count ratio.

6.Get the data for the 2nd set of indicators

  1. Total population  – SP.POP.TOTL
  2. GDP in US$ – NY.GDP.MKTP.CD
  3. Access to electricity (% population) – EG.ELC.ACCS.ZS
  4. Electricity consumption KWh per capita -EG.USE.ELEC.KH.PC
  5. CO2 emissions -EN.ATM.CO2E.KT
  6. Sanitation Access – SH.STA.ACSN
# World population
population = WDI(indicator='SP.POP.TOTL', country="all",start=1960, end=2016)
# GDP in US $
gdp= WDI(indicator='NY.GDP.MKTP.CD', country="all",start=1960, end=2016)
# Access to electricity (% population)
elecAccess= WDI(indicator='EG.ELC.ACCS.ZS', country="all",start=1960, end=2016)
# Electric power consumption Kwh per capita
elecConsumption= WDI(indicator='EG.USE.ELEC.KH.PC', country="all",start=1960, end=2016)
#CO2 emissions
co2Emissions= WDI(indicator='EN.ATM.CO2E.KT', country="all",start=1960, end=2016)
# Access to sanitation (% population)
sanitationAccess= WDI(indicator='SH.STA.ACSN', country="all",start=1960, end=2016)

7.Rename the columns

names(population)[3]="Total population"
names(gdp)[3]="GDP US($)"
names(elecAccess)[3]="Access to Electricity (% popn)"
names(elecConsumption)[3]="Electric power consumption (KWH per capita)"
names(co2Emissions)[3]="CO2 emisions"
names(sanitationAccess)[3]="Access to sanitation(% popn)"

8.Join the individual data frames

Join the individual data frames to one large wide data frame with all the indicators for the countries


j1 <- join(population, gdp)
j2 <- join(j1,elecAccess)
j3 <- join(j2,elecConsumption)
j4 <- join(j3,co2Emissions)
wbData1 <- join(j3,sanitationAccess)

 

 

9.Use WDI_data

Use WDI_data to get the list of indicators and the countries. Join the countries and region

#This returns  list of 2 matrixes
wdi_data =WDI_data
# The 1st matrix is the list is the set of all World Bank Indicators
indicators=wdi_data[[1]]
# The 2nd  matrix gives the set of countries and regions
countries=wdi_data[[2]]
df = as.data.frame(countries)
aa <- df$region != "Aggregates"
# Remove the aggregates
countries_df <- df[aa,]
# Subset from the development data only those corresponding to the countries
ee = subset(wbData1, country %in% countries_df$country)
ff = join(ee,countries_df)
## Joining by: iso2c, country

10.Create and display the motion chart

gg1<- gvisMotionChart(ff,
                                idvar = "country",
                                timevar = "year",
                                xvar = "GDP",
                                yvar = "Access to Electricity",
                                sizevar ="Population",
                                colorvar = "region")
plot(gg1)
cat(gg1$html$chart, file="chart2.html")

This is World Bank Motion Chart2  which has a different set of parameters like Access to Energy, CO2 emissions etc

The set of charts below are screenshots of the motion chart World Bank Motion Chart 2

a. Access to Electricity vs Population
pic6The above chart shows that in China 100% population have access to electricity. India has made decent progress from 50% in 1990 to 79% in 2012. However Pakistan seems to have been much better in providing access to electricity. Pakistan moved from 59% to close 98% access to electricity

b. Power consumption vs population

powercon

The above chart shows the Power consumption vs Population. China and India have proportionally much lower consumption that Norway, US, Canada

c. CO2 emissions vs Population

pic7

In 1963 the CO2 emissions were fairly low and about comparable for all countries. US, India have shown a steady increase while China shows a steep increase. Interestingly UK shows a drop in CO2 emissions

d.  Access to sanitation
san

India shows an improvement but it has a long way to go with only 40% of population with access to sanitation. China has made much better strides with 80% having access to sanitation in 2015. Strangely Nigeria shows a drop in sanitation by almost about 20% of population.

The code is available at Github at worldBankAnalysys

Conclusion: So there you have it. I have shown some screenshots of some sample parameters of the World indicators. Please try to play around with World Bank Motion Chart1 & World Bank Motion Chart 2  with your own set of parameters and countries.  You can also create your own motion chart from the 100s of WDI indicators avaialable at  World Bank Data indicator.

Finally, I  would really like to thank Prof Hans Rosling, googleVis and  WDI (Vincent  Arel-Bundock) for making this visualization possible!

Also see
1.  Introducing QCSimulator: A 5-qubit quantum computing simulator in R
2. Dabbling with Wiener filter using OpenCV
3. Designing a Social Web Portal
4. Design Principles of Scalable, Distributed Systems
5. Re-introducing cricketr! : An R package to analyze performances of cricketers
6. Natural language processing: What would Shakespeare say?

To see all posts Index of posts

cricketr sizes up legendary All-rounders of yesteryear

Introduction

This is a post I have been wanting to write for several months, but had to put it off for one reason or another. In this post I use my R package cricketr to analyze the performance of All-rounder greats namely Kapil Dev, Ian Botham, Imran Khan and Richard Hadlee. All these players had talent that was natural and raw. They were good strikers of the ball and extremely lethal with their bowling. The ODI data for these players have been taken from ESPN Cricinfo.

Please be mindful of the ESPN Cricinfo Terms of Use

If you are passionate about cricket, and love analyzing cricket performances, then check out my racy book on cricket ‘Cricket analytics with cricketr and cricpy – Analytics harmony with R & Python’! This book discusses and shows how to use my R package ‘cricketr’ and my Python package ‘cricpy’ to analyze batsmen and bowlers in all formats of the game (Test, ODI and T20). The paperback is available on Amazon at $21.99 and  the kindle version at $9.99/Rs 449/-. A must read for any cricket lover! Check it out!!

You can download the latest PDF version of the book  at  ‘Cricket analytics with cricketr and cricpy: Analytics harmony with R and Python-6th edition

Untitled

You can also read this post at Rpubs as cricketr-AR. Dowload this report as a PDF file from cricketr-AR

Important note 1: The latest release of ‘cricketr’ now includes the ability to analyze performances of teams now!!  See Cricketr adds team analytics to its repertoire!!!

Important note 2 : Cricketr can now do a more fine-grained analysis of players, see Cricketr learns new tricks : Performs fine-grained analysis of players

Important note 3: Do check out the python avatar of cricketr, ‘cricpy’ in my post ‘Introducing cricpy:A python package to analyze performances of cricketers

Note: If you would like to do a similar analysis for a different set of batsman and bowlers, you can clone/download my skeleton cricketr template from Github (which is the R Markdown file I have used for the analysis below). You will only need to make appropriate changes for the players you are interested in. Just a familiarity with R and R Markdown only is needed.

Important note: Do check out my other posts using cricketr at cricketr-posts

All Rounders

  1. Kapil Dev (Ind)
  2. Ian Botham (Eng)
  3. Imran Khan (Pak)
  4. Richard Hadlee (NZ)

I have sprinkled the plots with a few of my comments. Feel free to draw your conclusions! The analysis is included below

if (!require("cricketr")){ 
    install.packages("cricketr",) 
} 

library(cricketr)

The data for any particular ODI player can be obtained with the getPlayerDataOD() function. To do you will need to go to ESPN CricInfo Playerand type in the name of the player for e.g Kapil Dev, etc. This will bring up a page which have the profile number for the player e.g. for Kapil Dev this would be http://www.espncricinfo.com/india/content/player/30028.html. Hence, Kapils’s profile is 30028. This can be used to get the data for Kapil Dev’s data as shown below. I have already executed the below 4 commands and I will use the files to run further commands

#kapil1 
#botham11 
#imran1 
#hadlee1 

Analyses of batting performances of the All Rounders

The following plots gives the analysis of the 4 ODI batsmen

  1. Kapil Dev (Ind) – Innings – 225, Runs = 3783, Average=23.79, Strike Rate= 95.07
  2. Ian Botham (Eng) – Innings – 116, Runs= 2113, Average=23.21, Strike Rate= 79.10
  3. Imran Khan (Pak) – Innings – 175, Runs= 3709, Average=33.41, Strike Rate= 72.65
  4. Richard Hadlee (NZ) – Innings – 115, Runs= 1751, Average=21.61, Strike Rate= 75.50

Plot of 4s, 6s and the scoring rate in ODIs

The 3 charts below give the number of

  1. 4s vs Runs scored
  2. 6s vs Runs scored
  3. Balls faced vs Runs scored

A regression line is fitted in each of these plots for each of the ODI batsmen

A. Kapil Dev
It can be seen that Kapil scores four 4’s when he scores 50. Also after facing 50 deliveries he scores around 43

par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
batsman4s("./kapil1.csv","Kapil")
batsman6s("./kapil1.csv","Kapil")
batsmanScoringRateODTT("./kapil1.csv","Kapil")

kapil-4s6ssr-1

dev.off()
## null device 
##           1

B. Ian Botham
Botham scores around 39 runs after 50 deliveries

par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
batsman4s("./botham1.csv","Botham")
batsman6s("./botham1.csv","Botham")
batsmanScoringRateODTT("./botham1.csv","Botham")

botham-4s6sr-1

dev.off()
## null device 
##           1

C. Imran Khan
Imran scores around 36 runs for 50 deliveries

par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
batsman4s("./imran1.csv","Imran")
batsman6s("./imran1.csv","Imran")
batsmanScoringRateODTT("./imran1.csv","Imran")

imran-4s6ssr-1

dev.off()
## null device 
##           1

D. Richard Hadlee
Hadlee also scores around 30 runs facing 50 deliveries

par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
batsman4s("./hadlee1.csv","Hadlee")
batsman6s("./hadlee1.csv","Hadlee")
batsmanScoringRateODTT("./hadlee1.csv","Hadlee")

hadlee-4s6sout-1

dev.off()
## null device 
##           1

Cumulative Average runs of batsman in career

Kapils cumulative avrerage runs drops towards the last 15 innings wheres Botham had a good run towards the end of his career. Imran performance as a batsman really peaks towards the end with a cumulative average of almost 25 runs. Hadlee has a stead performance

par(mfrow=c(2,2))
par(mar=c(4,4,2,2))
batsmanCumulativeAverageRuns("./kapil1.csv","Kapil")

kbih-car-1

batsmanCumulativeAverageRuns("./botham1.csv","Botham")

kbih-car-2

batsmanCumulativeAverageRuns("./imran1.csv","Imran")

kbih-car-3

batsmanCumulativeAverageRuns("./hadlee1.csv","Hadlee")

kbih-car-4

dev.off()
## null device 
##           1

Cumulative Average strike rate of batsman in career

Kapil’s strike rate is superlative touching the 90’s steadily. Botham’s strike drops dramatically towards the latter part of his career. Imran average at a steady 75 and Hadlee averages around 85.

par(mfrow=c(2,2))
par(mar=c(4,4,2,2))
batsmanCumulativeStrikeRate("./kapil1.csv","Kapil")

kbih-casr-1

batsmanCumulativeStrikeRate("./botham1.csv","Botham")

kbih-casr-2

batsmanCumulativeStrikeRate("./imran1.csv","Imran")

kbih-casr-3

batsmanCumulativeStrikeRate("./hadlee1.csv","Hadlee")

kbih-casr-4

dev.off()
## null device 
##           1

Relative Mean Strike Rate

Kapil tops the strike rate among all the all-rounders. This is really a revelation to me. This can also be seen in the original data in Kapil’s strike rate is at a whopping 95.07 in comparison to Botham, Inran and Hadlee who are at 79.1,72.65 and 75.50 respectively

par(mar=c(4,4,2,2))
frames <- list("./kapil1.csv","./botham1.csv","imran1.csv","hadlee1.csv")
names <- list("Kapil","Botham","Imran","Hadlee")
relativeBatsmanSRODTT(frames,names)

plot-1-1

Relative Runs Frequency Percentage

This plot shows that Imran has a much better average runs scored over the other all rounders followed by Kapil

frames <- list("./kapil1.csv","./botham1.csv","imran1.csv","hadlee1.csv")
names <- list("Kapil","Botham","Imran","Hadlee")
relativeRunsFreqPerfODTT(frames,names)

plot-2-1

Relative cumulative average runs in career

It can be seen clearly that Imran Khan leads the pack in cumulative average runs followed by Kapil Dev and then Botham

frames <- list("./kapil1.csv","./botham1.csv","imran1.csv","hadlee1.csv")
names <- list("Kapil","Botham","Imran","Hadlee")
relativeBatsmanCumulativeAvgRuns(frames,names)

kbih-relcar-1

Relative cumulative average strike rate in career

In the cumulative strike rate Hadlee and Kapil run a close race.

frames <- list("./kapil1.csv","./botham1.csv","imran1.csv","hadlee1.csv")
names <- list("Kapil","Botham","Imran","Hadlee")
relativeBatsmanCumulativeStrikeRate(frames,names)

kbih-relcsr-1

Percent 4’s,6’s in total runs scored

The plot below shows the contrib

frames <- list("./kapil1.csv","./botham1.csv","imran1.csv","hadlee1.csv")
names <- list("Kapil","Botham","Imran","Hadlee")
runs4s6s <-batsman4s6s(frames,names)

plot-46s-1

print(runs4s6s)
##                Kapil Botham Imran Hadlee
## Runs(1s,2s,3s) 72.08  66.53 77.53  73.27
## 4s             21.98  25.78 17.61  21.08
## 6s              5.94   7.68  4.86   5.65

Runs forecast

The forecast for the batsman is shown below.

par(mfrow=c(2,2))
par(mar=c(4,4,2,2))
batsmanPerfForecast("./kapil1.csv","Kapil")
batsmanPerfForecast("./botham1.csv","Botham")
batsmanPerfForecast("./imran1.csv","Imran")
batsmanPerfForecast("./hadlee1.csv","Hadlee")

plot-fcst-1

dev.off()
## null device 
##           1

3D plot of Runs vs Balls Faced and Minutes at Crease

The plot is a scatter plot of Runs vs Balls faced and Minutes at Crease. A prediction plane is fitted

par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
battingPerf3d("./kapil1.csv","Kapil")
battingPerf3d("./botham1.csv","Botham")

plot-3-1

dev.off()
## null device 
##           1
par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
battingPerf3d("./imran1.csv","Imran")
battingPerf3d("./hadlee1.csv","Hadlee")

plot-4-1

dev.off()
## null device 
##           1

Predicting Runs given Balls Faced and Minutes at Crease

A multi-variate regression plane is fitted between Runs and Balls faced +Minutes at crease.

BF <- seq( 10, 200,length=10)
Mins <- seq(30,220,length=10)
newDF <- data.frame(BF,Mins)

kapil <- batsmanRunsPredict("./kapil1.csv","Kapil",newdataframe=newDF)
botham <- batsmanRunsPredict("./botham1.csv","Botham",newdataframe=newDF)
imran <- batsmanRunsPredict("./imran1.csv","Imran",newdataframe=newDF)
hadlee <- batsmanRunsPredict("./hadlee1.csv","Hadlee",newdataframe=newDF)

The fitted model is then used to predict the runs that the batsmen will score for a hypotheticial Balls faced and Minutes at crease. It can be seen that Kapil is the best bet for a balls faced and minutes at crease followed by Botham.

batsmen <-cbind(round(kapil$Runs),round(botham$Runs),round(imran$Runs),round(hadlee$Runs))
colnames(batsmen) <- c("Kapil","Botham","Imran","Hadlee")
newDF <- data.frame(round(newDF$BF),round(newDF$Mins))
colnames(newDF) <- c("BallsFaced","MinsAtCrease")
predictedRuns <- cbind(newDF,batsmen)
predictedRuns
##    BallsFaced MinsAtCrease Kapil Botham Imran Hadlee
## 1          10           30    16      6    10     15
## 2          31           51    33     22    22     28
## 3          52           72    49     38    33     42
## 4          73           93    65     54    45     56
## 5          94          114    81     70    56     70
## 6         116          136    97     86    67     84
## 7         137          157   113    102    79     97
## 8         158          178   130    117    90    111
## 9         179          199   146    133   102    125
## 10        200          220   162    149   113    139

Highest runs likelihood

The plots below the runs likelihood of batsman. This uses K-Means . A. Kapil Dev

batsmanRunsLikelihood("./kapil1.csv","Kapil")

kapil11-1

## Summary of  Kapil 's runs scoring likelihood
## **************************************************
## 
## There is a 34.57 % likelihood that Kapil  will make  22 Runs in  24 balls over 34  Minutes 
## There is a 17.28 % likelihood that Kapil  will make  46 Runs in  46 balls over  65  Minutes 
## There is a 48.15 % likelihood that Kapil  will make  5 Runs in  7 balls over 9  Minutes

B. Ian Botham

batsmanRunsLikelihood("./botham1.csv","Botham")

devilliers-1

## Summary of  Botham 's runs scoring likelihood
## **************************************************
## 
## There is a 47.95 % likelihood that Botham  will make  9 Runs in  12 balls over 15  Minutes 
## There is a 39.73 % likelihood that Botham  will make  23 Runs in  32 balls over  44  Minutes 
## There is a 12.33 % likelihood that Botham  will make  59 Runs in  74 balls over 101  Minutes

C. Imran Khan

batsmanRunsLikelihood("./imran1.csv","Imran")

gaylecache-true-1

## Summary of  Imran 's runs scoring likelihood
## **************************************************
## 
## There is a 23.33 % likelihood that Imran  will make  36 Runs in  54 balls over 74  Minutes 
## There is a 60 % likelihood that Imran  will make  14 Runs in  18 balls over  23  Minutes 
## There is a 16.67 % likelihood that Imran  will make  53 Runs in  90 balls over 115  Minutes

D. Richard Hadlee

batsmanRunsLikelihood("./hadlee1.csv","Hadlee")

maxwell-1

## Summary of  Hadlee 's runs scoring likelihood
## **************************************************
## 
## There is a 6.1 % likelihood that Hadlee  will make  64 Runs in  79 balls over 90  Minutes 
## There is a 42.68 % likelihood that Hadlee  will make  25 Runs in  33 balls over  44  Minutes 
## There is a 51.22 % likelihood that Hadlee  will make  9 Runs in  11 balls over 15  Minutes

Average runs at ground and against opposition

A. Kapil Dev

par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
batsmanAvgRunsGround("./kapil1.csv","Kapil")
batsmanAvgRunsOpposition("./kapil1.csv","Kapil")

avgrg-1-1

dev.off()
## null device 
##           1

B. Ian Botham

par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
batsmanAvgRunsGround("./botham1.csv","Botham")
batsmanAvgRunsOpposition("./botham1.csv","Botham")

avgrg-2-1

dev.off()
## null device 
##           1

C. Imran Khan

par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
batsmanAvgRunsGround("./imran1.csv","Imran")
batsmanAvgRunsOpposition("./imran1.csv","Imran")

avgrg-3-1

dev.off()
## null device 
##           1

D. Richard Hadlee

par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
batsmanAvgRunsGround("./hadlee1.csv","Hadlee")
batsmanAvgRunsOpposition("./hadlee1.csv","Hadlee")

avgrg-4-1

dev.off()
## null device 
##           1

Moving Average of runs over career

The moving average for the 4 batsmen indicate the following

Kapil’s performance drops significantly while there is a slump in Botham’s performance. On the other hand Imran and Hadlee’s performance were on the upswing.

par(mfrow=c(2,2))
par(mar=c(4,4,2,2))
batsmanMovingAverage("./kapil1.csv","Kapil")
batsmanMovingAverage("./botham1.csv","Botham")
batsmanMovingAverage("./imran1.csv","Imran")
batsmanMovingAverage("./hadlee1.csv","Hadlee")

sdgm-ma-1

dev.off()
## null device 
##           1

Check batsmen in-form, out-of-form

[1] “**************************** Form status of Kapil ****************************\n\n
Population size: 72
Mean of population: 19.38 \n
Sample size: 9 Mean of sample: 6.78 SD of sample: 6.14 \n\n
Null hypothesis H0 : Kapil ‘s sample average is within 95% confidence interval of population average\n
Alternative hypothesis Ha : Kapil ‘s sample average is below the 95% confidence interval of population average\n\n
Kapil ‘s Form Status: Out-of-Form because the p value: 8.4e-05 is less than alpha= 0.05

“**************************** Form status of Botham ****************************\n\n
Population size: 65
Mean of population: 21.29 \n
Sample size: 8 Mean of sample: 15.38 SD of sample: 13.19 \n\n
Null hypothesis H0 : Botham ‘s sample average is within 95% confidence interval of population average\n
Alternative hypothesis Ha : Botham ‘s sample average is below the 95% confidence interval of population average\n\n
Botham ‘s Form Status: In-Form because the p value: 0.120342 is greater than alpha= 0.05 \n

“**************************** Form status of Imran ****************************\n\n
Population size: 54
Mean of population: 24.94 \n
Sample size: 6 Mean of sample: 30.83 SD of sample: 25.4 \n\n
Null hypothesis H0 : Imran ‘s sample average is within 95% confidence interval of population average\n
Alternative hypothesis Ha : Imran ‘s sample average is below the 95% confidence interval of population average\n\n
Imran ‘s Form Status: In-Form because the p value: 0.704683 is greater than alpha= 0.05 \n

“**************************** Form status of Hadlee ****************************\n\n
Population size: 73
Mean of population: 18 \n
Sample size: 9 Mean of sample: 27 SD of sample: 24.27 \n\n
Null hypothesis H0 : Hadlee ‘s sample average is within 95% confidence interval of population average\n
Alternative hypothesis Ha : Hadlee ‘s sample average is below the 95% confidence interval of population average\n\n
Hadlee ‘s Form Status: In-Form because the p value: 0.85262 is greater than alpha= 0.05 \n *******************************************************************************************\n\n”

Analyses of bowling performances of the All Rounders

The following plots gives the analysis of the 4 ODI batsmen

  1. Kapil Dev (Ind) – Innings – 225, Wickets = 253, Average=27.45, Economy Rate= 3.71
  2. Ian Botham (Eng) – Innings – 116, Wickets = 145, Average=28.54, Economy Rate= 3.96
  3. Imran Khan (Pak) – Innings – 175, Wickets = 182, Average=26.61, Economy Rate= 3.89
  4. Richard Hadlee (NZ) – Innings – 115, Wickets = 158, Average=21.56, Economy Rate= 3.30

Botham has the highest number of innings and wickets followed closely by Mitchell. Imran and Hadlee have relatively fewer innings.

To get the bowler’s data use

#kapil2 
#botham2 
#imran2 
#hadlee2 

“`

Wicket Frequency percentage

This plot gives the percentage of wickets for each wickets (1,2,3…etc).

par(mfrow=c(1,4))
par(mar=c(4,4,2,2))
bowlerWktsFreqPercent("./kapil2.csv","Kapil")
bowlerWktsFreqPercent("./botham2.csv","Botham")
bowlerWktsFreqPercent("./imran2.csv","Imran")
bowlerWktsFreqPercent("./hadlee2.csv","Hadlee")

relbowlfp-1

dev.off()
## null device 
##           1

Wickets Runs plot

The plot below gives a boxplot of the runs ranges for each of the wickets taken by the bowlers.

par(mfrow=c(1,4))
par(mar=c(4,4,2,2))

bowlerWktsRunsPlot("./kapil2.csv","Kapil")
bowlerWktsRunsPlot("./botham2.csv","Botham")
bowlerWktsRunsPlot("./imran2.csv","Imran")
bowlerWktsRunsPlot("./hadlee2.csv","Hadlee")

wktsrun-1

dev.off()
## null device 
##           1

Cumulative average wicket plot

Botham has the best cumulative average wicket touching almost 1.6 wickets followed by Hadlee

par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
bowlerCumulativeAvgWickets("./kapil2.csv","Kapil")

kwm-bowlcaw-1

bowlerCumulativeAvgWickets("./botham2.csv","Botham")

kwm-bowlcaw-2

bowlerCumulativeAvgWickets("./imran2.csv","Imran")

kwm-bowlcaw-3

bowlerCumulativeAvgWickets("./hadlee2.csv","Hadlee")

kwm-bowlcaw-4

dev.off()
## null device 
##           1
par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
bowlerCumulativeAvgEconRate("./kapil2.csv","Kapil")

kwm-bowlcer-1

bowlerCumulativeAvgEconRate("./botham2.csv","Botham")

kwm-bowlcer-2

bowlerCumulativeAvgEconRate("./imran2.csv","Imran")

kwm-bowlcer-3

bowlerCumulativeAvgEconRate("./hadlee2.csv","Hadlee")

kwm-bowlcer-4

dev.off()
## null device 
##           1

Average wickets in different grounds and opposition

A. Kapil Dev

par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
bowlerAvgWktsGround("./kapil2.csv","Kapil")
bowlerAvgWktsOpposition("./kapil2.csv","Kapil")

gr-1-1

dev.off()
## null device 
##           1

B. Ian Botham

par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
bowlerAvgWktsGround("./botham2.csv","Botham")
bowlerAvgWktsOpposition("./botham2.csv","Botham")

gr-2-1

dev.off()
## null device 
##           1

C. Imran Khan

par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
bowlerAvgWktsGround("./imran2.csv","Imran")
bowlerAvgWktsOpposition("./imran2.csv","Imran")

gr-3-1

dev.off()
## null device 
##           1

D. Richard Hadlee

par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
bowlerAvgWktsGround("./hadlee2.csv","Hadlee")
bowlerAvgWktsOpposition("./hadlee2.csv","Hadlee")

gr-4-1

dev.off()
## null device 
##           1

Relative bowling performance

It can be seen that Botham is the most effective wicket taker of the lot

frames <- list("./kapil2.csv","./botham2.csv","imran2.csv","hadlee2.csv")
names <- list("Kapil","Botham","Imran","Hadlee")
relativeBowlingPerf(frames,names)

relbowlperf-1

Relative Economy Rate against wickets taken

Hadlee has the best overall economy rate followed by Kapil Dev

frames <- list("./kapil2.csv","./botham2.csv","imran2.csv","hadlee2.csv")
names <- list("Kapil","Botham","Imran","Hadlee")
relativeBowlingERODTT(frames,names)

relbowler-1

Relative cumulative average wickets of bowlers in career

This plot confirms the wicket taking ability of Botham followed by Hadlee

frames <- list("./kapil2.csv","./botham2.csv","imran2.csv","hadlee2.csv")
names <- list("Kapil","Botham","Imran","Hadlee")
relativeBowlerCumulativeAvgWickets(frames,names)

rbcaw-1

Relative cumulative average economy rate of bowlers

frames <- list("./kapil2.csv","./botham2.csv","imran2.csv","hadlee2.csv")
names <- list("Kapil","Botham","Imran","Hadlee")
relativeBowlerCumulativeAvgEconRate(frames,names)

rbcer-1

Moving average of wickets over career

This plot shows that Hadlee has the best economy rate followed by Kapil

par(mfrow=c(2,2))
par(mar=c(4,4,2,2))
bowlerMovingAverage("./kapil2.csv","Kapil")
bowlerMovingAverage("./botham2.csv","Botham")
bowlerMovingAverage("./imran2.csv","Imran")
bowlerMovingAverage("./hadlee2.csv","Hadlee")

jmss-bowlma-1

dev.off()
## null device 
##           1

Wickets forecast

par(mfrow=c(2,2))
par(mar=c(4,4,2,2))
bowlerPerfForecast("./kapil2.csv","Kapil")
bowlerPerfForecast("./botham2.csv","Botham")
bowlerPerfForecast("./imran2.csv","Imran")
bowlerPerfForecast("./hadlee2.csv","Hadlee")

jjmss-pfcst-1

dev.off()
## null device 
##           1

Check bowler in-form, out-of-form

“**************************** Form status of Kapil ****************************\n\n
Population size: 198
Mean of population: 1.2 \n Sample size: 23 Mean of sample: 0.65 SD of sample: 0.83 \n\n
Null hypothesis H0 : Kapil ‘s sample average is within 95% confidence interval \n of population average\n
Alternative hypothesis Ha : Kapil ‘s sample average is below the 95% confidence\n interval of population average\n\n
Kapil ‘s Form Status: Out-of-Form because the p value: 0.002097 is less than alpha= 0.05 \n

“**************************** Form status of Botham ****************************\n\n
Population size: 166
Mean of population: 1.58 \n Sample size: 19 Mean of sample: 1.47 SD of sample: 1.12 \n\n
Null hypothesis H0 : Botham ‘s sample average is within 95% confidence interval \n of population average\n
Alternative hypothesis Ha : Botham ‘s sample average is below the 95% confidence\n interval of population average\n\n
Botham ‘s Form Status: In-Form because the p value: 0.336694 is greater than alpha= 0.05 \n

“**************************** Form status of Imran ****************************\n\n
Population size: 137
Mean of population: 1.23 \n Sample size: 16 Mean of sample: 0.81 SD of sample: 0.91 \n\n
Null hypothesis H0 : Imran ‘s sample average is within 95% confidence interval \n of population average\n
Alternative hypothesis Ha : Imran ‘s sample average is below the 95% confidence\n interval of population average\n\n
Imran ‘s Form Status: Out-of-Form because the p value: 0.041727 is less than alpha= 0.05 \n

“**************************** Form status of Hadlee ****************************\n\n
Population size: 100
Mean of population: 1.38 \n Sample size: 12 Mean of sample: 1.67 SD of sample: 1.37 \n\n
Null hypothesis H0 : Hadlee ‘s sample average is within 95% confidence interval \n of population average\n
Alternative hypothesis Ha : Hadlee ‘s sample average is below the 95% confidence\n interval of population average\n\n
Hadlee ‘s Form Status: In-Form because the p value: 0.761265 is greater than alpha= 0.05 \n *******************************************************************************************\n\n”

Key findings

Here are some key conclusions ODI batsmen

  1. Kapil Dev’s strike rate stands high above the other 3
  2. Imran Khan has the best cumulative average runs followed by Kapil
  3. Botham is the most effective wicket taker followed by Hadlee
  4. Hadlee is the most economical bowler and is followed by Kapil Dev
  5. For a hypothetical Balls Faced and Minutes at creases Kapil will score the most runs followed by Botham
  6. The moving average of indicates that the best is yet to come for Imran and Hadlee. Kapil and Botham were on the decline

Also see my other posts in R

  1. A primer on Qubits, Quantum gates abd Quantum operations
  2. Deblurring with OpenCV:Weiner filter reloaded
  3. Designing a Social Web Portal
  4. A crime map of India in R – Crimes against women
  5. Bend it like Bluemix, MongoDB with autoscaling – Part 2
  6. Mirror, mirror . the best batsman of them all?

For a full list of posts see Index of posts

Introducing cricket package yorkr:Part 4-In the block hole!

Introduction

“The nitrogen in our DNA, the calcium in our teeth, the iron in our blood, the carbon in our apple pies were made in the interiors of collapsing stars. We are made of starstuff.”

“If you wish to make an apple pie from scratch, you must first invent the universe.”

“We are like butterflies who flutter for a day and think it is forever.”

“The absence of evidence is not the evidence of absence.”

“We are star stuff which has taken its destiny into its own hands.”

                              Cosmos - Carl Sagan

This post is the 4th and possibly, the last part of my introduction, to my latest cricket package yorkr. This is the 4th part of the introduction, the 3 earlier ones were

  1. Introducing cricket package yorkr-Part1:Beaten by sheer pace!.
  2. Introducing cricket package yorkr: Part 2-Trapped leg before wicket!
  3. Introducing cricket package yorkr: Part 3-Foxed by flight!

The 1st part included functions dealing with a specific match, the 2nd part dealt with functions between 2 opposing teams. The 3rd part dealt with functions between a team and all matches with all oppositions. This 4th part includes individual batting and bowling performances in ODI matches and deals with Class 4 functions.

If you are passionate about cricket, and love analyzing cricket performances, then check out my 2 racy books on cricket! In my books, I perform detailed yet compact analysis of performances of both batsmen, bowlers besides evaluating team & match performances in Tests , ODIs, T20s & IPL. You can buy my books on cricket from Amazon at $12.99 for the paperback and $4.99/$6.99 respectively for the kindle versions. The books can be accessed at Cricket analytics with cricketr  and Beaten by sheer pace-Cricket analytics with yorkr  A must read for any cricket lover! Check it out!!

1

d $4.99/Rs 320 and $6.99/Rs448 respectively

 

This post has also been published at RPubs yorkr-Part4 and can also be downloaded as a PDF document from yorkr-Part4.pdf.

You can clone/fork the code for the package yorkr from Github at yorkr-package

Checkout my interactive Shiny apps GooglyPlus (plots & tables) and Googly (only plots) which can be used to analyze IPL players, teams and matches.

Important note 1: Do check out all the posts on the python avatar of yorkr, namely ‘yorkpy’ in my post ‘Pitching yorkpy … short of good length to IPL – Part 1

Batsman functions

  1. batsmanRunsVsDeliveries
  2. batsmanFoursSixes
  3. batsmanDismissals
  4. batsmanRunsVsStrikeRate
  5. batsmanMovingAverage
  6. batsmanCumulativeAverageRuns
  7. batsmanCumulativeStrikeRate
  8. batsmanRunsAgainstOpposition
  9. batsmanRunsVenue
  10. batsmanRunsPredict

Bowler functions

  1. bowlerMeanEconomyRate
  2. bowlerMeanRunsConceded
  3. bowlerMovingAverage
  4. bowlerCumulativeAvgWickets
  5. bowlerCumulativeAvgEconRate
  6. bowlerWicketPlot
  7. bowlerWicketsAgainstOpposition
  8. bowlerWicketsVenue
  9. bowlerWktsPredict

Note: The yorkr package in its current avatar only supports ODI, T20 and IPL T20 matches.

library(yorkr)
library(gridExtra)
library(rpart.plot)
library(dplyr)
library(ggplot2)
rm(list=ls())

A. Batsman functions

1. Get Team Batting details

The function below gets the overall team batting details based on the RData file available in ODI matches. This is currently also available in Github at (https://github.com/tvganesh/yorkrData/tree/master/ODI/ODI-matches).  However you may have to do this as future matches are added! The batting details of the team in each match is created and a huge data frame is created by rbinding the individual dataframes. This can be saved as a RData file

setwd("C:/software/cricket-package/york-test/yorkrData/ODI/ODI-matches")
india_details <- getTeamBattingDetails("India",dir=".", save=TRUE)
dim(india_details)
## [1] 11085    15
sa_details <- getTeamBattingDetails("South Africa",dir=".",save=TRUE)
dim(sa_details)
## [1] 6375   15
nz_details <- getTeamBattingDetails("New Zealand",dir=".",save=TRUE)
dim(nz_details)
## [1] 6262   15
eng_details <- getTeamBattingDetails("England",dir=".",save=TRUE)
dim(eng_details)
## [1] 9001   15

2. Get batsman details

This function is used to get the individual batting record for a the specified batsmen of the country as in the functions below. For analyzing the batting performances the following cricketers have been chosen

  1. Virat Kohli (Ind)
  2. M S Dhoni (Ind)
  3. AB De Villiers (SA)
  4. Q De Kock (SA)
  5. J Root (Eng)
  6. M J Guptill (NZ)
setwd("C:/software/cricket-package/york-test/yorkrData/ODI/ODI-matches")
kohli <- getBatsmanDetails(team="India",name="Kohli",dir=".")
## [1] "./India-BattingDetails.RData"
dhoni <- getBatsmanDetails(team="India",name="Dhoni")
## [1] "./India-BattingDetails.RData"
devilliers <-  getBatsmanDetails(team="South Africa",name="Villiers",dir=".")
## [1] "./South Africa-BattingDetails.RData"
deKock <-  getBatsmanDetails(team="South Africa",name="Kock",dir=".")
## [1] "./South Africa-BattingDetails.RData"
root <-  getBatsmanDetails(team="England",name="Root",dir=".")
## [1] "./England-BattingDetails.RData"
guptill <-  getBatsmanDetails(team="New Zealand",name="Guptill",dir=".")
## [1] "./New Zealand-BattingDetails.RData"

3. Runs versus deliveries

Kohli, De Villiers and Guptill have a good cluster of points that head towards 150 runs at 150 deliveries.

p1 <-batsmanRunsVsDeliveries(kohli,"Kohli")
p2 <- batsmanRunsVsDeliveries(dhoni, "Dhoni")
p3 <- batsmanRunsVsDeliveries(devilliers,"De Villiers")
p4 <- batsmanRunsVsDeliveries(deKock,"Q de Kock")
p5 <- batsmanRunsVsDeliveries(root,"JE Root")
p6 <- batsmanRunsVsDeliveries(guptill,"MJ Guptill")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

runsVsDeliveries-1

4. Batsman Total runs, Fours and Sixes

The plots below show the total runs, fours and sixes by the batsmen

kohli46 <- select(kohli,batsman,ballsPlayed,fours,sixes,runs)
p1 <- batsmanFoursSixes(kohli46,"Kohli")
dhoni46 <- select(dhoni,batsman,ballsPlayed,fours,sixes,runs)
p2 <- batsmanFoursSixes(dhoni46,"Dhoni")
devilliers46 <- select(devilliers,batsman,ballsPlayed,fours,sixes,runs)
p3 <- batsmanFoursSixes(devilliers46, "De Villiers")
deKock46 <- select(deKock,batsman,ballsPlayed,fours,sixes,runs)
p4 <- batsmanFoursSixes(deKock46,"Q de Kock")
root46 <- select(root,batsman,ballsPlayed,fours,sixes,runs)
p5 <- batsmanFoursSixes(root46,"JE Root")
guptill46 <- select(guptill,batsman,ballsPlayed,fours,sixes,runs)
p6 <- batsmanFoursSixes(guptill46,"MJ Guptill")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

foursSixes-1

5. Batsman dismissals

The type of dismissal for each batsman is shown below

p1 <-batsmanDismissals(kohli,"Kohli")
p2 <- batsmanDismissals(dhoni, "Dhoni")
p3 <- batsmanDismissals(devilliers, "De Villiers")
p4 <- batsmanDismissals(deKock,"Q de Kock")
p5 <- batsmanDismissals(root,"JE Root")
p6 <- batsmanDismissals(guptill,"MJ Guptill")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

dismissal-1

6. Runs versus Strike Rate

De villiers has the best strike rate among all as there are more points to the right side of the plot for the same runs. Kohli and Dhoni do well too. Q De Kock and Joe Root also have a very good spread of points though they have fewer innings.

p1 <-batsmanRunsVsStrikeRate(kohli,"Kohli")
p2 <- batsmanRunsVsStrikeRate(dhoni, "Dhoni")
p3 <- batsmanRunsVsStrikeRate(devilliers, "De Villiers")
p4 <- batsmanRunsVsStrikeRate(deKock,"Q de Kock")
p5 <- batsmanRunsVsStrikeRate(root,"JE Root")
p6 <- batsmanRunsVsStrikeRate(guptill,"MJ Guptill")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

runsSR-1

7. Batsman moving average

Kohli’s average is on a gentle increase from below 50 to around 60’s. Joe Root performance is impressive with his moving average of late tending towards the 70’s. Q De Kock seemed to have a slump around 2015 but his performance is on the increase. Devilliers consistently averages around 50. Dhoni also has been having a stable run in the last several years.

p1 <-batsmanMovingAverage(kohli,"Kohli")
p2 <- batsmanMovingAverage(dhoni, "Dhoni")
p3 <- batsmanMovingAverage(devilliers, "De Villiers")
p4 <- batsmanMovingAverage(deKock,"Q de Kock")
p5 <- batsmanMovingAverage(root,"JE Root")
p6 <- batsmanMovingAverage(guptill,"MJ Guptill")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

ma-1

8. Batsman cumulative average

The functions below provide the cumulative average of runs scored. As can be seen Kohli and Devilliers have a cumulative runs rate that averages around 48-50. Q De Kock seems to have had a rocky career with several highs and lows as the cumulative average oscillates between 45-40. Root steadily improves to a cumulative average of around 42-43 from his 50th innings

p1 <-batsmanCumulativeAverageRuns(kohli,"Kohli")
p2 <- batsmanCumulativeAverageRuns(dhoni, "Dhoni")
p3 <- batsmanCumulativeAverageRuns(devilliers, "De Villiers")
p4 <- batsmanCumulativeAverageRuns(deKock,"Q de Kock")
p5 <- batsmanCumulativeAverageRuns(root,"JE Root")
p6 <- batsmanCumulativeAverageRuns(guptill,"MJ Guptill")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

cAvg-1

9. Cumulative Average Strike Rate

The plots below show the cumulative average strike rate of the batsmen. Dhoni and Devilliers have the best cumulative average strike rate of 90%. The rest average around 80% strike rate. Guptill shows a slump towards the latter part of his career.

p1 <-batsmanCumulativeStrikeRate(kohli,"Kohli")
p2 <- batsmanCumulativeStrikeRate(dhoni, "Dhoni")
p3 <- batsmanCumulativeStrikeRate(devilliers, "De Villiers")
p4 <- batsmanCumulativeStrikeRate(deKock,"Q de Kock")
p5 <- batsmanCumulativeStrikeRate(root,"JE Root")
p6 <- batsmanCumulativeStrikeRate(guptill,"MJ Guptill")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

cSR-1

10. Batsman runs against opposition

Kohli’s best performances are against Australia, West Indies and Sri Lanka

batsmanRunsAgainstOpposition(kohli,"Kohli")

runsOppn1-1

batsmanRunsAgainstOpposition(dhoni, "Dhoni")

runsOppn2-1

Kohli’s best performances are against Australia, Pakistan and West Indies

batsmanRunsAgainstOpposition(devilliers, "De Villiers")

runsOppn3-1

Quentin de Kock average almost 100 runs against India and 75 runs against England

batsmanRunsAgainstOpposition(deKock, "Q de Kock")

runsOppn4-1

Root’s best performances are against South Africa, Sri Lanka and West Indies

batsmanRunsAgainstOpposition(root, "JE Root")

runsOppn5-1

batsmanRunsAgainstOpposition(guptill, "MJ Guptill")

runsOppn6-1

11. Runs at different venues

The plots below give the performances of the batsmen at different grounds.

batsmanRunsVenue(kohli,"Kohli")

runsVenue1-1

batsmanRunsVenue(dhoni, "Dhoni")

runsVenue2-1

batsmanRunsVenue(devilliers, "De Villiers")

runsVenue3-1

batsmanRunsVenue(deKock, "Q de Kock")

runsVenue4-1

batsmanRunsVenue(root, "JE Root")

runsVenue5-1

batsmanRunsVenue(guptill, "MJ Guptill")

runsVenue6-1

12. Predict number of runs to deliveries

The plots below use rpart classification tree to predict the number of deliveries required to score the runs in the leaf node. For e.g. Kohli takes 66 deliveries to score 64 runs and for higher number of deliveries scores around 115 runs. Devilliers needs

par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
batsmanRunsPredict(kohli,"Kohli")
batsmanRunsPredict(dhoni, "Dhoni")
batsmanRunsPredict(devilliers, "De Villiers")

runsPredict1,runsVenue1-1

par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
batsmanRunsPredict(deKock,"Q de Kock")
batsmanRunsPredict(root,"JE Root")
batsmanRunsPredict(guptill,"MJ Guptill")

runsPredict2,runsVenue1-1

B. Bowler functions

13. Get bowling details

The function below gets the overall team bowling details based on the RData file available in ODI matches. This is currently also available in Github at (https://github.com/tvganesh/yorkrData/tree/master/ODI/ODI-matches). The bowling details of the team in each match is created and a huge data frame is created by rbinding the individual dataframes. This can be saved as a RData file

setwd("C:/software/cricket-package/york-test/yorkrData/ODI/ODI-matches")
ind_bowling <- getTeamBowlingDetails("India",dir=".",save=TRUE)
dim(ind_bowling)
## [1] 7816   12
aus_bowling <- getTeamBowlingDetails("Australia",dir=".",save=TRUE)
dim(aus_bowling)
## [1] 9191   12
ban_bowling <- getTeamBowlingDetails("Bangladesh",dir=".",save=TRUE)
dim(ban_bowling)
## [1] 5665   12
sa_bowling <- getTeamBowlingDetails("South Africa",dir=".",save=TRUE)
dim(sa_bowling)
## [1] 3806   12
sl_bowling <- getTeamBowlingDetails("Sri Lanka",dir=".",save=TRUE)
dim(sl_bowling)
## [1] 3964   12

14. Get bowling details of the individual bowlers

This function is used to get the individual bowling record for a specified bowler of the country as in the functions below. For analyzing the bowling performances the following cricketers have been chosen

  1. R A Jadeja (Ind)
  2. Ravichander Ashwin (Ind)
  3. Mitchell Starc (Aus)
  4. Shakib Al Hasan (Ban)
  5. Ajantha Mendis (SL)
  6. Dale Steyn (SA)
jadeja <- getBowlerWicketDetails(team="India",name="Jadeja",dir=".")
ashwin <- getBowlerWicketDetails(team="India",name="Ashwin",dir=".")
starc <-  getBowlerWicketDetails(team="Australia",name="Starc",dir=".")
shakib <-  getBowlerWicketDetails(team="Bangladesh",name="Shakib",dir=".")
mendis <-  getBowlerWicketDetails(team="Sri Lanka",name="Mendis",dir=".")
steyn <-  getBowlerWicketDetails(team="South Africa",name="Steyn",dir=".")

15. Bowler Mean Economy Rate

Shakib Al Hassan is expensive in the 1st 3 overs after which he is very economical with a economy rate of 3-4. Starc, Steyn average around a ER of 4.0

p1<-bowlerMeanEconomyRate(jadeja,"RA Jadeja")
p2<-bowlerMeanEconomyRate(ashwin, "R Ashwin")
p3<-bowlerMeanEconomyRate(starc, "MA Starc")
p4<-bowlerMeanEconomyRate(shakib, "Shakib Al Hasan")
p5<-bowlerMeanEconomyRate(mendis, "A Mendis")
p6<-bowlerMeanEconomyRate(steyn, "D Steyn")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

meanER-1

16. Bowler Mean Runs conceded

Ashwin is expensive around 6 & 7 overs

p1<-bowlerMeanRunsConceded(jadeja,"RA Jadeja")
p2<-bowlerMeanRunsConceded(ashwin, "R Ashwin")
p3<-bowlerMeanRunsConceded(starc, "M A Starc")
p4<-bowlerMeanRunsConceded(shakib, "Shakib Al Hasan")
p5<-bowlerMeanRunsConceded(mendis, "A Mendis")
p6<-bowlerMeanRunsConceded(steyn, "D Steyn")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

meanRunsConceded-1

17. Bowler Moving average

RA jadeja and Mendis’ performance has dipped considerably, while Ashwin and Shakib have improving performances. Starc average around 4 wickets

p1<-bowlerMovingAverage(jadeja,"RA Jadeja")
p2<-bowlerMovingAverage(ashwin, "Ashwin")
p3<-bowlerMovingAverage(starc, "M A Starc")
p4<-bowlerMovingAverage(shakib, "Shakib Al Hasan")
p5<-bowlerMovingAverage(mendis, "Ajantha Mendis")
p6<-bowlerMovingAverage(steyn, "Dale Steyn")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

bowlerMA-1

17. Bowler cumulative average wickets

Starc is clearly the most consistent performer with 3 wickets on an average over his career, while Jadeja averages around 2.0. Ashwin seems to have dropped from 2.4-2.0 wickets, while Mendis drops from high 3.5 to 2.2 wickets. The fractional wickets only show a tendency to take another wicket.

p1<-bowlerCumulativeAvgWickets(jadeja,"RA Jadeja")
p2<-bowlerCumulativeAvgWickets(ashwin, "Ashwin")
p3<-bowlerCumulativeAvgWickets(starc, "M A Starc")
p4<-bowlerCumulativeAvgWickets(shakib, "Shakib Al Hasan")
p5<-bowlerCumulativeAvgWickets(mendis, "Ajantha Mendis")
p6<-bowlerCumulativeAvgWickets(steyn, "Dale Steyn")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

cumWkts-1

18. Bowler cumulative Economy Rate (ER)

The plots below are interesting. All of the bowlers seem to average around 4.5 runs/over. RA Jadeja’s ER improves and heads to 4.5, Mendis is seen to getting more expensive as his career progresses. From a ER of 3.0 he increases towards 4.5

p1<-bowlerCumulativeAvgEconRate(jadeja,"RA Jadeja")
p2<-bowlerCumulativeAvgEconRate(ashwin, "Ashwin")
p3<-bowlerCumulativeAvgEconRate(starc, "M A Starc")
p4<-bowlerCumulativeAvgEconRate(shakib, "Shakib Al Hasan")
p5<-bowlerCumulativeAvgEconRate(mendis, "Ajantha Mendis")
p6<-bowlerCumulativeAvgEconRate(steyn, "Dale Steyn")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

cumER-1

19. Bowler wicket plot

The plot below gives the average wickets versus number of overs

p1<-bowlerWicketPlot(jadeja,"RA Jadeja")
p2<-bowlerWicketPlot(ashwin, "Ashwin")
p3<-bowlerWicketPlot(starc, "M A Starc")
p4<-bowlerWicketPlot(shakib, "Shakib Al Hasan")
p5<-bowlerWicketPlot(mendis, "Ajantha Mendis")
p6<-bowlerWicketPlot(steyn, "Dale Steyn")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

wktPlot-1

20. Bowler wicket against opposition

#Jadeja's' best pertformance are against England, Pakistan and West Indies
bowlerWicketsAgainstOpposition(jadeja,"RA Jadeja")

wktsOppn1-1

#Ashwin's bets pertformance are against England, Pakistan and South Africa
bowlerWicketsAgainstOpposition(ashwin, "Ashwin")

wktsOppn2-1

#Starc has good performances against India, New Zealand, Pakistan, West Indies
bowlerWicketsAgainstOpposition(starc, "M A Starc")

wktsOppn3-1

bowlerWicketsAgainstOpposition(shakib,"Shakib Al Hasan")

wktsOppn4-1

bowlerWicketsAgainstOpposition(mendis, "Ajantha Mendis")

wktsOppn5-1

#Steyn has good performances against India, Sri Lanka, Pakistan, West Indies
bowlerWicketsAgainstOpposition(steyn, "Dale Steyn")

wktsOppn6-1

21. Bowler wicket at cricket grounds

bowlerWicketsVenue(jadeja,"RA Jadeja")

wktsAve1-1

bowlerWicketsVenue(ashwin, "Ashwin")

wktsAve2-1

bowlerWicketsVenue(starc, "M A Starc")
## Warning: Removed 2 rows containing missing values (geom_bar).

wktsAve3-1

bowlerWicketsVenue(shakib,"Shakib Al Hasan")

wktsAve4-1

bowlerWicketsVenue(mendis, "Ajantha Mendis")

wktsAve5-1

bowlerWicketsVenue(steyn, "Dale Steyn")

wktsAve6-1

22. Get Delivery wickets for bowlers

Thsi function creates a dataframe of deliveries and the wickets taken

setwd("C:/software/cricket-package/york-test/yorkrData/ODI/ODI-matches")
jadeja1 <- getDeliveryWickets(team="India",dir=".",name="Jadeja",save=FALSE)
ashwin1 <- getDeliveryWickets(team="India",dir=".",name="Ashwin",save=FALSE)
starc1 <- getDeliveryWickets(team="Australia",dir=".",name="MA Starc",save=FALSE)
shakib1 <- getDeliveryWickets(team="Bangladesh",dir=".",name="Shakib",save=FALSE)
mendis1 <- getDeliveryWickets(team="Sri Lanka",dir=".",name="Mendis",save=FALSE)
steyn1 <- getDeliveryWickets(team="South Africa",dir=".",name="Steyn",save=FALSE)

23. Predict number of deliveries to wickets

#Jadeja and Ashwin need around 22 to 28 deliveries to make a break through
par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
bowlerWktsPredict(jadeja1,"RA Jadeja")
bowlerWktsPredict(ashwin1,"RAshwin")

wktsPred1-1

#Starc and Shakib provide an early breakthrough producing a wicket in around 16 balls. Starc's 2nd wicket comed around the 30th delivery
par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
bowlerWktsPredict(starc1,"MA Starc")
bowlerWktsPredict(shakib1,"Shakib Al Hasan")

wktsPred2-1

#Steyn and Mendis take 20 deliveries to get their 1st wicket
par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
bowlerWktsPredict(mendis1,"A Mendis")
bowlerWktsPredict(steyn1,"DSteyn")

wktsPred3-1

Conclusion

This concludes the 4 part introduction to my new R cricket package yorkr for ODIs. I will be enhancing the package to handle Twenty20 and IPL matches soon. You can fork/clone the code from Github at yorkr.

The yaml data from Cricsheet have already beeen converted into R consumable dataframes. The converted data can be downloaded from Github at yorkrData. There are 3 folders – ODI matches, ODI matches between 2 teams (oppnAllMatches), ODI matches between a team and the rest of the world (all matches,all oppositions).

As I have already mentioned I have around 67 functions for analysis, however I am certain that the data has a lot more secrets waiting to be tapped. So please do go ahead and run any machine learning or statistical learning algorithms on them. If you do come up with interesting insights, I would appreciate if attribute the source to Cricsheet(http://cricsheet.org), and my package yorkr and my blog Giga thoughts*, besides dropping me a note.

Hope you have a great time with my yorkr package!

Important note: Do check out my other posts using yorkr at yorkr-posts

Also see

  1. Introducing cricketr! : An R package to analyze performances of cricketers
  2. Cricket analytics with cricketr in paperback and Kindle versions
  3. My TEDx talk on the “Internet of Things”
  4. Bend it like Bluemix,MongoDB with autoscaling – Part 1
  5. The mind of a programmer
  6. Fun simulation of a chain in Android
  7. Taking cricketr for a spin-Part 1
  8. Latency,throughput implications for the cloud
  9. Hand detection through haar-training: A hands-on approach
  10. Cricket analytics with cricketr

Introducing cricket package yorkr: Part 2-Trapped leg before wicket!

“It was a puzzling thing. The truth knocks on the door and you say ‘Go away, I ’m looking for the truth,’ and so it goes away. Puzzling.”

“But even though Quality cannot be defined, you know what Quality is!”

“The Buddha, the Godhead, resides quite comfortably in the circuits of a digital computer or the gears of a cycle transmission as he does at the top of a mountain or in the petals of the flower. To think otherwise is to demean the Buddha – which is to demean oneself.”

                Zen and the Art of Motorcycle maintenance - Robert M Pirsig

Introduction

If we were to to extend the last quote from Zen and the Art of Motorcycle Maintenance, by Robert M Pirsig, I think it would be fair to say that the Buddha also comfortably resides in the exquisite backhand cross-court return of Bjorn Borg, to the the graceful arc of the football in a Lionel Messi’s free kick to the smashing cover drive of Sunil Gavaskar.

In this post I continue to introduce my latest cricket package yorkr. This post is a continuation of my earlier post – Introducing cricket package yorkr-Part1:Beaten by sheer pace!. This post deals with Class 2 functions namely the performances of a team in all matches against a single opposition for e.g all matches of India-Australia, Pakistan-West Indies etc. You can clone/fork the code for my package yorkr from Github at yorkr

If you are passionate about cricket, and love analyzing cricket performances, then check out my 2 racy books on cricket! In my books, I perform detailed yet compact analysis of performances of both batsmen, bowlers besides evaluating team & match performances in Tests , ODIs, T20s & IPL. You can buy my books on cricket from Amazon at $12.99 for the paperback and $4.99/$6.99 respectively for the kindle versions. The books can be accessed at Cricket analytics with cricketr  and Beaten by sheer pace-Cricket analytics with yorkr  A must read for any cricket lover! Check it out!!

1

 

Note 1: The package currently only supports ODI, T20s and IPL T20 matches.

This post has also been published at RPubs yorkr-Part2 and can also be downloaded as a PDF document from yorkr-Part2.pdf

Checkout my interactive Shiny apps GooglyPlus (plots & tables) and Googly (only plots) which can be used to analyze IPL players, teams and matches.

Important note 1: Do check out all the posts on the python avatar of yorkr, namely ‘yorkpy’ in my post ‘Pitching yorkpy … short of good length to IPL – Part 1

The list of function in Class 2 are

  1. teamBatsmenPartnershiOppnAllMatches()
  2. teamBatsmenPartnershipOppnAllMatchesChart()
  3. teamBatsmenVsBowlersOppnAllMatches()
  4. teamBattingScorecardOppnAllMatches()
  5. teamBowlingPerfOppnAllMatches()
  6. teamBowlersWicketsOppnAllMatches()
  7. teamBowlersVsBatsmenOppnAllMatches()
  8. teamBowlersWicketKindOppnAllMatches()
  9. teamBowlersWicketRunsOppnAllMatches()
  10. plotWinLossBetweenTeams()

1. Install the package from CRAN

if (!require("yorkr")) {
  install.packages("yorkr") 
  library("yorkr")
}
library(plotly) 
rm(list=ls())

2. Get data for all matches between 2 teams

We can get all matches between any 2 teams using the function below. The dir parameter should point to the folder which RData files of the individual matches. This function creates a data frame of all the matches and also saves the dataframe as RData

setwd("C:/software/cricket-package/york-test/yorkrData/ODI/ODI-matches")
matches <- getAllMatchesBetweenTeams("Australia","India",dir=".")
dim(matches)
## [1] 67428    25

I have however already saved the matches for all possible combination of opposing countries. The data for these matches for the individual teams/countries can be obtained from Github at in the folder ODI-allmatches-between-two-teams

Note: The dataframe for the different head-to-head matches can be loaded directly into your code. The datframes are 15000+ rows x 25 columns. While I have 10 functions to process the details between teams, feel free to let loose any statistical or machine learning algorithms on the dataframe. So go ahead with any insights that can be gleaned from random forests, ridge regression,SVM classifiers and so on. If you do come up with something interesting, I would appreciate if you could drop me a note. Also please do attribute source to Cricsheet (http://cricsheet.org), the package york and my blog Giga thoughts

3. Save data for all matches between all combination of 2 teams

This can be done locally using the function below. You could use this function to combine all matches between any 2 teams into a single dataframe and save it in the current folder. The current implementation expectes that the the RData files of individual matches are in ../data folder. Since I already have converted this I will not be running this again

#saveAllMatchesBetweenTeams(dir=".",odir=".")

4. Load data directly for all matches between 2 teams

As in my earlier post I pick all matches between 2 random teams. I load the data directly from the stored RData files. When we load the Rdata file a “matches” object will be created. This object can be stored for the apporpriate teams as below

setwd("C:/software/cricket-package/york-test/yorkrData/ODI/ODI-allmatches-between-two-teams")
load("India-Australia-allMatches.RData")
aus_ind_matches <- matches
dim(aus_ind_matches)
## [1] 21909    25
load("England-New Zealand-allMatches.RData")
eng_nz_matches <- matches
dim(eng_nz_matches)
## [1] 15343    25
load("Pakistan-South Africa-allMatches.RData")
pak_sa_matches <- matches
dim(pak_sa_matches)
## [1] 17083    25
load("Sri Lanka-West Indies-allMatches.RData")
sl_wi_matches <- matches
dim(sl_wi_matches)
## [1] 4869   25
load("Bangladesh-Ireland-allMatches.RData")
ban_ire_matches <-matches
dim(ban_ire_matches)
## [1] 1668   25
load("Kenya-Bermuda-allMatches.RData")
ken_ber_matches <- matches
dim(ken_ber_matches)
## [1] 1518   25
load("Scotland-Canada-allMatches.RData")
sco_can_matches <-matches
dim(sco_can_matches)
## [1] 1061   25
load("Netherlands-Afghanistan-allMatches.RData")
nl_afg_matches <- matches
dim(nl_afg_matches)
## [1] 402  25

5. Team Batsmen partnership (all matches with opposition)

This function will create a report of the batting partnerships in the teams. The report can be brief or detailed depending on the parameter ‘report’. The top batsmen in India-Australia clashes are Ricky Ponting from Australia and Mahendra Singh Dhoni of India.

m<- teamBatsmenPartnershiOppnAllMatches(aus_ind_matches,'Australia',report="summary")
m
## Source: local data frame [47 x 2]
## 
##       batsman totalRuns
##        (fctr)     (dbl)
## 1  RT Ponting       876
## 2  MEK Hussey       753
## 3   GJ Bailey       614
## 4   SR Watson       609
## 5   MJ Clarke       607
## 6   ML Hayden       573
## 7   A Symonds       536
## 8    AJ Finch       525
## 9   SPD Smith       467
## 10  DA Warner       391
## ..        ...       ...
m <-teamBatsmenPartnershiOppnAllMatches(aus_ind_matches,'India',report="summary")
m
## Source: local data frame [44 x 2]
## 
##         batsman totalRuns
##          (fctr)     (dbl)
## 1      MS Dhoni      1156
## 2     RG Sharma       918
## 3  SR Tendulkar       910
## 4       V Kohli       902
## 5     G Gambhir       536
## 6  Yuvraj Singh       524
## 7      SK Raina       509
## 8      S Dhawan       471
## 9      V Sehwag       289
## 10   RV Uthappa       283
## ..          ...       ...
m <-teamBatsmenPartnershiOppnAllMatches(aus_ind_matches,'Australia',report="detailed")
m <-teamBatsmenPartnershiOppnAllMatches(pak_sa_matches,'Pakistan',report="summary")
m
## Source: local data frame [40 x 2]
## 
##            batsman totalRuns
##             (fctr)     (dbl)
## 1    Misbah-ul-Haq       727
## 2      Younis Khan       657
## 3    Shahid Afridi       558
## 4  Mohammad Yousuf       539
## 5  Mohammad Hafeez       477
## 6     Shoaib Malik       452
## 7    Ahmed Shehzad       348
## 8     Abdul Razzaq       246
## 9     Kamran Akmal       241
## 10      Umar Akmal       215
## ..             ...       ...
m <-teamBatsmenPartnershiOppnAllMatches(eng_nz_matches,'England',report="summary")
m
## Source: local data frame [47 x 2]
## 
##           batsman totalRuns
##            (fctr)     (dbl)
## 1         IR Bell       654
## 2         JE Root       612
## 3  PD Collingwood       514
## 4      EJG Morgan       479
## 5         AN Cook       464
## 6       IJL Trott       362
## 7    KP Pietersen       358
## 8      JC Buttler       287
## 9         OA Shah       274
## 10      RS Bopara       222
## ..            ...       ...
m <-teamBatsmenPartnershiOppnAllMatches(sl_wi_matches,'Sri Lanka',report="summary")
m[1:50,]
## Source: local data frame [50 x 2]
## 
##             batsman totalRuns
##              (fctr)     (dbl)
## 1  DPMD Jayawardene       288
## 2     KC Sangakkara       238
## 3        TM Dilshan       224
## 4       WU Tharanga       220
## 5        AD Mathews       161
## 6     ST Jayasuriya       160
## 7       ML Udawatte        87
## 8   HDRL Thirimanne        67
## 9       MDKJ Perera        64
## 10    CK Kapugedera        57
## ..              ...       ...
m <- teamBatsmenPartnershiOppnAllMatches(ban_ire_matches,"Ireland",report="summary")
m
## Source: local data frame [16 x 2]
## 
##             batsman totalRuns
##              (fctr)     (dbl)
## 1   WTS Porterfield       111
## 2        KJ O'Brien        99
## 3        NJ O'Brien        75
## 4         GC Wilson        60
## 5          AR White        38
## 6       DT Johnston        36
## 7           JP Bray        31
## 8         JF Mooney        28
## 9          AC Botha        23
## 10         EC Joyce        16
## 11      PR Stirling        15
## 12      GH Dockrell         9
## 13        WB Rankin         9
## 14 D Langford-Smith         6
## 15       EJG Morgan         5
## 16        AR Cusack         0

6. Team batsmen partnership (all matches with opposition)

This is plotted graphically in the charts below

teamBatsmenPartnershipOppnAllMatchesChart(aus_ind_matches,"India","Australia")

teamBatsmenPartnership-1

teamBatsmenPartnershipOppnAllMatchesChart(pak_sa_matches,main="South Africa",opposition="Pakistan")

teamBatsmenPartnership-2

m<- teamBatsmenPartnershipOppnAllMatchesChart(eng_nz_matches,"New Zealand",opposition="England",plot=FALSE)
m[1:30,]
##          batsman    nonStriker runs
## 1  KS Williamson   LRPL Taylor  354
## 2    BB McCullum    MJ Guptill  275
## 3    LRPL Taylor KS Williamson  273
## 4     MJ Guptill   BB McCullum  227
## 5    BB McCullum      JD Ryder  212
## 6     MJ Guptill KS Williamson  196
## 7  KS Williamson    MJ Guptill  179
## 8       JD Ryder   BB McCullum  175
## 9       JDP Oram     SB Styris  153
## 10   LRPL Taylor    GD Elliott  147
## 11    GD Elliott   LRPL Taylor  143
## 12   LRPL Taylor    MJ Guptill  140
## 13        JM How   BB McCullum  128
## 14    MJ Guptill   LRPL Taylor  125
## 15   BB McCullum        JM How  117
## 16   BB McCullum   LRPL Taylor  116
## 17     SB Styris      JDP Oram  100
## 18   LRPL Taylor        JM How   98
## 19        JM How   LRPL Taylor   98
## 20      JDP Oram   BB McCullum   84
## 21   LRPL Taylor     L Vincent   71
## 22      JDP Oram    DL Vettori   70
## 23   LRPL Taylor   BB McCullum   61
## 24     SB Styris        JM How   55
## 25      DR Flynn     SB Styris   54
## 26    DL Vettori      JDP Oram   53
## 27     L Vincent   LRPL Taylor   53
## 28    MJ Santner   LRPL Taylor   53
## 29    SP Fleming     L Vincent   52
## 30        JM How     SB Styris   50
teamBatsmenPartnershipOppnAllMatchesChart(sl_wi_matches,"Sri Lanka","West Indies")

teamBatsmenPartnership-3

teamBatsmenPartnershipOppnAllMatchesChart(ban_ire_matches,"Bangladesh","Ireland")

teamBatsmenPartnership-4

7. Team batsmen versus bowler (all matches with opposition)

The plots below provide information on how each of the top batsmen fared against the opposition bowlers

teamBatsmenVsBowlersOppnAllMatches(aus_ind_matches,"India","Australia")

batsmenvsBowler-1

teamBatsmenVsBowlersOppnAllMatches(pak_sa_matches,"South Africa","Pakistan",top=3)

batsmenvsBowler-2

m <- teamBatsmenVsBowlersOppnAllMatches(eng_nz_matches,"England","New Zealnd",top=10,plot=FALSE)
m
## Source: local data frame [157 x 3]
## Groups: batsman [1]
## 
##    batsman       bowler  runs
##     (fctr)       (fctr) (dbl)
## 1  IR Bell JEC Franklin    63
## 2  IR Bell      SE Bond    13
## 3  IR Bell MR Gillespie    33
## 4  IR Bell     NJ Astle     0
## 5  IR Bell     JS Patel    20
## 6  IR Bell   DL Vettori    28
## 7  IR Bell     JDP Oram    48
## 8  IR Bell    SB Styris    12
## 9  IR Bell     KD Mills   124
## 10 IR Bell   TG Southee    84
## ..     ...          ...   ...
teamBatsmenVsBowlersOppnAllMatches(sl_wi_matches,"Sri Lanka","West Indies")

batsmenvsBowler-3

teamBatsmenVsBowlersOppnAllMatches(ban_ire_matches,"Bangladesh","Ireland")

batsmenvsBowler-4

8. Team batsmen versus bowler (all matches with opposition)

The following tables gives the overall performances of the country’s batsmen against the opposition. For India-Australia matches Dhoni, Rohit Sharma and Tendulkar lead the way. For Australia it is Ricky Ponting, M Hussey and GJ Bailey. In South Africa- Pakistan matches it is AB Devilliers, Hashim Amla etc.

a <-teamBattingScorecardOppnAllMatches(aus_ind_matches,main="India",opposition="Australia")
## Total= 8331
a
## Source: local data frame [44 x 5]
## 
##         batsman ballsPlayed fours sixes  runs
##          (fctr)       (int) (int) (int) (dbl)
## 1      MS Dhoni        1406    78    22  1156
## 2     RG Sharma        1015    73    24   918
## 3  SR Tendulkar        1157   103     6   910
## 4       V Kohli         961    87     6   902
## 5     G Gambhir         677    44     2   536
## 6  Yuvraj Singh         664    52    11   524
## 7      SK Raina         536    43    11   509
## 8      S Dhawan         470    55     6   471
## 9      V Sehwag         305    42     4   289
## 10   RV Uthappa         295    29     7   283
## ..          ...         ...   ...   ...   ...
teamBattingScorecardOppnAllMatches(aus_ind_matches,"Australia","India")
## Total= 9995
## Source: local data frame [47 x 5]
## 
##       batsman ballsPlayed fours sixes  runs
##        (fctr)       (int) (int) (int) (dbl)
## 1  RT Ponting        1107    86     8   876
## 2  MEK Hussey         816    56     5   753
## 3   GJ Bailey         578    51    13   614
## 4   SR Watson         653    81    10   609
## 5   MJ Clarke         786    45     5   607
## 6   ML Hayden         660    72     8   573
## 7   A Symonds         543    43    15   536
## 8    AJ Finch         617    52     9   525
## 9   SPD Smith         431    44     7   467
## 10  DA Warner         385    40     6   391
## ..        ...         ...   ...   ...   ...
teamBattingScorecardOppnAllMatches(pak_sa_matches,"South Africa","Pakistan")
## Total= 6657
## Source: local data frame [36 x 5]
## 
##           batsman ballsPlayed fours sixes  runs
##            (fctr)       (int) (int) (int) (dbl)
## 1  AB de Villiers        1533   128    23  1423
## 2         HM Amla         864    88     3   815
## 3        GC Smith         726    68     3   597
## 4       JH Kallis         710    40     8   543
## 5       JP Duminy         620    35     3   481
## 6       CA Ingram         388    32     1   305
## 7    F du Plessis         363    30     4   278
## 8       Q de Kock         336    28     2   270
## 9       DA Miller         329    20     2   250
## 10       HH Gibbs         252    33     2   228
## ..            ...         ...   ...   ...   ...
teamBattingScorecardOppnAllMatches(sl_wi_matches,"West Indies","Sri Lanka")
## Total= 1800
## Source: local data frame [36 x 5]
## 
##          batsman ballsPlayed fours sixes  runs
##           (fctr)       (int) (int) (int) (dbl)
## 1       DM Bravo         353    20     6   265
## 2      RR Sarwan         315    11     3   205
## 3     MN Samuels         209    19     5   188
## 4       CH Gayle         198    18     8   176
## 5  S Chanderpaul         181     6     7   152
## 6      AB Barath         162     9     2   125
## 7       DJ Bravo         139     7     2   102
## 8       CS Baugh         102     5    NA    78
## 9    LMP Simmons          78     5     4    67
## 10     JO Holder          33     5     3    55
## ..           ...         ...   ...   ...   ...
teamBattingScorecardOppnAllMatches(eng_nz_matches,"England","New Zealand")
## Total= 6472
## Source: local data frame [47 x 5]
## 
##           batsman ballsPlayed fours sixes  runs
##            (fctr)       (int) (int) (int) (dbl)
## 1         IR Bell         871    74     7   654
## 2         JE Root         651    54     5   612
## 3  PD Collingwood         619    34    15   514
## 4      EJG Morgan         445    35    22   479
## 5         AN Cook         616    49     3   464
## 6       IJL Trott         421    26     1   362
## 7    KP Pietersen         481    30     6   358
## 8      JC Buttler         199    28    11   287
## 9         OA Shah         323    17     6   274
## 10      RS Bopara         350    21    NA   222
## ..            ...         ...   ...   ...   ...
teamBatsmenPartnershiOppnAllMatches(sco_can_matches,"Scotland","Canada")
## Source: local data frame [20 x 2]
## 
##          batsman totalRuns
##           (fctr)     (dbl)
## 1     CS MacLeod       177
## 2      MW Machan        68
## 3      CJO Smith        43
## 4    FRJ Coleman        40
## 5      RR Watson        14
## 6     JH Stander        12
## 7       MA Leask        12
## 8     RML Taylor        10
## 9     KJ Coetzer         8
## 10   GM Hamilton         7
## 11        RM Haq         7
## 12    PL Mommsen         6
## 13     CM Wright         5
## 14        JD Nel         5
## 15      MH Cross         4
## 16     SM Sharif         4
## 17     JAR Blain         2
## 18  NFI McCallum         1
## 19 RD Berrington         1
## 20     NS Poonia         0

9. Team performances of bowlers (all matches with opposition)

Like the function above the following tables provide the top bowlers of the countries in the matches against the oppoition. In India-Australia matches Ishant Sharma leads, in Pakistan-South Africa matches Shahid Afridi tops and so on.

teamBowlingPerfOppnAllMatches(aus_ind_matches,"India","Australia")
## Source: local data frame [36 x 5]
## 
##             bowler overs maidens  runs wickets
##             (fctr) (int)   (int) (dbl)   (dbl)
## 1         I Sharma    44       1   739      20
## 2  Harbhajan Singh    40       0   926      15
## 3        RA Jadeja    39       0   867      14
## 4        IK Pathan    42       1   702      11
## 5         UT Yadav    37       2   606      10
## 6          P Kumar    27       0   501      10
## 7           Z Khan    33       1   500      10
## 8      S Sreesanth    34       0   454      10
## 9         R Ashwin    43       0   684       9
## 10   R Vinay Kumar    31       1   380       9
## ..             ...   ...     ...   ...     ...
teamBowlingPerfOppnAllMatches(pak_sa_matches,main="Pakistan",opposition="South Africa")
## Source: local data frame [24 x 5]
## 
##             bowler overs maidens  runs wickets
##             (fctr) (int)   (int) (dbl)   (dbl)
## 1    Shahid Afridi    38       0  1053      17
## 2      Saeed Ajmal    39       0   658      14
## 3  Mohammad Hafeez    38       0   774      13
## 4   Mohammad Irfan    29       0   467      13
## 5   Iftikhar Anjum    29       1   257      12
## 6       Wahab Riaz    31       0   534      11
## 7      Junaid Khan    32       0   429      10
## 8    Sohail Tanvir    26       1   409       9
## 9    Shoaib Akhtar    22       1   313       9
## 10        Umar Gul    25       2   365       7
## ..             ...   ...     ...   ...     ...
teamBowlingPerfOppnAllMatches(eng_nz_matches,"New Zealand","England")
## Source: local data frame [33 x 5]
## 
##            bowler overs maidens  runs wickets
##            (fctr) (int)   (int) (dbl)   (dbl)
## 1      TG Southee    40       0   684      19
## 2        KD Mills    36       1   742      17
## 3      DL Vettori    35       0   561      16
## 4  MJ McClenaghan    34       0   515      14
## 5         SE Bond    17       1   205      11
## 6      GD Elliott    20       0   194      10
## 7    JEC Franklin    24       0   418       7
## 8   KS Williamson    21       1   225       7
## 9        TA Boult    18       2   195       7
## 10    NL McCullum    30       0   425       6
## ..            ...   ...     ...   ...     ...
teamBowlingPerfOppnAllMatches(sl_wi_matches,"Sri Lanka","West Indies")
## Source: local data frame [24 x 5]
## 
##             bowler overs maidens  runs wickets
##             (fctr) (int)   (int) (dbl)   (dbl)
## 1       SL Malinga    28       1   280      11
## 2       BAW Mendis    15       0   267       8
## 3  KMDN Kulasekara    13       1   185       7
## 4       AD Mathews    14       0   191       6
## 5   M Muralitharan    20       1   157       6
## 6      MF Maharoof     9       2    14       6
## 7       WPUJC Vaas     7       2    82       5
## 8       RAS Lakmal     7       0    55       4
## 9    ST Jayasuriya     1       0    38       4
## 10    HMRKB Herath    10       1   124       3
## ..             ...   ...     ...   ...     ...
teamBowlingPerfOppnAllMatches(ken_ber_matches,"Kenya","Bermuda")
## Source: local data frame [9 x 5]
## 
##        bowler overs maidens  runs wickets
##        (fctr) (int)   (int) (dbl)   (dbl)
## 1  JK Kamande    16       0   122       5
## 2  HA Varaiya    13       1    64       5
## 3   AS Luseno     6       0    32       4
## 4  PJ Ongondo     7       0    39       3
## 5    TM Odoyo     7       0    36       3
## 6  LN Onyango     7       0    37       2
## 7   SO Tikolo    18       0    81       1
## 8 NN Odhiambo    14       1    76       1
## 9    CO Obuya     4       0    20       0

10. Team bowler’s wickets (all matches with opposition)

This provided a graphical plot of the tables above

teamBowlersWicketsOppnAllMatches(aus_ind_matches,"India","Australia")

bowlerWicketsOppn-1

teamBowlersWicketsOppnAllMatches(aus_ind_matches,"Australia","India")

bowlerWicketsOppn-2

teamBowlersWicketsOppnAllMatches(pak_sa_matches,"South Africa","Pakistan",top=10)

bowlerWicketsOppn-3

m <-teamBowlersWicketsOppnAllMatches(eng_nz_matches,"England","Zealand",plot=FALSE)
m
## Source: local data frame [20 x 2]
## 
##            bowler wickets
##            (fctr)   (int)
## 1     JM Anderson      20
## 2       SCJ Broad      13
## 3         ST Finn      12
## 4  PD Collingwood      11
## 5        GP Swann      10
## 6   RJ Sidebottom       8
## 7       CR Woakes       8
## 8      A Flintoff       7
## 9     LE Plunkett       6
## 10      AU Rashid       6
## 11      BA Stokes       6
## 12     MS Panesar       5
## 13      LJ Wright       4
## 14     TT Bresnan       4
## 15      DJ Willey       4
## 16    JC Tredwell       3
## 17    CT Tremlett       2
## 18      RS Bopara       2
## 19      CJ Jordan       2
## 20        J Lewis       1
teamBowlersWicketsOppnAllMatches(ban_ire_matches,"Bangladesh","Ireland",top=7)

bowlerWicketsOppn-4

11. Team bowler vs batsmen (all matches with opposition)

These plots show how the bowlers fared against the batsmen. It shows which of the opposing teams batsmen were able to score the most runs

teamBowlersVsBatsmenOppnAllMatches(aus_ind_matches,'India',"Australia",top=5)

bowlerVsBatsmen-1

teamBowlersVsBatsmenOppnAllMatches(pak_sa_matches,"Pakistan","South Africa",top=3)

bowlerVsBatsmen-2

teamBowlersVsBatsmenOppnAllMatches(eng_nz_matches,"England","New Zealand")

bowlerVsBatsmen-3

teamBowlersVsBatsmenOppnAllMatches(eng_nz_matches,"New Zealand","England")

bowlerVsBatsmen-4

12. Team bowler’s wicket kind (caught,bowled,etc) (all matches with opposition)

The charts below show the wicket kind taken by the bowler (caught, bowled, lbw etc)

teamBowlersWicketKindOppnAllMatches(aus_ind_matches,"India","Australia",plot=TRUE)

bowlerWickets-1

m <- teamBowlersWicketKindOppnAllMatches(aus_ind_matches,"Australia","India",plot=FALSE)
m[1:30,]
##        bowler        wicketKind wicketPlayerOut runs
## 1  GD McGrath            caught    SR Tendulkar   69
## 2   SR Watson            caught        D Mongia  532
## 3  MG Johnson               lbw        V Sehwag 1020
## 4       B Lee            caught        R Dravid  671
## 5       B Lee            bowled          M Kaif  671
## 6  NW Bracken            caught        SK Raina  429
## 7  GD McGrath            caught       IK Pathan   69
## 8  NW Bracken               lbw        MS Dhoni  429
## 9  MG Johnson               lbw    SR Tendulkar 1020
## 10 MG Johnson            bowled       G Gambhir 1020
## 11   SR Clark            caught    SR Tendulkar  254
## 12   JR Hopes            caught    Yuvraj Singh  346
## 13   SR Clark               lbw      RV Uthappa  254
## 14    GB Hogg            caught        R Dravid  427
## 15  MJ Clarke           run out       IK Pathan  212
## 16  MJ Clarke           stumped Harbhajan Singh  212
## 17  MJ Clarke            bowled        RR Powar  212
## 18    GB Hogg            caught          Z Khan  427
## 19    GB Hogg            caught        MS Dhoni  427
## 20      B Lee               lbw       G Gambhir  671
## 21 MG Johnson               lbw      RV Uthappa 1020
## 22      B Lee            caught        R Dravid  671
## 23    GB Hogg            bowled    SR Tendulkar  427
## 24      B Lee            caught        MS Dhoni  671
## 25   JR Hopes            caught       RG Sharma  346
## 26    GB Hogg               lbw       IK Pathan  427
## 27 MG Johnson            bowled    Yuvraj Singh 1020
## 28    GB Hogg caught and bowled          Z Khan  427
## 29   SR Clark            bowled     S Sreesanth  254
## 30   JR Hopes            caught      SC Ganguly  346
teamBowlersWicketKindOppnAllMatches(sl_wi_matches,"Sri Lanka",'West Indies',plot=TRUE)

bowlerWickets-2

13. Team bowler’s wicket taken and runs conceded (all matches with opposition)

teamBowlersWicketRunsOppnAllMatches(aus_ind_matches,"India","Australia")

wicketRuns-1

m <-teamBowlersWicketRunsOppnAllMatches(pak_sa_matches,"Pakistan","South Africa",plot=FALSE)
m[1:30,]
## Source: local data frame [30 x 5]
## 
##             bowler overs maidens  runs wickets
##             (fctr) (int)   (int) (dbl)   (dbl)
## 1         Umar Gul    25       2   365       7
## 2   Iftikhar Anjum    29       1   257      12
## 3     Yasir Arafat     5       0    33       1
## 4     Abdul Razzaq    16       0   290       4
## 5  Mohammad Hafeez    38       0   774      13
## 6    Shahid Afridi    38       0  1053      17
## 7     Shoaib Malik    18       0   219       4
## 8    Sohail Tanvir    26       1   409       9
## 9     Abdur Rehman    25       0   301       4
## 10   Mohammad Asif    10       1   204       2
## ..             ...   ...     ...   ...     ...

14. Plot of wins vs losses between teams.

setwd("C:/software/cricket-package/york-test/yorkrData/ODI/ODI-matches")
plotWinLossBetweenTeams("India","Sri Lanka")

winsLosses-1

plotWinLossBetweenTeams('Pakistan',"South Africa",".")

winsLosses-2

plotWinLossBetweenTeams('England',"New Zealand",".")

winsLosses-3

plotWinLossBetweenTeams("Australia","West Indies",".")

winsLosses-4

plotWinLossBetweenTeams('Bangladesh',"Zimbabwe",".")

winsLosses-5

plotWinLossBetweenTeams('Scotland',"Ireland",".")

winsLosses-6

Conclusion

This post included all functions for all matches between any 2 opposing countries. As before the data frames are already available. You can load the data and begin to use them. If more insights from the dataframe are possible do go ahead. But please do attribute the source to Cricheet (http://cricsheet.org), my package yorkr and my blog. Do give the functions a spin for yourself.

There are 2 more posts required for the introduction of MY yorkr package.So, Hasta la vista, baby! I’ll be back!

Important note: Do check out my other posts using yorkr at yorkr-posts

Also see

You may also like

  1. Introducing cricketr! : An R package to analyze performances of cricketers
  2. Cricket analytics with cricketr
  3. cricketr adapts to the Twenty20 International!
  4. The making of Total Control Android game
  5. De-blurring revisited with Wiener filter using OpenCV
  6. Rock N’ Roll with Bluemix, Cloudant & NodeExpress

Introducing cricket package yorkr: Part 1- Beaten by sheer pace!

“We need to regard statistical intuition with proper suspicion and replace impression formation by computation wherever possible”

“We are pattern seekers, believers in a coherent world”

“The hot hand is entirely in the eyes of the beholders, who are consistently” “too quick to perceive order and causality in randomeness. The hot hand is a” “massive and widespread cognitive illusion”

                   "Thinking, Fast and Slow - Daniel Kahneman"

Introduction

Yorker (noun) :A yorker is a bowling delivery in cricket, that pitches at or around the batsman’s toes. Also known as ‘toe crusher’

My package ‘yorkr’ is now available on CRAN. This package is based on data from Cricsheet. Cricsheet has the data of ODIs, Test, Twenty20 and IPL matches as yaml files. The yorkr package provides functions to convert the yaml files to more easily R consumable entities, namely dataframes. In fact all ODI matches have already been converted and are available for use at yorkrData. However as future matches are added to Cricsheet, you will have to convert the match files yourself. More details below.

If you are passionate about cricket, and love analyzing cricket performances, then check out my 2 racy books on cricket! In my books, I perform detailed yet compact analysis of performances of both batsmen, bowlers besides evaluating team & match performances in Tests , ODIs, T20s & IPL. You can buy my books on cricket from Amazon at $12.99 for the paperback and $4.99/$6.99 respectively for the kindle versions. The books can be accessed at Cricket analytics with cricketr  and Beaten by sheer pace-Cricket analytics with yorkr  A must read for any cricket lover! Check it out!!

1

This post can be viewed at RPubs at yorkr-Part1 or can also be downloaded as a PDF document yorkr-1.pdf

Checkout my interactive Shiny apps GooglyPlus2021 (interactive plots ) and GooglyPlusPlus2021 (analysis in specific intervals) which can be used to analyze IPL players, teams and matches.

Important note: Do check out the python avatar of cricketr, ‘cricpy’ in my post ‘Introducing cricpy:A python package to analyze performances of cricketers

Important note 1: Do check out all the posts on the python avatar of yorkr, namely ‘yorkpy’ in my post ‘Pitching yorkpy … short of good length to IPL – Part 1

1. First things first

  1. yorkr currently has a total 70 functions as of now. I have intentionally avoided abbreviating function names by dropping vowels, as is the usual practice in coding, because the resulting abbreviated names created would be very difficult to remember, and use. So instead of naming a function as tmBmenPrtshpOppnAllMtches(), I have used the longer form for e.g. teamBatsmenPartnershipOppnAllmatches(), which is much clearer. The longer form will be more intuitive. Moreover RStudio prompts the the different functions which have the same prefix and one does not need to type in the entire function name.
  2. The package yorkr has 4 classes of functions
  • Class 1- Team performances in a match
  • Class 2- Team performances in all matches against a single oppostion (e.g. all matches of India vs Australia or all matches of England vs Pakistan etc.)
  • Class 3- Team performance in all matches against all Opposition (India vs All,Pakistan vs All etc.)
  • Class 4- Individual performances of batsmen and bowlers

In this post I will be looking into Class 1 functions, namely the performances of opposing teams in a single match

The list of functions are

  1. teamBattingScorecardMatch()
  2. teamBatsmenPartnershipMatch()
  3. teamBatsmenVsBowlersMatch()
  4. teamBowlingScorecardMatch()
  5. teamBowlingWicketKindMatch()
  6. teamBowlingWicketRunsMatch()
  7. teamBowlingWicketRunsMatch()
  8. teamBowlingWicketMatch()
  9. teamBowlersVsBatsmenMatch()
  10. matchWormGraph()

2. Install the package from CRAN

library(yorkr)
rm(list=ls())

3. Convert and save yaml file to dataframe

This function will convert a yaml file in the format as specified in Cricsheet to dataframe. This will be saved as as RData file in the target directory. The name of the file wil have the following format team1-team2-date.RData. This is seen below.

convertYaml2RDataframe("225171.yaml","./source","./data")
## [1] "./source/225171.yaml"
## [1] "first loop"
## [1] "second loop"
setwd("./data")
dir()
## [1] "Australia-India-2012-02-12.RData"      
## [2] "Bangladesh-Zimbabwe-2009-10-27.RData"  
## [3] "convertedFiles.txt"                    
## [4] "England-New Zealand-2007-01-30.RData"  
## [5] "Ireland-England-2006-06-13.RData"      
## [6] "Pakistan-South Africa-2013-11-08.RData"
## [7] "Sri Lanka-West Indies-2011-02-06.RData"
setwd("..")

4. Convert and save all yaml files to dataframes

This function will convert all yaml files from a source directory to dataframes and save it in the target directory with the names as mentioned above.

convertAllYaml2RDataframes("./source",targetDirMen=".",targetDirWomen=".")
## [1] 1
## i= 1   file= ./source/225171.yaml 
## [1] "first loop"
## [1] "second loop"
## [1] 633  25

5. yorkrData – A Github repositiory

Cricsheet has ODI matches from 2006. There are a total of 1167 ODI matches(files) out of which 34 yaml files had format problems and were skipped. Incidentally I have already converted the 1133 yaml files in the ODI directory of Cricsheet to dataframes and saved then as RData. The rest of the yaml files ave already been converted to RData and are available for use. All the converted RData files can be accessed from my Github link yorkrData under the folder ODI-matches. You will need to use the functions to convert new match files, as they are added to Cricsheet. There is aslo a file named ‘convertedFiles’ which will have the name of the original file and the converted file as below

convertedFiles

  • 225171.yaml:Ireland-England-2006-06-13.RData
  • 225245.yaml:England-Pakistan-2006-08-30.RData
  • 225246.yaml:England-Pakistan-2006-09-02.RData …

You can download the the zip of the files and use it directly in the functions as follows

Note 1: The package in its current form handles ODIs,T20s and IPL T20 matches

Note 2: The link to the converted data frames have been provided above. The dataframes are around 600 rows x 25 columns. In this post I have created 10 functions that analyze team performances in a match. However you are free to slice and dice the dataframe in any way you like. If you do come up with interesting analyses, please do attribute the source of the data to Cricsheet, and my package yorkr and my blog. I would appreciate it if you could send me a note. .

6. Load the match data as dataframes

As mentioned above in this post I will using the functions from Class 1. For this post I will be using the match data from 5 random matches between 10 different opposing teams/countries. For this I will directly use the converted RData files rather than getting the data through the getMatchDetails()

With the RData we can load the data in 2 ways

A. With getMatchDetails()

  1. With getMatchDetails() using the 2 teams and the date on which the match occured
aus_ind <- getMatchDetails("Australia","India","2012-02-12",dir="./data")

or

B.Directly load RData into your code.

The match details will be loaded into a dataframe called ’overs’ which you can assign to a suitable name as below

The randomly selected matches are

  • Australia vs India – 2012-02-12, Adelaide
  • England vs New Zealand – 2007-01-30, Perth
  • Pakistan vs South Africa – 2013-07-08, UAE
  • Sri Lanka vs West Indioes -2011-02-06, Colombo(SSC)
  • Bangladesh vs Zimbabwe -2009-10-27, Dhaka

Directly load RData from file

load("./data/Australia-India-2012-02-12.RData")
aus_ind <- overs
load("./data/England-New Zealand-2007-01-30.RData")
eng_nz <- overs
load("./data/Pakistan-South Africa-2013-11-08.RData")
pak_sa <- overs
load("./data/Sri Lanka-West Indies-2011-02-06.RData")
sl_wi<- overs
load("./data/Bangladesh-Zimbabwe-2009-10-27.RData")
ban_zim <- overs

7. Team batting scorecard

Compute and display the batting scorecard of the teams in the match. The top batsmen in are G Gambhir(Ind), PJ Forrest(Aus), Q De Kock(SA) and KC Sangakkara(SL)

teamBattingScorecardMatch(aus_ind,'India')
## Total= 258
## Source: local data frame [8 x 5]
## 
##     batsman ballsPlayed fours sixes  runs
##      (fctr)       (int) (dbl) (dbl) (dbl)
## 1 G Gambhir         110     7     0    92
## 2  V Sehwag          20     3     0    20
## 3   V Kohli          28     1     0    18
## 4 RG Sharma          41     1     1    33
## 5  SK Raina          30     3     1    38
## 6  MS Dhoni          57     0     1    44
## 7 RA Jadeja           8     0     0    12
## 8  R Ashwin           2     0     0     1
teamBattingScorecardMatch(aus_ind,'Australia')
## Total= 260
## Source: local data frame [9 x 5]
## 
##        batsman ballsPlayed fours sixes  runs
##         (fctr)       (int) (dbl) (dbl) (dbl)
## 1    DA Warner          23     2     0    18
## 2   RT Ponting          13     1     0     6
## 3    MJ Clarke          43     5     0    38
## 4   PJ Forrest          83     5     2    66
## 5    DJ Hussey          76     5     0    72
## 6 DT Christian          36     2     0    39
## 7      MS Wade          17     1     0    16
## 8    RJ Harris           2     0     0     2
## 9     CJ McKay           3     0     0     3
teamBattingScorecardMatch(pak_sa,'South Africa')
## Total= 256
## Source: local data frame [7 x 5]
## 
##          batsman ballsPlayed fours sixes  runs
##           (fctr)       (int) (dbl) (dbl) (dbl)
## 1      Q de Kock         132     9     1   112
## 2        HM Amla          50     6     0    46
## 3   F du Plessis          21     1     0    10
## 4 AB de Villiers          40     2     0    30
## 5      DA Miller           9     0     0     5
## 6      JP Duminy          20     1     1    25
## 7      R McLaren          21     3     1    28
teamBattingScorecardMatch(sl_wi,'Sri Lanka')
## Total= 261
## Source: local data frame [10 x 5]
## 
##             batsman ballsPlayed fours sixes  runs
##              (fctr)       (int) (dbl) (dbl) (dbl)
## 1       WU Tharanga          50     5     0    39
## 2        TM Dilshan          27     2     1    30
## 3     KC Sangakkara         103     4     1    75
## 4  DPMD Jayawardene          52     2     0    44
## 5     CK Kapugedera          17     0     0    17
## 6    TT Samaraweera           7     0     0     4
## 7       NLTC Perera           8     0     0     6
## 8        AD Mathews          22     1     1    36
## 9      HMRKB Herath           4     0     0     2
## 10       BAW Mendis           6     1     0     8

8. Plot the team batting partnerships

The functions below plot the team batting partnetship in the match Note: Many of the plots include an additional parameters plot which is either TRUE or FALSE. The default value is plot=TRUE. When plot=TRUE the plot will be displayed. When plot=FALSE the data frame will be returned to the user. The user can use this to create an interactive chary using one of th epackages like rcharts, ggvis,googleVis or plotly.

teamBatsmenPartnershipMatch(pak_sa,"Pakistan","South Africa")

batsmenPartnership-1

teamBatsmenPartnershipMatch(eng_nz,"New Zealand","England",plot=TRUE)

batsmenPartnership-2

teamBatsmenPartnershipMatch(ban_zim,"Bangladesh","Zimbabwe",plot=FALSE)
##              batsman        nonStriker runs
## 1        Tamim Iqbal   Junaid Siddique    0
## 2        Tamim Iqbal Mohammad Ashraful    5
## 3    Junaid Siddique       Tamim Iqbal    0
## 4  Mohammad Ashraful       Tamim Iqbal    0
## 5  Mohammad Ashraful     Raqibul Hasan   20
## 6      Raqibul Hasan Mohammad Ashraful   13
## 7      Raqibul Hasan   Shakib Al Hasan    3
## 8    Shakib Al Hasan     Raqibul Hasan   12
## 9    Shakib Al Hasan   Mushfiqur Rahim    1
## 10   Mushfiqur Rahim   Shakib Al Hasan    1
## 11   Mushfiqur Rahim       Naeem Islam   30
## 12   Mushfiqur Rahim      Abdur Razzak    6
## 13   Mushfiqur Rahim      Dolar Mahmud   11
## 14   Mushfiqur Rahim     Rubel Hossain    8
## 15       Mahmudullah   Mushfiqur Rahim    4
## 16       Naeem Islam   Mushfiqur Rahim   21
## 17      Abdur Razzak   Mushfiqur Rahim    3
## 18      Dolar Mahmud   Mushfiqur Rahim   41
teamBatsmenPartnershipMatch(aus_ind,"India","Australia", plot=TRUE)

batsmenPartnership-3

9. Batsmen vs Bowler

The function below computes and plots the performances of the batsmen vs the bowlers. As before the plot parameter can be set to TRUE or FALSE. By default it is plot=TRUE

teamBatsmenVsBowlersMatch(pak_sa,'Pakistan',"South Africa", plot=TRUE)

batsmenVsBowler-1

teamBatsmenVsBowlersMatch(aus_ind,'Australia',"India",plot=TRUE)

batsmenVsBowler-2

teamBatsmenVsBowlersMatch(ban_zim,'Zimbabwe',"Bangladesh", plot=TRUE)

batsmenVsBowler-3

m <- teamBatsmenVsBowlersMatch(sl_wi,'West Indies',"Sri Lanka", plot=FALSE)
m
## Source: local data frame [35 x 3]
## Groups: batsman [?]
## 
##      batsman        bowler runsConceded
##       (fctr)        (fctr)        (dbl)
## 1   CH Gayle  CRD Fernando            0
## 2   DM Bravo  CRD Fernando           15
## 3   DM Bravo   NLTC Perera           21
## 4   DM Bravo    AD Mathews           10
## 5   DM Bravo    BAW Mendis           11
## 6   DM Bravo CK Kapugedera            1
## 7   DM Bravo    TM Dilshan            5
## 8   DM Bravo  HMRKB Herath           16
## 9  AB Barath   NLTC Perera            0
## 10 RR Sarwan  CRD Fernando            6
## ..       ...           ...          ...

10. Bowling Scorecard

This function provides the bowling performance, the number of overs bowled, maidens, runs conceded and wickets taken for each match

teamBowlingScorecardMatch(eng_nz,'England')
## Source: local data frame [6 x 5]
## 
##           bowler overs maidens  runs wickets
##           (fctr) (int)   (int) (dbl)   (dbl)
## 1    LE Plunkett     9       0    54       3
## 2    CT Tremlett    10       0    72       1
## 3     A Flintoff    10       0    66       0
## 4     MS Panesar    10       2    35       2
## 5  JWM Dalrymple     5       0    43       0
## 6 PD Collingwood     6       0    36       1
teamBowlingScorecardMatch(eng_nz,'New Zealand')
## Source: local data frame [6 x 5]
## 
##         bowler overs maidens  runs wickets
##         (fctr) (int)   (int) (dbl)   (dbl)
## 1 JEC Franklin     8       1    45       1
## 2      SE Bond    10       0    58       1
## 3     JDP Oram     5       0    23       0
## 4     JS Patel    10       0    53       1
## 5   DL Vettori    10       0    40       3
## 6  CD McMillan     7       1    38       2
teamBowlingScorecardMatch(aus_ind,'Australia')
## Source: local data frame [6 x 5]
## 
##         bowler overs maidens  runs wickets
##         (fctr) (int)   (int) (dbl)   (dbl)
## 1    RJ Harris    10       0    57       1
## 2     MA Starc     8       0    49       0
## 3     CJ McKay    10       1    53       3
## 4 DT Christian    10       0    45       0
## 5    DJ Hussey     3       0    13       0
## 6   XJ Doherty     9       0    51       2

11. Wicket Kind

The plots below provide the bowling kind of wicket taken by the bowler (caught, bowled, lbw etc.)

teamBowlingWicketKindMatch(aus_ind,"India","Australia")

bowlingWicketKind-1

teamBowlingWicketKindMatch(aus_ind,"Australia","India")

bowlingWicketKind-2

teamBowlingWicketKindMatch(pak_sa,"South Africa","Pakistan")

bowlingWicketKind-3

m <-teamBowlingWicketKindMatch(sl_wi,"Sri Lanka",plot=FALSE)
m
##           bowler wicketKind wicketPlayerOut runs
## 1   CRD Fernando     bowled        CH Gayle   45
## 2    NLTC Perera     caught       AB Barath   36
## 3   HMRKB Herath        lbw       RR Sarwan   54
## 4     BAW Mendis     caught   S Chanderpaul   46
## 5    NLTC Perera        lbw        DM Bravo   36
## 6    NLTC Perera     caught       DJG Sammy   36
## 7   CRD Fernando     caught        DJ Bravo   45
## 8     BAW Mendis     caught       NO Miller   46
## 9     BAW Mendis     caught        CS Baugh   46
## 10    BAW Mendis     caught         SJ Benn   46
## 11    AD Mathews   noWicket        noWicket   33
## 12 CK Kapugedera   noWicket        noWicket    7
## 13    TM Dilshan   noWicket        noWicket   25

12. Wicket vs Runs conceded

The plots below provide the wickets taken and the runs conceded by the bowler in the match

teamBowlingWicketRunsMatch(pak_sa,"Pakistan","South Africa")

wicketRuns-1

teamBowlingWicketRunsMatch(aus_ind,"Australia","India")

wicketRuns-2

m <-teamBowlingWicketRunsMatch(sl_wi,"West Indies","Sri Lanka", plot=FALSE)
m
## Source: local data frame [6 x 5]
## 
##      bowler overs maidens  runs wickets
##      (fctr) (int)   (int) (dbl)   (chr)
## 1 R Rampaul     5       0    44       1
## 2 DJG Sammy    10       1    61       1
## 3  DJ Bravo    10       0    58       3
## 4  CH Gayle    10       0    34       0
## 5   SJ Benn    10       1    38       4
## 6 NO Miller     5       0    35       0

13. Wickets taken by bowler

The plots provide the wickets taken by the bowler

m <-teamBowlingWicketMatch(eng_nz,'England',"New Zealand", plot=FALSE)
m
##           bowler wicketKind wicketPlayerOut runs
## 1    LE Plunkett        lbw      SP Fleming   54
## 2    LE Plunkett     caught       PG Fulton   54
## 3 PD Collingwood     caught     LRPL Taylor   36
## 4     MS Panesar    stumped     CD McMillan   35
## 5    LE Plunkett     caught       L Vincent   54
## 6     MS Panesar     caught     BB McCullum   35
## 7    CT Tremlett     caught    JEC Franklin   72
## 8     A Flintoff   noWicket        noWicket   66
## 9  JWM Dalrymple   noWicket        noWicket   43
teamBowlingWicketMatch(sl_wi,"Sri Lanka","West Indies")

bowlingWickets-1

teamBowlingWicketMatch(eng_nz,"New Zealand","England")

bowlingWickets-2

14. Bowler Vs Batsmen

The functions compute and display how the different bowlers of the country performed against the batting opposition.

teamBowlersVsBatsmenMatch(ban_zim,"Bangladesh","Zimbabwe")

bowlerVsBatsmen-1

teamBowlersVsBatsmenMatch(aus_ind,"India","Australia")

bowlerVsBatsmen-2

teamBowlersVsBatsmenMatch(eng_nz,"England","New Zealand")

bowlerVsBatsmen-3

m <- teamBowlersVsBatsmenMatch(pak_sa,"Pakistan",plot=FALSE)
m
## Source: local data frame [30 x 3]
## Groups: bowler [?]
## 
##            bowler        batsman runsConceded
##            (fctr)         (fctr)        (dbl)
## 1  Mohammad Irfan      Q de Kock           25
## 2  Mohammad Irfan        HM Amla           17
## 3  Mohammad Irfan   F du Plessis            0
## 4  Mohammad Irfan AB de Villiers            9
## 5   Sohail Tanvir      Q de Kock           11
## 6   Sohail Tanvir        HM Amla            6
## 7   Sohail Tanvir      JP Duminy            9
## 8   Sohail Tanvir      R McLaren           12
## 9     Junaid Khan      Q de Kock           24
## 10    Junaid Khan        HM Amla            6
## ..            ...            ...          ...

15. Match worm graph

The plots below provide the match worm graph for the matches

matchWormGraph(aus_ind,'Australia',"India")

matchWorm-1

matchWormGraph(sl_wi,'Sri Lanka',"West Indies")

matchWorm-2

Conclusion

This post included all functions between 2 opposing countries from the package yorkr.As mentioned above the yaml match files have been already converted to dataframes and are available for download from Github. Go ahead and give it a try

To be continued. Watch this space!

Important note: Do check out my other posts using yorkr at yorkr-posts

You may also like

The making of cricket package yorkr – Part 3

Introduction

This is the 3rd part of my cricket package yorkr in R. In my 2 earlier posts

  1. The making of cricket package yorkr – Part 1. This post analyzed the performance of team in a ODI match. The batting and bowling performances of the team were analyzed. This post also performed analyses of a country in all matches against another country for e.g. India vs All matches agianst Australia. The best performers with the bat and ball were determined, the best batting partnerships, the performances at different venues etc. The detailed performances of the bowlers of India and Australia in the confrontation were also analyzed.
  2. The making of cricket package yorkr – Part 2 This post includes all ODI matches between a country and others. For obvious reasons I have chosen India and selected all ODI matches played by India with other countries. This included batting and bowling performances of the country against all oppositions.

As mentioned in my earlier posts the data is taken from Cricsheet

If you are passionate about cricket, and love analyzing cricket performances, then check out my 2 racy books on cricket! In my books, I perform detailed yet compact analysis of performances of both batsmen, bowlers besides evaluating team & match performances in Tests , ODIs, T20s & IPL. You can buy my books on cricket from Amazon at $12.99 for the paperback and $4.99/$6.99 respectively for the kindle versions. The books can be accessed at Cricket analytics with cricketr  and Beaten by sheer pace-Cricket analytics with yorkr  A must read for any cricket lover! Check it out!!

1

s), and $4.99/Rs 320 and $6.99/Rs448 respectively

Important note: Do check out my other posts using yorkr at yorkr-posts

Important note: Do check out all the posts on the python avatar of yorkr, namely ‘yorkpy’ in my post ‘Pitching yorkpy … short of good length to IPL – Part 1

In this post I look at individual performances of batsmen and bowlers in ODIs. For this post I have chosen Virat Kohli & Mahendra Singh Dhoni from India. Kohli has been consistent and in great form right through. Dhoni follows Kohli very closely in ODIs. Dhoni besides his shrewd captaincy is one of the best ODI batsman and a great finisher. I have include AB Devilliers from South Africa who seems to invent new strokes and shots every time, much like Glenn Maxwell.

For bowling analyses I have selected RA Jadeja, Harbhajan Singh *the top Indian ODI bowlers) and Mitchell Johnson who is among the best in the world.

This post is also available at RPubs at yorkr-3. You can also download this post as a pdf from yorkr-3.pdf

Checkout my interactive Shiny apps GooglyPlus (plots & tables) and Googly (only plots) which can be used to analyze IPL players, teams and matches.

My earlier package ‘cricketr’ (see Introducing cricketr: An R package for analyzing performances of cricketers) was based on data from ESPN Cricinfo Statsguru. If you want to take a look at my book with all my articles based on my package cricketr at – Cricket analytics with cricketr!!!. The book is also available in paperback and kindle versions at Amazon which has, by the way, better formatting!

I have added some quick observations on the plots below. However there is a lot more that can be discerned from the plots that I can possibly explain. The charts do display a wealth of insights. Do take a close look at the plots.

library(dplyr)
library(ggplot2)
library(yorkr)
library(reshape2)
library(gridExtra)
library(rpart.plot)

1. Batting Details

The following functions get the overall batting details for a country against all opposition.

a <- getTeamBattingDetails("India",save=TRUE)
b <- getTeamBattingDetails("South Africa",save=TRUE)

2. Get Batsman details

Now I get the details of the batsmen Virat Kohli and Mahendra Singh Dhoni from the saved India file and AB De Villiers from the saved South Africa file

kohli <- getBatsmanDetails(team="India",name="Kohli")
dhoni <- getBatsmanDetails(team="India",name="Dhoni")
devilliers <-  getBatsmanDetails(team="South Africa",name="Villiers")

3. Display the dataframe

the dataframe obtained from the calls above provide detailed information for the batsman in every ODI match. This dataframe has all the fields that can be obtained from ESPN Cricinfo

Untitled


 

 

Performance analyses of batsmen

 

4. Runs vs deliveries plot

It can be seen from the plots below that Kohli is very consistent in the runs scored. The runs crowd near the regression curve. There is more variance in Dhoni and De Villiers performance. The band on either side of the regression curve represents the 95% confidence interval(A 95% confidence level means that 95% of the intervals would include the population parameter).

p1 <-batsmanRunsVsDeliveries(kohli,"Kohli")
p2 <- batsmanRunsVsDeliveries(dhoni, "Dhoni")
p3 <- batsmanRunsVsDeliveries(devilliers,"De Villiers")
grid.arrange(p1,p2,p3, ncol=3)

runsDel-1

5. Total runs vs 4s vs 6s plot

The plots below show the runs (Total runs, Runs from 4s & Runs from sixes) vs the deliveries faced. Kohli scores more runs and more fours which can be evaluated from the slope of the blue and red regression lines (reaches 150+,50+) for Total runs and Runs from fours). De Villers has more Runs from sixes as can be seen the 3rd sub plot (green line)

kohli46 <- select(kohli,batsman,ballsPlayed,fours,sixes,runs)
p1 <- batsmanFoursSixes(kohli46,"Kohli")
dhoni46 <- select(dhoni,batsman,ballsPlayed,fours,sixes,runs)
p2 <- batsmanFoursSixes(dhoni46,"Dhoni")
devilliers46 <- select(devilliers,batsman,ballsPlayed,fours,sixes,runs)
p3 <- batsmanFoursSixes(devilliers46, "De Villiers")
grid.arrange(p1,p2,p3, ncol=3)

4s6s-1

6. Batsmen dismissals

Interestingly it can be seen that Dhoni has remained unbeaten more often (47 times) than Kohli or De Villiers. Dhoni despite being a great runner between wickets has been run-out more often.

p1 <-batsmanDismissals(kohli,"Kohli")
p2 <- batsmanDismissals(dhoni, "Dhoni")
p3 <- batsmanDismissals(devilliers, "De Villiers")
grid.arrange(p1,p2,p3, ncol=3)

dismissal-1

7. Batsmen Strike Rate

From the plot below Kohli has the best strike rate till 100 runs, the slope seems to steeper. De Villiers seems to do better after 100 runs.

p1 <-batsmanMeanStrikeRate(kohli,"Kohli")
p2 <- batsmanMeanStrikeRate(dhoni, "Dhoni")
p3 <- batsmanMeanStrikeRate(devilliers, "De Villiers")
grid.arrange(p1,p2,p3, ncol=3)

meanSR-1

8. Batsmen moving average

Kohli’s and De Villiers’ form can be seen to be improving over the years. Dhoni seems to have hit a slump in recent times. But we have to keep in mind that he has the second highest ODI runs in India and is just behind Kohli

p1 <-batsmanMovingAverage(kohli,"Kohli")
p2 <- batsmanMovingAverage(dhoni, "Dhoni")
p3 <- batsmanMovingAverage(devilliers, "De Villiers")
grid.arrange(p1,p2,p3, ncol=3)

bmanMA-1

9. Batsmen against opposition

Kohli averages 50 runs against 6 countries, to Dhoni’s 4. Kohli performs well against Australia, New Zealand, West Indies,Pakistan,Bangladesh. Kohli’s performance against England has been mediocre. De Villiers averages around 50 with 5 countries

batsmanRunsAgainstOpposition(kohli,"Kohli")

bmanOppn-1

batsmanRunsAgainstOpposition(dhoni, "Dhoni")

bmanOppn-2

batsmanRunsAgainstOpposition(devilliers, "De Villiers")

bmanOppn-3

10. Batsmen runs at different venues

Kohli’s favorite hunting grounds in ODI are Adelaide, Sydney, Western Australia, Wankhede. Dhoni’s best performances are at Lords, Sydney,Chepauk.

batsmanRunsVenue(kohli,"Kohli")

bmanOppn1-1

batsmanRunsVenue(dhoni, "Dhoni")

bmanOppn1-2

batsmanRunsVenue(devilliers, "De Villiers")

bmanOppn1-3

11. Batsmen runs predict

The plots below predict the number of deliveries needed by each batsmen to score runs shown. For this I have used classification trees based on deliveries and runs using the package rpart. From the plot for Kohli it can be seen that for 58 deliveries scores around 52 runs. On the other hand De Villiers needs just over 40 deliveries to score 52 runs.

par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
batsmanRunsPredict(kohli,"Kohli")
batsmanRunsPredict(dhoni, "Dhoni")
batsmanRunsPredict(devilliers, "De Villiers")

runsPred-1

dev.off()
## null device 
##           1

12. Get team bowling details

The function below all the ODI matches between India or Australia and all other countries

c <- getTeamBowlingDetails("India",save=TRUE)
d <- getTeamBowlingDetails("Australia",save=TRUE)

13. Get wicket details

The functions below gets the data frame for each bowler

jadeja <- getBowlerWicketDetails(team="India",name="Jadeja")
harbhajan <- getBowlerWicketDetails(team="India",name="Harbhajan")
ashwin <- getBowlerWicketDetails(team="India",name="Ashwin")
johnson <-  getBowlerWicketDetails(team="Australia",name="Johnson")

14. Display data frame

The details of the data frame is shown below

knitr::kable(head(jadeja))
bowler overs maidens runs wickets economyRate date opposition venue
RA Jadeja 6 0 40 0 6.67 2009-02-08 Sri Lanka R Premadasa Stadium
RA Jadeja 7 1 34 0 4.86 2009-06-26 West Indies Sabina Park, Kingston
RA Jadeja 2 0 12 0 6.00 2009-06-28 West Indies Sabina Park, Kingston
RA Jadeja 9 0 39 1 4.33 2009-10-25 Australia Reliance Stadium
RA Jadeja 7 1 35 4 5.00 2009-10-28 Australia Vidarbha Cricket Association Stadium, Jamtha
RA Jadeja 7 1 35 4 5.00 2009-10-28 Australia Vidarbha Cricket Association Stadium, Jamtha

15. Bowler Economy rate

Harbhajan and Ashwin have a better economy rate than RA Jadeja

p1 <- bowlerEconomyRate(jadeja,"RA Jadeja")
p2<-bowlerEconomyRate(harbhajan, "Harbhajan")
p3<-bowlerEconomyRate(ashwin, "Ashwin")
p4<-bowlerEconomyRate(johnson, "MG Johnson")
grid.arrange(p1,p2,p3,p4, ncol=2)

ER-1

15. Mean runs conceded by bowler

p1<-bowlerMeanRuns(jadeja,"RA Jadeja")
p2<-bowlerMeanRuns(harbhajan, "Harbhajan")
p3<-bowlerMeanRuns(ashwin, "Ashwin")
p4<-bowlerMeanRuns(johnson, "MG Johnson")
grid.arrange(p1,p2,p3,p4, ncol=2)

meanRuns-1

15. Moving average of bowler

From the plots below MG Johnson, Harbhajan and Ashwin have been performing very consistently. RA Jadeja bowling seems to be taking a nosedive, though he is at the top of all ODI bowlers of India

p1<-bowlerMovingAverage(jadeja,"RA Jadeja")
p2<-bowlerMovingAverage(harbhajan, "Harbhajan")
p3<-bowlerMovingAverage(ashwin, "Ashwin")
p4<-bowlerMovingAverage(johnson, "MG Johnson")
grid.arrange(p1,p2,p3,p4, ncol=2)

bwlrMA-1

16. Wicket average

Jadeja has a better wicket average than Harbhajan and Ashwin.Jadeja and Ashwin average around 2 wickets Harbhajan averages 1.5 wickets(tendency to 2)

p1<-bowlerWicketPlot(jadeja,"RA Jadeja")
p2<-bowlerWicketPlot(harbhajan, "Harbhajan")
p3<-bowlerWicketPlot(ashwin, "Ashwin")
p4<-bowlerWicketPlot(johnson, "MG Johnson")
grid.arrange(p1,p2,p3,p4, ncol=2)

bwlrWkt-1

16. Wickets opposition

Jadeja’s best performances have been against England, Pakistan, New Zealand and Zimbabwe. For Harbhajan it has been New Zealand, Sri Lanka and Zimbabwe.

bowlerWicketsAgainstOpposition(jadeja,"RA Jadeja")

bwlrOppn-1

bowlerWicketsAgainstOpposition(harbhajan, "Harbhajan")

bwlrOppn-2

bowlerWicketsAgainstOpposition(ashwin, "Ashwin")

bwlrOppn-3

bowlerWicketsAgainstOpposition(johnson, "MG Johnson")

bwlrOppn-4

16. Wickets venue

The top 20 venues for each bowler is shown in the plots

bowlerWicketsVenue(jadeja,"RA Jadeja")

bwlrVenue-1

bowlerWicketsVenue(harbhajan, "Harbhajan")

bwlrVenue-2

bowlerWicketsVenue(ashwin, "Ashwin")

bwlrVenue-3

bowlerWicketsVenue(johnson, "MG Johnson")

bwlrVenue-4

16. Create a data frame with wickets and deliveries

jadeja1 <- getDeliveryWickets(team="India",name="Jadeja",save=FALSE)
harbhajan1 <- getDeliveryWickets(team="India",name="Harbhajan",save=FALSE)
ashwin1 <- getDeliveryWickets(team="India",name="Ashwin",save=FALSE)
johnson1 <- getDeliveryWickets(team="Australia",name="MG Johnson",save=FALSE)

17. Deliveries to wickets plots

The following plots try to predict the average number of deliveries required for the wickets taken. As in the batsman runs predict I have used classification trees between deliverie at which a wicket was taken. The package rpart was used for the classification. The internediate nodes are the number of deliveries and the leaf nodes are the wickets taken. Though the wickets are in decimal we can intepret the tree as follows For RA Jadeja 22 to take 1.6 wicket (~2 wickets). Interestingly Harbhajan needs

par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
bowlerWktsPredict(jadeja1,"RA Jadeja")
bowlerWktsPredict(harbhajan1,"Harbhajan Sigh")

wktPrd1-1

dev.off()
## null device 
##           1

Similarly MG Johnson can provide a breakthrough with just around 14 deliveries

par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
bowlerWktsPredict(ashwin1,"Ravichander Ashwin")
bowlerWktsPredict(johnson1,"MG Johnson")

wktPred2-1

dev.off()
## null device 
##           1

Conclusion

ODI batsman

  1. The top 2 ODI Indian batsman(Kohli and Dhoni) and De Villiers of South Africa were considered.
  2. Kohli has a better strike rate till about 100 runs(steeper slope) and De Villiers beyond 100.
  3. Dhoni has remained unbeaten more number of times than the other 2. It may have been possible that his average would have been higher if he had come in earlier
  4. Kohli and De Villiers have performed consistently. Dhoni needs to get back his touch

ODI bowlers

  1. RA Jadeja has a better wicket taking rate than Harbhajan and Ashwin.
  2. Ashwin and Harbhajan have a better economy rate than Jadeja
  3. Harbhjanan, Ashwin and MG Johnson have performed consistently while RA Jadeja’s performance has been on the decline.
  4. Harbhajan and MG Johnson need around 11 balls to make a break through

This was probably the last set of functions for my cricket package yorkr. Over the next several weeks I will be cleaning up, documenting, refining the functions and removing any glitches. I hope to have the package released in the next 6-8 weeks

Also see

  1. Cricket analytics with cricketr
  2. Sixer: An R package cricketr’s new Shiny avatar

You may also like

  1. What’s up Watson? Using IBM Watson’s QAAPI with Bluemix, NodeExpress
  2. The common alphabet of programming languages
  3. A method to crowd source pothole marking on (Indian) roads
  4. The Anomaly
  5. Simulating the domino effect in Android using Box2D and AndEngine
  6. Presentation on Wireless Technologies – Part 1
  7. Natural selection of database technology through the years

The making of cricket package yorkr – Part 2

Introduction

In this post (The making of cricket package yorkr-Part 2),  I continue to add new functionality to my package cricket package yorkr in R. In my earlier post The making of cricket package yorkr-Part 1 I had included functionality that will plot batsman partnerships, bowlers performances with wicket-kind, wicket-runs in specified ODI match. The earlier post also included functions that were based on confrontations between any 2 teams ( I had chosen the ODI matches between India and Australia).

If you are passionate about cricket, and love analyzing cricket performances, then check out my 2 racy books on cricket! In my books, I perform detailed yet compact analysis of performances of both batsmen, bowlers besides evaluating team & match performances in Tests , ODIs, T20s & IPL. You can buy my books on cricket from Amazon at $12.99 for the paperback and $4.99/$6.99 respectively for the kindle versions. The books can be accessed at Cricket analytics with cricketr  and Beaten by sheer pace-Cricket analytics with yorkr  A must read for any cricket lover! Check it out!!

1

320 and $6.99/Rs448 respectively

 

Checkout my interactive Shiny apps GooglyPlus (plots & tables) and Googly (only plots) which can be used to analyze IPL players, teams and matches.

Important note: Do check out all the posts on the python avatar of yorkr, namely ‘yorkpy’ in my post ‘Pitching yorkpy … short of good length to IPL – Part 1

This post includes all ODI matches between a country and others. For obvious reasons I have chosen India and selected all ODI matches played by India with other countries. As mentioned in my earlier post the data is taken from Cricsheet. There are a total of 262 ODI matches that India has played. These 262 ODI matches played by India are then combined into one large dataframe that is 140,655 rows x 22 columns.

The analysis is then done on India’s batting and bowling performances on this huge dataframe for e.g. who has the most scores and highest batting partnerships, which bowlers are most effective against a country. Also the functions give details like which Indian bowlers have the worst performance or which bowlers have taken the most wicket against India. The functions also provide information on batsmen and bowlers of the opposing countries who have performed welll against India. Since the dataset is large and rich, the possible insights are infinite.I am including some functions that I have created on this dataset below.

Also note that it is possible to choose all ODI matches played by Australia, Pakistan, South Africa etc with the rest of the world. Similar analysis can be done for these countries also by using the functions below

As before the package ‘yorkr’ is still under development. I will be releasing the package and code in about 6-10 weeks time. Please be patient.

This post is also available at RPubs at yorkr-2. You can download this post as a PDF document at yorkr-2.pdf

My earlier package ‘cricketr’ (see Introducing cricketr: An R package for analyzing performances of cricketers) was based on data from ESPN Cricinfo Statsguru. Take a look at my book with all my articles based on my package cricketr at – Cricket analytics with cricketr!!!. The book is also available in paperback and kindle versions at Amazon which has, by the way, better formatting!

library(dplyr)
library(ggplot2)
library(yorkr)
matches <- getAllMatches("India",save=FALSE)
dim(matches)
## [1] 140655     22

1. Team Batting details – India

The following function provides the overall batting performance of India against all opposition

Virat Kohli has the best performance with a total of 7023 runs in ODIs followed closely by Mahendra Dhoni with 6885 runs and then Suresh Raina with 4964 runs While Kohli leads in the numnber of 4s (662), Dhoni and Raina has twice the number of 6s as compared to Kohli. However Kohli has a better strike rate (7023/774100) = 90.33% while Dhoni has an overall strike rate of (6885/7878100) = 87.39%

df <-teamBattingDetailsAllOppn(matches,theTeam="India")
## Total= 58033
df
## Source: local data frame [71 x 5]
## 
##         batsman ballsPlayed fours sixes  runs
##          (fctr)       (int) (int) (int) (dbl)
## 1       V Kohli        7774   662    65  7023
## 2      MS Dhoni        7878   515   129  6885
## 3      SK Raina        5076   429   114  4964
## 4     G Gambhir        5138   470    15  4495
## 5     RG Sharma        5245   370    89  4377
## 6  SR Tendulkar        4708   504    43  4196
## 7  Yuvraj Singh        4472   403    96  3976
## 8      V Sehwag        3102   494    74  3679
## 9      S Dhawan        2956   314    37  2694
## 10    AM Rahane        2490   194    24  2005
## ..          ...         ...   ...   ...   ...

2. Team batting details – Other countries against India

When we use other countries in theTeam then we get the performance of batsman of these countries against India in ODIs. This is because matches is a selection of all matches played by India against other countries. The following there calls show the performances of the batsman of England, South Africa, Pakistan & Ireland against India.

df <-teamBattingDetailsAllOppn(matches,theTeam="England")
## Total= 7602
df
## Source: local data frame [43 x 5]
## 
##           batsman ballsPlayed fours sixes  runs
##            (fctr)       (int) (int) (int) (dbl)
## 1         IR Bell        1238   110     9  1085
## 2    KP Pietersen         990    89    10   847
## 3         AN Cook        1049   103     2   822
## 4       RS Bopara         632    42     8   534
## 5  PD Collingwood         450    38     6   393
## 6         OA Shah         394    40     7   385
## 7       IJL Trott         410    33     2   349
## 8         JE Root         408    32     4   336
## 9        SR Patel         336    25    10   329
## 10   C Kieswetter         309    34    13   313
## ..            ...         ...   ...   ...   ...
df <-teamBattingDetailsAllOppn(matches,theTeam="South Africa")
## Total= 6172
df
## Source: local data frame [36 x 5]
## 
##           batsman ballsPlayed fours sixes  runs
##            (fctr)       (int) (int) (int) (dbl)
## 1  AB de Villiers        1026   102    38  1179
## 2         HM Amla         796    74     1   704
## 3       Q de Kock         637    76     8   633
## 4       JH Kallis         666    50     4   554
## 5       JP Duminy         477    19     9   438
## 6    F du Plessis         470    30     8   421
## 7        GC Smith         355    25     3   252
## 8        HH Gibbs         318    26     3   242
## 9      MN van Wyk         270    23     1   202
## 10      DA Miller         188    19     4   193
## ..            ...         ...   ...   ...   ...
df <-teamBattingDetailsAllOppn(matches,theTeam="Pakistan")
## Total= 4660
df
## Source: local data frame [37 x 5]
## 
##            batsman ballsPlayed fours sixes  runs
##             (fctr)       (int) (int) (int) (dbl)
## 1      Younis Khan         752    56     8   686
## 2     Shoaib Malik         669    61     4   595
## 3    Misbah-ul-Haq         619    49     6   550
## 4      Salman Butt         617    69     4   535
## 5  Mohammad Yousuf         458    37     2   432
## 6    Nasir Jamshed         473    41     4   408
## 7  Mohammad Hafeez         423    36     3   347
## 8    Shahid Afridi         187    16     7   235
## 9     Kamran Akmal         235    20     5   192
## 10      Umar Akmal         146     7     2   103
## ..             ...         ...   ...   ...   ...
df <-teamBattingDetailsAllOppn(matches,theTeam="Bangladesh")
## Total= 3761
df
## Source: local data frame [39 x 5]
## 
##              batsman ballsPlayed fours sixes  runs
##               (fctr)       (int) (int) (int) (dbl)
## 1    Mushfiqur Rahim         658    34    13   517
## 2        Tamim Iqbal         573    61     6   504
## 3    Shakib Al Hasan         591    42     5   493
## 4        Mahmudullah         310    27     1   269
## 5      Raqibul Hasan         262    11     3   202
## 6      Nasir Hossain         187    21     1   183
## 7  Mohammad Ashraful         235    17    NA   158
## 8      Soumya Sarkar         164    18     5   157
## 9        Imrul Kayes         183    21     1   155
## 10     Sabbir Rahman         142    16     1   136
## ..               ...         ...   ...   ...   ...

3. Top batting partnership report – India

The following functions show the top partnerships among Indian batsman in ODIs. Virat Kohli leads the way with 7023 runs followed by Mahendra Singh Dhoni with 6885 runs and Sures Raina in the 3rd pace.

The detailed report gives the breakup of the partnerships. It can be seen that Kohli has had the best partnership with Rohot Sharma and Suresh Raina. Dhoni best partnership is with Raina

a <- batsmanPartnershiAllOppn(matches,theTeam="India",report="summary")
a
## Source: local data frame [71 x 2]
## 
##         batsman totalRuns
##          (fctr)     (dbl)
## 1       V Kohli      7023
## 2      MS Dhoni      6885
## 3      SK Raina      4964
## 4     G Gambhir      4495
## 5     RG Sharma      4377
## 6  SR Tendulkar      4196
## 7  Yuvraj Singh      3976
## 8      V Sehwag      3679
## 9      S Dhawan      2694
## 10    AM Rahane      2005
## ..          ...       ...
b <- batsmanPartnershiAllOppn(matches,theTeam="India",report="detailed")
b[1:50,]
##     batsman      nonStriker partnershipRuns totalRuns
## 1   V Kohli        S Dhawan             657      7023
## 2   V Kohli       AM Rahane             502      7023
## 3   V Kohli       RG Sharma            1073      7023
## 4   V Kohli      KD Karthik             139      7023
## 5   V Kohli    SR Tendulkar             272      7023
## 6   V Kohli        R Dravid             132      7023
## 7   V Kohli        V Sehwag             255      7023
## 8   V Kohli    Yuvraj Singh             420      7023
## 9   V Kohli        SK Raina            1072      7023
## 10  V Kohli        MS Dhoni             534      7023
## 11  V Kohli Harbhajan Singh              13      7023
## 12  V Kohli       IK Pathan               1      7023
## 13  V Kohli               4               0      7023
## 14  V Kohli       G Gambhir             962      7023
## 15  V Kohli      RV Uthappa              10      7023
## 16  V Kohli       RA Jadeja              91      7023
## 17  V Kohli        R Ashwin              71      7023
## 18  V Kohli       AT Rayudu             345      7023
## 19  V Kohli Gurkeerat Singh               1      7023
## 20  V Kohli       YK Pathan              68      7023
## 21  V Kohli       STR Binny               4      7023
## 22  V Kohli       MK Tiwary             105      7023
## 23  V Kohli        AR Patel              39      7023
## 24  V Kohli        PA Patel             180      7023
## 25  V Kohli               6               0      7023
## 26  V Kohli         M Vijay              33      7023
## 27  V Kohli       KM Jadhav              10      7023
## 28  V Kohli        AM Nayar              25      7023
## 29  V Kohli     S Badrinath               9      7023
## 30 MS Dhoni        S Dhawan              49      6885
## 31 MS Dhoni       AM Rahane              50      6885
## 32 MS Dhoni       RG Sharma             300      6885
## 33 MS Dhoni      KD Karthik             158      6885
## 34 MS Dhoni    SR Tendulkar             325      6885
## 35 MS Dhoni        R Dravid             239      6885
## 36 MS Dhoni        V Sehwag             188      6885
## 37 MS Dhoni    Yuvraj Singh             837      6885
## 38 MS Dhoni        SK Raina            1423      6885
## 39 MS Dhoni          M Kaif              47      6885
## 40 MS Dhoni        D Mongia              47      6885
## 41 MS Dhoni      AB Agarkar               8      6885
## 42 MS Dhoni Harbhajan Singh              90      6885
## 43 MS Dhoni        RP Singh              95      6885
## 44 MS Dhoni        MM Patel               0      6885
## 45 MS Dhoni       IK Pathan             156      6885
## 46 MS Dhoni       G Gambhir             596      6885
## 47 MS Dhoni      RV Uthappa             137      6885
## 48 MS Dhoni     S Sreesanth              23      6885
## 49 MS Dhoni        I Sharma              67      6885
## 50 MS Dhoni         P Kumar              64      6885

4. Top batting partnership report – Other countries against India

Since matches already has selected all matches played by India with every other country calling the function with theTeam=“Australia” or “South Africa” will display those batsman who had the best partnerships in matches against India. It can be seen that Ponting, Hussey and Bailey lead against India while for the SOuth Africans it is De Villiers, Hashim Amla and Q De Kock.

a <- batsmanPartnershiAllOppn(matches,theTeam="Australia",report="summary")
a
## Source: local data frame [48 x 2]
## 
##       batsman totalRuns
##        (fctr)     (dbl)
## 1  RT Ponting       876
## 2  MEK Hussey       753
## 3   GJ Bailey       610
## 4   SR Watson       609
## 5   MJ Clarke       607
## 6   ML Hayden       573
## 7   A Symonds       536
## 8    AJ Finch       525
## 9   SPD Smith       467
## 10  DA Warner       391
## ..        ...       ...
b <- batsmanPartnershiAllOppn(matches,theTeam="South Africa",report="summary")
b
## Source: local data frame [36 x 2]
## 
##           batsman totalRuns
##            (fctr)     (dbl)
## 1  AB de Villiers      1179
## 2         HM Amla       704
## 3       Q de Kock       633
## 4       JH Kallis       554
## 5       JP Duminy       438
## 6    F du Plessis       421
## 7        GC Smith       252
## 8        HH Gibbs       242
## 9      MN van Wyk       202
## 10      DA Miller       193
## ..            ...       ...

5. Top batting partnership plots

The following plots display the above partnershi[p details graphically

batsmanPartnershipAllOppnPlot(matches,"India","All")

partnership-1-1

batsmanPartnershipAllOppnPlot(matches,"India","Australia")

partnership-1-2

batsmanPartnershipAllOppnPlot(matches,"India","South Africa")

partnership-1-3

dim(matches)
## [1] 140655     22

6. Batsman vs bowlers report

The reports below show how the Indian batsman fared against bowlers of other countries. Using rank=0 shows the top 10 batsman of India. Specificying a rank ‘i’ will show against which bowlers the batsman scored maximum runs. Kohli has made most runs against Perera, Kulasekara and Malinga.Dhoni against Muralidharan, Jayasuriya and Malinga. Surprisingly Tendulkars runs ODIs have come Mitchell Johnson, Brett Lee and James Anderson.

a <- batsmanVsBowlersAllOppnRept(matches,theTeam="India",rank=0)
a
## Source: local data frame [10 x 2]
## 
##         batsman runsScored
##          (fctr)      (dbl)
## 1       V Kohli       7023
## 2      MS Dhoni       6885
## 3      SK Raina       4964
## 4     G Gambhir       4495
## 5     RG Sharma       4377
## 6  SR Tendulkar       4196
## 7  Yuvraj Singh       3976
## 8      V Sehwag       3679
## 9      S Dhawan       2694
## 10    AM Rahane       2005
b <- batsmanVsBowlersAllOppnRept(matches,theTeam="India",rank=1)
b
## Source: local data frame [50 x 3]
## Groups: batsman [1]
## 
##    batsman          bowler  runs
##     (fctr)          (fctr) (dbl)
## 1  V Kohli     NLTC Perera   242
## 2  V Kohli KMDN Kulasekara   196
## 3  V Kohli      SL Malinga   175
## 4  V Kohli      AD Mathews   155
## 5  V Kohli      BAW Mendis   132
## 6  V Kohli       R Rampaul   127
## 7  V Kohli     JW Dernbach   121
## 8  V Kohli     JP Faulkner   118
## 9  V Kohli       DJG Sammy   116
## 10 V Kohli    HMRKB Herath   113
## ..     ...             ...   ...
b <- batsmanVsBowlersAllOppnRept(matches,theTeam="India",rank=2)
b
## Source: local data frame [50 x 3]
## Groups: batsman [1]
## 
##     batsman         bowler  runs
##      (fctr)         (fctr) (dbl)
## 1  MS Dhoni M Muralitharan   195
## 2  MS Dhoni  ST Jayasuriya   183
## 3  MS Dhoni     SL Malinga   144
## 4  MS Dhoni      SR Watson   135
## 5  MS Dhoni        ST Finn   130
## 6  MS Dhoni     MG Johnson   128
## 7  MS Dhoni    JP Faulkner   125
## 8  MS Dhoni  Shahid Afridi   120
## 9  MS Dhoni     TT Bresnan   111
## 10 MS Dhoni     AD Mathews   111
## ..      ...            ...   ...
b <- batsmanVsBowlersAllOppnRept(matches,theTeam="India",rank=3)
b
## Source: local data frame [50 x 3]
## Groups: batsman [1]
## 
##     batsman           bowler  runs
##      (fctr)           (fctr) (dbl)
## 1  SK Raina         S Randiv   124
## 2  SK Raina      NLTC Perera   124
## 3  SK Raina       TT Bresnan   113
## 4  SK Raina Mashrafe Mortaza   108
## 5  SK Raina  KMDN Kulasekara   104
## 6  SK Raina       SL Malinga    96
## 7  SK Raina      JW Dernbach    94
## 8  SK Raina          ST Finn    93
## 9  SK Raina      JC Tredwell    86
## 10 SK Raina       T Thushara    84
## ..      ...              ...   ...
b <- batsmanVsBowlersAllOppnRept(matches,theTeam="India",rank=6)
b
## Source: local data frame [50 x 3]
## Groups: batsman [1]
## 
##         batsman          bowler  runs
##          (fctr)          (fctr) (dbl)
## 1  SR Tendulkar      MG Johnson   178
## 2  SR Tendulkar           B Lee   137
## 3  SR Tendulkar     JM Anderson   133
## 4  SR Tendulkar      SL Malinga   133
## 5  SR Tendulkar KMDN Kulasekara   127
## 6  SR Tendulkar        JR Hopes    94
## 7  SR Tendulkar        Umar Gul    92
## 8  SR Tendulkar       SCJ Broad    89
## 9  SR Tendulkar    IDR Bradshaw    85
## 10 SR Tendulkar      BAW Mendis    80
## ..          ...             ...   ...

7.Batsman vs bowlers report – Bowlers of other countries against India

As before using another team for theTeam e.g. West Indies or Pakistan will show the batsman of those countries who made the most runs against India in ODIs. The reports below show the performances of batsmen from West Indies, Bangladesh and Zimbabwe.

a <- batsmanVsBowlersAllOppnRept(matches,theTeam="West Indies",rank=0)
a
## Source: local data frame [10 x 2]
## 
##        batsman runsScored
##         (fctr)      (dbl)
## 1    RR Sarwan        655
## 2   MN Samuels        653
## 3     DM Bravo        523
## 4  LMP Simmons        426
## 5     CH Gayle        414
## 6   KA Pollard        359
## 7    DJG Sammy        348
## 8   AD Russell        308
## 9     DJ Bravo        301
## 10     BC Lara        268
a <- batsmanVsBowlersAllOppnRept(matches,theTeam="Ireland",rank=0)
a
## Source: local data frame [10 x 2]
## 
##            batsman runsScored
##             (fctr)      (dbl)
## 1       NJ O'Brien        173
## 2  WTS Porterfield        158
## 3      DT Johnston         51
## 4      PR Stirling         42
## 5        AR Cusack         35
## 6      A Balbirnie         24
## 7        GC Wilson         19
## 8         DI Joyce         18
## 9        JF Mooney         17
## 10        AR White         13
a <- batsmanVsBowlersAllOppnRept(matches,theTeam="Zimbabwe",rank=0)
a
## Source: local data frame [10 x 2]
## 
##          batsman runsScored
##           (fctr)      (dbl)
## 1     BRM Taylor        328
## 2   E Chigumbura        322
## 3    H Masakadza        285
## 4  Sikandar Raza        202
## 5    SC Williams        186
## 6   CJ Chibhabha        158
## 7      V Sibanda        140
## 8      CR Ervine         94
## 9       P Utseya         71
## 10   R Mutumbami         61

8. Batsman vs bowlers plots

df <- batsmanVsBowlersAllOppnRept(matches,theTeam="India",rank=1)
batsmanVsBowlersAllOppnPlot(df)

batsmanvsbowler-1

df <- batsmanVsBowlersAllOppnRept(matches,theTeam="India",rank=2)
batsmanVsBowlersAllOppnPlot(df)

batsmanvsbowler-2

df <- batsmanVsBowlersAllOppnRept(matches,theTeam="South Africa",rank=1)
d <- complete.cases(df) # Remove NAs
df <- df[d,]
batsmanVsBowlersAllOppnPlot(df)

batsmanvsbowler-3

df <- batsmanVsBowlersAllOppnRept(matches,theTeam="Pakistan",rank=3)
d <- complete.cases(df) # Remove NAs
df <- df[d,]
batsmanVsBowlersAllOppnPlot(df)

batsmanvsbowler-4

9. Top ODI bowlers of India

The overall bowling performance of all Indian bowlers in all ODI matches played so far is computed in the function below. The top 5 Indian ODI bowlers with the best ODI performance are

  1. Ravindra Jadeja
  2. Ravichander Ashwin
  3. Zaheer Khan
  4. Harbhajan Singh
  5. Ishant Sharma
df <- teamBowlingDetailsAllOppnMain(matches,theTeam="India")
df
## Source: local data frame [59 x 5]
## 
##             bowler overs maidens  runs wickets
##             (fctr) (int)   (int) (dbl)   (dbl)
## 1        RA Jadeja    43       0  4743     153
## 2         R Ashwin    49       0  4209     146
## 3           Z Khan    47       0  3686     141
## 4  Harbhajan Singh    45       0  4032     123
## 5         I Sharma    51       0  3216     113
## 6         MM Patel    49       1  2392      92
## 7          P Kumar    50       2  2748      84
## 8         UT Yadav    51       0  2442      80
## 9   Mohammed Shami    43       0  1802      80
## 10    Yuvraj Singh    38       0  2588      77
## ..             ...   ...     ...   ...     ...

10. Top ODI bowlers of other countries against India

The tables below provide the details of the bowlers who have the best performances against India. This is obtained when theteam=“India”. Mitchell Johnson has a haul of 44 wicke taken at 1012 runs followed by Kulaseka who has 40 wickets for 1492 and then Mendis who has taken 34 wickets for 810 runs

df <- teamBowlingDetailsAllOppn(matches,theTeam="India")
df
## Source: local data frame [309 x 5]
## 
##             bowler overs maidens  runs wickets
##             (fctr) (int)   (int) (dbl)   (dbl)
## 1       MG Johnson    47       0  1012      44
## 2  KMDN Kulasekara    44       0  1492      40
## 3       BAW Mendis    37       0   810      34
## 4         DW Steyn    35       1   714      34
## 5       SL Malinga    48       1  1402      33
## 6      JM Anderson    31       0   991      33
## 7       AD Mathews    47       1   800      31
## 8      NLTC Perera    45       0   983      30
## 9          ST Finn    38       0   775      30
## 10       SCJ Broad    29       2   903      29
## ..             ...   ...     ...   ...     ...

11. Top ODI bowlers of other countries against India

The tables below give the performances of Indian bowlers against different opposition. Against Australia the top 3 bowlers are Ishant Sharma, Harbhajan Singh and Irfan Pathan. FOr ODI matches against England the top 3 are Jadeja, Ashwin and Munaf Patel. The tables are for matches against South Africa and Pakistan are also included

df <- teamBowlingDetailsAllOppn(matches,theTeam="Australia")
df
## Source: local data frame [37 x 5]
## 
##             bowler overs maidens  runs wickets
##             (fctr) (int)   (int) (dbl)   (dbl)
## 1         I Sharma    44       1   739      26
## 2  Harbhajan Singh    40       0   926      25
## 3        IK Pathan    42       1   702      22
## 4         UT Yadav    37       2   606      18
## 5      S Sreesanth    34       0   454      18
## 6        RA Jadeja    39       0   867      16
## 7           Z Khan    33       1   500      15
## 8         R Ashwin    43       0   680      14
## 9          P Kumar    27       0   501      14
## 10   R Vinay Kumar    31       1   380      14
## ..             ...   ...     ...   ...     ...
df <- teamBowlingDetailsAllOppn(matches,theTeam="England")
df
## Source: local data frame [32 x 5]
## 
##             bowler overs maidens  runs wickets
##             (fctr) (int)   (int) (dbl)   (dbl)
## 1        RA Jadeja    34       0   735      35
## 2         R Ashwin    32       0   792      34
## 3         MM Patel    16       0   478      18
## 4           Z Khan    26       1   518      17
## 5         RP Singh    19       1   438      12
## 6         I Sharma    32       1   418      12
## 7         RR Powar    22       0   259      11
## 8          B Kumar    17       1   367      10
## 9         SK Raina    17       0   238      10
## 10 Harbhajan Singh    15       0   293       9
## ..             ...   ...     ...   ...     ...
df <- teamBowlingDetailsAllOppn(matches,theTeam="South Africa")
df
## Source: local data frame [33 x 5]
## 
##             bowler overs maidens  runs wickets
##             (fctr) (int)   (int) (dbl)   (dbl)
## 1           Z Khan    19       1   552      25
## 2  Harbhajan Singh    31       0   580      15
## 3         MM Patel    19       0   310      15
## 4   Mohammed Shami     9       1   215      11
## 5     Yuvraj Singh    17       0   279       9
## 6        RA Jadeja    18       1   299       8
## 7          A Nehra    27       1   366       7
## 8        MM Sharma    16       0   307       7
## 9      S Sreesanth    18       0   266       7
## 10         B Kumar    11       0   374       6
## ..             ...   ...     ...   ...     ...
df <- teamBowlingDetailsAllOppn(matches,theTeam="Pakistan")
df
## Source: local data frame [28 x 5]
## 
##             bowler overs maidens  runs wickets
##             (fctr) (int)   (int) (dbl)   (dbl)
## 1         I Sharma    32       1   405      14
## 2           Z Khan    20       0   284      12
## 3        IK Pathan    30       3   504      10
## 4          P Kumar    24       0   387      10
## 5  Harbhajan Singh    29       0   339      10
## 6         RP Singh    25       1   319      10
## 7         R Ashwin    22       0   302      10
## 8        RA Jadeja    23       2   250      10
## 9          B Kumar    14       0   194       9
## 10    Yuvraj Singh    16       0   241       6
## ..             ...   ...     ...   ...     ...

12. Top Indian ODI bowlers vs batsman

The reports below give the performances of bowlers against opposition batsman 1.The 1st call with theteam=“India” and rank=0 gives the bowlers who have conceded the most runs against India 2. The 2nd call with rank=1 gives the names of Indian batsman who scored the most against India 3. The 3rd call gives the performance of Malinga who has conceded the 2nd most runs in ODIs against India and the batsman who made these runs

a <- bowlersVsBatsmanAllOppnRept(matches,theTeam="India",rank=0)
a
## Source: local data frame [10 x 2]
## 
##             bowler  runs
##             (fctr) (dbl)
## 1  KMDN Kulasekara  1448
## 2       SL Malinga  1319
## 3      NLTC Perera   959
## 4      JM Anderson   954
## 5       MG Johnson   931
## 6        SCJ Broad   877
## 7       BAW Mendis   783
## 8       AD Mathews   776
## 9          ST Finn   751
## 10        DJ Bravo   739
a <- bowlersVsBatsmanAllOppnRept(matches,theTeam="India",rank=1)
a
## Source: local data frame [31 x 3]
## Groups: bowler [1]
## 
##             bowler      batsman runsConceded
##             (fctr)       (fctr)        (dbl)
## 1  KMDN Kulasekara     V Sehwag          199
## 2  KMDN Kulasekara      V Kohli          196
## 3  KMDN Kulasekara    G Gambhir          157
## 4  KMDN Kulasekara SR Tendulkar          127
## 5  KMDN Kulasekara Yuvraj Singh          118
## 6  KMDN Kulasekara    RG Sharma          114
## 7  KMDN Kulasekara     SK Raina          104
## 8  KMDN Kulasekara     MS Dhoni           80
## 9  KMDN Kulasekara   KD Karthik           56
## 10 KMDN Kulasekara   SC Ganguly           51
## ..             ...          ...          ...
a <- bowlersVsBatsmanAllOppnRept(matches,theTeam="India",rank=2)
a
## Source: local data frame [31 x 3]
## Groups: bowler [1]
## 
##        bowler      batsman runsConceded
##        (fctr)       (fctr)        (dbl)
## 1  SL Malinga      V Kohli          175
## 2  SL Malinga    G Gambhir          170
## 3  SL Malinga     MS Dhoni          144
## 4  SL Malinga     V Sehwag          140
## 5  SL Malinga SR Tendulkar          133
## 6  SL Malinga     SK Raina           96
## 7  SL Malinga Yuvraj Singh           64
## 8  SL Malinga   KD Karthik           52
## 9  SL Malinga    RG Sharma           50
## 10 SL Malinga   RV Uthappa           47
## ..        ...          ...          ...

13. Top ODI bowlers of other countries vs batsman

When we use other teams in theTeam we get the names of Indian bowlers

a <- bowlersVsBatsmanAllOppnRept(matches,theTeam="Sri Lanka",rank=0)
a
## Source: local data frame [10 x 2]
## 
##             bowler  runs
##             (fctr) (dbl)
## 1           Z Khan  1141
## 2        RA Jadeja   882
## 3         I Sharma   855
## 4  Harbhajan Singh   805
## 5          P Kumar   758
## 6         R Ashwin   736
## 7        IK Pathan   674
## 8          A Nehra   584
## 9         UT Yadav   544
## 10        MM Patel   484
a <- bowlersVsBatsmanAllOppnRept(matches,theTeam="England",rank=0)
a
## Source: local data frame [10 x 2]
## 
##          bowler  runs
##          (fctr) (dbl)
## 1      R Ashwin   777
## 2     RA Jadeja   729
## 3        Z Khan   503
## 4      MM Patel   459
## 5      RP Singh   410
## 6      I Sharma   396
## 7     PP Chawla   375
## 8  Yuvraj Singh   370
## 9       B Kumar   353
## 10   AB Agarkar   336
a <- bowlersVsBatsmanAllOppnRept(matches,theTeam="New Zealand",rank=0)
a
## Source: local data frame [10 x 2]
## 
##            bowler  runs
##            (fctr) (dbl)
## 1        R Ashwin   456
## 2       RA Jadeja   363
## 3    Yuvraj Singh   320
## 4  Mohammed Shami   304
## 5         A Nehra   302
## 6         P Kumar   289
## 7        I Sharma   281
## 8          Z Khan   238
## 9         B Kumar   233
## 10       MM Patel   213

14. Top ODI bowlers vs batsman plots

The plots below give the the performances of bowlers against batsman. The logic is same as above

df <- bowlersVsBatsmanAllOppnRept(matches,theTeam="India",rank=1)
bowlerVsBatsmanAllOppnPlot(df,"India","India")

bowlerBatsman-1

df <- bowlersVsBatsmanAllOppnRept(matches,theTeam="England",rank=1)
bowlerVsBatsmanAllOppnPlot(df,"India","England")

bowlerBatsman-2

df <- bowlersVsBatsmanAllOppnRept(matches,theTeam="Australia",rank=1)
bowlerVsBatsmanAllOppnPlot(df,"India","England")

bowlerBatsman-3

15. Top ODI bowlers wicket kind

The following plots give the top 8 bowlers against India and the wicket kind taken

teamBowlingWicketKindAllOppn(matches,t1="India",t2="All")

wicketKind-1-1

The plots below give the top 8 Indian bowlers against different countries

teamBowlingWicketKindAllOppn(matches,t1="India",t2="Bangladesh")

wicketKind-2-1

teamBowlingWicketKindAllOppn(matches,t1="India",t2="New Zealand")

wicketKind-2-2

teamBowlingWicketKindAllOppn(matches,t1="India",t2="West Indies")

wicketKind-2-3

teamBowlingWicketKindAllOppn(matches,t1="India",t2="Sri Lanka")

wicketKind-2-4

16. Top ODI bowlers  wicket runs

The plot below gives the top 8 performances of bowlers against India with wickets taken and runs conceded. The maximum wickets is 44 (pink) and Mitchell Johnson has taken it conceding around 1000 runs. Kulasekara has 40 wickets (purple) conceding around 1400 runs

teamBowlingWicketRunsAllOppn(matches,t1="India",t2="All")

wicketRuns-1-1

The plots below give the top 8 Indian bowlers against different countries. The bar that is rightmost is the most wickets and the taller the bar more the runs conceded.

teamBowlingWicketRunsAllOppn(matches,t1="India",t2="Zimbabwe")

wicketRuns-2-1

teamBowlingWicketRunsAllOppn(matches,t1="India",t2="Australia")

wicketRuns-2-2

teamBowlingWicketRunsAllOppn(matches,t1="India",t2="Pakistan")

wicketRuns-2-3

teamBowlingWicketRunsAllOppn(matches,t1="India",t2="New Zealand")

wicketRuns-2-4

Important note: Do check out my other posts using yorkr at yorkr-posts

Conclusion

Here are some quick conclusions I have gleaned from the analysis

  1. Virat Kohli has the highest runs in ODI, followed by Mahendra Dhoni and then Suresh Raina.
  2. Though Kohli has the best strike rate, Dhoni and Raina have twice the number of 6’s as Kohli.
  3. Among batsmen from other countries that have to be feared are GJ Bailey, Younis Khan, AB Devillers etc
  4. Among Indian ODI bowlers Ravindra Jadeja, Ashwin and Zaheer Khan have the most wickets. 5.Ishant Sharma, Harbhajan Singh performed well against Australia and RA Jadeja,Ashwin against England and so on
  5. India has to be wary of Mitchell Johnson,Kulasekara, Malinga

Also see

  1. Cricket analytics with cricketr in paperback and Kindle versions
  2. Introducing cricketr! : An R package to analyze performances of cricketers
  3. Cricketr plays the ODIs
  4. Cricketr adapts to Twenty20 International

You may also like

  1. Revisiting crimes against women in India
  2. Literacy in India – A deepR dive
  3. Bend it like Bluemix,MongoDB using Autoscaling – Part 2
  4. A closer look at Robot Horse on a trot in Android
  5. Programming Zen and now – Sime essential tips
  6. Design principles of scalable distributed systems
  7. Sea shells on the sea shore

The making of cricket package yorkr – Part 1

Introduction

Here is a sneak preview of my latest package cricket package yorkr in R. My earlier package ‘cricketr’ (see Introducing cricketr: An R package for analyzing performances of cricketers) was based on data from ESPN Cricinfo Statsguru. My current package ‘yorkr’ is based on data from Cricsheet. The data for Test, ODI, Twenty20 matches in Cricheet are formatted as yaml files.

While the data available from ESPN Cricinfo Statsguru is a summary of the player’s performances, Cricsheet data is more detailed and granular. Cricsheet gives a ball-by-ball detail for each match as can be seen from the above website. Hence the type of analyses possible can be much more detailed and richer. Some cool functions in this package, include charts for batsman partnerships, performance of batsman against bowlers and how bowlers fared against batsman for a single ODI match or for all ODI matches between 2 opposing sides (for e.g Australia-India or West Indies-Sri Lanka)

This current post includes my first stab at analysing ODI data from Cricsheet. To do this I had to parse the Yaml files and flatten them out as data frames. That was a fairly involved task and I think I now have done it. I then perform analyses on these flattened 1000’s of data frames. This post contains my initial analyses of the ODI data from Cricsheet.

Since the package ‘yorkr’ is still work in progress. I will be adding more functions, refining existing functions and crossing t’s and dotting the i’s. I hope to have the yorkr package wrapped up in about 6-10 weeks time. The package and code should be available after that. Please ‘hold your horses’ till this time.

If you are passionate about cricket, and love analyzing cricket performances, then check out my 2 racy books on cricket! In my books, I perform detailed yet compact analysis of performances of both batsmen, bowlers besides evaluating team & match performances in Tests , ODIs, T20s & IPL. You can buy my books on cricket from Amazon at $12.99 for the paperback and $4.99/$6.99 respectively for the kindle versions. The books can be accessed at Cricket analytics with cricketr  and Beaten by sheer pace-Cricket analytics with yorkr  A must read for any cricket lover! Check it out!!

1

 

This report is also available at Rpubs at yorkr1 york1. The report can also be downloaded as a PDF document at yorkr-1.pdf

 

Checkout my interactive Shiny apps GooglyPlus (plots & tables) and Googly (only plots) which can be used to analyze IPL players, teams and matches.

Important note: Do check out all the posts on the python avatar of yorkr, namely ‘yorkpy’ in my post ‘Pitching yorkpy … short of good length to IPL – Part 1

The current set of functions developed fall into 4 main categories

  • batsmen performance in match
  • bowlers performance in match
  • batsmen performance against opposition
  • bowlers performance against opposition

In the first part of the post I have taken an single Australia-India ODI match on 24 Feb 2008 at Sydney. (For details on this match look up Australia – India, Sydney)

The second part of the past looks at all ODI matches between Australia-India (there are 40 ODI matches between India and Australia)

While this post analyses 1 ODI match and all matches between 2 opposing sides (Australia vs India), the functions developed in yorkr(Part 1) can be used for any of 1000+ ODI matches and any combination of opposing countries!!!

So without much ado let me dive into the functions created

library(dplyr)
library(ggplot2)
library(yorkr)

Get the match details (Aus-Ind,24 Feb 2008,Sydney)

match <- getMatchDetails()

Team batting performances of the opposing teams

In this post I pick a ODI match played between India and Australia on 24 Feb 2008 at Sydney.

1. Team batting details (ODI Match)

This function gives the overall scores of the team for which the function is invoked

Team batting details (ODI Match)
This function gives the overall scores of the team for which the function is invoked

teamBattingDetailsMatch(match,"India")
## Total= 272
## Source: local data frame [11 x 5]
## 
##            batsman ballsPlayed fours sixes  runs
##             (fctr)       (int) (dbl) (dbl) (dbl)
## 1         V Sehwag          18     3     0    17
## 2     SR Tendulkar           3     0     0     2
## 3        G Gambhir         118     9     1   113
## 4        RG Sharma           3     0     0     1
## 5     Yuvraj Singh           3     1     0     5
## 6         MS Dhoni          64     4     0    36
## 7       RV Uthappa          40     4     1    51
## 8        IK Pathan          20     2     0    22
## 9  Harbhajan Singh          11     3     0    20
## 10     S Sreesanth           4     0     0     3
## 11        I Sharma           3     0     0     2
teamBattingDetailsMatch(match,"Australia")
## Total= 303
## Source: local data frame [7 x 5]
## 
##        batsman ballsPlayed fours sixes  runs
##         (fctr)       (int) (dbl) (dbl) (dbl)
## 1 AC Gilchrist           7     3     0    16
## 2    ML Hayden          61     5     1    54
## 3   RT Ponting         132     7     1   124
## 4    MJ Clarke          38     0     0    31
## 5    A Symonds          48     6     2    59
## 6   MEK Hussey          10     1     0    15
## 7     JR Hopes           3     0     0     4

2. Batsmen partnership (ODI Match)

The plot below shows the partnerships between batsman. Gautham Gambhir scored the highest followed by Uthappa. Gambhir had a good partnership with Sehway, Dhoni and Uthappa. On the Australian side Ponting had a good partnership with Hayden,Clarke and Symonds.

batsmenPartnershipMatch(match,"India")

partnershipmatch-1

batsmenPartnershipMatch(match,"Australia")

partnershipmatch-2

3. Batsmen vs Bowlers (ODI Match)

This chart shows how each batsman fared against the bowlers. Gambhir scored maximum from Hogg and Clarke. Ponting scores maximum from Pathan, Ishant Sharma, Sreesanth.

batsmenVsBowlersMatch(match,"India")

batsmenbowler-1

batsmenVsBowlersMatch(match,"Australia")

batsmenbowler-2

4. Team bowling details (ODI Match)

The table gives bowling details of each team

teamBowlingDetailsMatch(match,"India")
## Source: local data frame [6 x 5]
## 
##       bowler overs maidens  runs wickets
##       (fctr) (int)   (int) (dbl)   (dbl)
## 1      B Lee    10       2    58       5
## 2 NW Bracken    10       0    53       1
## 3   SR Clark    10       0    55       2
## 4   JR Hopes     6       0    27       1
## 5    GB Hogg     9       0    62       1
## 6  MJ Clarke     5       0    33       0
teamBowlingDetailsMatch(match,"Australia")
## Source: local data frame [6 x 5]
## 
##            bowler overs maidens  runs wickets
##            (fctr) (int)   (int) (dbl)   (dbl)
## 1     S Sreesanth     8       0    58       2
## 2        I Sharma    10       0    65       1
## 3       IK Pathan     9       0    73       0
## 4 Harbhajan Singh     9       0    50       2
## 5        V Sehwag     6       0    28       2
## 6    Yuvraj Singh     8       0    38       0

5. Wicket kind (ODI Match)

This chart gives the wicket kind or the type of wicket for the bowler vs the runs scored

teamBowlingWicketKindMatch(match,"India")

wicketKindmatch-1

teamBowlingWicketKindMatch(match,"Australia")

wicketKindmatch-2

6. Wickets Runs (ODI Match)l

This plot gives the number of wickets taken and the runs conceded by the bowler

teamBowlingWicketRunsMatch(match,"India")

wicketRunsMatch-1

teamBowlingWicketRunsMatch(match,"Australia")

wicketRunsMatch-2

7. Wicket (batsman) and total runs scored (ODI Match)

This plot gives the details of the wickets taken and the runs conceded. Brett Lee has the performance with 5 scalps. On the Indian side Sreesanth, Harbhajan and Sehwag have 2 wickets apiece. Sreesanth is the most expensive,

teamBowlingWicketMatch(match,"India")

wicketMatch-1

teamBowlingWicketMatch(match,"Australia")

wicketMatch-2

8. Bowler vs Batsman (ODI Match)

This plot below shows which of the batsman was most brutal against the bowler or who scored the most against the bowler. Ponting scores most against Pathan.

bowlersVsBatsmanMatch(match,"India")


batsmanMatch,-1

bowlersVsBatsmanMatch(match,"Australia")


batsmanMatch-2

9.

Worm graph (ODI Match) This chart gives the match worm of runs scored against the number deliveries.

matchWormGraph(match,team1="Australia",team2="India")

worm-1

The following charts show the performances of the batsmen and against the opposition. In this case I have chosen India and Australia. Hence the plots below show the best performers(batsmen and bowlers) of either team against their adversary. The below analyses are based on all ODI confrontations between Australia and India. There are a total of 40 head-on confrontations between Aus-India.

allMatches <- getOppositionDetails()

10.Batsman partnership against opposition (all ODI matches)

The report below gives the batsman who has had the best partnetship in Australia-India matches. On the Indian side the top 3 are Mahendra Singh Dhoni, Rohit Sharma followed by Tendulkar. Ponting, Hussey and Bailey are the top 3 for the Autralians. As far as ODI is concerned Dhoni towers over all others. Of course similar analyses can be done between India-Pakistan, India-South Africa etc. But at least against the Australians we need to have Dhoni and Rohit Sharma I think The report below gives a summary of the partnership runs

report <- batsmanPartnershipOppn(allMatches,"India",report="summary")
report
## Source: local data frame [44 x 2]
## 
##         batsman partnershipRuns
##          (fctr)           (dbl)
## 1      MS Dhoni            1156
## 2     RG Sharma             914
## 3  SR Tendulkar             910
## 4       V Kohli             902
## 5     G Gambhir             532
## 6  Yuvraj Singh             524
## 7      SK Raina             509
## 8      S Dhawan             471
## 9      V Sehwag             287
## 10   RV Uthappa             279
## ..          ...             ...
report <- batsmanPartnershipOppn(allMatches,"Australia",report="summary")
report
## Source: local data frame [48 x 2]
## 
##       batsman partnershipRuns
##        (fctr)           (dbl)
## 1  RT Ponting             876
## 2  MEK Hussey             753
## 3   GJ Bailey             610
## 4   SR Watson             609
## 5   MJ Clarke             607
## 6   ML Hayden             573
## 7   A Symonds             536
## 8    AJ Finch             525
## 9   SPD Smith             467
## 10  DA Warner             391
## ..        ...             ...

The report below gives a detailed breakup of the partnership runs

report <- batsmanPartnershipOppn(allMatches,"India",report="detailed")
report[1:40,]
##         batsman      nonStriker runs partnershipRuns
## 1      MS Dhoni    SR Tendulkar   71            1156
## 2      MS Dhoni        R Dravid   27            1156
## 3      MS Dhoni    Yuvraj Singh  128            1156
## 4      MS Dhoni        SK Raina  187            1156
## 5      MS Dhoni          M Kaif    6            1156
## 6      MS Dhoni        D Mongia   23            1156
## 7      MS Dhoni Harbhajan Singh   16            1156
## 8      MS Dhoni       IK Pathan   42            1156
## 9      MS Dhoni       G Gambhir  117            1156
## 10     MS Dhoni       RG Sharma   56            1156
## 11     MS Dhoni      RV Uthappa   51            1156
## 12     MS Dhoni     S Sreesanth   19            1156
## 13     MS Dhoni        I Sharma    4            1156
## 14     MS Dhoni         P Kumar    1            1156
## 15     MS Dhoni         V Kohli   78            1156
## 16     MS Dhoni       RA Jadeja  103            1156
## 17     MS Dhoni        R Ashwin   78            1156
## 18     MS Dhoni        R Sharma    2            1156
## 19     MS Dhoni   R Vinay Kumar   30            1156
## 20     MS Dhoni          Z Khan    6            1156
## 21     MS Dhoni       AM Rahane   47            1156
## 22     MS Dhoni       MK Pandey   34            1156
## 23     MS Dhoni Gurkeerat Singh    1            1156
## 24     MS Dhoni         B Kumar   26            1156
## 25     MS Dhoni        RR Powar    3            1156
## 26    RG Sharma    SR Tendulkar   66             914
## 27    RG Sharma    Yuvraj Singh    5             914
## 28    RG Sharma        SK Raina   69             914
## 29    RG Sharma        MS Dhoni   90             914
## 30    RG Sharma               4    0             914
## 31    RG Sharma       G Gambhir   35             914
## 32    RG Sharma         V Kohli  248             914
## 33    RG Sharma       RA Jadeja   13             914
## 34    RG Sharma        R Ashwin   11             914
## 35    RG Sharma        S Dhawan  247             914
## 36    RG Sharma       AM Rahane   77             914
## 37    RG Sharma       MK Pandey   53             914
## 38 SR Tendulkar        R Dravid   12             910
## 39 SR Tendulkar        V Sehwag  111             910
## 40 SR Tendulkar    Yuvraj Singh  173             910
report <- batsmanPartnershipOppn(allMatches,"Australia",report="detailed")
report[1:40,]
##       batsman   nonStriker runs partnershipRuns
## 1  RT Ponting    SR Watson  140             876
## 2  RT Ponting    DR Martyn   35             876
## 3  RT Ponting    MJ Clarke   63             876
## 4  RT Ponting    BJ Haddin   33             876
## 5  RT Ponting    ML Hayden  117             876
## 6  RT Ponting    A Symonds   41             876
## 7  RT Ponting   MEK Hussey   74             876
## 8  RT Ponting AC Gilchrist  113             876
## 9  RT Ponting     TD Paine   68             876
## 10 RT Ponting     CL White   84             876
## 11 RT Ponting    DA Warner    6             876
## 12 RT Ponting      MS Wade    9             876
## 13 RT Ponting    DJ Hussey   20             876
## 14 RT Ponting     SE Marsh   45             876
## 15 RT Ponting     BJ Hodge   28             876
## 16 MEK Hussey   RT Ponting   85             753
## 17 MEK Hussey    MJ Clarke   74             753
## 18 MEK Hussey    BJ Haddin   24             753
## 19 MEK Hussey      GB Hogg   19             753
## 20 MEK Hussey   MG Johnson   43             753
## 21 MEK Hussey     SR Clark    4             753
## 22 MEK Hussey    ML Hayden    5             753
## 23 MEK Hussey    A Symonds    5             753
## 24 MEK Hussey        B Lee   39             753
## 25 MEK Hussey   NW Bracken    3             753
## 26 MEK Hussey     JR Hopes   83             753
## 27 MEK Hussey     CL White  185             753
## 28 MEK Hussey    DA Warner   10             753
## 29 MEK Hussey      MS Wade   35             753
## 30 MEK Hussey    DJ Hussey   10             753
## 31 MEK Hussey   PJ Forrest   59             753
## 32 MEK Hussey     AC Voges   59             753
## 33 MEK Hussey MC Henriques   11             753
## 34  GJ Bailey    SR Watson   79             610
## 35  GJ Bailey    BJ Haddin    7             610
## 36  GJ Bailey            4    0             610
## 37  GJ Bailey    DA Warner    6             610
## 38  GJ Bailey     AJ Finch   22             610
## 39  GJ Bailey    SPD Smith  149             610
## 40  GJ Bailey   GJ Maxwell  133             610

11. Partnership runs against opposition (all ODI matches)

The chart below gives the overall partnership. It is graphical representation of the chart above.

batsmanPartnershipOppnChart(allMatches,"India")

partnershipOppnChart,-1

batsmanPartnershipOppnChart(allMatches,"Australia")

partnershipOppnChart,-2

12. Batsmen vs Bowlers against opposition (all ODI matches)

The chart below gives how the batsmen fared against the bowlers of the opposition.)

batsmanVsBowlersOppn(allMatches,"India")

batsmenVsBowlers,-1

batsmanVsBowlersOppn(allMatches,"Australia"

bowlersVsBatsmen,-2

13. Team batting details opposition (all ODI matches)

The table below gives the total runs scores by each batsman and is dsiplayed in descending order. Dhoni, Rohit Sharma and Tendulkar are the top 3 for India and Ponting, Hussey and Bailey lead for Australia

teamBattingDetailsOppn(allMatches,"India")
## Total= 8313
## Source: local data frame [44 x 5]
## 
##         batsman  runs fours sixes ballsPlayed
##          (fctr) (dbl) (int) (int)       (int)
## 1      MS Dhoni  1156    78    22        1406
## 2     RG Sharma   914    72    24        1015
## 3  SR Tendulkar   910   103     6        1157
## 4       V Kohli   902    87     6         961
## 5     G Gambhir   532    43     2         677
## 6  Yuvraj Singh   524    52    11         664
## 7      SK Raina   509    43    11         536
## 8      S Dhawan   471    55     6         470
## 9      V Sehwag   287    42     4         303
## 10   RV Uthappa   279    28     7         295
## ..          ...   ...   ...   ...         ...
teamBattingDetailsOppn(allMatches,"Australia")
## Total= 9993
## Source: local data frame [48 x 5]
## 
##       batsman  runs fours sixes ballsPlayed
##        (fctr) (dbl) (int) (int)       (int)
## 1  RT Ponting   876    86     8        1107
## 2  MEK Hussey   753    56     5         816
## 3   GJ Bailey   610    50    13         578
## 4   SR Watson   609    81    10         653
## 5   MJ Clarke   607    45     5         786
## 6   ML Hayden   573    72     8         660
## 7   A Symonds   536    43    15         543
## 8    AJ Finch   525    52     9         617
## 9   SPD Smith   467    44     7         431
## 10  DA Warner   391    40     6         385
## ..        ...   ...   ...   ...         ...

14. Bowler vs Batsman against opposition (all ODI matches)

The charts below give the performance of the bowlers against batsman

bowlersVsBatsmanOppn(allMatches,"India")

bowlersVsBatsmen,-1

bowlersVsBatsmanOppn(allMatches,"Australia")

bowlersVsBatsmen,-2

15. Bowling details against opposition (all ODI matches)

For matches between Australia and India the top 3 wicket takes for Australia are Mitchell Johnson, Brett Lee and JR Faulkner. For India it is Ishant Sharma, Harbhajan Singh and R A Jadeja.

teamBowlingDetailsOppn(allMatches,"India")
## Source: local data frame [39 x 5]
## 
##          bowler overs maidens  runs wickets
##          (fctr) (int)   (int) (dbl)   (dbl)
## 1    MG Johnson    40       0  1012      18
## 2         B Lee    21       1   667      15
## 3   JP Faulkner    33       0   598      13
## 4     SR Watson    24       0   532      12
## 5       GB Hogg    15       0   427      12
## 6      CJ McKay    17       0   403      12
## 7    NW Bracken    28       2   429      11
## 8      MA Starc    12       2   251      11
## 9      JR Hopes    18       0   346       8
## 10 DE Bollinger    11       4   174       8
## ..          ...   ...     ...   ...     ...
teamBowlingDetailsOppn(allMatches,"Australia")
## Source: local data frame [37 x 5]
## 
##             bowler overs maidens  runs wickets
##             (fctr) (int)   (int) (dbl)   (dbl)
## 1         I Sharma    44       1   739      20
## 2  Harbhajan Singh    40       0   926      15
## 3        RA Jadeja    39       0   867      14
## 4        IK Pathan    42       1   702      11
## 5         UT Yadav    37       2   606      10
## 6          P Kumar    27       0   501      10
## 7           Z Khan    33       1   500      10
## 8      S Sreesanth    34       0   454      10
## 9         R Ashwin    43       0   680       9
## 10   R Vinay Kumar    31       1   380       9
## ..             ...   ...     ...   ...     ...

16. Wicket kind against opposition (all ODI matches)

These charts give the wicket kind for each of the top 9 bowlers from each side.

teamBowlingWicketKindOppn(allMatches,"India")

wicketKindOppn-1

teamBowlingWicketKindOppn(allMatches,"Australia")

wicketKindOppn-2

17. Wicket runs against opposition (all ODI matches)

These given the runs conceded by the bowlers

teamBowlingWicketRunsOppn(allMatches,"India")

wicketRunsOppn-1

teamBowlingWicketRunsOppn(allMatches,"Australia")

wicketRunsOppn-2

18. Wickets against opposition (all ODI matches)

The charts below depict the wickets taken by each bowler. If you notice Mitchel Johnson has the most wickets.

teamBowlingWicketsOppn(allMatches,"India")

wicketOppn,-1

teamBowlingWicketsOppn(allMatches,"Australia")

wicketOppn,-2

Conclusion :

Some key findings

In the ODI confrontations between Australia and India the top 3 batsmen of India are

  1. Mahendra Dhoni 2.Rohit Sharma
  2. Sachin Tendulkar.

The best bowlers for India are

  1. Ishant Sharma
  2. Harbhajan Singh
  3. R A Jadeja

For the Australian side the top 3 batsmen are

  1. R A Ponting
  2. M Hussey
  3. G J Bailey

The top 3 bowlers are

1. Mitchell Johnson
2. Brett Lee
3. J P Faulkner

Note: This is the first part of my yorkr package. I will be adding more functions in the weeks to come. Clearly the data from Cricsheet is more granular and allows for more detailed analyses. I should have the next set of functions soon.

(Take a look at The making of cricket package yorkr – Part 2)

Important note: Do check out my other posts using yorkr at yorkr-posts

Watch this space!!!

Also see

  1. Cricket analytics with cricketr
  2. Introducing cricketr! : An R package to analyze performances of cricketers
  3. Sixer – R package cricketr’s new Shiny avatar
  4. Informed choices through Machine Learning – Analyzing Kohli, Tendulkar and Dravid

You may also like

  1. Natural language processing: What would Shakespeare say?
  2. Revisiting crimes against women in India
  3. Literacy in India – A deepR dive
  4. TWS-4: Gossip protocol: Epidemics and rumors to the rescue
  5. Singularity
  6. Simulating an Edge shape in Android
  7. Programming Zen and now – Sime essential tips
  8. Rock N’ Roll with Bluemix, Cloudant & NodeExpress
  9. Architecting a cloud based IP Multimedia System (IMS)

Cricket analytics with cricketr in paperback and Kindle versions

Untitled

My book “Cricket analytics with cricketr” is now available in paperback and Kindle versions. The paperback is available from Amazon (US, UK and Europe) for $ 48.99. The Kindle version can be downloaded from the Kindle store for $2.50 (Rs 169/-). Do pick your copy. It should be a good read for a Sunday afternoon.

This book of mine contains my posts based on my R package ‘cricketr’ now in CRAN. The package cricketr can analyze both batsmen and bowlers for all formats of the game Test, ODI and Twenty20. The package uses the data from ESPN Cricinfo. The analyses include runs frequency charts, performances of batsmen and bowlers in different grounds and against different teams, moving  average of  runs/wickets over the career, mean strike rate, mean economy rate and so on.

The book includes the following chapters based on my R package cricketr  There are 2 additional articles where I use Machine Learning with the package Octave.

CONTENTS
Cricket Analytics with cricketr 11
1.1. Introducing cricketr! : An R package to analyze performances of cricketers 11
1.2. Taking cricketr for a spin – Part 1 49
1.2. cricketr digs the Ashes! 70
1.3. cricketr plays the ODIs! 99
1.4. cricketr adapts to the Twenty20 International! 141
1.5. Sixer – R package cricketr’s new Shiny avatar 170
2. Other cricket posts in R 180
2.1. Analyzing cricket’s batting legends – Through the mirage with R 180
2.2. Mirror, mirror … the best batsman of them all? 206
3. Appendix 220
Cricket analysis with Machine Learning using Octave 220
3.1. Informed choices through Machine Learning – Analyzing Kohli, Tendulkar and Dravid 221
3.2. Informed choices through Machine Learning-2 Pitting together Kumble, Kapil, Chandra 234
Further reading 248
Important Links 249

You can download the latest PDF version of the book  at  ‘Cricket analytics with cricketr and cricpy: Analytics harmony with R and Python-6th edition

I do hope you have a great time reading it. Do pick up your copy. Feel free to get in touch with me with your comments and suggestions.  I have more interesting things lined up for the future.

Watch this space!

You may also like
1. Literacy in India : A deepR dive.
2. Natural Language Processing: What would Shakespeare say?
3. Revisiting crimes against women in India
4. Experiments with deblurring using OpenCV
5. TWS-4: Gossip protocol: Epidemics and rumors to the rescue
6. Bend it like Bluemix, MongoDB with autoscaling – Part 1
7. “Is it animal? Is it an insect?” in Android