Pitching yorkpy … short of good length to IPL – Part 1


I fear not the man who has practiced 10,000 kicks once, but I fear the man who has practiced one kick 10,000 times.
Bruce Lee

I’ve missed more than 9000 shots in my career. I’ve lost almost 300 games. 26 times, I’ve been trusted to take the game winning shot and missed. I’ve failed over and over and over again in my life. And that is why I succeed.
Michael Jordan

Man, it doesn’t matter where you come in to bat, the score is still zero
Viv Richards

Introduction

“If cricketr is to cricpy, then yorkr is to _____?”. Yes, you guessed it right, it is yorkpy. In this post, I introduce my 2nd python package, yorkpy, which is a python clone of my R package yorkr. This package is based on data from Cricsheet. yorkpy currently handles IPL T20 matches.

When I created cricpy, the python avatar, of my R package cricketr, see Introducing cricpy:A python package to analyze performances of cricketers, I had decided that I should avoid doing a python avatar of my R package yorkr (see Introducing cricket package yorkr: Part 1- Beaten by sheer pace!) , as it was more involved, and required the parsing of match data available as yaml files.

Just out of curiosity, I tried the python package ‘yaml’ to read the match data, and lo and behold, I was sucked into the developing the package and so, yorkpy was born. Of course, it goes without saying that, usually when I am in the thick of developing something, I occasionally wonder, why I am doing it, for whom and for what purpose? Maybe it is the joy of ideation, the problem-solving,  the programmer’s high, for sharing my ideas etc. Anyway, whatever be the reason, I hope you enjoy this post and also find yorkpy useful.

You can clone/download the code at Github yorkpy
This post has been published to RPubs at yorkpy-Part1
You can download this post as PDF at IPLT20-yorkpy-part1

The IPL T20 functions in yorkpy are

2. Install the package using ‘pip install’

import pandas as pd
import yorkpy.analytics as yka
#pip install yorkpy

3. Load a yaml file from Cricsheet

There are 2 functions that can be to convert the IPL Twenty20 yaml files to pandas dataframeare

  1. convertYaml2PandasDataframeT20
  2. convertAllYaml2PandasDataframesT20

Note 1: While I have already converted the IPL T20 files, you will need to use these functions for future IPL matches

4. Convert and save IPL T20 yaml file to pandas dataframe

This function will convert a IPL T20 IPL yaml file, in the format as specified in Cricsheet to pandas dataframe. This will be saved as as CSV file in the target directory. The name of the file wil have the following format team1-team2-date.csv. The IPL T20 zip file can be downloaded from Indian Premier League matches.  An example of how a yaml file can be converted to a dataframe and saved is shown below.

import pandas as pd
import yorkpy.analytics as yka
#convertYaml2PandasDataframe(".\\1082593.yaml","..\ipl", ..\\data")

5. Convert and save all IPL T20 yaml files to dataframes

This function will convert all IPL T20 yaml files from a source directory to dataframes, and save it in the target directory, with the names as mentioned above. Since I have already done this, I will not be executing this again. You can download the zip of all the converted RData files from Github at yorkpyData

import pandas as pd
import yorkpy.analytics as yka
#convertAllYaml2PandasDataframes("..\\ipl", "..\\data")

You can download the the zip of the files and use it directly in the functions as follows.For the analysis below I chosen a set of random IPL matches

The randomly selected IPL T20 matches are

  • Chennai Super Kings vs Kings Xi Punjab, 2014-05-30
  • Deccan Chargers vs Delhi Daredevils, 2012-05-10
  • Gujarat Lions vs Mumbai Indians, 2017-04-29
  • Kolkata Knight Riders vs Rajasthan Royals, 2010-04-17
  • Rising Pune Supergiants vs Royal Challengers Bangalore, 2017-04-29

6. Team batting scorecard

The function below computes the batting score card of a team in an IPL match. The scorecard gives the balls faced, the runs scored, 4s, 6s and strike rate. The example below is based on the CSK KXIP match on 30 May 2014.

You can check against the actual scores in this match Chennai Super Kings-Kings XI Punjab-2014-05-30

import pandas as pd
import yorkpy.analytics as yka
csk_kxip=pd.read_csv(".\\Chennai Super Kings-Kings XI Punjab-2014-05-30.csv")
scorecard,extras=yka.teamBattingScorecardMatch(csk_kxip,"Chennai Super Kings")
print(scorecard)
##         batsman  runs  balls  4s  6s          SR
## 0      DR Smith     7     12   0   0   58.333333
## 1  F du Plessis     0      1   0   0    0.000000
## 2      SK Raina    87     26  12   0  334.615385
## 3   BB McCullum    11     16   0   0   68.750000
## 4     RA Jadeja    27     22   2   0  122.727273
## 5     DJ Hussey     1      3   0   0   33.333333
## 6      MS Dhoni    42     34   3   0  123.529412
## 7      R Ashwin    10     11   0   0   90.909091
## 8     MM Sharma     1      3   0   0   33.333333
print(extras)
##    total  wides  noballs  legbyes  byes  penalty  extras
## 0    428     14        3        5     5        0      27
print("\n\n")
scorecard1,extras1=yka.teamBattingScorecardMatch(csk_kxip,"Kings XI Punjab")
print(scorecard1)
##       batsman  runs  balls  4s  6s          SR
## 0    V Sehwag   122     62  12   0  196.774194
## 1     M Vohra    34     33   1   0  103.030303
## 2  GJ Maxwell    13      8   1   0  162.500000
## 3   DA Miller    38     19   5   0  200.000000
## 4   GJ Bailey     1      2   0   0   50.000000
## 5     WP Saha     6      4   0   0  150.000000
## 6  MG Johnson     1      1   0   0  100.000000
print(extras1)
##    total  wides  noballs  legbyes  byes  penalty  extras
## 0    428     14        3        5     5        0      27

Let’s take another random match between Gujarat Lions and Mumbai Indian on 29 Apr 2017 Gujarat Lions-Mumbai Indians-2017-04-29

import pandas as pd
gl_mi=pd.read_csv(".\\Gujarat Lions-Mumbai Indians-2017-04-29.csv")
import yorkpy.analytics as yka
scorecard,extras=yka.teamBattingScorecardMatch(gl_mi,"Gujarat Lions")
print(scorecard)
##          batsman  runs  balls  4s  6s          SR
## 0   Ishan Kishan    48     38   6   0  126.315789
## 1    BB McCullum     6      4   1   0  150.000000
## 2       SK Raina     1      3   0   0   33.333333
## 3       AJ Finch     0      3   0   0    0.000000
## 4     KD Karthik     2      9   0   0   22.222222
## 5      RA Jadeja    28     22   2   0  127.272727
## 6    JP Faulkner    21     29   2   0   72.413793
## 7      IK Pathan     2      3   0   0   66.666667
## 8         AJ Tye    25     12   2   0  208.333333
## 9   Basil Thampi     2      4   0   0   50.000000
## 10    Ankit Soni     7      2   0   0  350.000000
print(extras)
##    total  wides  noballs  legbyes  byes  penalty  extras
## 0    306      8        3        1     0        0      12
print("\n\n")
scorecard1,extras1=yka.teamBattingScorecardMatch(gl_mi,"Mumbai Indians")
print(scorecard1)
##             batsman  runs  balls  4s  6s          SR
## 0          PA Patel    70     45   9   0  155.555556
## 1        JC Buttler     9      7   2   0  128.571429
## 2            N Rana    19     16   1   0  118.750000
## 3         RG Sharma     5     13   0   0   38.461538
## 4        KA Pollard    15     11   2   0  136.363636
## 5         KH Pandya    29     20   2   0  145.000000
## 6         HH Pandya     4      5   0   0   80.000000
## 7   Harbhajan Singh     0      1   0   0    0.000000
## 8    MJ McClenaghan     1      1   0   0  100.000000
## 9         JJ Bumrah     0      1   0   0    0.000000
## 10       SL Malinga     0      1   0   0    0.000000
print(extras1)
##    total  wides  noballs  legbyes  byes  penalty  extras
## 0    306      8        3        1     0        0      12

7. Plot the team batting partnerships

The functions below plot the team batting partnership in the match. It shows what the partnership were in the mtach

Note: Many of the plots include an additional parameters plot which is either True or False. The default value is plot=True. When plot=True the plot will be displayed. When plot=False the data frame will be returned to the user. The user can use this to create an interactive chart using one of the packages like rcharts, ggvis,googleVis or plotly.

import pandas as pd
import yorkpy.analytics as yka
dc_dd=pd.read_csv(".\\Deccan Chargers-Delhi Daredevils-2012-05-10.csv")
yka.teamBatsmenPartnershipMatch(dc_dd,'Deccan Chargers','Delhi Daredevils')

yka.teamBatsmenPartnershipMatch(dc_dd,'Delhi Daredevils','Deccan Chargers',plot=True)
# Print partnerships as a dataframe

rps_rcb=pd.read_csv(".\\Rising Pune Supergiant-Royal Challengers Bangalore-2017-04-29.csv")
m=yka.teamBatsmenPartnershipMatch(rps_rcb,'Royal Challengers Bangalore','Rising Pune Supergiant',plot=False)
print(m)
##            batsman     non_striker  runs
## 0   AB de Villiers         V Kohli     3
## 1         AF Milne         V Kohli     5
## 2        KM Jadhav         V Kohli     7
## 3           P Negi         V Kohli     3
## 4        S Aravind         V Kohli     0
## 5        S Aravind       YS Chahal     8
## 6         S Badree         V Kohli     2
## 7        STR Binny         V Kohli     1
## 8      Sachin Baby         V Kohli     2
## 9          TM Head         V Kohli     2
## 10         V Kohli  AB de Villiers    17
## 11         V Kohli        AF Milne     5
## 12         V Kohli       KM Jadhav     4
## 13         V Kohli          P Negi     9
## 14         V Kohli       S Aravind     2
## 15         V Kohli        S Badree     8
## 16         V Kohli     Sachin Baby     1
## 17         V Kohli         TM Head     9
## 18       YS Chahal       S Aravind     4

8. Batsmen vs Bowler

The function below computes and plots the performances of the batsmen vs the bowlers. As before the plot parameter can be set to True or False. By default it is plot=True

import pandas as pd
import yorkpy.analytics as yka
gl_mi=pd.read_csv(".\\Gujarat Lions-Mumbai Indians-2017-04-29.csv")
yka.teamBatsmenVsBowlersMatch(gl_mi,"Gujarat Lions","Mumbai Indians", plot=True)
# Print 

csk_kxip=pd.read_csv(".\\Chennai Super Kings-Kings XI Punjab-2014-05-30.csv")
m=yka.teamBatsmenVsBowlersMatch(csk_kxip,'Chennai Super Kings','Kings XI Punjab',plot=False)
print(m)
##          batsman           bowler  runs
## 0    BB McCullum         AR Patel     4
## 1    BB McCullum       GJ Maxwell     1
## 2    BB McCullum  Karanveer Singh     6
## 3      DJ Hussey          P Awana     1
## 4       DR Smith       MG Johnson     7
## 5       DR Smith          P Awana     0
## 6       DR Smith   Sandeep Sharma     0
## 7   F du Plessis       MG Johnson     0
## 8      MM Sharma         AR Patel     0
## 9      MM Sharma       MG Johnson     0
## 10     MM Sharma          P Awana     1
## 11      MS Dhoni         AR Patel    12
## 12      MS Dhoni  Karanveer Singh     2
## 13      MS Dhoni       MG Johnson    11
## 14      MS Dhoni          P Awana    15
## 15      MS Dhoni   Sandeep Sharma     2
## 16      R Ashwin         AR Patel     1
## 17      R Ashwin  Karanveer Singh     4
## 18      R Ashwin       MG Johnson     1
## 19      R Ashwin          P Awana     1
## 20      R Ashwin   Sandeep Sharma     3
## 21     RA Jadeja         AR Patel     5
## 22     RA Jadeja       GJ Maxwell     3
## 23     RA Jadeja  Karanveer Singh    19
## 24     RA Jadeja          P Awana     0
## 25      SK Raina       MG Johnson    21
## 26      SK Raina          P Awana    40
## 27      SK Raina   Sandeep Sharma    26

9. Bowling Scorecard

This function provides the bowling performance, the number of overs bowled, maidens, runs conceded. wickets taken and economy rate for the IPL match

import pandas as pd
import yorkpy.analytics as yka
dc_dd=pd.read_csv(".\\Deccan Chargers-Delhi Daredevils-2012-05-10.csv")
a=yka.teamBowlingScorecardMatch(dc_dd,'Deccan Chargers')
print(a)
##        bowler  overs  runs  maidens  wicket  econrate
## 0  AD Russell      4    39        0       0      9.75
## 1   IK Pathan      4    46        0       1     11.50
## 2    M Morkel      4    32        0       1      8.00
## 3    S Nadeem      4    39        0       0      9.75
## 4    VR Aaron      4    30        0       2      7.50
rps_rcb=pd.read_csv(".\\Rising Pune Supergiant-Royal Challengers Bangalore-2017-04-29.csv")
b=yka.teamBowlingScorecardMatch(rps_rcb,'Royal Challengers Bangalore')
print(b)
##               bowler  overs  runs  maidens  wicket  econrate
## 0          DL Chahar      2    18        0       0      9.00
## 1       DT Christian      4    25        0       1      6.25
## 2        Imran Tahir      4    18        0       3      4.50
## 3         JD Unadkat      4    19        0       1      4.75
## 4        LH Ferguson      4     7        1       3      1.75
## 5  Washington Sundar      2     7        0       1      3.50

10. Wicket Kind

The plots below provide the kind of wicket taken by the bowler (caught, bowled, lbw etc.) for the IPL match

import pandas as pd
import yorkpy.analytics as yka
kkr_rr=pd.read_csv(".\\Kolkata Knight Riders-Rajasthan Royals-2010-04-17.csv")
yka.teamBowlingWicketKindMatch(kkr_rr,'Kolkata Knight Riders','Rajasthan Royals')

csk_kxip=pd.read_csv(".\\Chennai Super Kings-Kings XI Punjab-2014-05-30.csv")
m = yka.teamBowlingWicketKindMatch(csk_kxip,'Chennai Super Kings','Kings-Kings XI Punjab',plot=False)
print(m)
##             bowler     kind  player_out
## 0         AR Patel  run out           1
## 1         AR Patel  stumped           1
## 2  Karanveer Singh  run out           1
## 3       MG Johnson   caught           1
## 4          P Awana   caught           2
## 5   Sandeep Sharma   bowled           1

11. Wicket vs Runs conceded

The plots below provide the wickets taken and the runs conceded by the bowler in the IPL T20 match

import pandas as pd
import yorkpy.analytics as yka
dc_dd=pd.read_csv(".\\Deccan Chargers-Delhi Daredevils-2012-05-10.csv")
yka.teamBowlingWicketMatch(dc_dd,"Deccan Chargers", "Delhi Daredevils",plot=True)

print("\n\n")
rps_rcb=pd.read_csv(".\\Rising Pune Supergiant-Royal Challengers Bangalore-2017-04-29.csv")
a=yka.teamBowlingWicketMatch(rps_rcb,"Royal Challengers Bangalore", "Rising Pune Supergiant",plot=False)
print(a)
##               bowler      player_out  kind
## 0       DT Christian         V Kohli     1
## 1        Imran Tahir        AF Milne     1
## 2        Imran Tahir          P Negi     1
## 3        Imran Tahir        S Badree     1
## 4         JD Unadkat         TM Head     1
## 5        LH Ferguson  AB de Villiers     1
## 6        LH Ferguson       KM Jadhav     1
## 7        LH Ferguson       STR Binny     1
## 8  Washington Sundar     Sachin Baby     1

12. Bowler Vs Batsmen

The functions compute and display how the different bowlers of the IPL team performed against the batting opposition.

import pandas as pd
import yorkpy.analytics as yka
csk_kxip=pd.read_csv(".\\Chennai Super Kings-Kings XI Punjab-2014-05-30.csv")
yka.teamBowlersVsBatsmenMatch(csk_kxip,"Chennai Super Kings","Kings XI Punjab")

print("\n\n")
kkr_rr=pd.read_csv(".\\Kolkata Knight Riders-Rajasthan Royals-2010-04-17.csv")
m =yka.teamBowlersVsBatsmenMatch(kkr_rr,"Rajasthan Royals","Kolkata Knight Riders",plot=False)
print(m)
##        batsman      bowler  runs
## 0     AC Voges    AB Dinda     1
## 1     AC Voges  JD Unadkat     1
## 2     AC Voges   LR Shukla     1
## 3     AC Voges    M Kartik     5
## 4     AJ Finch    AB Dinda     3
## 5     AJ Finch  JD Unadkat     3
## 6     AJ Finch   LR Shukla    13
## 7     AJ Finch    M Kartik     2
## 8     AJ Finch     SE Bond     0
## 9      AS Raut    AB Dinda     1
## 10     AS Raut  JD Unadkat     1
## 11    FY Fazal    AB Dinda     1
## 12    FY Fazal   LR Shukla     3
## 13    FY Fazal    M Kartik     3
## 14    FY Fazal     SE Bond     6
## 15     NV Ojha    AB Dinda    10
## 16     NV Ojha  JD Unadkat     5
## 17     NV Ojha   LR Shukla     0
## 18     NV Ojha    M Kartik     1
## 19     NV Ojha     SE Bond     2
## 20     P Dogra  JD Unadkat     2
## 21     P Dogra   LR Shukla     5
## 22     P Dogra    M Kartik     1
## 23     P Dogra     SE Bond     0
## 24  SK Trivedi    AB Dinda     4
## 25    SK Warne    AB Dinda     2
## 26    SK Warne    M Kartik     1
## 27    SK Warne     SE Bond     0
## 28   SR Watson    AB Dinda     2
## 29   SR Watson  JD Unadkat    13
## 30   SR Watson   LR Shukla     1
## 31   SR Watson    M Kartik    18
## 32   SR Watson     SE Bond    10
## 33   YK Pathan  JD Unadkat     1
## 34   YK Pathan   LR Shukla     7

13. Match worm chart

The plots below provide the match worm graph for the IPL Twenty 20 matches

import pandas as pd
import yorkpy.analytics as yka
dc_dd=pd.read_csv(".\\Deccan Chargers-Delhi Daredevils-2012-05-10.csv")
yka.matchWormChart(dc_dd,"Deccan Chargers", "Delhi Daredevils")

gl_mi=pd.read_csv(".\\Gujarat Lions-Mumbai Indians-2017-04-29.csv")
yka.matchWormChart(gl_mi,"Mumbai Indians","Gujarat Lions")

Feel free to clone/download the code from Github yorkpy

Conclusion

This post included all functions between 2 IPL teams from the package yorkpy for IPL Twenty20 matches. As mentioned above the yaml match files have been already converted to dataframes and are available for download from Github at yorkpyData

After having used Python and R for analytics, Machine Learning and Deep Learning, I have now realized that neither language is superior or inferior. Both have, some good packages and some that are not so well suited.

To be continued. Watch this space!

You may also like
1.My book ‘Deep Learning from first principles:Second Edition’ now on Amazon
2.My book ‘Practical Machine Learning in R and Python: Second edition’ on Amazon
2. Cricpy takes a swing at the ODIs
3. Introducing cricket package yorkr: Part 1- Beaten by sheer pace!
4. Big Data-1: Move into the big league:Graduate from Python to Pyspark
5. Simulating an Edge Shape in Android

To see all posts click Index of posts

Using Linear Programming (LP) for optimizing bowling change or batting lineup in T20 cricket


In my recent post, My travels through the realms of Data Science, Machine Learning, Deep Learning and (AI), I had recounted my journey in the domains of of Data Science, Machine Learning (ML), and more recently Deep Learning (DL) all of which are useful while analyzing data. Of late, I have come to the realization that there are many facets to data. And to glean insights from data, Data Science, ML and DL alone are not sufficient and one needs to also have a good handle on linear programming and optimization. My colleague at IBM Research also concurred with this view and told me he had arrived at this conclusion several years ago.

If you are passionate about cricket, and love analyzing cricket performances, then check out my 2 racy books on cricket! In my books, I perform detailed yet compact analysis of performances of both batsmen, bowlers besides evaluating team & match performances in Tests , ODIs, T20s & IPL. You can buy my books on cricket from Amazon at $12.99 for the paperback and $4.99/$6.99 respectively for the kindle versions. The books can be accessed at Cricket analytics with cricketr  and Beaten by sheer pace-Cricket analytics with yorkr  A must read for any cricket lover! Check it out!!

1

While ML & DL are very useful and interesting to make inferences and predictions of outputs from input variables, optimization computes the choice of input which results in maximizing or minimizing the output. So I made a small course correction and started on a course from India’s own NPTEL Introduction to Linear Programming by Prof G. Srinivasan of IIT Madras (highly recommended!). The lectures are delivered with remarkable clarity by the Prof and I am just about halfway through the course (each lecture is of 50-55 min duration), when I decided that I needed to try to formulate and solve some real world Linear Programming problem.

As usual, I turned towards cricket for some appropriate situations, and sure enough it was there in the open. For this LP formulation I take International T20 and IPL, though International ODI will also work equally well.  You can download the associated code and data for this from Github at LP-cricket-analysis

In T20 matches the captain has to make choice of how to rotate bowlers with the aim of restricting the batting side. Conversely, the batsmen need to take advantage of the bowling strength to maximize the runs scored.

Note:
a) A simple and obvious strategy would be
– If the ith bowler’s economy rate is less than the economy rate of the jth bowler i.e.
er_{i} < er_{j} then have bowler ‘i’ to bowl more overs as his/her economy rate is better

b)A better strategy would be to consider the economy rate of each bowler against each batsman. How often  have we witnessed bowlers with a great bowling average get thrashed time and again by the same batsman, or a bowler who is generally very poor being very effective against a particular batsman. i.e. er_{ij} < er_{ik} where the jth bowler is more effective than the kth bowler against the ith batsman. This now becomes a linear optimization problem as we can have several combinations of number of overs X economy rate for different bowlers and we will have to solve this algorithmically to determine the lowest score for bowling performance or highest score for batting order.

This post uses the latter approach to optimize bowling change and batting lineup.

Let is take a hypothetical situation
Assume there are 3 bowlers – bwlr_{1},bwlr_{2},bwlr_{3}
and there are 4 batsmen – bman_{1},bman_{2},bman_{3},bman_{4}

Let the economy rate er_{ij} be the Economy Rate of the jth bowler to the ith batsman. Also if remaining overs for the bowlers are o_{1},o_{2},o_{3}
and the total number of overs left to be bowled are
o_{1}+o_{2}+o_{3} = N then the question is

a) Given the economy rate of each bowler per batsman, how many overs should each bowler bowl, so that the total runs scored by all the batsmen are minimum?

b) Alternatively, if the know the individual strike rate of a batsman against the individual bowlers, how many overs should each batsman face with a bowler so that the total runs scored is maximized?

1. LP Formulation for bowling order

Let the economy rate er_{ij} be the Economy Rate of the jth bowler to the ith batsman.
Objective function : Minimize –
er_{11}*o_{11} + er_{12}*o_{12} +..+er_{1n}*o_{1n}+ er_{21}*o_{21} + er_{22}*o_{22}+.. + er_{22}*o_{2n}+ er_{m1}*o_{m1}+..+ er_{mn}*o_{mn}
i.e.
\sum_{i=1}^{i=m}\sum_{j=1}^{i=n}er_{ij}*o_{ij}
Constraints
Where o_{j} is the number of overs remaining for the jth bowler against  ‘k’ batsmen
o_{j1} + o_{j2} + .. o_{jk} < o_{j}
and if the total number of overs remaining to be bowled is N then
o_{1} + o_{2} +...+ o_{k} = N or
\sum_{j=1}^{j=k} o_{j} =N
The overs that any bowler can bowl is o_{j} >=0

2. LP Formulation for batting lineup

Let the strike rate sr_{ij}  be the Strike Rate of the ith batsman to the jth bowler
Objective function : Maximize –
sr_{11}*o_{11} + sr_{12}*o_{12} +..+ sr_{1n}*o_{1n}+ sr_{21}*o_{21} + sr_{22}*o_{22}+.. sr_{2n}*o_{2n}+ sr_{m1}*o_{m1}+..+ sr_{mn}*o_{mn}
i.e.
\sum_{i=1}^{i=4}\sum_{j=1}^{i=3}sr_{ij}*o_{ij}
Constraints
Where o_{j} is the number of overs remaining for the jth bowler against  ‘k’ batsmen
o_{j1} + o_{j2} + .. o_{jk} < o_{j}
and the total number of overs remaining to be bowled is N then
o_{1} + o_{2} +...+ o_{k} = N or
\sum_{j=1}^{j=k} o_{j} =N
The overs that any bowler can bowl is
o_{j} >=0

lpSolveAPI– For this maximization and minimization problem I used lpSolveAPI.

Below I take 2 simple examples (example1 & 2)  to ensure that my LP formulation and solution is correct before applying it on real T20 cricket data (Intl. T20 and IPL)

3. LP formulation (Example 1)

Initially I created a test example to ensure that I get the LP formulation and solution correct. Here the er1=4 and er2=3 and o1 & o2 are the overs bowled by bowlers 1 & 2. Also o1+o2=4 In this example as below

o1 o2 Obj Fun(=4o1+3o2)
1    3      13
2    2      14
3    1      15

library(lpSolveAPI)
library(dplyr)
library(knitr)
lprec <- make.lp(0, 2)
a <-lp.control(lprec, sense="min")
set.objfn(lprec, c(4, 3))  # Economy Rate of 4 and 3 for er1 and er2
add.constraint(lprec, c(1, 1), "=",4)  # o1 + o2 =4
add.constraint(lprec, c(1, 0), ">",1)  # o1 > 1
add.constraint(lprec, c(0, 1), ">",1)  # o2 > 1
lprec
## Model name: 
##             C1    C2       
## Minimize     4     3       
## R1           1     1   =  4
## R2           1     0  >=  1
## R3           0     1  >=  1
## Kind       Std   Std       
## Type      Real  Real       
## Upper      Inf   Inf       
## Lower        0     0
b <-solve(lprec)
get.objective(lprec) # 13
## [1] 13
get.variables(lprec) # 1    3 
## [1] 1 3

Note 1: In the above example 13 runs is the minimum that can be scored and this requires

LP solution:
Minimum runs=13

  • o1=1
  • o2=3

Note 2:The numbers in the columns represent the number of overs that need to be bowled by a bowler to the corresponding batsman.

4. LP formulation (Example 2)

In this formulation there are 2 bowlers and 2 batsmen o11,o12 are the oves bowled by bowler 1 to batsmen 1 & 2 and o21, o22 are the overs bowled by bowler 2 to batsmen 1 & 2 er11=4, er12=2,er21=2,er22=5 o11+o12+o21+o22=5

The solution for this manually computed is o11, o12, o21, o22 Runs
where B11, B12 are the overs bowler 1 bowls to batsman 1 and B21 and B22 are overs bowler 2 bowls to batsman 2

o11     o12    o21    o22      Runs=(4*o11+2*o12+2*o21+5*o22)
1            1             1            2           18
1           2              1             1           15
2           1              1            1            17
1           1               2            1            15

lprec <- make.lp(0, 4)
a <-lp.control(lprec, sense="min")
set.objfn(lprec, c(4, 2,2,5))
add.constraint(lprec, c(1, 1,0,0), "<=",8)
add.constraint(lprec, c(0, 0,1,1), "<=",7)
add.constraint(lprec, c(1, 1,1,1), "=",5)
add.constraint(lprec, c(1, 0,0,0), ">",1)
add.constraint(lprec, c(0, 1,0,0), ">",1)
add.constraint(lprec, c(0, 0,1,0), ">",1)
add.constraint(lprec, c(0, 0,0,1), ">",1)
lprec
## Model name: 
##             C1    C2    C3    C4       
## Minimize     4     2     2     5       
## R1           1     1     0     0  <=  8
## R2           0     0     1     1  <=  7
## R3           1     1     1     1   =  5
## R4           1     0     0     0  >=  1
## R5           0     1     0     0  >=  1
## R6           0     0     1     0  >=  1
## R7           0     0     0     1  >=  1
## Kind       Std   Std   Std   Std       
## Type      Real  Real  Real  Real       
## Upper      Inf   Inf   Inf   Inf       
## Lower        0     0     0     0
b<-solve(lprec)
get.objective(lprec) 
## [1] 15
get.variables(lprec) 
## [1] 1 2 1 1

Note: In the above example 15 runs is the minimum that can be scored and this requires

LP Solution:
Minimum runs=15

  • o11=1
  • o12=2
  • o21=1
  • o22=1

It is possible to keep the minimum to other values and solves also.

5. LP formulation for International T20 India vs Australia (Batting lineup)

To analyze batting and bowling lineups in the cricket world I needed to get the ball-by-ball details of runs scored by each batsman against each of the bowlers. Fortunately I had already created this with my R package yorkr. yorkr processes yaml data from Cricsheet. So I copied the data of all matches between Australia and India in International T20s. You can download my processed data for International T20 at Inswinger

load("Australia-India-allMatches.RData")
dim(matches)
## [1] 3541   25

The following functions compute the ‘Strike Rate’ of a batsman as

SR=1/oversRunsScored

Also the Economy Rate is computed as

ER=1/oversRunsConceded

Incidentally the SR=ER

# Compute the Strike Rate of the batsman
computeSR <- function(batsman1,bowler1){
    a <- matches %>% filter(batsman==batsman1 & bowler==bowler1) 
    a1 <- a %>% summarize(totalRuns=sum(runs),count=n()) %>% mutate(SR=(totalRuns/count)*6)
    a1
}

# Compute the Economy Rate of the batsman
computeER <- function(batsman1,bowler1){
    a <- matches %>% filter(batsman==batsman1 & bowler==bowler1) 
    a1 <- a %>% summarize(totalRuns=sum(runs),count=n()) %>% mutate(ER=(totalRuns/count)*6)
    a1
}

Here I compute the Strike Rate of Virat Kohli, Yuvraj Singh and MS Dhoni against Shane Watson, Brett Lee and MA Starc

 # Kohli
kohliWatson<- computeSR("V Kohli","SR Watson")
kohliWatson
##   totalRuns count       SR
## 1        45    37 7.297297
kohliLee <- computeSR("V Kohli","B Lee")
kohliLee
##   totalRuns count       SR
## 1        10     7 8.571429
kohliStarc <- computeSR("V Kohli","MA Starc")
kohliStarc
##   totalRuns count       SR
## 1        11     9 7.333333
# Yuvraj
yuvrajWatson<- computeSR("Yuvraj Singh","SR Watson")
yuvrajWatson
##   totalRuns count       SR
## 1        24    22 6.545455
yuvrajLee <- computeSR("Yuvraj Singh","B Lee")
yuvrajLee
##   totalRuns count       SR
## 1        12     7 10.28571
yuvrajStarc <- computeSR("Yuvraj Singh","MA Starc")
yuvrajStarc
##   totalRuns count SR
## 1        12     8  9
# MS Dhoni
dhoniWatson<- computeSR("MS Dhoni","SR Watson")
dhoniWatson
##   totalRuns count       SR
## 1        33    28 7.071429
dhoniLee <- computeSR("MS Dhoni","B Lee")
dhoniLee
##   totalRuns count  SR
## 1        26    20 7.8
dhoniStarc <- computeSR("MS Dhoni","MA Starc")
dhoniStarc
##   totalRuns count   SR
## 1        11     8 8.25

When we consider the batting lineup, the problem is one of maximization. In the LP formulation below V Kohli has a SR of 7.29, 8.57, 7.33 against Watson, Lee & Starc
Yuvraj has a SR of 6.5, 10.28, 9 against Watson, Lee & Starc
and Dhoni has a SR of 7.07, 7.8,  8.25 against Watson, Lee and Starc

The constraints are Watson, Lee and Starc have 3, 4 & 3 overs remaining respectively. The total number of overs remaining to be bowled is 9.The other constraints could be that a bowler bowls at least 1 over etc.

Formulating and solving

# 3 batsman x 3 bowlers
lprec <- make.lp(0, 9)
# Maximization
a<-lp.control(lprec, sense="max")

# Set the objective function
set.objfn(lprec, c(kohliWatson$SR, kohliLee$SR,kohliStarc$SR,
                   yuvrajWatson$SR,yuvrajLee$SR,yuvrajStarc$SR,
                   dhoniWatson$SR,dhoniLee$SR,dhoniStarc$SR))

#Assume the  bowlers have 3,4,3 overs left respectively
add.constraint(lprec, c(1, 1,1,0,0,0, 0,0,0), "<=",3)
add.constraint(lprec, c(0,0,0,1,1,1,0,0,0), "<=",4)
add.constraint(lprec, c(0,0,0,0,0,0,1,1,1), "<=",3)
#o11+o12+o13+o21+o22+o23+o31+o32+o33=8 (overs remaining)
add.constraint(lprec, c(1,1,1,1,1,1,1,1,1), "=",9) 


add.constraint(lprec, c(1,0,0,0,0,0,0,0,0), ">=",1) #o11 >=1
add.constraint(lprec, c(0,1,0,0,0,0,0,0,0), ">=",0) #o12 >=0
add.constraint(lprec, c(0,0,1,0,0,0,0,0,0), ">=",0) #o13 >=0
add.constraint(lprec, c(0,0,0,1,0,0,0,0,0), ">=",1) #o21 >=1
add.constraint(lprec, c(0,0,0,0,1,0,0,0,0), ">=",1) #o22 >=1
add.constraint(lprec, c(0,0,0,0,0,1,0,0,0), ">=",0) #o23 >=0
add.constraint(lprec, c(0,0,0,0,0,0,1,0,0), ">=",1) #o31 >=1
add.constraint(lprec, c(0,0,0,0,0,0,0,1,0), ">=",0) #o32 >=0
add.constraint(lprec, c(0,0,0,0,0,0,0,0,1), ">=",0) #o33 >=0

lprec
## Model name: 
##   a linear program with 9 decision variables and 13 constraints
b <-solve(lprec)
get.objective(lprec) #  
## [1] 77.16418
get.variables(lprec) # 
## [1] 1 2 0 1 3 0 1 0 1

This shows that the maximum runs that can be scored for the current strike rate is 77.16   runs in 9 overs The breakup is as follows

This is also shown below

get.variables(lprec) # 
## [1] 1 2 0 1 3 0 1 0 1

This is also shown below

e <- as.data.frame(rbind(c(1,2,0,3),c(1,3,0,4),c(1,0,1,2)))
names(e) <- c("S Watson","B Lee","MA Starc","Overs")
rownames(e) <- c("Kohli","Yuvraj","Dhoni")
e

LP Solution:
Maximum runs that can be scored by India against Australia is:77.164 if the 9 overs to be faced by the batsman are as below

##        S Watson B Lee MA Starc Overs
## Kohli         1     2        0     3
## Yuvraj        1     3        0     4
## Dhoni         1     0        1     2
#Total overs=9

Note: This assumes that the batsmen perform at their current Strike Rate. Howvever anything can happen in a real game, but nevertheless this is a fairly reasonable estimate of the performance

Note 2:The numbers in the columns represent the number of overs that need to be bowled by a bowler to the corresponding batsman.

Note 3:You could try other combinations of overs for the above SR. For the above constraints 77.16 is the highest score for the given number of overs

6. LP formulation for International T20 India vs Australia (Bowling lineup)

For this I compute how the bowling should be rotated between R Ashwin, RA Jadeja and JJ Bumrah when taking into account their performance against batsmen like Shane Watson, AJ Finch and David Warner. For the bowling performance I take the Economy rate of the bowlers. The data is the same as above

computeSR <- function(batsman1,bowler1){
    a <- matches %>% filter(batsman==batsman1 & bowler==bowler1) 
    a1 <- a %>% summarize(totalRuns=sum(runs),count=n()) %>% mutate(SR=(totalRuns/count)*6)
    a1
}
# RA Jadeja
jadejaWatson<- computeER("SR Watson","RA Jadeja")
jadejaWatson
##   totalRuns count       ER
## 1        60    29 12.41379
jadejaFinch <- computeER("AJ Finch","RA Jadeja")
jadejaFinch
##   totalRuns count       ER
## 1        36    33 6.545455
jadejaWarner <- computeER("DA Warner","RA Jadeja")
jadejaWarner
##   totalRuns count       ER
## 1        23    11 12.54545
# Ashwin
ashwinWatson<- computeER("SR Watson","R Ashwin")
ashwinWatson
##   totalRuns count       ER
## 1        41    26 9.461538
ashwinFinch <- computeER("AJ Finch","R Ashwin")
ashwinFinch
##   totalRuns count   ER
## 1        63    36 10.5
ashwinWarner <- computeER("DA Warner","R Ashwin")
ashwinWarner
##   totalRuns count       ER
## 1        38    28 8.142857
# JJ Bunrah
bumrahWatson<- computeER("SR Watson","JJ Bumrah")
bumrahWatson
##   totalRuns count  ER
## 1        22    20 6.6
bumrahFinch <- computeER("AJ Finch","JJ Bumrah")
bumrahFinch
##   totalRuns count       ER
## 1        25    19 7.894737
bumrahWarner <- computeER("DA Warner","JJ Bumrah")
bumrahWarner
##   totalRuns count ER
## 1         2     4  3

As can be seen from above RA Jadeja has a ER of 12.4, 6.54, 12.54 against Watson, AJ Finch and Warner also Ashwin has a ER of 9.46, 10.5, 8.14 against Watson, Finch and Warner. Similarly Bumrah has an ER of 6.6,7.89, 3 against Watson, Finch and Warner
The constraints are Jadeja, Ashwin and Bumrah have 4, 3 & 4 overs remaining and the total overs remaining to be bowled is 10.

Formulating solving the bowling lineup is shown below

lprec <- make.lp(0, 9)
a <-lp.control(lprec, sense="min")

# Set the objective function
set.objfn(lprec, c(jadejaWatson$ER, jadejaFinch$ER,jadejaWarner$ER,
                   ashwinWatson$ER,ashwinFinch$ER,ashwinWarner$ER,
                   bumrahWatson$ER,bumrahFinch$ER,bumrahWarner$ER))

add.constraint(lprec, c(1, 1,1,0,0,0, 0,0,0), "<=",4) # Jadeja has 4 overs
add.constraint(lprec, c(0,0,0,1,1,1,0,0,0), "<=",3)   # Ashwin has 3 overs left
add.constraint(lprec, c(0,0,0,0,0,0,1,1,1), "<=",4)   # Bumrah has 4 overs left
add.constraint(lprec, c(1,1,1,1,1,1,1,1,1), "=",10) # Total overs = 10
add.constraint(lprec, c(1,0,0,0,0,0,0,0,0), ">=",1)
add.constraint(lprec, c(0,1,0,0,0,0,0,0,0), ">=",0)
add.constraint(lprec, c(0,0,1,0,0,0,0,0,0), ">=",1)
add.constraint(lprec, c(0,0,0,1,0,0,0,0,0), ">=",0)
add.constraint(lprec, c(0,0,0,0,1,0,0,0,0), ">=",1)
add.constraint(lprec, c(0,0,0,0,0,1,0,0,0), ">=",0)
add.constraint(lprec, c(0,0,0,0,0,0,1,0,0), ">=",0)
add.constraint(lprec, c(0,0,0,0,0,0,0,1,0), ">=",1)
add.constraint(lprec, c(0,0,0,0,0,0,0,0,1), ">=",0)

lprec
## Model name: 
##   a linear program with 9 decision variables and 13 constraints
b <-solve(lprec)
get.objective(lprec) #  
## [1] 73.58775
get.variables(lprec) # 
## [1] 1 2 1 0 1 1 0 1 3

The minimum runs that will be conceded by these 3 bowlers in 10 overs is 73.58 assuming the bowling is rotated as follows

e <- as.data.frame(rbind(c(1,0,0),c(2,1,1),c(1,1,3),c(4,2,4)))
names(e) <- c("RA Jadeja","R Ashwin","JJ Bumrah")
rownames(e) <- c("S Watson","AJ Finch","DA Warner","Overs")
e 

LP Solution:
Minimum runs that will be conceded by India against Australia is 73.58 in 10 overs if the overs bowled are as follows

##           RA Jadeja R Ashwin JJ Bumrah
## S Watson          1        0         0
## AJ Finch          2        1         1
## DA Warner         1        1         3
## Overs             4        2         4
#Total overs=10  

7. LP formulation for IPL (Mumbai Indians – Kolkata Knight Riders – Bowling lineup)

As in the case of International T20s I also have processed IPL data derived from my R package yorkr. yorkr. yorkr processes yaml data from Cricsheet. The processed data for all IPL matches can be downloaded from GooglyPlus

load("Mumbai Indians-Kolkata Knight Riders-allMatches.RData")
dim(matches)
## [1] 4237   25
# Compute the Economy Rate of the Mumbai Indian bowlers against Kolkata Knight Riders

# Gambhir
gambhirMalinga <- computeER("G Gambhir","SL Malinga")
gambhirHarbhajan <- computeER("G Gambhir","Harbhajan Singh")
gambhirPollard <- computeER("G Gambhir","KA Pollard")

#Yusuf Pathan
yusufMalinga <- computeER("YK Pathan","SL Malinga")
yusufHarbhajan <- computeER("YK Pathan","Harbhajan Singh")
yusufPollard <- computeER("YK Pathan","KA Pollard")

#JH Kallis
kallisMalinga <- computeER("JH Kallis","SL Malinga")
kallisHarbhajan <- computeER("JH Kallis","Harbhajan Singh")
kallisPollard <- computeER("JH Kallis","KA Pollard")

#RV Uthappa
uthappaMalinga <- computeER("RV Uthappa","SL Malinga")
uthappaHarbhajan <- computeER("RV Uthappa","Harbhajan Singh")
uthappaPollard <- computeER("RV Uthappa","KA Pollard")

Here

gambhirMalinga, yusufMalinga, kallisMalinga, uthappaMalinga is the ER of Malinga against Gambhir, Yusuf Pathan, Kallis and Uthappa
gambhirHarbhajan, yusufHarbhajan, kallisHarbhajan, uthappaHarbhajan is the ER of Harbhajan against Gambhir, Yusuf Pathan, Kallis and Uthappa
gambhirPollard, yusufPollard, kallisPollard, uthappaPollard is the ER of Kieron Pollard against Gambhir, Yusuf Pathan, Kallis and Uthappa

The constraints are Malinga, Harbhajan and Pollard have 4 overs each and remaining overs to be bowled is 10.

Formulating and solving this for the bowling lineup of Mumbai Indians against Kolkata Knight Riders

 library("lpSolveAPI")
 lprec <- make.lp(0, 12)
 a=lp.control(lprec, sense="min")
 
 set.objfn(lprec, c(gambhirMalinga$ER, yusufMalinga$ER,kallisMalinga$ER,uthappaMalinga$ER,
                    gambhirHarbhajan$ER,yusufHarbhajan$ER,kallisHarbhajan$ER,uthappaHarbhajan$ER,
                    gambhirPollard$ER,yusufPollard$ER,kallisPollard$ER,uthappaPollard$ER))
 
 add.constraint(lprec, c(1,1,1,1, 0,0,0,0, 0,0,0,0), "<=",4)
 add.constraint(lprec, c(0,0,0,0,1,1,1,1,0,0,0,0), "<=",4)
 add.constraint(lprec, c(0,0,0,0,0,0,0,0,1,1,1,1), "<=",4)
 add.constraint(lprec, c(1,1,1,1,1,1,1,1,1,1,1,1), "=",10)
 
 add.constraint(lprec, c(1,0,0,0,0,0,0,0,0,0,0,0), ">=",0)
 add.constraint(lprec, c(0,1,0,0,0,0,0,0,0,0,0,0), ">=",1)
 add.constraint(lprec, c(0,0,1,0,0,0,0,0,0,0,0,0), ">=",0)
 add.constraint(lprec, c(0,0,0,1,0,0,0,0,0,0,0,0), ">=",0)
 add.constraint(lprec, c(0,0,0,0,1,0,0,0,0,0,0,0), ">=",0)
 add.constraint(lprec, c(0,0,0,0,0,1,0,0,0,0,0,0), ">=",1)
 add.constraint(lprec, c(0,0,0,0,0,0,1,0,0,0,0,0), ">=",0)
 add.constraint(lprec, c(0,0,0,0,0,0,0,1,0,0,0,0), ">=",1)
 add.constraint(lprec, c(0,0,0,0,0,0,0,0,1,0,0,0), ">=",0)
 add.constraint(lprec, c(0,0,0,0,0,0,0,0,0,1,0,0), ">=",1)
 add.constraint(lprec, c(0,0,0,0,0,0,0,0,0,0,1,0), ">=",0)
 add.constraint(lprec, c(0,0,0,0,0,0,0,0,0,0,0,1), ">=",0)
 
 lprec
## Model name: 
##   a linear program with 12 decision variables and 16 constraints
 b=solve(lprec)
 get.objective(lprec) #  
## [1] 55.57887
 get.variables(lprec) # 
##  [1] 3 1 0 0 0 1 0 1 3 1 0 0
e <- as.data.frame(rbind(c(3,1,0,0,4),c(0, 1, 0,1,2),c(3, 1, 0,0,4)))
names(e) <- c("Gambhir","Yusuf","Kallis","Uthappa","Overs")
rownames(e) <- c("Malinga","Harbhajan","Pollard") 
e

LP Solution: Mumbai Indians can restrict Kolkata Knight Riders to 55.87 in 10 overs
if the overs are bowled as below

##           Gambhir Yusuf Kallis Uthappa Overs
## Malinga         3     1      0       0     4
## Harbhajan       0     1      0       1     2
## Pollard         3     1      0       0     4
#Total overs=10  

8. LP formulation for IPL (Mumbai Indians – Kolkata Knight Riders – Batting lineup)

As I mentioned it is possible to perform a maximation with the same formulation since computeSR<==>computeER

This just flips the problem around and computes the maximum runs that can be scored for the batsman’s Strike rate (this is same as the bowler’s Economy rate) i.e.

gambhirMalinga, yusufMalinga, kallisMalinga, uthappaMalinga is the SR of Gambhir, Yusuf Pathan, Kallis and Uthappa against Malinga
gambhirHarbhajan, yusufHarbhajan, kallisHarbhajan, uthappaHarbhajan is the SR of Gambhir, Yusuf Pathan, Kallis and Uthappa against Harbhajan
gambhirPollard, yusufPollard, kallisPollard, uthappaPollard is the SR of Gambhir, Yusuf Pathan, Kallis and Uthappa against Kieron Pollard.

The constraints are Malinga, Harbhajan and Pollard have 4 overs each and remaining overs to be bowled is 10.

 library("lpSolveAPI")
 lprec <- make.lp(0, 12)
 a=lp.control(lprec, sense="max")
 
 a <-set.objfn(lprec, c(gambhirMalinga$ER, yusufMalinga$ER,kallisMalinga$ER,uthappaMalinga$ER,
                    gambhirHarbhajan$ER,yusufHarbhajan$ER,kallisHarbhajan$ER,uthappaHarbhajan$ER,
                    gambhirPollard$ER,yusufPollard$ER,kallisPollard$ER,uthappaPollard$ER))
 
 
 add.constraint(lprec, c(1,1,1,1, 0,0,0,0, 0,0,0,0), "<=",4)
 add.constraint(lprec, c(0,0,0,0,1,1,1,1,0,0,0,0), "<=",4)
 add.constraint(lprec, c(0,0,0,0,0,0,0,0,1,1,1,1), "<=",4)
 add.constraint(lprec, c(1,1,1,1,1,1,1,1,1,1,1,1), "=",11)
 
 add.constraint(lprec, c(1,0,0,0,0,0,0,0,0,0,0,0), ">=",0)
 add.constraint(lprec, c(0,1,0,0,0,0,0,0,0,0,0,0), ">=",1)
 add.constraint(lprec, c(0,0,1,0,0,0,0,0,0,0,0,0), ">=",0)
 add.constraint(lprec, c(0,0,0,1,0,0,0,0,0,0,0,0), ">=",0)
 add.constraint(lprec, c(0,0,0,0,1,0,0,0,0,0,0,0), ">=",0)
 add.constraint(lprec, c(0,0,0,0,0,1,0,0,0,0,0,0), ">=",1)
 add.constraint(lprec, c(0,0,0,0,0,0,1,0,0,0,0,0), ">=",0)
 add.constraint(lprec, c(0,0,0,0,0,0,0,1,0,0,0,0), ">=",1)
 add.constraint(lprec, c(0,0,0,0,0,0,0,0,1,0,0,0), ">=",0)
 add.constraint(lprec, c(0,0,0,0,0,0,0,0,0,1,0,0), ">=",1)
 add.constraint(lprec, c(0,0,0,0,0,0,0,0,0,0,1,0), ">=",0)
 add.constraint(lprec, c(0,0,0,0,0,0,0,0,0,0,0,1), ">=",0)
 lprec
## Model name: 
##   a linear program with 12 decision variables and 16 constraints
 b=solve(lprec)
 get.objective(lprec) #  
## [1] 94.22649
 get.variables(lprec) # 
##  [1] 0 3 0 0 0 1 0 3 0 1 3 0
e <- as.data.frame(rbind(c(0,3,0,0,3),c(0, 1, 0,3,4),c(0, 1, 3,0,4)))
names(e) <- c("Gambhir","Yusuf","Kallis","Uthappa","Overs")
rownames(e) <- c("Malinga","Harbhajan","Pollard") 
e

LP Solution: Kolkata Knight Riders can score a maximum of 94.22 in 11 overs against Mumbai Indians
if the the number of overs KKR face is as below

##           Gambhir Yusuf Kallis Uthappa Overs
## Malinga         0     3      0       0     3
## Harbhajan       0     1      0       3     4
## Pollard         0     1      3       0     4
#Total overs=11  

Conclusion: It is possible to thus determine the optimum no of overs to give to a specific bowler based on his/her Economy Rate with a particular batsman. Similarly one can determine the maximum runs that can be scored by a batsmen based on their strike rate with bowlers. Cricket like many other games is a game of strategy, skill, talent and some amount of luck. So while the LP formulation can provide some direction,  one must be aware anything could happen in a game of cricket!

Thoughts, comments, suggestions welcome!

Also see
1. Inswinger: yorkr swings into International T20s
2. Working with Node.js and PostgreSQL
3. Simulating the domino effect in Android using Box2D and AndEngine
4. Introducing cricket package yorkr: Part 1- Beaten by sheer pace!
5. Introducing QCSimulator: A 5-qubit quantum computing simulator in R
6. A Cloud medley with IBM Bluemix, Cloudant DB and Node.js

To see all posts see Index of Posts

Analysis of International T20 matches with yorkr templates


Introduction

In this post I create yorkr templates for International T20 matches that are available on Cricsheet. With these templates you can convert all T20 data which is in yaml format to R dataframes. Further I create data and the necessary templates for analyzing. All of these templates can be accessed from Github at yorkrT20Template. The templates are

  1. Template for conversion and setup – T20Template.Rmd
  2. Any T20 match – T20Matchtemplate.Rmd
  3. T20 matches between 2 nations – T20Matches2TeamTemplate.Rmd
  4. A T20 nations performance against all other T20 nations – T20AllMatchesAllOppnTemplate.Rmd
  5. Analysis of T20 batsmen and bowlers of all T20 nations – T20BatsmanBowlerTemplate.Rmd

Besides the templates the repository also includes the converted data for all T20 matches I downloaded from Cricsheet in Dec 2016, You can recreate the files as more matches are added to Cricsheet site. This post contains all the steps needed for T20 analysis, as more matches are played around the World and more data is added to Cricsheet. This will also be my reference in future if I decide to analyze T20 in future!

If you are passionate about cricket, and love analyzing cricket performances, then check out my 2 racy books on cricket! In my books, I perform detailed yet compact analysis of performances of both batsmen, bowlers besides evaluating team & match performances in Tests , ODIs, T20s & IPL. You can buy my books on cricket from Amazon at $12.99 for the paperback and $4.99/$6.99 respectively for the kindle versions. The books can be accessed at Cricket analytics with cricketr  and Beaten by sheer pace-Cricket analytics with yorkr  A must read for any cricket lover! Check it out!!

1

 

Feel free to download/clone these templates  from Github yorkrT20Template and perform your own analysis

There will be 5 folders at the root

  1. T20data – Match files as yaml from Cricsheet
  2. T20Matches – Yaml match files converted to dataframes
  3. T20MatchesBetween2Teams – All Matches between any 2 T20 teams
  4. allMatchesAllOpposition – A T20 countries match data against all other teams
  5. BattingBowlingDetails – Batting and bowling details of all countries
library(yorkr)
library(dplyr)

The first few steps take care of the data setup. This needs to be done before any of the analysis of T20 batsmen, bowlers, any T20 match, matches between any 2 T20 countries or analysis of a teams performance against all other countries

There will be 5 folders at the root

  1. T20data
  2. T20Matches
  3. T20MatchesBetween2Teams
  4. allMatchesAllOpposition
  5. BattingBowlingDetails

The source YAML files will be in T20Data folder

1.Create directory T20Matches

Some files may give conversions errors. You could try to debug the problem or just remove it from the T20data folder. At most 2-4 file will have conversion problems and I usally remove then from the files to be converted.

Also take a look at my Inswinger shiny app which was created after performing the same conversion on the Dec 16 data .

convertAllYaml2RDataframesT20("T20Data","T20Matches")

2.Save all matches between all combinations of T20 nations

This function will create the set of all matches between every T20 country against every other T20 country. This uses the data that was created in T20Matches, with the convertAllYaml2RDataframesT20() function.

setwd("./T20MatchesBetween2Teams")
saveAllMatchesBetweenTeams("../T20Matches")

3.Save all matches against all opposition

This will create a consolidated dataframe of all matches played by every T20 playing nation against all other nattions. This also uses the data that was created in T20Matches, with the convertAllYaml2RDataframesT20() function.

setwd("../allMatchesAllOpposition")
saveAllMatchesAllOpposition("../T20Matches")

4. Create batting and bowling details for each T20 country

These are the current T20 playing nations. You can add to this vector as more countries start playing T20. You will get to know all T20 nations by also look at the directory created above namely allMatchesAllOpposition. his also uses the data that was created in T20Matches, with the convertAllYaml2RDataframesT20() function.

setwd("../BattingBowlingDetails")
teams <-c("Australia","India","Pakistan","West Indies", 'Sri Lanka',
          "England", "Bangladesh","Netherlands","Scotland", "Afghanistan",
          "Zimbabwe","Ireland","New Zealand","South Africa","Canada",
          "Bermuda","Kenya","Hong Kong","Nepal","Oman","Papua New Guinea",
          "United Arab Emirates")

for(i in seq_along(teams)){
    print(teams[i])
    val <- paste(teams[i],"-details",sep="")
    val <- getTeamBattingDetails(teams[i],dir="../T20Matches", save=TRUE)

}

for(i in seq_along(teams)){
    print(teams[i])
    val <- paste(teams[i],"-details",sep="")
    val <- getTeamBowlingDetails(teams[i],dir="../T20Matches", save=TRUE)

}

5. Get the list of batsmen for a particular country

For e.g. if you wanted to get the batsmen of Canada you would do the following. By replacing Canada for any other country you can get the batsmen of that country. These batsmen names can then be used in the batsmen analysis

country="Canada"
teamData <- paste(country,"-BattingDetails.RData",sep="")
load(teamData)
countryDF <- battingDetails
bmen <- countryDF %>% distinct(batsman) 
bmen <- as.character(bmen$batsman)
batsmen <- sort(bmen)
batsmen

6. Get the list of bowlers for a particular country

The method below can get the list of bowler names for any T20 nation. These names can then be used in the bowler analysis below

country="Netherlands"
teamData <- paste(country,"-BowlingDetails.RData",sep="")
load(teamData)
countryDF <- bowlingDetails
bwlr <- countryDF %>% distinct(bowler) 
bwlr <- as.character(bwlr$bowler)
bowler <- sort(bwlr)
bowler

Now we are all set

A)  International T20 Match Analysis

Load any match data from the ./T20Matches folder for e.g. Afganistan-England-2016-03-23.RData

setwd("./T20Matches")
load("Afghanistan-England-2016-03-23.RData")
afg_eng<- overs
#The steps are
load("Country1-Country2-Date.Rdata")
country1_country2 <- overs

All analysis for this match can be done now

2. Scorecard

teamBattingScorecardMatch(country1_country2,"Country1")
teamBattingScorecardMatch(country1_country2,"Country2")

3.Batting Partnerships

teamBatsmenPartnershipMatch(country1_country2,"Country1","Country2")
teamBatsmenPartnershipMatch(country1_country2,"Country2","Country1")

4. Batsmen vs Bowler Plot

teamBatsmenVsBowlersMatch(country1_country2,"Country1","Country2",plot=TRUE)
teamBatsmenVsBowlersMatch(country1_country2,"Country1","Country2",plot=FALSE)

5. Team bowling scorecard

teamBowlingScorecardMatch(country1_country2,"Country1")
teamBowlingScorecardMatch(country1_country2,"Country2")

6. Team bowling Wicket kind match

teamBowlingWicketKindMatch(country1_country2,"Country1","Country2")
m <-teamBowlingWicketKindMatch(country1_country2,"Country1","Country2",plot=FALSE)
m

7. Team Bowling Wicket Runs Match

teamBowlingWicketRunsMatch(country1_country2,"Country1","Country2")
m <-teamBowlingWicketRunsMatch(country1_country2,"Country1","Country2",plot=FALSE)
m

8. Team Bowling Wicket Match

m <-teamBowlingWicketMatch(country1_country2,"Country1","Country2",plot=FALSE)
m
teamBowlingWicketMatch(country1_country2,"Country1","Country2")

9. Team Bowler vs Batsmen

teamBowlersVsBatsmenMatch(country1_country2,"Country1","Country2")
m <- teamBowlersVsBatsmenMatch(country1_country2,"Country1","Country2",plot=FALSE)
m

10. Match Worm chart

matchWormGraph(country1_country2,"Country1","Country2")

B)  International T20 Matches between 2 teams

Load match data between any 2 teams from ./T20MatchesBetween2Teams for e.g.Australia-India-allMatches

setwd("./T20MatchesBetween2Teams")
load("Australia-India-allMatches.RData")
aus_ind_matches <- matches
#Replace below with your own countries
country1<-"England"
country2 <- "South Africa"
country1VsCountry2 <- paste(country1,"-",country2,"-allMatches.RData",sep="")
load(country1VsCountry2)
country1_country2_matches <- matches

2.Batsmen partnerships

m<- teamBatsmenPartnershiOppnAllMatches(country1_country2_matches,"country1",report="summary")
m
m<- teamBatsmenPartnershiOppnAllMatches(country1_country2_matches,"country2",report="summary")
m
m<- teamBatsmenPartnershiOppnAllMatches(country1_country2_matches,"country1",report="detailed")
m
teamBatsmenPartnershipOppnAllMatchesChart(country1_country2_matches,"country1","country2")

3. Team batsmen vs bowlers

teamBatsmenVsBowlersOppnAllMatches(country1_country2_matches,"country1","country2")

4. Bowling scorecard

a <-teamBattingScorecardOppnAllMatches(country1_country2_matches,main="country1",opposition="country2")
a

5. Team bowling performance

teamBowlingPerfOppnAllMatches(country1_country2_matches,main="country1",opposition="country2")

6. Team bowler wickets

teamBowlersWicketsOppnAllMatches(country1_country2_matches,main="country1",opposition="country2")
m <-teamBowlersWicketsOppnAllMatches(country1_country2_matches,main="country1",opposition="country2",plot=FALSE)
teamBowlersWicketsOppnAllMatches(country1_country2_matches,"country1","country2",top=3)
m

7. Team bowler vs batsmen

teamBowlersVsBatsmenOppnAllMatches(country1_country2_matches,"country1","country2",top=5)

8. Team bowler wicket kind

teamBowlersWicketKindOppnAllMatches(country1_country2_matches,"country1","country2",plot=TRUE)
m <- teamBowlersWicketKindOppnAllMatches(country1_country2_matches,"country1","country2",plot=FALSE)
m[1:30,]

9. Team bowler wicket runs

teamBowlersWicketRunsOppnAllMatches(country1_country2_matches,"country1","country2")

10. Plot wins and losses

setwd("./T20Matches")
plotWinLossBetweenTeams("country1","country2")

C)  International T20 Matches for a team against all other teams

Load the data between for a T20 team against all other countries ./allMatchesAllOpposition for e.g all matches of India

load("allMatchesAllOpposition-India.RData")
india_matches <- matches
country="country1"
allMatches <- paste("allMatchesAllOposition-",country,".RData",sep="")
load(allMatches)
country1AllMatches <- matches

2. Team’s batting scorecard all Matches

m <-teamBattingScorecardAllOppnAllMatches(country1AllMatches,theTeam="country1")
m

3. Batting scorecard of opposing team

m <-teamBattingScorecardAllOppnAllMatches(matches=country1AllMatches,theTeam="country2")

4. Team batting partnerships

m <- teamBatsmenPartnershipAllOppnAllMatches(country1AllMatches,theTeam="country1")
m
m <- teamBatsmenPartnershipAllOppnAllMatches(country1AllMatches,theTeam='country1',report="detailed")
head(m,30)
m <- teamBatsmenPartnershipAllOppnAllMatches(country1AllMatches,theTeam='country1',report="summary")
m

5. Team batting partnerships plot

teamBatsmenPartnershipAllOppnAllMatchesPlot(country1AllMatches,"country1",main="country1")
teamBatsmenPartnershipAllOppnAllMatchesPlot(country1AllMatches,"country1",main="country2")

6, Team batsmen vs bowlers report

m <-teamBatsmenVsBowlersAllOppnAllMatchesRept(country1AllMatches,"country1",rank=0)
m
m <-teamBatsmenVsBowlersAllOppnAllMatchesRept(country1AllMatches,"country1",rank=1,dispRows=30)
m
m <-teamBatsmenVsBowlersAllOppnAllMatchesRept(matches=country1AllMatches,theTeam="country2",rank=1,dispRows=25)
m

7. Team batsmen vs bowler plot

d <- teamBatsmenVsBowlersAllOppnAllMatchesRept(country1AllMatches,"country1",rank=1,dispRows=50)
d
teamBatsmenVsBowlersAllOppnAllMatchesPlot(d)
d <- teamBatsmenVsBowlersAllOppnAllMatchesRept(country1AllMatches,"country1",rank=2,dispRows=50)
teamBatsmenVsBowlersAllOppnAllMatchesPlot(d)

8. Team bowling scorecard

teamBowlingScorecardAllOppnAllMatchesMain(matches=country1AllMatches,theTeam="country1")
teamBowlingScorecardAllOppnAllMatches(country1AllMatches,'country2')

9. Team bowler vs batsmen

teamBowlersVsBatsmenAllOppnAllMatchesMain(country1AllMatches,theTeam="country1",rank=0)
teamBowlersVsBatsmenAllOppnAllMatchesMain(country1AllMatches,theTeam="country1",rank=2)
teamBowlersVsBatsmenAllOppnAllMatchesRept(matches=country1AllMatches,theTeam="country1",rank=0)

10. Team Bowler vs bastmen

df <- teamBowlersVsBatsmenAllOppnAllMatchesRept(country1AllMatches,theTeam="country1",rank=1)
teamBowlersVsBatsmenAllOppnAllMatchesPlot(df,"country1","country1")

11. Team bowler wicket kind

teamBowlingWicketKindAllOppnAllMatches(country1AllMatches,t1="country1",t2="All")
teamBowlingWicketKindAllOppnAllMatches(country1AllMatches,t1="country1",t2="country2")

12.

teamBowlingWicketRunsAllOppnAllMatches(country1AllMatches,t1="country1",t2="All",plot=TRUE)
teamBowlingWicketRunsAllOppnAllMatches(country1AllMatches,t1="country1",t2="country2",plot=TRUE)

D) Batsman functions

Get the batsman’s details for a batsman

setwd("../BattingBowlingDetails")
kohli <- getBatsmanDetails(team="India",name="Kohli",dir=".")
batsmanDF <- getBatsmanDetails(team="country1",name="batsmanName",dir=".")

2. Runs vs deliveries

batsmanRunsVsDeliveries(batsmanDF,"batsmanName")

3. Batsman 4s & 6s

batsman46 <- select(batsmanDF,batsman,ballsPlayed,fours,sixes,runs)
p1 <- batsmanFoursSixes(batsman46,"batsmanName")

4. Batsman dismissals

batsmanDismissals(batsmanDF,"batsmanName")

5. Runs vs Strike rate

batsmanRunsVsStrikeRate(batsmanDF,"batsmanName")

6. Batsman Moving Average

batsmanMovingAverage(batsmanDF,"batsmanName")

7. Batsman cumulative average

batsmanCumulativeAverageRuns(batsmanDF,"batsmanName")

8. Batsman cumulative strike rate

batsmanCumulativeStrikeRate(batsmanDF,"batsmanName")

9. Batsman runs against oppositions

batsmanRunsAgainstOpposition(batsmanDF,"batsmanName")

10. Batsman runs vs venue

batsmanRunsVenue(batsmanDF,"batsmanName")

11. Batsman runs predict

batsmanRunsPredict(batsmanDF,"batsmanName")

12. Bowler functions

For example to get Ravicahnder Ashwin’s bowling details

setwd("../BattingBowlingDetails")
ashwin <- getBowlerWicketDetails(team="India",name="Ashwin",dir=".")
bowlerDF <- getBatsmanDetails(team="country1",name="bowlerName",dir=".")

13. Bowler Mean Economy rate

bowlerMeanEconomyRate(bowlerDF,"bowlerName")

14. Bowler mean runs conceded

bowlerMeanRunsConceded(bowlerDF,"bowlerName")

15. Bowler Moving Average

bowlerMovingAverage(bowlerDF,"bowlerName")

16. Bowler cumulative average wickets

bowlerCumulativeAvgWickets(bowlerDF,"bowlerName")

17. Bowler cumulative Economy Rate (ER)

bowlerCumulativeAvgEconRate(bowlerDF,"bowlerName")

18. Bowler wicket plot

bowlerWicketPlot(bowlerDF,"bowlerName")

19. Bowler wicket against opposition

bowlerWicketsAgainstOpposition(bowlerDF,"bowlerName")

20. Bowler wicket at cricket grounds

bowlerWicketsVenue(bowlerDF,"bowlerName")

21. Predict number of deliveries to wickets

setwd("./T20Matches")
bowlerDF1 <- getDeliveryWickets(team="country1",dir=".",name="bowlerName",save=FALSE)
bowlerWktsPredict(bowlerDF1,"bowlerName")

Inswinger: yorkr swings into International T20s


In this post I introduce ‘Inswinger’ an interactive Shiny app to analyze International T20 players, matches and teams. This app was a natural consequence to my earlier Shiny app ‘GooglyPlus’. Most of the structure for this app remained the same, I only had to work with a different dataset, so to speak.

The Googly Shiny app is based on my R package ‘yorkr’ which is now available in CRAN. The R package and hence this Shiny app is based on data from Cricsheet. Inswinger is based on the latest data dump from Cricsheet (Dec 2016) and includes all International T20 till then. There are a lot of new Internationation teams like Oman, Hong Kong, UAE, etc. In total there are 22 different International T20 teams in my Inswinger app.

The countries are a) Afghanistan b) Australia c) Bangladesh d) Bermuda e) Canada f) England g) Hong Kong h) India i) Ireland j) Kenya k) Nepal l) Netherlands m) New Zealand n) Oman o) Pakistan p) Papua New Guinea q) Scotland r) South Africa s) Sri Lanka t) United Arab Emirates u) West Indies v) Zimbabwe

If you are passionate about cricket, and love analyzing cricket performances, then check out my 2 racy books on cricket! In my books, I perform detailed yet compact analysis of performances of both batsmen, bowlers besides evaluating team & match performances in Tests , ODIs, T20s & IPL. You can buy my books on cricket from Amazon at $12.99 for the paperback and $4.99/$6.99 respectively for the kindle versions. The books can be accessed at Cricket analytics with cricketr  and Beaten by sheer pace-Cricket analytics with yorkr  A must read for any cricket lover! Check it out!!

1

 

My R package ‘yorkr’,  on which both these Shiny apps are based, has the ability to output either a dataframe or plot, depending on a parameter plot=TRUE or FALSE. Hence in the Inswinger Shiny app results can be displayed both as table or a plot depending on the choice of function.

Inswinger can do detailed analyses of a) Individual T20 batsman b) Individual T20 bowler c) Any T20 match d) Head to head confrontation between 2 T20 teams e) All matches of a T20 team against all other teams.

The Shiny app can be accessed at Inswinger

The code for Inswinger is available at Github. Feel free to clone/download/fork  the code from Inswinger

Based on the 5 detailed analysis domains there are 5 tabs
A) T20 Batsman: This tab can be used to perform analysis of all T20 batsman. If a batsman has played in more than 1 team, then the overall performance is considered. There are 10 functions for the T20 Batsman. They are shown below
– Batsman Runs vs. Deliveries
– Batsman’s Fours & Sixes
– Dismissals of batsman
– Batsman’s Runs vs Strike Rate
– Batsman’s Moving Average
– Batsman’s Cumulative Average Run
– Batsman’s Cumulative Strike Rate
– Batsman’s Runs against Opposition
– Batsman’s Runs at Venue
– Predict Runs of batsman

B) T20 Bowler: This tab can be used to analyze individual T20 bowlers. The functions handle T20 bowlers who have played in more than 1 T20 team.
– Mean Economy Rate of bowler
– Mean runs conceded by bowler
– Bowler’s Moving Average
– Bowler’s Cumulative Avg. Wickets
– Bowler’s Cumulative Avg. Economy Rate
– Bowler’s Wicket Plot
– Bowler’s Wickets against opposition
– Bowler’s Wickets at Venues
– Bowler’s wickets prediction

C) T20 match: This tab can be used for analyzing individual T20 matches. The available functions are
– Match Batting Scorecard – Table
– Batting Partnerships – Plot, Table
– Batsmen vs Bowlers – Plot, Table
– Match Bowling Scorecard   – Table
– Bowling Wicket Kind – Plot, Table
– Bowling Wicket Runs – Plot, Table
– Bowling Wicket Match – Plot, Table
– Bowler vs Batsmen – Plot, Table
– Match Worm Graph – Plot

D) Head to head: This tab can be used for analyzing head-to-head confrontations, between any 2 T20 teams for e.g. all matches between India vs Australia or West Indies vs Sri Lanka . The available functions are
-Team Batsmen Batting Partnerships All Matches – Plot, Table {Summary and Detailed}
-Team Batting Scorecard All Matches – Table
-Team Batsmen vs Bowlers all Matches – Plot, Table
-Team Wickets Opposition All Matches – Plot, Table
-Team Bowling Scorecard All Matches – Table
-Team Bowler vs Batsmen All Matches – Plot, Table
-Team Bowlers Wicket Kind All Matches – Plot, Table
-Team Bowler Wicket Runs All Matches – Plot, Table
– Win Loss All Matches – Plot

E) T20 team’s overall performance: this tab can be used analyze the overall performance of any T20 team. For this analysis all matches played by this team is considered. The available functions are
-Team Batsmen Partnerships Overall – Plot, Table {Summary and Detailed)}
-Team Batting Scorecard Overall –Table
-Team Batsmen vs Bowlers Overall – Plot, Table
-Team Bowler vs Batsmen Overall – Plot, Table
-Team Bowling Scorecard Overall – Table
-Team Bowler Wicket Kind Overall – Plot, Table

Below I include a random set of charts that are generated in each of the 5 tabs
A. IPL Batsman
a. Shakib-al-Hassan (Bangladesh) :  Runs vs Deliveries
untitled

b. Virat Kohli (India) – Cumulative Average
untitled

c.  AB Devilliers (South Africa) – Runs at venues
untitled

d. Glenn Maxwell (Australia)  – Predict runs vs deliveries faces
untitled

B. IPL Bowler
a. TG Southee (New Zealand) – Mean Economy Rate vs overs
untitled

b) DJ Bravo – Moving Average of wickets
untitled

c) AC Evans (Scotland) – Bowler Wickets Against Opposition
untitled

C.T20 Match
a. Match Score (Afghanistan vs Canada, 2012-03-18)
untitled

b)  Match batting partnerships (Plot) Hong Kong vs Oman (2015-11-21), Hong Kong
Hong Kong Partnerships
untitled

c) Match batting partnerships (Table) – Ireland vs Scotland(2012-03-18, Ireland)
Batting partnership can also be displayed as a table
untitled

d) Batsmen vs Bowlers (Plot) – India vs England (2012-12-22)
untitled

e) Match Worm Chart – Sri Lanka vs Pakistan (2015-08-01)
untitled

D.Head to head
a) Team Batsmen Partnership (Plot) – India vs Australia (all matches)
Virat Kohli has the highest total runs in partnerships against Australia
untitled

b)  Team Batsmen Partnership (Summary – Table) – Kenya vs Bangladesh
untitled

c) Team Bowling Scorecard (Table only) India vs South Africa all Matches
untitled

d) Wins- Losses New Zealand vs West Indies all Matches
untitled

C) Overall performances
a) Batting Scorecard All Matches  (Table only) – England’s overall batting performance
Eoin Morgan, Kevin Pieterson  & SJ Taylor have the best performance
untitled

b) Batsman vs Bowlers all Matches (Plot)
India’s best performing batsman (Rank=1) is Virat Kohli
untitled

c)  Batsman vs Bowlers all Matches (Table)
The plot above for Virat Kohli can also be displayed as a table. Kohli has score most runs DJ Bravo, SR Watson & Shahid Afridi
untitled

The Inswinger Shiny app can be accessed at Inswinger. Give it a swing!

The code for Inswinger is available at Github. Feel free to clone/download/fork  the code from Inswinger

Also see my other Shiny apps
1.GooglyPlus
2.What would Shakespeare say?
3.Sixer
4.Revisiting crimes against women in India

You may also like
1. Neural Networks: The mechanics of backpropagation
A primer on Qubits, Quantum gates and Quantum Operation
2. Re-working the Lucy Richardson algorithm in OpenCV
3.Design Principles of Scalable, Distributed Systems
4.Spicing up a IBM Bluemix cloud app with MongoDB and NodeExpress
5.Programming languages in layman’s language
7.Re-introducing cricketr! : An R package to analyze performances of cricketers

To see all posts take at a look at Index of Posts

GooglyPlus: yorkr analyzes IPL players, teams, matches with plots and tables


In this post I introduce my new Shiny app,“GooglyPlus”, which is a  more evolved version of my earlier Shiny app “Googly”. My R package ‘yorkr’,  on which both these Shiny apps are based, has the ability to output either a dataframe or plot, depending on a parameter plot=TRUE or FALSE. My initial version of the app only included plots, and did not exercise the yorkr package fully. Moreover, I am certain, there may be a set of cricket aficionados who would prefer, numbers to charts. Hence I have created this enhanced version of the Googly app and appropriately renamed it as GooglyPlus. GooglyPlus is based on the yorkr package which uses data from Cricsheet. The app is based on IPL data from  all IPL matches from 2008 up to 2016. Feel free to clone/fork or download the code from Github at GooglyPlus.

If you are passionate about cricket, and love analyzing cricket performances, then check out my 2 racy books on cricket! In my books, I perform detailed yet compact analysis of performances of both batsmen, bowlers besides evaluating team & match performances in Tests , ODIs, T20s & IPL. You can buy my books on cricket from Amazon at $12.99 for the paperback and $4.99/$6.99 respectively for the kindle versions. The books can be accessed at Cricket analytics with cricketr  and Beaten by sheer pace-Cricket analytics with yorkr  A must read for any cricket lover! Check it out!!

1

 

Click  GooglyPlus to access the Shiny app!

The changes for GooglyPlus over the earlier Googly app is only in the following 3 tab panels

  • IPL match
  • Head to head
  • Overall Performance

The analysis of IPL batsman and IPL bowler tabs are unchanged. These charts are as they were before.

The changes are only in  tabs i) IPL match ii) Head to head and  iii) Overall Performance. New functionality has been added and existing functions now have the dual option of either displaying a plot or a table.

The changes are

A) IPL Match
The following additions/enhancements have been done

-Match Batting Scorecard – Table
-Batting Partnerships – Plot, Table (New)
-Batsmen vs Bowlers – Plot, Table(New)
-Match Bowling Scorecard   – Table (New)
-Bowling Wicket Kind – Plot, Table (New)
-Bowling Wicket Runs – Plot, Table (New)
-Bowling Wicket Match – Plot, Table (New)
-Bowler vs Batsmen – Plot, Table (New)
-Match Worm Graph – Plot

B) Head to head
The following functions have been added/enhanced

-Team Batsmen Batting Partnerships All Matches – Plot, Table {Summary (New) and Detailed (New)}
-Team Batting Scorecard All Matches – Table (New)
-Team Batsmen vs Bowlers all Matches – Plot, Table (New)
-Team Wickets Opposition All Matches – Plot, Table (New)
-Team Bowling Scorecard All Matches – Table (New)
-Team Bowler vs Batsmen All Matches – Plot, Table (New)
-Team Bowlers Wicket Kind All Matches – Plot, Table (New)
-Team Bowler Wicket Runs All Matches – Plot, Table (New)
-Win Loss All Matches – Plot

C) Overall Performance
The following additions/enhancements have been done in this tab

-Team Batsmen Partnerships Overall – Plot, Table {Summary (New) and Detailed (New)}
-Team Batting Scorecard Overall –Table (New)
-Team Batsmen vs Bowlers Overall – Plot, Table (New)
-Team Bowler vs Batsmen Overall – Plot, Table (New)
-Team Bowling Scorecard Overall – Table (New)
-Team Bowler Wicket Kind Overall – Plot, Table (New)

Included below are some random charts and tables. Feel free to explore the Shiny app further

1) IPL Match
a) Match Batting Scorecard (Table only)
This is the batting score card for the Chennai Super Kings & Deccan Chargers 2011-05-11

untitled

b)  Match batting partnerships (Plot)
Delhi Daredevils vs Kings XI Punjab – 2011-04-23

untitled

c) Match batting partnerships (Table)
The same batting partnership  Delhi Daredevils vs Kings XI Punjab – 2011-04-23 as a table

untitled

d) Batsmen vs Bowlers (Plot)
Kolkata Knight Riders vs Mumbai Indians 2010-04-19

Untitled.png

e)  Match Bowling Scorecard (Table only)
untitled

B) Head to head

a) Team Batsmen Partnership (Plot)
Deccan Chargers vs Kolkata Knight Riders all matches

untitled

b)  Team Batsmen Partnership (Summary – Table)
In the following tables it can be seen that MS Dhoni has performed better that SK Raina  CSK against DD matches, whereas SK Raina performs better than Dhoni in CSK vs  KKR matches

i) Chennai Super Kings vs Delhi Daredevils (Summary – Table)

untitled

ii) Chennai Super Kings vs Kolkata Knight Riders (Summary – Table)
untitled

iii) Rising Pune Supergiants vs Gujarat Lions (Detailed – Table)
This table provides the detailed partnership for RPS vs GL all matches

untitled

c) Team Bowling Scorecard (Table only)
This table gives the bowling scorecard of Pune Warriors vs Deccan Chargers in all matches

untitled

C) Overall performances
a) Batting Scorecard All Matches  (Table only)

This is the batting scorecard of Royal Challengers Bangalore. The top 3 batsmen are V Kohli, C Gayle and AB Devilliers in that order

untitled

b) Batsman vs Bowlers all Matches (Plot)
This gives the performance of Mumbai Indian’s batsman of Rank=1, which is Rohit Sharma, against bowlers of all other teams

untitled

c)  Batsman vs Bowlers all Matches (Table)
The above plot as a table. It can be seen that Rohit Sharma has scored maximum runs against M Morkel, then Shakib Al Hasan and then UT Yadav.

untitled

d) Bowling scorecard (Table only)
The table below gives the bowling scorecard of CSK. R Ashwin leads with a tally of 98 wickets followed by DJ Bravo who has 88 wickets and then JA Morkel who has 83 wickets in all matches against all teams

Untitled.png

This is just a random selection of functions. Do play around with the app and checkout how the different IPL batsmen, bowlers and teams stack against each other. Do read my earlier post Googly: An interactive app for analyzing IPL players, matches and teams using R package yorkr  for more details about the app and other functions available.

Click GooglyPlus to access the Shiny app!

You can clone/fork/download the code from Github at GooglyPlus

Hope you have fun playing around with the Shiny app!

Note: In the tabs, for some of the functions, not all controls  are required. It is possible to enable the controls selectively but this has not been done in this current version. I may make the changes some time in the future.

Take a look at my other Shiny apps
a.Revisiting crimes against women in India
b. Natural language processing: What would Shakespeare say?

Check out some of my other posts
1. Analyzing World Bank data with WDI, googleVis Motion Charts
2. Video presentation on Machine Learning, Data Science, NLP and Big Data – Part 1
3. Singularity
4. Design principles of scalable, distributed systems
5. Simulating an Edge shape in Android
6. Dabbling with Wiener filter in OpenCV

To see all posts click Index of Posts

yorkr ranks IPL Players post 2016 season


Here is a short post which ranks IPL batsmen and bowlers post the 2016 IPL season. These are based on match data from Cricsheet. I had already ranked IPL players in my post yorkr ranks IPL batsmen and bowlers, but that was mid IPL 2016 season. This post will be final ranking post 2016 season

If you are passionate about cricket, and love analyzing cricket performances, then check out my 2 racy books on cricket! In my books, I perform detailed yet compact analysis of performances of both batsmen, bowlers besides evaluating team & match performances in Tests , ODIs, T20s & IPL. You can buy my books on cricket from Amazon at $12.99 for the paperback and $4.99/$6.99 respectively for the kindle versions. The books can be accessed at Cricket analytics with cricketr  and Beaten by sheer pace-Cricket analytics with yorkr  A must read for any cricket lover! Check it out!!

1

 

This post has also been published in RPubs RankIPLPlayers2016. You can download this as a pdf file at RankIPLPlayers2016.pdf.

You can take a look at the code at rankIPLPlayers2016

Checkout my interactive Shiny apps GooglyPlus (plots & tables) and Googly (only plots) which can be used to analyze IPL players, teams and matches.

rm(list=ls())
library(yorkr)
library(dplyr)
source('C:/software/cricket-package/cricsheet/ipl2016/final/R/rankIPLBatsmen.R', encoding = 'UTF-8')
source('C:/software/cricket-package/cricsheet/ipl2016/final/R/rankIPLBowlers.R', encoding = 'UTF-8')

Rank IPL batsmen post 2016

Chris Gayle, Shaun Marsh & David Warner are top 3 IPL batsmen. Gayle towers over everybody, with an 38.28 Mean Runs, and a Mean Strike Rate of 138.85. Virat Kohli comes in 4th, with 34.52 as his Average Runs per innings, and a Mean Strike Rate of 117.51

iplBatsmanRank <- rankIPLBatsmen()
as.data.frame(iplBatsmanRank[1:30,])
##             batsman matches meanRuns    meanSR
## 1          CH Gayle      92 38.28261 138.85120
## 2          SE Marsh      60 36.40000 118.97783
## 3         DA Warner     104 34.51923 124.88798
## 4           V Kohli     136 31.77941 117.51000
## 5         AM Rahane      89 31.46067 104.62989
## 6    AB de Villiers     109 29.93578 136.48945
## 7      SR Tendulkar      78 29.62821 108.58962
## 8         G Gambhir     133 28.94737 109.61263
## 9         RG Sharma     140 28.68571 117.79057
## 10         SK Raina     143 28.41259 121.55713
## 11        SR Watson      90 28.21111 125.80122
## 12         S Dhawan     110 28.09091 111.97282
## 13         R Dravid      79 27.87342 109.14544
## 14         DR Smith      76 27.55263 120.22329
## 15        JP Duminy      70 27.28571 122.99243
## 16      BB McCullum      94 26.86170 118.55606
## 17        JH Kallis      97 26.83505  95.47866
## 18         V Sehwag     105 26.26667 137.11562
## 19       RV Uthappa     132 26.18182 123.16326
## 20     AC Gilchrist      81 25.77778 122.69074
## 21          M Vijay      99 25.69697 106.02010
## 22    KC Sangakkara      70 25.67143 112.97529
## 23         MS Dhoni     131 25.14504 131.62206
## 24        DA Miller      60 24.76667 133.80983
## 25        AT Rayudu      99 23.35354 121.59313
## 26 DPMD Jayawardene      80 23.05000 114.54712
## 27     Yuvraj Singh     103 22.46602 118.15000
## 28        DJ Hussey      63 22.26984        NA
## 29        YK Pathan     121 22.25620 132.58793
## 30      S Badrinath      66 22.22727 114.97061

Rank IPL bowlers

The top 3 IPL T20 bowlers are SL Malinga, DJ Bravo and SP Narine

Don’t get hung up on the decimals in the average wickets for the bowlers. All it implies is that if 2 bowlers have average wickets of 1.0 and 1.5, it implies that in 2 matches the 1st bowler will take 2 wickets and the 2nd bowler will take 3 wickets.

setwd("C:/software/cricket-package/cricsheet/ipl2016/details")
iplBowlersRank <- rankIPLBowlers()
as.data.frame(iplBowlersRank[1:30,])
##             bowler matches meanWickets   meanER
## 1       SL Malinga      96    1.645833 6.545208
## 2         DJ Bravo      58    1.517241 7.929310
## 3        SP Narine      65    1.492308 6.155077
## 4          B Kumar      45    1.422222 7.355556
## 5        YS Chahal      41    1.414634 8.057073
## 6         M Morkel      37    1.405405 7.626216
## 7        IK Pathan      40    1.400000 7.579250
## 8         RP Singh      42    1.357143 7.966429
## 9         MM Patel      31    1.354839 7.282581
## 10   R Vinay Kumar      63    1.317460 8.342540
## 11  Sandeep Sharma      38    1.315789 7.697368
## 12       MM Sharma      46    1.304348 7.740652
## 13         P Awana      33    1.303030 8.325758
## 14        MM Patel      30    1.300000 7.569667
## 15          Z Khan      41    1.292683 7.735854
## 16         PP Ojha      53    1.245283 7.268679
## 17     JP Faulkner      40    1.225000 8.502250
## 18 Shakib Al Hasan      41    1.170732 7.103659
## 19     DS Kulkarni      32    1.156250 8.372188
## 20        UT Yadav      46    1.152174 8.394783
## 21        A Kumble      41    1.146341 6.567073
## 22       JA Morkel      73    1.136986 8.131370
## 23        SK Warne      53    1.132075 7.277170
## 24        A Mishra      55    1.127273 7.319455
## 25        UT Yadav      33    1.090909 8.853636
## 26        L Balaji      34    1.088235 7.186176
## 27       PP Chawla      35    1.085714 8.162000
## 28        R Ashwin      92    1.065217 6.812391
## 29  M Muralitharan      39    1.051282 6.470256
## 30 Harbhajan Singh     120    1.050000 7.134833

Beaten by sheer pace! Cricket analytics with yorkr in paperback and Kindle versions


Untitled

My book “Beaten by sheer pace! Cricket analytics with yorkr” is now available in paperback and Kindle versions. The paperback is available from Amazon (US, UK and Europe) for $ 54.95. The Kindle version can be downloaded from the Kindle store for $4.99 (Rs 332/-). Do pick up your copy. It should be a good read for a Sunday afternoon.

This book of mine contains my posts based on my R package ‘yorkr’ now in CRAN. The package yorkr uses the data from Cricsheet (http://cricsheet.org/) and can perform analysis of ODI and T20 matches. yorkr can analyze teams against a specific opposition or all oppositions, besides providing details on batsmen or bowlers individual performances The analyses include team batting partnerships, performances of batsmen against bowlers, bowlers against batsmen, bowlers best performances etc.  Individual analyses of batsmen strike rate, cumulative average, bowler economy rate, bowler moving average etc can be performances

The book includes the following chapters based on my R package yorkr.

CONTENTS
Preface
Foreword
1.Introducing cricket package yorkr: Part 1- Beaten by sheer pace!
2.Introducing cricket package yorkr: Part 2-Trapped leg before wicket!
3.Introducing cricket package yorkr: Part 3-Foxed by flight!
4.Introducing cricket package yorkr:Part 4-In the block hole!
5.yorkr pads up for the Twenty20s: Part 1- Analyzing team’s match performance!
6.yorkr pads up for the Twenty20s: Part 2-Head to head confrontation between teams
7.yorkr pads up for the Twenty20s:Part 3:Overall team performance against all oppositions!
8.yorkr pads up for Twenty20s:Part 4- Individual batting and bowling performances!
9.yorkr crashes the IPL party ! – Part 1
10.yorkr crashes the IPL party! – Part 2
11.yorkr crashes the IPL party! – Part 3!
12.yorkr crashes the IPL party! – Part 4
13.yorkr ranks IPL batsmen and bowlers
14.yorkr ranks T20 batsmen and bowlers
15.yorkr ranks ODI batsmen and bowlers
16.yorkr is generic!
Important links
Afterword
Other books by author
About the author

Checkout my interactive Shiny apps GooglyPlus (plots & tables) and Googly (only plots) which can be used to analyze IPL players, teams and matches.

yorkr is generic!


The features and functionality in my yorkr package is now complete. My R package yorkr, is totally generic, which means that the R package  can be used for all ODI, T20 matches. Hence yorkr can be used for professional or amateur ODI and T20 matches. The R package can be used for both men and women ODI, T20 international or domestic matches. The main requirement is, that the match data  be created as a Yaml file in the format Cricsheet (Required yaml format for the match data).

If you are passionate about cricket, and love analyzing cricket performances, then check out my 2 racy books on cricket! In my books, I perform detailed yet compact analysis of performances of both batsmen, bowlers besides evaluating team & match performances in Tests , ODIs, T20s & IPL. You can buy my books on cricket from Amazon at $12.99 for the paperback and $4.99/$6.99 respectively for the kindle versions. The books can be accessed at Cricket analytics with cricketr  and Beaten by sheer pace-Cricket analytics with yorkr  A must read for any cricket lover! Check it out!!

1

$4.99/Rs 320 and $6.99/Rs448 respectively

 

I have successfully used my R functions for the Indian Premier League (IPL) matches with changes only to the convertAllYamlFiles2RDataFramesXX (please see posts below)

The convertAllYamlFiles2RDataframes &convertAllYamlFiles2RDataFramesT20 will have to be customized for the names of the teams playing in the domestic professional or amateur matches. All other classes of functions namely Class1, Class2, Class 3 and Class 4 as discussed in my post Introducing cricket package yorkr-Part 1: Beaten by sheer pace can be used as is without any changes.

There are numerous professional & amateur T20 matches that are played around the world. Here are a list of domestic T20 tournaments that are played around the world (from Wikipedia). The yorkr package can be used for any of these matches once the match data is saved as yaml as mentioned above.

So do go ahead and have fun, analyzing cricket performances with yorkr!

Please take a look at my posts on how to use yorkr for ODI, Twenty20 matches.

  1. Introducing cricket package yorkr:Part 1- Beaten by sheer pace!
    2. Introducing cricket package yorkr:Part 2- Trapped leg before wicket!
    3.  Introducing cricket package yorkr:Part 3- foxed by flight!
    4. Introducing cricket package yorkr:Part 4-In the block hole!
    5. yorkr pads up for the Twenty20s: Part 1- Analyzing team”s match performance
    6. yorkr pads up for the Twenty20s: Part 2-Head to head confrontation between teams
    7. yorkr pads up for the Twenty20s:Part 3:Overall team performance against all oppositions!
    8. yorkr pads up for Twenty20s:Part 4- Individual batting and bowling performances!
    9. yorkr crashes the IPL party ! – Part 1
    10. yorkr crashes the IPL party! – Part 2
    11. yorkr crashes the IPL party! – Part 3
    12. yorkr crashes the IPL party! – Part 4
    13. yorkr ranks IPL batsmen and bowlers
    14. yorkr ranks T20 batsmen and bowlers
    15. yorkr ranks ODI batsmen and bowlers

yorkr ranks IPL batsmen and bowlers


Here is a short post which ranks IPL batsmen and bowlers. These are based on match data from Cricsheet. Ranking batsmen and bowlers in IPL is more challenging as the players can belong to different teams in different years. Hence I create a combined data frame of the batsmen and bowlers regardless of their IPL teams and calculate a) average runs and average strike rate for batsmen and c) average wickets and d) average economy rate for bowlers.

I will be doing this ranking for T20 and ODI batting and bowling performances shortly.

If you are passionate about cricket, and love analyzing cricket performances, then check out my 2 racy books on cricket! In my books, I perform detailed yet compact analysis of performances of both batsmen, bowlers besides evaluating team & match performances in Tests , ODIs, T20s & IPL. You can buy my books on cricket from Amazon at $12.99 for the paperback and $4.99/$6.99 respectively for the kindle versions. The books can be accessed at Cricket analytics with cricketr  and Beaten by sheer pace-Cricket analytics with yorkr  A must read for any cricket lover! Check it out!!

1

 

This post has also been published in RPubs RankIPLPlayers. You can download this as a pdf file at RankIPLPlayers.pdf.

You can take a look at the code at rankIPLPlayers (should be available in yorkr_0.0.5)

Checkout my interactive Shiny apps GooglyPlus (plots & tables) and Googly (only plots) which can be used to analyze IPL players, teams and matches.

The results are slightly surprising

rm(list=ls())
library(yorkr)
library(dplyr)
setwd("C:/software/cricket-package/cricsheet/cleanup/IPL/rank")
source("rankIPLBatsmen.R")
source("rankIPLBowlers.R")

Rank IPL batsmen

Chris Gayle, MEK Hussey and Shane Watson are top 3 IPL batsmen. Gayle towers over the others in mean runs and mean strike rate. Surprisingly Ajinkya Rahane is the top Indian T20 batsman, if we leave out Sachin Tendulkar (who tops India yet again!). The other top IPL T20 batsmen are Raina, Gambhir, Rohit Sharma in that order. Virat Kohli comes a distant 14th.

iplBatsmanRank <- rankIPLBatsmen()
as.data.frame(iplBatsmanRank[1:30,])
##             batsman matches meanRuns    meanSR
## 1          CH Gayle     128 40.00781 144.92188
## 2        MEK Hussey      64 33.57812 107.23500
## 3         SR Watson      75 31.46667 129.97733
## 4      SR Tendulkar     127 29.74803 108.86622
## 5         AM Rahane      77 29.14286 101.40065
## 6         DA Warner     134 29.10448 118.38313
## 7         JP Duminy      94 28.77660 124.61702
## 8          SK Raina     128 28.62500 122.12656
## 9         G Gambhir     210 28.13810 108.78090
## 10        RG Sharma     181 28.07182 118.57801
## 11         DR Smith      78 27.82051 119.64462
## 12      BB McCullum      98 27.81633 114.91255
## 13         S Dhawan     109 27.74312 112.21000
## 14          V Kohli     188 27.56915 113.81261
## 15   AB de Villiers     150 27.46000 136.70860
## 16         R Dravid     104 27.02885 107.78923
## 17        JH Kallis     167 26.54491  94.65641
## 18         V Sehwag     174 26.39655 140.29011
## 19       RV Uthappa     166 26.27711 120.48506
## 20       SC Ganguly      86 25.98837  96.39849
## 21     AC Gilchrist      81 25.77778 122.69074
## 22    KC Sangakkara      70 25.67143 112.97529
## 23         MS Dhoni     119 25.29412 130.99832
## 24       TM Dilshan      82 24.13415 101.12634
## 25          M Vijay      96 23.92708 102.01771
## 26        AT Rayudu     146 23.63014 117.91000
## 27 DPMD Jayawardene     109 22.95413 110.73862
## 28        MK Pandey     105 22.71429        NA
## 29     Yuvraj Singh     112 22.48214 114.51018
## 30      S Badrinath      66 22.22727 114.97061

Rank IPL bowlers

The top 3 IPL T20 bowlers are SL Malinga,SP Narine and DJ Bravo.

Don’t get hung up on the decimals in the average wickets for the bowlers. All it implies is that if 2 bowlers have average wickets of 1.0 and 1.5, it implies that in 2 matches the 1st bowler will take 2 wickets and the 2nd bowler will take 3 wickets.

iplBowlersRank <- rankIPLBowlers()
as.data.frame(iplBowlersRank[1:30,])
##             bowler matches meanWickets   meanER
## 1       SL Malinga      96    1.645833 6.545208
## 2        SP Narine      54    1.555556 5.967593
## 3         DJ Bravo      58    1.517241 7.929310
## 4         M Morkel      37    1.405405 7.626216
## 5        IK Pathan      40    1.400000 7.579250
## 6         RP Singh      42    1.357143 7.966429
## 7         MM Patel      31    1.354839 7.282581
## 8  Shakib Al Hasan      32    1.343750 6.911250
## 9    R Vinay Kumar      63    1.317460 8.342540
## 10       MM Sharma      46    1.304348 7.740652
## 11         P Awana      33    1.303030 8.325758
## 12        MM Patel      30    1.300000 7.569667
## 13          Z Khan      41    1.292683 7.735854
## 14        A Mishra      43    1.255814 7.226512
## 15         PP Ojha      53    1.245283 7.268679
## 16     JP Faulkner      40    1.225000 8.502250
## 17     DS Kulkarni      32    1.156250 8.372188
## 18        UT Yadav      46    1.152174 8.394783
## 19        A Kumble      41    1.146341 6.567073
## 20       JA Morkel      73    1.136986 8.131370
## 21        SK Warne      53    1.132075 7.277170
## 22 Harbhajan Singh     107    1.102804 7.014953
## 23        L Balaji      34    1.088235 7.186176
## 24        R Ashwin      92    1.065217 6.812391
## 25        AR Patel      31    1.064516 7.137097
## 26  M Muralitharan      39    1.051282 6.470256
## 27         P Kumar      36    1.027778 8.148056
## 28       PP Chawla      85    1.023529 8.017765
## 29       SR Watson      67    1.014925 7.695224
## 30        DJ Bravo      30    1.000000 7.966333

Conclusion: The results are somewhat surprising. The ranking was based on data from Cricsheet. The data in this site are available from 2008-2015. I hope to do this ranking for T20 and ODIs shortly

Watch this space!

  1. Introducing cricket package yorkr-Part1:Beaten by sheer pace!.
  2. yorkr pads up for the Twenty20s: Part 1- Analyzing team“s match performance.
  3. yorkr crashes the IPL party !Part 1
  4. Introducing cricketr! : An R package to analyze performances of cricketers
  5. Cricket analytics with cricketr in paperback and Kindle versions

yorkr crashes the IPL party! – Part 4


Introduction

I’ve missed more than 9000 shots in my career. I’ve lost almost 300 games. 26 times, I’ve been trusted to take the game winning shot and missed. I’ve failed over and over and over again in my life. And that is why I succeed.

                      Michael Jordan

Success is where preparation and opportunity meet.

                      Bobby Unser

It is not whether you get knocked down. It is whether you get up.

                      Vince Lombardi

Make sure your worst enemy doesn’t live between your own two ears.

                      Laird Hamilton

This post should be the last post for “yorkr crashes the IPL party!”. In fact it is final post for the whole ‘yorkr’ series. I have now covered the use of yorkr for ODIs, Twenty20s and IPL T20 formats. I will not be including functionality in yorkr to handle Test cricket from Cricsheet. I would recommend that you use my R package cricketr. Please see my post Introducing cricketr! : An R package to analyze performances of cricketers

In this last post on IPL T20 I look at the top individual batting and bowling performances in the IPL Twenty20s. Also please take a look at my 3 earlier post on yorkr’s handling of IPL Twenty20 matches

  1. yorkr crashes the IPL party ! – Part 1
  2. yorkr crashes the IPL party ! – Part 2
  3. yorkr crashes the IPL party ! – Part 3

If you are passionate about cricket, and love analyzing cricket performances, then check out my 2 racy books on cricket! In my books, I perform detailed yet compact analysis of performances of both batsmen, bowlers besides evaluating team & match performances in Tests , ODIs, T20s & IPL. You can buy my books on cricket from Amazon at $12.99 for the paperback and $4.99/$6.99 respectively for the kindle versions. The books can be accessed at Cricket analytics with cricketr  and Beaten by sheer pace-Cricket analytics with yorkr  A must read for any cricket lover! Check it out!!

1

 

This post has also been published at RPubs IPLT20-Part4 and can also be downloaded as a PDF document from IPLT20-Part4.pdf.

You can clone/fork the code for the package yorkr from Github at yorkr-package

Checkout my interactive Shiny apps GooglyPlus (plots & tables) and Googly (only plots) which can be used to analyze IPL players, teams and matches.

The list of Class 4 functions are shown below.The Twenty20 features will be available from yorkr_0.0.4

Batsman functions

  1. batsmanRunsVsDeliveries
  2. batsmanFoursSixes
  3. batsmanDismissals
  4. batsmanRunsVsStrikeRate
  5. batsmanMovingAverage
  6. batsmanCumulativeAverageRuns
  7. batsmanCumulativeStrikeRate
  8. batsmanRunsAgainstOpposition
  9. batsmanRunsVenue
  10. batsmanRunsPredict

Bowler functions

  1. bowlerMeanEconomyRate
  2. bowlerMeanRunsConceded
  3. bowlerMovingAverage
  4. bowlerCumulativeAvgWickets
  5. bowlerCumulativeAvgEconRate
  6. bowlerWicketPlot
  7. bowlerWicketsAgainstOpposition
  8. bowlerWicketsVenue
  9. bowlerWktsPredict
library(yorkr)
library(gridExtra)
library(rpart.plot)
library(dplyr)
library(ggplot2)
rm(list=ls())

A. Batsman functions

1. Get IPL Team Batting details

The function below gets the overall IPL team batting details based on the RData file available in IPL T20 matches. This is currently also available in Github at [IPL-T20-matches] (https://github.com/tvganesh/yorkrData/tree/master/IPL/IPL-T20-matches). The batting details of the IPL team in each match is created and a huge data frame is created by rbinding the individual dataframes. This can be saved as a RData file

setwd("C:/software/cricket-package/york-test/yorkrData/IPL/IPL-T20-matches")
csk_details <- getTeamBattingDetails("Chennai Super Kings",dir=".", save=TRUE)
dd_details <- getTeamBattingDetails("Delhi Daredevils",dir=".",save=TRUE)
kkr_details <- getTeamBattingDetails("Kolkata Knight Riders",dir=".",save=TRUE)
mi_details <- getTeamBattingDetails("Mumbai Indians",dir=".",save=TRUE)
rcb_details <- getTeamBattingDetails("Royal Challengers Bangalore",dir=".",save=TRUE)

2. Get IPL batsman details

This function is used to get the individual IPL T20 batting record for a the specified batsman of the team as in the functions below. For analyzing the batting performances I have chosen the top IPL T20 batsmen from the teams. This was based to a large extent on batting scorecard functions from yorkr crashes the IPL party!:Part 3 The top IPL batsmen chosen are the ones below

  1. Suresh Raina (CSK)
  2. MS Dhoni (CSK)
  3. Virendar Sehwag (DD)
  4. Rohit Sharma (MI)
  5. Gautham Gambhir (KKR)
  6. Virat Kohli (RCB)
setwd("C:/software/cricket-package/cricsheet/cleanup/IPL/part4")
raina <- getBatsmanDetails(team="Chennai Super Kings",name="SK Raina",dir=".")
## [1] "./Chennai Super Kings-BattingDetails.RData"
dhoni <- getBatsmanDetails(team="Chennai Super Kings",name="MS Dhoni")
## [1] "./Chennai Super Kings-BattingDetails.RData"
sehwag <-  getBatsmanDetails(team="Delhi Daredevils",name="V Sehwag",dir=".")
## [1] "./Delhi Daredevils-BattingDetails.RData"
gambhir <-  getBatsmanDetails(team="Kolkata Knight Riders",name="G Gambhir",dir=".")
## [1] "./Kolkata Knight Riders-BattingDetails.RData"
rsharma <-  getBatsmanDetails(team="Mumbai Indians",name="RG Sharma",dir=".")
## [1] "./Mumbai Indians-BattingDetails.RData"
kohli <-  getBatsmanDetails(team="Royal Challengers Bangalore",name="V Kohli",dir=".")
## [1] "./Royal Challengers Bangalore-BattingDetails.RData"

3. Runs versus deliveries (in IPL matches)

Sehwag has a superb strike rate. It can be seen that Sehwag averages around 80 runs for around 40 deliveries followed by Rohit Sharma. Raina and Dhoni average around 60 runs

p1 <-batsmanRunsVsDeliveries(raina, "SK Raina")
p2 <-batsmanRunsVsDeliveries(dhoni,"MS Dhoni")
p3 <-batsmanRunsVsDeliveries(sehwag,"V Sehwag")
p4 <-batsmanRunsVsDeliveries(gambhir,"G Gambhir")
p5 <-batsmanRunsVsDeliveries(rsharma,"RG Sharma")
p6 <-batsmanRunsVsDeliveries(kohli,"V Kohli")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

runsVsDeliveries-1

4. Batsman Total runs, Fours and Sixes (in IPL matches)

Dhoni leads in the runs made from sixes in comparison to the others

raina46 <- select(raina,batsman,ballsPlayed,fours,sixes,runs)
p1 <-batsmanFoursSixes(raina46, "SK Raina")
dhoni46 <- select(dhoni,batsman,ballsPlayed,fours,sixes,runs)
p2 <-batsmanFoursSixes(dhoni46,"MS Dhoni")
sehwag46 <- select(sehwag,batsman,ballsPlayed,fours,sixes,runs)
p3 <-batsmanFoursSixes(sehwag46,"V Sehwag")
gambhir46 <- select(gambhir,batsman,ballsPlayed,fours,sixes,runs)
p4 <-batsmanFoursSixes(gambhir46,"G Gambhir")
rsharma46 <- select(rsharma,batsman,ballsPlayed,fours,sixes,runs)
p5 <-batsmanFoursSixes(rsharma46,"RG Sharma")
kohli46 <- select(kohli,batsman,ballsPlayed,fours,sixes,runs)
p6 <-batsmanFoursSixes(kohli46,"V Kohli")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

foursSixes-1

5. Batsman dismissals (in IPL matches)

The type of dismissal for each batsman is shown below

p1 <-batsmanDismissals(raina, "SK Raina")
p2 <-batsmanDismissals(dhoni,"MS Dhoni")
p3 <-batsmanDismissals(sehwag,"V Sehwag")
p4 <-batsmanDismissals(gambhir,"G Gambhir")
p5 <-batsmanDismissals(rsharma,"RG Sharma")
p6 <-batsmanDismissals(kohli,"V Kohli")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

dismissal-1

6. Runs versus Strike Rate (in IPL matches)

Raina, Dhoni and Kohli have an increasing strike rate with more runs scored

p1 <-batsmanRunsVsStrikeRate(raina, "SK Raina")
p2 <-batsmanRunsVsStrikeRate(dhoni,"MS Dhoni")
p3 <-batsmanRunsVsStrikeRate(sehwag,"V Sehwag")
p4 <-batsmanRunsVsStrikeRate(gambhir,"G Gambhir")
p5 <-batsmanRunsVsStrikeRate(rsharma,"RG Sharma")
p6 <-batsmanRunsVsStrikeRate(kohli,"V Kohli")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

runsSR-1

7. Batsman moving average (in IPL matches)

Rohit Sharma seems to maintain an average of almost 30 runs, while Dhoni and Kohli average around 25.

p1 <-batsmanMovingAverage(raina, "SK Raina")
p2 <-batsmanMovingAverage(dhoni,"MS Dhoni")
p3 <-batsmanMovingAverage(sehwag,"V Sehwag")
p4 <-batsmanMovingAverage(gambhir,"G Gambhir")
p5 <-batsmanMovingAverage(rsharma,"RG Sharma")
p6 <-batsmanMovingAverage(kohli,"V Kohli")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

ma-1

8. Batsman cumulative average (in IPL matches)

The cumulative runs average of Raina, Gambhir, Kohli and Rohit Sharma are around 28-30 runs.Dhoni drops to 25

p1 <-batsmanCumulativeAverageRuns(raina, "SK Raina")
p2 <-batsmanCumulativeAverageRuns(dhoni,"MS Dhoni")
p3 <-batsmanCumulativeAverageRuns(sehwag,"V Sehwag")
p4 <-batsmanCumulativeAverageRuns(gambhir,"G Gambhir")
p5 <-batsmanCumulativeAverageRuns(rsharma,"RG Sharma")
p6 <-batsmanCumulativeAverageRuns(kohli,"V Kohli")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

cAvg-1

9. Cumulative Average Strike Rate (in IPL matches)

As seen above Sehwag has a phenomenal cumulative strike rate of around 150, followed by Dhoni around 130, then we Raina and finally Kohli.

p1 <-batsmanCumulativeStrikeRate(raina, "SK Raina")
p2 <-batsmanCumulativeStrikeRate(dhoni,"MS Dhoni")
p3 <-batsmanCumulativeStrikeRate(sehwag,"V Sehwag")
p4 <-batsmanCumulativeStrikeRate(gambhir,"G Gambhir")
p5 <-batsmanCumulativeStrikeRate(rsharma,"RG Sharma")
p6 <-batsmanCumulativeStrikeRate(kohli,"V Kohli")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

cSR-1

10. Batsman runs against opposition (in IPL matches)

The following charts show the performance of te batsmen against opposing IPL teams


batsmanRunsAgainstOpposition(raina, "SK Raina")

runsOppn1-1

batsmanRunsAgainstOpposition(dhoni,"MS Dhoni")

runsOppn2-1

batsmanRunsAgainstOpposition(sehwag,"V Sehwag")

runsOppn3-1

batsmanRunsAgainstOpposition(gambhir,"G Gambhir")

runsOppn4-1

batsmanRunsAgainstOpposition(rsharma,"RG Sharma")

runsOppn5-1

batsmanRunsAgainstOpposition(kohli,"V Kohli")

runsOppn6-1

11. Runs at different venues (in IPL matches)

The plots below give the performances of the batsmen at different grounds.

batsmanRunsVenue(raina, "SK Raina")

runsVenue1-1

batsmanRunsVenue(dhoni,"MS Dhoni")

runsVenue2-1

batsmanRunsVenue(sehwag,"V Sehwag")

runsVenue3-1

batsmanRunsVenue(gambhir,"G Gambhir")

runsVenue4-1

batsmanRunsVenue(rsharma,"RG Sharma")

runsVenue5-1

batsmanRunsVenue(kohli,"V Kohli")

runsVenue6-1

12. Predict number of runs to deliveries (in IPL matches)

The plots below use rpart classification tree to predict the number of deliveries required to score the runs in the leaf node.

par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
batsmanRunsPredict(raina, "SK Raina")
batsmanRunsPredict(dhoni,"MS Dhoni")
batsmanRunsPredict(sehwag,"V Sehwag")

runsPredict1,runsVenue1-1

par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
batsmanRunsPredict(gambhir,"G Gambhir")
batsmanRunsPredict(rsharma,"RG Sharma")
batsmanRunsPredict(kohli,"V Kohli")

runsPredict2,runsVenue1-1

B. Bowler functions

13. Get bowling details in IPL matches

The function below gets the overall team IPL T20 bowling details based on the RData file available in IPL T20 matches. This is currently also available in Github at [yorkrData] (https://github.com/tvganesh/yorkrData/tree/master/IPL/IPL-T20-matches). The IPL T20 bowling details of the IPL team in each match is created, and a huge data frame is created by rbinding the individual dataframes. This can be saved as a RData file

setwd("C:/software/cricket-package/york-test/yorkrData/IPL/IPL-T20-matches")
kkr_bowling <- getTeamBowlingDetails("Kolkata Knight Riders",dir=".",save=TRUE)
csk_bowling <- getTeamBowlingDetails("Chennai Super Kings",dir=".",save=TRUE)
kxip_bowling <- getTeamBowlingDetails("Kings XI Punjab",dir=".",save=TRUE)
mi_bowling <- getTeamBowlingDetails("Mumbai Indians",dir=".",save=TRUE)
rcb_bowling <- getTeamBowlingDetails("Royal Challengers Bangalore",dir=".",save=TRUE)
rr_bowling <- getTeamBowlingDetails("Rajasthan Royals",dir=".",save=TRUE)
fl <- list.files(".","BowlingDetails.RData")
file.copy(fl, "C:/software/cricket-package/cricsheet/cleanup/IPL/part4")

14. Get bowling details of the individual IPL bowlers

This function is used to get the individual bowling record for a specified bowler of the country as in the functions below. For analyzing the bowling performances the following cricketers have been chosen based on the bowling scorecard from my post yorkr crashes the IPL party ! – Part 3

  1. Ravichander Ashwin (CSK)
  2. DJ Bravo (CSK)
  3. PP Chawla (KXIP)
  4. Harbhajan Singh (MI)
  5. R Vinay Kumar (RCB)
  6. SK Trivedi (RR)
setwd("C:/software/cricket-package/cricsheet/cleanup/IPL/part4")
ashwin <- getBowlerWicketDetails(team="Chennai Super Kings",name="R Ashwin",dir=".")
bravo <-  getBowlerWicketDetails(team="Chennai Super Kings",name="DJ Bravo",dir=".")
chawla <-  getBowlerWicketDetails(team="Kings XI Punjab",name="PP Chawla",dir=".")
harbhajan <-  getBowlerWicketDetails(team="Mumbai Indians",name="Harbhajan Singh",dir=".")
vinay <-  getBowlerWicketDetails(team="Royal Challengers Bangalore",name="R Vinay Kumar",dir=".")
sktrivedi <-  getBowlerWicketDetails(team="Rajasthan Royals",name="SK Trivedi",dir=".")

15. Bowler Mean Economy Rate (in IPL matches)

Ashwin & Chawla have the best economy rates of in the IPL teams, followed by Harbhajan Singh

p1<-bowlerMeanEconomyRate(ashwin,"R Ashwin")
p2<-bowlerMeanEconomyRate(bravo, "DJ Bravo")
p3<-bowlerMeanEconomyRate(chawla, "PP Chawla")
p4<-bowlerMeanEconomyRate(harbhajan, "Harbhajan Singh")
p5<-bowlerMeanEconomyRate(vinay, "R Vinay")
p6<-bowlerMeanEconomyRate(sktrivedi, "SK Trivedi")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

meanER-1

16. Bowler Mean Runs conceded (in IPL matches)

p1<-bowlerMeanRunsConceded(ashwin,"R Ashwin")
p2<-bowlerMeanRunsConceded(bravo, "DJ Bravo")
p3<-bowlerMeanRunsConceded(chawla, "PP Chawla")
p4<-bowlerMeanRunsConceded(harbhajan, "Harbhajan Singh")
p5<-bowlerMeanRunsConceded(vinay, "R Vinay")
p6<-bowlerMeanRunsConceded(sktrivedi, "SK Trivedi")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

meanRunsConceded-1

17. Bowler Moving average (in IPL matches)

Harbhajan’s moving average is the best hovering around 2 wickets

p1<-bowlerMovingAverage(ashwin,"R Ashwin")
p2<-bowlerMovingAverage(bravo, "DJ Bravo")
p3<-bowlerMovingAverage(chawla, "PP Chawla")
p4<-bowlerMovingAverage(harbhajan, "Harbhajan Singh")
p5<-bowlerMovingAverage(vinay, "R Vinay")
p6<-bowlerMovingAverage(sktrivedi, "SK Trivedi")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

bowlerMA-1

17. Bowler cumulative average wickets (in IPL matches)

The cumulative average tells a different story. DJ Bravo and R Vinay have a cumulative average of 2 wickets. All others are around 1.5

p1<-bowlerCumulativeAvgWickets(ashwin,"R Ashwin")
p2<-bowlerCumulativeAvgWickets(bravo, "DJ Bravo")
p3<-bowlerCumulativeAvgWickets(chawla, "PP Chawla")
p4<-bowlerCumulativeAvgWickets(harbhajan, "Harbhajan Singh")
p5<-bowlerCumulativeAvgWickets(vinay, "R Vinay")
p6<-bowlerCumulativeAvgWickets(sktrivedi, "SK Trivedi")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

cumWkts-1

18. Bowler cumulative Economy Rate (ER) (in IPL matches)

Ashwin & Harbhajan have the best cumulative economy rate

p1<-bowlerCumulativeAvgEconRate(ashwin,"R Ashwin")
p2<-bowlerCumulativeAvgEconRate(bravo, "DJ Bravo")
p3<-bowlerCumulativeAvgEconRate(chawla, "PP Chawla")
p4<-bowlerCumulativeAvgEconRate(harbhajan, "Harbhajan Singh")
p5<-bowlerCumulativeAvgEconRate(vinay, "R Vinay")
p6<-bowlerCumulativeAvgEconRate(sktrivedi, "SK Trivedi")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

cumER-1

19. Bowler wicket plot (in IPL matches)

The plot below gives the average wickets versus number of overs

p1<-bowlerWicketPlot(ashwin,"R Ashwin")
p2<-bowlerWicketPlot(bravo, "DJ Bravo")
p3<-bowlerWicketPlot(chawla, "PP Chawla")
p4<-bowlerWicketPlot(harbhajan, "Harbhajan Singh")
p5<-bowlerWicketPlot(vinay, "R Vinay")
p6<-bowlerWicketPlot(sktrivedi, "SK Trivedi")
grid.arrange(p1,p2,p3,p4,p5,p6, ncol=3)

wktPlot-1

20. Bowler wicket against opposing IPL teams

bowlerWicketsAgainstOpposition(ashwin,"R Ashwin")

wktsOppn1-1

bowlerWicketsAgainstOpposition(bravo, "DJ Bravo")

wktsOppn2-1

bowlerWicketsAgainstOpposition(chawla, "PP Chawla")

wktsOppn3-1

bowlerWicketsAgainstOpposition(harbhajan, "Harbhajan Singh")

wktsOppn4-1

bowlerWicketsAgainstOpposition(vinay, "R Vinay")

wktsOppn5-1

bowlerWicketsAgainstOpposition(sktrivedi, "SK Trivedi")

wktsOppn6-1

21. Bowler wicket at cricket grounds in IPL

bowlerWicketsVenue(ashwin,"R Ashwin")

wktsAve1-1

bowlerWicketsVenue(bravo, "DJ Bravo")

wktsAve2-1

bowlerWicketsVenue(chawla, "PP Chawla")

wktsAve3-1

bowlerWicketsVenue(harbhajan, "Harbhajan Singh")

wktsAve4-1

bowlerWicketsVenue(vinay, "R Vinay")

wktsAve5-1

bowlerWicketsVenue(sktrivedi, "SK Trivedi")

wktsAve6-1

22. Get Delivery wickets for IPL bowlers

This function creates a dataframe of deliveries and the wickets taken

setwd("C:/software/cricket-package/york-test/yorkrData/IPL/IPL-T20-matches")
ashwin1 <- getDeliveryWickets(team="Chennai Super Kings",dir=".",name="R Ashwin",save=FALSE)
bravo1 <- getDeliveryWickets(team="Chennai Super Kings",dir=".",name="DJ Bravo",save=FALSE)
chawla1 <- getDeliveryWickets(team="Kings XI Punjab",dir=".",name="PP Chawla",save=FALSE)
harbhajan1 <- getDeliveryWickets(team="Mumbai Indians",dir=".",name="Harbhajan Singh",save=FALSE)
vinay1 <- getDeliveryWickets(team="Royal Challengers Bangalore",dir=".",name="R Vinay",save=FALSE)
sktrivedi1 <- getDeliveryWickets(team="Rajasthan Royals",dir=".",name="SK Trivedi",save=FALSE)

23. Predict number of deliveries to wickets in IPL T20

#Ashwin takes 
par(mfrow=c(1,2))
par(mar=c(4,4,2,2))

bowlerWktsPredict(ashwin1,"R Ashwin")
bowlerWktsPredict(bravo1, "DJ Bravo")

wktsPred1-1

par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
bowlerWktsPredict(chawla1, "PP Chawla")
bowlerWktsPredict(harbhajan1, "Harbhajan Singh")

wktsPred2-1

par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
bowlerWktsPredict(vinay1, "R Vinay")
bowlerWktsPredict(sktrivedi1, "SK Trivedi")

wktsPred3-1

Conclusion

This concludes the 4 part writeup of yorkr’s handling of IPL Twenty20’s. You can fork/clone the code from Github at yorkr.

As I mentioned earlier, this brings to a close to all my posts based on my R cricket package yorkr. I do have a couple of more ideas, but this will take some time I think.

Hope you have a great time with my yorkr package!

Till next time, adieu!

Also see

  1. Introducing cricket package yorkr: Part 3-Foxed by flight!
  2. Introducing cricketr! : An R package to analyze performances of cricketers
  3. Cricket analytics with cricketr in paperback and Kindle versions
  4. Bend it like Bluemix, MongoDB with auto-scaling – Part 1
  5. The dark side of the Internet
  6. Modeling a Car in Android
  7. yorkr pads up for the Twenty20s: Part 1- Analyzing team”s match performance
  8. Cricket analytics with cricketr