I fear not the man who has practiced 10,000 kicks once, but I fear the man who has practiced one kick 10,000 times.
Bruce Lee
I’ve missed more than 9000 shots in my career. I’ve lost almost 300 games. 26 times, I’ve been trusted to take the game winning shot and missed. I’ve failed over and over and over again in my life. And that is why I succeed.
Michael Jordan
Man, it doesn’t matter where you come in to bat, the score is still zero
Viv Richards
Introduction
“If cricketr is to cricpy, then yorkr is to _____?”. Yes, you guessed it right, it is yorkpy. In this post, I introduce my 2nd python package, yorkpy, which is a python clone of my R package yorkr. This package is based on data from Cricsheet. yorkpy currently handles IPL T20 matches.
When I created cricpy, the python avatar, of my R package cricketr, see Introducing cricpy:A python package to analyze performances of cricketers, I had decided that I should avoid doing a python avatar of my R package yorkr (see Introducing cricket package yorkr: Part 1- Beaten by sheer pace!) , as it was more involved, and required the parsing of match data available as yaml files.
Just out of curiosity, I tried the python package ‘yaml’ to read the match data, and lo and behold, I was sucked into the developing the package and so, yorkpy was born. Of course, it goes without saying that, usually when I am in the thick of developing something, I occasionally wonder, why I am doing it, for whom and for what purpose? Maybe it is the joy of ideation, the problem-solving, the programmer’s high, for sharing my ideas etc. Anyway, whatever be the reason, I hope you enjoy this post and also find yorkpy useful.
You can clone/download the code at Github yorkpy
This post has been published to RPubs at yorkpy-Part1
You can download this post as PDF at IPLT20-yorkpy-part1
Note: If you would like to do a similar analysis for a different set of batsman and bowlers, you can clone/download my skeleton yorkpy-template from Github (which is the R Markdown file I have used for the analysis below).
The IPL T20 functions in yorkpy are
2. Install the package using ‘pip install’
import pandas as pd
import yorkpy.analytics as yka
#pip install yorkpy
3. Load a yaml file from Cricsheet
There are 2 functions that can be to convert the IPL Twenty20 yaml files to pandas dataframeare
- convertYaml2PandasDataframeT20
- convertAllYaml2PandasDataframesT20
Note 1: While I have already converted the IPL T20 files, you will need to use these functions for future IPL matches
4. Convert and save IPL T20 yaml file to pandas dataframe
This function will convert a IPL T20 IPL yaml file, in the format as specified in Cricsheet to pandas dataframe. This will be saved as as CSV file in the target directory. The name of the file wil have the following format team1-team2-date.csv. The IPL T20 zip file can be downloaded from Indian Premier League matches. An example of how a yaml file can be converted to a dataframe and saved is shown below.
import pandas as pd
import yorkpy.analytics as yka
#convertYaml2PandasDataframe(".\\1082593.yaml","..\ipl", ..\\data")
5. Convert and save all IPL T20 yaml files to dataframes
This function will convert all IPL T20 yaml files from a source directory to dataframes, and save it in the target directory, with the names as mentioned above. Since I have already done this, I will not be executing this again. You can download the zip of all the converted RData files from Github at yorkpyData
import pandas as pd
import yorkpy.analytics as yka
#convertAllYaml2PandasDataframes("..\\ipl", "..\\data")
You can download the the zip of the files and use it directly in the functions as follows.For the analysis below I chosen a set of random IPL matches
The randomly selected IPL T20 matches are
- Chennai Super Kings vs Kings Xi Punjab, 2014-05-30
- Deccan Chargers vs Delhi Daredevils, 2012-05-10
- Gujarat Lions vs Mumbai Indians, 2017-04-29
- Kolkata Knight Riders vs Rajasthan Royals, 2010-04-17
- Rising Pune Supergiants vs Royal Challengers Bangalore, 2017-04-29
6. Team batting scorecard
The function below computes the batting score card of a team in an IPL match. The scorecard gives the balls faced, the runs scored, 4s, 6s and strike rate. The example below is based on the CSK KXIP match on 30 May 2014.
You can check against the actual scores in this match Chennai Super Kings-Kings XI Punjab-2014-05-30
import pandas as pd
import yorkpy.analytics as yka
csk_kxip=pd.read_csv(".\\Chennai Super Kings-Kings XI Punjab-2014-05-30.csv")
scorecard,extras=yka.teamBattingScorecardMatch(csk_kxip,"Chennai Super Kings")
print(scorecard)
## batsman runs balls 4s 6s SR
## 0 DR Smith 7 12 0 0 58.333333
## 1 F du Plessis 0 1 0 0 0.000000
## 2 SK Raina 87 26 12 6 334.615385
## 3 BB McCullum 11 16 0 0 68.750000
## 4 RA Jadeja 27 22 2 1 122.727273
## 5 DJ Hussey 1 3 0 0 33.333333
## 6 MS Dhoni 42 34 3 3 123.529412
## 7 R Ashwin 10 11 0 0 90.909091
## 8 MM Sharma 1 3 0 0 33.333333
print(extras)
## total wides noballs legbyes byes penalty extras
## 0 428 14 3 5 5 0 27
print("\n\n")
scorecard1,extras1=yka.teamBattingScorecardMatch(csk_kxip,"Kings XI Punjab")
print(scorecard1)
## batsman runs balls 4s 6s SR
## 0 V Sehwag 122 62 12 8 196.774194
## 1 M Vohra 34 33 1 2 103.030303
## 2 GJ Maxwell 13 8 1 1 162.500000
## 3 DA Miller 38 19 5 1 200.000000
## 4 GJ Bailey 1 2 0 0 50.000000
## 5 WP Saha 6 4 0 1 150.000000
## 6 MG Johnson 1 1 0 0 100.000000
print(extras1)
## total wides noballs legbyes byes penalty extras
## 0 428 14 3 5 5 0 27
Let’s take another random match between Gujarat Lions and Mumbai Indian on 29 Apr 2017 Gujarat Lions-Mumbai Indians-2017-04-29
import pandas as pd
gl_mi=pd.read_csv(".\\Gujarat Lions-Mumbai Indians-2017-04-29.csv")
import yorkpy.analytics as yka
scorecard,extras=yka.teamBattingScorecardMatch(gl_mi,"Gujarat Lions")
print(scorecard)
## batsman runs balls 4s 6s SR
## 0 Ishan Kishan 48 38 6 2 126.315789
## 1 BB McCullum 6 4 1 0 150.000000
## 2 SK Raina 1 3 0 0 33.333333
## 3 AJ Finch 0 3 0 0 0.000000
## 4 KD Karthik 2 9 0 0 22.222222
## 5 RA Jadeja 28 22 2 1 127.272727
## 6 JP Faulkner 21 29 2 0 72.413793
## 7 IK Pathan 2 3 0 0 66.666667
## 8 AJ Tye 25 12 2 2 208.333333
## 9 Basil Thampi 2 4 0 0 50.000000
## 10 Ankit Soni 7 2 0 1 350.000000
print(extras)
## total wides noballs legbyes byes penalty extras
## 0 306 8 3 1 0 0 12
print("\n\n")
scorecard1,extras1=yka.teamBattingScorecardMatch(gl_mi,"Mumbai Indians")
print(scorecard1)
## batsman runs balls 4s 6s SR
## 0 PA Patel 70 45 9 1 155.555556
## 1 JC Buttler 9 7 2 0 128.571429
## 2 N Rana 19 16 1 1 118.750000
## 3 RG Sharma 5 13 0 0 38.461538
## 4 KA Pollard 15 11 2 0 136.363636
## 5 KH Pandya 29 20 2 1 145.000000
## 6 HH Pandya 4 5 0 0 80.000000
## 7 Harbhajan Singh 0 1 0 0 0.000000
## 8 MJ McClenaghan 1 1 0 0 100.000000
## 9 JJ Bumrah 0 1 0 0 0.000000
## 10 SL Malinga 0 1 0 0 0.000000
print(extras1)
## total wides noballs legbyes byes penalty extras
## 0 306 8 3 1 0 0 12
7. Plot the team batting partnerships
The functions below plot the team batting partnership in the match. It shows what the partnership were in the mtach
Note: Many of the plots include an additional parameters plot which is either True or False. The default value is plot=True. When plot=True the plot will be displayed. When plot=False the data frame will be returned to the user. The user can use this to create an interactive chart using one of the packages like rcharts, ggvis,googleVis or plotly.
import pandas as pd
import yorkpy.analytics as yka
dc_dd=pd.read_csv(".\\Deccan Chargers-Delhi Daredevils-2012-05-10.csv")
yka.teamBatsmenPartnershipMatch(dc_dd,'Deccan Chargers','Delhi Daredevils')
yka.teamBatsmenPartnershipMatch(dc_dd,'Delhi Daredevils','Deccan Chargers',plot=True)
# Print partnerships as a dataframe
rps_rcb=pd.read_csv(".\\Rising Pune Supergiant-Royal Challengers Bangalore-2017-04-29.csv")
m=yka.teamBatsmenPartnershipMatch(rps_rcb,'Royal Challengers Bangalore','Rising Pune Supergiant',plot=False)
print(m)
## batsman non_striker runs
## 0 AB de Villiers V Kohli 3
## 1 AF Milne V Kohli 5
## 2 KM Jadhav V Kohli 7
## 3 P Negi V Kohli 3
## 4 S Aravind V Kohli 0
## 5 S Aravind YS Chahal 8
## 6 S Badree V Kohli 2
## 7 STR Binny V Kohli 1
## 8 Sachin Baby V Kohli 2
## 9 TM Head V Kohli 2
## 10 V Kohli AB de Villiers 17
## 11 V Kohli AF Milne 5
## 12 V Kohli KM Jadhav 4
## 13 V Kohli P Negi 9
## 14 V Kohli S Aravind 2
## 15 V Kohli S Badree 8
## 16 V Kohli Sachin Baby 1
## 17 V Kohli TM Head 9
## 18 YS Chahal S Aravind 4
8. Batsmen vs Bowler
The function below computes and plots the performances of the batsmen vs the bowlers. As before the plot parameter can be set to True or False. By default it is plot=True
import pandas as pd
import yorkpy.analytics as yka
gl_mi=pd.read_csv(".\\Gujarat Lions-Mumbai Indians-2017-04-29.csv")
yka.teamBatsmenVsBowlersMatch(gl_mi,"Gujarat Lions","Mumbai Indians", plot=True)
# Print
csk_kxip=pd.read_csv(".\\Chennai Super Kings-Kings XI Punjab-2014-05-30.csv")
m=yka.teamBatsmenVsBowlersMatch(csk_kxip,'Chennai Super Kings','Kings XI Punjab',plot=False)
print(m)
## batsman bowler runs
## 0 BB McCullum AR Patel 4
## 1 BB McCullum GJ Maxwell 1
## 2 BB McCullum Karanveer Singh 6
## 3 DJ Hussey P Awana 1
## 4 DR Smith MG Johnson 7
## 5 DR Smith P Awana 0
## 6 DR Smith Sandeep Sharma 0
## 7 F du Plessis MG Johnson 0
## 8 MM Sharma AR Patel 0
## 9 MM Sharma MG Johnson 0
## 10 MM Sharma P Awana 1
## 11 MS Dhoni AR Patel 12
## 12 MS Dhoni Karanveer Singh 2
## 13 MS Dhoni MG Johnson 11
## 14 MS Dhoni P Awana 15
## 15 MS Dhoni Sandeep Sharma 2
## 16 R Ashwin AR Patel 1
## 17 R Ashwin Karanveer Singh 4
## 18 R Ashwin MG Johnson 1
## 19 R Ashwin P Awana 1
## 20 R Ashwin Sandeep Sharma 3
## 21 RA Jadeja AR Patel 5
## 22 RA Jadeja GJ Maxwell 3
## 23 RA Jadeja Karanveer Singh 19
## 24 RA Jadeja P Awana 0
## 25 SK Raina MG Johnson 21
## 26 SK Raina P Awana 40
## 27 SK Raina Sandeep Sharma 26
9. Bowling Scorecard
This function provides the bowling performance, the number of overs bowled, maidens, runs conceded. wickets taken and economy rate for the IPL match
import pandas as pd
import yorkpy.analytics as yka
dc_dd=pd.read_csv(".\\Deccan Chargers-Delhi Daredevils-2012-05-10.csv")
a=yka.teamBowlingScorecardMatch(dc_dd,'Deccan Chargers')
print(a)
## bowler overs runs maidens wicket econrate
## 0 AD Russell 4 39 0 0 9.75
## 1 IK Pathan 4 46 0 1 11.50
## 2 M Morkel 4 32 0 1 8.00
## 3 S Nadeem 4 39 0 0 9.75
## 4 VR Aaron 4 30 0 2 7.50
rps_rcb=pd.read_csv(".\\Rising Pune Supergiant-Royal Challengers Bangalore-2017-04-29.csv")
b=yka.teamBowlingScorecardMatch(rps_rcb,'Royal Challengers Bangalore')
print(b)
## bowler overs runs maidens wicket econrate
## 0 DL Chahar 2 18 0 0 9.00
## 1 DT Christian 4 25 0 1 6.25
## 2 Imran Tahir 4 18 0 3 4.50
## 3 JD Unadkat 4 19 0 1 4.75
## 4 LH Ferguson 4 7 1 3 1.75
## 5 Washington Sundar 2 7 0 1 3.50
10. Wicket Kind
The plots below provide the kind of wicket taken by the bowler (caught, bowled, lbw etc.) for the IPL match
import pandas as pd
import yorkpy.analytics as yka
kkr_rr=pd.read_csv(".\\Kolkata Knight Riders-Rajasthan Royals-2010-04-17.csv")
yka.teamBowlingWicketKindMatch(kkr_rr,'Kolkata Knight Riders','Rajasthan Royals')
csk_kxip=pd.read_csv(".\\Chennai Super Kings-Kings XI Punjab-2014-05-30.csv")
m = yka.teamBowlingWicketKindMatch(csk_kxip,'Chennai Super Kings','Kings-Kings XI Punjab',plot=False)
print(m)
## bowler kind player_out
## 0 AR Patel run out 1
## 1 AR Patel stumped 1
## 2 Karanveer Singh run out 1
## 3 MG Johnson caught 1
## 4 P Awana caught 2
## 5 Sandeep Sharma bowled 1
11. Wicket vs Runs conceded
The plots below provide the wickets taken and the runs conceded by the bowler in the IPL T20 match
import pandas as pd
import yorkpy.analytics as yka
dc_dd=pd.read_csv(".\\Deccan Chargers-Delhi Daredevils-2012-05-10.csv")
yka.teamBowlingWicketMatch(dc_dd,"Deccan Chargers", "Delhi Daredevils",plot=True)
print("\n\n")
rps_rcb=pd.read_csv(".\\Rising Pune Supergiant-Royal Challengers Bangalore-2017-04-29.csv")
a=yka.teamBowlingWicketMatch(rps_rcb,"Royal Challengers Bangalore", "Rising Pune Supergiant",plot=False)
print(a)
## bowler player_out kind
## 0 DT Christian V Kohli 1
## 1 Imran Tahir AF Milne 1
## 2 Imran Tahir P Negi 1
## 3 Imran Tahir S Badree 1
## 4 JD Unadkat TM Head 1
## 5 LH Ferguson AB de Villiers 1
## 6 LH Ferguson KM Jadhav 1
## 7 LH Ferguson STR Binny 1
## 8 Washington Sundar Sachin Baby 1
12. Bowler Vs Batsmen
The functions compute and display how the different bowlers of the IPL team performed against the batting opposition.
import pandas as pd
import yorkpy.analytics as yka
csk_kxip=pd.read_csv(".\\Chennai Super Kings-Kings XI Punjab-2014-05-30.csv")
yka.teamBowlersVsBatsmenMatch(csk_kxip,"Chennai Super Kings","Kings XI Punjab")
print("\n\n")
kkr_rr=pd.read_csv(".\\Kolkata Knight Riders-Rajasthan Royals-2010-04-17.csv")
m =yka.teamBowlersVsBatsmenMatch(kkr_rr,"Rajasthan Royals","Kolkata Knight Riders",plot=False)
print(m)
## batsman bowler runs
## 0 AC Voges AB Dinda 1
## 1 AC Voges JD Unadkat 1
## 2 AC Voges LR Shukla 1
## 3 AC Voges M Kartik 5
## 4 AJ Finch AB Dinda 3
## 5 AJ Finch JD Unadkat 3
## 6 AJ Finch LR Shukla 13
## 7 AJ Finch M Kartik 2
## 8 AJ Finch SE Bond 0
## 9 AS Raut AB Dinda 1
## 10 AS Raut JD Unadkat 1
## 11 FY Fazal AB Dinda 1
## 12 FY Fazal LR Shukla 3
## 13 FY Fazal M Kartik 3
## 14 FY Fazal SE Bond 6
## 15 NV Ojha AB Dinda 10
## 16 NV Ojha JD Unadkat 5
## 17 NV Ojha LR Shukla 0
## 18 NV Ojha M Kartik 1
## 19 NV Ojha SE Bond 2
## 20 P Dogra JD Unadkat 2
## 21 P Dogra LR Shukla 5
## 22 P Dogra M Kartik 1
## 23 P Dogra SE Bond 0
## 24 SK Trivedi AB Dinda 4
## 25 SK Warne AB Dinda 2
## 26 SK Warne M Kartik 1
## 27 SK Warne SE Bond 0
## 28 SR Watson AB Dinda 2
## 29 SR Watson JD Unadkat 13
## 30 SR Watson LR Shukla 1
## 31 SR Watson M Kartik 18
## 32 SR Watson SE Bond 10
## 33 YK Pathan JD Unadkat 1
## 34 YK Pathan LR Shukla 7
13. Match worm chart
The plots below provide the match worm graph for the IPL Twenty 20 matches
import pandas as pd
import yorkpy.analytics as yka
dc_dd=pd.read_csv(".\\Deccan Chargers-Delhi Daredevils-2012-05-10.csv")
yka.matchWormChart(dc_dd,"Deccan Chargers", "Delhi Daredevils")
gl_mi=pd.read_csv(".\\Gujarat Lions-Mumbai Indians-2017-04-29.csv")
yka.matchWormChart(gl_mi,"Mumbai Indians","Gujarat Lions")
Feel free to clone/download the code from Github yorkpy
Conclusion
This post included all functions between 2 IPL teams from the package yorkpy for IPL Twenty20 matches. As mentioned above the yaml match files have been already converted to dataframes and are available for download from Github at yorkpyData
After having used Python and R for analytics, Machine Learning and Deep Learning, I have now realized that neither language is superior or inferior. Both have, some good packages and some that are not so well suited.
To be continued. Watch this space!
Important note: Do check out my other posts using yorkpy at yorkpy-posts
You may also like
1.My book ‘Deep Learning from first principles:Second Edition’ now on Amazon
2.My book ‘Practical Machine Learning in R and Python: Second edition’ on Amazon
2. Cricpy takes a swing at the ODIs
3. Introducing cricket package yorkr: Part 1- Beaten by sheer pace!
4. Big Data-1: Move into the big league:Graduate from Python to Pyspark
5. Simulating an Edge Shape in Android
To see all posts click Index of posts
30 thoughts on “Pitching yorkpy … short of good length to IPL – Part 1”