“Dear, dear! How queer everything is to-day! And yesterday things went on just as usual. I wonder if I’ve been changed in the night? Let me think: was I the same when I got up this morning? I almost think I can remember feeling a little different. But if I’m not the same, the next question is ’Who in the world am I? Ah, that’s the great puzzle!”
Alice's adventures in Wonderland, Lewis Carroll
1. Introduction
In this post, yorkpy clean bowls the following T20 formats namely International T20s, Big Bash League and Natwest T20 Blast. I take yorkpy on a spin through these T20 leagues. In the post below,I choose a random set of about 10-12 of the overall 63 functions that yorkpy has, and execute them for each of the different T20 leagues – Intl T20s, BBL and Natwest T20s. yorkpy, is the python avatar of my R package yorkr, see Introducing cricket package yorkr: Part 1- Beaten by sheer pace!
There were a couple of new functions that needed to be added for each of the T20 leagues – Intl T20, BBL and Natwest T20 to take into account the different teams in each of these leagues. Further some bugs were also ironed out in tje latest version of yorkpy. yorkpy uses data from Cricsheet . The match data is in the form of YAML files. yorkpy converts these YAML files to dataframes. YAML files are very detailed and include a ball-by-ball account of the match.
– You can clone/fork the latest code for yorkpy from github yorkpy
– This post has also been published in RPubs at yorkpy takes a hat-trick
– You can download the PDF version of this post at yorkpy takes a hat-trick
The data for IPL, Intl. T20, BBL and Natwest T20 have already been converted into pandas dataframes and saved as CSVs. You can download the converted files from Github at [allYorkpyT20Data])(https://github.com/tvganesh/allYorkpyT20Data)
yorkpy has the following 4 main classes of functions
A.Functions analyzing individual T20 match (Class 1)
This was demonstrated in Pitching yorkpy . short of good length to IPL – Part 1 The functions deal with individual T20 matches. The functions are
- convertYaml2PandasDataframeT20()
- convertAllYaml2PandasDataframesT20()
- teamBattingScorecardMatch()
- teamBatsmenPartnershipMatch()
- teamBatsmenVsBowlersMatch()
- teamBowlingScorecardMatch()
- teamBowlingWicketKindMatch()
- teamBowlingWicketRunsMatch()
- teamBowlingWicketMatch()
- teamBowlersVsBatsmenMatch()
- matchWormChart()
B. Functions that analyze all matches between 2 T20 teams (Class 2
Pitching yorkpy.on the middle and outside off-stump to IPL – Part 2 included functions that analyze head-to-head confrontation between any 2 T20 teams The functions are
- getAllMatchesBetweenTeams()
- saveAllMatchesBetween2IPLTeams()
- getAllMatchesBetweenTeams()
- saveAllMatchesBetween2IPLTeams()
- teamBatsmenPartnershiOppnAllMatches()
- teamBatsmenPartnershipOppnAllMatchesChart()
- teamBatsmenVsBowlersOppnAllMatches()
- teamBattingScorecardOppnAllMatches()
- teamBowlingScorecardOppnAllMatches()
- teamBowlingWicketKindOppositionAllMatches()
- teamBowlersVsBatsmenOppnAllMatches()
- plotWinLossBetweenTeams()
- plotWinsByRunOrWickets() 23.plotWinsbyTossDecision()
C. Functions that analyze the performance of a T20 team against all other teams (Class 3)
The post Pitching yorkpy.swinging away from the leg stump to IPL – Part 3 is based on Class C set of functions shown below
- getAllMatchesAllOpposition()
- saveAllMatchesAllOppositionIPLT20(dir1)
- getAllMatchesAllOpposition()
- saveAllMatchesAllOppositionIPLT20()
- teamBatsmenPartnershiAllOppnAllMatches()
- teamBatsmenPartnershipAllOppnAllMatchesChart()
- teamBatsmenVsBowlersAllOppnAllMatches()
- teamBattingScorecardAllOppnAllMatches()
- teamBowlingScorecardAllOppnAllMatches()
- teamBowlingWicketKindAllOppnAllMatches()
- teamBowlersVsBatsmenAllOppnAllMatches()
- plotWinLossByTeamAllOpposition()
- plotWinsByRunOrWicketsAllOpposition()
- plotWinsbyTossDecisionAllOpposition()
D. Functions that analyze performances of T20 batsmen and bowlers (Class 4)
These set of functions analyze individual batsmen and bowlers and have been used in Pitching yorkpy . in the block hole – Part 4 The functions are
- getTeamBattingDetails()
- getBatsmanDetails()
- batsmanRunsVsDeliveries()
- batsmanFoursSixes()
- batsmanDismissals()
- batsmanRunsVsStrikeRate()
- batsmanMovingAverage()
- batsmanCumulativeAverageRuns()
- batsmanCumulativeStrikeRate()
- batsmanRunsAgainstOpposition()
- batsmanRunsVenue
- getTeamBowlingDetails()
- getBowlerWicketDetails()
- bowlerMeanEconomyRate()
- bowlerMeanRunsConceded()
- bowlerMovingAverage()
- bowlerCumulativeAvgWickets()
- bowlerCumulativeAvgEconRate()
- bowlerWicketPlot()
- bowlerWicketsAgainstOpposition()
- bowlerWicketsVenue()
Additional new functions were added to handle Intl T20s, Big Bash League and Natwest T20 Blast, since the teams are different. They are
59. saveAllMatchesBetween2IntlT20s()
60. saveAllMatchesAllOppositionIntlT20()
61. saveAllMatchesBetween2BBLTeams()
62 saveAllMatchesAllOppositionBBLT20()
63. saveAllMatchesBetween2NWBTeams()
64. saveAllMatchesAllOppositionNWBT20()
All other functions can be used as is! You can get the help of any function in yorkpy using
import yorkpy.analytics as yka
help(yka.teamBatsmenPartnershiOppnAllMatches)
## Help on function teamBatsmenPartnershiOppnAllMatches in module yorkpy.analytics:
##
## teamBatsmenPartnershiOppnAllMatches(matches, theTeam, report='summary', top=5)
## Team batting partnership against a opposition all IPL matches
##
## Description
##
## This function computes the performance of batsmen against all bowlers of an oppositions in
## all matches. This function returns a dataframe
##
## Usage
##
## teamBatsmenPartnershiOppnAllMatches(matches,theTeam,report="summary")
## Arguments
##
## matches
## All the matches of the team against the oppositions
## theTeam
## The team for which the the batting partnerships are sought
## report
## If the report="summary" then the list of top batsmen with the highest partnerships
## is displayed. If report="detailed" then the detailed break up of partnership is returned
## as a dataframe
## top
## The number of players to be displayed from the top
## Value
##
## partnerships The data frame of the partnerships
##
## Note
##
## Maintainer: Tinniam V Ganesh tvganesh.85@gmail.com
##
## Author(s)
##
## Tinniam V Ganesh
##
## References
##
## http://cricsheet.org/
## https://gigadom.wordpress.com/
##
##
## See Also
##
## teamBatsmenVsBowlersOppnAllMatchesPlot
## teamBatsmenPartnershipOppnAllMatchesChart
As I mentioned above I will be randomly choosing a set of 12 functions from Class 1,2,3,4 for each of the T20 leagues (Intl T20, BBL and NWB T20) for analysis
2. International T20s
The following functions were added for handling Intl. T20s
- saveAllMatchesBetween2IntlT20s()
- saveAllMatchesAllOppositionIntlT20()
To handle the countries in Intl. T20s below
Afghanistan, Australia, Bangladesh, Bermuda, Canada, England,Hong Kong,India, Ireland, Kenya, Nepal, Netherlands, “New Zealand, Oman,Pakistan,Scotland,South Africa, Sri Lanka, United Arab Emirates,West Indies, Zimbabwe
import os
#os.chdir('C:\\software\\cricket-package\\yorkpyT20\\t20s')
#import yorkpy.analytics as yka
#1. Convert all YAML files to dataframes and CSV
#yka.convertAllYaml2PandasDataframesT20(".", "..\\data1")
#dir1='C:\\software\\cricket-package\\yorkpyT20\\IntlT20-Matches'
#2. Save all matches between 2 T20 teams
#yka.saveAllMatchesBetween2IntlT20s(dir1)
#3. Save all matches between a T20 team and all other teams
#dir1='C:\\software\\cricket-package\\yorkpyT20\\IntlT20-Matches'
#yka.saveAllMatchesAllOppositionIntlT20(dir1)
#4. Get batting details
#dir1='C:\\software\\cricket-package\\yorkpyT20\\IntlT20-Matches
#yka.getTeamBattingDetails("Afghanistan",dir=dir1, save=True)
#yka.getTeamBattingDetails("Australia",dir=dir1,save=True)
#yka.getTeamBattingDetails("Bangladesh",dir=dir1,save=True)
#...
#5. Get bowling details
#dir1='C:\\software\\cricket-package\\yorkpyT20\\IntlT20-Matches
#yka.getTeamBowlingDetails("Afghanistan",dir=dir1, save=True)
#yka.getTeamBowlingDetails("Australia",dir=dir1,save=True)
#yka.getTeamBowlingDetails("Bangladesh",dir=dir1,save=True)
# ...
Once the data is converted you can use the yorkpy functions. The data has been converted for Intl T20 and is available at Github at IntlT20
To use the yorkpy functions for a new league we need to initial convert the YAML files into appropriate format for processing by yorkpy functions
This will create the necessary files which are are used in the functions below
2.2 2.1 Intl. T20 – Team score card (Class 1)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyT20\\IntlT20-Matches"
path=os.path.join(dir1,".\\India-New Zealand-2007-09-16.csv")
ind_nz=pd.read_csv(path)
scorecard,extras=yka.teamBattingScorecardMatch(ind_nz,"India")
print(scorecard)
## batsman runs balls 4s 6s SR
## 0 G Gambhir 51 34 5 2 150.000000
## 1 V Sehwag 40 18 6 2 222.222222
## 2 RV Uthappa 0 2 0 0 0.000000
## 3 MS Dhoni 24 20 2 0 120.000000
## 4 Yuvraj Singh 5 7 0 0 71.428571
## 5 KD Karthik 17 12 3 0 141.666667
## 6 IK Pathan 11 10 2 0 110.000000
## 7 AB Agarkar 1 2 0 0 50.000000
## 8 Harbhajan Singh 7 6 1 0 116.666667
## 9 S Sreesanth 19 10 4 0 190.000000
## 10 RP Singh 1 1 0 0 100.000000
print(extras)
## total wides noballs legbyes byes penalty extras
## 0 370 6 0 8 0 0 14
2.2 Intl. T20 -Team batsmen partnership (Class 1)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyT20\\IntlT20-Matches"
path=os.path.join(dir1,".\\South Africa-Australia-2009-03-27.csv")
sa_aus=pd.read_csv(path)
yka.teamBatsmenPartnershipMatch(sa_aus,'Australia','New Zealand',plot=True)
2.3 Intl. T20 -Team bowling scorecard match (Class 1)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyT20\\IntlT20-Matches"
path=os.path.join(dir1,".\\Sri Lanka-West Indies-2012-09-28.csv")
sl_wi=pd.read_csv(path)
a=yka.teamBowlingScorecardMatch(sl_wi,'Sri Lanka')
print(a)
## bowler overs runs maidens wicket econrate
## 0 A Mohammed 2 13 0 0 6.5
## 1 SA Campbelle 1 8 0 1 8.0
## 2 SC Selman 1 3 0 0 3.0
## 3 SF Daley 2 5 0 1 2.5
## 4 SR Taylor 2 4 0 1 2.0
## 5 TD Smartt 2 17 0 0 8.5
2.4 Intl. T20 -Match Worm chart (Class 1)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyT20\\IntlT20-Matches"
path=os.path.join(dir1,".\\England-India-2012-09-29.csv")
eng_ind=pd.read_csv(path)
yka.matchWormChart(eng_ind,"England", "India")
path=os.path.join(dir1,".\\Bangladesh-Ireland-2015-12-05.csv")
ban_ire=pd.read_csv(path)
yka.matchWormChart(ban_ire,"Bangladesh", "Ireland")
2.5 Intl. T20 -Team Batting partnerships all matches 2 teams (Class 2)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyT20\\IntlT20-allMatchesBetween2Teams"
path=os.path.join(dir1,"India-England-allMatches.csv")
dc_mi_matches = pd.read_csv(path)
theTeam='India'
m=yka.teamBatsmenPartnershiOppnAllMatches(dc_mi_matches,theTeam,report="detailed", top=4)
print(m)
## batsman totalPartnershipRuns non_striker partnershipRuns
## 0 SK Raina 265 G Gambhir 2
## 1 SK Raina 265 KL Rahul 40
## 2 SK Raina 265 MK Tiwary 24
## 3 SK Raina 265 MS Dhoni 124
## 4 SK Raina 265 P Kumar 0
## 5 SK Raina 265 PP Chawla 4
## 6 SK Raina 265 R Ashwin 1
## 7 SK Raina 265 RG Sharma 16
## 8 SK Raina 265 V Kohli 47
## 9 SK Raina 265 Yuvraj Singh 7
## 10 MS Dhoni 264 A Mishra 1
## 11 MS Dhoni 264 AT Rayudu 18
## 12 MS Dhoni 264 HH Pandya 8
## 13 MS Dhoni 264 IK Pathan 2
## 14 MS Dhoni 264 JJ Bumrah 2
## 15 MS Dhoni 264 MK Pandey 3
## 16 MS Dhoni 264 Parvez Rasool 21
## 17 MS Dhoni 264 R Ashwin 11
## 18 MS Dhoni 264 RA Jadeja 11
## 19 MS Dhoni 264 RG Sharma 9
## 20 MS Dhoni 264 RR Pant 6
## 21 MS Dhoni 264 RV Uthappa 5
## 22 MS Dhoni 264 SK Raina 98
## 23 MS Dhoni 264 YK Pathan 36
## 24 MS Dhoni 264 Yuvraj Singh 33
## 25 V Kohli 236 AM Rahane 3
## 26 V Kohli 236 G Gambhir 78
## 27 V Kohli 236 KL Rahul 46
## 28 V Kohli 236 RG Sharma 2
## 29 V Kohli 236 RV Uthappa 4
## 30 V Kohli 236 S Dhawan 45
## 31 V Kohli 236 SK Raina 48
## 32 V Kohli 236 Yuvraj Singh 10
## 33 M Raj 176 A Sharma 2
## 34 M Raj 176 H Kaur 18
## 35 M Raj 176 J Goswami 6
## 36 M Raj 176 KV Jain 5
## 37 M Raj 176 L Kumari 5
## 38 M Raj 176 N Niranjana 3
## 39 M Raj 176 N Tanwar 17
## 40 M Raj 176 PG Raut 41
## 41 M Raj 176 R Malhotra 5
## 42 M Raj 176 S Mandhana 8
## 43 M Raj 176 S Naik 10
## 44 M Raj 176 S Pandey 19
## 45 M Raj 176 SK Naidu 37
2.6 Intl. T20 -Team Batsmen vs Bowlers all matches 2 teams (Class 2)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyT20\\IntlT20-allMatchesBetween2Teams"
path=os.path.join(dir1,"Ireland-Netherlands-allMatches.csv")
ire_nl_matches = pd.read_csv(path)
yka.teamBatsmenVsBowlersOppnAllMatches(ire_nl_matches,'Ireland',"Netherlands",plot=True,top=3,runsScored=10)
2.7 Intl. T20 -Team Bowling scorecard all matches 2 teams (Class 2)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyT20\\IntlT20-allMatchesBetween2Teams"
path=os.path.join(dir1,"Bangladesh-Nepal-allMatches.csv")
bang_nep_matches = pd.read_csv(path)
scorecard=yka.teamBowlingScorecardOppnAllMatches(bang_nep_matches,'Bangladesh',"Nepal")
print(scorecard)
## bowler overs runs maidens wicket econrate
## 0 B Regmi 3 14 0 1 4.666667
## 3 SP Gauchan 4 40 0 1 10.000000
## 1 JK Mukhiya 2 16 0 0 8.000000
## 2 P Khadka 3 23 0 0 7.666667
## 4 Sagar Pun 1 16 0 0 16.000000
## 5 Sompal Kami 2 21 0 0 10.500000
2.8 Intl. T20 -Team Batsmen vs Bowlers all Oppositions (Class 3)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyT20\\\IntlT20-allMatchesAllOpposition\\"
path=os.path.join(dir1,"Australia-allMatchesAllOpposition.csv")
aus_matches = pd.read_csv(path)
yka.teamBatsmenVsBowlersAllOppnAllMatches(aus_matches,"Australia",plot=True,top=3,runsScored=40)
2.9 Intl. T20 -Wins vs Losses of a team against all other teams (Class 3)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyT20\\\IntlT20-allMatchesAllOpposition\\"
path=os.path.join(dir1,"South Africa-allMatchesAllOpposition.csv")
sa_matches = pd.read_csv(path)
team1='South Africa'
yka.plotWinLossByTeamAllOpposition(sa_matches,team1,plot="detailed")
2.10 Intl. T20 -Batsmen analysis (Class 4)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyT20\\\IntlT20-BattingBowlingDetails\\"
# Rohit Sharma
name="RG Sharma"
team='India'
df=yka.getBatsmanDetails(team,name,dir=dir1)
yka.batsmanCumulativeAverageRuns(df,name)
# MJ Guptill
name="MJ Guptill"
team='New Zealand'
df=yka.getBatsmanDetails(team,name,dir=dir1)
yka.batsmanCumulativeStrikeRate(df,name)
2.11 Intl. T20 -Bowler analysis (Class 4)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyT20\\\IntlT20-BattingBowlingDetails\\"
# Shakib Al Hasan
name="Shakib Al Hasan"
team='Bangladesh'
df=yka.getBowlerWicketDetails(team,name,dir=dir1)
yka.bowlerMeanEconomyRate(df,name)
# Rashid Khan
name="SL Malinga"
team='Sri Lanka'
df=yka.getBowlerWicketDetails(team,name,dir=dir1)
yka.bowlerWicketsAgainstOpposition(df,name)
3. Big Bash League
The following functions for added to handle BBL teams
- saveAllMatchesBetween2BBLTeams()
- saveAllMatchesAllOppositionBBLT20
The BBL teams are included are Adelaide Strikers, Brisbane Heat, Hobart Hurricanes, Melbourne Renegades, Perth Scorchers, Sydney Sixers, Sydney Thunder
To use the yorkpy functions first the YAML files have to be converted into pandas dataframe and then saved as CSV as shown below
import os
import yorkpy.analytics as yka
os.chdir('C:\\software\\cricket-package\\yorkpyBBL\\bbl')
#1. Convert all YAML files to dataframes and save as CSV
#yka.convertAllYaml2PandasDataframesT20(".", "..\\BBLT20-Matches")
#2. Save all matches between 2 BBL teams
dir1='C:\\software\\cricket-package\\yorkpyBBL\\BBLT20-Matches'
#yka.saveAllMatchesBetween2BBLTeams(dir1)
#3. Save T20 matches between a BBL team and all other teams
dir1='C:\\software\\cricket-package\\yorkpyBBL\\BBLT20-Matches'
#yka.saveAllMatchesAllOppositionBBLT20(dir1)
#4. Get the batting details
dir1='C:\\software\\cricket-package\\yorkpyBBL\\BBLT20-Matches'
#yka.getTeamBattingDetails("Adelaide Strikers",dir=dir1, save=True)
#yka.getTeamBattingDetails("Brisbane Heat",dir=dir1,save=True)
#yka.getTeamBattingDetails("Hobart Hurricanes",dir=dir1,save=True)
#...
# Get the bowling details
dir1='C:\\software\\cricket-package\\yorkpyBBL\\BBLT20-Matches'
#yka.getTeamBowlingDetails("Adelaide Strikers",dir=dir1, save=True)
#yka.getTeamBowlingDetails("Brisbane Heat",dir=dir1,save=True)
#yka.getTeamBowlingDetails("Hobart Hurricanes",dir=dir1,save=True)
#...
The functions below perform analysis on the generated files from above. The YAML files have already been converted and are available at Github at BBL
3.1 Big Bash League – Team score card (Class 1)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyBBL\\BBLT20-Matches"
path=os.path.join(dir1,".\\Adelaide Strikers-Brisbane Heat-2012-12-13.csv")
as_bh=pd.read_csv(path)
scorecard,extras=yka.teamBattingScorecardMatch(as_bh,"Brisbane Heat")
print(scorecard)
## batsman runs balls 4s 6s SR
## 0 LA Pomersbach 65 42 8 2 154.761905
## 1 JR Hopes 1 2 0 0 50.000000
## 2 JA Burns 37 31 2 2 119.354839
## 3 DT Christian 12 15 0 0 80.000000
## 4 NLTC Perera 12 4 0 2 300.000000
## 5 CA Lynn 19 18 1 1 105.555556
## 6 BCJ Cutting 13 5 0 2 260.000000
## 7 PJ Forrest 12 8 0 1 150.000000
## 8 CD Hartley 5 2 1 0 250.000000
print(extras)
## total wides noballs legbyes byes penalty extras
## 0 371 10 2 5 0 0 17
3.2 Big Bash League -Team batsmen vs Bowlers (Class 1)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyBBL\\BBLT20-Matches"
path=os.path.join(dir1,".\\Hobart Hurricanes-Melbourne Renegades-2012-01-18.csv")
hh_mr=pd.read_csv(path)
yka.teamBatsmenVsBowlersMatch(hh_mr,'Hobart Hurricanes','Melbourne Renegades',plot=True)
3.3 Big Bash League -Team bowling scorecard match (Class 1)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyBBL\\BBLT20-Matches"
path=os.path.join(dir1,".\\Melbourne Stars-Sydney Thunder-2016-01-24.csv")
ms_st=pd.read_csv(path)
a=yka.teamBowlingScorecardMatch(ms_st,'Sydney Thunder')
print(a)
## bowler overs runs maidens wicket econrate
## 0 A Zampa 4 32 0 2 8.000000
## 1 BW Hilfenhaus 2 21 0 0 10.500000
## 2 DJ Hussey 1 9 0 1 9.000000
## 3 DJ Worrall 3 42 0 0 14.000000
## 4 EP Gulbis 2 19 0 0 9.500000
## 5 MA Beer 3 25 0 1 8.333333
## 6 MP Stoinis 4 30 0 3 7.500000
3.4 Big Bash League – Match Worm chart (Class 1)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyBBL\\BBLT20-Matches"
path=os.path.join(dir1,".\\Sydney Sixers-Melbourne Stars-2011-12-27.csv")
ss_ms=pd.read_csv(path)
yka.matchWormChart(ss_ms,"Melbourne Stars", "Sydney Sixers")
path=os.path.join(dir1,".\\Hobart Hurricanes-Brisbane Heat-2015-01-02.csv")
hh_bh=pd.read_csv(path)
yka.matchWormChart(hh_bh,"Hobart Hurricanes", "Brisbane Heat")
3.5 Big Bash League -Team Batting partnerships all matches 2 teams (Class 2)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyBBL\\BBLT20-allMatchesBetween2Teams"
path=os.path.join(dir1,"Brisbane Heat-Adelaide Strikers-allMatches.csv")
bh_as_matches = pd.read_csv(path)
yka.teamBatsmenPartnershipOppnAllMatchesChart(bh_as_matches,"Brisbane Heat","Adelaide Strikers",plot=True, top=4, partnershipRuns=20)
3.6 Big Bash League -Team Bowling wicket kind all matches 2 teams (Class 2)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyBBL\\BBLT20-allMatchesBetween2Teams"
path=os.path.join(dir1,"Sydney Sixers-Perth Scorchers-allMatches.csv")
ss_ps_matches = pd.read_csv(path)
yka.teamBowlingWicketKindOppositionAllMatches(ss_ps_matches,'Perth Scorchers','Sydney Sixers',plot=True,top=5,wickets=1)
3.7 Big Bash League -Team Bowling scorecard all teams (Class 3)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyBBL\\BBLT20-allMatchesAllOpposition"
path=os.path.join(dir1,"Hobart Hurricanes-allMatchesAllOpposition.csv")
hh_matches = pd.read_csv(path)
scorecard=yka.teamBowlingScorecardAllOppnAllMatches(hh_matches,"Hobart Hurricanes")
print(scorecard)
## bowler overs runs maidens wicket econrate
## 16 B Lee 20 132 0 9 6.600000
## 30 CJ McKay 13 110 0 9 8.461538
## 88 NJ Rimmington 16 103 1 9 6.437500
## 67 JW Hastings 15 88 0 8 5.866667
## 63 JP Faulkner 15 146 0 7 9.733333
## 27 CJ Gannon 17 147 1 7 8.647059
## 93 NM Lyon 8 51 0 7 6.375000
## 20 BCJ Cutting 27 226 0 7 8.370370
## 48 GB Hogg 22 167 0 7 7.590909
## 107 SM Boland 12 96 0 7 8.000000
## 15 B Laughlin 13 99 0 7 7.615385
## 87 MT Steketee 15 134 0 5 8.933333
## 121 Yasir Arafat 9 48 0 4 5.333333
## 96 PJ Cummins 8 83 0 4 10.375000
## 46 Fawad Ahmed 11 64 0 4 5.818182
## 76 MA Beer 12 63 0 4 5.250000
## 108 SNJ O'Keefe 15 104 0 4 6.933333
## 75 M Muralitharan 7 31 0 4 4.428571
## 10 AJ Tye 16 127 0 4 7.937500
## 52 J Botha 13 94 0 4 7.230769
## 56 JL Pattinson 7 71 0 4 10.142857
## 62 JP Behrendorff 16 119 0 4 7.437500
## 3 AC Agar 12 87 0 4 7.250000
## 24 BM Edmondson 4 40 0 4 10.000000
## 37 DJ Hussey 8 47 0 3 5.875000
## 49 GJ Maxwell 8 65 0 3 8.125000
## 84 MN Samuels 4 22 0 3 5.500000
## 81 MG Neser 5 54 0 3 10.800000
## 44 DT Christian 9 114 0 3 12.666667
## 50 GS Sandhu 7 51 0 3 7.285714
## .. ... ... ... ... ... ...
## 43 DP Nannes 8 58 0 1 7.250000
## 51 IA Moran 4 25 0 1 6.250000
## 55 JK Lalor 10 82 0 1 8.200000
## 54 JH Kallis 3 18 0 1 6.000000
## 73 LR Butterworth 4 25 0 1 6.250000
## 4 AC McDermott 2 28 0 1 14.000000
## 70 LA Doran 4 38 0 1 9.500000
## 69 KW Richardson 6 44 0 1 7.333333
## 119 WD Sheridan 2 6 0 0 3.000000
## 2 AB McDonald 1 15 0 0 15.000000
## 115 TD Andrews 3 23 0 0 7.666667
## 11 AK Heal 4 33 0 0 8.250000
## 7 AD Russell 4 40 0 0 10.000000
## 8 AJ Finch 2 15 0 0 7.500000
## 9 AJ Turner 3 28 0 0 9.333333
## 60 JM Mennie 1 20 0 0 20.000000
## 18 BA Stokes 1 9 0 0 9.000000
## 26 CH Gayle 1 16 0 0 16.000000
## 28 CJ Green 4 44 0 0 11.000000
## 95 PD Collingwood 2 20 0 0 10.000000
## 31 CJ Simmons 4 21 0 0 5.250000
## 59 JM Holland 3 34 0 0 11.333333
## 36 DJ Bravo 6 64 0 0 10.666667
## 38 DJ Pattinson 2 16 0 0 8.000000
## 41 DJ Worrall 8 90 0 0 11.250000
## 72 LN O'Connor 6 56 0 0 9.333333
## 71 LJ Wright 3 27 0 0 9.000000
## 68 KA Pollard 1 7 0 0 7.000000
## 58 JM Herrick 4 23 0 0 5.750000
## 92 NM Hauritz 5 42 0 0 8.400000
##
## [122 rows x 6 columns]
3.8 Big Bash League -Plot wins vs losses against all teams(Class 3)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyBBL\\BBLT20-allMatchesAllOpposition"
path=os.path.join(dir1,"Sydney Sixers-allMatchesAllOpposition.csv")
ss_matches = pd.read_csv(path)
yka.plotWinLossByTeamAllOpposition(ss_matches,'Sydney Sixers')
3.9 Big Bash League -Wins vs losses by toss decision (Class 3)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyBBL\\BBLT20-allMatchesAllOpposition"
path=os.path.join(dir1,"Adelaide Strikers-allMatchesAllOpposition.csv")
as_matches = pd.read_csv(path)
yka.plotWinsByRunOrWicketsAllOpposition(as_matches,'Adelaide Strikers')
3.10 Big Bash League -Batsmen Analysis (Class 4)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyBBL\\BBLT20-BattingBowlingDetails"
# CA Lynn
name="CA Lynn"
team='Brisbane Heat'
df=yka.getBatsmanDetails(team,name,dir=dir1)
yka.batsmanRunsVsStrikeRate(df,name)
# UT Khawaja
name="UT Khawaja"
team='Sydney Thunder'
df=yka.getBatsmanDetails(team,name,dir=dir1)
yka.batsmanRunsAgainstOpposition(df,name)
3.11Big Bash League – Bowler analysis (Class 4)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyBBL\\BBLT20-BattingBowlingDetails"
# CJ McKay
name="CJ McKay"
team='Sydney Thunder'
df=yka.getBowlerWicketDetails(team,name,dir=dir1)
yka.bowlerCumulativeAvgWickets(df,name)
# AU Rashid
name="AU Rashid"
team='Adelaide Strikers'
df=yka.getBowlerWicketDetails(team,name,dir=dir1)
yka.bowlerCumulativeAvgEconRate(df,name)
4. Natwest T20 Blast
The following functions for added to handle Natwest T20 teams
- saveAllMatchesBetween2NWBTeams()
- saveAllMatchesAllOppositionNWBT20
The Natwest teams are
Derbyshire, Durham, Essex, Glamorgan, Gloucestershire, Hampshire, Kent,Lancashire, Leicestershire, Middlesex,Northamptonshire, Nottinghamshire, Somerset, Surrey, Sussex, Warwickshire, Worcestershire,Yorkshire
In order to perform analysis with yorkpy, the YAML data has to be converted to pandas dataframe and saves as CSV as shown
#import os
#import yorkpy.analytics as yka
#os.chdir('C:\\software\\cricket-package\\yorkpyNWB\\nwb')
#1. Convert YAML to dataframes and save as CSV
#yka.convertAllYaml2PandasDataframesT20(".", "..\\NWBT20-Matches")
#2. Save all matches between 2 NWBT20 teams
#dir1='C:\\software\\cricket-package\\yorkpyNWB\\NWBT20-Matches'
#yka.saveAllMatchesBetween2NWBTeams(dir1)
#3. Save all matches between a NWB T20 team and all other teams
#dir1='C:\\software\\cricket-package\\yorkpyNWB\\NWBT20-Matches'
#yka.saveAllMatchesAllOppositionNWBT20(dir1)
#4. Compute the batting details
dir1='C:\\software\\cricket-package\\yorkpyNWB\\NWBT20-Matches'
#yka.getTeamBattingDetails("Derbyshire",dir=dir1, save=True)
#yka.getTeamBattingDetails("Durham",dir=dir1,save=True)
#yka.getTeamBattingDetails("Essex",dir=dir1,save=True)
#..
#5. Compute bowling details
dir1='C:\\software\\cricket-package\\yorkpyNWB\\NWBT20-Matches'
#yka.getTeamBowlingDetails("Derbyshire",dir=dir1, save=True)
#yka.getTeamBowlingDetails("Durham",dir=dir1,save=True)
#yka.getTeamBowlingDetails("Essex",dir=dir1,save=True)
#...
Once the data is converted all yorkpy functions can be used. This has already been done and is available at github NWB
4.1 Natwest T20 Blast – Team score card (Class 1)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\\yorkpyNWB\\NWBT20-Matches"
path=os.path.join(dir1,".\\Durham-Yorkshire-2016-08-20.csv")
d_y=pd.read_csv(path)
scorecard,extras=yka.teamBattingScorecardMatch(d_y,"Durham")
print(scorecard)
## batsman runs balls 4s 6s SR
## 0 MD Stoneman 25 20 4 0 125.000000
## 1 KK Jennings 11 13 1 0 84.615385
## 2 BA Stokes 56 37 4 3 151.351351
## 3 MJ Richardson 29 23 4 1 126.086957
## 4 JTA Burnham 17 15 1 1 113.333333
## 5 RD Pringle 10 9 1 0 111.111111
## 6 PD Collingwood 2 3 0 0 66.666667
## 7 U Arshad 1 1 0 0 100.000000
print(extras)
## total wides noballs legbyes byes penalty extras
## 0 305 2 0 5 0 0 7
4.2 Natwest T20 Blast -Team batsmen vs Bowlers (Class 1)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\\yorkpyNWB\\NWBT20-Matches"
path=os.path.join(dir1,".\\Derbyshire-Lancashire-2016-07-13.csv")
d_l=pd.read_csv(path)
yka.teamBatsmenVsBowlersMatch(d_l,'Lancashire','Derbyshire',plot=True)
4.3 Natwest T20 Blast -Team bowling scorecard match (Class 1)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\\yorkpyNWB\\NWBT20-Matches"
path=os.path.join(dir1,".\\Essex-Surrey-2016-05-20.csv")
e_s=pd.read_csv(path)
a=yka.teamBowlingScorecardMatch(e_s,'Essex')
print(a)
## bowler overs runs maidens wicket econrate
## 0 Azhar Mahmood 3 38 0 4 12.666667
## 1 GJ Batty 4 33 0 1 8.250000
## 2 JE Burke 1 18 0 0 18.000000
## 3 MW Pillans 3 28 0 0 9.333333
## 4 SM Curran 4 23 0 2 5.750000
## 5 TK Curran 4 21 0 3 5.250000
4.4 Natwest T20 Blast -Match Worm chart (Class 1)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\\yorkpyNWB\\NWBT20-Matches"
path=os.path.join(dir1,".\\Gloucestershire-Glamorgan-2016-06-10.csv")
ss_ms=pd.read_csv(path)
yka.matchWormChart(ss_ms,"Gloucestershire", "Glamorgan")
path=os.path.join(dir1,".\\Leicestershire-Northamptonshire-2016-05-20.csv")
hh_bh=pd.read_csv(path)
yka.matchWormChart(hh_bh,"Northamptonshire", "Leicestershire")
4.5 Natwest T20 Blast -Team Batting partnerships all matches 2 teams (Class 2)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyNWB\\NWBT20-allMatchesBetween2Teams"
path=os.path.join(dir1,"Hampshire-Sussex-allMatches.csv")
h_s_matches = pd.read_csv(path)
yka.teamBatsmenPartnershipOppnAllMatchesChart(h_s_matches,"Hampshire","Sussex",plot=True, top=4, partnershipRuns=10)
4.6 Natwest T20 Blast -Team Bowling wicket kind all matches 2 teams (Class 2)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyNWB\\NWBT20-allMatchesBetween2Teams"
path=os.path.join(dir1,"Kent-Somerset-allMatches.csv")
k_s_matches = pd.read_csv(path)
yka.teamBowlersVsBatsmenOppnAllMatches(k_s_matches,'Kent','Somerset',plot=True,
top=5,runsConceded=10)
4.7 Natwest T20 Blast -Team Bowling scorecard all teams (Class 3)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyNWB\\NWBT20-allMatchesAllOpposition"
path=os.path.join(dir1,"Middlesex-allMatchesAllOpposition.csv")
m_matches = pd.read_csv(path)
scorecard=yka.teamBowlingScorecardAllOppnAllMatches(m_matches,"Middlesex")
print(scorecard)
## bowler overs runs maidens wicket econrate
## 1 AJ Tye 8 75 0 6 9.375000
## 5 BAC Howell 8 41 0 5 5.125000
## 26 GR Napier 7 65 0 5 9.285714
## 15 DI Stevens 4 31 0 4 7.750000
## 19 DW Lawrence 6 37 0 4 6.166667
## 32 JW Dernbach 4 33 0 3 8.250000
## 7 BTJ Wheal 4 43 0 3 10.750000
## 18 DR Briggs 4 24 0 3 6.000000
## 50 RK Kleinveldt 4 24 0 3 6.000000
## 46 R McLaren 7 59 0 3 8.428571
## 47 R Rampaul 3 21 0 3 7.000000
## 34 L Gregory 6 51 0 2 8.500000
## 33 KMDN Kulasekara 2 24 0 2 12.000000
## 40 MG Hogan 3 17 0 2 5.666667
## 43 MTC Waller 4 31 0 2 7.750000
## 49 RJ Gleeson 4 20 0 2 5.000000
## 48 RE van der Merwe 5 24 0 2 4.800000
## 51 RN ten Doeschate 4 32 0 2 8.000000
## 53 S Prasanna 4 20 0 2 5.000000
## 56 SW Tait 3 17 0 2 5.666667
## 57 Shahid Afridi 8 55 0 2 6.875000
## 59 T van der Gugten 3 13 1 2 4.333333
## 64 TS Mills 3 34 0 2 11.333333
## 65 WAT Beer 4 23 0 2 5.750000
## 31 JH Davey 4 28 0 2 7.000000
## 68 ZS Ansari 3 16 0 2 5.333333
## 25 GM Andrew 3 19 0 2 6.333333
## 23 GJ Batty 6 55 0 2 9.166667
## 16 DJ Bravo 3 27 0 2 9.000000
## 41 MR Quinn 6 65 0 1 10.833333
## .. ... ... ... ... ... ...
## 24 GL van Buuren 7 49 0 1 7.000000
## 37 MD Hunn 3 35 0 1 11.666667
## 36 LC Norwell 6 62 0 1 10.333333
## 29 JC Tredwell 4 35 0 1 8.750000
## 35 LA Dawson 6 53 0 1 8.833333
## 62 TL Best 4 51 0 0 12.750000
## 58 T Westley 2 12 0 0 6.000000
## 4 Azharullah 3 24 0 0 8.000000
## 60 TD Groenewald 1 21 0 0 21.000000
## 61 TK Curran 4 35 0 0 8.750000
## 38 MD Taylor 3 30 0 0 10.000000
## 30 JG Myburgh 1 5 0 0 5.000000
## 8 C Overton 2 18 0 0 9.000000
## 2 Ashar Zaidi 1 5 0 0 5.000000
## 66 WR Smith 2 25 0 0 12.500000
## 28 J Overton 2 24 0 0 12.000000
## 6 BJ Taylor 1 6 0 0 6.000000
## 22 GG White 4 31 0 0 7.750000
## 55 SP Crook 1 9 0 0 9.000000
## 39 ME Claydon 4 40 0 0 10.000000
## 52 RS Bopara 4 32 0 0 8.000000
## 10 CD Nash 2 19 0 0 9.500000
## 11 CH Morris 4 36 0 0 9.000000
## 12 DA Cosker 3 32 0 0 10.666667
## 13 DA Griffiths 4 39 0 0 9.750000
## 45 PD Trego 1 11 0 0 11.000000
## 44 PA van Meekeren 2 19 0 0 9.500000
## 42 MS Crane 2 25 0 0 12.500000
## 20 FK Cowdrey 1 19 0 0 19.000000
## 14 DD Masters 2 16 0 0 8.000000
##
## [69 rows x 6 columns]
4.8 Natwest T20 Blast -Plot wins vs losses against all teams(Class 3)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyNWB\\NWBT20-allMatchesAllOpposition"
path=os.path.join(dir1,"Warwickshire-allMatchesAllOpposition.csv")
w_matches = pd.read_csv(path)
yka.plotWinLossByTeamAllOpposition(w_matches,'Warwickshire')
4.9 Natwest T20 Blast -Batsmen Analysis (Class 4)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyNWB\\NWBT20-BattingBowlingDetails"
# M Klinger
name="M Klinger"
team='Gloucestershire'
df=yka.getBatsmanDetails(team,name,dir=dir1)
yka.batsmanRunsAgainstOpposition(df,name)
# CA Ingram
name="CA Ingram"
team='Glamorgan'
df=yka.getBatsmanDetails(team,name,dir=dir1)
yka.batsmanCumulativeStrikeRate(df,name)
4.11 Natwest T20 Blast -Bowler analysis (Class 4)
import os
import pandas as pd
import yorkpy.analytics as yka
dir1="C:\\software\\cricket-package\\yorkpyNWB\\NWBT20-BattingBowlingDetails"
# BAC Howell
name="BAC Howell"
team='Gloucestershire'
df=yka.getBowlerWicketDetails(team,name,dir=dir1)
yka.bowlerCumulativeAvgEconRate(df,name)
# GR Napier
name="GR Napier"
team='Essex'
df=yka.getBowlerWicketDetails(team,name,dir=dir1)
yka.bowlerWicketsVenue(df,name)
Note: yorkpy will work for all T20 leagues which are in YAML format as specified in Cricsheet.
You can clone/fork the latest code for yorkpy from github yorkpy
The data for IPL, Intl. T20, BBL and Natwest T20 have already been converted into pandas dataframes and saved as CSVs. You can download the converted files from Github at [allYorkpyT20Data])(https://github.com/tvganesh/allYorkpyT20Data)
Conclusion This post shows the kind of detailed analysis that can be performed with yorkpy. In fact with all the converted data it should be possible to also train a Machine Learning model, which I will probably keep for another day. You could go ahead and use the data in other innovative ways. Do keep me posted if you do!!
Important note: Do check out my other posts using yorkpy at yorkpy-posts
Have fun with yorkpy!!
See also
1. Take 4+: Presentations on ‘Elements of Neural Networks and Deep Learning’ – Parts 1-8
2. My book ‘Practical Machine Learning in R and Python: Third edition’ on Amazon
3. Hand detection through Haartraining: A hands-on approach
4.My book ‘Deep Learning from first principles:Second Edition’ now on Amazon
5. Introducing QCSimulator: A 5-qubit quantum computing simulator in R
6. The 3rd paperback & kindle editions of my books on Cricket, now on Amazon
To see all posts click Index of posts
This post is a continuation of my earlier post Big Data-1: Move into the big league:Graduate from Python to Pyspark. While the earlier post discussed parallel constructs in Python and Pyspark, this post elaborates similar and key constructs in R and SparkR. While this post just focuses on the programming part of R and SparkR it is essential to understand and fully grasp the concept of Spark, RDD and how data is distributed across the clusters. This post like the earlier post shows how if you already have a good handle of R, you can easily graduate to Big Data with SparkR
Note 1: This notebook has also been published at Databricks community site Big Data-2: Move into the big league:Graduate from R to SparkR