GooglyPlusPlus now with Win Probability Analysis for all T20 matches

In my 2 earlier posts Computing Win-Probability of T20 matches and Boosting Win Probability accuracy with player embeddings I had discussed the approaches to computing ball-by-ball Win Probability of a T20 match. My best ML models were.

  • glmnet – Logistic Regression(LR) with lasso regularization and penalty – Accuracy – 0.73
  • Random Forest (RF) – Accuracy – 0.92

Incidentally, both these models can be used on live streaming ball-by-ball data if available

I have now integrated the trained ML Logistic Regression model with penalty into my Shiny app GooglyPlusPlus. Unfortunately, the Random Forest model, besides being computationally intensive is also heavy-weight (1.29GB) when compared to LR model which is just 91.2 MB. So, I was not able to upload the Random Forest model to Shiny as the memory allowed exceeded that allowed in my paid subscription.

However, I will demonstrate the performance of both models, LR ( in my Web app) and RF (in my local machine). Incidentally the Random Forest model takes a long time to load and even longer (~90 secs) to compute the Win Probability of a T20 match, while the LR model computes in a few seconds. Interestingly, I find the LR model’s Win Probability more intuitive and explainable than the Random Forest. Possibly, the RF model overfits. I need to explore this more. Anyway, take a look at some interesting Win Probability Charts (fortune swings of teams!!!) over the course of the T20 match.

You can try out this latest version here at GooglyPlusPlus !!

Some major upsets in the ICC T20 World Cup, 2022

A) Netherlands vs South Africa – 2022-11-06

B) Zimbabwe vs Pakistan – 2022-10-27

1a) Netherlands vs South Africa – ICC 2022-11-06 (Worm-wicket chart)

Netherlands shocked South Africa and ended South Africa’s hopes for a place in the semi-finals. The match worm-wicket chart for this match is shown below

The 2 circled areas are where the South Africa lost the plot around the 8th over (~120+48=168) and 15th over (~120+90=210)

Around 205-215 ball of the innings South Africa started to lose

1b) Netherlands vs South Africa – ICC 2022-11-06 – Logistic Regression with regularisation (Shiny)

1c) 1b) Netherlands vs South Africa – ICC 2022-11-06 – Random Forest (not in Web app, local)

If you notice, for some reason, Random Forest model decided that Netherland was on the winning side, right from the start. Why would this happen? Possibly overfitting, I presume…

2a) Zimbabwe vs Pakistan – ICC 2022-10-27 Worm-wicket chart

Pakistan seemed to be cruising along with finally 11 runs in the last over, and for some reason they panicked and lost.

2a) Zimbabwe vs Pakistan -ICC 2022 – 2022-10-27 – Logistic Regression with regularisation (Shiny)

It can be seen that Pakistan did seem to have the upper hand , save the last over.

2a) Zimbabwe vs Pakistan ICC 2022-10-27 – Random Forest (not in Web app, local)

Again the Random Forest model implies that Zimbabwe was on a winning foot except in brief stretches for e.g ball 248 of the innings

So while the accuracy of Random Forest model is better by about ~20% I feel it is the Logistic Regression with penalty has generalised better and is more intuitive. Meanwhile, I will see if I can improve LR or try another model which can provide better accuracy besides generalising well

Henceforth, I will only be using the LR model that is in the Shiny app

3a) England vs New Zealand T20 Women – 2021-09-04

Another close match till the 15th over. After that England’s seems to have had a slower strike rate and lost

3b) England vs New Zealand T20 Women – 2021-09-04 – Logistic Regression

4a) Chennai Super Kings vs Gujarat Titans (IPL 2022) – Worm wicket chart

4a) Chennai Super Kings vs Gujarat Titans (IPL 2022) – Logistic Regression

5a) Islamabad United vs Peshawar Zalmi -2021-06-17 – Worm wicket chart

This match seems to be close, with both worms inter-twined almost all the way

5b) Islamabad United vs Peshawar Zalmi -2021-06-17 – Logistic Regression

According to the model Peshawar Zalmi lost the game around 14-15th over

Feel free to play around with the latest GooglyPlusPlus


Meanwhile I will try to come with a better model which executes fast, generalises well and is accurate. Tall order, no doubt!!!

Till such time play around with GooglyPlusPlus

Also check out my other posts

  1. Using embeddings, collaborative filtering with Deep Learning to analyse T20 players
  2. Computer Vision: Ramblings on derivatives, histograms and contours
  3. Deep Learning from first principles in Python, R and Octave – Part 4
  4. TWS-4: Gossip protocol: Epidemics and rumors to the rescue
  5. How to program – Some essential tips
  6. Cricpy performs granular analysis of players
  7. Analyzing World Bank data with WDI, googleVis Motion Charts
  8. Practical Machine Learning with R and Python – Part 5
  9. Presentation on “Intelligent Networks, CAMEL protocol, services & applications

To see all posts click Index of posts

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s