The first page of the Variations Rulebook
Using a variation score for T20 bowling to analyse phase-wise trends.
A batter’s ability of playing various shots is similar to a bowler’s ability of bowling a variety of deliveries. A bowler’s case is different however, in terms of the number of opportunities he gets to shoot all his arrows. It’s just the 6 balls. There could be an argument that its actually 24, but each over is really a discrete event in a T20 game. A batter is less likely to dwell on the type of ball the bowler delivered in a previous over compared to when they are in the middle of an over, having already faced, say, three balls from the same bowler. Hence if there is a bowler’s variation metric, it has to be based on how much creativity he produces in an over.
Why look at variations?
An important question to answer. How much difference do variations create? And if they do, what’s the trend? How does an xyz variable change as variation increases? But wait a minute. How do you calculate variation?
That’s the first matter of importance. We first need to know how much variation was produced in an over, only then we can plot it against let’s say economy to know if it really creates a difference. For this I needed Hawk-eye data, which I finally have thanks to Himanish Ganjoo (the best cricketer in France currently).
(the data consists of past three IPL seasons)
The idea of this variation score is pretty simple - take the variables of line, length, swing (drift for spin), seam (turn for spin) and speed, normalize them and project them in a 5-dimensional space. This way you get 6 different points for 6 different balls of the over. Now just calculate the average distance between them, there’s your variation score. And here I present to you, the economy-variations plot –

You see that? It looks like two different halves altogether. I was confused a bit initially, and that’s where the fun is in this analysis stuff. This is how I moved forward – I calculated the average over variation of all the bowlers to find some pattern. And I could observe one consistent pattern, it was a lot of these middle over bowlers who are dominating the first half of the plot! This led me to look for if it was the middle overs where I am getting these consistent low variation scores. And yes, it was indeed the middle overs. For spinners who are primarily middle overs bowlers and also for all the pace overs in the middle, the economy increased with variation (correlation = 0.94 and 0.88 respectively). I also plotted the percentage of seam middle overs with variation score, and it followed a falling trend (correlation = -0.96). Thus explaining the first half of the Batruns-Variation plot.
To summarize
To find out further what variations scores exactly will be good for both spinners and pacers, I plotted the wickets against variation as well. These are the final findings –
As discussed above, for spinners the economy increases with variations. But their wicket taking ability also shows an increasing trend till a value of 1.4, after which no particular trend is seen. Thus a score around 1.4 is good for spinners to balance both aspects.
For pacers bowling in the middle, wickets have majorly been on the decline with variation (correlation = -0.68). This information combined with the fact that economy has increased with variation, further strengthens the argument to keep the variations less and focus more on consistency.
For pacers in powerplay and death, wickets specifically show no particular trend with increasing variation, but economy decreased (second half of the plot). Hence, from a point of view of economy, I believe it’s better to have bowlers with a good variation score in these overs.
In one line – Consistency over variations for middle, variations over consistency for the rest.
The Names
Okay, coming to the selling point of the blog.
Spinners
On expected grounds, Jadeja was the bowler who has the least variation score. Kuldeep has the highest. The 4 players at the top left – Karn Sharma, Shahbaz, Lalit Yadav and Sundar are players with really high economies despite of having a low variation score and are exceptions to the trend. The rest of the group displays the trend in general. We can see for example how majority of the bowlers with an economy less than 8 lie below or close to 1.4 (red region). When we talk about the value of 1.4, Hasaranga, Chahal, Theekshana, Narine and Kartikeya are the players closest to it. There’s also a very evident off spinners and leg spinners split that happens at the 1.4 line.
The two bowlers at the top in wickets plot are Hasaranga and Chawla. The transition from yellow to red in this plot strengthens the idea of wicket taking going down with low variations. The green and red regions show two different breeds of spinners - the consistent ones and the wicket taking ones, again a clear demarcation between offies and legies.
Pacers
For the pacers again as discussed, we observe that the density of bowlers with an economy greater than 8.5 decreases as the variation increases (blue to green). Majority of these bowlers in this plot are powerplay or death bowlers. Chris Jordan who bowls mainly at the death has been the most expensive and as expected has a very low variation score. A good range for pacers has been 1.2 to 1.4, there aren’t a lot of them beyond that. This is also the region where the most economical Jasprit Burmah is present. Pretorius, Coetzee and Akash Deep (not shown in this scatter plot) are three bowlers with really high variation scores, all of them around 2 and the least economy among them was 8.88 – again these are some exceptions to the trend.
As I summarised earlier for powerplay and death bowlers, I couldn’t see a clear pattern between their variations and wicket taking abilities. The wicket scatter plot confirms the same, the density doesn’t appear to change much as we move towards the high variation zones.
Use Cases
This piece provides a way first of all to quantify a bowler’s variations, thus adding a new informative metric for a bowler. It also provides a way for teams to relate this metric with different phases in a T20 game and some data backed observations which can help determine bowling combinations.
That’s it for now then. Do share your views!
Beautiful piece of work. The next step may be to check whether different types of variation are correlated differently with outcomes - maybe speed variation reaps more wickets but length variation curtails runs - and produce a weighted metric (if sample sizes don't become a problem).