Tuesday, July 28, 2009

Who would you want to bat for your life?

Disclaimer: I am a complete believer in the popular phrase - “Statistics are like mini-skirts, what they reveal is interesting but what they hide is vital”

It’s always a dicey proposition to make a science out of an art. But for good or bad, we as cricket fans have internalized it so much that we cannot leave statistics out of our evaluation of any player or team. Now that we have decided to live with it, let’s use numbers from as many dimensions as possible to arrive at a statistically more fair evaluation of a player.

Typically the statistic that’s most abused in cricket is a batsman’s average. The first class in most statistics course start with this question on cricket - Does higher average translate into better consistency? That’s when the professor would introduce the concept of Standard Deviation, which is nothing but a measurement of risk. Very simply put, it measures the variability from the expected score (which is the average) – higher the variability higher the risk! For an intuitive explanation of this concept, click here

But the Standard Deviation is so overhyped a concept that once a lateral thinking guru told me that most cricket followers mistake average as a proxy for greatness when it should ideally be Standard Deviation! Does a batsman who scores 10 runs every time he bats more great than Sachin or Lara, because he is very very consistent – no variability at all! So standalone Standard Deviation may not be of great help, but when used along with the average, it gives a potent number (a crude version of what is famously called “Sharpe Ratio” in the world of finance). Let’s call this the consistency index – it gives the average of a batsman per unit of risk. “The higher the average per unit risk, the more reliable the batsman is” is not a bad doctrine to believe in.

In fact, if I remember right, S.Rajesh of Cricinfo had done a similar exercise a few years back for Cricinfo Magazine where he had calculated the consistency index for all great batsmen to arrive at statistically the most consistent of them all. In this exercise, I am trying to build on his work and add more dimensions to the analysis to possibly get a more complete picture. Since I don’t have the advantage of having access to a comprehensive database like S.Rajesh, I would restrict this analysis to a shortlist of batsman whom I have had the privilege of watching in my cricket following career of the past 2 decades. This list is completely my personal choice of batsmen who have staked a claim to be the best in the world at different points in time. And they are – Sachin Tendulkar, Rahul Dravid, Brian Lara, Jacques Kallis, Ricky Ponting, Kevin Peterson, Virender Sehwag, Matthew Hayden, and Adam Gilchrist. Just for novelty factor, added Sunil Gavaskar to the mix as well!

Now to begin with, let’s use the primitive method of comparing them on averages.
Average 1
Ponting 56.31
Sachin 54.73
Kallis 54.66
Lara 52.89
Dravid 52.39
Gavaskar 51.12
Sehwag 50.82
Peterson 50.82
Hayden 50.74
Gilchrist 47.61

Ponting is clearly leading the pack with a good lead over Tendulkar. Kallis is a hairline behind Tendulkar, with Lara and Dravid completing the top 5. Gilchrist might seem like a strange pick amongst the very best in the last 2 decades, but as I said before it’s a completely personal choice with no quantitative criteria to select the list.

The “Not out” factor
The problem with using average is that it is easily maneuvered by “Not Outs”. So if Lara scores a 400 and finishes not out, it can boost his average by almost 2 runs across his career. There was a phase in which Tendulkar scored nearly 700 runs without getting dismissed. Very rarely a batsman staying not out makes a critical difference to the fortunes of the team. Would it have made any difference to Team India if Tendulkar got out for 241 against Australia in Sydney? So let’s look at the averages by discounting the “Not outs”. Average will be nothing but total number of runs by the number of times the batsman has ventured out to bat. Let’s call this Average 2.

Average 2
Lara 51.52
Ponting 49.78
Sehwag 49.05
Sachin 48.98
Peterson 48.85
Gavaskar 47.30
Hayden 46.88
Kallis 46.50
Dravid 46.21
Gilchrist 40.66

Oh…what a difference this makes. It disturbs the order of the entire list with the exception of Gilchrist. Lara jumps up from fourth place to first place and that despite a 400 N.O against his name! Sehwag climbs up four positions to move from 7th to 3rd in this list. Kallis moves down five positions from 3 to 8.

Let’s quantify the “Not Out” factor for each of these batsmen and see who the biggest beneficiary of it is:

The “Not Out” factor
Kallis 8.16
Gilchrist 6.95
Ponting 6.54
Dravid 6.18
Sachin 5.75
Hayden 3.86
Gavaskar 3.82
Peterson 1.97
Sehwag 1.77
Lara 1.37

There you go, Kallis has nearly a good tail-ender’s average added to his Average 2 courtesy his unfinished innings. Gilchrist’s record as a batsman is well enhanced by the number of not outs in his career. Look at the bottom of the list there; its not at all surprising to find Sehwag there, but Lara enjoys the least benefit of not being dismissed. The amazing aspect of this list is that five of the top 6 batsmen in this list bat at either 3 or 4 (infact except Tendulkar, the rest of them bat at no.3) and have still finished as “Not out” with sizeable scores – oh boy do these guys hate getting dismissed!

Having seen both the statistic, it’s fair to say that both are fair and unfair in their own ways. So let’s use both of them to compute the consistency index and see what difference it makes in the final analysis.

Risk – Variability of expected value
Let’s look at our measurement of risk for each of these batsmen – Standard Deviation.

Standard Deviation
Gilchrist 43.66
Kallis 44.19
Peterson 48.43
Dravid 48.51
Gavaskar 50.08
Hayden 50.57
Ponting 51.01
Sachin 52.41
Lara 62.37
Sehwag 62.77

If I have to go by what the renowned lateral thinking guru said, then Gilchrist is the least risky or in other words the most consistent batsmen of this lot – now then, can you believe it? Is Tendulkar a less consistent batsman than Gilchrist – Even Gilchrist would dismiss it as a joke! But watch out the last 2 names there at the bottom, can you disagree with that? So, Standard Deviation as a stand alone measure also offers some interesting insights. But let’s refine it further; let’s see what consistency index tells us…

Consistency Index (Average/Standard Deviation)

Using Average 1

Consistency Index – 1
Kallis 1.24
Ponting 1.10
Gilchrist 1.09
Dravid 1.08
Peterson 1.05
Sachin 1.04
Gavaskar 1.02
Hayden 1.00
Lara 0.85
Sehwag 0.81

Using Average 2

Consistency index – 2
Kallis 1.05
Peterson 1.01
Ponting 0.98
Dravid 0.95
Gavaskar 0.94
Sachin 0.93
Gilchrist 0.93
Hayden 0.93
Lara 0.83
Sehwag 0.78

Who would you want to bat for your life? If you believe in numbers, you will not look beyond Jacques Kallis – Take “Not outs”, discount “Not outs”, do whatever, but Kallis stands tall in the consistency index towering over the rest. And is anyone surprised by the 2 names in the bottom at all? It seems to be an apt reflection of their style of batting – high risk, high return! Except for Gilchrist, the positions of the rest of the batsmen don’t change much whether you factor in the Not out’s or not. Gilchrist seems to be enjoying the privilege of being number 7 on the batting order with too many “Not out” innings in his kitty.

This completes the first part of the two-part analysis; let’s get a little more whacky in the second part, which I promise will be a much shorter one.

Note: All the data used for the analysis are as on March 18, 2009

No comments:

Post a Comment