I remember during last season, hearing an interview with a player from the Brazilian Serie A team Vasco, with a quote that went something like this "we are leading the league in shots, but the goals are not going in. Sooner or later, our luck is going to change."
Unfortunately, for Vasco, their "luck" didn't change, and they finished tied for second in the Serie A in shots, but tied for 13th in goals scored. The casual fan watching might have spotted why: Vasco was bombing a high percentage of their shots from long distance. Stated another way, their shot quality was not good. We've examined shot quality in great detail using multivariate regression analysis, but in this post I am going to just look at one shot quality factor which is shot distance, and more specifically shots in the 18-yard box.
Looking at the overall data for the entire 2010 season for all 20 teams in the serie A, we see the Vasco player did have a point. There is in obvious correlation between shots per 90 minutes and goals per 90 minutes. When we run a simple correlation analysis, the resulting correlation coefficient is 0.63. You can also see that Vasco is way above the predicted correlation line, so maybe it was bad luck.

However, the same graph for shots inside the 18-yard box vs. goals per 90 minutes in the 18-yard box looks both a bit more tightly correlated and upward sloping. Running the same correlation with shots in the 18 yard box per 90 minutes and goals per 90 minute produces a correlation coefficient of around 0.73. So this is an even better correlation, but not markedly better. The point here is that most teams actually are pretty similar on their ratio of shots to shots in the 18 yard box, so we should expect similar correlations. Vasco was the team that really stood out as having a low % of shots in the 18 yard box, with only 30% of its shots coming in the box. And in fact, Vasco in this graph is below the line, meaning they actually outperformed the number of goals that their shots in the 18-yard box would have predicted; so much for *bad* luck. Overall, though, our evidence from this particular analysis for the importance of shots in the 18-yard box is weak at best, partially due to the aggregated nature of the data and the relative consistency of teams on the % of shots they take inside the 18 yard box.

We dug a little deeper, using our data on individual shots, and the results on the value of a shot inside the 18 yard box vs. outside the 18 yard box are pretty striking. Looking only at foot shots and excluding both penalties and free kicks, we found that 58% of foot shots were taken outside the 18-yard box.
When taking a foot shot inside the 18-yard box, the % chance of scoring a goal was just over 14%; meanwhile the % of foot shots resulting in goals outside the 18-yard box was only 3.2%. Put quite bluntly, a shot inside the 18-yard box is 4.4x more likely to score than one outside.

What are the implications of this? We should value passers who have a consistent ability to create shots inside the 18-yard box more, so statistics on overall shot creation by a player probably are not that meaningful. Obviously, assists may be a better measure of this, but if we had a simple measure of shots created inside the 18-yard box for a passer, we could give the passer credit for the creation of higher probability shots that his team-mates did not score. At the same time, shot selection, in particular % of shots taken outside the 18-yard box should be a key indicator we watch at both the player and team level as a measure of shot quality.
Does this mean that shots outside the 18-yard box should never be taken? It really depends on the value of the alternatives to taking a shot compared to the value of the shot itself. We'll examine "how far is too far" in an upcoming post.