I’ve noticed a lot of people in the HBD-o-sphere, including some very prominent bloggers, don’t know diddlysquat about statistics, so I am going to do a brief series on some basic concepts.
The first is my absolute favorite: The standard deviation represented by the Greek letter sigma, σ.
I remember learning about this in high school. I don’t know how I learned it, because God knows none of my teachers knew what a standard deviation is, but once I discovered it, I was so excited, because it’s one of the most fascinating concepts in the World.
Most of you know what a mean is. A mean, also known as an average, is simply the sum of a bunch of numbers, divided by the number of numbers.
So a sample of 40 Canadian adults (20 men and 20 women) had the following heights:
Now the mean height of these 40 Canadian adults is about 171 cm.
In order to determine the standard deviation, you take the square root of the average squared difference from the mean of the 40 heights. This number is sometimes adjusted for what’s known as degrees of freedom, but we wont discuss that now.
The standard deviation is an enormous amount of work to calculate by hand, so you use this excellent online calculator which shows the standard deviation is about 9.5.
Why should you care?
Because when the distribution of numbers is normal, the standard deviation has some fascinating properties. Roughly two thirds of the sample (68%) will fall within one standard deviation of the mean, so in our sample, two thirds should fall from 162 cm and 181 cm. In our sample it’s a little more (about 78%).
In a normal distribution, 95% should fall within two standard deviations of the mean, so in our sample, between 153 cm and 191 cm. In our sample, it’s a bit more than expected (98%). Our sample might not be perfectly normal because we combined men and women together.
Who is the tallest Canadian?
There are about 27 million adults living in Canada. According to the Gaussian curve (commonly known as the bell curve), the highest value in a group of 27 million should be 5.4 standard deviations above the mean. Since Canadian adults average about 171 cm tall with a standard deviation of 9.5 cm, the tallest Canadian should be:
5.4(9.5) + 171 = 222 cm tall.
Or roughly, seven feet three inches.
The actual tallest Canadian was recently measured at seven feet six inches. So simply by taking the height of 40 random Canadian adults, we can predict the height of the tallest person in all of Canada within a few inches. That’s why standard deviation is such a brilliant concept.
Of course this only works on variables that are normally distributed, and only works because Canada’s tallest person is part of the biologically normal population. If I were to use this same technique to try to predict the tallest person in World history, I doubt it would work, because people like Robert Wadlow are such freaks of nature that they are outside biologically normal variation. They perhaps had mutations of large effects, which override normal polygenetic variation.