Agile Java Man: Great expectations

Monday, February 12, 2018

Great expectations

Since this took me a few hours to sort out in my head, I thought I would blog it.

The Bessel Correction is used taking a sample with mean x̄ and variance s² from a population whose true mean is μ and true variance is σ².

The standard deviation, s, is:

√(Σ_iⁿ(x_i-x̄)²/n)

but we'd be wrong if we thought we could use this as an estimate for σ². (We actually divide by n-1).

Note that sometimes we don't need to estimate the population. In the case of a country conducting a census, we have all the information. However, for most use cases we need to sample the country's population.

[Aside: Interestingly, “such a basic concept as standard deviation, with an apparently impeccable mathematical pedigree, is socially constructed and a product of history” Leeds University].

The reasoning goes like this.

The expected value for x̄ is E[x̄] = μ because the expectation is the mean, by definition.

The variance of x̄ is:

E[(x̄ - μ)²] - (E[x̄ - μ])²

also by definition.

The second term is clearly zero when we expand it and substitute in the expected value for x̄ we've just stated above.

We then expand the first term and take the expected value of all the resulting terms. However, we have to be careful here as E[A.B] = E[A].E[B] iff A and B are independent.

To illustrate, the expected value of rolling a pair of fair 6-sided dice is (3.5)² which is 12.5. But the expected value of rolling just one die and squaring the result is (1² + 2² + ... 6²)/6 which is about 15. Clearly they are not the same probability distribution as, for example, p(12) is non-zero in the first case but zero in the second (as you can't get a 12 by squaring integers).

So, let's represent the variance of x̄ as:

E[ (Σ_iⁿ(x_i-μ)/n) . (Σ_jⁿ(x_j-μ)/n) ]

for all terms where i≠j, the distributions are independent so this becomes:

E[(x-μ)²] = E[x² - 2xμ + μ²] = E[x²] - 2E[x]μ + E[μ²] = E[x]E[x] - 2μ²+ μ²= μ²- 2μ²+ μ² = 0

but when i=j and the distributions are not independent, we have

E[ Σ_iⁿ(x_i-μ)²/n² ] = E[ E[(x_i-μ)²]/n ] = E[(x_i-μ)²]/n = σ²/n

Now, we can re-express our definition of

s²= Σ_iⁿ(x_i-x̄)²/n

by noting that:

(x_i-x̄)²= ((x_i-μ)-(x̄-μ))²
= (x_i-μ)²- 2(x_i-μ)(x̄-μ) + (x̄-μ)²

The expectation of the first term is σ²+E[x-μ]²=σ² from our definition of the variance of the whole population, the last term is σ²/n as we've just shown and as for the middle term, we do the same trick as before. Where i=j, it's zero but for the remaining 1/n cases, it's σ². So, this middle term must equal 2σ²/n.

Therefore:

E[s²]=(1-1/n)σ²

so we can estimate σ² as

σ²≅ =(n/n-1)s²

So, don't forget your n-1 if you're taking a sample and not calculating values for the whole population.

Agile Java Man

Monday, February 12, 2018

Great expectations

No comments:

Post a Comment

Blog Archive

About Me