Ravi is an armchair futurist and an aspiring mad scientist. His mission is to create simplicity out of complexity and order out of chaos.

## Friday, June 15, 2012

### On Weak and Strong Laws of Large Numbers

#### Introduction

In statistics, we use the mean calculated from a sample as an estimate for the mean of the population. E.g. if the average height of a random sample of a thousand people from a region is 6 feet, then we estimate that the average height of all people in that region is 6 feet. Why does this work? The weak and strong laws of large numbers provide a theoretical basis for why this works. Given below are the laws themselves and their difference, which serves as a justification for their names.

#### Notation

1. Let $X_1$, $X_2$, $\cdots$, $X_n$ be independent and identically distributed random variables. (In layman terms, $X_1$ is the first observation, $X_2$ is the second and so on.)
2. Let $M_n$ be a random variable denoting the mean of $X_1$, $X_2$, $\cdots$, $X_n$. In other words, $M_n=\frac{1}{n}\sum_{1}^{n}X_i$So this is the mean of the sample.
3. Let $\mu$ be the mean of each of $X_1$, $X_2$, $\cdots$, $X_n$. In other words, $\mathbf{E}(X_i)=\mu$ for each $i$. So $\mu$ is the mean of the population (usually unknown, which is why we want to estimate it!).

#### Weak Law of Large Numbers

This law states that for any $\epsilon>0$,
$\lim_{n\to\infty}\mathbf{P}(|M_n-\mu|>\epsilon)=0$Interpretation
• For large values of $n$ (i.e. $n>n_0$ for some $n_0$), the probability that the value of $M_n$ (the sample mean) differs from the population mean $\mu$ by more than any given number $\epsilon$ is 0.
• Alternatively, all probability is concentrated in an $\epsilon$-interval around $\mu$.
• Alternatively, almost surely, for large samples, the sample mean is within an $\epsilon$ neighborhood of the population mean.

#### Strong Law of Large Numbers

This law states that
$\mathbf{P}(\lim_{n\to\infty}M_n=\mu)=1$Interpretation

• For large values of $n$ (i.e. $n>n_0$ for some $n_0$), the probability that the value of $M_n$ (the sample mean) differs from the population mean at all is 0.
• Alternatively, all probability is concentrated at $\mu$.
• Alternatively, almost surely, for large samples, the sample mean is exactly the population mean.

#### Difference between the two laws

• Strong law is stronger than the weak law because the strong law allows for $\epsilon=0$, while the weak law has to have  $\epsilon>0$.
• Per the strong law, all probability is concentrated at $\mu$, while per the weak law, it is concentrated in the interval $(\mu-\epsilon,\mu+\epsilon)$, which is infinitely larger because $\epsilon>0$.
• Because the probability of the sample mean, $M_n$ differing from population mean $\mu$ is 0, the strong law allows for only a finite number of values of $M_n$ to differ from $\mu$. In other words, there are only a finite number of sequences $X_1$, $X_2$, $\cdots$, $X_n$ whose mean $M_n$ differs from $\mu$. Now that is a very strong statement!
• Because the probability of the sample mean $M_n$ differing from population mean $\mu$ is positive (although small), the weak law allows for an infite number of values of $M_n$ to differ from $\mu$. In other words, there are an infinte number of sequences $X_1$, $X_2$, $\cdots$, $X_n$ whose mean $M_n$ differs from $\mu$. This is clearly weaker than the previous statement.