**Table of Contents**

##### Central Limit Theorem

- Assume a population with mean μ, standard deviation σ, and any probability distribution. Consider an n-member random sample from the population. The
*Central Limit Theorem*says that, as n increases, the probability distribution of the sample mean approaches a normal distribution with mean μ and standard deviation σ/√n.- The normal distribution is the
*sampling distribution*for the mean. - The standard deviation σ/√n is the
*standard error*for the mean.

- The normal distribution is the
- Example
- Consider a population with
- μ = 69
- σ = 11.75

- The probability distribution of the mean of a 100-member sample is approximated by a normal distribution with mean = 69 and standard deviation = 11.75 / √100 = 1.175
- Probabilities can be calculated from the distribution, e.g:
- the probability is 0.95 that the mean of a 100-member random sample is between 66.7 and 71.3.

- Consider a population with

##### Monte Carlo Simulation

- In this Monte Carlo simulation:
- The population is arbitrarily assumed to have a Gamma Distribution with mean 4.0 and standard deviation 2.82843.
- Random samples are taken from the population with sizes 100, 1000, 10000, and 100000.
- The calculated standard error for the mean (= population SD / √sample size) compares with the standard deviation of the means of 500 random samples.
- For sample size 1000, for example:
- standard error = 0.0894427
- standard deviation of 500 random samples = 0.0902963

- For sample size 1000, for example:
- As the sample size increases, the standard error decreases and the graphs of the sampling distribution become more narrow.

##### Interactives

View Predicted Sampling Distribution

View Sampling Distribution from Simulation

View Combined Interactive

##### The Theorem

Let X_{1},…, X_{n} be independent random variables having a common distribution with expectation μ and standard deviation σ. Then for any real numbers *a* and *b*, as n → ∞:

Image Credit britannica.com/science/probability-theory/The-central-limit-theorem

- Take an n-member random sample from a population with an arbitrary probability distribution, e.g. the ChiSquare Distribution with mean 69.
- A 20-member random sample, for example, might look like this:
- 31.4131, 62.8372, 63.247, 73.7571, 44.7476, 74.4283, 102.562, 67.8194, 72.8808, 61.512, 81.3126, 80.7713, 45.6086, 75.0849, 65.4396, 50.1927, 53.2385, 72.3477, 55.7154, 56.2447

- Define the random variable q(n):

- n is the sample size and q(n) is its standardized mean
- CLT says that as n increases q(n) approaches a normal distribution, specifically the standard normal distribution (where μ=0 and σ=1).
- To prove that q[n] approaches a normal distribution we use the fact that the probability of a randomly selected number being between numbers
*a*and*b*under a continuous distribution is defined as the area under its curve between*a*and*b*. - Two examples where P(a < q[n] ≤ b) = the area under the standard standard normal distribution between
*a*and*b*. *Example 1: x between 0 and 1.*- According to CLT, as n increases
- P(0 < q[n] ≤ 1) approaches the area under the standard normal curve between 0 and 1.

- According to CLT, as n increases

- As n increases, P(0 < q[n] ≤ 1) approaches 0.341345
- In a Monte Carlo simulation of 1000 iterations, the proportion of times q[100000] was between 0 and 1 was 0.346
- Monte Carlo Simulation Program
- tally = 0;
- n = 1000;
- SeedRandom[RandomInteger[{1, 1000}]];
- Do[tmp = q[1000000];
- If[tmp ≥ 0 && tmp ≤ 1, tally = tally + 1, tally], {n}];

- Print[“Proportion of times q[100000] is between 0 and 1 = “, N[tally / n]];

- Monte Carlo Simulation Program

- In a Monte Carlo simulation of 1000 iterations, the proportion of times q[100000] was between 0 and 1 was 0.346
- The area under the curve between 0 and 1 = 0.341345
- Integrate[(1/Sqrt[2 Pi]) E^(-x^2 / 2), {x, 0, 1}] = 0.341345

*Second Example: x ≥ 1*- As n increases, P(q[n] > 1) approaches 0.15445
- In a Monte Carlo simulation of 1000 iterations, the proportion of times q[100000] ≥ 1 = 0.1544

- The area under the curve between 1 and ∞ = 0.158655
- Integrate[(1/Sqrt[2 Pi]) E^(-x^2 / 2), {x, 1, ∞}] = 0.158655

- As n increases, P(q[n] > 1) approaches 0.15445

- In general, for any two numbers
*a*and*b*, as n increases, the probability that q[n] is between*a*and*b*approaches the area under the standard normal distribution between*a*and*b*. Thus, by definition, q[n] approaches the standard normal distribution.