Why This Matters
Roll a die once and you get any number from 1 to 6 with equal probability — that is a flat, uniform distribution, not a bell curve at all. But roll 100 dice and take their average, and something remarkable happens: that average will be very close to 3.5 and will follow a bell curve. Repeat this experiment thousands of times and you get a perfect Normal distribution. This is the Central Limit Theorem (CLT), arguably the most important result in all of statistics.
The CLT says that the sampling distribution of the mean approaches a Normal distribution as the sample size grows, regardless of the shape of the original distribution. The standard error — the standard deviation of this sampling distribution — shrinks as the sample size increases, specifically as sigma / sqrt(n). This is why larger samples give more precise estimates. The CLT is the foundation of confidence intervals, hypothesis tests, and polling accuracy.
Define Terms
Visual Model
The full process at a glance. Click Start tour to walk through each step.
The CLT: no matter what shape the population has, sample means form a bell curve. Larger samples make the bell curve narrower.
Code Example
// Demonstrate the Central Limit Theorem
// Population: uniform die rolls (not Normal at all)
function rollDie() {
return Math.floor(Math.random() * 6) + 1;
}
// Take a sample of size n and return the mean
function sampleMean(n) {
let sum = 0;
for (let i = 0; i < n; i++) {
sum += rollDie();
}
return sum / n;
}
// Generate 10000 sample means for different sample sizes
function cltDemo(sampleSize, numSamples) {
const means = [];
for (let i = 0; i < numSamples; i++) {
means.push(sampleMean(sampleSize));
}
// Compute mean and SD of the sample means
const avg = means.reduce((a, b) => a + b, 0) / means.length;
const variance = means.reduce((a, b) => a + (b - avg) ** 2, 0) / means.length;
const sd = Math.sqrt(variance);
return { mean: avg, sd: sd };
}
// Population: die has mean 3.5, SD = 1.708
console.log("Population mean: 3.5, SD: 1.708");
// CLT with increasing sample sizes
const n5 = cltDemo(5, 10000);
console.log(`n=5: mean=${n5.mean.toFixed(3)}, SE=${n5.sd.toFixed(3)} (theory: ${(1.708/Math.sqrt(5)).toFixed(3)})`);
const n30 = cltDemo(30, 10000);
console.log(`n=30: mean=${n30.mean.toFixed(3)}, SE=${n30.sd.toFixed(3)} (theory: ${(1.708/Math.sqrt(30)).toFixed(3)})`);
const n100 = cltDemo(100, 10000);
console.log(`n=100: mean=${n100.mean.toFixed(3)}, SE=${n100.sd.toFixed(3)} (theory: ${(1.708/Math.sqrt(100)).toFixed(3)})`);Interactive Experiment
Try these exercises:
- Simulate the CLT with a highly skewed distribution (e.g., exponential: -Math.log(Math.random())). Does the sample mean still converge to Normal?
- For n=5, n=30, and n=100, compute the theoretical standard error sigma/sqrt(n) and compare to your simulated SE. How close are they?
- At what sample size does a uniform distribution start looking Normal? Try n=2, n=5, n=10, n=30 and compare histograms visually.
- Verify that the mean of the sampling distribution equals the population mean, regardless of n.
- Double the sample size from 25 to 100. By what factor does the standard error decrease? Confirm it is sqrt(4) = 2.
Quick Quiz
Coding Challenge
Write a function `simulateCLT(sampleSize, numSamples)` that: (1) generates numSamples random samples, each of size sampleSize, from a uniform distribution on [1, 6] (integers, like a die), (2) computes the mean of each sample, and (3) returns the mean and standard deviation of those sample means, rounded to 2 decimal places, as a string in the format 'mean,sd'. Use Math.floor(Math.random() * 6) + 1 for die rolls.
Real-World Usage
The Central Limit Theorem is the backbone of modern statistics:
- Polling and surveys: Political polls sample a few thousand people to estimate the opinions of millions. The CLT guarantees the sample mean is close to the true proportion, with error bounded by the standard error.
- A/B testing: When comparing two website variants, the average metrics (click rate, revenue) follow approximate Normal distributions due to the CLT, enabling z-tests and confidence intervals.
- Quality control: Manufacturing uses control charts based on the CLT. Sample means of batch measurements should fall within known bands if the process is stable.
- Finance: Portfolio returns are averages of many asset returns. The CLT justifies using Normal-based risk models even when individual asset returns are not Normal.
- Machine learning: Stochastic gradient descent averages gradients over mini-batches. The CLT explains why larger batch sizes give more stable gradient estimates.