probability stats25 min

Random Variables

Assigning numbers to outcomes — expected value, variance, and the language of uncertainty

0/9Not Started

Why This Matters

A coin flip gives heads or tails. A die roll gives a number from 1 to 6. But in data science and engineering, we need to do math with outcomes — add them, average them, measure their spread. A random variable bridges the gap by assigning a number to each outcome in a sample space. Once outcomes become numbers, we can compute averages, measure variability, and build statistical models.

The expected value (also called the mean) tells you the long-run average of a random variable. The variance tells you how spread out the values are around that average. Together, these two quantities summarize the center and spread of any probability distribution. Every machine learning loss function, every financial risk model, and every quality control chart uses expected value and variance at its core.

Define Terms

Visual Model

Sample Space Se.g., {1, 2, 3, 4, 5, 6}
Random Variable XMaps outcomes to numbers
P(X = x)Probability mass function
E[X] = sum of x*P(X=x)Expected value (mean)
Var(X) = E[(X - mu)^2]Variance (spread)
SD(X) = sqrt(Var)Standard deviation

The full process at a glance. Click Start tour to walk through each step.

A random variable maps outcomes to numbers. Expected value measures the center. Variance and standard deviation measure the spread.

Code Example

Code
// Random variable: fair die
const values = [1, 2, 3, 4, 5, 6];
const probs = [1/6, 1/6, 1/6, 1/6, 1/6, 1/6];

// Expected value: E[X] = sum of x * P(X=x)
function expectedValue(vals, ps) {
  let sum = 0;
  for (let i = 0; i < vals.length; i++) {
    sum += vals[i] * ps[i];
  }
  return sum;
}

const mu = expectedValue(values, probs);
console.log("E[X] =", mu.toFixed(4)); // 3.5000

// Variance: Var(X) = E[(X - mu)^2]
function variance(vals, ps, mean) {
  let sum = 0;
  for (let i = 0; i < vals.length; i++) {
    sum += Math.pow(vals[i] - mean, 2) * ps[i];
  }
  return sum;
}

const v = variance(values, probs, mu);
console.log("Var(X) =", v.toFixed(4)); // 2.9167
console.log("SD(X) =", Math.sqrt(v).toFixed(4)); // 1.7078

// Simulate: roll die 10000 times, check average
let total = 0;
for (let i = 0; i < 10000; i++) {
  total += Math.floor(Math.random() * 6) + 1;
}
console.log("Simulated mean:", (total / 10000).toFixed(3));

// Weighted coin: P(H)=0.7, X=1 for H, X=0 for T
const coinVals = [1, 0];
const coinProbs = [0.7, 0.3];
console.log("E[biased coin]:", expectedValue(coinVals, coinProbs)); // 0.7

Interactive Experiment

Try these exercises:

  • Compute E[X] for a loaded die where P(6) = 1/2 and each other face has probability 1/10. Is the expected value higher than 3.5?
  • Create a random variable for two coin flips where X = number of heads. List the PMF and compute E[X] and Var(X).
  • Simulate 10,000 die rolls and compute the sample mean and sample variance. How close are they to the theoretical values?
  • For a random variable with values -1, 0, and 1 (each with equal probability), what are E[X] and Var(X)?
  • Show that Var(X) = E[X^2] - (E[X])^2 by computing both sides for the fair die.

Quick Quiz

Coding Challenge

Expected Value and Variance

Write two functions: `computeEV(values, probs)` that returns the expected value of a random variable given arrays of values and probabilities, and `computeVariance(values, probs)` that returns the variance. Round both results to 4 decimal places. Hint: first compute the mean, then use it to compute variance as E[(X-mean)^2].

Loading editor...

Real-World Usage

Random variables and their properties appear throughout engineering and science:

  • Machine learning: Loss functions compute E[loss] over training data. The variance of the loss tells you how noisy your training signal is.
  • Finance: Stock returns are modeled as random variables. E[return] is the expected profit, and Var(return) measures risk. The Sharpe ratio is E[return]/SD(return).
  • Quality control: Manufacturing processes use expected value and variance to set tolerance ranges. If Var is too high, the process is inconsistent.
  • Game theory: Expected value drives rational decision-making. A gamble with positive expected value is favorable in the long run.
  • Network engineering: Packet arrival times are random variables. E[delay] gives average latency, and Var(delay) measures jitter.

Connections