probability stats30 min

Conditional Probability and Bayes Theorem

How new evidence updates beliefs — the core of Bayesian reasoning

0/9Not Started

Why This Matters

Imagine a medical test for a rare disease that is 99% accurate. If you test positive, what is the probability you actually have the disease? Most people guess 99%, but the real answer depends on how rare the disease is. This is the domain of conditional probability — the probability of one event given that another event has occurred. Without it, you will make catastrophically wrong decisions about data.

Bayes theorem provides the formula for updating beliefs when new evidence arrives. You start with a prior belief (how likely the disease is before testing), observe evidence (the test result), and compute a posterior belief (how likely the disease is after testing). This prior-to-posterior update is the foundation of spam filters, medical diagnosis, search engines, and modern machine learning. Master it and you will think more clearly about evidence and uncertainty.

Define Terms

Visual Model

Prior P(H)Belief before evidence
Evidence ENew data observed
Likelihood P(E|H)How likely is E if H is true
Marginal P(E)Total probability of E
Bayes TheoremP(H|E) = P(E|H)*P(H)/P(E)
Posterior P(H|E)Updated belief after evidence
Disease ExampleP(disease|positive test)

The full process at a glance. Click Start tour to walk through each step.

Bayes theorem updates your prior belief using evidence to produce a posterior. Even highly accurate tests can mislead when the prior is low.

Code Example

Code
// Bayes Theorem: P(H|E) = P(E|H) * P(H) / P(E)
function bayesTheorem(priorH, likelihoodEgivenH, marginalE) {
  return (likelihoodEgivenH * priorH) / marginalE;
}

// Medical test example
const pDisease = 0.001;        // Prior: 1 in 1000
const pHealthy = 1 - pDisease;  // 0.999
const pPosGivenDisease = 0.99;  // Sensitivity (true positive rate)
const pPosGivenHealthy = 0.01;  // False positive rate

// Marginal: total probability of positive test
const pPositive = pPosGivenDisease * pDisease + pPosGivenHealthy * pHealthy;
console.log("P(positive):", pPositive.toFixed(5)); // 0.01098

// Posterior: P(disease | positive test)
const posterior = bayesTheorem(pDisease, pPosGivenDisease, pPositive);
console.log("P(disease|positive):", posterior.toFixed(4)); // 0.0902
console.log("Only about 9% chance of disease despite 99% accurate test!");

// Conditional probability: P(A|B) = P(A and B) / P(B)
function conditionalProb(pAandB, pB) {
  return pAandB / pB;
}

// Example: deck of cards
// P(King | Face card) = P(King and Face) / P(Face)
const pKingAndFace = 4/52;  // Kings are face cards
const pFace = 12/52;        // J, Q, K of each suit
console.log("P(King|Face):", conditionalProb(pKingAndFace, pFace).toFixed(4)); // 0.3333

Interactive Experiment

Try these exercises:

  • Change the disease prevalence to 1 in 100 (P(disease)=0.01). How does the posterior change? What about 1 in 10?
  • If the test accuracy improves to 99.9% (false positive drops to 0.1%), recompute P(disease|positive) with the original 1/1000 prevalence.
  • Compute P(rain today | cloudy) if P(cloudy|rain)=0.9, P(rain)=0.3, and P(cloudy)=0.5.
  • Why does a second positive test dramatically increase the posterior? Compute it by using the first posterior as the new prior.
  • Simulate the medical test scenario: generate 100,000 people, apply the disease rate and test accuracy, and count what fraction of positive tests are true positives.

Quick Quiz

Coding Challenge

Apply Bayes Theorem

Write a function `bayesPosterior(prior, likelihood, falsePositiveRate)` that computes the posterior probability P(H|E). The arguments are: `prior` = P(H), `likelihood` = P(E|H), and `falsePositiveRate` = P(E|not H). First compute the marginal P(E) = likelihood * prior + falsePositiveRate * (1 - prior), then return the posterior. Round the result to 4 decimal places.

Loading editor...

Real-World Usage

Conditional probability and Bayes theorem are used everywhere:

  • Spam filters: Naive Bayes classifiers compute P(spam | words) by combining the prior probability of spam with the likelihood of seeing each word in spam vs. legitimate emails.
  • Medical diagnosis: Doctors must account for disease prevalence (base rate) when interpreting test results. Rare diseases with accurate tests still produce many false positives.
  • Search engines: Search ranking uses Bayesian methods to estimate P(relevant | query, document) based on click data and content signals.
  • Machine learning: Bayesian neural networks, Gaussian processes, and probabilistic programming all use Bayes theorem to update model parameters given training data.
  • Criminal justice: DNA evidence interpretation requires Bayesian reasoning. A 1-in-a-million DNA match means different things depending on the suspect pool size (the prior).

Connections