Why This Matters
A machine learning model needs a way to measure "how wrong am I?" before it can improve. The loss function is that measurement. It takes the model's prediction and the correct answer, then produces a single number indicating how far off the prediction was.
Without a loss function, a model has no compass. It cannot tell whether one set of parameters is better than another, and training becomes impossible. The choice of loss function shapes what the model optimizes for — and therefore what it learns.
Define Terms
Visual Model
The full process at a glance. Click Start tour to walk through each step.
The loss function compares predictions to actual labels, producing a signal that guides parameter updates.
Code Example
// Mean Squared Error (MSE) - for regression
function mseLoss(predictions, actuals) {
let sum = 0;
for (let i = 0; i < predictions.length; i++) {
sum += (predictions[i] - actuals[i]) ** 2;
}
return sum / predictions.length;
}
// Example: predicting house prices (in thousands)
const predicted = [250, 300, 180, 400];
const actual = [260, 290, 200, 410];
console.log("MSE:", mseLoss(predicted, actual));
// MSE: 175 (average squared error)
// Binary Cross-Entropy - for classification
function binaryCrossEntropy(predictions, actuals) {
let sum = 0;
for (let i = 0; i < predictions.length; i++) {
const p = Math.max(1e-15, Math.min(1 - 1e-15, predictions[i]));
sum += actuals[i] * Math.log(p) + (1 - actuals[i]) * Math.log(1 - p);
}
return -sum / predictions.length;
}
// Example: spam classification (1=spam, 0=not spam)
const spamPred = [0.9, 0.1, 0.8, 0.3];
const spamActual = [1, 0, 1, 0 ];
console.log("Cross-Entropy:", binaryCrossEntropy(spamPred, spamActual).toFixed(4));
// Low loss = predictions match wellInteractive Experiment
Try these exercises to build intuition:
- Change a prediction in the MSE example to be very wrong (e.g., predict 500 when actual is 260). How does MSE react? Why does squaring matter?
- In the cross-entropy example, change a prediction from 0.9 to 0.5 for a true spam email. How does the loss change?
- What happens to cross-entropy when a prediction is 0.0 for a true label of 1? (Hint: that is why we clip values.)
- Compare MSE vs. Mean Absolute Error (MAE = average of |predicted - actual|). When would you prefer one over the other?
Quick Quiz
Coding Challenge
Write a function called `maeLoss` that computes the Mean Absolute Error between two arrays of numbers. MAE = (1/n) * sum(|predicted_i - actual_i|). Also write `compareLosses` that takes predictions and actuals, and returns an object with both `mse` and `mae` values.
Real-World Usage
Loss functions are central to training every ML model in production:
- Self-driving cars: Multi-task loss functions combine penalties for incorrect steering, speed, and obstacle detection.
- Language models: Cross-entropy loss on next-token prediction trains models like GPT to generate coherent text.
- Image generation: Perceptual loss functions compare generated images to targets at a feature level, not pixel by pixel.
- Recommendation systems: Ranking losses optimize for the order of recommendations, not just individual predictions.
- Custom objectives: Companies design domain-specific losses — e.g., penalizing false negatives more heavily in medical diagnosis.