mathematics30 min

Optimization & Derivatives

Rates of change, slopes, and finding minima — the math behind ML training

0/9Not Started

Why This Matters

Every time an ML model learns, it is solving an optimization problem: find the parameters that minimize the error. But how do you find the lowest point of a function? The answer is the derivative -- a measure of how fast a function is changing at any given point.

If the derivative is positive, the function is going up. If it is negative, the function is going down. If the derivative is zero, you are at a peak or a valley. Optimization uses derivatives to navigate the landscape of a function, descending toward the local minimum. This is not just theory -- it is the actual mechanism that trains every neural network, fits every regression model, and tunes every recommendation algorithm.

Define Terms

Visual Model

f(x) = x^2U-shaped curve
Point on Curve(x, f(x))
Tangent LineSlope = df/dx
Derivativef′(x) = 2x
Set f′(x) = 0
Minimumx = 0

The full process at a glance. Click Start tour to walk through each step.

The derivative tells you the slope. Where the slope is zero, you find minima and maxima.

Code Example

Code
// Numerical derivative: approximate df/dx
// using the limit definition: (f(x+h) - f(x)) / h
function derivative(f, x, h = 0.0001) {
  return (f(x + h) - f(x)) / h;
}

// f(x) = x^2, derivative should be 2x
const f = x => x * x;
console.log("df(3):", derivative(f, 3).toFixed(2));   // 6.00
console.log("df(0):", derivative(f, 0).toFixed(2));   // 0.00
console.log("df(-2):", derivative(f, -2).toFixed(2)); // -4.00

// g(x) = x^3 - 3x + 1, derivative is 3x^2 - 3
const g = x => x ** 3 - 3 * x + 1;
console.log("dg(0):", derivative(g, 0).toFixed(2));  // -3.00
console.log("dg(1):", derivative(g, 1).toFixed(2));  // 0.00 (local min!)
console.log("dg(-1):", derivative(g, -1).toFixed(2)); // 0.00 (local max!)

// Find minimum numerically: scan for where derivative ~ 0
function findMinimum(f, start, end, steps = 1000) {
  let minX = start, minY = f(start);
  const step = (end - start) / steps;
  for (let x = start; x <= end; x += step) {
    const y = f(x);
    if (y < minY) { minX = x; minY = y; }
  }
  return { x: +minX.toFixed(4), y: +minY.toFixed(4) };
}

console.log("min of x^2:", findMinimum(f, -5, 5));
console.log("local min of g:", findMinimum(g, 0, 3));

// Partial derivatives: f(x, y) = x^2 + y^2
function partialX(f, x, y, h = 0.0001) {
  return (f(x + h, y) - f(x, y)) / h;
}
function partialY(f, x, y, h = 0.0001) {
  return (f(x, y + h) - f(x, y)) / h;
}

const f2d = (x, y) => x * x + y * y;
console.log("df/dx at (3,4):", partialX(f2d, 3, 4).toFixed(2)); // 6.00
console.log("df/dy at (3,4):", partialY(f2d, 3, 4).toFixed(2)); // 8.00

Interactive Experiment

Try these exercises:

  • Compute the numerical derivative of f(x) = x^3 at x = 2. The analytical answer is 12. How close is your approximation?
  • Find where the derivative of f(x) = x^2 - 4x + 3 equals zero. This is the minimum. Verify by evaluating f at that point.
  • Try different values of h in the numerical derivative (0.1, 0.01, 0.001, 0.0001). How does precision change?
  • For f(x) = sin(x), compute derivatives at x = 0 and x = pi/2. What do you get?
  • Compute both partial derivatives of f(x, y) = x^2 + y^2 at the origin. Why is (0,0) the minimum?

Quick Quiz

Coding Challenge

Numerical Minimum Finder

Write a function called `findMin` that takes a function f, a search range [start, end], and finds the x value where f(x) is minimized. Use a simple approach: evaluate f at many evenly spaced points and return the x that gives the smallest f(x). Round x to 2 decimal places.

Loading editor...

Real-World Usage

Optimization and derivatives power critical systems:

  • Machine learning training: Every neural network minimizes a loss function using gradient-based optimization. The derivative of the loss tells the model which direction to adjust its weights.
  • Logistics and operations: Companies minimize costs and maximize throughput by solving optimization problems over supply chains and delivery routes.
  • Computer graphics: Ray tracing finds intersections by solving equations using derivatives. Animation curves use derivatives for smooth motion.
  • Economics and finance: Option pricing (Black-Scholes) and portfolio optimization rely heavily on calculus and derivatives.
  • Compiler optimization: Optimizing compilers find local minima in the space of possible code transformations.

Connections