networking25 min

Load Balancing

Distributing traffic across multiple servers for reliability and performance

0/9Not Started

Why This Matters

A single server can only handle so many requests. When your app goes from 100 users to 100,000, one machine is not enough. A load balancer sits in front of a group of servers and distributes incoming traffic across them. If one server goes down, the load balancer routes traffic to the healthy ones. If traffic spikes, you add more servers behind the same load balancer.

Every large-scale web application uses load balancing. When you visit any popular website, your request is being routed through a load balancer before it reaches an application server. Understanding how this works is essential for building systems that stay fast and available under real-world traffic.

Define Terms

Visual Model

Clients
Load Balancer
Server 1healthy
Server 2healthy
Server 3down
requests
blocked

The full process at a glance. Click Start tour to walk through each step.

A load balancer distributes requests across healthy backend servers and removes unhealthy ones from rotation.

Code Example

Code
// Simple round-robin load balancer logic
class RoundRobinBalancer {
  constructor(servers) {
    this.servers = servers;
    this.currentIndex = 0;
  }

  getNextServer() {
    const server = this.servers[this.currentIndex];
    this.currentIndex = (this.currentIndex + 1) % this.servers.length;
    return server;
  }
}

const balancer = new RoundRobinBalancer([
  "server-1:8080",
  "server-2:8080",
  "server-3:8080"
]);

// Simulate 6 requests
for (let i = 0; i < 6; i++) {
  console.log("Request " + (i+1) + " -> " + balancer.getNextServer());
}
// Request 1 -> server-1:8080
// Request 2 -> server-2:8080
// Request 3 -> server-3:8080
// Request 4 -> server-1:8080 (wraps around)
// Request 5 -> server-2:8080
// Request 6 -> server-3:8080

// With health checks
class HealthAwareBalancer {
  constructor(servers) {
    this.servers = servers.map(s => ({ address: s, healthy: true }));
    this.currentIndex = 0;
  }

  markUnhealthy(address) {
    const server = this.servers.find(s => s.address === address);
    if (server) server.healthy = false;
  }

  getNextServer() {
    const healthy = this.servers.filter(s => s.healthy);
    if (healthy.length === 0) return null;
    const server = healthy[this.currentIndex % healthy.length];
    this.currentIndex++;
    return server.address;
  }
}

const lb = new HealthAwareBalancer(["s1", "s2", "s3"]);
lb.markUnhealthy("s2");
console.log(lb.getNextServer()); // s1
console.log(lb.getNextServer()); // s3
console.log(lb.getNextServer()); // s1

Interactive Experiment

Try these exercises:

  • Modify the round-robin balancer to handle weighted servers (e.g., server-1 gets twice as many requests as server-2).
  • Simulate a server going down mid-rotation. How does the health-aware balancer adjust?
  • Implement a "least connections" strategy: track active request counts and always pick the server with the fewest.
  • What happens if all servers fail health checks? How should the balancer handle this?

Quick Quiz

Coding Challenge

Round-Robin Load Balancer

Write a function called `createBalancer` that takes an array of server names and returns a function. Each time the returned function is called, it should return the next server in round-robin order, cycling back to the beginning after reaching the end. For example, with servers ['a', 'b', 'c'], successive calls return 'a', 'b', 'c', 'a', 'b', ...

Loading editor...

Real-World Usage

Load balancing is everywhere in production infrastructure:

  • Cloud providers: AWS Elastic Load Balancer, Google Cloud Load Balancing, and Azure Load Balancer handle millions of requests per second for applications worldwide.
  • Nginx and HAProxy: Open-source load balancers used by Netflix, GitHub, and Airbnb to distribute traffic across server fleets.
  • Kubernetes: Uses Services and Ingress controllers as built-in load balancers for containerized applications.
  • CDNs: Content delivery networks like Cloudflare use global load balancing to route users to the nearest edge server.
  • Database replicas: Read queries are load-balanced across multiple read replicas while writes go to the primary.

Connections