distributed systems30 min

Distributed Replication

How distributed systems copy data across nodes for reliability and performance

0/9Not Started

Why This Matters

A single server is a single point of failure. If it crashes, your data is gone. Replication copies data across multiple nodes so that if one fails, others can take over. But replication introduces hard problems: how do you keep copies in sync? What happens when two nodes accept conflicting writes? Which node is in charge?

Every major database -- PostgreSQL, MySQL, MongoDB, Cassandra -- uses replication. Understanding replication models is essential for building reliable systems and choosing the right database for your workload.

Define Terms

Visual Model

LeaderAccepts all writes
Follower 1Read replica
Follower 2Read replica
Follower 3Read replica
Client Write
Client Read
Write
Sync repl.
Async repl.
Async repl.
Read
Read
Read

The full process at a glance. Click Start tour to walk through each step.

Single-leader replication: writes go to the leader, reads can be served by any node.

Code Example

Code
// Simulating single-leader replication

class ReplicatedDatabase {
  constructor(followerCount) {
    this.leader = new Map();
    this.followers = Array.from({ length: followerCount }, () => new Map());
    this.replicationLog = [];
  }

  // All writes go through the leader
  write(key, value) {
    this.leader.set(key, value);
    this.replicationLog.push({ key, value, timestamp: Date.now() });
    console.log(`Leader wrote: ${key} = ${value}`);
  }

  // Replicate to followers (async in real systems)
  replicate() {
    for (const entry of this.replicationLog) {
      for (const follower of this.followers) {
        follower.set(entry.key, entry.value);
      }
    }
    this.replicationLog = [];
    console.log("Replication complete");
  }

  // Read from a follower to spread load
  read(key, nodeIndex = -1) {
    if (nodeIndex === -1) {
      return this.leader.get(key); // Read from leader
    }
    return this.followers[nodeIndex].get(key);
  }
}

// Quorum-based read/write
function quorumWrite(nodes, key, value, quorumSize) {
  let acks = 0;
  for (const node of nodes) {
    node.set(key, value);
    acks++;
    if (acks >= quorumSize) {
      console.log(`Quorum reached: ${acks}/${nodes.length} nodes confirmed`);
      return true;
    }
  }
  return false;
}

const db = new ReplicatedDatabase(2);
db.write("user:1", "Alice");
console.log("Leader read:", db.read("user:1"));
console.log("Follower read:", db.read("user:1", 0));
db.replicate();
console.log("After replication:", db.read("user:1", 0));

Interactive Experiment

Try these modifications and thought exercises:

  • Add a failover method that promotes a follower to leader when the current leader fails. What happens to unreplicated writes?
  • Implement a quorum read that reads from multiple followers and returns the most recent value.
  • Simulate a scenario where the leader crashes after writing locally but before replicating. How much data is lost?
  • Compare sync vs async replication: add a replicateSync method that blocks until all followers confirm.

Quick Quiz

Coding Challenge

Quorum Calculator

Write a function called `quorumConfig` that takes the total number of nodes (N) and returns an object with the minimum write quorum (W) and read quorum (R) such that W + R > N (guaranteeing overlap). Use the common strategy where W = floor(N/2) + 1 and R = floor(N/2) + 1.

Loading editor...

Real-World Usage

Replication is fundamental to every production database and storage system:

  • PostgreSQL streaming replication: A primary server streams WAL (write-ahead log) records to standby replicas for failover and read scaling.
  • MongoDB replica sets: A primary node handles writes while secondaries replicate data and can be promoted if the primary fails.
  • Cassandra (leaderless): Uses quorum-based replication with no single leader. Any node can accept reads and writes.
  • Redis Sentinel: Monitors Redis replicas and automatically promotes a replica to primary if the current primary fails.
  • Cloud storage (S3): Replicates objects across multiple availability zones for durability.

Connections