Why This Matters
When someone says a website "feels slow," they are usually describing latency -- the time it takes for a single request to travel to the server and back. When an engineering team says their system "can't keep up with traffic," they are describing throughput -- how many requests the system can handle per second. And when someone says "we need a bigger pipe," they mean bandwidth -- the maximum data rate the network connection can carry.
These three metrics are the language of network performance. You cannot diagnose a slow application, plan server capacity, or evaluate a CDN without understanding what each metric measures and how they relate. Every production system has monitoring dashboards tracking latency percentiles, throughput rates, and bandwidth utilization.
Define Terms
Visual Model
The full process at a glance. Click Start tour to walk through each step.
Latency is one request round-trip time. Throughput is requests per second. Bandwidth is max data rate of the link.
Code Example
// Measuring request latency with timestamps
async function measureLatency(url) {
const start = performance.now();
const response = await fetch(url);
await response.text();
const end = performance.now();
const latencyMs = end - start;
console.log("Latency: " + latencyMs.toFixed(2) + "ms");
return latencyMs;
}
// Measuring throughput: requests per second
async function measureThroughput(url, numRequests) {
const start = performance.now();
const promises = [];
for (let i = 0; i < numRequests; i++) {
promises.push(fetch(url));
}
await Promise.all(promises);
const elapsed = (performance.now() - start) / 1000; // seconds
const throughput = numRequests / elapsed;
console.log("Throughput: " + throughput.toFixed(1) + " req/s");
return throughput;
}
// Calculating percentiles from latency measurements
function percentile(sortedValues, p) {
const index = Math.ceil((p / 100) * sortedValues.length) - 1;
return sortedValues[Math.max(0, index)];
}
// Simulate latency measurements
const latencies = [12, 15, 18, 20, 22, 25, 30, 45, 120, 500];
latencies.sort((a, b) => a - b);
console.log("P50: " + percentile(latencies, 50) + "ms"); // 22ms
console.log("P90: " + percentile(latencies, 90) + "ms"); // 120ms
console.log("P99: " + percentile(latencies, 99) + "ms"); // 500ms
console.log("Max: " + latencies[latencies.length - 1] + "ms"); // 500msInteractive Experiment
Try these exercises:
- Run the latency measurement code against different URLs. How does latency change based on server distance?
- Modify the percentile function to handle P95 and P99.9. Why do engineers care about P99.9?
- If a server has 50ms average latency and receives 100 concurrent requests, what is the minimum theoretical throughput?
- Calculate how long it takes to download a 1 GB file over a 100 Mbps connection. What about a 1 Gbps connection?
Quick Quiz
Coding Challenge
Write a function called `calculateThroughput` that takes an array of request-response timing pairs (each with a `start` and `end` timestamp in milliseconds) and returns the throughput in requests per second. Throughput is the total number of completed requests divided by the total time window (from the earliest start to the latest end).
Real-World Usage
Latency, throughput, and bandwidth metrics drive engineering decisions everywhere:
- Monitoring dashboards: Tools like Datadog, Grafana, and New Relic display P50, P90, and P99 latency for every API endpoint in real time.
- SLAs and SLOs: Service Level Agreements guarantee specific latency targets (e.g., "P99 latency under 200ms") with financial penalties for violations.
- CDNs: Cloudflare, Akamai, and AWS CloudFront reduce latency by caching content at 200+ edge locations worldwide.
- Database optimization: Slow queries increase latency. Connection pooling and read replicas increase throughput.
- Capacity planning: Engineers use throughput benchmarks to decide how many servers are needed for expected traffic.