operating systems25 min

File Systems

How the OS organizes and stores data on disk

0/9Not Started

Why This Matters

Every program you write interacts with files. Configuration files, log files, databases, images, your source code itself -- it is all stored in a file system. The file system is how the operating system organizes data on disk into a hierarchy of directories and files that humans and programs can navigate.

But a file system is more than just a folder tree. Under the hood, the OS uses processes to track which files are open, assigns file descriptors to each open file, and uses data structures called inodes to map filenames to physical disk locations. Understanding file systems helps you write reliable I/O code, avoid data corruption, set proper permissions, and debug issues like "too many open files" errors that crash production servers.

Define Terms

Visual Model

/root
/home
/etc
/usr
alice/
bob/
readme.txt
code/
nginx.conf
bin/

The full process at a glance. Click Start tour to walk through each step.

The file system is a tree of directories and files. Under the hood, inodes map names to disk locations.

Code Example

Code
const fs = require("fs");
const path = require("path");

// Reading a file (async)
fs.readFile("data.txt", "utf-8", (err, content) => {
  if (err) {
    console.log("Error:", err.message);
    return;
  }
  console.log("Content:", content);
});

// Reading a file (promises)
const fsPromises = require("fs").promises;

async function readConfig() {
  try {
    const data = await fsPromises.readFile("config.json", "utf-8");
    return JSON.parse(data);
  } catch (err) {
    console.log("Failed to read config:", err.message);
    return {};
  }
}

// Writing a file
fs.writeFileSync("output.txt", "Hello, file system!");

// Listing directory contents
const files = fs.readdirSync(".");
console.log("Files in current directory:", files);

// Path utilities
console.log(path.join("/home", "alice", "code"));  // /home/alice/code
console.log(path.resolve("./src")); // absolute path
console.log(path.extname("app.js")); // .js

// Streams: reading large files efficiently
const stream = fs.createReadStream("large-file.txt", "utf-8");
stream.on("data", (chunk) => {
  console.log("Read chunk:", chunk.length, "bytes");
});
stream.on("end", () => console.log("Done reading"));

Interactive Experiment

Try these exercises:

  • In a terminal, run ls -li to see inode numbers for files in a directory. Create a hard link with ln file1 file2 and compare their inode numbers.
  • Write a script that creates a file, reads it back, and deletes it. Use try/catch to handle the case where the file does not exist.
  • Run ulimit -n to see the maximum number of open file descriptors for your shell. What happens if you try to open more files than this limit?
  • Compare reading a large file all at once (readFile) vs streaming it chunk by chunk (createReadStream). Which uses less memory?

Quick Quiz

Coding Challenge

Word Frequency Counter

Write a function called `wordFrequency` that takes a string of text and returns an object (dictionary) mapping each lowercase word to its count. Words should be split on whitespace and converted to lowercase. Punctuation attached to words should be stripped (remove all non-letter characters from the start and end of each word). Ignore empty strings after stripping.

Loading editor...

Real-World Usage

File systems are foundational to all software:

  • Databases like PostgreSQL and SQLite store data as files on disk. Understanding file I/O helps you optimize database performance.
  • Docker uses layered file systems (OverlayFS) where each image layer is a read-only filesystem stacked on top of the previous.
  • Log management tools like Logrotate manage log files by rotating, compressing, and deleting old files to prevent disk exhaustion.
  • Build tools like Webpack and Vite read source files, process them, and write output files. File watching (fs.watch) triggers rebuilds on change.
  • Cloud storage services like S3 provide file-system-like APIs (put/get objects in buckets) even though the underlying storage is distributed.

Connections