Running out of Disk Space in Production

Source: alt-romes.github.io

Date: April 1, 2026

Tags: devops nix nginx

The Setup

A simple server hosting digital Kanjideck files (2.2GB downloadable files) on a small Hetzner machine running NixOS — 4GB RAM, 40GB disk space. The architecture: Haskell program serves static files with authorization, proxied by nginx.

First, Panic

Within minutes of announcing the files were available, hundreds of customers hit the server. Logs showed repeated errors:

Mar 31 20:43:03 mogbit kanjideck-fulfillment[2528300]: user error (Unexpected reply to: MAIL "<...>", Expected reply code: 250, Got this instead: 452 "4.3.1 Insufficient system storage")

Disk space: 40GB/40GB used. 100% usage on /dev/sda.

Initial Triage

Two largest culprits found:

Attempted nix-collect-garbage -d but failed — no space left on device.

Solution: journalctl --vacuum-time=1s to clear logs first.

Mounting the Nix Store on a Separate Volume

Hetzner had no larger cloud instance available. Plan B: attach a separate Volume.

The /nix/store is immutable and was the largest system component at 12GB — perfect candidate for migration.

Following the NixOS Wiki instructions worked flawlessly:

fileSystems."/nix" = {
  device = "/dev/disk/by-label/nix";
  fsType = "ext4";
  neededForBoot = true;
  options = [ "noatime" ];
};

Finding the Root Cause in Nginx

The large 2.2GB file download kept stopping halfway through.

The bug: nginx's proxy_max_temp_file_size defaults to 1024m. The 2.2GB file exceeded this buffer.

Fix: Set proxy_max_temp_file_size 5000m;

The Hidden Culprit: Deleted Files

During the day, disk space spiked to 100% again. Investigation revealed:

lsof +L1 | grep nginx

Result: 14.5 GB of deleted files held by nginx! These were unlinked open files — not visible in du -h but still consuming disk space.

Key Lessons