Reproducible Cluster Computing with Nix and Apptainer

Introduction

TL;DR

Click here if you just want instructions.

Apptainer (formerly Singularity) is a container platform designed specifically for scientific computing and high-performance environments. It enables reproducible code by allowing researchers to package their entire computational environment—including the operating system, libraries, dependencies, and application code—into a single, portable container image that can be executed consistently across different computing systems, ensuring that scientific workflows produce identical results regardless of the underlying infrastructure.

In practice, the Unity cluster I use primarily supports running custom-packaged software using Singularity (wrapping Docker). Docker is often touted as a method of producing reproducibile environments, but doesn't exactly achieve that. Docker containers still have hidden external dependencies, non-deterministic build processes, and potential image drift over time as base images change. Nix, meanwhile, offers superior reproducibility through purely functional package management, explicit declaration of all dependencies, bit-for-bit reproducible builds, and an immutable store that ensures identical outputs given the same inputs, allowing for more reliable recreation of development and runtime environments across different systems.

In my opinion, Nix is the gold standard for reproducible software, and I use it for all of my computers, servers, and development environments. So how do I deploy Nix-built software to a cluster that supports singularity?

Additionally, it's just nicer to use. A declaratively defined, programmatic system means you are aware of everything running on the system, and the system can be fully reproduced trivially.

Problem

Why can't I just copy my software over to the cluster? Let's go over a few failure cases

Compiled executables

The fun part about Nix is it takes its commitment to reproducibility to the extreme. For example, most executables on Linux use an "interpreter", typically stored at /lib/ld-linux-x86_64.so.2 or something of that sort. You can see the path using file my_program. But executables built on Nix?

my_program: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /nix/store/wn7v2vhyyyi6clcyn0s9ixvl7d4d87ic-glibc-2.40-36/lib/ld-linux-x86-64.so.2, for GNU/Linux 3.10.0, with debug_info, not stripped

You're not going to find a /nix/store on most systems, so the majority of executables built with nix aren't going to be portable to other systems.

Python

In nix, there is an easy path to a simple environment with a bunch of packages installed. You stick the following in a shell.nix and nix-shell drops you into a shell where Python+dependencies are just installed.

{ pkgs ? import <nixpkgs> {} }:
pkgs.mkShell {
  buildInputs = with pkgs; [
    (python3.withPackages (ps: with ps; [
      matplotlib
      numpy
    ]))
  ];
}

This is really nice, but to run this on a cluster, I'd have to manually convert this to a requirements.txt, install it into a conda prefix, yuck.

Solution

Nix allows building a Docker container from our reproducibile packages. Let's start with packaging my Python program. default.nix:

{ pkgs, python3Packages, wireshark-cli }:
python3Packages.buildPythonPackage {
  pname = "circle-tokenizer";
  version = "0.1";
  src = ./.;

  # Specify Python dependencies
  propagatedBuildInputs = with python3Packages; [
    setuptools # Required for Python packaging
    pyshark
    lxml
    wireshark-cli
  ];
  # Disable tests (enable if i add them)
  doCheck = false; 
}

In this package, I import regular software in addition to Python dependencies.

Reproducibility Note

You'll notice the reproducibility here is unclear. What we're actually relying on is a fixed checkout of nixpkgs which defines everything.

Let's move on to flakes. Flakes are a way to streamline Nix further. Each flake has inputs (nixpkgs versions, utilities, imported repositories) and outputs (packages, systems, development shells).

{
  # Import nixpkgs
  inputs = {
    nixpkgs.url = "github:NixOS/nixpkgs/nixos-24.11";
    flake-utils.url = "github:numtide/flake-utils";
  };
  # This is just a bunch of fancy jumbo to let us build for different architectures, another benefit of nix
  outputs = { self, nixpkgs, flake-utils }:
    flake-utils.lib.eachDefaultSystem (system:
      let
        pkgs = (import nixpkgs) {
          inherit system;
        };
      in
      rec {
        # Here I import the package I defined earlier and build it using the nixpkgs I imported earlier
        packages.circle-tokenizer = (pkgs.callPackage ./tokenizer/default.nix { });
        # This is the big part, and shows how we build a docker image using nix 
        packages.docker-image = pkgs.dockerTools.buildImage {
          # Name and tag, as usual with docker
          name = "tokenizer";
          tag = "latest";
          # Here we build a bare minimum nix environment with some things imported
          copyToRoot = pkgs.buildEnv {
            name = "image-root";
            paths = [
              pkgs.bashInteractive
              pkgs.coreutils
              pkgs.findutils
              # Usually things in nix propagate, but for some reason wireshark didn't
              # That's fine, I just manually add it here.
              pkgs.wireshark-cli
              # The package from earlier
              packages.circle-tokenizer
            ];
            pathsToLink = [ "/bin" "/lib" "/usr" "/etc" "/nix/store" ];
          };
          # Some folder setup
          runAsRoot = ''
            #!${pkgs.runtimeShell}
            mkdir -p /tmp
            mkdir -p /root
            mkdir -p /home
          '';
          config = {
            # Use Entrypoint to ensure bash is available for command execution
            Entrypoint = [ "/bin/bash" "-c" ];
            # Default command that will be overridden by Apptainer arguments
            Cmd = [ "$@" ];
            Env = [
              "PATH=/bin:/usr/bin:/nix/store"
              "HOME=/root"
              # This is a hack I had to add at some point, not sure if it is needed
              "PYTHONPATH=${packages.circle-tokenizer}/${pkgs.python3.sitePackages}"
              "PS1='$ '"
            ];
          };
        };
        # This is actually completely unrelated to the docker stuff. It lets me run `nix develop` on my machine
        # and drops me into a shell. Basically the flake version of the nix-shell thing we mentioned earlier
        devShell = pkgs.mkShell {
          buildInputs = with pkgs; [ ];
          nativeBuildInputs = with pkgs; [
            pypy3
            (python311.withPackages (ps: with ps; [
              pyshark
              scipy
              lxml
            ]))
            wireshark-cli
          ];
        };
      }
    );
}

Okay, so that was a lot. What this flake.nix does is pull in my custom package, then set up a custom docker environment for running things within that package. I build the docker image (and upload it to Unity) using this script I wrote, ./docker.sh unity:docker-images/

#!/bin/sh
set -eu

nix build .#docker-image
cat result > docker-image.tar.gz
rm result
rsync -av docker-image.tar.gz $1/docker-image.tar.gz

The intermediate cat is necessary because result is a soft link.

Finally, we can run stuff from this docker container on unity. Here's my Slurm job file

#!/bin/bash
#SBATCH --job-name=circle                   # Job name
#SBATCH --output=jobs/output/%j_output.log  # Output log file
#SBATCH --error=jobs/output/%j_error.log    # Error log file
#SBATCH --time=01:00:00                     # Time limit hh:mm:ss
#SBATCH --ntasks=1                          # Number of tasks
#SBATCH --cpus-per-task=1                   # CPU cores per task
#SBATCH --mem=32G                           # Memory per node
#SBATCH --partition=cpu,cpu-preempt         # Partition to submit to
set -eu

# Load Apptainer if not already in environment
module load apptainer/latest

# Unity pulls in a bunch of paths I don't want. Let's strip those out
unset APPTAINER_BINDPATH

# Set paths
DOCKER_TAR="docker-image.tar.gz"
SIF_FILE="${HOME}/apptainer/circle-tokenizer.sif"   # The built container file
HASH_FILE="${HOME}/apptainer/circle-tokenizer.hash" # Store the hash

# This imports some aliases I wrote
source "${HOME}/.profile"
# don is a bash function I wrote that shoots off a message to a webhook
trap 'don "@everyone Job ${SLURM_JOB_ID} done. args: $*"' EXIT

# Get current hash of docker image
CURRENT_HASH=$(sha256sum "$DOCKER_TAR" | cut -d' ' -f1)

# Check if rebuild is needed
NEEDS_REBUILD=1
if [ -f "$HASH_FILE" ] && [ -f "$SIF_FILE" ]; then
    STORED_HASH=$(cat "$HASH_FILE")
    if [ "$CURRENT_HASH" = "$STORED_HASH" ]; then
        NEEDS_REBUILD=0
    fi
fi

if [ $NEEDS_REBUILD -eq 1 ]; then
    echo "Building new Apptainer image..."
    apptainer build --fakeroot --force "$SIF_FILE" "docker-archive://$DOCKER_TAR"
    echo "$CURRENT_HASH" > "$HASH_FILE"
else
    echo "Using existing Apptainer image..."
fi

# Allocate a workspace
WS_NAME=circle
if ! ws_list -s | grep -q "^${WS_NAME}$"; then
    echo "Creating a new workspace"
    ws_allocate "${WS_NAME}" 1
fi
echo "Using ${WS_NAME} as the temporary directory"
export TMPDIR=$(ws_find $WS_NAME)
export TMP=$TMPDIR
export TEMP=$TMPDIR
# Run the container
echo "Running Apptainer container..."
apptainer run -B "${HOME}:${HOME}:rw" "$SIF_FILE" "$*"

To use this, simply sbatch ./base.slurm my_program_name arguments

You can use this as a script within other jobs, which is why I call it base.slurm