What is Apptainer (formerly Singularity)

Until recently Apptainer was called Singularity. In November 2021 the guidance of parts of Singularity was transferred to the Linux Foundation, and that fully open-source component has been renamed Apptainer, while the commercial fork is still called Singularity.

Why use a container

Idea: package and distribute the software environment along with the application, i.e. create a portable software environment.

Why: 1. avoid compiling complex software chains from scratch for the host’s Linux OS 1. run software in the environment where it might not be available as a package, or run older software 1. use a familiar software environment everywhere where you can run Apptainer, e.g. across different HPC centres - create a consistent testing environment independently of the underlying system - transfer pipelines from a test environment to a production environment 1. popular, but somewhat dubious reason: data reproducibility (use the same software environment as the authors  ➜  same result)

Why/when not to use a container

Do not use Apptainer if your software is already installed on the Alliance clusters. Learning and understanding Apptainer is more difficult than learning how to use our software modules or pre-compiled Python packages.

Create your own Apptainer images only if you have a compelling reason to require a custom image. In my experience, 95% of those who think they need one actually don’t.

Installing/running Apptainer on your own computer

Apptainer was really developed for use on HPC cluster, but there are ways to run it on your own computer:

  1. On Linux install and use Apptainer software. When running a longer version of this course after a Cloud course, we install Apptainer as a package inside our VM.
  2. On any host OS: in a VM running Linux.
  3. On Windows or MacOS: inside Vagrant.
  4. On Windows or MacOS: inside Docker (download a Docker image with Apptainer installed).

Glossary

An image is a bundle of files including an operating system, software and potentially data and other application-related files. Apptainer uses the Singularity Image Format (SIF), and images are provided as single .sif files.

A container is a virtual environment that is based on an image. You can start multiple container instances from an image.

An operating system (OS) is all the software that let you interact with a computer, run applications, UI, etc, consists of the “kernel” and “userland” parts.

A kernel is the central piece of software that manages hardware and provides resources (CPU, I/O, memory, devices, filesystems) to the processes it is running.

A filesystem is an organized collection of files. Under UNIX/Linux, there is a single hierarchy under /, and additional filesystems are “mounted” somewhere under that hierarchy.

Containers vs virtual machines

  • Container = the OS-level mechanism to isolate some parts of the OS along with a given application.
    • virtualizes an operating system
    • lets you run an application compiled for a specific Linux OS on another Linux OS
    • almost no performance overhead
  • Virtual machine (VM) = complete isolation from the host OS via virtualized hardware
    • virtualizes hardware
    • maximum flexibility, can mix any combination of host and guest OS’s
    • significant performance overhead, as you run on simulated hardware

Docker: container platform for services, runs as root on the host system, uses cgroups for resource management between different VMs on a given node, very popular with software developers, can’t really use it on HPC systems (no root or sudo possible for users on clusters + cgroups resource management will conflict with HPC resource managers).

Apptainer: run containers entirely in user space, as a user, can use existing Docker containers (Apptainer will convert them to proper SIF images for you), works seamlessly with the schedulers.

There are few other container engines focusing on specific features.

Apptainer on HPC systems

Training cluster

We will now distribute usernames and passwords for our training cluster.

Let’s log in to the training cluster cass.vastcloud.org and try loading Apptainer:

module load apptainer/1.2.4   # the default version at the time of writing
apptainer --version
apptainer                     # see the list of available commands
Note

Apart from this short example, please do not run Apptainer on a cluster’s login node. Apptainer can be quite resource-demanding, so we will run on a compute node inside a Slurm job. I will explain how to do that in the next section. The same applies to our production clusters: always schedule either an interactive or a batch job to run Apptainer workflows.