Enhancing Portability and Consistency: A Dive into Containerisation Platforms

Containerization platforms are tools that package an application and its dependencies into a standardized unit called a container. This allows the application to run consistently across different computing environments, ensuring that the software behaves the same regardless of where it’s executed. Containers are lightweight, portable, and provide isolated environments for applications, making them an essential part of modern software development and deployment, particularly in data-intensive fields like bioinformatics.

Image created using DALL-E by OpenAI.

Key Concepts of Containerisation

  1. Isolation: Containers encapsulate all the necessary components (code, runtime, system tools, libraries) for an application, providing an isolated environment that prevents conflicts with other applications.
  2. Portability: Containers can run on any system that supports the containerisation platform, ensuring that the application behaves the same across different environments (e.g., development, testing, production).
  3. Reproducibility: By packaging the entire runtime environment, containers ensure that applications can be reproduced exactly, which is crucial for scientific research and data analysis.
  4. Efficiency: Containers share the host system’s kernel and resources, making them more lightweight and efficient compared to traditional virtual machines (VMs).

Docker and Singularity are two popular containerisation platforms, each with its strengths and use cases.

Side-by-Side Comparison

FeatureDockerSingularity
Ease of UseVery user-friendly, extensive CLI toolsUser-friendly, especially in HPC environments
SecurityRequires root privileges for the daemonRuns containers without root privileges
PortabilityHigh portability across different environmentsHigh portability, single-file containers
EcosystemLarge ecosystem with Docker HubSmaller ecosystem, but can use Docker images
IntegrationIntegrated with Kubernetes, Docker SwarmIntegrated with HPC job schedulers (SLURM, Torque)
PerformanceHigh performance, but less optimized for HPCOptimized for HPC performance
Community SupportLarge, active communityGrowing community, strong academic focus
Container FormatMulti-layered image formatSingle-file container format
Use CasesGeneral-purpose containerization, microservicesScientific research, HPC environments

Basic operations with singularity

1. Create a definition file

# Singularity.def
Bootstrap: docker
From: ubuntu:20.04

%labels
    Author Masih Sherafatian
    Version v1.0

%post
    # Update the package repository
    apt-get update

    # Install Python and pip
    apt-get install -y python3 python3-pip

    # Install some Python packages
    pip3 install numpy pandas

    # Clean up
    apt-get clean

%environment
    # Set environment variables
    export LC_ALL=C
    export LANG=C

%runscript
    exec "python3" "$@"

%test
    # Simple test to verify the container
    python3 --version
    pip3 --version

Explanation:

  1. Bootstrap: Specifies the base image source. In this example, we are using a Docker image (ubuntu:20.04) as the base.
  2. From: Indicates the base image to use, in this case, Ubuntu 20.04.
  3. %labels: Allows you to add metadata to the image, such as the author and version.
  4. %post: Contains commands that run during the build process. Here, it updates the package repository, installs Python and pip, installs some Python packages (numpy and pandas), and cleans up temporary files.
  5. %environment: Sets environment variables that will be available inside the container.
  6. %runscript: Defines the default command to run when the container is executed. In this case, it runs Python.
  7. %test: Provides commands to test if the container is built correctly. This section ensures Python and pip are installed by checking their versions.
  8. "$@": This is a special parameter that represents all the positional parameters passed to a script or a function.

2. Build the singularity container:

singularity build SingulaityContainerImage.sif Singularity.def

3. Run the singularity container:

singularity exec \
SingulaityContainerImage.sif \
python3 \
-c "import numpy; print(numpy.__version__)"