Containerization platforms are tools that package an application and its dependencies into a standardized unit called a container. This allows the application to run consistently across different computing environments, ensuring that the software behaves the same regardless of where it’s executed. Containers are lightweight, portable, and provide isolated environments for applications, making them an essential part of modern software development and deployment, particularly in data-intensive fields like bioinformatics.
Key Concepts of Containerisation
- Isolation: Containers encapsulate all the necessary components (code, runtime, system tools, libraries) for an application, providing an isolated environment that prevents conflicts with other applications.
- Portability: Containers can run on any system that supports the containerisation platform, ensuring that the application behaves the same across different environments (e.g., development, testing, production).
- Reproducibility: By packaging the entire runtime environment, containers ensure that applications can be reproduced exactly, which is crucial for scientific research and data analysis.
- Efficiency: Containers share the host system’s kernel and resources, making them more lightweight and efficient compared to traditional virtual machines (VMs).
Docker and Singularity are two popular containerisation platforms, each with its strengths and use cases.
Side-by-Side Comparison
Feature | Docker | Singularity |
---|---|---|
Ease of Use | Very user-friendly, extensive CLI tools | User-friendly, especially in HPC environments |
Security | Requires root privileges for the daemon | Runs containers without root privileges |
Portability | High portability across different environments | High portability, single-file containers |
Ecosystem | Large ecosystem with Docker Hub | Smaller ecosystem, but can use Docker images |
Integration | Integrated with Kubernetes, Docker Swarm | Integrated with HPC job schedulers (SLURM, Torque) |
Performance | High performance, but less optimized for HPC | Optimized for HPC performance |
Community Support | Large, active community | Growing community, strong academic focus |
Container Format | Multi-layered image format | Single-file container format |
Use Cases | General-purpose containerization, microservices | Scientific research, HPC environments |
Basic operations with singularity
1. Create a definition file
# Singularity.def
Bootstrap: docker
From: ubuntu:20.04
%labels
Author Masih Sherafatian
Version v1.0
%post
# Update the package repository
apt-get update
# Install Python and pip
apt-get install -y python3 python3-pip
# Install some Python packages
pip3 install numpy pandas
# Clean up
apt-get clean
%environment
# Set environment variables
export LC_ALL=C
export LANG=C
%runscript
exec "python3" "$@"
%test
# Simple test to verify the container
python3 --version
pip3 --version
Explanation:
- Bootstrap: Specifies the base image source. In this example, we are using a Docker image (
ubuntu:20.04
) as the base. - From: Indicates the base image to use, in this case, Ubuntu 20.04.
- %labels: Allows you to add metadata to the image, such as the author and version.
- %post: Contains commands that run during the build process. Here, it updates the package repository, installs Python and pip, installs some Python packages (numpy and pandas), and cleans up temporary files.
- %environment: Sets environment variables that will be available inside the container.
- %runscript: Defines the default command to run when the container is executed. In this case, it runs Python.
- %test: Provides commands to test if the container is built correctly. This section ensures Python and pip are installed by checking their versions.
"$@"
: This is a special parameter that represents all the positional parameters passed to a script or a function.
2. Build the singularity container:
singularity build SingulaityContainerImage.sif Singularity.def
3. Run the singularity container:
singularity exec \
SingulaityContainerImage.sif \
python3 \
-c "import numpy; print(numpy.__version__)"