There is a myth that containers are stateless. However, this is not true since a container can contain a state. The state is temporal, unique, and resides on the host machine.
Containers consist of read-only layers. Every layer holds the last changes made on the
config file with configuration and installation commands.
After executing the commands from the
config file, the system changes are stored in a disk layer.
A temporal writable layer is usually created on top of the previous disk image when the container is running. The writable layer is unique to every container on a specific host and can survive when the container restarts.
This writable layer does not store any stateful data, such as persistent application data. Its sole purpose is to temporarily store the data of the running application in the container.
Table of contents
- Where containers stores state
- Backing up and restoring a Docker container
- Reasons why backup solutions fail
- Possible alternative
To follow along, you need to have:
Where containers store state
As mentioned earlier, containers do not save the state in the container image.
Backing up and restoring a Docker container
Docker assists developers in automating the development and deployment process of an application.
Developers can also build a packaged environment that runs applications, which makes them more portable and lightweight.
Docker containers also assist in maintaining applications’ versions. The software that run on Docker are platform-independent.
We will assume we have a container executing in a local environment. We can take a snapshot or backup of the specified container to undo any changes or even run it in the previous timestamp in case of an emergency.
This section will cover how we can backup and restore Docker containers using inbuilt Docker commands.
Backing up a Docker container
We can back up a Docker container using the following command.
We can list all running containers and get their ids, as shown below:
$ docker ps −a
Then we will copy the container’s ID that we want to back up. To take a snapshot of the Docker container, we will execute the below command:
$ docker commit −p (ID of the CONTAINER) (BACKUP_NAME)
For instance, we can pull a WordPress Docker image using the below commands:
$ docker pull wordpress
The output will be:
We can then list all our containers using the following command:
$ docker ps -a
The output will be:
We can then take the snapshot of our container image by running the below command:
$ docker commit -p 1571dbfe094f wordpress-backup
And the output will be:
$ docker commit -p 1571dbfe094f wordpress-backup sha256:abe166f1f1ff6c59c978ab898dbc6f843c10c4a8415d7a2b012660420d205f8a
We store the container image in form of a
tar file in the local storage, as illustrated below:
$ docker save -output wordpress-backup.tar wordpress-backup
Restoring a Docker container
After we have created a backup, we can restore the Docker container, as demonstrated below:
$ docker load -i wordpress-backup.tar
The output will be:
$ docker load -i wordpress-backup.tar Loaded image: wordpress-backup:latest
We can check whether the image was restored successfully by executing the following command:
$ docker images
We can then pull back the Docker image, as highlighted below:
$ docker pull wordpress-backup:latest
After restoring the Docker image, we can use the below command to execute a restored instance of the Docker container:
$ docker run -ti wordpress-backup:latest
Reasons why backup solutions fail
Storage-based snapshots are not enough for data mobility and backup.
They are periodic, require scheduling, and do not deliver the granularity that DevOps requires today. In a fast-paced technological world, where containers regularly start and terminate as per the user’s preference, a backup snapshot is not enough.
In addition, performing container backup at the storage layer means the organization will be prone to vendor lock-ins. As the business grows, they will fail to support the agility needed in the modern world.
Also, containers are not perfect for backing up data due to the following reasons:
- Containers are highly scalable, with numerous instances, each performing a tiny part of the same task. It means that there is no single container that can be the master in an application. Many containers may access similar persistent data each time, unlike virtual machines (VMs), where only one VM accesses data.
- Containers are temporal and cannot be up each time backups need to be taken. This is different from virtual mchines which mostly keep running the VM machine software.
Architectural differences that come with containers demonstrate why backup solutions may fail.
A different approach to performing continuous backups of stateful application data is required.
A better solution should not rely on the container for backup and replication. Also, it cannot rely on one storage solution.
These DaemonSets integrate into the persistent storage to gain access to persistent data independent of any container.
Unifying all cluster nodes and the cluster API allows Zerto for Kubernetes to work more efficiently. It channels persistent data without container duplication or performance issues.
Zerto can also be integrated with clusters, making persistent data replication easier. It is storage agnostic and supports CSI-compatible block storage. This makes it ideal for data migration and mobility solutions.
Organizations should ensure they go for a solution that stores stateful data and captures the Kubernetes state for each application.
It will also enhance data protection for components like ConfigMaps and services. These components can rebuild the application when performing data recovery on the same or another cluster.
Many users and developers do not backup their container. Most argue that containers are stateless and cannot store data; thus, they do not require backup and recovery operations.
The container infrastructure and Kubernetes offer improved availability. Containers can be started and stopped as needed.
However, if anything happens, the entire cluster and container nodes with associated data are destroyed or lost. This means that Kubernetes, Docker, and other applications may need to be backed up.
Peer Review Contributions by: Briana Nzivu
About the authorVerah Ombui
Verah Ombui is an undergraduate Computer Science student at Jomo Kenyatta University Of Agriculture and Technology. Her interests are web development, cloud computing, and data science. Verah is a technology enthusiast.