Most of our applications are deployed in Kubernetes with persistent volumes (pvc). This is what we are going to talk about in this post and especially the way to backup these volumes to avoid losing data.

By Maud Laurent, Sys admin @ Objectif Libre

Targeted audience: Kubernetes admins

Kubernetes: Backup your Stateful apps

Why do we focus on Stateful apps?
Because stateless apps are built with yaml files, which can be backed up with git, have an automated deployment with CI/CD, etc. None of the data is saved inside these apps. Configuration is made in a ConfigMap or Secrets.

Stateful apps on the other hand save data, mostly attached on volumes, and it is these volumes that contain all the information that apps need in order to run properly making it a priority to backup

Tools to make own backups

To back up volumes inside Kubernetes, there are two applications: Velero and Stash.

Velero is a backup tool not only focused on volumes backups, it also allows you to backup all your cluster (pods, services, volumes,…) with a sorting system by labels or Kubernetes objects.

Stash is a tool only focused on volume backups.

These two applications use the same tool to manage backups: Restic. Restic is a backup manager tool allowing us to create and restore backups. It’s encrypting backup data to guarantee the confidentiality and integrity of these.

Restic is built to secure your data against such attackers, by encrypting it with AES-256 in counter mode and authenticating it using Poly1305-AES. (source: https://restic.net)

Our use case

To explain this more in details, we are going to run an example and backup our data with these two tools.
The running app is a simple Nginx web server with a service in NodePort. The Nginx logs are saved in a persistent volume.

The cluster used has 1 master 2 workers with Ceph object (s3) and block installed using Rook.

The Nginx application is deployed using a deployment yaml, a service yaml and a volume yaml. These files are available below. The following command creates the test namespace and run deployment yamls.

kubectl create ns demo-app && kubectl apply -n demo-app -f pvc-log-1Gi-no-ns.yml -f deployment-no-ns.yml -f service-no-ns.yml

Below, you will find yaml file used to create the deployment application on Kubernetes :

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: demo-app
    name: demo-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: demo-app
  template:
    metadata:
      labels:
        app: demo-app
        name: web-app
    spec:
      containers:
      - args:
        name: nginx-app
        image: nginx
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - mountPath: /var/log/nginx/
          name: log-data
          restartPolicy: Always
      volumes:
      - name: log-data
        persistentVolumeClaim:
        claimName: demo-app-log

Below, you will find the yaml file used to create the persistent volume on Kubernetes :

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: demo-app-log
spec:
  storageClassName: rook-ceph-block
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

Below, you will find the yaml file used to create the service on Kubernetes.

apiVersion: v1
kind: Service
metadata:
  name: demo-app
  labels:
    app: demo-app
spec:
  selector:
    app: demo-app
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
  type: NodePort

To populate Nginx logs, we used this command watch -n 30 curl IP_Master:NodePort
Our volume saves Nginx application logs with this kubectl command: kubectl exec -it -n demo-app demo-app-5b955c984d-x7pqf -c nginx-app -- wc -l /var/log/nginx/access.log we can see the line number in the access.log file and check our web page access. We can also make sure the backup restoration worked using this command.

Velero

Installation

[default]
aws_access_key_id = Your_Access_Key
aws_secret_access_key = Your_Secret_Key
  • Install Velero with the cli (add --dry-run -o yaml to see only yaml applied):
velero install --provider aws \
--bucket velero-backup \
--secret-file ./velero-creds \
--use-volume-snapshots=true \
--backup-location-config region=":default-placement",s3ForcePathStyle="true",s3Url=http://rook-ceph-rgw-my-store.rook-ceph.svc.cluster.local \
--snapshot-location-config region=":default-placement" \
--use-restic

Or with Helm :

  • Create a secret with access storage information kubectl create secret generic s3-velero-creds -n velero --from-file cloud=velero-creds
  • Create a setting file values.yml
configuration:
  provider: aws
  backupStorageLocation:
    name: aws
    bucket: velero-backup
    config:
      region: ":default-placement"
      s3ForcePathStyle: true
      s3Url: http://rook-ceph-rgw-my-store.rook-ceph.svc.cluster.local

credentials:
  existingSecret: s3-velero-creds

deployRestic: true
  • Run the Helm install helm-3 install -f values.yml velero --namespace velero --version 2.1.3 stable/velero

Run a backup

To backup our volumes with Restic, Velero ask to make an annotation on the pods that contain these volumes kubectl -n demo-app annotate pod/demo-app-57f87559b6-jhdfk backup.velero.io/backup-volumes=log-data. This annotation can also directly be written in the application deployment yaml file, in the following part : spec.template.metadata.annotations.

Backups can be created with the command velero backup create demo-backup --include-namespaces demo-app or applying a yaml file (https://velero.io/docs/v1.1.0/api-types/backup/).

apiVersion: velero.io/v1
kind: Backup
metadata:
  name: demo-app-backup
  namespace: velero # must be match velero server namespace
spec:
  includedNamespaces:
  - demo-app
    ttl: 24h0m0s # default 720h0m0s
    storageLocation: default # backup storage location
  volumeSnapshotLocations:
  - default

With this file, only one backup is done, but it’s possible to schedule it with velero schedule create demo-app-schedule-backup --schedule="@every 5m" --include-namespace demo-app --ttl 0h30m00s command or the yaml file right below.

apiVersion: velero.io/v1
kind: Schedule
metadata:
  name: demo-app-schedule-backup
  namespace: velero
spec:
  schedule: '@every 5m'
  template:
    includedNamespaces:
    - demo-app
      ttl: 0h30m00s

We can create many localization for backups storage and also choose where to save the data.

PS: To delete a backup in velero, it’s better to use the cli velero instead of deleting it with the kubectl cli.

Restore backed up data

Before restoring a snapshot saved previously, we  are going to delete the deployment namespace kubectl delete ns demo-app. Snapshots list are available with cli velero backup get. We can create a restoration with the command velero restore create --from-backup demo-app-backup or with a yaml file like this.

apiVersion: velero.io/v1
kind: Restore
metadata:
  name: demo-app-restore
  namespace: velero
spec:
  backupName: demo-app-backup
  excludeResources:
  - nodes
  - events
  - events.events.k8s.io
  - backups.velero.io
  - restores.velero.io
  - resticrepositories.velero.io

Just like with backups, it’s possible to see the restoration state with the command velero restore get and get more details with the keyword describe or logs.
Once the restoration is complete, we have all our data available. To check it, we can run the command kubectl get all,pvc -n demo-app.

In my case, the result is the next :

NAME READY STATUS RESTARTS AGE
pod/demo-app-5b955c984d-x7pqf 2/2 Running 0 109s

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/demo-app NodePort 10.233.33.213 80:31010/TCP 105s

NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/demo-app 1/1 1 1 108s

NAME DESIRED CURRENT READY AGE
replicaset.apps/demo-app-5b955c984d 1 1 1 108s

NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/demo-app-log Bound pvc-7f302738-15ad-4294-8e8e-6407e42f8bf3 1Gi RWO rook-ceph-block 110s

We can see that a Velero backup restoration keeps exactly the pod name; and that the restoration is applied on a different or in the same namespace that the previous deployment.
After the restoration, writing the command kubectl exec -it -n demo-app demo-app-5b955c984d-x7pqf -c nginx-app -- wc -l /var/log/nginx/access.log we can see that our file already has some lines even if none access has been made on the webserver.

Stash

Installation

Stash can be installed using Helm or following the indications below :

kubectl create ns stash
helm-3 repo add appscode https://charts.appscode.com/stable/
helm-3 repo update
helm-3 repo install --version 0.8.3 stash --namespace stash appscode/stash

Run a backup

All our backup storage is done in s3 storage. Stash needs a secret with the storage authentication information. Stash also asks for a Restic password. Restic saves data in s3 storage encrypting it with the password. This secret is saved on the project namespace to back up, which allows you to specify for each project/namespace a s3 access and a Restic password.

apiVersion: v1
kind: Secret
metadata:
  name: s3-secret
  namespace: demo-app
data:
  AWS_ACCESS_KEY_ID: czNfc3RvcmFnZV9rZXlfaWQ= # s3_storage_key_id
  AWS_SECRET_ACCESS_KEY: czNfc3RvcmFnZV9rZXlfc2VjcmV0 # s3_storage_key_secret
  RESTIC_PASSWORD: U3VwM3JSZXN0aWNQd2Q= # Sup3rResticPwd

Stash offers the possibility to make two backup types:

  • an online backup (hot backup): Stash adds a sidecar container to the current deployment pod. This sidecar mounts the volume in read-only (RO) and makes backups at the same time the app is running.
  • an offline backup (cold backup): Stash adds an init container to the current pod and creates a Kubernetes cronjob. This cronjob runs back up, the pod would be recreated in each backup to execute the init container.

In our case, we are going to apply an online backup with the yaml below.

apiVersion: stash.appscode.com/v1alpha1
kind: Restic
metadata:
  name: rook-restic
  namespace: demo-app
spec:
  type: online # default value
  selector:
    matchLabels:
      app: demo-app # Must match with the label of pod we want to back up.
  fileGroups:
  - path: /var/log/nginx
    retentionPolicyName: 'keep-last-5'
  backend:
    s3:
      endpoint: 'http://rook-ceph-rgw-my-store.rook-ceph.svc.cluster.local'
      bucket: stash-backup # Give a name of the bucket where you want to back up.
      prefix: demo-app # A prefix for the directory where repository will be created.(optional).
    storageSecretName: s3-secret
  schedule: '@every 5m'
  volumeMounts:
  - mountPath: /var/log/nginx
    name: log-data # name of volume set in deployment not claimName
  retentionPolicies:
  - name: 'keep-last-5'
    keepLast: 5
    prune: true

When the yaml is applied, pods are recreated to add the sidecar allowing the backup. Snapshots can be paused when you want patching the restic object creation thanks to the command kubectl patch restic -n demo-app rook-restic --type="merge" --patch='{"spec": {"paused": true}}'. Snapshot created by Stash is displayed with the command kubectl get -n demo-app snapshots.repositories.stash.appscode.com.

Restore backed up data

Before restoring our system, we are going to delete our data.
Stash needs two elements to restore a backup: the repository and the storage secret. These two elements are found in the project namespace to back up.
The namespace deletion leads to the loss of these two data. It is still possible to recreate it by reapplying the secret and a repository configuration to run the backup restoration.

In our case, we are going to delete our deployment, the volume, and our restic resource.

kubectl delete -n demo-app deployment demo-app
kubectl delete -n demo-app pvc demo-app-log
kubectl delete -n demo-app restics.stash.appscode.com rook-restic

To restore a system, we are first going to create a new empty volume (equal or larger the previous size).

kubectl apply -f pvc-recovery.yml -n demo-app
kubectl get pvc -n demo-app
persistentvolumeclaim/demo-app-log-recovery Bound pvc-5a0ba64e-7cf8-49d8-a5a9-ac071160da11 2Gi RWO rook-ceph-block 4s

After that, we will run a restoration. This yaml file copies backed up data to the new volume created. Stash creates a Kubernetes Job which will mount the volume and copy the data inside.

apiVersion: stash.appscode.com/v1alpha1
kind: Recovery
metadata:
  name: s3-recovery
  namespace: demo-app
spec:
  repository:
    name: deployment.demo-app
    namespace: demo-app
  snapshot: deployment.demo-app-70b545c2 # snapshot name to restore
  paths: # path want to restore
  - /var/log/nginx
  recoveredVolumes: # where we want to restore
  - mountPath: /var/log/nginx
    persistentVolumeClaim:
      claimName: demo-app-log-recovery
kubectl get recoveries.stash.appscode.com -n demo-app -w
NAME REPOSITORY-NAMESPACE REPOSITORY-NAME SNAPSHOT PHASE AGE
s3-recovery demo-app deployment.demo-app deployment.demo-app-70b545c2 Running 18s
s3-recovery demo-app deployment.demo-app deployment.demo-app-70b545c2 Succeeded 39s

Once the restoration is finished without error, we only need to apply the previous deployment by changing the volume name to the restored one.

So, Velero or Stash to back up your persistent volumes?

Velero

  • Use the same password for all backup volumes, with Restic. So be careful with access to the storage location.
  • All elements used for backup are saved in the velero namespace. If a namespace is deleted, the restoration isn’t compromised.
  • Runs with a system of plugins and hook to custom/upgrade backups.
  • Offers a metrics solution to monitor backups.
  • A volume backup is an extension of all backup system offered by Velero.
  • Storage providers: S3 (AWS, Ceph, Minio…), ABS, GCS. Volume Snapshot providers: AWS EBS, AMD, GCED, Restic, Portworx, DigitalOcean, OpenEBS, AlibabaCloud.
  • Works great for migration or project backups, but the rights are difficult to manage. Encourage use only by admins.

Stash

  • Encryption password for Restic can be defined for each namespace in the cluster.
  • Data to restore backup are saved in the project namespace you want to back up. In that case, if you delete the namespace, the restoration is more complex, but always possible.
  • Offers a metrics solution to monitor backups.
  • Focus on volume (pvc) backups.
  • Storage providers: S3 (AWS, Ceph, Minio…), ABS, GCS, Openstack Swift, Backblaze B2, Local.
  • Can be easily used to resize a persistent volume.