Most of our applications are deployed in Kubernetes with persistent volumes (pvc). This is what we are going to talk about in this post and especially the way to backup these volumes to avoid losing data.
By Maud Laurent, Sys admin @ Objectif Libre
Targeted audience: Kubernetes admins
Kubernetes: Backup your Stateful apps
Why do we focus on Stateful apps?
Because stateless apps are built with yaml files, which can be backed up with git, have an automated deployment with CI/CD, etc. None of the data is saved inside these apps. Configuration is made in a ConfigMap or Secrets.
Stateful apps on the other hand save data, mostly attached on volumes, and it is these volumes that contain all the information that apps need in order to run properly making it a priority to backup
Tools to make own backups
To back up volumes inside Kubernetes, there are two applications: Velero and Stash.
Velero is a backup tool not only focused on volumes backups, it also allows you to backup all your cluster (pods, services, volumes,…) with a sorting system by labels or Kubernetes objects.
Stash is a tool only focused on volume backups.
These two applications use the same tool to manage backups: Restic. Restic is a backup manager tool allowing us to create and restore backups. It’s encrypting backup data to guarantee the confidentiality and integrity of these.
Restic is built to secure your data against such attackers, by encrypting it with AES-256 in counter mode and authenticating it using Poly1305-AES. (source: https://restic.net)
Our use case
To explain this more in details, we are going to run an example and backup our data with these two tools.
The running app is a simple Nginx web server with a service in NodePort. The Nginx logs are saved in a persistent volume.
The cluster used has 1 master 2 workers with Ceph object (s3) and block installed using Rook.
The Nginx application is deployed using a deployment yaml, a service yaml and a volume yaml. These files are available below. The following command creates the test namespace and run deployment yamls.
kubectl create ns demo-app && kubectl apply -n demo-app -f pvc-log-1Gi-no-ns.yml -f deployment-no-ns.yml -f service-no-ns.yml
Below, you will find yaml file used to create the deployment application on Kubernetes :
apiVersion: apps/v1 kind: Deployment metadata: labels: app: demo-app name: demo-app spec: replicas: 1 selector: matchLabels: app: demo-app template: metadata: labels: app: demo-app name: web-app spec: containers: - args: name: nginx-app image: nginx imagePullPolicy: IfNotPresent volumeMounts: - mountPath: /var/log/nginx/ name: log-data restartPolicy: Always volumes: - name: log-data persistentVolumeClaim: claimName: demo-app-log
Below, you will find the yaml file used to create the persistent volume on Kubernetes :
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: demo-app-log spec: storageClassName: rook-ceph-block accessModes: - ReadWriteOnce resources: requests: storage: 1Gi
Below, you will find the yaml file used to create the service on Kubernetes.
apiVersion: v1 kind: Service metadata: name: demo-app labels: app: demo-app spec: selector: app: demo-app ports: - port: 80 protocol: TCP targetPort: 80 type: NodePort
To populate Nginx logs, we used this command watch -n 30 curl IP_Master:NodePort
Our volume saves Nginx application logs with this kubectl command: kubectl exec -it -n demo-app demo-app-5b955c984d-x7pqf -c nginx-app -- wc -l /var/log/nginx/access.log
we can see the line number in the access.log file and check our web page access. We can also make sure the backup restoration worked using this command.
Velero
Installation
- Download the velero cli (here we used version 1.1.0): https://github.com/heptio/velero/releases/tag/v1.1.0
- Create an access storage file for s3 :
[default] aws_access_key_id = Your_Access_Key aws_secret_access_key = Your_Secret_Key
- Install Velero with the cli (add
--dry-run -o yaml
to see only yaml applied):
velero install --provider aws \ --bucket velero-backup \ --secret-file ./velero-creds \ --use-volume-snapshots=true \ --backup-location-config region=":default-placement",s3ForcePathStyle="true",s3Url=http://rook-ceph-rgw-my-store.rook-ceph.svc.cluster.local \ --snapshot-location-config region=":default-placement" \ --use-restic
Or with Helm :
- Create a secret with access storage information
kubectl create secret generic s3-velero-creds -n velero --from-file cloud=velero-creds
- Create a setting file values.yml
configuration: provider: aws backupStorageLocation: name: aws bucket: velero-backup config: region: ":default-placement" s3ForcePathStyle: true s3Url: http://rook-ceph-rgw-my-store.rook-ceph.svc.cluster.local credentials: existingSecret: s3-velero-creds deployRestic: true
- Run the Helm install
helm-3 install -f values.yml velero --namespace velero --version 2.1.3 stable/velero
Run a backup
To backup our volumes with Restic, Velero ask to make an annotation on the pods that contain these volumes kubectl -n demo-app annotate pod/demo-app-57f87559b6-jhdfk backup.velero.io/backup-volumes=log-data
. This annotation can also directly be written in the application deployment yaml file, in the following part : spec.template.metadata.annotations
.
Backups can be created with the command velero backup create demo-backup --include-namespaces demo-app
or applying a yaml file (https://velero.io/docs/v1.1.0/api-types/backup/).
apiVersion: velero.io/v1 kind: Backup metadata: name: demo-app-backup namespace: velero # must be match velero server namespace spec: includedNamespaces: - demo-app ttl: 24h0m0s # default 720h0m0s storageLocation: default # backup storage location volumeSnapshotLocations: - default
With this file, only one backup is done, but it’s possible to schedule it with velero schedule create demo-app-schedule-backup --schedule="@every 5m" --include-namespace demo-app --ttl 0h30m00s
command or the yaml file right below.
apiVersion: velero.io/v1 kind: Schedule metadata: name: demo-app-schedule-backup namespace: velero spec: schedule: '@every 5m' template: includedNamespaces: - demo-app ttl: 0h30m00s
We can create many localization for backups storage and also choose where to save the data.
PS: To delete a backup in velero, it’s better to use the cli velero
instead of deleting it with the kubectl
cli.
Restore backed up data
Before restoring a snapshot saved previously, we are going to delete the deployment namespace kubectl delete ns demo-app
. Snapshots list are available with cli velero backup get
. We can create a restoration with the command velero restore create --from-backup demo-app-backup
or with a yaml file like this.
apiVersion: velero.io/v1 kind: Restore metadata: name: demo-app-restore namespace: velero spec: backupName: demo-app-backup excludeResources: - nodes - events - events.events.k8s.io - backups.velero.io - restores.velero.io - resticrepositories.velero.io
Just like with backups, it’s possible to see the restoration state with the command velero restore get
and get more details with the keyword describe
or logs
.
Once the restoration is complete, we have all our data available. To check it, we can run the command kubectl get all,pvc -n demo-app
.
In my case, the result is the next :
NAME READY STATUS RESTARTS AGE pod/demo-app-5b955c984d-x7pqf 2/2 Running 0 109s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/demo-app NodePort 10.233.33.213 80:31010/TCP 105s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/demo-app 1/1 1 1 108s NAME DESIRED CURRENT READY AGE replicaset.apps/demo-app-5b955c984d 1 1 1 108s NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE persistentvolumeclaim/demo-app-log Bound pvc-7f302738-15ad-4294-8e8e-6407e42f8bf3 1Gi RWO rook-ceph-block 110s
We can see that a Velero backup restoration keeps exactly the pod name; and that the restoration is applied on a different or in the same namespace that the previous deployment.
After the restoration, writing the command kubectl exec -it -n demo-app demo-app-5b955c984d-x7pqf -c nginx-app -- wc -l /var/log/nginx/access.log
we can see that our file already has some lines even if none access has been made on the webserver.
Stash
Installation
Stash can be installed using Helm or following the indications below :
kubectl create ns stash helm-3 repo add appscode https://charts.appscode.com/stable/ helm-3 repo update helm-3 repo install --version 0.8.3 stash --namespace stash appscode/stash
Run a backup
All our backup storage is done in s3 storage. Stash needs a secret with the storage authentication information. Stash also asks for a Restic password. Restic saves data in s3 storage encrypting it with the password. This secret is saved on the project namespace to back up, which allows you to specify for each project/namespace a s3 access and a Restic password.
apiVersion: v1 kind: Secret metadata: name: s3-secret namespace: demo-app data: AWS_ACCESS_KEY_ID: czNfc3RvcmFnZV9rZXlfaWQ= # s3_storage_key_id AWS_SECRET_ACCESS_KEY: czNfc3RvcmFnZV9rZXlfc2VjcmV0 # s3_storage_key_secret RESTIC_PASSWORD: U3VwM3JSZXN0aWNQd2Q= # Sup3rResticPwd
Stash offers the possibility to make two backup types:
- an online backup (hot backup): Stash adds a sidecar container to the current deployment pod. This sidecar mounts the volume in read-only (RO) and makes backups at the same time the app is running.
- an offline backup (cold backup): Stash adds an init container to the current pod and creates a Kubernetes cronjob. This cronjob runs back up, the pod would be recreated in each backup to execute the init container.
In our case, we are going to apply an online backup with the yaml below.
apiVersion: stash.appscode.com/v1alpha1 kind: Restic metadata: name: rook-restic namespace: demo-app spec: type: online # default value selector: matchLabels: app: demo-app # Must match with the label of pod we want to back up. fileGroups: - path: /var/log/nginx retentionPolicyName: 'keep-last-5' backend: s3: endpoint: 'http://rook-ceph-rgw-my-store.rook-ceph.svc.cluster.local' bucket: stash-backup # Give a name of the bucket where you want to back up. prefix: demo-app # A prefix for the directory where repository will be created.(optional). storageSecretName: s3-secret schedule: '@every 5m' volumeMounts: - mountPath: /var/log/nginx name: log-data # name of volume set in deployment not claimName retentionPolicies: - name: 'keep-last-5' keepLast: 5 prune: true
When the yaml is applied, pods are recreated to add the sidecar allowing the backup. Snapshots can be paused when you want patching the restic
object creation thanks to the command kubectl patch restic -n demo-app rook-restic --type="merge" --patch='{"spec": {"paused": true}}'
. Snapshot created by Stash is displayed with the command kubectl get -n demo-app snapshots.repositories.stash.appscode.com
.
Restore backed up data
Before restoring our system, we are going to delete our data.
Stash needs two elements to restore a backup: the repository and the storage secret. These two elements are found in the project namespace to back up.
The namespace deletion leads to the loss of these two data. It is still possible to recreate it by reapplying the secret and a repository configuration to run the backup restoration.
In our case, we are going to delete our deployment, the volume, and our restic resource.
kubectl delete -n demo-app deployment demo-app kubectl delete -n demo-app pvc demo-app-log kubectl delete -n demo-app restics.stash.appscode.com rook-restic
To restore a system, we are first going to create a new empty volume (equal or larger the previous size).
kubectl apply -f pvc-recovery.yml -n demo-app
kubectl get pvc -n demo-app persistentvolumeclaim/demo-app-log-recovery Bound pvc-5a0ba64e-7cf8-49d8-a5a9-ac071160da11 2Gi RWO rook-ceph-block 4s
After that, we will run a restoration. This yaml file copies backed up data to the new volume created. Stash creates a Kubernetes Job which will mount the volume and copy the data inside.
apiVersion: stash.appscode.com/v1alpha1 kind: Recovery metadata: name: s3-recovery namespace: demo-app spec: repository: name: deployment.demo-app namespace: demo-app snapshot: deployment.demo-app-70b545c2 # snapshot name to restore paths: # path want to restore - /var/log/nginx recoveredVolumes: # where we want to restore - mountPath: /var/log/nginx persistentVolumeClaim: claimName: demo-app-log-recovery
kubectl get recoveries.stash.appscode.com -n demo-app -w NAME REPOSITORY-NAMESPACE REPOSITORY-NAME SNAPSHOT PHASE AGE s3-recovery demo-app deployment.demo-app deployment.demo-app-70b545c2 Running 18s s3-recovery demo-app deployment.demo-app deployment.demo-app-70b545c2 Succeeded 39s
Once the restoration is finished without error, we only need to apply the previous deployment by changing the volume name to the restored one.
So, Velero or Stash to back up your persistent volumes?
Velero
- Use the same password for all backup volumes, with Restic. So be careful with access to the storage location.
- All elements used for backup are saved in the velero namespace. If a namespace is deleted, the restoration isn’t compromised.
- Runs with a system of plugins and hook to custom/upgrade backups.
- Offers a metrics solution to monitor backups.
- A volume backup is an extension of all backup system offered by Velero.
- Storage providers: S3 (AWS, Ceph, Minio…), ABS, GCS. Volume Snapshot providers: AWS EBS, AMD, GCED, Restic, Portworx, DigitalOcean, OpenEBS, AlibabaCloud.
- Works great for migration or project backups, but the rights are difficult to manage. Encourage use only by admins.
Stash
- Encryption password for Restic can be defined for each namespace in the cluster.
- Data to restore backup are saved in the project namespace you want to back up. In that case, if you delete the namespace, the restoration is more complex, but always possible.
- Offers a metrics solution to monitor backups.
- Focus on volume (pvc) backups.
- Storage providers: S3 (AWS, Ceph, Minio…), ABS, GCS, Openstack Swift, Backblaze B2, Local.
- Can be easily used to resize a persistent volume.