Datalore 2024.3 Help

Backup, migration & restore

In this chapter, you will find general guidance about backing up (and subsequent restoration) of Datalore Enterprise deployment.

General considerations

There are two places where important data resides:

  • PostgreSQL database

    You can back up and restore data using native database tools (or cloud-native tools, if you are managing a database like Amazon RDS.

  • Block storage

    In both Kubernetes and Docker installation methods, Datalore provides no built-in backup and restore mechanism. Instead, you use underlying infrastructure provider tools to back up the volume used for Datalore.

Migration

Sometimes, you might need not to back up but to migrate an existing environment to another (for example, migrating from PoC envs to production, having a different Kubernetes cluster for such cases).

If that's the case, proceed as follows:

  1. Export the metadata from the old environment

  2. Deploy pods to a new environment

  3. Import PersistentVolume/PersistentVolumeClaims metadata

  4. Patch PV with the correct UID of the just created PVC

  5. Patch PVs to prevent them from being automatically deleted: kubectl patch pv ${PV_NAME} -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'

  6. Save metadata: kubectl get pv/${PV_NAME} --export -o yaml > ${PV_NAME}.yaml; kubectl get pvc/${PVC_NAME} --export -o yaml > ${PVC_NAME}.yaml

  7. Deploy a new Datalore installation to the new cluster; then delete the (new) PVC along with the (new) PV: kubectl delete pvc/${PVC_NAME}

  8. Kubernetes uses UID to determine the connection between PVC and PV. Therefore, you'll need to create a PV with the metadata from step 3 and patch the PV with the correct UID:

    kubectl apply -f ${PVC_NAME}.yaml PVC_UID=$(kubectl get pvc/${PVC_NAME} -o jsonpath='{.metadata.uid}') kubectl apply -f ${PV_NAME}.yaml kubectl patch pv ${PV_NAME} -p "{\"spec\":{\"claimRef\":{\"uid\":\"${PVC_UID}\"}}}"
  9. Stop/remove the old deployment so that you can use this volume in the new cluster.

When deployed in Kubernetes, Datalore uses two volume claims (if installed without Hub):

$ k get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE postgresql-data-datalore-0 Bound pvc-2d0f1d24-0ad0-438d-9066-568be44212ca 2Gi RWO gp2 17h storage-datalore-0 Bound pvc-1d103578-c395-4d89-9b5f-778864d4dfac 10Gi RWO gp2 17h
Last modified: 25 June 2024