Persistent Storage on CloudFlow
Application developers are able to use standard Kubernetes Persistent Volumes to give their applications persistent storage that lives beyond the lifetime of a pod. Without persistent volumes, pods are able to read and write data to ephemeral disk, but the lifecycle of such data ends when the pod terminates. So a database of any kind placed on an ephmeral disk only lasts as long as the pod and then it is gone. Persistent storage solves this problem by giving data a lifecycle longer than that of a single pod.
Persistent storage can be used for:
- horizontal scaling of a pod, so that the multiple replicas have access to common data, such as a cache
- different pods of a microservice application, giving those pods a common source of truth for whatever data they might need
- data that needs to survive a pod that crashes and restarts
- a database for your distributed application, such as Postgres, MySQL, SQLite, or others
- a document store
- a persistent cache, which might be used with our Varnish Cache tutorial
- a KV store
- an object store, such as MinIO
How it Works
Persistent volumes are managed by standard Kubernetes tooling, such as
kubectl or .yaml files that contain the required fields and object specification.
Persistent volumes in CloudFlow are created dynamically as a result of a claim: when your deployment to CloudFlow includes a Persistent Volume Claim, then a persistent volume is created dynamically. Then your pod mounts a volume with that claim to gain access to the data.
The claim may fail if resouces are not available. And writes to the volume can fail if you exceed the capacity.
Managing Data Between Locations
Persistent volumes are created within a cluster, within a CloudFlow location. Note that a persistent volume does not relocate with your project when CloudFlow moves it due to changes in traffic.
Strategies for sharing data between locations include configuring replication to occur between persistent volumes residing in different locations. We have a guide that explains how you can set this up (coming). You might choose to use a static location configuration for your project and then replicate data between them in order to provide a globally-distributed persistent data.
Strategies for making your persistent data available in new locations as CloudFlow creates them include the idea of continuously streaming changes to AWS S3, Azure Blob Storage, etc. Litestream.io supports this idea for SQLite, allowing you to quickly restore data when your project becomes live in a new location.
CloudFlow provisions the volume as resources allow. But the application developer is responsible for replication, data backup, disaster recovery, compliance and cryptographic requirements, and disk destruction.