Kubernetes Storage Explained: Persistent Volumes, Storage Classes, and StatefulSets

Kubernetes storage is the foundation for running stateful workloads like MySQL, PostgreSQL, Kafka, and other data-critical systems reliably. While containers are designed to be ephemeral, production systems need data to survive restarts, rescheduling, and node failures. Persistent Volumes (PVs), PersistentVolumeClaims (PVCs), Storage Classes, and StatefulSets work together to provide durable, policy-driven storage.
This guide explains how these components fit together, when to use each one, and what has changed in recent Kubernetes releases to make stateful storage easier to manage.

Why Kubernetes Storage Needs PVs, PVCs, and StatefulSets
In Kubernetes, a Pod can be deleted and recreated at any time. If storage is tied to the Pod lifecycle, the data disappears. Kubernetes storage solves this by decoupling storage from Pods:
Persistent Volumes (PVs) represent actual storage capacity available to the cluster.
PersistentVolumeClaims (PVCs) are requests for storage made by applications.
Storage Classes define how storage is provisioned and what policies it follows.
StatefulSets manage Pods that need stable identities and stable storage.
Most modern clusters rely on dynamic provisioning through CSI (Container Storage Interface) drivers, which became standard across Kubernetes distributions from version 1.17 onward. The CSI ecosystem supports a broad range of storage backends across public cloud and on-premises environments, making Kubernetes storage more portable than earlier platform-specific plugins.
Persistent Volumes and PersistentVolumeClaims
What is a Persistent Volume?
A Persistent Volume is a cluster-wide resource that represents real storage. It can map to:
Cloud block volumes (for example, AWS EBS, GCE PD, Azure Disk)
Network storage (NFS, iSCSI)
Distributed storage systems via CSI drivers
hostPath, typically limited to local development and testing
PVs carry properties such as capacity, volume type, and access modes. A common access mode for databases is ReadWriteOnce (RWO), which allows a single node to mount the volume with read-write access. This aligns with how many database engines handle local file locks and consistency.
What is a PersistentVolumeClaim?
A PersistentVolumeClaim is a request for storage, similar to how a Pod requests CPU and memory. It specifies:
Storage size (for example, 10Gi)
Access mode (for example, ReadWriteOnce)
Storage class, which determines provisioning behavior
Kubernetes binds a PVC to a matching PV. When dynamic provisioning is enabled, the PV is created automatically on demand by a provisioner, usually a CSI driver.
Static vs Dynamic Provisioning
Static provisioning means an administrator pre-creates PVs, and workloads bind to them. This approach suits local PVs, legacy storage, or tightly controlled environments, but it increases operational overhead.
Dynamic provisioning uses a Storage Class and a provisioner to create PVs automatically when PVCs are submitted. Production teams strongly favor dynamic provisioning because it reduces manual operations and scales more effectively as stateful workloads grow.
Reclaim Policies and Avoiding Orphaned Storage
PVs have a reclaim policy that controls what happens after a PVC is deleted. The two primary policies are:
Delete: removes the underlying storage asset. This is the common default for dynamically provisioned volumes.
Retain: preserves the underlying storage asset for manual recovery or auditing.
Using Delete for dynamically provisioned volumes prevents storage from accumulating unnoticed after PVC cleanup. For regulated or safety-critical data, Retain may be preferred, but it requires a clear operational process to manage leftover volumes.
Storage Classes: the Policy Layer of Kubernetes Storage
A StorageClass defines a category of storage and how it should be provisioned. It typically includes:
Provisioner: usually a CSI driver
Parameters: encryption, performance tier, replication factor, filesystem type, and related settings
Reclaim policy: Delete or Retain
Binding mode: controls when and where a volume is created
If a PVC does not specify storageClassName, Kubernetes uses the cluster default Storage Class, if one is configured.
Storage Class Examples Found in Real Clusters
General-purpose SSD: names like standard are common cluster defaults.
Encrypted SSD: classes like gp3-encrypted are frequently used for production databases where encryption at rest is required.
Highly available replicated storage: some CSI-based platforms provide classes with multi-replica durability and encryption for critical workloads.
As production usage has matured, encrypted and replicated volumes have become standard for critical stateful workloads, a trend closely tied to the near-universal adoption of CSI in production clusters.
StatefulSets: Stable Identities and Stable Storage
A StatefulSet is the Kubernetes workload API designed for stateful applications. It provides capabilities that a Deployment does not guarantee:
Stable network identities: predictable Pod names like mysql-0, mysql-1
Ordered operations: Pods are created sequentially (0 to N-1) and terminated in reverse order
Stable storage: each Pod receives its own PVC through volumeClaimTemplates
This design is why databases and log-based systems are typically deployed using StatefulSets. Many teams standardize on StatefulSets for persistent databases, message queues, and other systems where identity and storage must remain consistent across rescheduling events.
How volumeClaimTemplates Work
With volumeClaimTemplates, Kubernetes generates one PVC per replica. For example, a template named data results in PVCs like:
data-mysql-0
data-mysql-1
data-mysql-2
Each PVC binds to a PV that matches the request. If the Pod is rescheduled, it reattaches to the same volume, preserving data integrity.
MySQL StatefulSet Example (3 Replicas with 10Gi Each)
The following example shows the essential pattern: RWO storage per Pod, dynamically provisioned via a Storage Class.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
spec:
serviceName: "mysql"
replicas: 3
selector:
matchLabels:
app: mysql
template:
spec:
containers:
- name: mysql
image: mysql:8.0
volumeMounts:
- name: data
mountPath: /var/lib/mysql
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "standard"
resources:
requests:
storage: 10GiThis pattern ensures each MySQL instance retains its own data and that the same PV is reattached after restarts.
Operational Realities: Deletion, Scaling, and Retention
A common point of confusion is that deleting a StatefulSet or scaling it down does not automatically delete the associated PVCs and PVs. This behavior is intentional: Kubernetes defaults to protecting data from accidental workload deletion.
Kubernetes v1.32: PVC Retention Policy for StatefulSets
As of Kubernetes v1.32 (stable), StatefulSets support .spec.persistentVolumeClaimRetentionPolicy to control PVC lifecycle during deletion and scaling events. Example behaviors include:
whenDeleted: Delete: removes PVCs when the StatefulSet is deleted
whenScaled: Retain: keeps PVCs when scaling down, supporting later scale-up reuse
This capability is controlled by the StatefulSetAutoDeletePVC feature gate. It improves storage hygiene in environments where automated teardown is safe and desired, while still allowing retention where recovery is a priority.
Best Practices for Kubernetes Storage in Production
Use Storage Classes with clear policies: define encryption, replication, performance tier, and reclaim policy explicitly.
Prefer dynamic provisioning: it reduces manual PV management and improves scalability.
Match access modes to the workload: RWO is typical for single-writer databases; validate multi-writer requirements carefully before using ReadWriteMany.
Plan PVC lifecycle deliberately: choose Retain or Delete with intention, and consider the StatefulSet PVC retention policy where appropriate.
Set security and permissions correctly: many database images require correct filesystem ownership. Use settings like fsGroup when needed - for example, some PostgreSQL images run under a specific UID such as 999.
Allow graceful shutdown: stateful services often need longer termination grace periods to flush data safely before a Pod is removed.
Use Cases Where StatefulSets and Kubernetes Storage Excel
MySQL clusters: dedicated RWO PVCs per replica, resilient across restarts and rescheduling.
PostgreSQL: encrypted Storage Classes, correct fsGroup configuration, and controlled shutdown timing.
Kafka and messaging systems: stable Pod identities support partition assignments and log persistence.
Development and testing: hostPath or pre-provisioned PVs can serve local clusters but are not suitable for production use.
High-availability databases on distributed storage: replicated encrypted volumes through a CSI provider reduce risk from node or disk failure.
What to Expect Next for Kubernetes Storage
The Kubernetes storage roadmap points toward greater automation and observability. Capabilities like CSI VolumeHealth aim to provide real-time volume condition reporting and improved self-healing workflows. Multi-tenant platforms are also increasingly using multiple Storage Classes as quality-of-service tiers for AI/ML and data platforms. Edge Kubernetes distributions continue refining patterns around local PVs and lightweight stateful operations.
Conclusion
Kubernetes storage becomes straightforward to reason about once responsibilities are clearly separated: PVs are the underlying resources, PVCs are the requests, Storage Classes encode provisioning and policy, and StatefulSets provide the identity and lifecycle guarantees needed for persistent systems. For most teams, the most reliable approach combines CSI-based dynamic provisioning, encrypted Storage Classes, and StatefulSets with volumeClaimTemplates for databases and message queues.
To build practical skills in this area, consider structured learning paths covering Kubernetes administration, cloud security, and DevOps automation. Certifications focused on Kubernetes, DevOps, and cybersecurity can help teams manage stateful workloads with greater confidence and operational maturity.
Related Articles
View AllKubernetes
Kubernetes Architecture Explained: Control Plane vs Worker Nodes in Detail
Learn Kubernetes architecture in detail by comparing control plane vs worker nodes, core components like API server, etcd, scheduler, kubelet, and how scheduling and reconciliation work.
Kubernetes
Kubernetes Troubleshooting Playbook: Debugging Pods, Deployments, and Cluster Issues
A practical Kubernetes troubleshooting playbook to debug pods, deployments, and cluster issues using kubectl, events, logs, metrics, and modern tools like kubectl debug.
Kubernetes
GitOps on Kubernetes: Building a CI/CD Pipeline with Argo CD and Kubernetes Manifests
Learn GitOps on Kubernetes with Argo CD: repo structure, automated sync, CI updating manifests, secrets handling, and progressive delivery using Argo Rollouts.
Trending Articles
The Role of Blockchain in Ethical AI Development
How blockchain technology is being used to promote transparency and accountability in artificial intelligence systems.
AWS Career Roadmap
A step-by-step guide to building a successful career in Amazon Web Services cloud computing.
Top 5 DeFi Platforms
Explore the leading decentralized finance platforms and what makes each one unique in the evolving DeFi landscape.