Hop Into Eggciting Learning Opportunities | Flat 25% OFF | Code: EASTER
Kubernetes7 min read

Kubernetes Storage Explained: Persistent Volumes, Storage Classes, and StatefulSets

Suyash RaizadaSuyash Raizada
Kubernetes Storage Explained: Persistent Volumes, Storage Classes, and StatefulSets

Kubernetes storage is the foundation for running stateful workloads like MySQL, PostgreSQL, Kafka, and other data-critical systems reliably. While containers are designed to be ephemeral, production systems need data to survive restarts, rescheduling, and node failures. Persistent Volumes (PVs), PersistentVolumeClaims (PVCs), Storage Classes, and StatefulSets work together to provide durable, policy-driven storage.

This guide explains how these components fit together, when to use each one, and what has changed in recent Kubernetes releases to make stateful storage easier to manage.

Certified Artificial Intelligence Expert Ad Strip

Why Kubernetes Storage Needs PVs, PVCs, and StatefulSets

In Kubernetes, a Pod can be deleted and recreated at any time. If storage is tied to the Pod lifecycle, the data disappears. Kubernetes storage solves this by decoupling storage from Pods:

  • Persistent Volumes (PVs) represent actual storage capacity available to the cluster.

  • PersistentVolumeClaims (PVCs) are requests for storage made by applications.

  • Storage Classes define how storage is provisioned and what policies it follows.

  • StatefulSets manage Pods that need stable identities and stable storage.

Most modern clusters rely on dynamic provisioning through CSI (Container Storage Interface) drivers, which became standard across Kubernetes distributions from version 1.17 onward. The CSI ecosystem supports a broad range of storage backends across public cloud and on-premises environments, making Kubernetes storage more portable than earlier platform-specific plugins.

Persistent Volumes and PersistentVolumeClaims

What is a Persistent Volume?

A Persistent Volume is a cluster-wide resource that represents real storage. It can map to:

  • Cloud block volumes (for example, AWS EBS, GCE PD, Azure Disk)

  • Network storage (NFS, iSCSI)

  • Distributed storage systems via CSI drivers

  • hostPath, typically limited to local development and testing

PVs carry properties such as capacity, volume type, and access modes. A common access mode for databases is ReadWriteOnce (RWO), which allows a single node to mount the volume with read-write access. This aligns with how many database engines handle local file locks and consistency.

What is a PersistentVolumeClaim?

A PersistentVolumeClaim is a request for storage, similar to how a Pod requests CPU and memory. It specifies:

  • Storage size (for example, 10Gi)

  • Access mode (for example, ReadWriteOnce)

  • Storage class, which determines provisioning behavior

Kubernetes binds a PVC to a matching PV. When dynamic provisioning is enabled, the PV is created automatically on demand by a provisioner, usually a CSI driver.

Static vs Dynamic Provisioning

Static provisioning means an administrator pre-creates PVs, and workloads bind to them. This approach suits local PVs, legacy storage, or tightly controlled environments, but it increases operational overhead.

Dynamic provisioning uses a Storage Class and a provisioner to create PVs automatically when PVCs are submitted. Production teams strongly favor dynamic provisioning because it reduces manual operations and scales more effectively as stateful workloads grow.

Reclaim Policies and Avoiding Orphaned Storage

PVs have a reclaim policy that controls what happens after a PVC is deleted. The two primary policies are:

  • Delete: removes the underlying storage asset. This is the common default for dynamically provisioned volumes.

  • Retain: preserves the underlying storage asset for manual recovery or auditing.

Using Delete for dynamically provisioned volumes prevents storage from accumulating unnoticed after PVC cleanup. For regulated or safety-critical data, Retain may be preferred, but it requires a clear operational process to manage leftover volumes.

Storage Classes: the Policy Layer of Kubernetes Storage

A StorageClass defines a category of storage and how it should be provisioned. It typically includes:

  • Provisioner: usually a CSI driver

  • Parameters: encryption, performance tier, replication factor, filesystem type, and related settings

  • Reclaim policy: Delete or Retain

  • Binding mode: controls when and where a volume is created

If a PVC does not specify storageClassName, Kubernetes uses the cluster default Storage Class, if one is configured.

Storage Class Examples Found in Real Clusters

  • General-purpose SSD: names like standard are common cluster defaults.

  • Encrypted SSD: classes like gp3-encrypted are frequently used for production databases where encryption at rest is required.

  • Highly available replicated storage: some CSI-based platforms provide classes with multi-replica durability and encryption for critical workloads.

As production usage has matured, encrypted and replicated volumes have become standard for critical stateful workloads, a trend closely tied to the near-universal adoption of CSI in production clusters.

StatefulSets: Stable Identities and Stable Storage

A StatefulSet is the Kubernetes workload API designed for stateful applications. It provides capabilities that a Deployment does not guarantee:

  • Stable network identities: predictable Pod names like mysql-0, mysql-1

  • Ordered operations: Pods are created sequentially (0 to N-1) and terminated in reverse order

  • Stable storage: each Pod receives its own PVC through volumeClaimTemplates

This design is why databases and log-based systems are typically deployed using StatefulSets. Many teams standardize on StatefulSets for persistent databases, message queues, and other systems where identity and storage must remain consistent across rescheduling events.

How volumeClaimTemplates Work

With volumeClaimTemplates, Kubernetes generates one PVC per replica. For example, a template named data results in PVCs like:

  • data-mysql-0

  • data-mysql-1

  • data-mysql-2

Each PVC binds to a PV that matches the request. If the Pod is rescheduled, it reattaches to the same volume, preserving data integrity.

MySQL StatefulSet Example (3 Replicas with 10Gi Each)

The following example shows the essential pattern: RWO storage per Pod, dynamically provisioned via a Storage Class.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
spec:
  serviceName: "mysql"
  replicas: 3
  selector:
    matchLabels:
      app: mysql
  template:
    spec:
      containers:
      - name: mysql
        image: mysql:8.0
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: "standard"
      resources:
        requests:
          storage: 10Gi

This pattern ensures each MySQL instance retains its own data and that the same PV is reattached after restarts.

Operational Realities: Deletion, Scaling, and Retention

A common point of confusion is that deleting a StatefulSet or scaling it down does not automatically delete the associated PVCs and PVs. This behavior is intentional: Kubernetes defaults to protecting data from accidental workload deletion.

Kubernetes v1.32: PVC Retention Policy for StatefulSets

As of Kubernetes v1.32 (stable), StatefulSets support .spec.persistentVolumeClaimRetentionPolicy to control PVC lifecycle during deletion and scaling events. Example behaviors include:

  • whenDeleted: Delete: removes PVCs when the StatefulSet is deleted

  • whenScaled: Retain: keeps PVCs when scaling down, supporting later scale-up reuse

This capability is controlled by the StatefulSetAutoDeletePVC feature gate. It improves storage hygiene in environments where automated teardown is safe and desired, while still allowing retention where recovery is a priority.

Best Practices for Kubernetes Storage in Production

  • Use Storage Classes with clear policies: define encryption, replication, performance tier, and reclaim policy explicitly.

  • Prefer dynamic provisioning: it reduces manual PV management and improves scalability.

  • Match access modes to the workload: RWO is typical for single-writer databases; validate multi-writer requirements carefully before using ReadWriteMany.

  • Plan PVC lifecycle deliberately: choose Retain or Delete with intention, and consider the StatefulSet PVC retention policy where appropriate.

  • Set security and permissions correctly: many database images require correct filesystem ownership. Use settings like fsGroup when needed - for example, some PostgreSQL images run under a specific UID such as 999.

  • Allow graceful shutdown: stateful services often need longer termination grace periods to flush data safely before a Pod is removed.

Use Cases Where StatefulSets and Kubernetes Storage Excel

  • MySQL clusters: dedicated RWO PVCs per replica, resilient across restarts and rescheduling.

  • PostgreSQL: encrypted Storage Classes, correct fsGroup configuration, and controlled shutdown timing.

  • Kafka and messaging systems: stable Pod identities support partition assignments and log persistence.

  • Development and testing: hostPath or pre-provisioned PVs can serve local clusters but are not suitable for production use.

  • High-availability databases on distributed storage: replicated encrypted volumes through a CSI provider reduce risk from node or disk failure.

What to Expect Next for Kubernetes Storage

The Kubernetes storage roadmap points toward greater automation and observability. Capabilities like CSI VolumeHealth aim to provide real-time volume condition reporting and improved self-healing workflows. Multi-tenant platforms are also increasingly using multiple Storage Classes as quality-of-service tiers for AI/ML and data platforms. Edge Kubernetes distributions continue refining patterns around local PVs and lightweight stateful operations.

Conclusion

Kubernetes storage becomes straightforward to reason about once responsibilities are clearly separated: PVs are the underlying resources, PVCs are the requests, Storage Classes encode provisioning and policy, and StatefulSets provide the identity and lifecycle guarantees needed for persistent systems. For most teams, the most reliable approach combines CSI-based dynamic provisioning, encrypted Storage Classes, and StatefulSets with volumeClaimTemplates for databases and message queues.

To build practical skills in this area, consider structured learning paths covering Kubernetes administration, cloud security, and DevOps automation. Certifications focused on Kubernetes, DevOps, and cybersecurity can help teams manage stateful workloads with greater confidence and operational maturity.

Related Articles

View All

Trending Articles

View All

Search Programs

Search all certifications, exams, live training, e-books and more.