Kubernetes makes it easy to create and replace Pods. That is one of its strengths. A Pod can disappear because a node dies, because a Deployment rolls out a new version, because the scheduler moves work somewhere else, or because a probe keeps failing. For stateless workloads this is usually fine. A web frontend can be replaced. A small API replica can come back. A queue worker can restart and continue.

Data is less forgiving.

If a database writes to the container filesystem and the Pod is replaced, the data is gone with the container. If an upload service stores files inside /tmp and the node disappears, the files disappear too. Beginners often discover this the hard way because local Docker habits do not map cleanly to Kubernetes. A container image is not a small virtual machine with a durable disk. It is a repeatable filesystem layer plus a running process. Anything written into the writable container layer should be treated as temporary unless storage is attached deliberately.

That is why Kubernetes has a storage model. It is not there to make YAML longer. It exists because the lifecycle of compute and the lifecycle of data are different.

The mental model: Pods borrow storage

I think about Kubernetes storage in three layers.

First, there is the application. It knows that it needs a path such as /var/lib/postgresql/data, /data, or /uploads.

Second, there is a claim. The application does not usually ask for a specific disk by serial number. It asks for a kind of storage: “I need 20Gi, mounted read-write by one node, from the default storage class.” In Kubernetes this request is a PersistentVolumeClaim, usually shortened to PVC.

Third, there is the actual storage. That may be an AWS EBS volume, an Azure Disk, a Google Persistent Disk, a Ceph volume, a local disk, an NFS export, or something provided by a Container Storage Interface driver. In Kubernetes this backing object is represented as a PersistentVolume, usually shortened to PV.

The Pod mounts the PVC. The PVC binds to a PV. The PV points to real storage.

That indirection matters. Application teams should not need to know every cloud disk detail to ask for durable storage. Platform teams should not need to edit every Deployment when the storage backend changes. Kubernetes puts a contract between the two.

emptyDir is useful, but it is not persistent

Before jumping to PersistentVolumes, it helps to name a common trap. Kubernetes has a volume type called emptyDir. It is created when a Pod starts and removed when the Pod is removed from the node.

That can be exactly right for scratch space:

apiVersion: v1
kind: Pod
metadata:
  name: worker-with-cache
spec:
  containers:
    - name: worker
      image: busybox:1.36
      command: ["sh", "-c", "while true; do date >> /cache/ticks; sleep 10; done"]
      volumeMounts:
        - name: cache
          mountPath: /cache
  volumes:
    - name: cache
      emptyDir: {}

This survives container restarts inside the same Pod. It does not survive Pod replacement. If the Pod is deleted and recreated, the emptyDir starts empty again. That makes it good for temporary caches, generated files, and shared scratch space between containers in one Pod. It makes it a bad home for a database.

The beginner rule is simple: if losing it would surprise a user or break recovery, do not rely on emptyDir.

PersistentVolumeClaims: the app’s storage request

A PVC describes what the workload needs. Here is a small claim:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: app-data
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: standard

The important fields are:

resources.requests.storage is the size. This is not memory. It is disk capacity.

accessModes describe how the volume may be mounted. ReadWriteOnce means it can be mounted read-write by one node at a time. ReadOnlyMany means many nodes may mount it read-only. ReadWriteMany means many nodes may mount it read-write, but only storage backends that support it can provide that.

storageClassName selects the kind of storage. In a cloud cluster, a class may map to SSD, HDD, encrypted disks, a particular availability zone behavior, or a CSI driver.

After applying the claim, check it:

kubectl apply -f pvc.yaml
kubectl get pvc app-data
kubectl describe pvc app-data

A healthy dynamically provisioned claim usually moves to Bound. If it stays Pending, read the events at the bottom of kubectl describe pvc. They are often more useful than the status column.

Using the claim from a Pod

A PVC does nothing until a workload mounts it. This Pod writes a timestamp into the mounted volume:

apiVersion: v1
kind: Pod
metadata:
  name: storage-demo
spec:
  containers:
    - name: app
      image: busybox:1.36
      command: ["sh", "-c", "while true; do date >> /data/ticks.txt; sleep 10; done"]
      volumeMounts:
        - name: data
          mountPath: /data
  volumes:
    - name: data
      persistentVolumeClaim:
        claimName: app-data

Now the application path /data is backed by the PVC. If the container restarts, the file remains. If the Pod is recreated and the volume can be attached again, the data remains.

Useful checks:

kubectl get pod storage-demo
kubectl exec storage-demo -- sh -c "ls -l /data && tail /data/ticks.txt"
kubectl describe pod storage-demo

If the Pod is stuck in ContainerCreating, look for mount or attach errors in kubectl describe pod. If the PVC is still Pending, the Pod cannot mount it because there is no bound volume yet.

PersistentVolumes: the cluster’s storage object

In many modern clusters you do not manually create PVs for every application. Dynamic provisioning does it for you. Still, understanding PVs removes a lot of confusion.

A static PV can look like this:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: manual-data-volume
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: manual
  hostPath:
    path: /mnt/data

This example uses hostPath, which is fine for a learning cluster and dangerous as a general production pattern. It ties data to one node path and does not behave like real network or cloud storage. For production, the PV is usually backed by a CSI driver or managed storage service.

The field worth noticing is persistentVolumeReclaimPolicy.

Delete means the backing storage should be deleted when the claim is deleted. That is convenient for disposable environments and risky if the claim represented important data.

Retain means the PV and backing storage remain after the claim is deleted. That protects data, but it also means someone must clean up or rebind intentionally.

Check PVs with:

kubectl get pv
kubectl describe pv manual-data-volume

When a PVC binds, you can see which PV it got:

kubectl get pvc app-data -o wide

StorageClasses: how dynamic provisioning works

A StorageClass tells Kubernetes how to create storage when a matching PVC appears. It connects the generic Kubernetes request to a real provisioner.

kubectl get storageclass
kubectl describe storageclass standard

You may see one class marked as default. If a PVC omits storageClassName, Kubernetes may use that default class. I say “may” because cluster configuration matters. On a beginner cluster this feels automatic. In a locked-down production cluster it may be intentionally disabled.

A simplified StorageClass looks like this:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

The exact provisioner and parameters depend on the environment. The useful beginner concept is volumeBindingMode.

Immediate means Kubernetes provisions or binds the volume as soon as the PVC is created.

WaitForFirstConsumer means Kubernetes waits until a Pod needs the claim, then considers scheduling constraints such as zones. This matters in cloud environments where a disk in one zone cannot attach to a node in another zone. If storage is created too early in the wrong zone, the Pod may be unschedulable later.

When a PVC is pending and uses WaitForFirstConsumer, the event may say it is waiting for a Pod to be scheduled. That is not always an error. It is the storage system waiting for enough context.

StatefulSets and stable identity

Beginners often ask: “Can I just mount one PVC into three replicas?” Sometimes yes, often no.

If a Deployment has three replicas and all replicas write to the same filesystem, the storage backend must support ReadWriteMany, and the application must be safe with concurrent writes. Many databases are not safe just because a shared disk exists. Kubernetes will not turn a single-node database into a correct distributed database.

For stateful workloads, StatefulSet is often the better controller. It gives Pods stable names and can create one PVC per replica using volumeClaimTemplates.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  serviceName: web
  replicas: 2
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
        - name: nginx
          image: nginx:1.27
          volumeMounts:
            - name: data
              mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 1Gi

This creates claims like data-web-0 and data-web-1. Each replica gets its own storage. Scaling down does not automatically delete those PVCs. That is deliberate. Kubernetes avoids deleting state just because you changed a replica count.

Check it with:

kubectl get statefulset web
kubectl get pod -l app=web
kubectl get pvc

Common misconceptions

The first misconception is that a PVC is the disk. It is not. It is a request and a binding. The real storage sits behind the PV and the storage provider.

The second is that ReadWriteOnce means one Pod. More precisely, it means read-write by one node at a time. Multiple Pods on the same node may be able to use the volume, depending on the driver and situation, but designing around that detail is fragile for beginners. Treat ReadWriteOnce as “not shared storage.”

The third is that deleting a Pod deletes the data. Usually it does not, if the data is on a PVC. Deleting the PVC is the dangerous operation. Even then, what happens depends on the reclaim policy.

The fourth is that Kubernetes storage makes backups unnecessary. It does not. A persistent disk protects against Pod churn. It does not automatically protect against accidental deletes, application corruption, ransomware, bad migrations, or a human running the wrong command. For anything important, backups and restore tests are part of the storage design.

A practical debugging order

When a workload cannot start because of storage, I try not to jump straight into cloud consoles. I start with Kubernetes objects.

kubectl get pvc
kubectl describe pvc app-data
kubectl get pv
kubectl describe pod storage-demo
kubectl get events --sort-by=.lastTimestamp

I look for these clues:

Is the PVC Pending or Bound?

Does the PVC mention no available PV, no default StorageClass, quota problems, or provisioning failures?

Does the Pod show FailedMount, FailedAttachVolume, or timeout events?

Is the volume already attached to another node?

Does the requested access mode match what the storage class can provide?

Is the storage class name spelled correctly?

Is there a namespace quota blocking storage requests?

If all Kubernetes objects look correct, then I check the CSI driver and cloud provider. In many clusters, CSI driver Pods run in kube-system or a storage namespace:

kubectl get pods -A | grep -i csi
kubectl get csidrivers

Driver logs and cloud events are platform-specific, but the Kubernetes events usually point you in the right direction.

What to practice in a lab

The best way to learn this is to run a small exercise. Create a PVC. Mount it into a Pod. Write a file. Delete the Pod. Recreate the Pod with the same claim. Confirm the file is still there. Then delete the PVC in a throwaway namespace and observe what happens to the PV.

Do this in a local or disposable cluster, not with production data.

The lesson is not “storage is easy.” The lesson is that Kubernetes storage becomes less magical when you separate the pieces: the Pod needs a mount, the PVC asks for capacity and access, the StorageClass decides how storage is created, the PV represents the result, and the underlying provider does the real disk work.

Once that model clicks, the error messages become more readable. Pending stops meaning “Kubernetes is broken” and starts meaning “the request has not found suitable storage yet.” FailedMount stops being a vague startup problem and becomes a path to check claim binding, node attachment, access modes, and driver health.

That is enough to make storage less frightening. Not trivial, but workable.