In this section, we'll look at some of the common volume types available in the leading public cloud platforms. Managing storage at scale is a difficult task that eventually involves physical resources, similar to nodes. If you choose to run your Kubernetes cluster on a public cloud platform, you can let your cloud provider deal with all these challenges and focus on your system. But it's important to understand the various options, constraints, and limitations of each volume type.
AWS provides the elastic block store as persistent storage for EC2 instances. An AWS Kubernetes cluster can use AWS EBS as persistent storage with the following limitations:
Those are severe limitations. The restriction for a single availability zone, while great for performance, eliminates the ability to share storage at scale or across a geographically distributed system without custom replication and synchronization. The limit of a single EBS volume to a single EC2 instance means even within the same availability zone pods can't share storage (even for reading) unless you make sure they run on the same node.
With all the disclaimers out of the way, let's see how to mount an EBS volume:
apiVersion: v1 kind: Pod metadata: name: some-pod spec: containers: - image: some-container name: some-container volumeMounts: - mountPath: /ebs name: some-volume volumes: - name: some-volume awsElasticBlockStore: volumeID: <volume-id> fsType: ext4
You must create the EBS volume in AWS and then you just mount it into the pod. There is no need for a claim or storage class because you mount the volume directly by ID. The awsElasticBlockStore
volume type is known to Kubernetes.
AWS recently released a new service called the Elastic File System. This is really a managed NFS service. It's using NFS 4.1 protocol and it has many benefits over EBS:
That said, EFS is more expansive than EBS even when you consider the automatic replication to multiple AZs (assuming you fully utilize your EBS volumes). From a Kubernetes point of view, AWS EFS is just an NFS volume. You provision it as such:
apiVersion: v1 kind: PersistentVolume metadata: name: efs-share spec: capacity: storage: 200Gi accessModes: - ReadWriteMany nfs: server: eu-west-1b.fs-64HJku4i.efs.eu-west-1.amazonaws.com path: "/"
Once the persist volume exists, you can create a claim for it, attach the claim as a volume to multiple pods (ReadWriteMany
access mode), and mount it into containers.
The gcePersistentDisk
volume type is very similar to awsElasticBlockStore
. You must provision the disk ahead of time. It can only be used by GCE instances in the same project and zone. But the same volume can be used as read-only on multiple instances. This means it supports ReadWriteOnce
and ReadOnlyMany
. You can use a GCE persistent disk to share data as read-only between multiple pods in the same zone.
The pod that's using a persistent disk in ReadWriteOnce
mode must be controlled by a replication controller, a replica set, or a deployment with a replica count of 0
or 1
. Trying to scale beyond 1
will fail for obvious reasons:
apiVersion: v1 kind: Pod metadata: name: some-pod spec: containers: - image: some-container name: some-container volumeMounts: - mountPath: /pd name: some-volume volumes: - name: some-volume gcePersistentDisk: pdName: <persistent disk name> fsType: ext4
The Azure data disk is a virtual hard disk stored in Azure storage. It's similar in capabilities to AWS EBS. Here is a sample pod
configuration file:
apiVersion: v1 kind: Pod metadata: name: some-pod spec: containers: - image: some-container name: some-container volumeMounts: - name: some-volume mountPath: /azure volumes: - name: some-volume azureDisk: diskName: test.vhd diskURI: https://someaccount.blob.microsoft.net/vhds/test.vhd
In addition to the mandatory diskName
and diskURI
parameters, it also has a few optional parameters:
cachingMode
: The disk caching mode. This must be one of None
, ReadOnly
, or ReadWrite
. The default is None
.fsType
: The filesystem type set to mount
. The default is ext4
.readOnly
: Whether the filesystem is used as readOnly
. The default is false
.Azure data disks are limited to 1,023 GB. Each Azure VM can have up to 16 data disks. You can attach an Azure data disk to a single Azure VM.
In addition to the data disk, Azure has also a shared filesystem similar to AWS EFS. However, Azure file storage uses the SMB/CIFS protocol (it supports SMB 2.1 and SMB 3.0). It is based on the Azure storage platform and has the same availability, durability, scalability, and geo-redundancy capabilities as Azure Blob, Table, or Queue.
In order to use Azure file storage, you need to install on each client VM the cifs-utils
package. You also need to create a secret
, which is a required parameter:
apiVersion: v1 kind: Secret metadata: name: azure-file-secret type: Opaque data: azurestorageaccountname: <base64 encoded account name> azurestorageaccountkey: <base64 encoded account key>
Here is a configuration file for Azure file storage:
apiVersion: v1 kind: Pod metadata: name: some-pod spec: containers: - image: some-container name: some-container volumeMounts: - name: some-volume mountPath: /azure volumes: - name: some-volume azureFile: secretName: azure-file-secret shareName: azure-share readOnly: false
Azure file storage supports sharing within the same region as well as connecting on-premise clients. Here is a diagram that illustrates the workflow: