Working with volumes

Files in a container are ephemeral. When the container is terminated, the files are gone. Docker has introduced data volumes and data volume containers to help us manage the data by mounting from the host disk directory or from other containers. However, when it comes to a container cluster, it is hard to manage volumes across hosts and their lifetime by using Docker.

Kubernetes introduces volume, which lives with a pod across container restarts. It supports the following different types of network disks:

  • emptyDir
  • hostPath
  • nfs
  • iscsi
  • flocker
  • glusterfs
  • rbd
  • gitRepo
  • awsElasticBlockStore
  • gcePersistentDisk
  • secret
  • downwardAPI

In this section, we'll walk through the details of emptyDir, hostPath, nfs and glusterfs. Secret, which is used to store credentials, will be introduced in the next section. Most of them have similar Kubernetes syntax with a different backend.

Getting ready

The storage providers are required when you start to use volume in Kubernetes except for emptyDir, which will be erased when the pod is removed. For other storage providers, folders, servers or clusters have to be built before using them in the pod definition.

Different volume types have different storage providers:

Volume Type

Storage Provider

emptyDir

Local host

hostPath

Local host

nfs

NFS server

iscsi

iSCSI target provider

flocker

Flocker cluster

glusterfs

GlusterFS cluster

rbd

Ceph cluster

gitRepo

Git repository

awsElasticBlockStore

AWS EBS

gcePersistentDisk

GCE persistent disk

secret

Kubernetes configuration file

downwardAPI

Kubernetes pod information

How to do it…

Volumes are defined in the volumes section of the pod definition with a unique name. Each type of volume has a different configuration to be set. Once you define the volumes, you can mount them in the volumeMounts section in container spec. volumeMounts.name and volumeMounts.mountPath are required, which indicate the name of the volumes you defined and the mount path inside the container.

We'll use the Kubernetes configuration file with the YAML format to create a pod with volumes in the following examples.

emptyDir

emptyDir is the simplest volume type, which will create an empty volume for containers in the same pod to share. When the pod is removed, the files in emptyDir will be erased as well. emptyDir is created when a pod is created. In the following configuration file, we'll create a pod running Ubuntu with commands to sleep for 3600 seconds. As you can see, one volume is defined in the volumes section with name data, and the volumes will be mounted under /data-mount path in the Ubuntu container:

// configuration file of emptyDir volume
# cat emptyDir.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: ubuntu
labels:
  name: ubuntu
spec:
  containers:
    -
      image: ubuntu
      command:
        - sleep
        - "3600"
      imagePullPolicy: IfNotPresent
      name: ubuntu
      volumeMounts:
        -
          mountPath: /data-mount
          name: data
  volumes:
    -
      name: data
      emptyDir: {}

// create pod by configuration file emptyDir.yaml
# kubectl create -f emptyDir.yaml

Tip

Check which node the pod is running on

By using the kubectl describe pod <Pod name> | grep Node command you could check which node the pod is running on.

After the pod is running, you could use docker inspect <container ID> on the target node and you could see the detailed mount points inside your container:

    "Mounts": [
        {
            "Source": "/var/lib/kubelet/pods/<id>/volumes/kubernetes.io~empty-dir/data",
            "Destination": "/data-mount",
            "Mode": "",
            "RW": true
        },
      ...
     ]

Here, you can see Kubernetes simply create an empty folder with the path /var/lib/kubelet/pods/<id>/volumes/kubernetes.io~empty-dir/<volumeMount name> for the pod to use. If you create a pod with more than one container, all of them will mount the same destination /data-mount with the same source.

emptyDir could be mounted as tmpfs if we set the emptyDir.medium setting to Memory in the previous configuration file emptyDir.yaml:

  volumes:
    -
      name: data
      emptyDir:
        medium: Memory

We could also check the Volumes information by kubectl describe pods ubuntu to see whether it's set successfully:

# kubectl describe pods ubuntu
Name:        ubuntu
Namespace:      default
Image(s):      ubuntu
Node:        ip-10-96-219-192/
Status:        Running
...
Volumes:
  data:
    Type:  EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  Memory

hostPath

hostPath acts as data volume in Docker. The local folder on a node listed in hostPath will be mounted into the pod. Since the pod can run on any nodes, read/write functions happening in the volume could explicitly exist in the node on which the pod is running. In Kubernetes, however, the pod should not be node-aware. Please note that configuration and files might be different on different nodes when using hostPath. Therefore, the same pod, created by same command or configuration file, might act differently on different nodes.

By using hostPath, you're able to read and write the files between containers and local host disks of nodes. What we need for volume definition is for hostPath.path to specify the target mounted folder on the node:

// configuration file of hostPath volume
# cat hostPath.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: ubuntu
spec:
  containers:
    -
      image: ubuntu
      command:
        - sleep
        - "3600"
      imagePullPolicy: IfNotPresent
      name: ubuntu
      volumeMounts:
        -
          mountPath: /data-mount
          name: data
  volumes:
    -
      name: data
      hostPath:
        path: /target/path/on/host

Using docker inspect to check the volume details, you will see the volume on the host is mounted in /data-mount destination:

    "Mounts": [
        {
            "Source": "/target/path/on/host",
            "Destination": "/data-mount",
            "Mode": "",
            "RW": true
        },
      ...
    ]

Tip

Touching a file to validate that the volume is mounted successfully

Using kubectl exec <pod name> <command> you could run the command inside a pod. In this case, if we run kubectl exec ubuntu touch /data-mount/sample, we should be able to see one empty file named sample under /target/path/on/host.

nfs

You can mount the Network File System (NFS) to your pod as a nfs volume. Multiple pods can mount and share the files in the same nfs volume. The data stored in the nfs volume will be persistent across the pod's lifetime. You have to create your own NFS server before using nfs volume, and make sure that the nfs-utils package is installed on the Kubernetes nodes.

Note

Checking that the nfs server works before you go

You should check out that the /etc/exports file has proper sharing parameters and directory, and is using the mount -t nfs <nfs server>:<share name> <local mounted point> command to check whether it could be mounted locally.

The configuration file of a volume type with nfs is similar to others, but the nfs.server and nfs.path are required in the volume definition to specify NFS server information, and the path mounting from. nfs.readOnly is an optional field for specifying whether the volume is read-only or not (default is false):

// configuration file of nfs volume
# cat nfs.yaml
apiVersion: v1
kind: Pod
metadata:
  name: nfs
spec:
  containers:
    -
      name: nfs
      image: ubuntu
      volumeMounts:
          - name: nfs
            mountPath: "/data-mount"
  volumes:
  - name: nfs
    nfs:
      server: <your nfs server>
      path: "/"

After you run kubectl create -f nfs.yaml, you can describe your pod by using kubectl describe <pod name> to check the mounting status. If it's mounted successfully, it should show conditions. It's ready if it shows true and the target nfs you have mounted:

Conditions:
  Type    Status
  Ready   True
Volumes:
  nfs:
    Type:  NFS (an NFS mount that lasts the lifetime of a pod)
    Server:  <your nfs server>
    Path:  /
    ReadOnly:  false

If we inspect the container by using the Docker command, you will see the volume information in the Mounts section:

    "Mounts": [
 {
            "Source": "/var/lib/kubelet/pods/<id>/volumes/kubernetes.io~nfs/nfs",
            "Destination": "/data-mount",
            "Mode": "",
            "RW": true
        },
      ...
     ]

Actually, Kubernetes just mounts your <nfs server>:<share name> into /var/lib/kubelet/pods/<id>/volumes/kubernetes.io~nfs/nfs and then mounts it into a container as its destination in the /data-mount. You could also use kubectl exec to touch the file, as the previous tip mentions, to test whether it's perfectly mounted.

glusterfs

GlusterFS (https://www.gluster.org) is a scalable network-attached storage file system. The glusterfs volume type allows you mount the GlusterFS volume into your pod. Just like NFS volume, the data in the GlusterFS volume is persistent across the pod's lifetime. If the pod is terminated, the data is still accessible in the GlusterFS volume. You should build a GlusterFS system before using a GlusterFS volume.

Note

Checking GlusterFS works before you go

By using gluster volume info on GlusterFS servers, you can see currently available volumes. By using mount -t glusterfs <glusterfs server>:/<volume name> <local mounted point> locally, you can check whether the GlusterFS system can be successfully mounted.

Since the volume replica in GlusterFS must be greater than 1, let's assume we have two replicas in servers gfs1 and gfs2 and the volume name is gvol.

First, we need to create an endpoint acting as a bridge for gfs1 and gfs2:

# cat gfs-endpoint.yaml
kind: Endpoints
apiVersion: v1
metadata:
  name: glusterfs-cluster
subsets:
  -
    addresses:
      -
        ip: <gfs1 server ip>
    ports:
      -
        port: 1
  -
    addresses:
      -
        ip: <gfs2 server ip>
    ports:
      -
        port: 1

// create endpoints
# kubectl create -f gfs-endpoint.yaml

Then we could use kubectl get endpoints to check the endpoint is created properly:

# kubectl get endpoints
NAME                ENDPOINTS                         AGE
glusterfs-cluster   <gfs1>:1,<gfs2>:1                 12m

After that, we should be able to create the pod with the GlusterFS volume by glusterfs.yaml. The parameters of the glusterfs volume definition are glusterfs.endpoints, which specify the endpoint name we just created, and the glusterfs.path which is the volume name gvol. glusterfs.readOnly and is used to set whether the volume is mounted in read-only mode:

# cat glusterfs.yaml
apiVersion: v1
kind: Pod
metadata:
  name: ubuntu
spec:
  containers:
    -
      image: ubuntu
      command:
        - sleep
        - "3600"
      imagePullPolicy: IfNotPresent
      name: ubuntu
      volumeMounts:
        -
          mountPath: /data-mount
          name: data
  volumes:
    -
      name: data
      glusterfs:
        endpoints: glusterfs-cluster
        path: gvol

Let's check the volume setting by kubectl describe:

Volumes:
  data:
    Type:    Glusterfs (a Glusterfs mount on the host that shares a pod's lifetime)
    EndpointsName:  glusterfs-cluster
    Path:    gvol
    ReadOnly:    false

Using docker inspect you should be able to see the mounted source is /var/lib/kubelet/pods/<id>/volumes/kubernetes.io~glusterfs/data to the destination /data-mount.

iscsi

The iscsi volume is used to mount the existing iSCSI to your pod. Unlike nfs volume, the iscsi volume is only allowed to be mounted in a single container in read-write mode. The data will be persisted across the pod's lifecycle:

Field Name

Field Definition

targetPortal

IP Address of iSCSI target portal

Iqn

IQN of the target portal

Lun

Target LUN for mounting

fsType

File system type on LUN

readOnly

Specify read-only or not, default is false

flocker

Flocker is an open-source container data volume manager. The flocker volume will be moved to the target node when the container moves. Before using Flocker with Kubernetes, the Flocker cluster (Flocker control service, Flocker dataset agent, Flocker container agent) is required. Flocker's official website (https://docs.clusterhq.com/en/1.8.0/install/index.html) has detailed installation instructions.

After you get your Flocker cluster ready, create a dataset and specify the dataset name in the Flocker volume definition in the Kubernetes configuration file:

Field Name

Field Definition

datasetName

Target dataset name in Flocker

rbd

Ceph RADOS Block Device (http://docs.ceph.com/docs/master/rbd/rbd/) could be mounted into your pod by using rbd volume. You need to install Ceph before using the rbd volume. The definition of rbd volume support is secret in order to keep authentication secrets:

Field Name

Field Definition

Default Value

monitors

Cepth monitors

 

pool

The name of RADOS pool

rbd

image

The image name rbd created

 

user

RADOS user name

admin

keyring

The path of keyring, will be overwritten if secret name is provided

/etc/ceph/keyring

secretName

Secret name

 

fsType

File system type

 

readOnly

Specify read-only or not

False

gitRepo

The gitRepo volume will mount as an empty dictionary and Git clone a repository with certain revision in a pod for you to use:

Field Name

Field Definition

repository

Your Git repository with SSH or HTTPS

Revision

The revision of repository

readOnly

Specify read-only or not

awsElasticBlockStore

awsElasticBlockStore volume mounts an AWS EBS volume into a pod. In order to use it, you have to have your pod running on AWS EC2 with the same availability zone with EBS. For now, EBS only supports attaching to an EC2 in nature, so it means you cannot attach a single EBS volume to multiple EC2 instances:

Field Name

Field Definition

volumeID

EBS volume info - aws://<availability-zone>/<volume-id>

fsType

File system type

readOnly

Specify read-only or not

gcePersistentDisk

Similar to awsElasticBlockStore, the pod using the gcePersistentDisk volume must be running on GCE with the same project and zone. The gcePersistentDisk supports only a single writer when readOnly = false:

Field Name

Field Definition

pdName

GCE persistent disk name

fsType

File system type

readOnly

Specify read-only or not

downwardAPI

The downwardAPI volume is a Kubernetes volume plugin with the ability to save some pod information in a plain text file into a container. The current supporting metadata of the downwardAPI volume is:

  • metadata.annotations
  • metadata.namespace
  • metadata.name
  • metadata.labels

The definition of the downwardAPI is a list of items. An item contains a path and fieldRef. Kubernetes will then dump the specified metadata listed in the fieldRef to a file named path under mountPath and mount the <volume name> into the destination you specified:

        {
            "Source": "/var/lib/kubelet/pods/<id>/volumes/kubernetes.io~downward-api/<volume name>",
            "Destination": "/tmp",
            "Mode": "",
            "RW": true
        }

For the IP of the pod, using the environment variable to propagate in the pod spec would be much easier:

spec:
  containers:
    - name: envsample-pod-info
      env:
        - name: MY_POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP

For more examples, look at the sample folder in Kubernetes GitHub (https://github.com/kubernetes/kubernetes/tree/master/docs/user-guide/downward-api) which contains more examples for both environment variables and downwardAPI volume.

There's more…

In previous cases, the user needed to know the details of the storage provider. Kubernetes provides PersistentVolume (PV) to abstract the details of the storage provider and storage consumer. Kubernetes currently supports the PV types as follows:

  • GCEPersistentDisk
  • AWSElasticBlockStore
  • NFS
  • iSCSI
  • RBD (Ceph Block Device)
  • GlusterFS
  • HostPath (not workable in multi-node cluster)

PersistentVolume

The illustration of persistent volume is shown in the following graph. At first, administrator provisions the specification of a PersistentVolume. Second, they provision consumer requests for storage by PersistentVolumeClaim. Finally, the pod mounts the volume by the reference of the PersistentVolumeClaim:

PersistentVolume

The administrator needs to provision and allocate the persistent volume first.

Here is an example using NFS:

// example of PV with NFS
# cat pv.yaml
  apiVersion: v1
  kind: PersistentVolume
  metadata:
    name: pvnfs01
  spec:
    capacity:
      storage: 3Gi
    accessModes:
      - ReadWriteOnce
    nfs:
      path: /
      server: <your nfs server>
    persistentVolumeReclaimPolicy: Recycle

// create the pv
# kubectl create -f pv.yaml
persistentvolume "pvnfs01" created

We can see there are three parameters here: capacity, accessModes and persistentVolumeReclaimPolicy. capacity is the size of this PV. accessModes is based on the capability of the storage provider, and can be set to a specific mode during provision. For example, NFS supports multiple readers and writers simultaneously, thus we could specify the accessModes as ReadWriteOnce, ReadOnlyMany or ReadWriteMany. The accessModes of one volume could be set to one mode at a time. persistentVolumeReclaimPolicy is used to define the behavior when PV is released. Currently, the supported policy is Retain and Recycle for nfs and hostPath. You have to clean the volume by yourself in Retain mode; on the other hand, Kubernetes will scrub the volume in Recycle mode.

PV is a resource like node. We could use kubectl get pv to see current provisioned PVs:

// list current PVs
# kubectl get pv
NAME      LABELS    CAPACITY   ACCESSMODES   STATUS    CLAIM               REASON    AGE
pvnfs01   <none>    3Gi        RWO           Bound     default/pvclaim01             37m

Next, we will need to bind PersistentVolume with PersistentVolumeClaim in order to mount it as a volume into the pod:

// example of PersistentVolumeClaim
# cat claim.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvclaim01
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

// create the claim
# kubectl create -f claim.yaml
persistentvolumeclaim "pvclaim01" created

// list the PersistentVolumeClaim (pvc)
# kubectl get pvc
NAME        LABELS    STATUS    VOLUME    CAPACITY   ACCESSMODES   AGE
pvclaim01   <none>    Bound     pvnfs01   3Gi        RWO           59m

The constraints of accessModes and storage could be set in the PersistentVolumeClaim. If the claim is bound successfully, its status will turn to Bound; conversely, if the status is Unbound, it means that currently no PV matches the requests.

Then we are able to mount the PV as a volume by using PersistentVolumeClaim:

// example of mounting into Pod
# cat nginx.yaml
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    project: pilot
    environment: staging
    tier: frontend
spec:
  containers:
    -
      image: nginx
      imagePullPolicy: IfNotPresent
      name: nginx
      volumeMounts:
      - name: pv
        mountPath: "/usr/share/nginx/html"
      ports:
      - containerPort: 80
  volumes:
    - name: pv
      persistentVolumeClaim:
        claimName: "pvclaim01"

// create the pod
# kubectl create -f nginx.yaml
pod "nginx" created

The syntax is similar to the other volume type. Just add the claimName of the persistentVolumeClaim in the volume definition. We are all set! Let's check the details to see whether we have mounted it successfully:

// check the details of a pod
# kubectl describe pod nginx
...
Volumes:
  pv:
    Type:  PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  pvclaim01
    ReadOnly:  false
...

We can see that we have a volume mounted in the pod nginx with type pv pvclaim01. Use docker inspect to see how it is mounted:

     "Mounts": [
        {
            "Source": "/var/lib/kubelet/pods/<id>/volumes/kubernetes.io~nfs/pvnfs01",
            "Destination": "/usr/share/nginx/html",
            "Mode": "",
            "RW": true
        },
      ...
    ]

Kubernetes mounts /var/lib/kubelet/pods/<id>/volumes/kubernetes.io~nfs/< persistentvolume name> into the destination in the pod.

See also

Volumes are put in container specs in pods or replication controllers. Check out the following recipes to jog your memory:

  • Working with pods
  • Working with a replication controller
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset