Leader election can be very useful for your application too, but it is notoriously difficult to implement. Luckily, Kubernetes comes to the rescue. There is a documented procedure for supporting leader election for your application through the leader-elector container from Google. The basic concept is to use the Kubernetes endpoints combined with ResourceVersion and Annotations. When you couple this container as a sidecar in your application pod, you get leader-election capabilities in a very streamlined fashion.
Let's run the leader-elector container with three pods and an election called election:
> kubectl run leader-elector --image=gcr.io/google_containers/leader-elector:0.5 --replicas=3 -- --election=election -http=0.0.0.0:4040
After a while, you'll see three new pods in your cluster, named leader-elector-xxx:
> kubectl get pods | grep elect leader-elector-57746fd798-7s886 1/1 Running 0 39s leader-elector-57746fd798-d94zx 1/1 Running 0 39s leader-elector-57746fd798-xcljl 1/1 Running 0 39s
OK. But who is the master? Let's query the election endpoints:
> kubectl get endpoints election -o json { "apiVersion": "v1", "kind": "Endpoints", "metadata": { "annotations": { "control-plane.alpha.kubernetes.io/leader": "{"holderIdentity":"leader-elector-57746fd798-xcljl","leaseDurationSeconds":10,"acquireTime":"2018-01-08T04:16:40Z","renewTime":"2018-01-08T04:18:26Z","leaderTransitions":0}" }, "creationTimestamp": "2018-01-08T04:16:40Z", "name": "election", "namespace": "default", "resourceVersion": "1090942", "selfLink": "/api/v1/namespaces/default/endpoints/election", "uid": "ba42f436-f42a-11e7-abf8-080027c94384" }, "subsets": null }
If you look really hard, you can see it buried in the metadata.annotations. To make it easy to detect, I recommend the fantastic jq program for slicing and dicing JSON (https://stedolan.github.io/jq/). It is very useful to parse the output of the Kubernetes API or kubectl:
> kubectl get endpoints election -o json | jq -r .metadata.annotations[] | jq .holderIdentity "leader-elector-57746fd798-xcljl"
To prove that leader election works, let's kill the leader and see if a new leader is elected:
> kubectl delete pod leader-elector-916043122-10wjj pod "leader-elector-57746fd798-xcljl" deleted
And we have a new leader:
> kubectl get endpoints election -o json | jq -r .metadata.annotations[] | jq .holderIdentity "leader-elector-57746fd798-d94zx"
You can also find the leader through HTTP, because each leader-elector container exposes the leader through a local web server (running on port 4040) though a proxy:
> kubectl proxy In a separate console: > curl http://localhost:8001/api/v1/proxy/namespaces/default/pods/leader-elector-57746fd798-d94zx:4040/ | jq .name "leader-elector-57746fd798-d94zx"
The local web server allows the leader-elector container to function as a sidecar container to your main application container within the same pod. Your application container shares the same local network as the leader-elector container, so it can access http://localhost:4040 and get the name of the current leader. Only the application container that shares the pod with the elected leader will run the application; the other application containers in the other pods will be dormant. If they receive requests, they'll forward them to the leader, or some clever load-balancing tricks can be done to automatically send all requests to the current leader.