Chapter 6. Observability

One of the greatest challenges with the management of a microservices architecture is simply trying to understand the relationships between individual components of the overall system. A single end-user transaction might flow through several, perhaps a dozen or more, independently deployed microservices running in their independent pods, and discovering where performance bottlenecks or errors have occurred provides valuable information.

In this chapter, we will touch on tracing via Jaeger, metrics via Grafana and Prometheus, plus service graphing via Kiali.

Istio’s Mixer capability is implemented as two different pods and services within the istio-system namespace. The following queries make them easier to spot as they have a common label:

oc get pods -l istio=mixer -n istio-system
oc get services -l istio=mixer -n istio-system

istio-policy and istio-telemetry are the services that make up Istio’s Mixer functionality.

Tracing

Often the first thing to understand about your microservices architecture is specifically which microservices are involved in an end-user transaction. If many teams are deploying their dozens of microservices, all independently of one another, it is often difficult to understand the dependencies across that “mesh” of services. Istio’s Mixer comes “out of the box” with the ability to pull tracing spans from your distributed microservices. This means that tracing is programming-language agnostic so that you can use this capability in a polyglot world where different teams, each with its own microservice, can be using different programming languages and frameworks.

Although Istio supports both Zipkin and Jaeger, for our purposes we focus on Jaeger, which implements OpenTracing, a vendor-neutral tracing API. Jaeger was originally open sourced by the Uber Technologies team and is a distributed tracing system specifically focused on microservices architecture.

An important term to understand here is span, which Jaeger defines as “a logical unit of work in the system that has an operation name, the start time of the operation, and the duration. Spans can be nested and ordered to model causal relationships. An RPC call is an example of a span.”

Another important term is trace, which Jaeger defines as “a data/execution path through the system, and can be thought of as a directed acyclic graph of spans.”

Open the Jaeger console by using the following command:

minishift openshift service tracing --in-browser

You can then select Customer from the drop-down list box and explore the traces found, as illustrated in Figure 6-1.

iimm 0601
Figure 6-1. Jaeger’s view of the customer-preference-recommendation trace

It’s important to remember that your programming logic must forward the OpenTracing headers from your inbound call to every outbound call:

x-request-id
x-b3-traceid
x-b3-spanid
x-b3-parentspanid
x-b3-sampled
x-b3-flags
x-ot-span-context

However, your chosen framework may have support for automatically carrying those headers. In the case of the customer and preference services, for the Spring Boot implementations, there is opentracing_spring_cloud.

Our customer and preference services are using the TracerResolver library, so that the concrete tracer can be loaded automatically without our code having a hard dependency on Jaeger. Given that the Jaeger tracer can be configured via environment variables, we don’t need to do anything in order to get a properly configured Jaeger tracer ready and registered with OpenTracing. That said, there are cases where it’s appropriate to manually configure a tracer. Refer to the Jaeger documentation for more information on how to do that.

By default, Istio captures or samples 100% of the requests flowing through the mesh. This is valuable for a development scenario where you are attempting to debug aspects of the application but it might be too voluminous in a different setting such as performance benchmark or production environment. The sampling rate is defined by the “PILOT_TRACE_SAMPLING” environment variable on the Istio Pilot Deployment. This can can be viewed/edited via the following command:

oc edit deployment istio-pilot -n istio-system

Metrics

By default, Istio will gather telemetry data across the service mesh by leveraging Prometheus and Grafana to get started with this important capability. You can get the URL to the Grafana console using the minishift service command:

minishift openshift service grafana --url

Make sure to select Istio Workload Dashboard in the upper left of the Grafana dashboard, as demonstrated in Figure 6-2.

iimm 0602
Figure 6-2. The Grafana dashboard—selecting Istio Workload Dashboard

You can also visit the Prometheus dashboard directly with the following command:

minishift openshift service prometheus --in-browser

The Prometheus dashboard allows you to query for specific metrics and graph them. For instance, you can review the total request count for the recommendation service, specifically the “v2” version as seen in Figure 6-3:

istio_requests_total{destination_app="recommendation",
 destination_version="v2"}
iimm 0603
Figure 6-3. The Prometheus dashboard

You can also review other interesting datapoints such as the pod memory usage with the following query string:

container_memory_rss{container_name="customer"}

Prometheus is a very powerful tool for gathering and extracting metric data from your Kubernetes/OpenShift cluster. Prometheus is currently a top-level or graduated project within the Cloud Native Computing Foundation alongside Kubernetes itself. For more information on query syntax and alerting, please review the documentation at the Prometheus website.

Service Graph

Istio has provided the out-of-the-box basic Servicegraph visualization since its earliest days. Now, a new, more comprehensive service graph tool and overall health monitoring solution called Kiali has been created by the Red Hat team, as depicted in Figure 6-4. The Kiali project provides answers to interesting questions like: What microservices are part of my Istio service mesh and how are they connected?

At the time of this writing, Kiali must be installed separately and those installation steps are somewhat complicated. Kiali wants to know the URLs for both Jaeger and Grafana and that requires some interesting environment variable substitution. The envsubst tool comes from a package called gettext and is available for Fedora via:

dnf install gettext

Or macOS:

brew install gettext

And the Kiali installation steps:

# URLS for Jaeger and Grafana
export JAEGER_URL="https://tracing-istio-system.$(minishift
                                                  ip).nip.io"
export GRAFANA_URL="https://grafana-istio-system.$(minishift
                                                   ip).nip.io"
export IMAGE_VERSION="v0.10.0"

curl -L http://git.io/getLatestKiali | bash

If you run into trouble with the installation process, make sure to visit the Kiali user forum. Like Istio itself Kiali is a fast-moving project that continues to change rapidly.

iimm 0604
Figure 6-4. The Kiali dashboard

As you can see with the various out-of-the-box as well as third-party additions, Istio makes your overall application’s components—its mesh of microservices—much more visible and more observable. In previous chapters you introduced errors as well as network delays, and now with these additional tools, you can better track where the potential problems are.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset