7
Building a reusable microservice framework

This chapter covers

  • Building a microservice chassis
  • Advantages of enforcing uniform practices across teams
  • Abstracting common concerns in a reusable framework

Once an organization fully embraces microservices and teams grow in number, it’s quite likely that each of those teams will start specializing in a given set of programming languages and tools. Sometimes, even when using the same programming language, different teams will choose a different combination of tools to achieve the same purpose. Although nothing is wrong with this, it may lead to an increased challenge for engineers moving between different teams. The ritual to set up new services, as well as the code structure, may be quite different. Even if teams eventually end up solving the same challenges in different ways, we believe this potential duplication is better than having to add a synchronization layer.

Having strict rules on the tools and languages that teams can use and enforcing a canonical way of setting up new services across all teams may harm speed and innovation and will eventually lead to the use of the same tools for every problem. Fortunately, you can enforce some common practices while keeping things rather free for teams to choose the programming language for specific services. You can encapsulate a set of tools for each adopted language while making sure that engineers have access to resources that’ll make it easy to abide by the practices across all teams. If team A decides to go with Elixir to create a service for managing notifications and team B decides to use Python for an image analysis service, they should both have the tools that allow those two services to emit metrics to the common metrics collection infrastructure.

You should centralize logs in the same place and with the same format, and things like circuit breakers, feature flags, or the ability to share the same event bus should be available. That way, teams can make choices but also have the tools to become aligned with the infrastructure available to run their services. These tools form a chassis, a foundation, that you can build new services on without much up-front investigation and ceremony. Let's consider how to build a chassis for your services—one that abstracts common concerns and architectural choices while at the same time enables teams to quickly bootstrap new services.

7.1 A microservice chassis

Imagine an organization has eight different engineering teams and four engineers on each team. Now imagine one engineer on each team is responsible for bootstrapping a new service in Python, Java, or C#. Those languages, like most mainstream languages, have a lot of options in the form of available libraries. From http clients to logging libraries, the choice is plentiful. What would be the odds of two teams selecting the same language ending up with the same combination of libraries? I’d say pretty narrow! This issue isn’t exclusive to microservice applications; for a monolithic application I worked on, different programmers were using three distinct http client libraries!

In figures 7.1 to 7.3, you can see the choices a team may face while choosing components to use in a new project.

As you can see in figures 7.17.3, the choice isn’t easy! No matter which language you choose, options are plentiful, so the time you take to select components can increase, along with the risk of picking up a less than ideal library. An organization most likely will settle with two or three languages as the widely adopted ones, depending on the problems they need to solve. As a result, teams using the same language will coexist. Once one team gains some experience with a set of libraries, why wouldn’t you use that experience to the benefit of other teams? You can provide a set of libraries and tools already used in production that people bootstrapping new projects could choose from without the burden of having to dig deeply into each library to weigh the pros and cons.

c07_01.png

Figure 7.1 Search results for object-relational mapping (ORM) libraries for the .NET ecosystem

To make the job easier for your teams to create new services, it’s worth your while to provide basic structure and a set of vetted tools for each of the languages your organization uses to build and operate services. You also should make sure that structure abides with your standards regarding observability and the abstraction of infrastructure-related code and it reflects your architectural choices regarding communication between services. An example of this, if the organization favors asynchronous communication between services, would be providing the needed libraries for using an event bus infrastructure that’s already in place.

Not only would you be able to soft-enforce some practices, you also could make it easier to spawn new services quickly and allow fast prototyping. After all, it wouldn't make sense to take longer to bootstrap a service than to write the business logic that powers it.

c07_02.png

Figure 7.2 Search for Advanced Message Queuing Protocol (AMQP) libraries for the Java ecosystem

c07_03.png

Figure 7.3. Search for circuit breaker libraries for Python

The chassis structure allows teams to select a tech stack (language + libraries) and quickly set up a service. You might ask yourself: how hard is it to bootstrap a service without this so-called chassis? It can be easy, if you don’t have concerns like

  • Enabling deployments in the container scheduler from day one (CI/CD)
  • Setting up log aggregation
  • Collecting metrics
  • Having a mechanism for synchronous and asynchronous communication
  • Error reporting

At SimpleBank, no matter what programming language or tech stack a team chooses, services should be providing all the functionality described in the list above. This type of setup isn’t trivial to achieve, and, depending on the stack you chose, it can take more than a day to set up. Also, the combination of libraries two teams would choose for the same purpose could be quite different. You mitigate any issues related to that difference by providing a microservice chassis, so each team can focus on delivering features that SimpleBank customers will be using.

7.2 What’s the purpose of a microservice chassis?

The purpose of a microservice chassis is to allow you to make services easier to create while ensuring you have a set of standards that all services abide by, no matter which team owns a service. Let’s look into some of the advantages of having a microservices chassis in place:

  • Making it easier to onboard team members
  • Getting a good understanding of the code structure and concerns regarding the tech stack that an engineering team uses
  • Limiting the scope of experimentation for production systems as the team builds common knowledge, even if not always in the same tech stack
  • Helping to adhere to best practices

Having a predictable code structure and commonly used libraries will make it easier for team members to quickly understand a service’s implementation. They’ll only need to bother with the business logic implementation, because any other code will be pretty much common throughout all services. For example, common code will include code to deal with or configure

  • Logging
  • Configuration fetching
  • Metrics collection
  • Data store setup
  • Health checks
  • Service registry and discovery
  • The chosen transport-related boilerplate (AMQP, HTTP)

If common code has already taken care of those concerns when someone is creating a new service, the need to write boilerplate is reduced or eliminated, and developers will less likely have to reinvent the wheel. Good practices within the organization will also be easier to enforce.

From a knowledge sharing perspective, having a microservice chassis will also enable easy code reviews by members of different teams. If they’re using the same chassis, they’ll be familiar with the code structure and how things are done. This will increase visibility and allow you to gather opinions from engineers from other teams. It’s always desirable to have a different view on the problems a specific team is working on solving.

7.2.1 Reduced risk

By providing a microservice chassis, you reduce the risk you face, because you’ll have less of a chance of picking a combination of language and libraries that won’t work for a particular need. Imagine a service you’re creating needs to fully communicate asynchronously with other services using an already existing event bus. If your chassis already covers that use case, you’re not likely to end up with a setup that you need to tweak and eventually won’t work well. You can cover that asynchronous communication use case as well as the synchronous one so you don’t need to expend further effort to find a working solution.

The chassis can be constantly evolving to incorporate the findings of different teams, allowing you to be always up to date with the organization’s practices and experience dealing with multiple use cases. All in all, there will be less chance for a team to face a challenge that other teams haven’t solved before. And in case no one has solved that type of challenge yet, only one team needs to solve it; then you can incorporate the solution into the chassis, reducing the risks other teams have to take in the future.

Having a microservice chassis that already selects a set of libraries for use will limit the management of dependencies an engineering team will have to deal with. Referring to figures 7.1 to 7.3, if you have available one ORM, one AMQP, and one circuit breaker library, those will eventually be well known across multiple teams, and if someone finds a vulnerability in any of those libraries, you’ll be able to update them with ease.

7.2.2 Faster bootstrapping

It makes little sense to spend one or two days bootstrapping a service when it could take far less time to implement the business logic. Also, wiring the needed components that form a service is a repetitive task that can be error prone. Why make people have to go and set up components all over again every time they create a new service? Using, maintaining, and updating a microservice chassis will lead to a setup that’s sound, tested, and reusable. This will allow for faster service bootstrapping. Then you could use the extra time you gained by not having to write boilerplate code to develop, test, and deploy your features.

Having a sound foundation that teams use widely and know well allows you to experiment a lot more without worrying too much about the initial effort. If you can quickly turn a concept into a deployable service, you can easily validate it and decide to proceed with it or abandon it altogether. The key notion here is to be fast and to have it as easy as possible to create new functionality. Having a chassis in place also can significantly lower the entry barrier for new team members, because it’ll be quicker for them to jump into any project once they learn the structure that’s common to all services in each language.

7.3 Designing a chassis

At SimpleBank, the team responsible for implementing the purchasing and selling of stocks decided to create a chassis for the wider engineering team to use—they had faced a couple of challenges and want to share their experiences. We described a feature for selling stocks in chapter 2, figure 2.7. Let’s look at a flow diagram to better understand it (figure 7.4).

To sell stocks, a user issues a request via the web or a mobile application. An API gateway will pick up the request and will act as the interface between the user-facing application and all internal services that’ll collaborate to provide the functionality.

c07_04.png

Figure 7.4 The flow for selling stocks involves both synchronous and asynchronous communication between the intervening services.

Given that it can take a while to place the order to the stock exchange, most operations will be asynchronous, and you’ll return a message to the client indicating their request will be processed as soon as possible. Let’s look into the interactions between services and the type of communication:

  1. The gateway passes the user request to the orders service.
  2. The orders service sends an OrderCreated event to the event queue.
  3. The orders service requests the reservation of a stock position to the account transaction service.
  4. The orders service replies to the initial call from the gateway, then the gateway informs the user that the order is being processed.
  5. The market service consumes the OrderCreated event and places the order to the stock exchange.
  6. The market service emits an OrderPlaced event to the event queue.
  7. Both the fees service and the orders service consume the OrderPlaced event; they then charge the fees for the operation and update the status of the order to “placed,” respectively.

For this feature, you have four internal services collaborating, interactions with an external entity (stock exchange), and communication that’s a mix of synchronous and asynchronous. The use of the event queue allows other systems to react to changes; for instance, a service responsible for emailing or real-time notifications to clients can easily consume the OrderPlaced event, allowing it to send notifications of the placed order.

Given that the team owning this feature was comfortable with using Python, they created the initial prototype using the nameko framework (https://github.com/nameko/nameko). This framework offers, out of the box, a few things:

  • AMQP RPC and events (pub-sub)
  • HTTP GET, POST, and websockets
  • CLI for easy and rapid development
  • Utilities for unit and integration testing

But a few things were missing, like circuit breakers, error reporting, feature flags, and emitting metrics, so the team decided to create a code repository with libraries to take care of those concerns. They also created a Dockerfile and Docker compose file to allow building and running the feature with minimum effort and to offer a base for other teams to use when developing in Python. The code for the initial Python chassis (http://mng.bz/s4B2) and for the described feature (http://mng.bz/D19l) is available at the book code repository.

We’ll now look with more detail at how the built chassis deals with service discovery, observability, transport, and balancing and limiting.

7.3.1 Service discovery

Service discovery for the Python chassis that emerged from implementing the feature we previously described is quite simple. The communication between the services involved occurs either synchronously via RPC calls or asynchronously by publishing events. SimpleBank uses RabbitMQ (www.rabbitmq.com) as the message broker, so this indirectly provides a way of registering services for both the asynchronous and synchronous use case. RabbitMQ allows the use of synchronous request/response communication implementing RPC over queues, and it’ll also load balance the consumers using a round-robin algorithm (https://en.wikipedia.org/wiki/Round-robin_scheduling) by default. This allows you to use the messaging infrastructure to register services as well as to automatically distribute load between multiple instances of the same service. Figure 7.5 shows the RPC exchange your different services connect to.

c07_05.png

Figure 7.5 Services communicating via RPC register in an exchange. Multiple instances for a given service use the same routing key, and RabbitMQ will route the incoming requests between those instances.

All running services register themselves in this exchange. This will allow for them to communicate seamlessly without the need for each one to know explicitly where any service is located. This is also the case for RPC communication over the AMQP protocol, which allows you to have the same request/response behavior you’d get by using HTTP.

Let’s take a look on how easy it is to have this feature available to you by using the capacities that the chassis provides, in this case by using the nameko framework, as shown in the following listing.

Listing 7.1 microservices-in-action/chapter-7/chassis/rpc_demo.py

from nameko.rpc import rpc, RpcProxy

class RpcResponderDemoService:
    name = "rpc_responder_demo_service"    ①  
    @rpc    ②  
    def hello(self, name):
        return "Hello, {}!".format(name)

class RpcCallerDemoService:
    name = "rpc_caller_demo_service”
    remote = RpcProxy("rpc_responder_demo_service")    ③  

    @rpc
    def remote_hello(self, value="John Doe"):
        res = u"{}".format(value)
        return self.remote.hello(res)  ④  

In this example, we’ve defined two classes, a responder and a caller. In each class, we also defined a name variable that holds the identifier for the service. Use of the @rpc annotation will decorate the function. This decoration will allow you to transform what seems an ordinary function into something that’ll make use of the underlying AMQP infrastructure (that RabbitMQ offers) to invoke a method in a service running elsewhere. Calling the remote_hello method from the RpcCallerDemoService class will result in invoking the hello function in the RpcResponderDemoService, because that service is registered as remote via a RpcProxy that the framework provides.

Once you run this example code, RabbitMQ will display something like figure 7.6.

In Figure 7.6, you can observe that once you boot the services that rpc_demo.py defines, each one registers in a queue scoped to the service name: rpc-rpc_caller_demo_service and rpc-rpc_responder_demo_service. Two other queues—rpc.reply-rpc_caller_demo_service* and rpc.reply-standalone_rpc_proxy*—also appear, and theyll relay back the responses to the caller service. This is a way of implementing blocking synchronous communication in RabbitMQ (http://mng.bz/4blSh).

c07_06.png

Figure 7.6 Caller and responder demo services registered in RabbitMQ queues

Your chassis makes it super easy to access this functionality so you can use the same infrastructure for both synchronous and asynchronous communication between services. This setup brings you huge speed gains while prototyping solutions, because the team can spend its time developing new features instead of having to build all the functionality from scratch. If you opt for an orchestrated behavior, with blocking calls between services, a choreographed behavior where all communication is asynchronous, or a mix between the two, you can use the same infrastructure and library.

The following listing shows an example on how to use full asynchronous communication between services by using the functionality of the chassis.

Listing 7.2 microservices-in-action/chapter-7/chassis/events_demo.py

from nameko.events import EventDispatcher, event_handler
from nameko.rpc import rpc
from nameko.timer import timer

class EventPublisherService:
    name = "publisher_service"    ①  
    dispatch = EventDispatcher()  ②  

    @rpc
    def publish(self, event_type, payload):
        self.dispatch(event_type, payload)

class AnEventListenerService:
    name = "an_event_listener_service"    ①  
    @event_handler("publisher_service", "an_event")    ③  
    def consume_an_event(self, payload):
        print("service {} received:".format(self.name), payload)

class AnotherEventListenerService:
    name = "another_event_listener_service"

    @event_handler("publisher_service", "another_event")
    def consume_another_event(self, payload):
        print("service {} received:".format(self.name), payload)

class ListenBothEventsService:
    name = "listen_both_events_service"    ①  
    @event_handler("publisher_service", "an_event")    ③  
    def consume_an_event(self, payload):
        print("service {} received:".format(self.name), payload)
    @event_handler("publisher_service", "another_event")    ③  
    def consume_another_event(self, payload):
        print("service {} received:".format(self.name), payload)

As with the previous code example, each service a Python class implements declares a name variable that the framework will use to set up the underlying queues that allow communication. When running the services that each class in this file defines, RabbitMQ will create four queues, one for each service. As you can see in figure 7.7, the publisher service registers an RPC queue, without reply queue setup, contrary to the previous example that figure 7.6 illustrated. The other listener services register a queue per consumed event.

c07_07.png

Figure 7.7 The queues that RabbitMQ creates when you run the services defined in events_demo.py

The team chose nameko to be part of the microservice chassis because it makes it easy to abstract from the details of implementing and setting up these two types of communication over the existing message broker. In section 7.3.3, we’ll also look into another advantage that comes out of the box, because the message broker also takes care of load balancing.

7.3.2 Observability

To operate and maintain services, you need to be aware of what’s going on in production at all times. As a result, you’ll want the services to emit metrics to reflect the way they’re operating, report errors, and aggregate logs in a usable format. In part 4 of the book, we’ll focus on all these topics in more detail. But for now, let’s keep in mind that services should address these concerns from day one. Operating and maintaining services is as important as writing them in the first place, and, in most cases, they’ll spend a lot more time running than being developed.

Your microservice chassis has the dependencies shown in the following listing.

Listing 7.3 microservices-in-action/chapter-7/chassis/setup.py

(…)

    keywords='microservices chassis development',

    packages=find_packages(exclude=['contrib', 'docs', 'tests']),

    install_requires=[
        'nameko>=2.6.0',
        'statsd>=3.2.1',    ①  
        'nameko-sentry>=0.0.5',    ②  
        'logstash_formatter>=0.5.16',    ③  
        'circuitbreaker>=1.0.1',
        'gutter>=0.5.0',
        'request-id>=0.2.1',
    ],


(…)

From the seven declared dependencies, you use three of them for observability purposes. These libraries will allow you to collect metrics, report errors, and gather some contextual information around them and to adapt your logging to the format you use in all services deployed at SimpleBank.

Metrics

Let’s start with metrics collection and the use of StatsD.1  Etsy originally developed StatsD as a way to aggregate application metrics. It quickly became so popular that it’s now the de facto protocol to collect application metrics with clients in multiple programming languages. To be able to use StatsD, you need to instrument your code to capture all metrics you find relevant. Then a client library, in your case statsd for Python, will collect those metrics and send them to an agent that listens to UDP traffic from client libraries, aggregates the data, and periodically sends it to a monitoring system. Both commercial and open source solutions are available for the monitoring systems.

In the code repository, you’ll be able to find a simple agent that’ll be running in its own Docker container to simulate metrics collection. It’s a trivial ruby script that listens to port 8125 over UDP and outputs to the console, as follows.

Listing 7.4 microservices-in-action/chapter-7/feature/statsd-agent/statsd-agent.rb

#!/usr/bin/env ruby
#
# This script was originally found  in a post by Lee Hambley
# (http://lee.hambley.name)
#
require 'socket'
require 'term/ansicolor'

include Term::ANSIColor

$stdout.sync = true

c = Term::ANSIColor
s = UDPSocket.new
s.bind("0.0.0.0", 8125)
while blob = s.recvfrom(1024)
  metric, value = blob.first.split(':')
  puts "StatsD Metric: #{c.blue(metric)} #{c.green(value)}"
end

This simple script allows you to simulate metrics collection while developing your services. Figure 7.8 shows metrics collection for services running when placing a sell order, the feature we use as an example for this chapter.

Using an annotation in the code for each service, you enable them to send metrics for some operations. Even though this is a simple example, because they’re only emitting timing metrics, it serves the purpose of showing how you can instrument your code to collect data you find relevant. Let’s look into one of the services to see how this is done. Consider the listing 7.5.

c07_08.png

Figure 7.8 StatsD agent collecting metrics that services collaborating in a place sell order operation have emitted

Listing 7.5 microservices-in-action/chapter-7/feature/fees/app.py

import json
import datetime
from nameko.events import EventDispatcher, event_handler
from statsd import StatsClient    ①  

class FeesService:
    name = "fees_service"
    statsd = StatsClient('statsd-agent', 8125,
                         prefix='simplebank-demo.fees')    ②  

    @event_handler("market_service", "order_placed")
    @statsd.timer('charge_fee')    ③  
    def charge_fee(self, payload):
        print("[{}] {} received order_placed event ... charging fee".format(
            payload, self.name))

To collect metrics using the StatsD client library, you need to initialize the client by passing the hostname, in this case statsd-agent, the port, and an optional prefix for metrics collected in this service scope. If you annotate the charge_fee method with @statsd.timer('charge_fee'), the library will wrap the execution of that method in a timer and will collect the value from the timer and send it to the agent. You can collect these metrics and feed them to monitoring systems that’ll allow you to observe your system behavior and set up alerts or even autoscale your services.

For example, imagine the fees service becomes too busy, and the execution time that StatsD reports increases over a threshold you set. You can automatically be alerted about that and immediately investigate to understand if the service is throwing errors or if you need to increase its capacity by adding more instances. Figure 7.9 shows an example of a dashboard displaying metrics that StatsD collected.

Error reporting

Metrics allow you to observe how the system is behaving on an ongoing basis, but, unfortunately, they aren’t the only thing you need to care about. Sometimes errors happen, and you need to be alerted about them and, if possible, gather some information about the context in which the error occurred. For example, you might get a stack trace so you can diagnose and try to replicate and solve the error. Several services provide alerting and aggregation of errors. It’s easy to integrate error reporting in your services, as shown in the following listing.

Listing 7.6 microservices-in-action/chapter-7/chassis/http_demo.py

import json
from nameko.web.handlers import http
from werkzeug.wrappers import Response
from nameko_sentry import SentryReporter    ①  

class HttpDemoService:
    name = "http_demo_service"
    sentry = SentryReporter()    ②  

    @http("GET", "/broken")
    def broken(self, request):
        raise ConnectionRefusedError()    ③  

    @http('GET', '/books/<string:uuid>')
    def demo_get(self, request, uuid):
        data = {'id': uuid, 'title': 'The unbearable lightness of being',
                'author': 'Milan Kundera'}
        return Response(json.dumps({'book': data}),
                        mimetype='application/json')

    @http('POST', '/books')
    def demo_post(self, request):
        return Response(json.dumps({'book': request.data.decode()}),
                        mimetype='application/json')
c07_09.png

Figure 7.9 Example of a dashboard displaying metrics that StatsD collected from an application

Setting up error reporting in the chassis you assembled is simple. You initialize the error reporter, and it’ll take care of capturing any exceptions and sending them over to the error reporting service backend. It’s common for the error reporter to send along some context with the errors, like a stack trace. Figure 7.10 shows the dashboard with the error you get if you access the /broken endpoint in the demo service.

c07_10.png

Figure 7.10 Dashboard for an error reporting service (Sentry) after accessing the /broken endpoint

Logging

Your services output information either to log files or to the standard output. These files can record a given interaction, such as the result and timing of an http call or any other information developers find useful to record. Having multiple services running this recording means you potentially have multiple services logging information across the organization. In a microservice architecture, where interactions happen between multiple services, you need to make sure you can trace those interactions and have access to them in a consistent way.

Logging is a concern for all teams and plays an important role in any organization. This is the case either for compliance reasons, when you may need to keep track of specific operations, or for allowing you to understand the flow of execution between different systems. The importance of logging is a sound reason for making sure that teams, no matter what language they’re using to develop their services, keep logs in a consistent way and, preferably, aggregate them in a common place.

At SimpleBank, the log aggregation system allows complex searches in logs, so you agree to send logs to the same place and in the same format. You use logstash format for logging, so the Python chassis includes a library to emit logs in logstash format.

Logstash is an open source data processing pipeline that allows ingestion of data from multiple sources. The logstash format became quite popular and is widely used because it’s a json message with some default fields, such as the ones you can find in the following listing.

Listing 7.7 Logstash json formatted message

{
  "message"    => "hello world",
  "@version"   => "1",
  "@timestamp" => "2017-08-01T23:03:14.111Z",
  "type"       => "stdin",
  "host"       => "hello.local"
}

Figure 7.11 shows the log output that the gateway service generates when receiving a place sell order request from a client. In such cases, it generates two messages. They both contain a wealth of information, like the filename, module, and line executing code, as well as the time it took for the operation to complete. The only information you passed explicitly to the logger was what appears in the message fields. The library you’re using inserts all the other information.

By sending this information to a log aggregation tool, you can correlate data in many interesting ways. In this case, here are some example queries:

  • Group by module and function name
  • Select all entries for operations that took longer than x miliseconds
  • Group by host
c07_11.png

Figure 7.11 Logstash formatted log messages that the gateway service generated

The most interesting thing is that the host, type, version, and timestamp fields will appear in all the messages that the services using the chassis generate, so you can correlate messages from different services.

In your Python chassis, the following listing shows the code responsible for generating the log entries you can see in figure 7.11.

Listing 7.8 Logstash logger configuration in the Python chassis

import logging
from logstash_formatter import LogstashFormatterV1

logger = logging.getLogger(__name__)
handler = logging.StreamHandler()
formatter = LogstashFormatterV1()
handler.setFormatter(formatter)
logger.addHandler(handler)

(…)

# to log a message …
logger.info(“this is a sample message”)

This code is responsible for initializing the logging and adding the handler that’ll format the output in the logstash json format.

By using the microservice chassis, you create a standard way of accessing the tools to achieve the goal of running observable services. By choosing certain libraries, you’re able to enforce having all teams use the same underlying infrastructure without forcing any team to choose a particular language.

7.3.3 Balancing and limiting

We mentioned in section 7.3.1 on service discovery that the message broker provided not only a way for services to discover each other implicitly but also a load balancing capability.

While benchmarking the place sell order feature, say you realize you have a bottleneck in your processing. The market service has to interact with an external actor, the stock exchange, and will only do that after a successful response creates the OrderPlaced event that both the fees service and the orders service will consume. Requests are accumulating because the HTTP call to the external service is slower than the rest of the processing in the system. For this reason, you decide to increase the number of instances running the market service. You deploy three instances to compensate for the extra time that the order placement onto the stock exchange takes. This change is seamless, because once you add the new instances, they’re registered with the rpc-market_service queue in RabbitMQ. Figure 7.12 shows the three instances of the service connected.

As you can see, three instances are connected to the queue, each of them set to prefetch 10 messages from the queue as soon as they arrive. Now that you have multiple instances consuming from the same queue, you need to make sure only one of those instances processes each request. Once again, RabbitMQ makes your life easier because it deals with load balancing. By default, it’ll use a round-robin algorithm to schedule the delivery of messages between the service instances. This means it’ll deliver the first 10 messages to instance 1, then the next 10 to instance 2, and finally 10 to instance 3. It’ll keep repeating this over and over. This is a naïve approach to scheduling work, because one instance may take longer than another one, but it generally works quite well and is easy to understand.

c07_12.png

Figure 7.12 Multiple instances of the market service registered in the RPC queue

The only thing you need to be careful about is checking if the connected instances are healthy so they don’t start accumulating messages. You can do so by making use of metrics, using StatsD, to monitor the number of messages that each instance is processing and if they’re accumulating. In your code, you also can implement health checks so that any instance not responding to those health check requests can be flagged and restarted. RabbitMQ also will work as a limiting buffer, storing messages until the service instances can process them. According to the configuration shown in figure 7.12, each instance will receive ten messages to process at a time and will only be assigned new messages after it has finished processing previous ones.

It’s worth mentioning that in the particular case of the market service as it interacts with a third-party system, you also implement a circuit breaking mechanism. Let’s look at the service code where the call to the stock exchange is implemented, as follows.

Listing 7.9 microservices-in-action/chapter-7/feature/market/app.py

import json
import requests
(…)
from statsd import StatsClient
from circuitbreaker import circuit    ①  

class MarketService:
    name = "market_service"
    statsd = StatsClient('statsd-agent', 8125,
                         prefix='simplebank-demo.market')

    (…)

    @statsd.timer('place_order_stock_exchange')
    @circuit(failure_threshold=5, expected_exception=
➥ConnectionError)    ②  
    def __place_order_exchange(self, request):
        print("[{}] {} placing order to stock exchange".format(
            request, self.name))
        response = requests.get('https://jsonplaceholder.typicode.com/posts/1')
        return json.dumps({'code': response.status_code, 'body': response.text})

You make use of the circuit breaker library to configure the number of consecutive failures to connect that you’ll tolerate. In the example shown, if you have five consecutive failing calls with the ConnectionError exception, you’ll open the circuit, and no call will be made for 30 seconds. After those 30 seconds, you’ll enter the recovery stage, allowing one test call. If the call is successful, it’ll close the circuit again, resuming normal operation and allowing calls to the external service; otherwise, it’ll prevent calls for another 30 seconds.

You could use this technique not only for external calls but also for calls between internal components, because it will allow you to degrade the service. In the case of the market service, using this technique would mean messages that services retrieved from the queue wouldn’t be acknowledged and would accumulate in the broker. Once the external service connectivity was resumed, you’d be able to start processing messages from the queue. You could complete the call to the stock exchange and create the OrderPlaced event that allows both the fees service and the orders service to complete the execution of a place sell order request.

7.4 Exploring the feature implemented using the chassis

In the previous section, you saw code examples for the implementation of the place sell order feature. Let’s briefly look into the resulting feature prototype that you’d implement using the chassis. Based on the chassis code that you can find in the code repository under chapter7/chassis, say you’ve created five services:

  • Gateway
  • Orders service
  • Market service
  • Account transactions service
  • Fees service

Figure 7.13 shows the project structure and a Docker Compose file that allows you to locally start the five components and the StatsD agent we mentioned previously. The Docker Compose file will allow booting the services as well as the needed infrastructure components: RabbitMQ, Redis, and the local StatsD agent, which will simulate metrics collection.

We won’t go deep on Docker or Docker Compose right now, because we’ll cover it in the upcoming chapters. But if you do have Docker and Docker Compose available, you can boot the services by entering the feature directory and running docker-compose up –build. This will build a Docker container for each service and boot everything up.

Figure 7.14 shows all services running and processing a POST request to the shares/sell gateway endpoint.

Even though the feature makes use of both synchronous and asynchronous communication between the different components, the chassis you have in place allows you to quickly prototype it and run initial benchmarks using a tool that allows you to simulate concurrent requests, with results such as the following: (Please note that these benchmarks ran locally on a development machine and are merely indicative.)

$ siege -c20 -t300S -H 'Content-Type: application/json' 'http://192.168.64.3:5001/shares/sell POST'

    (benchmark running for 5 minutes …)

Lifting the server siege...
Transactions:		      	12663 hits
Availability:		     	100.00 %
Elapsed time:		     	299.78 secs
Data transferred:	       	0.77 MB
Response time:		      	0.21 secs
Transaction rate:	      		42.24 trans/sec
Throughput:		       	0.00 MB/sec
Concurrency:		       	9.04
Successful transactions:       	12663
Failed transactions:	     	0
Longest transaction:	      	0.52
Shortest transaction:	      	0.08

These numbers look good, but it’s worth mentioning that once the benchmark stopped, the market service still needed to consume 3000 messages—almost a quarter of the total requests that the gateway processed. This benchmark allows you to identify the bottleneck happening in the market service that we mentioned in section 7.3.3. Referring to figure 7.4, you can see that the gateway receives a response from the orders service, but asynchronous processing still happens after that.

c07_13.png

Figure 7.13 Project structure for the place sell order feature and the Docker Compose file that allows booting the services and the needed infrastructure components

c07_14.png

Figure 7.14 Services used in the place sell order running locally

The engineering team at SimpleBank certainly will continue to improve the Python chassis so it reflects continuous team learnings. For now though, it’s already usable to implement nontrivial functionality.

7.5 Wasn’t heterogeneity one of the promises of microservices?

In the previous sections, we covered building and using a chassis for Python applications at SimpleBank. You can apply the principles to any language used within your organization though. At SimpleBank, teams also use Java, Ruby, and Elixir for building services. Would you go and build a chassis for each of these languages and stacks? If the language is widely adopted within the organization and different teams bootstrap more than a couple of services, I’d say sure! But it’s not imperative that you create a chassis. The only thing to keep in mind is that with or without a chassis, you need to maintain principles like observability.

One of the advantages of a microservice architecture is enabling heterogeneity of languages, paradigms, and tooling. In the end, it’ll enable teams to choose the right tool for the job. Although in theory the choices are limitless, the fact is, teams will specialize in a couple of technology stacks for their day-to-day development. They’ll naturally develop a deeper knowledge around one or two different languages and their supporting ecosystems. A supporting ecosystem is also important. Independent teams, such as the ones you need to have in place to successfully run a microservice architecture, will also focus on operations and will know about the platforms running their apps. Some examples are the Java virtual machine (JVM) or the Erlang virtual machine (BEAM). Knowing about the infrastructure will help with delivering better and more efficient apps.

Netflix is a good example because they have a deep knowledge of the JVM. This enables them to be a proficient contributor of open source tools, allowing the community to benefit from the same tools they use to run their service. The fact that they have so many tools written targeting the JVM will make that ecosystem the first choice for their engineering teams. In some sense, it feels like: “You’re free to choose whatever you want, as long as it abides with our given set of rules and implements some interfaces..., or you can use this chassis that takes care of all of that!”

Having existing chassis for some of the languages and stacks an organization has adopted may help direct teams’ choices toward those languages and stacks. Not only will services be easier and faster to bootstrap, they’ll also become more maintainable from a risk standpoint. A chassis is a great way to indirectly enforce key concerns and practices of an engineering team.

Summary

  • A microservice chassis allows for quick bootstrapping of new services, enabling greater experimentation and reducing risk.
  • The use of a chassis allows you to abstract the implementation of certain infrastructure-related code.
  • Service discovery, observability, and different communication protocols are concerns of a microservice chassis, and it should provide them.
  • You can quickly prototype a complex feature like the place sell order example, if the proper tooling exists.
  • Although the microservice architecture is often associated with the possibility of building systems in any language, those systems, when in production, need to offer some guarantees and have mechanisms to allow their operation and maintenance to be manageable.
  • A microservice chassis is a way to provide those guarantees while allowing fast bootstrap and quick development for you to test ideas and, if proven, deploy them to production.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset