Chapter 7. Architecture of Microservice-Based Systems

This chapter discusses how microservices should behave when viewed from the outside and how an entire microservice system can be developed. Chapter 8, “Integration and Communication,” covers possible communication technologies that are an important technology component. Chapter 9, “Architecture of Individual Microservices,” focuses on the architecture of individual microservices.

Section 7.1 describes what the domain architecture of a microservice system should look like. Section 7.2 presents appropriate tools to visualize and manage the architecture. Section 7.3 shows how the architecture can be adapted in a stepwise manner. Only a constant evolution of the software architecture will ensure that the system remains maintainable in the long run and can be developed further. Section 7.4 discusses the goals and approaches that are important to enable further development.

Next, a number of approaches for the architecture of a microservice-based system are explained. Section 7.6 discusses the special challenges that arise when a legacy application is to be enhanced or replaced by microservices. Section 7.8 introduces event-driven architecture. This approach makes possible architectures that are very loosely coupled.

Finally, Section 7.9 deals with the technical aspects relevant to the architecture of a microservice-based system. Some of these aspects are presented in depth in the following sections: mechanisms for coordination and configuration (section 7.10), Service Discovery (section 7.11), Load Balancing (section 7.12), scalability (section 7.13), security (section 7.14), and finally documentation and metadata (section 7.15).

7.1 Domain Architecture

The domain architecture of a microservice-based system determines which microservices within the system should implement which domain. It defines how the entire domain is split into different areas, each of which are implemented by one microservice and thus one team. Designing such an architecture is one of the primary challenges when introducing microservices. It is, after all, an important motivation for the use of microservices that changes to the domain can be implemented, ideally, by just one team changing just one microservice—minimizing coordination and communication across teams. Done correctly this ensures that microservices can support the scaling of software development since even large teams need little communication and therefore can work productively.

To achieve this, it is important that the design of the domain architecture for the microservices makes it possible for changes to be limited to single microservices and thus individual teams. When the distribution into microservices does not support this, changes will require additional coordination and communication, and the advantages that a microservice-based approach can bring will not be achieved.

Strategic Design and Domain-Driven Design

Section 3.3 discussed the distribution of microservices based on strategic design, a concept taken from domain-driven design. A key element is that the microservices are distributed into contexts—that is, areas that represent separate functionality.

Often architects develop a microservice architecture based on entities from a domain model. A certain microservice implements the logic for a certain type of entity. Using this approach might give, for instance, one microservice for customers, one for items, and one for deliveries. However, this approach conflicts with the idea of Bounded Context, which stipulates that uniform modeling of data is impossible. Furthermore, this approach isolates changes very badly. When a process is to be modified and entities have to be adapted, the change is distributed across different microservices. As a result, changing the order process will impact the entity modeling for customers, items, and deliveries. When that is the case, the three microservices for the different entities have to be changed in addition to the microservice for the order process. To avoid this, it can be sensible to keep certain parts of the data for customers, items, and deliveries in the microservice for the order process. With this approach when changes to the order process require the data model to be modified, then this change can be limited to a single microservice.

However, this does not prevent a system from having services dedicated to the administration of certain entities. It may be necessary to manage the most fundamental data of a certain business entity in a service. For example, a service can certainly administrate the client data but leave specific client data, such as a bonus program number, to other microservices—for example to the microservice for the order process, which likely has to know this number.

Example Otto Shop

An example—the architecture of the Otto shop1—illustrates this concept. Otto GmbH is one of the biggest e-commerce companies. In the architecture there are, on the one hand, services like user, order, and product, which are oriented toward data, and on the other hand, areas like tracking, search and navigation, and personalization, which are not geared to data but to functionality. This is exactly the type of domain design that should be aimed for in a microservice-based system.

1. https://dev.otto.de/2016/03/20/why-microservices/

A domain architecture requires a precise understanding of the domain. It comprises not only the division of the system into microservices but also the dependencies. A dependency arises when a dependent microservice uses another one—for instance, by calling the microservice, by using elements from the UI of the microservice, or by replicating its data. Such a dependency means that changes to a microservice can also influence the microservice that is dependent on it. If, for example, the microservice modifies its interface, the dependent microservice has to be adapted to these changes. Also new requirements affecting the dependent microservice might mean that the other microservice has to modify its interface. If the dependent microservice needs more data to implement the requirements, the other microservice has to offer this data and adjust its interface accordingly.

For microservices such dependencies cause problems beyond just software architecture: If the microservices involved in a change are implemented by different teams, then the change will require collaboration between those teams; this overhead can be time consuming and laborious.

Managing Dependencies

Managing dependencies between microservices is central to the architecture of a system. Having too many dependencies will prevent microservices from being changed in isolation—which goes against the objective of developing microservices independently of each other. Here are two fundamental rules to apply for good architecture:

• There should be a loose coupling between components such as microservices. This means that each microservice should have few dependencies on other microservices. This makes it easier to modify them since changes will only affect individual microservices.

• Within a component such as a microservice, the constituent parts should work closely together. This is referred to as having high cohesion. This ensures that all constituent parts within a microservice really belong together.

When these two prerequisites are not met, it will be difficult to change an individual microservice in an isolated manner, and changes will have to be coordinated across multiple teams and microservices, which is just what microservice-based architectures are supposed to avoid. However, this is often a symptom: the fundamental problem is how the domain-based split of the functionality between the microservices was done. Obviously pieces of functionality that should have been placed together in one microservice have been distributed across different microservices. An order process, for instance, also needs to generate a bill. These two pieces of functionality are so different that they have to be distributed into at least two microservices. However, when each modification of the order process also affects the microservice that creates the bills, the domain-based modeling is not optimal and should be adjusted. The pieces of functionality have to be distributed differently to the microservices, as we will see.

Unintended Domain-Based Dependencies

It is not only the number of dependencies that can pose a problem. Certain domain-based dependencies can simply be nonsensical. For instance, it would be surprising in an e-commerce system if the team responsible for product search suddenly has an interface with the microservice for billing, because that should not be the case from a domain-based point of view. However, when it comes to domain modeling, there are always surprises for the unaware. When a dependency is not meaningful from a domain-based point of view, something regarding the functionality of the microservices has to be wrong. Maybe the microservice implements features that belong in other microservices from a domain-based perspective. Perhaps in the context of product search a scoring of the customer is required, which is implemented as part of billing. In that case one should consider whether this functionality is really implemented in the right microservice. To keep the system maintainable over the long term, such dependencies have to be questioned and, if necessary, removed from the system. For instance, the scoring could be moved into a new, independent microservice or transferred into another existing microservice.

Cyclic Dependencies

Cyclic dependencies can present additional problems for a comprehensive architecture. Let us assume that the microservice for the order process calls the microservice for billing (see Figure 7.1). The microservice for billing fetches data from the order process microservice. When the microservice for the order process is changed, modifications to the microservice for billing might be necessary since this microservice fetches data from the microservice for the order process. Conversely, changes to the billing microservice require changes to the order microservice as this microservice calls the billing microservice. Cyclic dependencies are problematic: the components can no longer be changed in isolation, contrary to the underlying aim for a split into separate components. For microservices great emphasis is placed on independence, which is violated in this case. In addition to the coordination of changes that is needed, it may also be that the deployment has to be coordinated. When a new version of the one microservice is rolled out, a new version of the other microservice might have to be rolled out as well if they have a cyclic dependency.

Image

Figure 7.1 Cyclic Dependency

The remainder of the chapter shows approaches to building microservice-based architectures in such a way that they have a sound structure from a domain-based perspective. Metrics like cohesion and loose coupling can verify that the architecture is really appropriate. In the context of approaches like event-driven architecture (section 7.8) microservices have hardly any direct technical dependencies since they only send messages. Who is sending the messages and who is processing them is difficult to determine from the code, meaning that the metrics may look very good. However, from a domain-based perspective the system can still be far too complicated, since the domain-based dependencies are not examined by the metrics. Domain-based dependencies arise when two microservices exchange messages. However, this is difficult to ascertain by code analysis, meaning that the metrics will always look quite good. Thus metrics can only suggest problems. By just optimizing the metrics, the symptoms are optimized, but the underlying problems remain unsolved. Even worse, even systems with good metrics can have architectural weaknesses. Therefore, the metric loses its value in determining the quality of a software system.

A special problem in the case of microservices is that dependencies between microservices can also influence their independent deployment. If a microservice requires a new version of another microservice because it uses, for instance, a new version of an interface, the deployment will also be dependent: The microservice has to be deployed before the dependent microservice can be deployed. In extreme cases this can result in a large number of microservices that have to be deployed in a coordinated manner—this is just what was supposed to be avoided. Microservices should be deployed independently of each other. Therefore, dependencies between microservices can present an even greater problem than would be the case for modules within a deployment monolith.

7.2 Architecture Management

For a domain architecture it is critical which microservices exist and what the communication relationships between the microservices look like. This is true in other systems as well where the relationships between the components are very important. When domain-based components are mapped on modules, classes, Java packages, JAR files, or DLLs, specific tools can determine the relationships between the components and control the adherence to certain rules. This is achieved by static code analysis.

Tools for Architecture Management

If an architecture is not properly managed, then unintended dependencies will quickly creep in. The architecture will get more and more complex and hard to understand. Only with the help of architecture management tools can developers and architects keep track of the system. Within a development environment developers view only individual classes. The dependencies between classes can only be found in the source code and are not readily discernible.

Figure 7.2 depicts the analysis of a Java project by the architecture management tool Structure 101. The image shows classes and Java packages, which contain classes. A levelized structure map (LSM) presents an overview of them. Classes and packages that are higher up the LSM use classes and packages that are depicted lower down the LSM. To simplify the diagram, these relationships are not indicated.

Image

Figure 7.2 Screenshot of the Architecture Management Tool Structure 101

Cycle-Free Software

Architectures should be free of cycles. Cyclic dependencies mean that two artifacts are using each other reciprocally. In the screenshot such cycles are presented by dashed lines. They always run from bottom to top. The reciprocal relationship in the cycle would be running from top to bottom and is not depicted.

In addition to cycles, packages that are located in the wrong position are also relevant. There is, for instance, a package util whose name suggests it is supposed to contain helper classes. However, it is not located at the very bottom of the diagram. Thus, it has to have dependencies to packages or classes that are further down—which should not be the case. Helper classes should be independent from other system components and should therefore appear at the very bottom of an LSM.

Architecture management tools like Structure 101 don’t just analyze architectures; they can also enable architects to define prohibited relationships between packages and classes. Developers who violate these rules will receive an error message and can modify the code.

With the help of tools like Structure 101 the architecture of a system can be easily visualized. The compiled code only has to be loaded into the tool for analysis.

Microservices and Architecture Management

For microservices the problem is much larger: relationships between microservices are not as easy to determine as the relationships between code components. After all, the microservices could even be implemented in different technologies. They communicate only via the network. Their relationships prevent management at a code level, because they appear only indirectly in the code. However, if the relationships between microservices are not known, architecture management becomes impossible.

There are different ways to visualize and manage the architecture:

• Each microservice can have associated documentation (see section 7.15) that lists all used microservices. This documentation has to adhere to a predetermined format, which enables visualization.

• The communication infrastructure can deliver the necessary data. If Service Discovery (section 7.11) is used, it will be aware of all microservices and will know which microservices have access to which other microservices. This information can then be used for the visualization of the relationships between the microservices.

• If access between microservices is safeguarded by a firewall, the rules of the firewall will at least detail which microservice can communicate with which other microservice. This can also be used as a basis for the visualization of relationships.

• Traffic within the network also reveals which microservices communicate with which other microservices. Tools like Packetbeat (see section 11.3) can be very helpful here. They visualize the relationships between microservices based on the recorded network traffic.

• The distribution into microservices should correspond to the distribution into teams. If two teams cannot work independently of each other anymore, this is likely due to a problem in the architecture: The microservices of the two teams depend so strongly on each other that they can now only be modified together. The teams involved probably know already which microservices are problematic due to the increased communication requirement. To verify the problem, an architecture management tool or a visualization can be used. However, manually collected information might be sufficient.

Tools

Different tools are useful to evaluate data about dependencies:

• There are versions of Structure 1012 that can use custom data structures as input. One still has to write an appropriate importer. Structure 101 will then recognize cyclic dependencies and can depict the dependencies graphically.

2. http://structure101.com

• Gephi3 can generate complex graphs, which are helpful for visualizing the dependencies between microservices. Again, a custom importer has to be written for importing the dependencies between the microservices from an appropriate source into Gephi.

3. http://gephi.github.io/

• jQAssistant4 is based on the graph database neo4j. It can be extended by a custom importer. Then the data model can be checked according to rules.

4. http://jqassistant.org/

For all these tools custom development is necessary. It is not possible to analyze a microservice-based architecture immediately; there is always some extra effort required. Since communication between microservices cannot be standardized, it is likely that custom development will always be required.

Is Architecture Management Important?

The architecture management of microservices is important, as it is the only way to prevent chaos in the relationships between the microservices. Microservices are a special challenge in this respect: With modern tools, a deployment monolith can be quite easily and rapidly analyzed. For microservice-based architectures, there are no tools that can analyze the entire structure in a simple manner. The teams first have to create the necessary prerequisites for an analysis. Changing the relationships between microservices is difficult, as the next section will show. Therefore, it is even more important to continually review the architecture of the microservices in order to correct problems that arise as early as possible. It is a benefit of microservice-based architectures that the architecture is also reflected in the organization. Problems with communication will therefore point towards architectural problems. Even without formal architecture management, architectural problems often become obvious.

On the other hand, experiences with complex microservice-based systems teach us that in such systems, nobody understands the entire architecture. However, this is also not necessary since most changes are limited to individual microservices. If a certain use case involving multiple microservices is to be changed, it is sufficient to understand this interaction and the involved microservices. A global understanding is not absolutely necessary. This is a consequence of the independence of the individual microservices.

Context Map

Context Maps are a way to get an overview of the architecture of a microservice-based system.5 They illustrate which domain models are used by which microservices and therefore visualize the different Bounded Contexts (see section 3.3). The Bounded Contexts not only influence the internal data presentation in the microservices but also impact the calls between microservices where data is exchanged. They have to be in line with some type of model. However, the data models underlying communication can be distinct from the internal representations. For example, if a microservice is supposed to identify recommendations for customers of an e-commerce shop, complex models can be employed internally for this that contain a lot of information about customers, products, and orders and correlate them in complex ways. On the outside, however, these models can be much simpler.

5. Eric Evans. 2003. Domain-Driven Design: Tackling Complexity in the Heart of Software. Boston: Addison-Wesley.

Figure 7.3 shows an example of a Context Map:

• The registration registers the basic data of each customer. The order process also uses this data format to communicate with registration.

• In the order process the customer’s basic data is supplemented by data such as billing and delivery addresses to obtain the customer order data. This corresponds to a Shared Kernel (see section 3.3). The order process shares the kernel of the customer data with the registration process.

• The delivery and the billing microservices use customer order data for communication, and the delivery microservice uses it for the internal representation of the customer. This model is a kind of standard model for the communication of customer data.

Image

Figure 7.3 An Example of a Context Map

• Billing uses an old mainframe data model. Therefore, customer order data for outside communication is decoupled from the internal representation by an anti-corruption layer. The data model represents a very bad abstraction, which should not be allowed to affect other microservices.

In this model it is clear that the internal data representation in registration propagates to the order process. There, it serves as the basis for the customer order data. This model is used in delivery as an internal data model as well as in the communication with billing and delivery. This leads to the model being hard to change since it is used by so many services. If this model was to be changed, all these services would have to be modified.

However, there are also advantages associated with this. If all these services had to implement the same change to the data model, only a single change would be necessary to update all microservices at once. Nevertheless, this goes against the principle that changes should always only affect a single microservice. If the change remains limited to the model, the shared model is advantageous since all microservices automatically use the current modeling. However, when the change requires changes in the microservices, now multiple microservices have to be modified—and brought into production together. This conflicts with an independent deployment of microservices.


Try and Experiment

• Download a tool for the analysis of architectures. Candidates are Structure 101,6 Gephi,7 or jQAssistant.8 Use the tool to get an overview of an existing code base. What options are there to insert your own dependency graphs into the tool? This would enable you to analyze the dependencies within a microservice-based architecture with this tool.

6. http://structure101.com

7. http://gephi.github.io/

8. http://jqassistant.org

• spigo9 is a simulation for the communication between microservices. It can be used to get an overview of more complex microservice-based architectures.

9. https://github.com/adrianco/spigo


7.3 Techniques to Adjust the Architecture

Microservices are useful in situations where the software is subject to numerous changes. Due to the distribution into microservices the system separates into deployment units, which can be developed independently of each other. This means that each microservice can implement its own stream of stories or requirements. Consequently, multiple changes can be worked on in parallel without much need for coordination.

Experience teaches us that the architecture of a system is subject to change. A certain distribution into domain-based components might seem sensible at first. However, once architects get to know the domain better, they might come to the conclusion that another distribution would be better. New requirements are hard to implement with the old architecture since it was devised based on different premises. This is especially common for agile processes, which demand less planning and more flexibility.

Where Does Bad Architecture Come From?

A system with a bad architecture does not normally arise because the wrong architecture has been chosen at the outset. Based on the information available at the start of the project, the architecture is often good and consistent. The problem is frequently that the architecture is not modified when there are new insights that suggest changes to the architecture. The symptom of this was mentioned in the last section: New requirements cannot be rapidly and easily implemented anymore. To that end the architecture would have to be changed. When this pressure to introduce changes is ignored for too long, the architecture will, at some point, not fit at all. The continuous adjustment and modification of the architecture is essential in keeping the architecture in a really sustainable state.

This section describes some techniques that enable the interplay between microservices to be changed in order to adapt the overall system architecture.

Changes in Microservices

Within a microservice adjustments are easy. The microservices are small and manageable. It is no big deal to adjust structures. If the architecture of an individual microservice is completely insufficient, it can be rewritten since it is not very large. Within a microservice it is also easy to move components or to restructure the code in other ways. The term “refactoring”10 describes techniques that serve to improve the structure of code. Many of these techniques can be automated using development tools. This enables an easy adjustment of the code of an individual microservice.

10. Martin Fowler. 1999. Refactoring: Improving the Design of Existing Code, Boston: Addison-Wesley.

Changes to the Overall Architecture

However, when the division of functionality between the microservices is no longer in line with the requirements, changing just one microservice will not be sufficient. To achieve the necessary adjustment of the complete architecture, functionality has to be moved between microservices. There can be different reasons for this:

• The microservice is too large and has to be divided. Indications for this can be that the microservice is no longer intelligible anymore or so large that a single team is not sufficient to develop it further. Another indication can be that the microservice contains more than one Bounded Context.

• A piece of functionality really belongs in another microservice. An indication for that can be that certain parts of a microservice communicate a lot with another microservice. In this situation the microservices no longer have a loose coupling. Such intense communication can imply that the component belongs in another microservice. Likewise, a low cohesion in a microservice can suggest that the microservice should be divided. In that case there are areas in a microservice that depend little on each other. Consequently, they do not really have to be in one microservice.

• A piece of functionality should be used by multiple microservices. For instance, this can become necessary when a microservice has to use logic from another microservice because of some new piece of functionality.

There are three main challenges: microservices have to be split, code has to be moved from one microservice into another, and multiple microservices are supposed to use the same code.

Shared Libraries

If two microservices are supposed to use code together, the code can be transferred into a shared library (see Figure 7.4). The code is removed from the microservice and packaged in a way that enables it to be used by the other microservices. A prerequisite for this is that the microservices are written in technologies that enable the use of a shared library. This is the case when they are written in the same language or at least use the same platform, such as JVM (Java Virtual Machine) or .NET Common Language Runtime (CLR).

Image

Figure 7.4 Shared Library

A shared library means that the microservices become dependent on each other. Work on the library has to be coordinated. Features for both microservices have to be implemented in the library. Via the backdoor each microservice is affected by changes meant for the other microservice. This can result in errors, meaning that the teams have to coordinate the development of the library. Under certain conditions changes to a library can mean that a microservice has to be newly deployed—for instance because a security gap has been closed in the library.

It is also possible that through the shared library the microservices might obtain additional code dependencies to third-party libraries. In a Java JVM, third-party libraries can only be present in one version. If the shared library requires a certain version of a third-party library, the microservice also has to use this specific version and cannot use a different one. Additionally, libraries often have a certain programming model. In that way libraries can provide code, which can be called, or a framework into which custom code can be integrated, which is then called by the framework. The library might pursue an asynchronous model or a synchronous model. Such approaches can fit more or less well to a respective microservice.

Microservices do not focus on the reuse of code since this leads to new dependencies between the microservices. An important aim of microservices is independence—so code reuse often causes more problems than it solves. This is a rejection of the ideal of code recycling. Developers in the nineties still pinned their hopes on code reuse in order to increase productivity. Moving code into a library also has advantages. Errors and security gaps have to be corrected only once. The microservices use always the current library version and thus automatically get fixes for errors.

Another problem associated with code reuse is that it requires a detailed understanding of the code—especially in the case of frameworks into which the custom code has to embed itself. This kind of reuse is known as white-box reuse: The internal code structures have to be known, not only the interface. This type of reuse requires a detailed understanding of the code that is to be reused, which sets a high hurdle for the reuse.

An example would be a library that makes it easier to generate metrics for system monitoring. It will be used in the billing microservice. Other teams also want to use the code. Therefore, the code is extracted into a library. Since it is technical code, it does not need to modified if domain-based changes are made. Therefore, the library does not influence the independent deployment and the independent development of domain-based features. The library was supposed to be turned into an internal open-source project (see section 12.8).

However, to transfer domain code into a shared library is problematic, as it might introduce deployment dependencies into microservices. When, for instance, the modeling of a customer is implemented in a library, then each change to the data structure has to be passed on to all microservices, and they all have to be newly deployed. Besides, a uniform modeling of a data structure like customer is difficult due to Bounded Context.

Transfer Code

Another way to change the architecture is to transfer code from one microservice to another (see Figure 7.5). This is sensible when doing so ensures a loose coupling and a high cohesion of the entire system. When two microservices communicate a lot, they are not loosely coupled. When the part of the microservice that communicates a lot with the other microservice is transferred, this problem can be solved.

Image

Figure 7.5 Transferring Code

This approach is similar to the removal into a shared library. However, the code is not a common dependency, which solves the problem of coupling between the microservices. However, it is possible that the microservices have to have a common interface in order to be able to use the functionality after the code transfer. This is a black-box dependency: Only the interface has to be known, not the internal code structures.

In addition, it is possible to transfer the code into another microservice while keeping it in the original microservice. This causes redundancy. Errors will then have to be corrected in both versions, and the two versions can develop in different directions. However, this will ensure that the microservices are independent, especially with regard to deployment.

The technological limitations are the same as for a shared library—the two microservices have to use similar technologies; otherwise, the code cannot be transferred. However, in a pinch the code can also be rewritten in a new programming language or with a different programming model. Microservices are not very large. The code that has to be rewritten is only a part of a microservice. Consequently, the required effort is manageable.

However, there is the problem that the size of that microservice into which the code is transferred increases. Thus, the danger increases that the microservice turns into a monolith over time.

One example: The microservice for the order process frequently calls the billing microservice in order to calculate the price for the delivery. Both services are written in the same programming language. The code is transferred from one microservice into the other. From a domain perspective it turns out that the calculation of delivery costs belongs in the order-process microservice. The code transfer is only possible when both services use the same platform and programming language. This also means that the communication between microservices has been replaced by local communication.

Reuse or Redundancy?

Instead of attributing shared code to one or the other microservices, the code can also be maintained in both microservices. At first this sounds dangerous—after all, the code will then be redundant in two places, and bug fixes will have to be performed in both places. Most of the time developers try to avoid such situations. An established best practice is “Don’t Repeat Yourself” (DRY). Each decision and consequently all code should only be stored at exactly one place in the system. In a microservice-based architectures redundancy has a key advantage: the two microservices stay independent of each other and can be independently deployed and independently developed further. In this way the central characteristic of microservices is preserved.

It is questionable whether a system can be built without any redundancies at all. Especially in the beginning of object-orientation, many projects invested significant effort to transfer shared code into shared frameworks and libraries. This was meant to reduce the expenditure associated with the creation of the individual projects. In reality the code to be reused was often difficult to understand and thus hard to use. A redundant implementation in the different projects might have been a better alternative. It can be easier to implement code several times than to design it in a reusable manner and then to actually reuse it.

There are, of course, cases of successful reuse of code: hardly any project can get along nowadays without open-source libraries. At this level code reuse is taking place all the time. This approach can be a good template for the reuse of code between microservices. However, this has effects on the organization. Section 12.8 discusses organization and also code reuse using an open-source model.

Shared Service

Instead of transferring the code into a library, it can also be moved into a new micro-service (see Figure 7.6). Here the typical benefits of a microservice-based architecture can be achieved; the technology of the new microservice does not matter, as long as it uses the universally defined communication technologies and can be operated like the other microservices. Its internal structure can be arbitrary, even to the point of programming language.

Image

Figure 7.6 Shared Microservice

The use of a microservice is simpler than the use of a library. Only the interface of the microservice has to be known—the internal structure does not matter. Moving code into a new service reduces the average size of a microservice and therefore improves the intelligibility and replaceability of the microservices. However, the transfer replaces local calls with calls via the network, and changes for new features might no longer be limited to one microservice.

In software development big modules are often a problem. Therefore, transferring code into new microservices can be a good option for keeping modules small. The new microservice can be developed further by the team that was already responsible for the original microservice. This will facilitate the close coordination of new and old microservices since the required communication happens within only one team.

The split into two microservices also has the consequence that a call to the microservice-based system is not processed by just one single microservice but by several microservices. These microservices call each other. Some of those microservices will not have a UI but are pure backend services.

To illustrate this, let us turn again to the order process, which frequently calls the billing microservice for calculating the delivery costs. The calculation of delivery costs can be separated into a microservice by itself. This is even possible when the billing service and the order process microservice use different platforms and technologies. However, a new interface will have to be established that enables the new delivery cost microservice to communicate with the remainder of the billing service.

Spawn a New Microservice

In addition, it is possible to use part of the code of a certain microservice to generate a new microservice (see Figure 7.7). The advantages and disadvantages are identical to the scenario in which code is transferred into a shared microservice. However, the motivation is different in this case: The size of the microservices is meant to be reduced to increase their maintainability or maybe to transfer the responsibility for a certain functionality to another team. Here, the new microservice is not supposed to be shared by multiple other microservices.

Image

Figure 7.7 Spawning a New Microservice

For instance, the service for registration might have become too complex. Therefore, it is split into multiple services, each handling certain user groups. A separation along technical lines would also be possible—for instance according to CQRS (see section 9.2), event sourcing (section 9.3) or hexagonal architecture (section 9.4).

Rewriting

Finally, an additional way to handle microservices whose structure does not fit anymore is to rewrite them. This is more easily done with microservices-based architectures than with other architectural approaches due to the small size of microservices and their use via defined interfaces. This means that the entire system does not have be rewritten—just a part. It is also possible to implement the new microservice in a different programming language, which may be better suited for this purpose. Rewriting microservices can also be beneficial since new insights about the domain can leave their mark on the new implementation.

A Growing Number of Microservices

Experience with microservice-based systems teaches us that during the time a project is running, new microservices will be generated continuously. This involves greater effort around infrastructure and the operation of the system. The number of deployed services will increase all the time. For more traditional projects, such a development is unusual and may therefore appear problematic. However, as this section demonstrates, the generation of new microservices is the best alternative for the shared use of logic and for the ongoing development of a system. In any case the growing number of microservices ensures that the average size of individual microservices stays constant. Consequently, the positive characteristics of microservices are preserved.

Generating new microservices should be made as easy as possible as this enables the properties of the microservice system to be preserved. Potential for optimization is mainly present when it comes to establishing continuous delivery pipelines and build infrastructure and the required server for the new microservice. Once these things are automated, new microservices can be generated comparably easily.

Microservice-Based Systems Are Hard to Modify

This section has shown that it is difficult to adjust the overall architecture of a microservice-based system. New microservices have to be generated. This entails changes to the infrastructure and the need for additional continuous delivery pipelines. Shared code in libraries is rarely a sensible option.

In a deployment monolith such changes would be easy to introduce: Often the integrated development environments automate the transfer of code or other structural changes. Due to automation the changes are easier and less prone to errors. There are no effects whatsoever on the infrastructure or continuous delivery pipelines in the case of deployment monoliths.

Thus, changes are difficult at the level of the entire system—because it is hard to transfer functionality between different microservices. Ultimately, this is exactly the effect that was termed “strong modularization” and listed as an advantage in section 1.2: To cross the boundaries between microservices is difficult so that the architecture at the level between the microservices will remain intact in the long run. However, this means that the architecture is hard to adjust at this level.


Try and Experiment

• A developer has written a helper class, which facilitates the interaction with a logging framework that is also used by other teams. It is not very large and complex.

Should it be used by other teams?

Should the helper class be turned into a library or an independent microservice, or should the code simply be copied?


7.4 Growing Microservice-Based Systems

The benefits of microservices are seen most clearly in very dynamic environments. Due to the independent deployment of individual microservices, teams can work in parallel on different features without the need for significant coordination. This is especially advantageous when it is unclear which features are really meaningful and experiments on the market are necessary to identify promising approaches.

Planning Architecture?

In this sort of environment, it is difficult to plan a good split of the domain logic into microservices right from the start. The architecture has to adjust to the evidence.

• The separation of a system into its domain aspects is even more important for microservices than in the context of a traditional architectural approach. This is because the domain-based distribution also influences the distribution into teams and therefore the independent working of the teams—the primary benefit of microservices (section 7.1).

Section 7.2 demonstrated that tools for architecture management cannot readily be used in microservice-based architectures.

• As section 7.3 discussed, it is difficult to modify the architecture of microservices—especially in comparison to deployment monoliths.

• Microservices are especially beneficial in dynamic environments—where it is even more difficult to determine a meaningful architecture right from the start.

The architecture has to be changeable; however, this is difficult due to the technical limitations. This section shows how the architecture of a microservice-based system can nevertheless be modified and developed further in a step-by-step manner.

Start Big

One way to handle this inherent problem is to start out with several big systems that are subsequently split step by step into microservices (see Figure 7.8). Section 3.1 defined an upper limit for the size of a microservice as the amount of code that an individual team can still handle. At the start of a project it is hard to violate this upper limit. The same is true for the other upper limits: modularization and replaceability.

Image

Figure 7.8 Start Big: A Few Microservices Develop into Progressively More Microservices

When the entire project consists of only one or a few microservices, pieces of functionality are still easy to move, because the transfer will mostly occur within one service rather than between services. Step by step, more people can be moved into the project so that additional teams can be assembled. In parallel, the system can be divided into progressively more microservices to enable the teams to work independently of each other. Such a ramp-up is also a good approach from an organizational perspective since the teams can be assembled in a stepwise manner.

Of course, it would also be possible to start off with a deployment monolith. However, starting with a monolith has a key disadvantage: There is the danger that dependencies and problems creep into the architecture, which make a later separation into microservices difficult. Also there will be only one continuous delivery pipeline. When the monolith gets distributed into microservices, the teams will have to generate new continuous delivery pipelines. This can be very onerous, especially when the continuous delivery pipeline for the deployment monolith had been generated manually. In that situation all the additional continuous delivery pipelines would most likely have to be manually generated in a laborious manner.

When projects start out with multiple microservices, this problem is avoided. There is no monolith that later would have to be divided, and there has to be an approach for the generation of new continuous delivery pipelines. Thus the teams can work independently from the start on their own microservices. Over the course of the project the initial microservices are split into additional smaller microservices.

“Start big” assumes that the number of microservices will increase over the course of the project. It is therefore sensible to start with a few big microservices and spawn new microservices in a stepwise manner. The most recent insights can always be integrated into the distribution of microservices. It is just not possible to define the perfect architecture right from the start. Instead, the teams should adapt the architecture step by step to new circumstances and insights and have the courage to implement the necessary changes.

This approach results in a uniform technology stack—this will facilitate operation and deployment. For developers it is also easier to work on other microservices.

Start Small?

It is also possible to start with a system split into a large number of microservices and use this structure as the basis for further development. However, the distribution of the services is very difficult. Building Microservices11 provides an example where a team was tasked with developing a tool to support continuous delivery of a microservice-based system. The team was very familiar with the domain, had already created products in this area, and thus chose an architecture that distributed the system early on into numerous microservices. However, as the new product was supposed to be offered in the cloud, the architecture was, for subtle reasons, not suitable in some respects. To implement changes got difficult because modifications to features had to be introduced in multiple microservices. To solve this problem and make it easier to change the software, the microservices were united again into a monolith. One year later the team decided on the final architecture and split the monolith back into microservices. This example demonstrates that a splitting into microservices too early can be problematic—even if a team knows the domain very well.

11. Sam Newman. 2015. Building Microservices: Designing Fine-Grained Systems. Sebastopol, CA: O’Reilly Media.

Limits of Technology

However, this is in the end a limitation of the technology. If it were easier to move functionality between microservices (see section 7.4), the split into microservices could be corrected. In that case it would be much less risky to start off with a split into small microservices. When all microservices use the same technology, it is easier to transfer functionality between them. Chapter 14, “Technologies for Nanoservices,” discusses technologies for nanoservices, which are based on a number of compromises but in exchange enable smaller services and an easier transfer of functionality.

Replaceability as a Quality Criterion

An advantage of the microservice approach is the replaceability of the microservices. This is only possible when the microservices do not grow beyond a certain size and internal complexity. One objective during the continued development of microservices is to maintain the replaceability of microservices. Then a microservice can be replaced by a different implementation—for instance, if its further development is no longer feasible due to bad structure. In addition, replaceability is a meaningful aim to preserve the intelligibility and maintainability of the microservice. If the microservice is not replaceable anymore, it is probably also not intelligible anymore and therefore hard to develop any further.

The Gravity of Monoliths

One problem is that large microservices attract modifications and new features. They already cover several features; therefore, it seems a good idea to also implement new features in this service. This is true in the case of microservices that are too large but even more so for deployment monoliths. A microservices-based architecture can be aimed at replacing a monolith. However, in that case the monolith contains so much functionality that care is needed not to introduce too many changes into the monolith. For this purpose, microservices can be created, even if they contain hardly any functionality at the beginning. To introduce changes and extensions to the monolith is exactly the course of action that has rendered the maintenance of the deployment monolith impossible and led to its replacement by microservices.

Keep Splitting

As mentioned, most architectures do not have the problem that they were originally planned in a way that did not fit the task. In most cases the problem is more that the architecture did not keep up with the changes in the environment. A microservice-based architecture also has to be continuously adjusted; otherwise, at some point it will no longer be able to support the requirements. These adjustments include the management of the domain-based split as well as of the size of the individual microservices. This is the only way to ensure that the benefits of the microservice-based architecture are maintained over time. Since the amount of code in a system usually increases, the number of microservices should also grow in order to keep the average size constant. Thus an increase in the number of microservices is not a problem but rather a good sign.

Global Architecture?

However, the size of microservices is not the only problem. The dependencies of the microservices can also cause problems (see section 7.1). Such problems can be solved most of the time by adjusting a number of microservices—that is, those that have problematic dependencies. This requires contributions only from the teams that work on these microservices. These teams are also the ones to spot the problems, because they will be affected by the bad architecture and the greater need for coordination. By modifying the architecture, they are able to solve these issues. In that case there is no need for a global management of dependencies. Metrics like a high number of dependencies or cyclic dependencies are only an indication of a problem. Whether such metrics actually show a problem can only be solved by evaluating them together with the involved teams. If the problematic components are, for instance, not going to be developed any further in the future, it does not matter if the metrics indicate a problem. Even if there is global architecture management, it can only work effectively in close cooperation with the different teams.

7.5 Don’t Miss the Exit Point or How to Avoid the Erosion of a Microservice (Lars Gentsch)

by Lars Gentsch, E-Post Development GmbH

Practically, it is not too difficult to develop a microservice. But how can you ensure that the microservice remains a microservice and does not secretly become a monolith? An example shall illustrate at which point a service starts to develop in the wrong direction and which measures are necessary to ensure that the microservice remains a microservice.

Let’s envision a small web application for customer registration. This scenario can be found in nearly every web application. A customer wants to buy a product in an Internet shop (Amazon, Otto, etc.) or to register for a video-on-demand portal (Amazon Prime, Netflix, etc.). As a first step the customer is led through a small registration workflow. He/she is asked for his/her username, a password, the email address, and the street address. This is a small self-contained functionality, which is very well suited for a microservice.

Technologically this service has probably a very simple structure. It consists of two or three HTML pages or an AngularJS-Single Page App, a bit of CSS, some Spring Boot and a MySQL database. Maven is used to build the application.

When data are entered, they are concomitantly validated, transferred into the domain model, and put into the database for persistence. How can the microservice grow step by step into a monolith?

Incorporation of New Functionality

Via the shop or the video-on-demand, portal items and content are supposed to be delivered, which are only allowed to be accessed by people who are of age. For this purpose, the age of the customer has to be verified. One possibility to do this is to store the birth date of the client together with other data and to incorporate an external service for the age verification.

Thus, the data model of our service has to be extended by the birth date. More interesting is the incorporation of the external service. To achieve this, a client for an external API has to be written, which should also be able to handle error situations like the nonavailability of the provider.

It is highly probable that the initiation of the age verification is an asynchronous process so that our service might be forced to implement a callback interface. So the microservice must store data about the state of the process. When was the age verification process initiated? Is it necessary to remind the customer via email? Was the verification process successfully completed?

What Is Happening to the Microservice Here?

The following things are going on:

1. The customer data is extended by the birthdate. That is not problematic.

2. In addition to customer data, there is now process data. Attention: here process data is mixed with domain data.

3. In addition to the original CRUD functionality of the service, some kind of workflow is now required. Synchronous processing is mixed with asynchronous processing.

4. An external system is incorporated. The testing effort for the registration microservice increases. An additional system and its behavior have to be simulated during test.

5. The asynchronous communication with the external system has other demands with regard to scaling. While the registration microservice requires an estimated ten instances due to load and failover, the incorporation of the age verification can be operated in a fail-safe and stable manner with just two instances. Thus, different run time requirements are mixed here.

As the example demonstrates, an apparently small requirement like the incorporation of age verification can have tremendous consequences for the size of the microservice.

Criteria Arguing for a New Microservice Instead of Extending an Existing One

The criteria for deciding on when to start a new microservice include the following:

1. Introduction of different data models and data (domain versus process data)

2. Mixing of synchronous and asynchronous data processing

3. Incorporation of additional services

4. Different load scenarios for different aspects within one service

The example of the registration service could be further extended: the verification of the customer’s street address could also be performed by an external provider. This is common in order to ensure the existence of the denoted address. Another scenario is the manual clearance of a customer in case of double registration. The incorporation of a solvency check or customer scoring upon registration likewise are frequent scenarios.

All these domain-based aspects belong in principle to the customer registration and tempt developers and architects to integrate the corresponding requirements into the existing microservice. As a result the microservice grows into more than just one microservice.

How to Recognize Whether the Creation of a New Microservice Should Have Occurred Already

If your situation exhibits the following characteristics, then you probably already needed another microservice:

• The service can only be sensibly developed further as a Maven multimodule project or a Gradle multimodule project.

• Tests have to be divided into test groups and have to be parallelized for execution since the runtime of the tests surpasses five minutes (a violation of the “fast feedback” principle).

• The configuration of the service is grouped by domain within the configuration file, or the file is divided into single configuration files to improve the overview.

• A complete build of the service takes long enough to have a coffee break. Fast feedback cycles are not possible anymore (a violation of the “fast feedback” principle).

Conclusion

As the example of the registration microservice illustrates, it is a significant challenge to let a microservice remain a microservice and not give in to the temptation of integrating new functionality into an existing microservice due to time pressure. This holds true even when the functionality clearly belongs, as in the example, to the same domain.

What defensive steps can be taken to prevent the erosion of a microservice? In principle, it has to be as simple as possible to create new services, including their own data storage. Frameworks like Spring Boot, Grails, and Play make a relevant contribution to this. The allocation of project templates like Maven archetypes and the use of container deployments with Docker are additional measures to simplify the generation and configuration of new microservices as well as their passage into the production environment as much as possible. By reducing the “expenditure” required to set up of a new service, the barriers to introducing a new microservice clearly decrease as does the temptation to implement new functionality into existing services.

7.6 Microservices and Legacy Applications

The transformation of a legacy application into a microservice-based architecture is a scenario that is frequently met with in practice. Completely new developments are rather rare, and microservices, first of all, promise advantages for long-term maintenance. This is especially interesting for applications that are already on the brink of not being maintainable anymore. Besides, the distribution into microservices makes possible easier handling of continuous delivery: Instead of deploying and testing a monolith in an automated fashion, small microservices can be deployed and tested. The expenditure for this is by far lower. A continuous delivery pipeline for a micro-service is not very complex; however, for a deployment monolith the expenditure can be very large. This advantage is sufficient for many companies to justify the effort of migrating to microservices.

In comparison to building up completely new systems, there are some important differences when migrating from a deployment monolith to microservices:

• For a legacy system the functionality is clear from the domain perspective. This can be a good basis for generating a clean domain architecture for the microservices. Such a clean domain-based division is especially important for microservices.

• However, there is already a large amount of code in existence. The code is often of bad quality. There are few tests for the code, and deployment times are often much too long. Microservices should remove these problems. Accordingly, the challenges in this area are often significant.

• Likewise, it is well possible that the module boundaries in the legacy application do not answer to the Bounded Context idea (see section 3.3). In that case migrating to a microservice-based architecture is a challenge because the domain-based design of the application has to be changed.

Breaking Up Code?

In a simple approach the code of the legacy application can be split into several microservices. This can be problematic when the legacy application does not have a good domain architecture, which is often the case. The code can be easily split into microservices when the microservices are geared to the existing modules of the legacy application. However, when those have a bad domain-based split, this bad division will be passed on to the microservice-based architecture. Additionally, the consequences of a bad domain-based design are even more profound in a microservice-based architecture: The design also influences the communication between teams. Besides, the initial design is hard to change later on in a microservice-based architecture.

Supplementing Legacy Applications

However, it is also possible to get by without a division of the legacy application. An essential advantage of microservices is that the modules are distributed systems. Because of that, the module boundaries are at the same time the boundaries of processes that communicate via the network. This has advantages for the distribution of a legacy application: It is not at all necessary to know the internal structures of the legacy application or, based on that, to perform a split into microservices. Instead microservices can supplement or modify the legacy application at the interface. For this it is very helpful when the system to be replaced is already built in an SOA (section 6.2). If there are individual services, they can be supplemented by microservices.

Enterprise Integration Patterns

Enterprise Integration Patterns12, 13 offer an inspiration for possible integrations of legacy applications and microservices:

12. http://www.eaipatterns.com/toc.html

13. Gregor Hohpe, Bobby Woolf. 2003. Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions. Boston: Addison-Wesley.

Message Router describes that certain messages go to another service. For example, a microservice can select some messages that are processed then by the microservice instead of by the legacy application. In this way, the microservice-based architecture does not have to newly implement the entire logic at once but can at first select some parts.

• A special router is the Content Based Router. It determines based on the content of a message where the message is supposed to be sent. This enables the sending of specific messages to a specific microservice—even if the message differs only in one field.

• The Message Filter avoids uninteresting messages that a microservice receives. For that it just filters out all messages the microservice is not supposed to get.

• A Message Translator translates a message into another format. Therefore, the microservices architecture can use other data formats and does not necessarily have to employ the formats used by the legacy application.

• The Content Enricher can supplement data in the messages. If a microservice requires supplementary information in addition to the data of the legacy application, the Content Enricher can add this information without the legacy application or the microservice noticing anything.

• The Content Filter achieves the opposite: Certain data are removed from the messages so that the microservice obtains only the information relevant for it.

Figure 7.9 shows a simple example. A Message Router takes calls and sends them to a microservice or the legacy system. This enables implementation of certain functionalities in microservices. These functionalities are also still present in the legacy system but are not used there anymore. In this way the microservices are largely independent of the structures within the legacy system. For instance, microservices can start off with processing orders for certain customers or certain items. Because their scope is limited, they do not have to implement all special cases.

Image

Figure 7.9 Supplementing Legacy Applications by a Message Router

The patterns can serve as inspiration for how a legacy application can be supplemented by microservices. There are numerous additional patterns—the list provides only a glimpse of the entire catalog. As in other cases the patterns can be implemented in different ways: actually, they focus on messaging systems. However, it is possible to implement them with synchronous communication mechanisms, though less elegant. For instance, a REST service can take a POST message, supplement it with additional data, and finally send it to another microservice. That would then be a Content Enricher.

To implement such patterns, the sender has to be uncoupled from the recipient. This enables the integration of additional steps into the processing of requests without the sender noticing anything. In case of a messaging approach, this is easily possible, as the sender knows only one queue in which he/she places the messages. The sender does not know who fetches the messages. However, in the case of synchronous communication via REST or SOAP, the message is sent directly to the recipient. Only by Service Discovery (see section 7.11) the sender gets uncoupled from the recipient. Then one service can be replaced by another service without need to change the senders. This enables an easier implementation of the patterns. When the legacy application is supplemented by a Content Enricher, this Content Enricher, instead of the legacy application, is registered in the Service Discovery, but no sender has to be modified. To introduce Service Discovery can therefore be a first step towards a microservices architecture since it enables supplementation or replacement of individual services of the legacy application without having to modify the users of the legacy application.

Limiting Integration

Especially for legacy applications, it is important that the microservices are not too dependent on the legacy application. Often the bad structure of the old application is the specific reason why the application is supposed to be replaced in the first place. Therefore, certain dependencies should not be allowed at all. When microservices directly access the database of the legacy application, the microservices are dependent on the internal data representation of the legacy application. Besides neither the legacy application nor the microservices can still change the schema, because such changes have to be implemented in microservices and legacy application. The shared use of a database in legacy application and microservices has to be avoided on all accounts. However, to replicate the data of the legacy application into a separate database schema is, of course, still an option.

Advantages

It is an essential advantage of such an approach that the microservices are largely independent of the architecture of the legacy application. And the replacement of a legacy application is mostly initiated because its architecture is not sustainable any more. This also enables supplementation of systems by microservices that are actually not at all meant to be extended. Though, for instance, standard solutions in the area of CRM, e-commerce, or ERP are internally extensible, their extension by external interfaces can be a welcome alternative since such a supplement is often easier. Moreover, such systems often attract functionalities that do not really belong there. A distribution into a different deployment unit via a microservice ensures a permanent and clear delimitation.

Integration via UI and Data Replication

However, this approach only tackles the problem on the level of logic integration. Chapter 8 describes another level of integration, namely data replication. This allows a microservice to access comprehensive datasets of a legacy application also with good performance. It is important that the replication does not happen based on the data model of the legacy application. In that case the data model of the legacy application would practically not be changeable anymore since it is also used by the microservice. An integration based on the use of the same database would be even worse. Also at the level of UI integrations are possible. Links in web applications are especially attractive since they cause only few changes in the legacy application.

Content Management Systems

In this manner content management systems (CMS), for instance, which often contain many functionalities, can be supplemented by microservices. CMS contain the data of a website and administrate the content so that editors can modify it. The microservices take over the handling of certain URLs. Similar to a Message Router, an HTTP request can be sent to a microservice instead of to the CMS. Or the micro-service changes elements of the CMS as in the case of a Content Enricher or modifies the request as in the case of a Message Translator. Last, the microservices could store data in the CMS and thereby use it as a kind of database. Besides JavaScript representing the UI of a microservice can be delivered into the CMS. In that case the CMS turns into a tool for the delivery of code in a browser.

Some examples could be:

• A microservice can import content from certain sources. Each source can have its own microservice.

• The functionality that enables a visitor of the web page—for example, to follow an author—can be implemented in a separate microservice. The micro-service can either have its own URL and be integrated via links, or it modifies the pages that the CMS delivers.

• While an author is still known in the CMS, there is other logic that is completely separate from the CMS. This could be vouchers or e-commerce functionalities. Also in this case a microservice can appropriately supplement the system.

Especially in the case of CMS systems, which create static HTML, microservices-based approaches can be useful for dynamic content. The CMS moves into the background and is only necessary for certain content. There is a monolithic deployment of the CMS content, while the microservices can be deployed much more rapidly and in an independent manner. In this context the CMS is like a legacy application.

Conclusion

The integrations all have the advantage that the microservices are not bound to the architecture or the technology decisions of the legacy application. This provides the microservices with a decisive advantage compared to a modifications of the legacy application. However, the migration away from the legacy application using this approach poses a challenge at the level of architecture; in effect, microservice-based systems have to have a well-structured domain-based design to enable the implementation of features within one microservice and by an individual team. In case of a migration, which follows the outlined approach, this cannot always be put into effect since the migration is influenced by the interfaces of the legacy application. Therefore, the design cannot always be as clear-cut as desirable. Besides, domain-based features will still be also implemented in the legacy application until a large part of the migration has been completed. During this time the legacy application cannot be finally removed. When the microservices confine themselves to transforming the messages, the migration can take a very long time.

No Big Bang

The outlined approaches suggest that the existing legacy application is supplemented in a stepwise manner by microservices or that individual parts of the legacy application are replaced by microservices. This type of approach has the advantage that the risk is minimized. Replacing the entire legacy application in one single step entails high risk due to the size of the legacy application. In the end, all functionalities have to be represented in the microservices. In this process numerous mistakes can creep in. In addition, the deployment of microservices is complex as they all have to be brought into production in a concerted manner in order to replace the legacy application in one step. A stepwise replacement nearly imposes itself in the case of microservices since they can be deployed independently and supplement the legacy application. Therefore, the legacy application can be replaced by microservices in a stepwise manner.

Legacy = Infrastructure

Part of a legacy application can also simply be continued to be used as infrastructure for the microservices. For example, the database of the legacy application can also be used for the microservices. It is important that the schemas of the microservices are separate from each other and also from the legacy application. After all, the microservices should not be closely coupled.

The use of the database of the legacy application does not have to be mandatory for the microservices. Microservices can definitely also use other solutions. However, the existing database is established with regard to operation or backup. Using this database can also present an advantage for the microservices. The same is true for other infrastructure components. A CMS, for instance, can likewise serve as common infrastructure, to which functionalities are added from the different microservices and into which the microservices can also deliver content.

Other Qualities

The migration approaches introduced so far focus on enabling the domain-based division into microservices in order to facilitate the long-term maintenance and continued development of the system. However, microservices have many additional advantages. When migrating it is important to understand which advantage motivates the migration to microservices because, depending on this motivation, an entirely different strategy might be adopted. Microservices also offer, for instance, increased robustness and resilience since the communication with other services is taken care of accordingly (see section 9.5). If the legacy application currently has a deficit in this area or a distributed architecture already exists that has to be optimized with respect to these points, appropriate technology and architecture approaches can be defined without necessarily requiring that the application be divided into microservices.


Try and Experiment

• Do research on the remaining Enterprise Integration Patterns:

Can they be meaningfully employed when dealing with microservices? In which context?

Can they really only be implemented with messaging systems?


7.7 Hidden Dependencies (Oliver Wehrens)

by Oliver Wehrens, E-Post Development GmbH

In the beginning there is the monolith. Often it is sensible and happens naturally that software is created as a monolith. The code is clearly arranged, and the business domain is just coming into being. In that case it is better when everything has a common base. There is a UI, business logic, and a database. Refactoring is simple, deployment is easy, and everybody can still understand the entire code.

Over time the amount of code grows, and it gets hard to see through. Not everybody knows all parts of the code anymore. The compiling takes longer, and the unit and integration tests invite developers to take a coffee break. In case of a relatively stable business domain and a very large code basis, many projects will consider at this point the option to distribute the functionality into multiple microservices.

Depending on the status of the business and the understanding of the business/product owners, the necessary tasks will be completed. Source code is distributed, continuous delivery pipelines are created, and server provisioned. During this step no new features are developed. The not-negligible effort is justified just by the hope that in the future, features will be faster and more independently created by other teams. While developers are going to be very assured of this, other stakeholders often have to be convinced first.

In principle everything has been done to reach a better architecture. There are different teams that have independent source code. They can bring their software at any time into production and independent of other teams.

Almost.

The Database

Every developer has a more or less pronounced affinity to the database. In my experience many developers view the database as necessary evil that is somewhat cumbersome to refactor. Often tools are being used that generate the database structure for the developers (e.g., Liquibase or Flyway in the JVM area). Tools and libraries (Object-relation mapping) renders it very easy to make objects persistent. A few annotations later and the domain is saved in the database.

All these tools remove the database from the typical developers, who “only” want to write their code. This has sometimes the consequence that there is not much attention given to the database during the development process. For instance, indices that were not created will slow down searches on the database. This will not show up in a typical test, which does not work with large data amounts, and thus go like that into production.

Let’s take the fictional case of an online shoe shop. The company requires a service that enables users to log in. A user service is created containing the typical fields like ID, first name, family name, address, and password. To now offer fitting shoes to the users, only a selection of shoes in their actual size is supposed to be displayed. The size is registered in the welcome mask. What could be more sensible than to store this data in the already existing user service? Everybody is sure this is the right decision: these are user-associated data, and this is the right location.

Now the shoe shop expands and starts to sell additional types of clothing. Dress size, collar size, and all other related data are now also stored in the user service.

Several teams are employed in the company. The code gets progressively more complex. It is this point in time where the monolith is split into domain-based services. The refactoring in the source code works well, and a soon the monolith is split apart into many microservices.

Unfortunately, it turns out that it is still not easy to introduce changes. The team in charge of shoes wants to accept different currencies because of international expansion and has to modify the structure of the billing data to include the address format. During the upgrade the database is blocked. Meanwhile no dress size or favorite color can be changed. Moreover, the address data are used in different standard forms of other services and thus cannot be changed without coordination and effort. Therefore, the feature cannot be implemented promptly.

Even though the code is well separated, the teams are indirectly coupled via the database. To rename columns in the user service database is nearly impossible because nobody knows anymore in detail who is using which columns. Consequently, the teams do workarounds. Either fields with the name ‘Userattribute1’ are created, which then are mapped onto the right description in the code, or separations are introduced into the data like ‘#Color: Blue#Size:10.’ Nobody except the involved team knows what is meant by ‘Userattribute1,’ and it is difficult to generate an index on ‘#Color: #Size.’ Database structure and code are progressively harder to read and maintain.

It has to be essential for every software developer to think about how to make the data persistent, not only about the database structures but also about where which data is stored. Is the table respective database the place where these data should be located? From a business domain perspective, does this data have connections to other data? In order to remain flexible in the long term, it is worthwhile to carefully consider these questions every time. Typically, databases and tables are not created very often. However, they are a component that is very hard to modify later. Besides, databases and tables are often the origin of a hidden interdependence between services. In general, it has to be that data can only be used by exactly one service via direct database access. All other services that want to use the data may only access it via the public interfaces of the service.

7.8 Event-Driven Architecture

Microservices can call each other in order to implement shared logic. For example, at the end of the order process the microservice for billing as well as the microservice for the order execution can be called to create the bill and make sure that the ordered items are indeed delivered (see Figure 7.10).

Image

Figure 7.10 Calls between Microservices

This requires that the order process knows the service for the billing and for the delivery. If a completed orders necessitates additional steps, the order service also has to call the services responsible for these steps.

Event-driven architecture (EDA) enables a different modeling: When the order processing has been successfully finished, the order process will send an event. It is an event emitter. This event signals all interested microservices (event consumers) that there is a new successful order. Thus, one microservice can now print a bill, and another microservice can initiate a delivery (see Figure 7.11).

Image

Figure 7.11 Event-Driven Architecture

This procedure has a number of advantages:

• When other microservices are also interested in orders, they can easily register. Modifying the order process is not necessary anymore.

• Likewise, it is imaginable that other microservices also trigger identical events—again without changes to the order process.

• The processing of events is temporally unlinked. It can be linked later on.

At the architectural level, event-driven architectures have the advantage that they enable a very loose coupling and thus facilitate changes. The microservices need to know very little about each other. However, the coupling requires that logic is integrated and therefore implemented in different microservices. Thereby a split into microservice with UI and microservices with logic can arise. That is not desirable. Changes to the business logic entail often changes to logic and UI. These are then separate microservices. The change cannot readily take place in only one microservice anymore and thus gets more complex.

Technically, such architectures can be implemented without a lot of effort via messaging (see section 8.4). Microservices within such an architecture can very easily implement CQRS (section 9.2) or event sourcing (section 9.3).

7.9 Technical Architecture

To define a technology stack with which the system can be built is one of the main parts of an architecture. For individual microservices this is likewise a very important task. However, the focus of this chapter is the microservice-based system in its entirety. Of course, a certain technology can bindingly be defined for all microservices. This has advantages: In that case the teams can exchange knowledge about the technology. Refactorings are simpler because members of one team can easily help out on other teams.

However, defining standard technologies is not mandatory: if they are not defined, there will be a plethora of different technologies and frameworks. However, since typically only one team is in contact with each technology, such an approach can be acceptable. Generally, microservice-based architectures aim for the largest possible independence. With respect to the technology stack, this independence translates into the ability to use different technology stacks and to independently make technology decisions. However, this freedom can also be restricted.

Technical Decisions for the Entire System

Nevertheless, at the level of the entire system there are some technical decisions to make. However, other aspects are more important for the technical architecture of the microservice-based system than the technology stack for the implementation:

• As discussed in the last section, there might be technologies that can be used by all microservices—for instance, databases for data storage. Using these technologies does not necessarily have to be mandatory. However, especially in the case of persistence technologies, like databases, backups, and disaster recovery concepts have to exist so that at least these technical solutions have to be obligatory. The same is true for other basic systems such as CMS, for instance, which likewise have to be used by all microservices.

• The microservices have to adhere to certain standards with respect to monitoring, logging and deployment. Thereby, it can be ensured that the plethora of microservices can still be operated in a uniform manner. Without such standards this is hardly possible anymore in the case of a larger number of microservices.

• Additional aspects relate to configuration (section 7.10), Service Discovery (section 7.11) and security (section 7.14).

• Resilience (section 9.5) and Load Balancing (section 7.12) are concepts that have to be implemented in a microservice. Still, the overall architecture can demand that each microservice takes precautions in this area.

• An additional aspect is the communication of the microservices with each other (see Chapter 8). For the system in its entirety a communication infrastructure has to be defined which the microservices adhere to also.

The overall architecture does not necessarily restrict the choice of technologies. For logging, monitoring, and deployment an interface could be defined so there can be a standard according to which all microservices log messages in the same manner and hand them over to a common log infrastructure. However, the microservices do not necessarily have to use the same technologies for this. Similarly, how data can be handed to the monitoring system and which data are relevant for the monitoring can be defined. A microservice has to hand over the data to the monitoring, but a technology does not necessarily have to be prescribed. For deployment a completely automated continuous delivery pipeline can be demanded that deploys software or deposits it into a repository in a certain manner. Which specific technology is used is, again, a question for the developers of the respective microservice to decide. Practically, there are advantages when all microservices employ the same technology. This reduces complexity, and there will also be more experience in how to deal with the employed technology. However, in case of specific requirements, it is still possible to use a different technical solution when, for this special case, the advantages of such a solution predominate. This is an essential advantage of the technology freedom of microservice-based architectures.

Sidecar

Even if certain technologies for implementing the demands on microservices are rigidly defined, it will still be possible to integrate other technologies. Therefore, the concept of a sidecar can be very useful. This is a process that integrates into the microservices-based architecture via standard technologies and offers an interface that enables another process to use these features. This process can be implemented in an entirely different technology so that the technology freedom is preserved. Figure 7.12 illustrates this concept: The sidecar uses standard technologies and renders them accessible for another microservice in an optional technology. The sidecar is an independent process and therefore can be called for instance via REST so that microservices in arbitrary technologies can use the sidecar. Section 13.12 shows a concrete example for a sidecar.

Image

Figure 7.12 A Sidecar Renders All Standard Technologies Accessible via a Simple Interface

Also, with this approach such microservices can be integrated into the architecture whose technological approach otherwise would exclude the use of the general technical basis for configuration, Service Discovery and security, as the client component is not available for the entire technology.

In some regards the definition of the technology stack also affects other fields. The definition of technologies across all microservices also affects the organization or can be the product of a certain organization (see Chapter 12, “Organizational Effects of a Microservices-Based Architecture”).


Try and Experiment

• A microservices-based architecture is supposed to be defined.

Which technical aspects could it comprise?

Which aspects would you prescribe to the teams? Why?

Which aspects should the teams decide on their own? Why?

In the end, the question is how much freedom you allow the teams to have. There are numerous possibilities, ranging from complete freedom up to the prescription of practically all aspects. However, some areas can only be centrally defined—the communication protocols, for example. Section 12.3 discusses in more detail who should make which decisions in a microservice-based project.


7.10 Configuration and Coordination

Configuring microservice-based systems is laborious. They comprise a plethora of microservices, which all have to be provided with the appropriate configuration parameters.

Some tools can store the configuration values and make them available to all microservices. Ultimately, these are solutions in key/value stores, which save a certain value under a certain key:

Zookeeper14 is a simple hierarchical system that can be replicated onto multiple servers in a cluster. Updates arrive in an orderly fashion at the clients. This can also be used in a distributed environment, for instance for synchronization. Zookeeper has a consistent data model: all nodes have always the same data. The project is implemented in Java and is available under the Apache license.

14. https://zookeeper.apache.org/

etcd15 originates from the Docker/CoreOS environment. It offers an HTTP interface with JSON as data format. etcd is implemented in Go and also is available under the Apache license. Similar to Zookeeper, etcd also has a consistent data model and can be used for distributed coordination. For instance, etcd enables implementation of locking in a distributed system.

15. https://github.com/coreos/etcd

Spring Cloud Config16 likewise has a REST-API. The configuration data can be provided by a Git backend. Therefore Spring Cloud Config directly supports data versioning. The data can also be encrypted to protect passwords. The system is well integrated into the Java framework Spring and can be used without additional effort in Spring systems for Spring itself provides already configuration mechanisms. Spring Cloud Config is written in Java and is available under the Apache license. Spring Cloud Config does not offer support for synchronizing different distributed components.

16. http://cloud.spring.io/spring-cloud-config/

Consistency as Problem

Some of the configuration solutions offer consistent data. This means that all nodes return the same data in case of a call. This is in a sense an advantage. However, according to the CAP theorem a node can only return an inconsistent response in case of a network failure—or none at all. In the end, without a network connection the node cannot know whether other nodes have already received other values. If the system allows only consistent responses, there can be no response at all in this situation. For certain scenarios this is highly sensible.

For instance, only one client should execute a certain code at a given time—for example, to initiate a payment exactly once. The necessary locking can be done by the configuration system: within the configuration system there is a variable that, upon entering this code, has to be set. Only in that case may the code be executed. In the end, it is better when the configuration system does not return a response two clients will not execute the code in parallel by chance.

However, for configurations such strict requirements regarding consistency are often not necessary. Maybe it is better when a system gets an old value rather than when it does not get any value at all. However, in the case of CAP different compromises are possible. For instance, etcd returns an incorrect response rather than no response at all under certain conditions.

Immutable Server

Another problem associated with the centralized storage of configuration data is that the microservices do not only depend on the state of their own file system and the contained files but also on the state of the configuration server. Therefore, a microservice now cannot be exactly replicated anymore—for this the state of the configuration server is relevant also. This makes the reproduction of errors and the search for errors in general more difficult.

In addition, the configuration server is in opposition to the concept of an immutable server. In this approach every software change leads to a new installation of the software. Ultimately, the old server is terminated upon an update, and a new server with an entirely new installation of the software is started. However, in case of an external configuration server, a part of the configuration will not be present on the server, and therefore the server is after all changeable in the end by adjusting the configuration. However, this is exactly what is not supposed to happen. To prevent it, a configuration can be made in the server itself instead of the configuration server. In that case configuration changes can only be implemented by rolling out a new server.

Alternative: Installation Tools

The installation tools (discussed in section 11.4) represent a completely different approach for the configuration of individual microservices. These tools support not only the installation of software, but also the configuration. The configuration files, for instance, can be generated, which can subsequently be read by microservices. The microservice itself does not notice the central configuration since it reads only a configuration file. Still, these approaches support all scenarios, which typically occur in a microservices-based architecture. Thus, this approach allows a central configuration and is not in opposition to the immutable server as the configuration is completely transferred to the server.

7.11 Service Discovery

Service Discovery ensures that microservices can find each other. This is, in a sense, a very simple task: For instance, a configuration file detailing the IP address and the port of the microservice can be delivered on all computers. Typical configuration management systems enable the rollout of such files. However, this approach is not sufficient:

• Microservices can come and go. This does not only happen due to server failures but also because of new deployments or the scaling of the environment by the start of new servers. Service Discovery has to be dynamic. A fixed configuration is not sufficient.

• Due to Service Discovery, the calling microservices are not so closely coupled anymore to the called microservice. This has positive effects for scaling: A client is not bound to a concrete server, instance, anymore but can contact different instances—depending on the current load of the different servers.

• When all microservices have a common approach for Service Discovery, a central registry of all microservices arises. This can be helpful for an architecture overview (see section 7.2). Or monitoring information can be retrieved by all systems.

In systems that employ messaging, Service Discovery can be dispensable. Messaging systems already decouple sender and recipient. Both know only the shared channel by which they communicate. However, they do not know the identity of their communication partner. The flexibility that Service Discovery offers is then provided by the decoupling via the channels.

Service Discovery = Configuration?

In principle it is conceivable to implement Service Discovery by configuration solutions (see section 7.10). In the end, only the information that service is reachable at which location is supposed to be transferred. However, configuration mechanisms are, in effect, the wrong tools for this. For Service Discovery, high availability is more important than for a configuration server. In the worst case a failure of Service Discovery can have the consequence that communication between microservices becomes impossible. Consequently, the trade-off between consistency and availability is different compared to configuration systems. Therefore, configuration systems should be used for Service Discovery only when they offer an appropriate availability. This can have consequences for the necessary architecture of the Service Discovery system.

Technologies

There are many different technologies for Service Discovery:

• One example is DNS17 (Domain Name System). This protocol ensures that a host name like www.ewolff.com can be resolved to an IP address. DNS is an essential component of the Internet and has clearly proven its scalability and availability. DNS is hierarchically organized: There is a DNS server that administrates the .com domain. This DNS server knows which DNS server administrates the subdomain ewolff.com, and the DNS server of this subdomain finally knows the IP address of www.ewolff.com. In this way a namespace can be hierarchically organized, and different organizations can administrate different parts of the namespace. If a server named server.ewolff.com is supposed to be created, this can be easily done by a change in the DNS server of the domain ewolff.com. This independence fits well to the concept of microservices, which especially focus on independence with regard to their architecture. To ensure reliability there are always several servers, which administrate a domain. In order to reach scalability DNS supports caching so that calls do not have to implement the entire resolution of a name via multiple DNS servers, but can be served by a cache. This does not only promote performance, but also reliability.

17. http://www.zytrax.com/books/dns/

• For Service Discovery it is not sufficient to resolve the name of a server into an IP address. In addition, there has to be a network port for each service. Therefore, the DNS has SRV records. These contain the information on which computer and port the service is reachable. In addition, a priority and a weight can be set for a certain server. These values can be used to select one of the servers and thereby to prefer powerful servers. Via this approach, DNS offers reliability and Load Balancing onto multiple servers. Advantages of DNS are apart from scalability also the availability of many different implementations and the broad support in different programming languages.

• A frequently used implementation for a DNS server is BIND (Berkeley Internet Name Domain Server).18 BIND runs on different operating systems (Linux, BSD, Windows, Mac OS X), is written in the programming language C and is under an open-source license.

18. https://www.isc.org/downloads/bind/

Eureka19 is part of the Netflix stack. It is written in Java and is available under the Apache license. The example application in this book uses Eureka for Service Discovery (see section 13.8). For every service Eureka stores under the service name a host and a port, under which the service is available. Eureka can replicate the information about the services onto multiple Eureka servers in order to increase the availability. Eureka is a REST service. A Java library for the clients belongs to Eureka. Via the sidecar concept (section 7.9) this library can also be used by systems, which are not written in Java. The sidecar takes over the communication with the Eureka server, which then offers Service Discovery to the microservice. On the clients the information from the server can be held in a cache so that calls are possible without communication with the server. The server regularly contacts the registered services to determine which services failed. Eureka can be used as basis for Load Balancing since several instances can be registered for one service. The load can then be distributed onto these instances. Eureka was originally designed for the Amazon Cloud.

19. https://github.com/Netflix/eureka

Consul20 is a key/value store and therefore fits also into the area of configuration servers (section 7.10). Apart from consistency it can also optimize availability.21 Clients can register with the server and react to certain events. In addition to a DNS interface it also has a HTTP/JSON interface. It can check whether services are still available by executing health checks. Consul is written in Go and is available under the Mozilla open-source license. Besides, Consul can create configuration files from templates. Therefore, a system expecting services in a configuration file can likewise be configured by Consul.

20. http://www.consul.io

21. https://aphyr.com/posts/316-call-me-maybe-etcd-and-consul

Every microservice-based architecture should use a Service Discovery system. It forms the basis for the administration of a large number of microservices and for additional features like Load Balancing. If there is only a small number of microservices, it is still imaginable to get along without Service Discovery. However, for a large system Service Discovery is indispensable. Since the number of microservices increases over time, Service Discovery should be integrated into the architecture right from the start. Besides, practically each system uses at least the name resolution of hosts, which is already a simple Service Discovery.

7.12 Load Balancing

It is one of the advantages of microservices that each individual service can be independently scaled. To distribute the load between the instances, multiple instances, which share the load, can simply be registered in a messaging solution (see section 8.4). The actual distribution of the individual messages is then performed by the messaging solution. Messages can either be distributed to one of the receivers (point-to-point) or to all receivers (publish/subscribe).

REST/HTTP

In case of REST and HTTP a load balancer has to be used. The load balancer has the function to behave to the outside like a single instance, but to distribute requests to multiple instances. Besides, a load balancer can be useful during deployment: Instances of the new version of the microservice can initially start without getting a lot of load. Afterwards the load balancer can be reconfigured in a way that the new microservices are put into operation. In doing so the load can also be increased in a stepwise manner. This decreases the risk of a system failure.

Figure 7.13 illustrates the principle of a proxy-based load balancer: the client sends its requests to a load balancer running on another server. This load balancer is responsible for sending each request to one of the known instances. There the request is processed.

Image

Figure 7.13 Proxy-Based Load Balancer

This approach is common for websites and relatively easy to implement. The load balancer retrieves information from the service instances to determine the load of the different instances. In addition, the load balancer can remove a server from the Load Balancing when the node does not react to requests anymore.

On the other hand, this approach has the disadvantage that the entire traffic for one kind of service has to be directed via a load balancer. Therefore, the load balancer can turn into a bottleneck. Besides, a failure of the load balancer results in the failure of a microservice.

Central Load Balancer

A central load balancer for all microservices is not only not recommended for these reasons but also because of the configuration. The configuration of the load balancer gets very complex when only one load balancer is responsible for many microservices. Besides, the configuration has to be coordinated between all microservices. Especially when a new version of a microservice is being deployed, a modification of the load balancer can be sensible in order to put the new microservice only after a comprehensive test under load. The need for coordination between microservices should especially be avoided with regard to deployment to ensure the independent deployment of microservices. In case of such a reconfiguration, one has to make sure that the load balancer supports a dynamic reconfiguration and, for instance, does not lose information regarding sessions if the microservice uses sessions. Also for this reason it is not recommended that stateful microservices should be implemented.

A Load Balancer per Microservice

There should be one load balancer per microservice, which distributes the load between the instances of the microservice. This enables the individual microservices to independently distribute load, and different configurations per microservice are possible. Likewise, it is simple to appropriately reconfigure the load balancer upon the deployment of a new version. However, in case of a failure of the load balancers, the microservice will not be available anymore.

Technologies

For Load Balancing there are different approaches:

• The Apache httpd web server supports Load Balancing with the extension mod_proxy_balancer.22

22. http://httpd.apache.org/docs/2.2/mod/mod_proxy_balancer.html

• The web server nginx23 can likewise be configured in a way that it supports Load Balancing. To use a web server as load balancer has the advantage that it can also deliver static websites, CSS, and images. Besides, the number of technologies will be reduced.

23. http://nginx.org/en/docs/http/load_balancing.html

• HAProxy24 is a solution for Load Balancing and high availability. It does not support HTTP, but all TCP-based protocols.

24. http://www.haproxy.org/

• Cloud providers frequently also offer Load Balancing. Amazon, for instance, offers Elastic Load Balancing.25 This can be combined with auto scaling so that higher loads automatically trigger the start of new instances, and thereby the application automatically scales with load.

25. http://aws.amazon.com/de/elasticloadbalancing/

Service Discovery

Another possibility for Load Balancing is Service Discovery (see Figure 7.14; see section 7.11). When the Service Discovery returns different nodes for a service, the load can be distributed across several nodes. However, this approach allows redirecting to another node only in the case that a new Service Discovery is performed. This makes it difficult to achieve a fine granular Load Balancing. For a new node it will therefore take some time until it gets a sufficient share of load. Finally, the failure of a node is hard to correct because a new Service Discovery would be necessary for that. It is useful that in case of DNS it can be stated for a set of data how long the data is valid (time-to-live). Afterwards the Service Discovery has to be run again. This enables a simple Load Balancing via DNS solutions and also with Consul. However, unfortunately this time-to-live is often not completely correctly implemented.

Image

Figure 7.14 Load Balancing with Service Discovery

Load Balancing with Service Discovery is simple because Service Discovery has to be present in a microservice-based system anyhow. Therefore, the Load Balancing does not introduce additional software components. Besides, avoiding a central load balancer has the positive effect that there is no bottle neck and no central component whose failure would have tremendous consequences.

Client-Based Load Balancing

The client itself can also use a load balancer (see Figure 7.15). The load balancer can be implemented as a part of the code of the microservice or it can come as a proxy-based load balancer such as nginx or Apache httpd, which runs on the same computer as the microservice. In that case there is no bottleneck because each client has its own load balancer, and the failure of an individual load balancer has hardly any consequences. However, configuration changes have to be passed on to all load balancers, which can cause quite a lot of network traffic and load.

Image

Figure 7.15 Client-Based Load Balancing

Ribbon26 is an implementation of client-based Load Balancing. It is a library that is written in Java and can use Eureka to find service instances. Alternately, a list of servers can be handed over to Ribbon. Ribbon implements different algorithms for Load Balancing. Especially when using it in combination with Eureka, the individual load balancer does not need to be configured anymore. Because of the sidecar concept Ribbon can also be used by microservices that are not implemented in Java. The example system uses Ribbon (see section 13.11).

26. https://github.com/Netflix/ribbon

Consul offers the possibility to define a template for configuration files of load balancers. This enables feeding the load balancer configuration with data from Service Discovery. Client-based Load Balancing can be implemented by defining a template for each client, into which Consul writes all service instances. This process can be regularly repeated. In this manner a central system configuration is again possible, and client-based Load Balancing is relatively simple to implement.

Load Balancing and Architecture

It is hardly sensible to use more than one kind of Load Balancing within a single microservice-based system. Therefore, this decision should be made once for the entire system. Load Balancing and Service Discovery have a number of contact points. Service Discovery knows all service instances; Load Balancing distributes the loads between the instances. Both technologies have to work together. Thus, the technology decisions in this area will influence each other.

7.13 Scalability

To be able to cope with high loads, microservices have to scale. Scalability means that a system can process more load when it gets more resources.

There are two different kinds of scalability as represented in Figure 7.16:

Horizontal scalability—This means that more resources are used, which each process part of the load, that is, the number of resources increases.

Vertical scalability—This means that more powerful resources are employed to handle a higher load. Here, an individual resource will process more load, while the number of resources stays constant.

Horizontal scalability is often the better choice since the limit for the possible number of resources and therefore the limit for the scalability is very high. Besides, it is cheaper to buy more resources than more powerful ones. One fast computer is often more expensive than many slow ones.

Image

Figure 7.16 Horizontal and Vertical Scaling

Scaling, Microservices, and Load Balancing

Microservices employ mostly horizontal scaling, where the load is distributed across several microservice instances via Load Balancing. The microservices themselves have to be stateless for this. More precisely, they should not have any state, which is specific for an individual user, because then the load can only be distributed to nodes, which have the respective state. The state for a user can be stored in a database or alternatively be put into an external storage (for example, In-Memory-Store), which can be accessed by all microservices.

Dynamic Scaling

Scalability means only that the load can be distributed to multiple nodes. How the system really reacts to the load is not defined. In the end it is more important that the system really adapts to an increasing load. For that it is necessary that, depending on the load, a microservice starts new instances onto which the load can be distributed. This enables the microservice to also cope with high loads. This process has to be automated, as manual processes would be too laborious.

There are different places in the continuous deployment pipeline (Chapter 11, “Operations and Continuous Delivery of Microservices”) where it is necessary to start a microservice to test the services. For that a suitable deployment system such as Chef or Puppet can be used. Alternatively, a new virtual machine or a new Docker container with the microservice is simply started. This mechanism can also be used for dynamic scaling. It only has to additionally register the new instances with the Load Balancing. However, the instance should be able to handle the production load right from the start. Therefore, the caches should, for instance, already be filled with data.

Dynamic scaling is especially simple with Service Discovery: The microservice has to register with the Service Discovery. The Service Discovery can configure the load balancer in a way that it distributes load to the new instance.

The dynamic scaling has to be performed based on a metric. When the response time of a microservice is too long or the number of requests is very high, new instances have to be started. The dynamic scaling can be part of a monitoring (see section 11.3) since the monitoring should enable the reaction to extraordinary metric values. Most monitoring infrastructures offer the possibility to react to metric values by calling a script. The script can start additional instances of the microservice. This is fairly easy to do with most cloud and virtualization environments. Environments like the Amazon Cloud offer suitable solutions for automatic scaling, which work in a similar manner. However, a home-grown solution is not very complicated since the scripts run anyhow, only every few minutes, so that failures are tolerable, at least for a limited time. Since the scripts are part of the monitoring, they will have a similar availability like the monitoring and should therefore be sufficiently available.

Especially in the case of cloud infrastructures, it is important to shut the instances down again in case of low load because every running instance costs money in a cloud. Also in this case scripts can be used to provide automated responses when values reach predefined levels.

Microservices: Advantages for Scaling

With regard to scaling, microservices have, first of all, the advantage that they can be scaled independently of each other. In case of a deployment monolith, starting each instance requires starting the entire monolith. The fine granular scaling does not appear to be an especially striking advantage at first glance. However, to run an entire e-commerce shop, in many instances just to speed up the search, causes high expenditures: A lot of hardware is needed, a complex infrastructure has to be built up, and system parts are held available, not all of which are used. These system parts render the deployment and monitoring more complex. The possibilities for dynamic scaling depend critically on the size of the services and on the speed with which new instances can be started. In this area microservices possess clear advantages.

In most cases microservices have already an automated deployment, which is also very easy to implement. In addition, there is already monitoring. Without automated deployment and monitoring, a microservice-based system can hardly be operated. If there is in addition Load Balancing, then it is only a script that is still missing for automated scaling. Therefore, microservices represent an excellent basis for dynamic scaling.

Sharding

Sharding means that the administrated data amount is divided and that each instance gets the responsibility for part of the data. For example, an instance can be responsible for the customers A–E or for all customers whose customer number ends with the number 9. Sharding is a variation of horizontal scaling: more servers are used. However, not all servers are equal, but every server is responsible for a different subset of the dataset. In case of microservices this type of scaling is easy to implement since the domain is anyhow distributed across multiple microservices. Every microservice can then shard its data and scale horizontally via this sharding. A deployment monolith is hardly scalable in this manner because it handles all the data. When the deployment monolith administrates customers and items, it can hardly be sharded for both types of data. In order to really implement sharding, the load balancer has to distribute the load appropriately to the shards, of course.

Scalability, Throughput, and Response Times

Scalability means that more load can be processed by more resources. The throughput increases—that is, the number of processed requests per unit of time increase. However, the response time stays constant in the best case—depending on circumstances it might rise, but not to such an extent that the system causes errors or gets too slow for the user.

When faster response times are required, horizontal scaling does not help. However, there are some approaches to optimize the response time of microservices:

• The microservices can be deployed on faster computers. This is vertical scaling. Then the microservices can process the individual requests more rapidly. Because of the automated deployment, vertical scaling is relatively simple to implement. The service has only to be deployed on faster hardware.

• Calls via the network have a long latency. Therefore, a possible optimization can be to avoid such calls. Instead caches can be used, or the data can be replicated. Caches can often very easily be integrated into the existing communication. For REST, for instance, a simple HTTP cache is sufficient.

• If the domain architecture of microservices is well designed, a request should only be processed in one microservice so that no communication via the network is necessary. In case of a good domain architecture the logic for processing a request is implemented in one microservice so that changes to the logic only require changes to one microservice. In that case microservices do not have longer response times than deployment monoliths. With regard to an optimization of response times microservices have the disadvantage that their communication via the network causes rather longer response times. However, there are means to counteract this effect.

7.14 Security

In a microservice-based architecture, each microservice has to know which user triggered the current call and wants to use the system. Therefore, a uniform security architecture has to exist: After all, microservices can work together for a request, and for each part of the processing of the request, another microservice might be responsible. Thus, the security structure has to be defined at the level of the entire system. This is the only way to ensure that the access of a user is uniformly treated in the entire system with regard to security.

Security comprises two essential aspects: authentication and authorization. Authentication is the process that validates the identity of the user. Authorization denotes the decision whether a certain user is allowed to execute a certain action. Both processes are independent of each other: The validation of the user identity in the context of authentication is not directly related to authorization.

Security and Microservices

In a microservice-based architecture the individual microservices should not perform authentication. It does not make much sense for each microservice to validate user name and password. For authentication a central server has to be used. For authorization an interplay is necessary: often there are user groups or roles that have to be centrally administered. However, whether a certain user group or role is allowed to use certain features of a microservice should be decided by the concerned microservice. Therefore changes to the authorization of a certain microservice can be limited to the implementation of this microservice.

OAuth2

One possible solution for this challenge is OAuth2. This protocol is also widely used in the Internet. Google, Microsoft, Twitter, XING, and Yahoo all offer support for this protocol.

Figure 7.17 shows the workflow of the OAuth2 protocol as defined by the standard:27

27. http://tools.ietf.org/html/rfc6749

1. The client inquires of the resource owner whether it might execute a certain action. For example, the application can request access to the profile or certain data in a social network that the resource owner stored there. The resource owner is usually the user of the system.

2. If the resource owner grants the client access, the client receives a respective response from the resource owner.

3. The client uses the response of the resource owner to put a request to the authorization server. In the example the authorization server would be located in the social network.

Image

Figure 7.17 The OAuth2 Protocol

4. The authorization server returns an access token.

5. With this access token the client can now call a Resource Server and there obtain the necessary information. For the call the token can for instance be put into an HTTP header.

6. The resource server answers the requests.

Possible Authorization Grants

The interaction with the authorization server can work in different ways:

• In case of the password grant the client shows an HTML form to the user in step 1. The resource owner can enter user name and password. In step 3 this information is used by the client to obtain the access token from the authorization server via an HTTP POST. This approach has the disadvantage that the client processes user name and password. The client can be insecurely implemented, and then these data are endangered.

• In case of the authorization grant the client directs the user in step 1 to a web page that the authorization server displays. There the user can choose whether he/she permits the access. If that is the case, in step 2 the client will obtain an authorization code via an HTTP-URL. In this way the authorization server can be sure that the correct client obtains the code since the server chooses the URL. In step 3 the client can then generate the access token with this authorization code via an HTTP POST. The approach is mainly implemented by the authorization server and thus very easy to use by a client. In this scenario the client would be a web application on the server: It will obtain the code from the authorization server and is the only one able to turn it via the HTTP POST into an access token.

• In case of an implicit grant, the procedure resembles the authorization grant. After the redirect to the authorization server in step 1 the client directly gets an access token via an HTTP redirect. This enables the browser or a mobile application to immediately read out the access token. Steps 3 and 4 are omitted. However, here the access token is not as well protected against attacks since the authorization server does not directly send it to the client. This approach is sensible when JavaScript code on the client or a mobile application is supposed to use the access token.

• In case of client credentials, the client uses a credential in step 1 that the client knows to obtain the access token from the authorization server. Therefore, the client can access the data without additional information from the resource owner. For example, a statistics software could read out and analyze customer data in this manner.

Via the access token the client can access resources. The access token has to be protected: When unauthorized people obtain access to the access token, they can thereby trigger all actions that the resource owner can also trigger. Within the token itself some information can be encoded. For instance, in addition to the real name of the resource owner the token can also contain information that assigns certain rights to the user or the membership to certain user groups.

JSON Web Token (JWT)

JSON Web Token (JWT) is a standard for the information that is contained in an access token. JSON serves as data structure. For the validation of the access token a digital signature with JWS (JSON Web Signature) can be used. Likewise, the access token can be encrypted with JSON Web Encryption (JWE). The access token can contain information about the issuer of the access token, the resource owner, the validity interval, or the addressee of the access token. Individual data can also be contained in the access token. The access token is optimized for use as HTTP header by an encoding of the JSON with BASE64. These headers are normally subject to size restrictions.

OAuth2, JWT, and Microservices

In a microservice-based architecture the user can initially authenticate via one of the OAuth2 approaches. Afterwards the user can use the web page of a microservice or call a microservice via REST. With each further call every microservice can hand over the access token to other microservices. Based on the access token the microservices can decide whether a certain access is granted or not. For that the validity of the token can first be checked. In case of JWT the token only has to be decrypted, and the signature of the authorization server has to be checked. Subsequently, whether the user may use the microservice as he/she intends can be decided based on the information of the token. Information from the token can be used for that. For instance, it is possible to store the affiliation with certain user groups directly in the token.

It is important that it is not defined in the access token which access to which microservice is allowed. The access token is issued by the authorization server. If the information about the access was available in the authorization server, every modification of the access rights would have to occur in the authorization server—and not in the microservices. This limits the changeability of the microservices since modifications to the access rights would require changes of the authorization server as central component. The authorization server should only administer the assignment to user groups, and the microservices should then allow or prohibit access based on such information from the token.

Technologies

In principle, other technical approaches than OAuth2 could also be used as long as they employ a central server for authorization and use a token for regulating the access to individual microservices. One example is Kerberos,28 which has a relatively long history. However, it is not well tuned to REST like OAuth2. Other alternatives are SAML and SAML 2.0.29 They define a protocol that uses XML and HTTP to perform authorization and authentication.

28. http://tools.ietf.org/html/rfc4556

29. https://www.oasis-open.org/committees/security/

Finally, signed cookies can be created by a home-grown security service. Via a cryptographic signature, it can be determined whether the cookie has really been issued by the system. The cookie can then contain the rights or groups of the user. Microservices can examine the cookie and restrict the access if necessary. There is the risk that the cookie is stolen. However, for that to occur the browser has to be compromised, or the cookie has to be transferred via a unencrypted connection. This is often acceptable as risk.

With a token approach it is possible that microservices do not have to handle the authorization of the caller but still can restrict the access to certain user groups or roles.

There are good reasons for the use of OAuth2:

• There are numerous libraries for practically all established programming languages that implement OAuth2 or an OAuth2 server.30 The decision for OAuth2 hardly restricts the technology choice for microservices.

30. http://oauth.net/2/

• Between the microservices only the access token still has to be transferred. This can occur in a standardized manner via an HTTP header when REST is used. In case of different communication protocols similar mechanisms can be exploited. Also in this area OAuth2 hardly limits the technology choice.

• Via JWT information can be placed into the token that the authorization server communicates to the microservices in order for them to allow or prohibit access. Therefore, also in this area the interplay between the individual microservice and the shared infrastructure is simple to implement—with standards that are widely supported.

Spring Cloud Security31 offers a good basis for implementing OAuth2 systems, especially for Java-based microservices.

31. http://cloud.spring.io/spring-cloud-security/

Additional Security Measures

OAuth2 solves, first of all, the problem of authentication and authorization—primarily for human users. There are additional measures for securing a microservice-based system:

• The communication between the microservices can be protected by SSL/TLS against wiretapping. All communication is then encrypted. Infrastructures like REST or messaging systems mostly support such protocols.

• Apart from authentication with OAuth2 certificates can be used to authenticate clients. A certificate authority creates the certificates. They can be used to verify digital signatures. This makes it possible to authenticate a client based on its digital signature. Since SSL/TLS supports certificates, at least at this level the use of certificates and authentication via certificates is possible.

• API keys represent a similar concept. They are given to external clients to enable them to use the system. Via the API key the external clients authenticate themselves and can obtain the appropriate rights. In case of OAuth2 this can be implemented with Client Credential.

• Firewalls can be used to protect the communication between microservices. Normally firewalls secure a system against unauthorized access from outside. A firewall for the communication between the microservices prevents that all microservices are endangered if an individual microservice has been successfully taken over. In this way the intrusion can be restricted to one microservice.

• Finally, there should be an intrusion detection to detect unauthorized access to the system. This topic is closely related to monitoring. The monitoring system can also be used to trigger an appropriate alarm in case of an intrusion.

• Datensparsamkeit32 is also an interesting concept. It is derived from the data security field and states that only data that is absolutely necessary to be saved. Form a security perspective this results in the advantage that collecting lots of data is avoided. This makes the system less attractive for attacks, and in addition the consequences of a security breach will not be as bad.

32. http://martinfowler.com/bliki/Datensparsamkeit.html

Hashicorp Vault

Hashicorp Vault33 is a tool that solves many problems in the area of microservice security. It offers the following features:

33. https://www.vaultproject.io/

• Secrets like passwords, API keys, keys for encryption, or certificates can be saved. This can be useful for enabling users to administrate their secrets. In addition, microservices can be equipped with certificates in such a manner as to protect their communication with each other or with external servers.

• Secrets are given via a lease to services. Besides, they can be equipped with an access control. This helps to limit the problem in case of a compromised service. Secrets can, for instance, also be declared invalid.

• Data can be immediately encrypted or decrypted with the keys without the microservices themselves having to save these keys.

• Access is made traceable by an audit. This enables tracing of who got which secret and at what time.

• In the background Vault can use HSMs, SQL databases, or Amazon IAM to store secrets. In addition, it can for instance also generate new access keys for the Amazon Cloud by itself.

In this manner Vault takes care of handling keys and thereby relieves microservices of this task. It is a big challenge to really handle keys securely. It is difficult to implement something like that in a really secure manner.

Additional Security Goals

With regard to a software architecture security comes in very different shapes. Approaches like OAuth2 only help to achieve confidentiality. They prevent data access to unauthorized users. However, even this confidentiality is not entirely safeguarded by OAuth2 on its own: The communication in the network likewise has to be protected against wiretapping—for instance via HTTPS or other kinds of encryption.

Additional security aspects include the following:

Integrity—Integrity means that there are no unnoticed changes to the data. Every microservice has to solve this problem. For instance, data can be signed to ensure that they have not been manipulated in some way. The concrete implementation has to be performed by each microservice.

Confidentiality—The concept of confidentiality means ensuring that modifications made by someone cannot be denied. This can be achieved by signing the changes introduced by different users by keys that are specific for the individual user. Then it is clear that exactly one specific user has modified the data. The overall security architecture has to provide the keys; the signing is then the task of each individual service.

Data security—Data security is ensured as long as no data are lost. This issue can be handled by backup solutions and highly available storage solutions. This problem has to be addressed by the microservices since it is within their responsibility as part of their data storage. However, the shared infrastructure can offer certain databases that are equipped with appropriate backup and disaster recovery mechanisms.

Availability—Availability means that a system is available. Also here the microservices have to contribute individually. However, since one has to deal with the possibility of failures of individual microservices, especially in the case of microservice-based architectures, microservice-based systems are often well prepared in this area. Resilience (section 9.5) is, for instance, useful for this.

These aspects are often not considered when devising security measures; however, the failure of a service has often even more dramatic consequences than the unauthorized access to data. One danger is denial-of-service attacks, which result in such an overloading of servers that they cannot perform any sensible work anymore. The technical hurdles for this are often shockingly low, and the defense against such attacks is frequently very difficult.

7.15 Documentation and Metadata

To keep the overview in a microservice-based architecture certain information about each microservice has to be available. Therefore, the microservice-based architecture has to define how microservices can provide such information. Only when all microservices provide this information in a uniform way, the information can be easily collected. Possible information of interest is, for instance:

• Fundamental information like the name of the service and the responsible contact person.

• Information about the source code: where the code can be found in the version control and which libraries have been used. The used libraries can be interesting in order to compare open-source licenses of the libraries with the company policies or to identify in case of a security gap in a library the affected microservices. For such purposes the information has to be available even if the decision about the use of a certain library rather concerns only one microservice. The decision itself can be made largely independently by the responsible team.

• Another interesting information is with which other microservices the micro-service works. This information is central for the architecture management (see section 7.2).

• In addition, information about configuration parameters or about feature toggles might be interesting. Feature toggles can switch features on or off. This is useful for activating new features only in production when their implementation is really finished, or for avoiding the failure of a service by deactivating certain features.

It is not sensible to document all components of the microservices or to unify the entire documentation. A unification only makes sense for information that is relevant outside of the team implementing the microservice. Whenever it is necessary to manage the interplay of microservices or to check licenses, the relevant information has to be available outside of the responsible team. These questions have to be solved across microservices. Each team can create additional documentation about their own microservices. However, this documentation is only relevant for this one team and therefore does not have to be standardized.

Outdated Documentation

A common problem concerning the documentation of any software is that the documentation gets easily outdated and then documents a state that is not up to date anymore. Therefore, the documentation should be versioned together with the code. Besides, the documentation should be created from information that is present in the system anyhow. For instance, the list of all used libraries can be taken from the build system since exactly this information is needed during the compilation of the system. Which other microservices are used can be obtained from Service Discovery. This information can, for instance, be used to create firewall rules when a firewall is supposed to be used to protect the communication between the microservices. In summary, the documentation does not have to be maintained separately, but documentation should be generated from information present in the system anyhow.

Access to Documentation

The documentation can be part of the artifacts that are created during the build. In addition, there can be a run-time interface that enables reading out of metadata. Such an interface can correspond to the otherwise common interfaces for monitoring and, for instance, provide JSON documents via HTTP. In this way, the metadata are only an additional information microservices provide at run-time.

A service template can show how the documentation is created. The service template can then form the basis for the implementation of new microservices. When the service template already contains this aspect, it facilitates the implementation of a standard-conform documentation. In addition, at least the formal characteristics of the documentation can be checked by a test.

7.16 Conclusion

The domain architecture of a microservice-based system is essential because it influences not only the structure of the system, but also the organization (section 7.1). Unfortunately, tools for dependency management are rare, especially for microservices, so that teams have to develop home-made solutions. However, often an understanding of the implementation of the individual business processes will be sufficient, and an overview of the entire architecture is not really necessary (section 7.2).

For an architecture to be successful it has to be permanently adjusted to the changing requirements. For deployment monoliths there are numerous refactoring techniques to achieve this. Such possibilities also exist for microservices; however without the support of tools and with much higher hurdles (section 7.3). Still, microservice-based systems can be sensibly developed further—for instance, by starting initially with a few large microservices and creating more and more microservices over time (section 7.4). An early distribution into many microservices entails the risk to end up with a wrong distribution.

A special case is the migration of a legacy application to a microservice-based architecture (section 7.6). In this case, the code base of the legacy application can be divided into microservices; however this can lead to a bad architecture due to the often bad structure of the legacy application. Alternatively, the legacy application can be supplemented by microservices, which replace functionalities of the legacy application in a stepwise manner.

Event-driven architecture (section 7.8) can serve to uncouple the logic in the microservices. This enables easy extensibility of the system.

Defining the technological basis is one of the tasks of an architecture (section 7.9). In case of microservice-based systems this does not relate to the definition of a shared technology stack for implementation but to the definition of shared communication protocols, interfaces, monitoring, and logging. Additional technical functions of the entire system are coordination and configuration (section 7.10). In this area tools can be selected that all microservices have to employ. Alternatively, one can do without a central configuration and instead leave each microservice to bring along its own configuration.

Likewise, for Service Discovery (section 7.11) a certain technology can be chosen. A solution for Service Discovery is in any case sensible for a microservice-based system—except messaging is used for communication. Based on Service Discovery, Load Balancing can be introduced (section 7.12) to distribute the load across the instances of the microservices. Service Discovery knows all instances; the Load Balancing distributes the load to these instances. Load Balancing can be implemented via a central load balancer, via Service Discovery or via one load balancer per client. This provides the basis for scalability (section 7.13). This enables a microservice to process more load by scaling up.

Microservices have a significantly higher technical complexity than deployment monoliths. Operating systems, networks, load balancer, Service Discovery, and communication protocols all become part of the architecture. Developers and architects of deployment monoliths are largely spared from these aspects. Thus architects have to deal with entirely different technologies and have to carry out architecture at an entirely different level.

In the area of security, a central component has to take over at least authentication and parts of authorization. The microservices should then settle the details of access (section 7.14). In order to obtain certain information from a system, which is composed of many microservices, the microservices have to possess a standardized documentation (section 7.15). This documentation can, for instance, provide information about the used libraries—to compare them with open-source license regulations or to remove security issues when a library has a security gap.

The architecture of a microservice-based system is different from classical applications. Many decisions are only made in the microservices, while topics like monitoring, logging or continuous delivery are standardized for the entire system.

Essential Points

• Refactoring between microservices is laborious. Therefore, it is hard to change the architecture at this level. Accordingly, the continued development of the architecture is a central point.

• An essential part of the architecture is the definition of overarching technologies for configuration and coordination, Service Discovery, Load Balancing, security, documentation, and metadata.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset