Chapter 3. Distributed Domain-Driven Design

As you explore the tools and techniques that Akka offers, you will find a need for something to tie all of the concepts together. You need a guiding set of principles that will help you design not just your actors, but also the larger components that make up an entire system. As you learn about new concepts like clustering, it will become apparent that you need a way to divide your application into smaller subsystems. You need to understand the boundaries for those subsystems and how they interact. The guiding principles we will use for this come from domain-driven design (DDD). When put together in a distributed Akka system, we will use the term distributed domain-driven design (DDDD).

DDD is a widely adopted concept initially created by Eric Evans. We will be covering some of the basics, but if you are interested in pursuing DDD further, you should look for Evans’ original book1 or some of the follow-up books by people like Vaughn Vernon.2

DDD Overview

DDD is a set of guiding principles for software architecture. The principles of DDD are not revolutionary. In fact, a common reaction to people hearing them for the first time is “well, of course. That’s obvious.” It is not the principles themselves that are so powerful; rather, it’s how they are applied and how they are combined. Applying them consistently throughout a project can transform that project from something that is cumbersome and awkward into one that is elegant and considerably more useful.

The most important concept in DDD is the focus on the Domain Model. If you are not familiar with the term “domain,” it is the set of requirements, constraints, and concepts that make up the business or field that you are trying to model. In DDD, we focus on that business domain and we model our software around it. We are trying to model our software in a way that reflects the real world and how it operates. We want our model to capture the truth of the domain that we are trying to model. This involves conversations with domain experts. These experts are people who are knowledgeable about the domain but might not be technically savvy. They can include lawyers, marketing staff, support staff, business managers, or anyone else with a knowledge of the domain. This means that it is important to use language that makes sense to those experts, not just in our conversations, but also in our code. This language that we develop is shared between the developers and the domain experts and is called the ubiquitous language.

When we establish a ubiquitous language, it eases our ability to communicate about the software, not just among developers but with the domain experts, as well. It allows us to have a conversation about the software, referring to actions and objects within the application in a way that is still intelligible to nondevelopers. This in turn helps the domain experts feel involved in the software in a way that would otherwise be impossible. When you begin explaining the model using their language, they are able to point out flaws in the way that you are using that language. Those flaws will often be reflected as deficiencies in the developer’s understanding of the domain that will have crept into the software itself. Being able to speak in this common language is an invaluable tool in the development process.

This common language can change meaning from one area of the domain to another. The individual concepts, what actions are available, and how they interact within the domain might not always be the same. Each concept is bound within a particular context. Within that context, the language has a certain meaning. When we leave that context and move to another area of the domain, the meaning can change.

A key part of this is recognizing that the domain is not fixed. It is a fluid entity that can change over time. Sometimes, the rules of the business change. Sometimes, our understanding of those rules evolves. We need to be prepared for those changes and we need to be prepared to adapt the Domain Model to accommodate the changes. If we build a model expecting it to handle all cases and to never change, we are doomed to failure.

DDD focuses on building models that are equipped to evolve. We don’t build a model the first time and expect it to last. We build it with the understanding that it will fail at some point and we will need to adapt. Accordingly, we need to build it in a way that allows it to adapt. The cost of change must remain small. DDD gives us a set of tools to help keep that cost of change low.

The Benefits of DDD

One of the worst ways that you can cripple your software and prevent it from evolving is by creating too much coupling between areas of the system that are conceptually different. In particular, when you create coupling between portions of an application that are considered part of the domain and portions that are considered part of the infrastructure, you reduce your ability to adapt. DDD helps to distinguish what is domain and what is infrastructure, but it also helps to create the right abstractions around them so that you don’t violate those boundaries.

Let’s consider a very simple example. Suppose that your company has been using a particular email library for years. This library has been working well, but now a new web-based service has come onto the scene that has the business owners excited. This new service provides all sorts of tracking tools that were previously unavailable. It will give you great metrics and a great view into your customers. The business owners ask that you convert the system to use this new email service.

You begin digging into the code and we realize how difficult of a problem this is. The old email library has been forcibly integrated into the code. There are hooks everywhere. If you want to convert to using this new system, you need to begin modifying the application in many different areas. The scheduling engine uses emails for notifications; it needs to change. The user management system uses emails for sending out confirmation emails and invitations, so it needs to change, too. There are even emails being triggered by a stored procedure within your database. Where does it end?

DDD introduces a number of concepts that help solve problems like this. It helps you to recognize that the concerns about how emails are sent are not part of the domain of user management or of the decision engine. It’s an infrastructure concern. The scheduling engine needs to send notifications, not emails. The fact that those notifications are delivered via email is not relevant. It is certainly not relevant that those emails be sent with a specific version of a particular library.

Recognizing the differences in the language being used (notification versus email) is only the first step. You also need to be able to recognize that how you manage users and make decisions about those users might be entirely different. It might be difficult or impossible to create a single model that captures all the needs of both.

The goal with DDD is to decompose the larger domain into smaller, easier-to-manage chunks. You can then model those chunks individually, and in doing so, you can develop a better understanding not just for yourself, but for the domain experts, as well. You can go back to the domain experts and talk about sending notifications within the scheduling engine. Sure, we are talking about emails, but by using the correct term of “notification,” you can prepare for the possibility that at some point this might become a text message, or a social media message, or whatever the next big thing happens to be. You can also talk to those domain experts specifically about sending emails and what is involved there, and you don’t need to blend your concerns. We don’t need your infrastructure to leak into the domain.

Components of DDD

So, what are the tools that help you to decompose the application using DDD? More importantly, how are those tools related to Akka?

DDD provides a set of tools that you can apply directly in the code. These are your building blocks. But in addition to the smaller building blocks there are higher level concepts that help us understand how to take the pieces that we build with those small blocks and combine them to create even larger software systems.

Often these building blocks are perfect candidates for determining the right structure for your actors. When trying to decide whether a particular concept deserves a new actor or should be rolled into an existing one, you can use the building blocks of DDD to help make that determination. Many of these building blocks have natural parallels within Akka, so it can be very easy to simply map the domain concepts directly into the Actor Model.

Let’s begin by looking at some of the smaller pieces.

Domain Entities

DDD uses the concept of an Entity to refer to objects in the system that are uniquely identifiable by a key or composite key. Entities can be mutable. That is, if an Entity changes its state in some way but its key remains unaffected, it is considered to be the same Entity—its identity has not changed.

The nature of Entities, the fact that they can contain mutable state and are uniquely identifiable, maps directly to Akka actors. Actors are all about managing mutable state. And every actor in the system is uniquely identifiable using its path, regardless of the data that actor contains. It is therefore natural for us to use actors in our system the same way we might use Entities in a nonactor-based system. You can treat them as equivalent.

For example, if your system has a user Entity, you could model that user as an actor:

class User(id: UUID) extends Actor {
  override def receive: Receive = ...
}

When our User actor receives messages, the actor’s internal state may change. But the path to that actor and the ID of the User don’t change. Those are fixed values. This means that this actor is always uniquely identifiable either by the path, or by the ID. This makes the User an Entity.

Often, a good practice when building actors to represent entities is to use the entity ID as the name of the actor. In general, you should try to model your actor hierarchy to replicate the structure of the entities in the domain as closely as possible.

Domain Value Objects

Value Objects, on the other hand, are different from Entities. A Value Object has no identity beyond the attributes that it contains. Two Value Objects that contain the same data are considered to be the equal; you don’t bother trying to distinguish between them. Value Objects, unlike Entities, are immutable. They must be immutable because if their properties change, they become different Value Objects—they are no longer equal.

In Akka, the messages passed between actors are natural Value Objects. Those messages are immutable if we are following best practices and are usually not identifiable. They are just data containers. They might contain references to other Entities, but the message itself is not usually an Entity. We can also use Value Objects as the containers that hold the state of our actors. We can swap in different states as required, but the state itself doesn’t have any identity. It is only when it is present inside of the actor that it becomes identifiable. If we were to create two unique actors that both had the same state, the actors would be considered Entities, whereas the state would be considered Value Objects.

Our User actor might have a series of messages that need to be passed to that actor in order to alter its state. For example, if you need to change the name of the user, you might do that by using a message like SetName:

    object User {
        case class SetName(firstName: String, lastName: String)
    }

In this case, the SetName message is a Value Object. If you have two SetName objects for which the value of firstName and lastName is the same, those two messages can be considered equivalent. Whether you send one or the other doesn’t matter, the effect on the user is the same. Conversely, if you were to change the value of firstName in one of the messages, it would be a different message. Sending it to the user would have a completely different effect. There is no unique identifier for the SetName message. There is no way to distinguish one message from the other except by the contents of the message itself. This is what makes it a Value Object.

The messages that you use to pass between actors are called the message protocol. A good practice is to embed messages for a particular type of actor in a common protocol object. This can be either in the companion object for the individual actor or it can be a separate protocol object (e.g., UserProtocol). The latter case is particularly useful if you want multiple types of actors to handle the same set of messages.

Aggregates and Aggregate Roots

Aggregates are collections of objects within an application. An aggregate creates a logical grouping of many different elements of a system. Every aggregate is bound to an aggregate root. An aggregate root is a special entity within the aggregate that has been given responsibility for the other members of that aggregate. One of the properties of an aggregate is that other aggregates are forbidden from holding a reference to anything inside the aggregate. If you want to gain access to some element inside the aggregate, you must go through the aggregate root; you do not access the inner element directly. For instance, if your person aggregate root has an address entity, you don’t directly access the address, but instead access the appropriate person, and then reference the contained address.

Aggregates, and their associated roots, are a tricky concept. It can be difficult to determine what is an aggregate in your system, or more importantly, what is the right aggregate root. Generally aggregate roots are the top-level pieces of a system. All of your interactions with the system will in one way or another interface with an aggregate root (with a few exceptions). So how do you determine what your aggregate roots are?

One simple rule is to consider deletion. If you pick a specific Entity in the system and delete it, does that delete other Entities in the system? If your system consists of people who have addresses, and you delete an address, does it delete other parts of the system? In this case, probably not. On the other hand, if you delete a person from the system, there is a good chance that you don’t need to keep that person’s address around anymore, so in this case, a person might aggregate an address. Keep in mind, however, that although people are often aggregate roots in a system, it is not always the case. Take a bowling score-keeping system as an example. In this system, you might have the concept of a game and a player. The player might seem like a natural candidate for an aggregate root. They have scores, and if you delete a player the scores associated with that player are deleted, too. But if you step up another layer, what happens if you delete a game? Well, you could argue that deleting the game does not delete the player, but that isn’t quite accurate. Deleting the game does not delete the person, but if a person is not involved in any games, is that person actually still considered a player? In this case, it might make more sense to say that the game is the aggregate root.

Remember, though, that you might get it wrong. The point isn’t to pick the right aggregate root directly from the start. The important thing is to keep the cost of changing your mind later as low as possible.

In Akka, aggregate roots are often represented by parent actors. When you delete/stop those parents, all of their children go with them. But they don’t need to be top-level actors. Sometimes, it is beneficial to have a layer or two between the top level and the actual aggregate roots. For example, if your users are the aggregate roots, you might want a layer above the user that will be the parent of all users. In this case, your user is still the aggregate root, but you have another component in your system that is responsible for managing those users. This is especially true when you begin introducing concepts like cluster sharding (see Chapter 2).

Let’s look at a very quick example. In a scheduling system, we probably have people that need to be scheduled. We will represent a person using a Person actor. But that Person might also have a Schedule. It might be desirable (especially when using the Actor Model) to represent that Schedule by using an actor, as well, as demonstrated here:

object Schedule {
  def props = Props(new Schedule)
}

class Schedule extends Actor {
  ...
}

class Person(id: UUID) extends Actor {
  private val schedule = createSchedule()

  protected def createSchedule() = context.actorOf(Schedule.props)
}

You can see in this example that the Person actor definitely aggregates the Schedule. You can’t access the Schedule without going through the Person, and if you delete the Person the Schedule goes with it. This makes Person a candidate for an aggregate. Of course, you can’t stop there. You would need to look at the whole picture. Is Person a child of some other actor? Does that mean that the other actor is actually the aggregate? These are the types of questions that you need to ask when trying to find your aggregate root.

Repositories

Repositories are where we begin to abstract away our infrastructure concerns. They are used to create an abstraction layer over the top of our storage concerns. The basic approach when working with aggregates in DDD is to go to a repository, retrieve an aggregate from that repository, perform some operation, and then save the aggregate again. This sounds a lot like a database. In fact a repository can be an abstraction over a database, but you need to be careful to not limit yourself to that thinking. Although a repository could be accessing a database, it could also be pulling data from memory, or from disk, or from the web. It might in fact do all of these things. There is nothing that says that a single repository can’t have a dependency on multiple storage mechanisms or be transient or computed.

The key to using repositories is to understand that they are abstraction layers. For this reason, they are often represented as a trait in Scala. That trait defines how you interface with the repository but hides any implementation details about whether it talks to a database or some other storage mechanism. You then create infrastructure-specific implementations of that trait to be used by the domain code.

In Akka, when using the Actor Model, repositories can be a little tricky. The general flow of a repository involves something resembling an Akka ask. You ask the repository for an instance of some aggregate, you perform an operation on that aggregate, and then you instruct the repository to commit that change. The problem is this violates the principles of “Tell, Don’t Ask” (more on that later). Often in Akka, our repositories take on a slightly different appearance. Instead of asking the repository for an instance of a particular aggregate, we instruct the repository to give a message to that aggregate. The repository acts like a “manager.” It is the parent of a certain set of actors. We inform that manager that we want a specific actor to process a message. It is then the responsibility of the repository to locate that actor and pass the message on.

We need to remember as well that the purpose of a repository is to abstract away from infrastructure concerns. Our goal when using a repository is usually to reconstitute an aggregate from a database or other storage mechanism. If we are using actors to represent our aggregates this will probably mean loading some data from a database, creating the appropriate actor using that data, and then passing a message to the aggregate. The aggregate itself can still be part of the domain. The interface to the repository is also part of the domain. However, the precise details of the repository implementation are part of the infrastructure, not part of the domain.

Why does this matter? Why is it important to treat the implementation as part of the infrastructure? Our goal is to create a system in which the domain is disconnected from the infrastructure. We don’t want to care about whether we are using a SQL database, a NoSQL database, a data file, or any other structure. Ideally, we want to be able to swap in different implementations of the repository if need be. This can be valuable for testing purposes, but it is also valuable for production code as the application evolves. We don’t want to assume that our current database implementation is static and will remain so. Instead, we want to assume that it will evolve and prepare for that. As our needs evolve we can write new repository implementations and make use of them without having to rewrite the logic that talks to that repository.

In our scheduling domain, assuming that we have decided that our Person is an aggregate, we might have a PersonRepository to manage instances of that aggregate. If you are using the Actor Model, that PersonRepository should be an actor, as well. In this case, you will want to define an interface to that repository inside of your domain. Because actors communicate through messages rather than through methods, it doesn’t make sense to use a trait here. Instead, we define the interface as a message protocol in the domain:

object PersonRepository {
  case class Send(userId: UUID, message: Any)
}

You can then define a PersonRepository in your infrastructure that makes use of that protocol, but how that protocol is used it left to the infrastructure. Here’s how you might implement this:

class CassandraPersonRepository extends Actor {
  ...
}

Because Akka communicates by using ActorRefs rather than actor instances, you can pass around a reference to this repository where you need to. Clients using that reference don’t need to know they are talking to a CassandraPersonRepository rather than a SQLPersonRepository. They need only know that the repository makes use of the PersonRepository protocol. You can then “Send” a message to the repository, identifying the user to which that message is directed. It is up to the PersonRepository to find the appropriate user (or create it) and then deliver the message.

Factories and Object Creation

Factories exist to abstract away the complexities of new domain object creation. Sometimes, the creation of a new domain object is complicated. It may involve wiring multiple pieces together, or pulling some data from data storage. There can be any number of complex operations that need to be performed. Factories differ from repositories only slightly. A factory is intended to abstract away new object creation, whereas a repository is intended to abstract away existing object re-creation. However, often that subtle difference is not enough to warrant the creation of a new abstraction layer. For this reason factories and repositories are sometimes combined, providing methods that will create a new object or return an existing one if possible.

In Akka, a factory operates much the same as a repository. Like with a repository, instead of following an ask pattern, it is often better to use a tell pattern wherein you instruct the factory to create your object and then pass a message to the newly created instance. Again, because of its similarity to a repository, distinguishing between the two is often not really necessary.

Domain Services

When using DDD, the goal is always to try to put your logic into an existing domain object. Usually this means adding to an aggregate root. But sometimes this is difficult. For example, you might have an operation that does not naturally fit into any particular aggregate root, or conversely, you might have an operation that works over multiple aggregate roots. In these cases, it might be difficult to find a suitable domain object to fill the role required. For these situations, we can introduce something called a service. A service is a domain object that is there to handle actions that do not naturally fit as an aggregate, but services can interact with aggregates as required.

Generally, we try to leave services as a last resort. If an aggregate fits the role, you should use it. If no existing aggregate fits, you should ask yourself if there is an aggregate that you are missing. Only when you have exhausted other possibilities should you introduce a service instead.

In Akka, services can take many forms. They could be long-lived actors that operate against other aggregate actors. Or, they could be temporary actors that are created to perform some task and then terminated when the task is completed.

An example of a service in the scheduling domain example would be a worker actor that you create to fulfill a particular task. For example, you might want a temporary worker actor to handle a single request:

class ScheduleRequestService(request: ScheduleRequest) extends Actor {
  ...
}

The job of the ScheduleRequestService is to manage any state involved with that particular request and to communicate with whatever aggregates are needed during the fulfillment of the request. After the request has been processed completely, the ScheduleRequestService can be terminated. An alternative implementation for this would be to create the ScheduleRequestService as a long-lived actor. Instead of taking the request as a constructor parameter, it would instead receive the request as a message. However, to manage all the state for the request, you might still need to create a temporary actor (e.g., ScheduleRequestWorker).

Bounded Contexts

The real key to DDD lies not in the small pieces that we use to build our domain. It isn’t the aggregates or the repositories that make the real magic happen. What makes DDD special comes in the form of bounded contexts. Any system of significant size is going to naturally break down into smaller components. These components can have their own domain. Although that domain can share some elements with the overall system, how those elements are represented might be different depending on the context in which they are used. Trying to build a single, cohesive domain that fits all use cases breaks down quickly. Bounded contexts seek to avoid this problem by recognizing that each context that you operate in might have a different Domain Model.

Take the scheduling example we’ve been building. When attempting to schedule people on a particular job, we have a domain representation of those people. On the other hand, we have a separate piece of the system that is going to be responsible for maintaining those people and any information about them. Both pieces of the system are dealing with the same people, but they each have very different needs. Whereas the people management side might care about things like addresses and phone numbers, the scheduling system might not. If we were to try to build a Domain Model that suited both purposes, we could hopelessly clutter our code. For that matter, there are aspects of the scheduling side that are completely unrelated to managing the people. When scheduling, it might be necessary to schedule resources that aren’t people at all. Those resources might be pieces of hardware, machinery, or vehicles. Trying to shoehorn those into the same structure as we do people is certain to fail.

In addition to a system for managing people, other bounded contexts might exist in the system. We might want a separate context for managing skills. Although it is possible that can end up being part of our people management, it might be that the skills management is significantly large and complex enough that we might want to keep it separate. We might want to do the same with our project management. Again the details of a project might not be necessary for the actual scheduling. When creating a project, we might need to include information like the primary contact for the project. This information is important, but not for the scheduling engine. Again, separating project management from project scheduling can be valuable.

In Akka, a bounded context can take different forms. It might be desirable to create a series of top-level actors in the system, one for each bounded context. Often, however, bounded contexts represent services that are disconnected from one another. They might be separate actor systems tied together by using Akka HTTP. Or, they might be actors tied together by using Akka Remoting or Akka Cluster. They might reside in the same Java Virtual Machine (JVM), but they can also live in separate JVMs, separate machines, or even separate datacenters. In fact, when dividing up your application into bounded contexts, you might discover that although some contexts map naturally to the Actor Model, others might be more suitably implemented by using a more functional architecture or a more traditional object-oriented architecture. The key is that recognizing that each bounded context is separate and distinct allows you to make those decisions as required. You aren’t tied to any particular approach. You can experiment with what best fits the bounded context.

This approach to bounded contexts fits very well with modern microservices architectures. In this case, each microservice often represents a single bounded context. More importantly, by separating the application into its constituent parts, you can begin to make decisions about how to distribute and scale it.

In breaking an application into separate bounded contexts and distributing those contexts across multiple machines, you give rise to the idea of distributed domain-driven design (DDDD). Using tools like Akka clustering, cluster sharding, and Akka HTTP, you can take a large system, break it up into separate bounded contexts and then distribute those in ways that would be difficult without the tools Akka provides. Here, the concept of location transparency that the Actor Model provides gives you the freedom to distribute actors in myriad different patterns. Whether it is distributing your bounded contexts as Akka HTTP endpoints or distributing actors in a single bounded context using cluster sharding, you are not limited in how you deploy your system. The actual deployment of your actors becomes an implementation detail rather than being an intrinsic part of the application. This opens up the possibility of independently scaling different portions of the system.

Conclusion

Overall, DDD can provide a means to provide the structure a system needs. Applications built without such a structure tend to be much more difficult to understand and maintain, and as a result they lack overall quality. This is especially true with Akka systems built with the Actor Model, because the desirable high level of isolation can make it difficult to see the overall design without DDD.

Now that we have introduced you to both Akka and the Actor Model (and shown you how this merges with the power design approach of DDD), you have the tools necessary to build powerful, scalable, and highly maintainable systems, keeping your system well structured with these established patterns.

In Chapter 4, we will discuss the attributes of good design with actors, and how to best organize your system for concurrent and effective data flow, while safely maintaining the scalability you want.

1 Evans, Eric. Domain-Driven Design: Tackling Complexity in the Heart of Software. Boston: Addison-Wesley, 2003.

2 Vernon, Vaughn. Implementing Domain-Driven Design. Boston: Addison-Wesley, 2013.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset