Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Raul Estrada and Isaac Ruiz, Big Data SMACK, 10.1007/978-1-4842-2175-4_7

7. The Manager: Apache Mesos

Raul Estrada¹ and Isaac Ruiz¹

(1)Mexico City, Mexico

We are reaching the end of this trip. In this chapter, you will learn how to create your own cluster in a simple way.

The M in SMACK gives us the guidelines to reuse physical or virtual resources, so that you have nothing to envy to a big data center.

Divide et Impera (Divide and Rule)

While writing this chapter, we participated in several meetups related to topics discussed here, to don’t miss anything current.

These topics have generated much expectation and interest and it is becoming common to reach newbies more than those who already know about the subject.

During these talks, several participants asked about distributed systems and distributed processes. These dynamics¹ were a quick way to understand distributed computing. The group activity seemed simple: Find the rock-paper-scissors champion among the nearly 40 attendees.

The initial rule: choose someone near, play one match, and the winner finds another winning contender. The losers serve as “spokespersons” repeating the name of winner; if a new champion beats your leader, the entire group becomes followers of the new champion.

Simple, right? Well, not so much.

As the time passed and the first winners emerged, the shouting and disorder increased, because it was not easy to locate new challengers. The dynamics took a few minutes until finally emerged the “furious” champion (there was a lot of testosterone in the event).

In Figure 7-1, the circles represent each person and the diamond symbolizes a match. The line coming out of each diamond represents the winner of the match, and so on, until it reaches the winner.

Figure 7-1. Sequence to find the rock-paper-scissors champion

After the activity, the question was if it is possible to find the champion in less time. It was definitely possible, but the question was designed to define a strategy. The discussed proposal to remove noise and disorder concluded that there should be people whose function was locate and “match” winners, which should expedite the activity.

To give order, these people should also be organized between them to perform their task more efficiently. These people could also be called coordinators.

Each match requires two participants and generates a winner. You can see that to reach the final match, the finalists had to win more games than others.

In Figure 7-2, you can see the first variant to the solution to make our championship more agile. You might call this dynamic a process.

Figure 7-2. Sequence to find the rock-paper-scissors champion using just one main coordinator

Our process has something different in this first variant: it uses coordinators.

Now, what happens if you want to more strictly validate each match?

While the coordinators are in communication, the participants may not properly perform a match, so you must rely on coordinators not for only gain speed but also to ensure that the process is successfully completed.

A second approach is to generate two large groups (with few coordinators), and in each of these groups, the coordinator is responsible for validating each match. The winner is sent to a pool of partial winners (that have won at least one time, but they are not still champions). Two participants are taken from this pool and the match is made (with supervision of a third coordinator), the winner returns to the pool and the loser is eliminated from the game.

This pool coordinator is in constant communication with the other two to speed up (or pause) the participants’ flow. At the end, there is only one winner.

Figure 7-3 shows the implementation of the second variant.

Figure 7-3. Sequence to find the rock-paper-scissors champion using a main coordinator and ordering the execution

At this point, you are probably already considering a better strategy to achieve the successful completion of these tasks. And if theory is correct, the solution should have something to help coordinate and something to help distribute, so distributes et impera is the modern “divide and conquer.”

Surely, you’re thinking that this exercise is used to illustrate only the concurrency, and that is true. It is also true that distributed systems make use of concurrency to achieve their goals. If you are particularly interested in the concurrency topics, you can see another example that clearly and in a fun way shows the importance of having a coordinator agent in problem resolution.

In the presentation “Concurrency Is Not Parallelism,”² Rob Pike, the GO language creator, shows an interesting example where communication works as “coordinator.” In any case, we decided to tackle the example from the audience perspective, as we experienced on the meetup.

With this little anecdote, you begin to see what the distribution task involves.

Distributed Systems

To talk about distributed systems today is to talk about how the Internet works.

The Internet is a huge system that shares hardware and software for different purposes; one of them is to enjoy the Web (http). Today, saying that a system shares hardware and software seems very simple, but it is still complex.

Distributed systems are not new; they have emerged as computing capacity has increased and data manipulation has grown. Over the years, research focused on how to share resources in order to optimize tasks.

These resources could be a data repository, RAM, or a printer if you are looking at sharing both physical and logical resources. But what happens if two people try to access the same resource at the same time?

In earlier days, new requirements encouraged the creation of models that allowed concurrency. Once concurrency and resource sharing were satisfied, new needs appeared. How to add a new resource to the system? How to make the entire system scalable?

In these past few paragraphs, we touched on the key points that evolved so that we could have today’s distributed systems.

It is still difficult to implement a distributed system and thus new tools emerge; one of them is Apache Mesos, which is what this chapter is about.

Figure 7-4 shows the importance of a distributed systems coordinator.

Figure 7-4. Distributed system anatomy

Why Are They Important?

Reliability and availability (also called high availability) are the two basic characteristics in today’s systems. Having information always available and having access to a vital process in economic, healthcare, and government scopes are requirements that we assume already done. If you also add the growth of users to a critical system (or a popular system, in the case of the Internet) is almost geometric in some cases, which requires these systems to be scalable and flexible.

All of these features were achieved years ago, but now they can be achieved at low cost. With few resources carrying the ease-of-use to other systems, we can take the distributed systems benefits available at new levels.

It Is Difficult to Have a Distributed System

There are several tasks that must be inherently performed when having a distributed system. This includes not only monitoring what is happening, but also making a deployment in a whole cluster. It is a task that must not jeopardize the current version but must reach all the nodes. Each deploy implies a process to ensure that all nodes are ready to accept it.

And finally, we need to know which resources are available within the entire cluster, which are down (a node could go down for any reason), and which were recently added (when a new node is added to the cluster).

All of this results in high-cost data centers, where every hour of wasted time can lead to unnecessary but costly businesses expenses.

In summary, it’s not easy to implement and maintain a distributed system. Figure 7-5 shows some of the main problems of working with distributed environments when increasing the number of nodes.

Figure 7-5. Typical distributed system tasks that increase the complexity regardless the number of nodes

Part of the complexity of a distributed system is due to its foundation and the way it is implemented,

In the presentation “What You Talk About When You Talk About Distributed Systems,”³ is discussed the importance of the fundamentals of distributed systems and reminds us of some of the models that are currently used to implement them.

Figure 7-6 summarizes some of these models .

Figure 7-6. The main distributed models and their implementations

Each of these models has its own implications, so it is possible that a feasible model for one scenario is not feasible for another scenario. Delving into the world of distributed systems requires a lot of theory. There is so much material on the subject that there is a post to complement this presentation. If you find this particular topic interesting, you can find more about it here.⁴

In short, it is not easy to implement and maintain a distributed system.

Ta-dah!! Apache Mesos

And it is in this context that Apache Mesos appeared on the scene in 2010.⁵ Discussion of distributed systems a few years ago was about data centers. Having different machines (physical or virtual) connected together, and making them “seen” as a single large machine is something that data centers already do well, or at least somewhat good. The objective is to abstract Mesos as many layers that make up a distributed system; in this case, a data center system.

A Mesos goal is to program and deploy an application in a data center in the same way that it is done on a PC or a mobile device. Achieving this goal and having it supported on different frameworks is discussed later.

Although tools such as Ansible or Puppet can handle a certain number of nodes, performing the packaging and deployment tasks usually generate some interesting challenges. The use of resources of a large machine represents another challenge for data centers, it is common to run into some scenarios where once added more nodes to the cluster, the CPU usage is uneven, thus wasting much of that large machine computation power. Usually, this large machine is actually a cluster with several nodes.

The tricky part comes when we have several clusters.

And it is here that Mesos comes in. Mesos is essentially a “general purpose cluster manager,” or a mechanism to administer several large machines used to drive a data center. This “general purpose” administration means that Mesos not only manages and schedules batch processes but also other processes.

Figure 7-7 shows the different types of processes that can be scheduled in Apache Mesos.

Figure 7-7. Mesos is a general-purpose cluster manager, not only focused on batch scheduling

Apache Mesos tries to solve problems inherent to distributed systems. It tries not only to be a manager but a whole cluster execution platform powered by the (Mesos) frameworks.

Mesos Framework

One way to better understand the Mesos architecture is with Figure 7-8. Mesos requires a layer on which provisioning functions, and then a top layer to expose applications.

Figure 7-8. Level of abstraction in Mesos

In each of these layers, Mesos requires components that allow the service deployment, service finding, and keep running those services. The frameworks we discuss in this chapter cover some of these tasks.

ZooKeeper discovers services. Chronos is responsible for scheduling the services. Marathon and Aurora are responsible for executing the services. These are not the only frameworks to perform the tasks, but they are the most commonly used.

Architecture

The Apache Mesos official documentation begins with a chart that shows its powerful architecture .⁶

Mesos consists of master servers, which can handle as many slaves as they receive (up to 10,000 according to the documentation). These slaves manage software components called Mesos frameworks, which are responsible for task execution.

With this scheme, we can see the importance of the frameworks.

Every framework has two parts: a scheduler that records the services offered by the master and slave nodes or servers that process the tasks to be executed.

You can read more about architecture in an MIT paper “Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center” by Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy Katz, Scott Shenker, and Ion Stoica.⁷

Figures 7-9 and 7-10 are adapted from the official Mesos web site at http://mesos.apache.org . They show an overview of the Mesos architecture and the major frameworks with which it interacts.

Figure 7-9. Mesos architecture: master servers (only one at a time) interact with various slave servers

Figure 7-10. A framework obtains a scheduler and performs a task

In the first figure we can see that the center of all is the Mesos master, basically a daemon process that coordinates certain number of agents (Mesos prefers to call them agentsrather than servers that are part of the cluster). These agents live in each node in a cluster. You also see an important component in the Mesos ecosystem: ZooKeeper. Later you will learn about the importance of this component.

The second figure describes the two components that all the frameworks need to run a task: a scheduler and an executor. While there are already a lot of Mesos frameworks ready to run, knowing the parts of a framework is attractive not only to better understand their execution, but also to know what involves creating your own framework.⁸

Mesos 101

Now that you have an overview of the Mesos ecosystem, let’s look at Apache Mesos in action.

The Mesos ecosystem tries to work as an operating system in the data centers world; therefore, most documentation mentions that Mesos is the kernel of the operating system. In the following pages, you will learn the steps to install this kernel. Then you will learn how to install ZooKeeper, which is the responsible for maintaining information synchronized across all nodes. We continue with Chronos, who keeps running all the defined services. Finally, Marathon and Aurora are responsible for maintaining the services running, each one in their own way.

Here we go.

Installation

You are now ready to install Mesos. The installation has three steps: get the binary, get the dependencies, and start Mesos. Something very important to keep in mind is that there is no installer to perform all the steps. Mesos takes advantage of the capabilities of each operating system and therefore must make a compilation of components to leverage the capabilities of the native environment.

Generating native libraries by platform optimizes the performance.

If you’re not going to compile with the make command, do not worry, the script is very complete and does almost all the work for you. As you shall see, the installation only requires to meet the dependencies to compile correctly.

Get the Installer

When this book was written, the stable version of Apache Mesos was 0.28.1.

For a step-by-step installation , follow the guide provided at this web page:

http://mesos.apache.org/gettingstarted/

This guide recommends downloading the latest stable distributable, as follows:

wget http://www.apache.org/dist/mesos/0.28.1/mesos-0.28.1.tar.gz

The following is the downloaded file size:

-rw-r--r--. 1 rugi rugi 29108023 Apr 13 15:07 mesos-0.28.1.tar.gz

Unpack it with this:

%>tar -zxf mesos-0.28.1.tar.gz

A folder called mesos-0.28-1 should be created with this content:

-rw-r--r--   1 isaacruiz  staff     414 Apr  5 20:25 mesos.pc.in
-rw-r--r--   1 isaacruiz  staff   60424 Apr  5 20:25 configure.ac
-rwxr-xr-x   1 isaacruiz  staff    3647 Apr  5 20:25 bootstrap
-rw-r--r--   1 isaacruiz  staff    1111 Apr  5 20:25 README.md
-rw-r--r--   1 isaacruiz  staff     162 Apr  5 20:25 NOTICE
-rw-r--r--   1 isaacruiz  staff    3612 Apr  5 20:25 Makefile.am
-rw-r--r--   1 isaacruiz  staff   28129 Apr  5 20:25 LICENSE
-rw-r--r--   1 isaacruiz  staff  324089 Apr  5 20:25 ltmain.sh
-rw-r--r--   1 isaacruiz  staff   46230 Apr  5 20:25 aclocal.m4
-rwxr-xr-x   1 isaacruiz  staff  860600 Apr  5 20:25 configure
-rwxr-xr-x   1 isaacruiz  staff    6872 Apr  5 20:25 missing
-rwxr-xr-x   1 isaacruiz  staff   14675 Apr  5 20:25 install-sh
-rwxr-xr-x   1 isaacruiz  staff   23566 Apr  5 20:25 depcomp
-rwxr-xr-x   1 isaacruiz  staff   35987 Apr  5 20:25 config.sub
-rwxr-xr-x   1 isaacruiz  staff   42938 Apr  5 20:25 config.guess
-rwxr-xr-x   1 isaacruiz  staff    7333 Apr  5 20:25 compile
-rwxr-xr-x   1 isaacruiz  staff    5826 Apr  5 20:25 ar-lib
-rw-r--r--   1 isaacruiz  staff   45159 Apr  5 20:25 Makefile.in
drwxr-xr-x   4 isaacruiz  staff     136 Apr  5 20:28 support
drwxr-xr-x   5 isaacruiz  staff     170 Apr  5 20:28 mpi
drwxr-xr-x   3 isaacruiz  staff     102 Apr  5 20:28 include
drwxr-xr-x  23 isaacruiz  staff     782 Apr  5 20:28 bin
drwxr-xr-x  13 isaacruiz  staff     442 Apr  5 20:28 3rdparty
drwxr-xr-x  43 isaacruiz  staff    1462 Apr  5 20:28 src
drwxr-xr-x  15 isaacruiz  staff     510 May  8 18:40 m4

Once unzipped, go to the mesos-0.28-1 directory, as follows:

%>cd mesos-0.28-1

Make a directory called build and then go to this directory:

%>mkdir build
%>cd build

Inside this folder, start generating the binaries for your machine:

%>../configure
%>make

The make operation can take a lot of time because in addition to dependency compilation, your machine configuration may be responsible for downloading and configuring the scripts to run Mesos.

If you have a modern computer, you probably have more than one processor. You can speed up the process by using the make command to indicate the number of processors that you have and by suppressing the console outputs (without verbose mode), as shown in the following:

%>make -j 2 V=0

This command indicates that you have a machine with two processors and you don’t want verbosity.

In Unix/Linux operating systems, we can see the status with the %>nproc or %>lscpu commands.

Missing Dependencies

The installation process is still in improvement, particularly because the compilation relies heavily on the computer characteristics in which we want to install Apache Mesos.

The compilation process assumes that we have all the required libraries for compiling; therefore, at http://mesos.apache.org/gettingstarted/ are listed the required libraries for the following operating systems:

Ubuntu 14.04
Mac OS: Yosemite and El Capitan
CentOS 6.6
CentOS 7.2

We strongly suggest to investigate before making an installation; also, install the following libraries on your computer according to your operating system and version:

libcurl
libapr-1

These are typically two missing dependencies .

If you use yum or apt-get, lean on the search options offered by each tool. If you are trying to install on a Mac, before anything, run the following command to ensure that you have the developer tools updated in the command line.

xcode-select --install

Note

Ensure the libraries’ compatibility. The make command execution is a task that could exceed the 20 minutes. Be patient; it is always difficult to use something earlier than version 1.0.

Start Mesos

Once past the dependencies and make construction stages, starting Mesos is very simple, just need to run two lines.

Master Server

The first line starts the master server . Since the working directory points to a system directory, be sure to run the command as a privileged user:

%>cd build
%>./bin/mesos-master.sh --ip=127.0.0.1 --work_dir=/var/lib/mesos

This is the typical console output:

%>./bin/mesos-master.sh --ip=127.0.0.1 --work_dir=/var/lib/mesos
I0508 16:48:54.315554 2016645120 main.cpp:237] Build: 2016-05-08 16:20:08 by isaacruiz
I0508 16:48:54.315907 2016645120 main.cpp:239] Version: 0.28.1
I0508 16:48:54.315989 2016645120 main.cpp:260] Using 'HierarchicalDRF' allocator
I0508 16:48:54.320935 2016645120 leveldb.cpp:174] Opened db in 4589us
I0508 16:48:54.323814 2016645120 leveldb.cpp:181] Compacted db in 2845us
I0508 16:48:54.323899 2016645120 leveldb.cpp:196] Created db iterator in 32us
I0508 16:48:54.323932 2016645120 leveldb.cpp:202] Seeked to beginning of db in 18us
I0508 16:48:54.323961 2016645120 leveldb.cpp:271] Iterated through 0 keys in the db in 20us
I0508 16:48:54.325166 2016645120 replica.cpp:779] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
I0508 16:48:54.336277 528384 recover.cpp:447] Starting replica recovery
I0508 16:48:54.338512 2016645120 main.cpp:471] Starting Mesos master
I0508 16:48:54.357270 528384 recover.cpp:473] Replica is in EMPTY status
I0508 16:48:54.368338 2016645120 master.cpp:375] Master 7f7d9b4b-c5e4-48be-bbb7-78e6fac701ea (localhost) started on 127.0.0.1:5050
I0508 16:48:54.368404 2016645120 master.cpp:377] Flags at startup: --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate="false" --authenticate_http="false" --authenticate_slaves="false" --authenticators="crammd5" --authorizers="local" --framework_sorter="drf" --help="false" --hostname_lookup="true" --http_authenticators="basic" --initialize_driver_logging="true" --ip="127.0.0.1" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" --port="5050" --quiet="false" --recovery_slave_removal_limit="100%" --registry="replicated_log" --registry_fetch_timeout="1mins" --registry_store_timeout="20secs" --registry_strict="false" --root_submissions="true" --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" --webui_dir="/Users/isaacruiz/Downloads/mesos/mesos-0.28.1/build/../src/webui" --work_dir="/var/lib/mesos" --zk_session_timeout="10secs"
W0508 16:48:54.378363 2016645120 master.cpp:380]
**************************************************
Master bound to loopback interface! Cannot communicate with remote schedulers or slaves. You might want to set '--ip' flag to a routable IP address.
**************************************************

Slave Server

The second line is responsible for starting the first Mesos slave server :

%>cd build
%> ./bin/mesos-slave.sh --master=127.0.0.1:5050

This is the typical console output:

%>./bin/mesos-slave.sh --master=127.0.0.1:5050
I0508 16:49:09.586303 2016645120 main.cpp:223] Build: 2016-05-08 16:20:08 by isaacruiz
I0508 16:49:09.587652 2016645120 main.cpp:225] Version: 0.28.1
I0508 16:49:09.588884 2016645120 containerizer.cpp:149] Using isolation: posix/cpu,posix/mem,filesystem/posix
I0508 16:49:09.627917 2016645120 main.cpp:328] Starting Mesos slave
I0508 16:49:09.630908 3747840 slave.cpp:193] Slave started on 1)@192.168.1.5:5051
I0508 16:49:09.630956 3747840 slave.cpp:194] Flags at startup: --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5" --container_disk_watch_interval="15secs" --containerizers="mesos" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/mesos/fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname_lookup="true" --image_provisioner_backend="copy" --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem" --launcher_dir="/Users/isaacruiz/Downloads/mesos/mesos-0.28.1/build/src" --logbufsecs="0" --logging_level="INFO" --master="127.0.0.1:5050" --oversubscribed_resources_interval="15secs" --port="5051" --qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="1secs" --sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" --version="false" --work_dir="/tmp/mesos"
I0508 16:49:39.704506 3747840 slave.cpp:464] Slave resources: cpus(*):2; mem(*):7168; disk(*):482446; ports(*):[31000-32000]
I0508 16:49:39.704661 3747840 slave.cpp:472] Slave attributes: [  ]
I0508 16:49:39.704684 3747840 slave.cpp:477] Slave hostname: 192.168.1.5
I0508 16:49:39.719388 1064960 state.cpp:58] Recovering state from '/tmp/mesos/meta'
I0508 16:49:39.720755 4284416 status_update_manager.cpp:200] Recovering status update manager
I0508 16:49:39.721927 4284416 containerizer.cpp:407] Recovering containerizer
I0508 16:49:39.728039 2674688 provisioner.cpp:245] Provisioner recovery complete
I0508 16:49:39.728682 3211264 slave.cpp:4565] Finished recovery
I0508 16:49:39.732142 2138112 status_update_manager.cpp:174] Pausing sending status updates
I0508 16:49:39.732161 3211264 slave.cpp:796] New master detected at [email protected]:5050
I0508 16:49:39.733449 3211264 slave.cpp:821] No credentials provided. Attempting to register without authentication
I0508 16:49:39.733577 3211264 slave.cpp:832] Detecting new master
I0508 16:49:40.588644 2138112 slave.cpp:971] Registered with master [email protected]:5050; given slave ID 7f7d9b4b-c5e4-48be-bbb7-78e6fac701ea-S0
I0508 16:49:40.589226 528384 status_update_manager.cpp:181] Resuming sending status updates
I0508 16:49:40.589984 2138112 slave.cpp:1030] Forwarding total oversubscribed resources

Like the other frameworks covered in this book, keep running the both commands to run successfully; first the master server and then the slave server.

In Figure 7-11, we can see the both windows running simultaneously.

Figure 7-11. Two consoles showing the execution, in front the master server, and in back a slave server

At this point, with the master and slave started, it’s already possible to access the Mesos web console. The console listens on port 5050. Thus, open your favorite browser and go to http://127.0.0.1:5050 . See the Apache Mesos main screen running with one slave server. In Figure 7-12, you see the main screen of the Mesos console.

Figure 7-12. Mesos server main screen running

Note

Before continuing, have at hand the location of the libmesos.so file (or the libmesos.dylib file if compiling on a Mac with OS X). This file is required to integrate with frameworks, as discussed next. Use the find -name * libmesos.. command to locate it.

Teaming

Although Mesos has still not reached version 1.0, there are already several frameworks that contribute to a robust ecosystem and perfectly complement the Mesos objectives. In particular, there are four frameworks to know: ZooKeeper, Chronos, Marathon, and Aurora.

ZooKeeper

The official site of the ZooKeeper framework tells us that it is a centralized naming service, which simultaneously allows to maintain configuration information in a distributed way.

Figure 7-13 shows the Apache ZooKeeper home page .

Figure 7-13. The Apache ZooKeeper home page

Do you remember the diagram showing the Mesos architecture? ZooKeeper’s strength is to keep distributed processes through service replication; customers connect to multiple servers (there is one main server) and from there, they get the information they need.

To manage the information, ZooKeeper creates a distributed namespace (see Figure 7-14) across all nodes; this namespace is similar to a standard file system.

Figure 7-14. ZooKeeper namespace

ZooKeeper was designed to be simple to use. The API only has seven messages, as shown in Table 7-1.

Table 7-1. Messages in the ZooKeeper API

Message	Definition
create	Creates a node at a location in the three.
delete	Deletes a node.
exist	Tests if a node exists at a location.
get data	Reads the data from the node.
set data	Writes data to a node.
get children	Retrieves a list of a node’s children.
sync	Waits for data to be propagated.

Figure 7-15 shows how ZooKeeper maintains high availability with the services scheduled. There is a main server (leader). All servers know each other and they all keep their status in memory. While more servers are active, the availability of services is assured.

Figure 7-15. ZooKeeper service

Clients, meanwhile, connect to a single server. A TCP connection maintains communication with this server (sending heartbeats). If something happens to the server, the client only connects to another server.

Installation

At the time of this writing, the stable version of ZooKeeper was 3.4.6. The installation process is simple. Download the binary file of the latest stable version from this web page:

http://zookeeper.apache.org/releases.html

The tar.gz file for this version has the following size:

Mode                LastWriteTime         Length Name
----                -------------         ------ ----
-a----    15/05/2016  08:17 a. m.       17699306 zookeeper-3.4.6.tar.gz

Once unzipped, the first is to create a configuration file; the configuration file must be named zoo.cfg and it must be located in the conf/ directory.

By default, the conf/ directory has as an example file where the parameters to set are described in detail.

In short, a zoo.cfg file must have at least the following values:

/opt/apache/zookeeper/zookeeper-3.4.6/conf%> vi zoo.cfg
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/tmp/zookeeper
clientPort=2181

The same directory has a sample file; we can copy its contents to our zoo.cfg file.

Once the zoo.cfg file is configured, to check the status to validate the ZooKeeper configuration file location and execution mode run this command.

∼/opt/apache/zookeeper/zookeeper-3.4.6/bin%>./zkServer.sh status
JMX enabled by default
Using config: /Users/isaacruiz/opt/apache/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: standalone

If we run only the zKServer.sh file, we can see the list of tasks that can be used with the ZooKeeper binary (be sure to execute instructions with a user that has enough privileges):

∼/opt/apache/zookeeper/zookeeper-3.4.6/bin%>./zkServer.sh       
Password:
JMX enabled by default
Using config: /users/isaacruiz/opt/apache/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg
Usage: ./zkServer.sh {start|start-foreground|stop|restart|status|upgrade|print-cmd}
{16-05-10 21:24}:∼/opt/apache/zookeeper/zookeeper-3.4.6/bin isaacruiz%

As you can see, we have the following tasks:

start, start-foreground, stop, restart, status, upgrade, print-cmd

Now, to start ZooKeeper.

∼/opt/apache/zookeeper/zookeeper-3.4.6/bin%>./zkServer.sh start
JMX enabled by default
Using config: /users/isaacruiz/opt/apache/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
{16-05-10 21:24} opt/apache/zookeeper/zookeeper-3.4.6/bin isaacruiz%

Chronos

The second component is Chronos , the substitute for the cron sentence in the Mesos context. The home page of this project is at https://mesos.github.io/chronos/ .

If you’ve used a Unix-based operating system, have probably performed the repetitive execution of a process using this operating system utility. Chronos does the same; it is a task scheduler. Only in the Chronos context, it is a distributed task scheduler, and in addition to isolating tasks, it can orchestrate them.

Another advantage of Chronos is to schedule tasks using the ISO 8601 8; the notation offered by this standard is much friendlier in specifying the time intervals to execute tasks. Chronos runs directly on Mesos. It is the first line of interaction with the outside world. Its integration is in both the master and the slave servers, and through this communication manages the records of the jobs to be done.

The Chronos architecture is shown in Figure 7-16.

Figure 7-16. Chronos architecture

Installation

At the time of this writing, the stable Chronos version was 2.4.0. Having a mature version, the installation process is simple, but it is a little slow for the binary generation of the platform where it is used.

The installation process assumes that you have already installed a version of Mesos (0.20.x) and ZooKeeper. And have also installed Apache Maven 3.x and JDK 1.6 or higher.

The installation process begins by downloading the binary from the home page:⁹

curl -0 https://github.com/mesos/chronos/archive/2.3.4.tar.gz
tar xvf 2.3.4.tar.gz

Once the file is decompressed, note that the main folder contains the POM files required by Maven to perform their tasks.

The directory contents should be similar to the following:

∼/opt/apache/chronos/chronos-2.4.0%> ls -lrt
total 104
drwxr-xr-x@  4 isaacruiz  staff    136 Aug 28  2015 src
-rw-r--r--@  1 isaacruiz  staff  17191 Aug 28  2015 pom.xml
drwxr-xr-x@ 14 isaacruiz  staff    476 Aug 28  2015 docs
-rw-r--r--@  1 isaacruiz  staff   2521 Aug 28  2015 changelog.md
-rw-r--r--@  1 isaacruiz  staff   1165 Aug 28  2015 build.xml
drwxr-xr-x@ 13 isaacruiz  staff    442 Aug 28  2015 bin
-rw-r--r--@  1 isaacruiz  staff   3087 Aug 28  2015 README.md
-rw-r--r--@  1 isaacruiz  staff    837 Aug 28  2015 NOTICE
-rw-r--r--@  1 isaacruiz  staff  11003 Aug 28  2015 LICENSE
-rw-r--r--@  1 isaacruiz  staff    470 Aug 28  2015 Dockerfile

The only thing remaining is to run the mvn package to generate the .jar file used to start Chronos.

∼/opt/apache/chronos/chronos-2.4.0%> mvn package
[INFO] Scanning for projects...
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building chronos 2.4.0
[INFO] ------------------------------------------------------------------------
Downloading: https://repo1.maven.org/maven2/org/apache/maven/plugins/maven-antrun-plugin/1.7/maven-antrun-plugin-1.7.pom
Downloaded: https://repo1.maven.org/maven2/org/apache/maven/plugins/maven-antrun-plugin/1.7/maven-antrun-plugin-1.7.pom (5 KB at 0.5 KB/sec)
Downloading: https://repo1.maven.org/maven2/org/apache/maven/plugins/maven-antrun-plugin/1.7/maven
...
...
...
Downloaded: https://repo.maven.apache.org/maven2/org/slf4j/slf4j-api/1.6.1/slf4j-api-1.6.1.jar (25 KB at 5.9 KB/sec)
[INFO] Dependency-reduced POM written at: /Users/isaacruiz/opt/apache/chronos/chronos-2.4.0/dependency-reduced-pom.xml
[INFO] Dependency-reduced POM written at: /Users/isaacruiz/opt/apache/chronos/chronos-2.4.0/dependency-reduced-pom.xml
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 03:55 min
[INFO] Finished at: 2016-05-19T01:45:13-05:00
[INFO] Final Memory: 68M/668M
[INFO] ------------------------------------------------------------------------

This step can take time, depending on your Internet connection and your machine’s characteristics.

Run

Having the .jar file (found in the target directory), it is possible start Chronos with the following line:

∼/opt/apache/chronos/chronos-2.4.0%> java -cp target/chronos*.jar org.apache.mesos.chronos.scheduler.Main --master zk://localhost:2181/mesos --zk_hosts localhost:2181

[2016-05-19 01:45:56,621] INFO --------------------- (org.apache.mesos.chronos.scheduler.Main$:26)
[2016-05-19 01:45:56,624] INFO Initializing chronos. (org.apache.mesos.chronos.scheduler.Main$:27)
[2016-05-19 01:45:56,627] INFO --------------------- (org.apache.mesos.chronos.scheduler.Main$:28)
[2016-05-19 01:45:59,109] INFO Wiring up the application (org.apache.mesos.chronos.scheduler.config.MainModule:39)
...
...
2016-05-19 01:46:01,446:3328(0x700002495000):ZOO_INFO@check_events@1703: initiated connection to server [::1:2181]
2016-05-19 01:46:01,449:3328(0x700002495000):ZOO_INFO@check_events@1750: session establishment complete on server [::1:2181], sessionId=0x154c75bb0940008, negotiated timeout=10000
I0519 01:46:01.451170 16019456 group.cpp:349] Group process (group(1)@192.168.1.5:53043) connected to ZooKeeper
I0519 01:46:01.452013 16019456 group.cpp:831] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0)

Now Apache Chronos is running.

One Chronos advantage is that is already in an advanced version; it has a web interface that allows to manage scheduled jobs. This web interface is available in the following port:

http://localhost:8080/

Figure 7-17 shows this screen.

Figure 7-17. Chronos web interface

Part of the Chronos power lies in its API,¹⁰ from which you can better interact from other integration points. You can easily try curl, as follows:

∼/opt/apache/chronos/chronos-2.4.0%> curl -L -X GET http://localhost:8080/scheduler/jobs

[]%

Right now, there are no scheduled jobs, so both the web interface and the API report the same thing: no scheduled tasks.

Marathon

Marathon is another tool that fits very well with Mesos; although it can be used independently, with Mesos it is even more powerful, and given the integration, it is a lot easier to use.

The Marathon home page is at https://mesosphere.github.io/marathon/ . From this page we can download the latest stable version; at the time of this writing, the version was 1.1.1.

Figure 7-18 shows the Marathon home page .

Figure 7-18. Marathon home page

A quick way to understand Marathon is with this phrase: “A self-serve interface to your cluster. Distributed init for long-running services.”¹¹

Probably you have used the init command on any Unix-based operating system that helps us to start tasks and/or processes already defined and configured on the operating system’s host.

Marathon has a particular way to manage the tasks execution.¹² Marathon intends to help keep each task executed by Mesos 100% available.

Figure 7-19 shows the Marathon architecture. It is based on official documentation at https://mesosphere.github.io/marathon/ .

Figure 7-19. Marathon architecture

In our chart, we can add the Chronos framework.¹³

When Marathon starts, it launches two Chronos instances: one operates Marathon and the other exists because Chronos is a framework that can work with Mesos. This also ensures that there will always be two Chronos instances running and ready to receive tasks.

Installation

The main binary is downloaded directly from the Marathon home page. Once the file is decompressed, we can start it (Marathon assumes that Mesos and ZooKeeper are already running).

Before running it, be sure to export the MESOS_NATIVE_JAVA_LIBRARY variable pointing to the route already detected (the Marathon start file will look in /usr/lib). When running it, check if the current os user has read permission on system directories.

opt/apache/marathon/marathon-1.1.1/bin%>./start --master local
MESOS_NATIVE_JAVA_LIBRARY is not set. Searching in /usr/lib /usr/local/lib.
MESOS_NATIVE_LIBRARY, MESOS_NATIVE_JAVA_LIBRARY set to '/usr/local/lib/libmesos.dylib'
[2016-05-15 15:23:22,391] INFO Starting Marathon 1.1.1 with --master local (mesosphere.marathon.Main$:main)
[2016-05-15 15:23:26,322] INFO Connecting to ZooKeeper... (mesosphere.marathon.Main$:main)
[2016-05-15 15:23:26,346] INFO Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT (org.apache.zookeeper.ZooKeeper:main)
[2016-05-15 15:23:26,347] INFO Client environment:host.name=192.168.1.5 (org.apache.zookeeper.ZooKeeper:main)
[2016-05-15 15:23:26,348] INFO Client environment:java.version=1.8.0_51 (org.apache.zookeeper.ZooKeeper:main)
[2016-05-15 15:23:26,349] INFO Client environment:java.vendor=Oracle Corporation (org.apache.zookeeper.ZooKeeper:main)
[2016-05-15 15:23:26,349] INFO Client environment:java.home=/Library/Java/JavaVirtualMachines/jdk1.8.0_51.jdk/Contents/Home/jre (org.apache.zookeeper.ZooKeeper:main)

Aurora

Aurora is one of the Mesos frameworks. (Do you remember the diagram?) Mesos requires frameworks to retrieve a scheduler and run a task from it.

Aurora allows to run applications and services through a set of machines. Its primary responsibility is to maintain this state of execution as long as possible; the ideal is “always.”

Figure 7-20 shows the Aurora project home page .

Figure 7-20. Aurora home page http://aurora.apache.org/

If a machine fails, Aurora is responsible for rescheduling the execution of a task in service if one of the remaining machines is available.

The use of Aurora¹⁴ on Twitter is the most widespread use case. It is used as a reference on the Aurora main page; the introduction¹⁵ is provided by the Twitter senior staff engineer.

Overview

Figure 7-21 shows the components of Aurora; this is based on the official documentation ( http://aurora.apache.org/ ).

Figure 7-21. Aurora components

Aurora is currently at version 0.30.0.

Let’s Talk About Clusters

Apache Mesos was designed to work in clusters. These Mesos clusters require a main server named MASTER and several secondary servers called SLAVES . Original, isn’t it?

Figure 7-22 shows the relationship between the coordinator and master-slave servers. It is based on official Mesos documentation.

Figure 7-22. Master and slaves server distribution

There are already several frameworks available for use by Mesos. The true potential is the ability to create special Mesos frameworks using any of the supported languages: Java, Scala, Python, and C ++.

One of these ready-to-use frameworks is the Apache Kafka framework.

Apache Mesos and Apache Kafka

Apache Kafka is one of the frameworks ready to be used with Mesos. There is a GitHub repository in charge of this project. Figure 7-23 shows the main screen of the project on GitHub .

Figure 7-23. Mesos/Kafka GitHub project

Like any Mesos framework, Kafka requires schedulers to run tasks. In Figure 7-24, we can see the relationship between the scheduler and executor , the basic components of any framework in Mesos. The figure shows an already known scheme but particularly applied to Apache Kafka; it is based on official documentation.¹⁶

Figure 7-24. The Apache Kafka framework and its interaction with Apache Mesos

Before building the binaries, check if you have already installed Java and Gradle (the automated building tool for Java projects).

JDK Validation

JDK validation is easy; you just need to ask for the active version.

/opt/apache/kafka> java -version
java version "1.8.0_51"
Java(TM) SE Runtime Environment (build 1.8.0_51-b16)
Java HotSpot(TM) 64-Bit Server VM (build 25.51-b03, mixed mode)

Gradle Validation

Gradle validation is easy. In this example is installed the Gradle version 2.9:

∼/opt/apache/kafka%> gradle -version

------------------------------------------------------------
Gradle 2.9
------------------------------------------------------------

Build time:   2015-11-17 07:02:17 UTC
Build number: none
Revision:     b463d7980c40d44c4657dc80025275b84a29e31f

Groovy:       2.4.4
Ant:          Apache Ant(TM) version 1.9.3 compiled on December 23 2013
JVM:          1.8.0_51 (Oracle Corporation 25.51-b03)

{16-05-10 17:55}Isaacs-MacBook-Pro-2:∼/opt/apache/kafka isaacruiz%

Export libmesos Location

Follow the instructions provided in the guide. When compiling Mesos, several native libraries are generated. Find the file called libmesos.so (or libmesos.dylib if you’re on a Mac with OS X) and export it or declare it in the file: kafka-mesos.sh

# export MESOS_NATIVE_JAVA_LIBRARY=/usr/local/lib/libmesos.so

Now start cloning the repository. The console output will be similar to this:

∼/opt/apache/kafka%> git clone https://github.com/mesos/kafka
Cloning into 'kafka'...
remote: Counting objects: 4343, done.
remote: Total 4343 (delta 0), reused 0 (delta 0), pack-reused 4343
Receiving objects: 100% (4343/4343), 959.33 KiB | 44.00 KiB/s, done.
Resolving deltas: 100% (1881/1881), done.
Checking connectivity... done.

Once the repository is cloned, proceed to build the Kafka-Mesos binaries with the command:

gradle jar

Gradle is characterized by place the artifacts required for compilation in the directories according to the invocation structure, this task may take a while depending on the script dependencies.

∼/opt/apache/kafka/kafka@master%> ./gradlew jar
Downloading https://services.gradle.org/distributions/gradle-2.8-all.zip
........................................................................
Unzipping /Users/isaacruiz/.gradle/wrapper/dists/gradle-2.8-all/ah86jmo43de9lfa8xg9ux3c4h/gradle-2.8-all.zip to /Users/isaacruiz/.gradle/wrapper/dists/gradle-2.8-all/ah86jmo43de9lfa8xg9ux3c4h
Set executable permissions for: /Users/isaacruiz/.gradle/wrapper/dists/gradle-2.8-all/ah86jmo43de9lfa8xg9ux3c4h/gradle-2.8/bin/gradle
:compileJava UP-TO-DATE
:compileScala
Download https://repo1.maven.org/maven2/org/scala-lang/scala-library/2.10.6/scala-library-2.10.6.pom
Download https://repo1.maven.org/maven2/org/apache/mesos/mesos/0.25.0/mesos-0.25.0.pom
...
Download https://repo1.maven.org/maven2/org/scala-lang/scala-reflect/2.10.6/scala-reflect-2.10.6.jar
:processResources UP-TO-DATE
:classes
:compileTestJava UP-TO-DATE
:compileTestScala
[ant:scalac] Element '/Users/isaacruiz/opt/apache/kafka/kafka/out/gradle/resources/main' does not exist.
:processTestResources UP-TO-DATE
:testClasses
:test
:jar

BUILD SUCCESSFUL

Total time: 29 mins 54.98 secs

This build could be faster, please consider using the Gradle Daemon: https://docs.gradle.org/2.8/userguide/gradle_daemon.html
{16-05-10 18:33}Isaacs-MacBook-Pro-2:∼/opt/apache/kafka/kafka@master isaacruiz%

After the Gradle compilation , you should have a structure similar to the following:

∼/opt/apache/kafka/kafka@master%> ls -lrt
total 34360
drwxr-xr-x  6 isaacruiz  staff       204 May 10 18:02 src
-rwxr-xr-x  1 isaacruiz  staff      1634 May 10 18:02 quickstart.sh
drwxr-xr-x  3 isaacruiz  staff       102 May 10 18:02 lib
-rwxr-xr-x  1 isaacruiz  staff       307 May 10 18:02 kafka-mesos.sh
-rw-r--r--  1 isaacruiz  staff       422 May 10 18:02 kafka-mesos.properties
-rwxr-xr-x  1 isaacruiz  staff      4971 May 10 18:02 gradlew
drwxr-xr-x  3 isaacruiz  staff       102 May 10 18:02 gradle
-rw-r--r--  1 isaacruiz  staff      1769 May 10 18:02 build.gradle
-rw-r--r--  1 isaacruiz  staff     29334 May 10 18:02 README.md
-rw-r--r--  1 isaacruiz  staff     11325 May 10 18:02 LICENSE
drwxr-xr-x  3 isaacruiz  staff       102 May 10 18:25 out
-rw-r--r--  1 isaacruiz  staff  17522191 May 10 18:33 kafka-mesos-0.9.5.0.jar

Now you can use the main shell, use the help command to learn about valid sentences.

{16-05-10 19:05}Isaacs-MacBook-Pro-2:∼/opt/apache/kafka/kafka> ./kafka-mesos.sh help     
Usage: <command>

Commands:
  help [cmd [cmd]] - print general or command-specific help
  scheduler        - start scheduler
  broker           - broker management commands
  topic            - topic management commands

Run `help <command>` to see details of specific command
{16-05-10 21:25}Isaacs-MacBook-Pro-2:∼/opt/apache/kafka/kafka%>

Or you can start the scheduler directly:

{16-05-10 21:25}Isaacs-MacBook-Pro-2:∼/opt/apache/kafka/kafka%> ./kafka-mesos.sh scheduler
Loading config defaults from kafka-mesos.properties
2016-05-10 21:25:33,573 [main] INFO  ly.stealth.mesos.kafka.Scheduler$  - Starting Scheduler$:
debug: true, storage: zk:/mesos-kafka-scheduler
mesos: master=master:5050, user=vagrant, principal=<none>, secret=<none>
framework: name=kafka, role=*, timeout=30d
api: http://192.168.3.5:7000, bind-address: <all>, zk: master:2181, jre: <none>

Mesos and Apache Spark

Since its earliest releases, Spark was ready for Mesos. The web page that explains how to perform the integration is http://spark.apache.org/docs/latest/running-on-mesos.html .

It’s easy to start Spark to work with Mesos. Just be careful when establishing the libmesos file location (the native library compiled earlier).

First, validate that Mesos is running by opening a browser and validating that your host is active as follows:

http://MESOS_HOST:5050/

The next step is to locate the file called libmesos.so (or libmesos.dylib if you’re on a Mac with OS X) and make it available as an environment variable:

export MESOS_NATIVE_JAVA_LIBRARY=<path to libmesos.so>

Once this is done, try running this line:

∼/opt/apache/spark/spark-1.6.1-bin-hadoop2.6/bin%> ./spark-shell --master mesos://MESOS_HOST:5050

If you receive this error:

Failed to load native library from Mesos
Failed to load native Mesos library from
/Users/your_user/Library/Java/Extensions:
/Users/your_user/Library/Java/Extensions/Library/Java/Extensions:
/Network/Library/Java/Extensions:
/System/Library/Java/Extensions:
/usr/lib/java:.

Copy the libmesos.so file to any of these routes, preferably one within your user directory to avoid conflicts with another version after you compile. If the path does not exist, you must create it.

The following is the successful output. The prompt appears available, and more importantly, Mesos recognizes an active framework.

16-05-15 12:47}Isaacs-MacBook-Pro-2:∼/opt/apache/spark/spark-1.6.1-bin-hadoop2.6/bin isaacruiz% ./spark-shell --master mesos://localhost:5050  
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark's repl log4j profile: org/apache/spark/log4j-defaults-repl.properties
To adjust logging level use sc.setLogLevel("INFO")
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _ / _ / _ `/ __/  ’_/
   /___/ .__/\_,_/_/ /_/\_   version 1.6.1
      /_/
Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_51)
Type in expressions to have them evaluated.
Type :help for more information.
I0515 12:48:34.810873 62210048 sched.cpp:222] Version: 0.28.1
I0515 12:48:34.842321 57393152 sched.cpp:326] New master detected at [email protected]:5050
I0515 12:48:34.843389 57393152 sched.cpp:336] No credentials provided. Attempting to register without authentication
I0515 12:48:34.876587 60612608 sched.cpp:703] Framework registered with 7f7d9b4b-c5e4-48be-bbb7-78e6fac701ea-0000
Spark context available as sc.
SQL context available as sqlContext.
scala>

From the web console, we can see the running frameworks as shown in Figure 7-25.

Figure 7-25. Apache Spark already appears in the list of active frameworks

The Best Is Yet to Come

At the beginning of this chapter, we mentioned that Mesos is still in pre-1.0 versions. All of Mesos’s versatility and usefulness work and can be achieved in version 0.28.0. (Imagine when we get to version 1.0!) The expectation level is high, and no wonder: on April 2016, Wired magazine published an article titled “You want to build an empire like Google’s? This is your OS.”¹⁷ Among other things, the article mentions that Google runs on architectures similar to Apache Mesos. It also mentions Mesos’s history and its creator, Ben Hindman; part of the original design includes creating data centers in the same way that software runs on a mobile device or a laptop.

Through Mesosphere,¹⁸ the company launched by Hindman, any company (and any of us) can build an infrastructure similar to Google’s.

If anything is missing to give new impetus to startups, Mesos probably covers it.

At MesosConf-Europe in 2015, Hindman presented the “State of Apache Mesos”¹⁹ and a brief summary with three main indicators of the growing Mesos community:

New users: Cisco, Apple, Yelp, Ericsson
New frameworks: Elastic, Kibana, MySQL, rial, CRATE, and Logstash
New books: This book is among the proof.

Every month and a half, a smaller version (now in 0.28) is released; by the end of 2016, it will likely be at version 0.40 or 0.50. Mesos surely has many surprises ahead—as we say, the best is yet to come.

And if that is not enough, as shown in Figure 7-26, Mesos is designed to run in the cloud as physical machines , so the hardware layer is transparent.

Figure 7-26. Mesos can run both physical machines and cloud

In 2016, presentations at #MesosConf North America²⁰ highlighted growing interest in Mesos.

Summary

In this chapter, you learned a little more about distributed systems. Now you know how difficult it is to try to build one. You learned that Mesos is a general-purpose cluster manager. You know its architecture and part of its ecosystem. Within this ecosystem, you know about the main frameworks that Mesos interacts with to increase its potential. The chapter also overviewed why Mesos is considered a SDK for distributed environments.

Footnotes

1 https://twitter.com/isragaytan/status/736376936562233344

2 https://blog.golang.org/concurrency-is-not-parallelism

3 https://www.infoq.com/presentations/topics-distributed-systems

4 http://videlalvaro.github.io/2015/12/learning-about-distributed-systems.html

5 “Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center” by Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy Katz, Scott Shenker, Ion Stoica, http://people.csail.mit.edu/matei/papers/2011/nsdi_mesos.pdf

6 http://mesos.apache.org/documentation/latest/architecture/

7 http://people.csail.mit.edu/matei/papers/2011/nsdi_mesos.pdf

8 Developing Frameworks for Apache Mesos. https://www.youtube.com/watch?v=ZYoXodSlycA

9 https://mesos.github.io/chronos/docs/

10 https://mesos.github.io/chronos/docs/api.html

11 Simplifying with Mesos and Marathon. https://www.youtube.com/watch?v=OgVaQPYEsVo

12 https://mesosphere.github.io/marathon/docs/

13 https://github.com/mesos/chronos

14 Introduction to Apache Aurora. https://www.youtube.com/watch?v=asd_h6VzaJc

15 Operating Aurora and Mesos at Twitter. https://www.youtube.com/watch?v=E4lxX6epM_U

16 https://mesosphere.com/blog/2015/07/16/making-apache-kafka-elastic-with-apache-mesos/

17 http://www.wired.com/2016/04/want-build-empire-like-googles-os/