Chapter 1. Getting Started with Zabbix

It's Friday night, and you are at a party outside the city with old friends. After a few beers, it looks as if this is going to be a great party, when suddenly your phone rings. A customer can't access some critical server that absolutely has to be available as soon as possible. You try to connect to the server using SSH, only to discover that the customer is right—it can't be accessed.

As driving after those few beers would quite likely lead to an inoperable server for quite some time, you get a taxi—expensive because of the distance (while many modern systems have out-of-band management cards installed that might have helped a bit in such a situation, our hypothetical administrator does not have one available). After arriving at the server room, you find out that some log files have been growing more than usual over the past few weeks and have filled up the hard drive.

While the preceding scenario is very simplistic, something similar has probably happened to most IT workers at one or another point in their careers. Most will have implemented a simple system monitoring and reporting solution soon after that.

We will learn how to set up and configure one such monitoring system—Zabbix. In this very first chapter, we will:

  • Decide which Zabbix version to use
  • Set up Zabbix either from packages or from the source
  • Configure the Zabbix frontend

The first steps in monitoring

Situations similar to the one just described are actually more common than desired. A system fault that had no symptoms visible before is relatively rare. A subsection of UNIX Administration Horror Stories (http://www-uxsup.csx.cam.ac.uk/misc/horror.txt) that only contains stories about faults that weren't noticed in time could probably be compiled easily.

As experience shows, problems tend to happen when we are least equipped to solve them. To work with them on our terms, we turn to a class of software commonly referred to as network monitoring software. Such software usually allows us to constantly monitor things happening in a computer network using one or more methods and notify the persons responsible, if a metric passes a defined threshold.

One of the first monitoring solutions most administrators implement is a simple shell script invoked from a crontab that checks some basic parameters such as disk usage or some service state, such as an Apache server. As the server and monitored-parameter count grows, a neat and clean script system starts to grow into a performance-hogging script hairball that costs more time in upkeep than it saves. While the do-it-yourself crowd claims that nobody needs dedicated software for most tasks (monitoring included), most administrators will disagree as soon as they have to add switches, UPSes, routers, IP cameras, and a myriad of other devices to the swarm of monitored objects.

So, what basic functionality can one expect from a monitoring solution? Let's take a look:

  • Data gathering: This is where everything starts. Usually, data is gathered using various methods, including Simple Network Management Protocol (SNMP), agents, and Intelligent Platform Management Interface (IPMI).
  • Alerting: Gathered data can be compared to thresholds and alerts sent out when required using different channels, such as e-mail or SMS.
  • Data storage: Once we have gathered the data, it doesn't make sense to throw it away, so we will often want to store it for later analysis.
  • Visualization: Humans are better at distinguishing visualized data than raw numbers, especially when there's a lot of data. As we have data already gathered and stored, it is easy to generate simple graphs from it.

Sounds simple? That's because it is. But then we start to want more features, such as easy and efficient configuration, escalations, and permission delegation. If we sit down and start listing the things we want to keep an eye out for, it may turn out that that area of interest extends beyond the network, for example, a hard drive that has Self-Monitoring, Analysis, and Reporting Technology (SMART) errors logged, an application that has too many threads, or a UPS that has one phase overloaded. It is much easier to manage the monitoring of all these different problem categories from a single configuration point.

In the quest for a manageable monitoring system, wondrous adventurers stumbled upon collections of scripts much like the way they themselves implemented obscure and not-so-obscure workstation-level software and heavy, expensive monitoring systems from big vendors.

Many went with a different category—free software. We will look at a free software monitoring solution, Zabbix.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset