Monitoring the CPU

Modern CPUs generally do not become bottlenecks for performance. The processing power is still far ahead of the data transfer speeds of I/O devices and networks. Generally, the CPU spends a big part of processing time waiting for synchronous IO to fetch data from the disk or from a network device. Tracking exact CPU usage is quite a confusing task. Most of the time, you will find higher CPU use, but in reality, the CPU is waiting for data to become available.

In this recipe, we will focus on tracking CPU performance. We will look at some common tools used to get CPU usage details.

Getting ready

You may need sudo privileges to execute some commands.

How to do it…

Let's start with the most commonly used monitoring command that is top command. The top command shows a summarized view of various resource utilization metrics. This includes CPU usage, memory and swap utilization, running processes, and their respective resource consumption, and so on. All metrics are updated at a predefined interval of three seconds.

Follow these steps to monitor the CPU:

  1. To start top, simply type in top in your command prompt and press Enter:
    $ top
  2. As you can see in the preceding screenshot, a single Python process is using 80% of CPU time. The CPU is still underutilized, with 58% time in idle processes:

    Optionally, you can use the htop command. This is the same process monitor as top, but a little easier to use, and it provides text graphs for CPU and memory utilization. You will need to install htop separately:

    $ sudo apt-get install htop    # one time command
    $ htop
  3. While top is used to get an overview of all running processes, the command pidstat can be used to monitor CPU utilization by an individual process or program. Use the following command to monitor CPU consumed by MySQL (or any other task name):
    $ pidstat -C mysql
  4. With pidstat, you can also query statistics for a specific process by its process ID or PID, as follows:
    $ pidstat -p 1134
  5. The other useful command is vmstat. This is primarily used to get details on virtual memory usages but also includes some CPU metrics similar to the top command:
  6. Another command for getting processor statistics is mpstat. This returns the same statistics as top or vmstat but is limited to CPU statistics. Mpstat is not a part of the default Ubuntu installation; you need to install the sysstat package to use the mpstat command:
    $ sudo apt-get install sysstat -y
  7. By default, mpstat returns combined averaged stats for all CPUs. Flag -P can be used to get details of specific CPUs. The following command will display statistics for processor one (0) and processor two (1), and update at an interval of 3 seconds:
    $ mpstat -P 0,1 3
  8. One more command, sar (System Activity Reporter), gives details of system performance.

    The following command will extract the CPU metrics recorded by sar. Flag -u will limit details to CPU only and -P will display data for all available CPUs separately. By default, the sar command will limit the output to CPU details only:

    $ sar -u -p ALL
  9. To get current CPU utilization using sar, specify the interval, and optionally, counter values. The following command will output 5 records at an interval of 2 seconds:
    $ sar -u 2 5
  10. All this data can be stored in a file specified by the (-o) flag. The following command will create a file named sarReport in your current directory, with details of CPU utilization:
    $ sar -u -o sarReport 3 5

Other options include flag –u, to limit the counter to CPU, and flag A, to get system-wide counters that include network, disk, interrupts, and many more. Check sar manual (man sar) to get specific flags for your desired counters.

How it works…

This recipe covers some well known CPU monitoring tools, starting with the very commonly used command, top, to the background metric logging tool SAR.

In the preceding example, we used top to get a quick summarized view of the current state of the system. By default, top shows the average CPU usage. It is listed in the third row of top output. If you have more than one CPU, their usage is combined and displayed in one single column. You can press 1 when top is running to get details of all available CPUs. This should expand the CPU row to list all CPUs. The following screenshot shows two CPUs available on my virtual machine:

The CPU row shows various different categories of CPU utilization, and the following is a list of their brief descriptions:

  • us: Time spent in running user space processes. This reflects the CPU consumption by your application.
  • sy: Time taken by system processes. A higher number here can indicate too many processes, and the CPU is spending more time process scheduling.
  • ni: Time spent with user space processes that are assigned with execution priority (nice value).
  • id: Indicates the time spent in idle mode, where the CPU is doing nothing.
  • wa: Waiting for IO. A higher value here means your CPU is spending too much time handling IO operations. Try improving IO performance or reducing IO at application level.
  • hi/si: Time spent in hardware interrupts or software interrupts.
  • st: Stolen CPU cycles. The hypervisor assigned these CPU cycles to another virtual machine. If you see a higher number in this field, try reducing the number of virtual machines from the host. If you are using a cloud service, try to get a new server, or change your service provider.

The second metric shown is the process level CPU utilization. This is listed in a tabular format under the column head, %CPU. This is the percentage of CPU utilization by each process. By default, the top output is automatically sorted in descending order of CPU utilization. Processes that are using higher CPU get listed at top. Another column, named TIME+, displays total CPU time used by each process. Check the processes section on the screen, which should be similar to the following screenshot:

If you have noticed the processes listed by top you should see that top itself is listed in the process list. Top is considered as a separate running process and also consumes CPU cycles.


To get help on the top screen, press H; this will show you various key combinations to modify top output. For additional details, check out the manual pages with the command, man top. When you are done with top, press Q, to exit or use the exit combination, Ctrl + C.

With top, you can get a list of processes or tasks that are consuming most of the CPU time. To get more details of these tasks, you can use the command, pidstat. By default, pidstat shows CPU statistics. It can be used with a process name or process ID (pid). With pidstat , you can also query memory usages, IO statistics, child processes, and various other process related details. Check the manual page for pidstat using the command man pidstat.

Both commands, top as well as pidstat, give a summarized view of CPU utilization. Top output is refreshed at a specific interval and you cannot extract utilization details over a specific time period. Here comes the other handy command that is vmstat. When run without any parameters, vmstat outputs a single line with memory and CPU utilization, but you can ask vmstat to run infinitely and update the latest metrics at specific intervals using the delay parameter. All the output lines are preserved and can be used to compare the system stats for a given period. The following command will render updated metrics every 5 seconds:

$ vmstat 5

Optionally, specify the count after delay parameter to close vmstat after specific repetitions. The following command will update the stats 5 times at 1 second intervals and then exit:

$ vmstat 1 5

The details provided by vmstat are quite useful for real-time monitoring. The tool sar helps you to store all this data in log files and then extract specific details whenever needed. Sar collects data from various internal counters maintained by the Linux kernel. It collects data over a period of time which can be extracted when required. Using sar without any parameters will show you the data extracted from the previously saved file. The data is collected in a binary format and is located at the /var/log/sysstat directory. You may need to enable data collection in the /etc/default/sysstat file. When the stats collection is enabled, sar automatically collects data every 10 minutes. Sar is again available from the package sysstat. Along with the sar package, sysstat combines two utilities: command sa1 to record daily system activity data in a binary format, and command sa2 to extract that data to a human readable format. All data collected by sar can be extracted in a human readable format using the sa2 command. Check the manual pages for both commands to get more details.

There's more…

Similar to sar, one more well-known tool is collectd. It gathers and stores system statistics, which can later be used to plot graphs.

See also

