Modern CPUs generally do not become bottlenecks for performance. The processing power is still far ahead of the data transfer speeds of I/O devices and networks. Generally, the CPU spends a big part of processing time waiting for synchronous IO to fetch data from the disk or from a network device. Tracking exact CPU usage is quite a confusing task. Most of the time, you will find higher CPU use, but in reality, the CPU is waiting for data to become available.
In this recipe, we will focus on tracking CPU performance. We will look at some common tools used to get CPU usage details.
Let's start with the most commonly used monitoring command that is top
command. The top
command shows a summarized view of various resource utilization metrics. This includes CPU usage, memory and swap utilization, running processes, and their respective resource consumption, and so on. All metrics are updated at a predefined interval of three seconds.
Follow these steps to monitor the CPU:
top
in your command prompt and press Enter:$ top
Optionally, you can use the htop
command. This is the same process monitor as top, but a little easier to use, and it provides text graphs for CPU and memory utilization. You will need to install htop separately:
$ sudo apt-get install htop # one time command $ htop
pidstat
can be used to monitor CPU utilization by an individual process or program. Use the following command to monitor CPU consumed by MySQL (or any other task name):$ pidstat -C mysql
pidstat
, you can also query statistics for a specific process by its process ID or PID, as follows:$ pidstat -p 1134
vmstat
. This is primarily used to get details on virtual memory usages but also includes some CPU metrics similar to the top
command:mpstat
. This returns the same statistics as top
or vmstat
but is limited to CPU statistics. Mpstat is not a part of the default Ubuntu installation; you need to install the sysstat
package to use the mpstat
command:$ sudo apt-get install sysstat -y
mpstat
returns combined averaged stats for all CPUs. Flag -P
can be used to get details of specific CPUs. The following command will display statistics for processor one (0
) and processor two (1
), and update at an interval of 3
seconds:$ mpstat -P 0,1 3
sar
(System Activity Reporter), gives details of system performance.The following command will extract the CPU metrics recorded by sar
. Flag -u
will limit details to CPU only and -P
will display data for all available CPUs separately. By default, the sar
command will limit the output to CPU details only:
$ sar -u -p ALL
sar
, specify the interval, and optionally, counter values. The following command will output 5
records at an interval of 2
seconds:$ sar -u 2 5
-o
) flag. The following command will create a file named sarReport
in your current directory, with details of CPU utilization:$ sar -u -o sarReport 3 5
Other options include flag –u
, to limit the counter to CPU, and flag A
, to get system-wide counters that include network, disk, interrupts, and many more. Check sar
manual (man sar
) to get specific flags for your desired counters.
This recipe covers some well known CPU monitoring tools, starting with the very commonly used command, top
, to the background metric logging tool SAR.
In the preceding example, we used top to get a quick summarized view of the current state of the system. By default, top shows the average CPU usage. It is listed in the third row of top output. If you have more than one CPU, their usage is combined and displayed in one single column. You can press 1 when top is running to get details of all available CPUs. This should expand the CPU row to list all CPUs. The following screenshot shows two CPUs available on my virtual machine:
The CPU row shows various different categories of CPU utilization, and the following is a list of their brief descriptions:
us
: Time spent in running user space processes. This reflects the CPU consumption by your application.sy
: Time taken by system processes. A higher number here can indicate too many processes, and the CPU is spending more time process scheduling.ni
: Time spent with user space processes that are assigned with execution priority (nice value).id
: Indicates the time spent in idle mode, where the CPU is doing nothing.wa
: Waiting for IO. A higher value here means your CPU is spending too much time handling IO operations. Try improving IO performance or reducing IO at application level.hi/si
: Time spent in hardware interrupts or software interrupts.st
: Stolen CPU cycles. The hypervisor assigned these CPU cycles to another virtual machine. If you see a higher number in this field, try reducing the number of virtual machines from the host. If you are using a cloud service, try to get a new server, or change your service provider.The second metric shown is the process level CPU utilization. This is listed in a tabular format under the column head, %CPU
. This is the percentage of CPU utilization by each process. By default, the top output is automatically sorted in descending order of CPU utilization. Processes that are using higher CPU get listed at top. Another column, named TIME+
, displays total CPU time used by each process. Check the processes section on the screen, which should be similar to the following screenshot:
If you have noticed the processes listed by top you should see that top itself is listed in the process list. Top is considered as a separate running process and also consumes CPU cycles.
With top, you can get a list of processes or tasks that are consuming most of the CPU time. To get more details of these tasks, you can use the command, pidstat
. By default, pidstat
shows CPU statistics. It can be used with a process name or process ID (pid). With pidstat
, you can also query memory usages, IO statistics, child processes, and various other process related details. Check the manual page for pidstat
using the command man pidstat
.
Both commands, top
as well as pidstat
, give a summarized view of CPU utilization. Top output is refreshed at a specific interval and you cannot extract utilization details over a specific time period. Here comes the other handy command that is vmstat
. When run without any parameters, vmstat outputs a single line with memory and CPU utilization, but you can ask vmstat to run infinitely and update the latest metrics at specific intervals using the delay parameter. All the output lines are preserved and can be used to compare the system stats for a given period. The following command will render updated metrics every 5
seconds:
$ vmstat 5
Optionally, specify the count after delay parameter to close vmstat
after specific repetitions. The following command will update the stats 5
times at 1
second intervals and then exit:
$ vmstat 1 5
The details provided by vmstat
are quite useful for real-time monitoring. The tool sar
helps you to store all this data in log files and then extract specific details whenever needed. Sar collects data from various internal counters maintained by the Linux kernel. It collects data over a period of time which can be extracted when required. Using sar
without any parameters will show you the data extracted from the previously saved file. The data is collected in a binary format and is located at the /var/log/sysstat
directory. You may need to enable data collection in the /etc/default/sysstat
file. When the stats collection is enabled, sar
automatically collects data every 10 minutes. Sar is again available from the package sysstat
. Along with the sar
package, sysstat
combines two utilities: command sa1
to record daily system activity data in a binary format, and command sa2
to extract that data to a human readable format. All data collected by sar
can be extracted in a human readable format using the sa2
command. Check the manual pages for both commands to get more details.
Similar to sar, one more well-known tool is collectd. It gathers and stores system statistics, which can later be used to plot graphs.
$ less /proc/cpuinfo
/proc
file system: http://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/proc.html