Using Sysstat to Track System Performance

Sysstat is a set of system monitoring and logging utilities that makes it easy to troubleshoot performance issues and hardware bottlenecks. The data that is logged by sysstat is done so over time and then archived, so it can be used to generate reports of performance over time.

Preparing Sysstat

The first set to using this utility is to ensure that the sysstat package is installed. The method for installing it will vary depending on your distro, but for RedHat-based servers including CentOS, the method is 1):

yum -y install sysstat

On RH-based servers, that is all you should need to do. This will install the necessary binaries as well as a cron.d entry. You can customize it by editing /etc/cron.d/sysstat, but that is beyond the scope of this tutorial as the default one suffices for most purposes. To verify this you can check that the log directory /var/log/sa/ (/var/log/atsar/ for Debian) was created and has an entry for the present day or manually start the daemon yourself using the same step as Debian guide below.

For Debian/Ubuntu servers you can use:

apt-get install sysstat

You will also need to configure the daemon to run by default. Edit the configuration file (vi /etc/default/sysstat), and set sadc collect variable to true by changing the line to ENABLED=“true”. Finally, start the daemon:

/etc/init.d/sysstat start

Since sysstat logs data over time through it's own logger (i.e. not through syslogd) it won't have any data to report initially. You can either give it some time to gather a bit of data, or come back at a later time when you have a day or so's worth of data to play with.

Basic Sysstat Usage

Assuming you have some data to analyze, you can simply run sar without any options to get a full day's printout of all the reported stats. Now this is not very useful and will likely all scroll past without you being able to read much of it. It is, however, important to illustrate a point here. Sar by default reads from the current day's log file; to change this we need to pass it an additional parameter. Log files are stored in /var/log/sa/ (/var/log/atsar/ on Debian) and begin with 'sa' followed by a number indicating the day of the month. So if we wanted to read the data from the 13th of the month (or the 13th of last month if that date has not already passed) we would run:

sar -f /var/log/sa/sa13

To specify intervals we pass extra arguments to sar. By default sysstat logs on ten-minute intervals, but if we only wanted to see 5 hourly intervals from the same log file as above:

sar -f /var/log/sa/sa13 3600 5

The first value is the number of seconds for the interval and the second is the number of intervals to display. The above becomes useful when combined with a time specification to view data from a specific interval:

sar -f /var/log/sa/sa13 1200 -s 03:00:00 -e 06:00:00

The resulting output would be log data in 20 minute intervals between the hours of 3 and 6AM. There are plenty more useful sar options but these are the ones you will likely use the most!

Performance Analyzation with Sysstat

Below represents a snippet of the report produced by initiating the sar -u command:

12:00:01 AM CPU %user %nice %system %iowait %idle
 12:10:01 AM all 0.85 0.16 0.54 0.20 98.24
 12:20:01 AM all 0.73 0.18 0.51 0.21 98.36
 12:30:01 AM all 0.67 0.18 0.51 0.08 98.56
 12:40:01 AM all 0.57 0.17 0.48 0.17 98.61
 12:50:01 AM all 0.45 0.19 0.44 0.06 98.87
 01:00:01 AM all 0.55 0.18 0.48 0.06 98.73

Combined with the options listed above, you should be able to produce a report that you can use to diagnose slowing during a given period. The %user and %system columns simply specify the amount of time the CPU spends in user and system mode. The %iowait and %idle columns are of interest to us when doing performance analysis. The %iowait column specifies the amount of time the CPU spends waiting for I/O requests to complete. The %idle column tells us how much useful work the CPU is doing. A %idle time near zero indicates a CPU bottleneck, while a high %iowait value indicates unsatisfactory disk performance.

We find that poor disk performance is typically associated with large amounts of disk swapping. This happens when the system runs out of physical memory and must use the swap space allocated on the disk to prevent memory corruption, among other things.

Using Pidstat

The pidstat command is used to monitor processes and threads currently being managed by the Linux kernel. It can also monitor the children of those processes and threads. With its -d option, pidstat can report I/O statistics, which is very useful in pinpointing the source heavy I/O load. The -t option also instructs pidstat to print information about the threads (or children) of the selected processes. To specify a particular process (or process tree with -t), use the -p <pid> switch. If for instance, we wanted to view disk I/O information for Apache and all of it's child processes if the Apache parent were to have a PID of 1759:

pidstat -d -t -p 1759

Interval specification as detailed for sar above, is also relevant for pidstat.

1)
As of the writing of this article, yum and apt-get are installing an older version of sysstat that does not include pidstat. You can find the latest version of sysstat on the developer's website if you desire this functionality:

http://pagesperso-orange.fr/sebastien.godard/download.html