SystemTap can monitor multiple system-wide synchronous and asynchronous events at the same time. It can do scriptable filtering and statistics collection. It’s a dynamic method of monitoring and tracing the operations of a running Linux kernel.

To instrument the running kernel, SystemTap uses Kprobes and return probes. With kernel debug information, it gets the addresses for functions and variables referenced in the script. With utrace, SystemTap supports probing user-space executables and shared libraries as well. SystemTap is, therefore, useful to systems administrators, kernel developers, support engineers, researchers and students.

Installation

To install SystemTap on Fedora, run the following commands as root:

yum install systemtap kernel-devel
debuginfo-install kernel

Working

To understand the working of System Tap, run the script in verbose mode (with the -v switch). The stap program is the front-end to SystemTap. The -e switch instructs it to execute the script in the following argument:

$ stap -v -e 'probe syscall.read {printf("syscall %s arguments %s \n", name, argstr); exit()}'
Pass 1: parsed user script and 65 library script(s) using 83596virt/20428res/2412shr kb, in 150usr/10sys/249real ms.
Pass 2: analyzed script: 1 probe(s), 4 function(s), 0 embed(s), 0 global(s) using 216260virt/115660res/73964shr kb, in 560usr/20sys/946real ms.
Pass 3: translated to C into "/tmp/stapUGVeZi/stap_b40c8268c87acc683f75ded62a52ee66_2113.c" using 216260virt/117180res/75484shr kb, in 320usr/40sys/1014real ms.
Pass 4: compiled C into "stap_b40c8268c87acc683f75ded62a52ee66_2113.ko" in 3010usr/1210sys/12818real ms.
Pass 5: starting run.
syscall read arguments 4, 0x00007fffa773b4c0, 8196
Pass 5: run completed in 20usr/60sys/174real ms.

Let’s see check each of the passes mentioned:

  • Passes 1 and 2: The script we want to run is parsed, and the code is checked for semantic and syntactic errors. Any tapset reference is imported. Debug data (provided via debuginfo packages) is read to find the addresses for functions and variables referenced in the script.
  • Pass 3: The script is translated into C code.
  • Pass 4: The translated C code is compiled to create a kernel module.
  • Pass 5: The compiled module is inserted into the running kernel.

Probes are inserted at proper locations, as soon as the module's are loaded. From now on, whenever a probe is hit, the handler for that probe is called.

The basic syntax to write a probe for an event, and the handler to run when that event occurred:

probe  { handler }

where,

  • event is one of the kernel.functionprocess.statementtimer.msbeginend, or (tapset) aliases. For more information, look at the man page for stapprobes.
  • handlercan have:
    • filtering/conditionals (if … next)
    • control structures (foreachwhile)

Note:

You don’t need to declare the type of a variable, already inferred from the context. 
Have predefined functions like pidexecnamelog, etc.
You can find the installed package /usr/share/doc/systemtap-<version>/langref.pdf.

How to run stap

The stap program can be invoked with multiple syntax's:

stap -e '<script>' [-c <target program>]
stap script.stp [-c <target program>]
stap -l '<event*>'

Tapset libraries

In the example shown earlier, after probing on the read system call, we printed the name of the system call, and the arguments passed via name and argstr. This was possible because in one of the tapset libraries, /usr/share/systemtap/tapset/syscalls2.stp, the following is defined:

probe syscall.read = kernel.function("SyS_read").call !,
                   kernel.function("sys_read").call
{
        name = "read"
        fd = $fd
        buf_uaddr = $buf
        count = $count
        argstr = sprintf("%d, %p, %d", $fd, $buf, $count)
}

Tapsets provide abstraction to common probe points, and define functions that you can use in your script. They (probe aliases, not probes) are not runnable themselves.

Was this answer helpful? 0 Users Found This Useful (0 Votes)