Strace

From Christoph's Personal Wiki
Jump to: navigation, search

strace is a diagnostic, debugging, and instructional userspace utility for Linux. It is used to monitor interactions between processes and the Linux kernel, which include system calls, signal deliveries, and changes of process state. The operation of strace(1) is made possible by the kernel feature known as ptrace.

strace(1) is an indispensable tool for any Linux Systems Administrator. I use it on a daily basis at work.

The most common usage is to start a program using strace(1), which prints a list of system calls made by the program. This is useful if the program continually crashes, or does not behave as expected; for example using strace(1) may reveal that the program is attempting to access a file which does not exist or cannot be read.

An alternative application is to use the "-p" flag to attach to a running process. This is useful if a process has stopped responding, and might reveal, for example, that the process is blocking whilst attempting to make a network connection.

As strace(1) only details system calls, it cannot be used to detect as many problems as a code debugger such as GNU Debugger (gdb). It is, however, easier to use than a code debugger, and is an extremely useful tool for system administrators.

Example usage

  • Simple call:
$ strace ls

Most strace(1) outputs will produce a lot of system function call (e.g., open(), read(), write(), close(), etc.). One can specify which of (or filter by) these system calls, as shown below.

  • View the 'open' systems calls made by `ls`:
$ strace -e open ls
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
open("/lib/x86_64-linux-gnu/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/lib/x86_64-linux-gnu/libacl.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
open("/lib/x86_64-linux-gnu/libpcre.so.3", O_RDONLY|O_CLOEXEC) = 3
open("/lib/x86_64-linux-gnu/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/lib/x86_64-linux-gnu/libattr.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/proc/filesystems", O_RDONLY)     = 3
open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
foo bar baz
+++ exited with 0 +++

In the output above, one can see the system call name/function ("open") being used by ls(1) in various ways. ls(1) requires various library modules (e.g., libc, libattr, etc.) and is looking for those library modules in a specific path. If a given library module is missing (or is located in a path the system does not know about), the aspect of that part of ls(1) that depends on that library will fail to function. The above strace(1) was successful and all required library modules were located (the "= 3" after each "open" system call indicates success. Errors (typically a return value of -1) have the errno symbol and error string appended. Below is an example of such an error:

open("/foo/bar", O_RDONLY) = -1 ENOENT (No such file or directory)
  • View the 'open' systems calls made by `ls /tmp` with timestamp:
$ strace -t -e open ls /tmp
17:15:54 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
...
  • View the 'open' systems calls made by `ls /tmp` with a relative timestamp upon entry to each system call (note: this records the time difference between the beginning of successive system calls):
$ strace -r -e open ls /tmp
    0.000000 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
    0.000399 open("/lib/x86_64-linux-gnu/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3
  • Save the 'open' and 'read' system calls made by a process (with PID=14889) to a file:
$ strace -e trace=open,read -p 14889 -o /tmp/strace.14889
  • View the 'open' system calls made by a program called "firefox" (strace(1) = window into program functionality):
$ strace -e open firefox
#~OR~
$ strace -e open -p $(pidof firefox)
  • Hook into STDOUT and STDERR of a running process:
$ strace -p <PID> -e trace=write -e write=1,2
  • Generate statistics report of system calls. That is, count time, calls, and errors for each system call and report a summary on program exit. This attempts to show system time (CPU time spent running in the kernel) independent of wall clock time. If "-c" is used with "-f" or "-F" (see man page), only aggregate totals for all traced processes are kept:
$ strace -c ls /tmp
foo bar baz
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 20.82    0.000250          11        23           mmap
 13.32    0.000160          80         2           getdents
 11.24    0.000135          10        14           mprotect
 10.82    0.000130          14         9           open
  6.99    0.000084           8        10           fstat
  6.41    0.000077          10         8           read
  6.16    0.000074           6        12           close
  5.83    0.000070           9         8         8 access
  5.41    0.000065          22         3           brk
  5.33    0.000064          21         3           munmap
  2.00    0.000024          12         2         2 statfs
  1.58    0.000019          19         1           write
  1.50    0.000018           9         2           ioctl
  1.08    0.000013          13         1           stat
  0.92    0.000011          11         1           openat
  0.42    0.000005           5         1           execve
  0.17    0.000002           2         1           arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00    0.001201                   101        10 total

See also

External links