Difference between revisions of "Strace"
(New page: '''<tt>strace</tt>''' is a diagnostic, debugging, and instructional userspace utility for Linux. It is used to monitor interactions between processes and the Linux kernel, which includ...) |
(→Example usage) |
||
Line 13: | Line 13: | ||
* Simple call: | * Simple call: | ||
$ strace ls | $ strace ls | ||
+ | |||
+ | Most <code>strace(1)</code> outputs will produce a lot of system function call (e.g., <code>open(), read(), write(), close(), etc.). One can specify which of (or filter by) these system calls, as shown below. | ||
* View the 'open' systems calls made by <code>`ls`</code>: | * View the 'open' systems calls made by <code>`ls`</code>: | ||
Line 29: | Line 31: | ||
+++ exited with 0 +++ | +++ exited with 0 +++ | ||
</pre> | </pre> | ||
+ | |||
+ | In the output above, one can see the system call name/function ("open") being used by <code>ls(1)</code> in various ways. <code>ls(1)</code> requires various library modules (e.g., <code>libc</code>, <code>libattr</code>, etc.) and is looking for those library modules in a specific path. If a given library module is missing (or is located in a path the system does not know about), the aspect of that part of <code>ls(1)</code> that depends on that library will fail to function. The above <code>strace(1)</code> was successful and all required library modules were located (the "<code>= 3</code>" after each "open" system call indicates success. Errors (typically a return value of <code>-1</code>) have the errno symbol and error string appended. Below is an example of such an error: | ||
+ | |||
+ | open("/foo/bar", O_RDONLY) = -1 ENOENT (No such file or directory) | ||
+ | |||
+ | * View the 'open' systems calls made by <code>`ls /tmp`</code> with timestamp: | ||
+ | $ strace -t -e open ls /tmp | ||
+ | 17:15:54 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 | ||
+ | ... | ||
+ | |||
+ | * View the 'open' systems calls made by <code>`ls /tmp`</code> with a relative timestamp upon entry to each system call (note: this records the time difference between the beginning of successive system calls): | ||
+ | $ strace -r -e open ls /tmp | ||
+ | 0.000000 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 | ||
+ | 0.000399 open("/lib/x86_64-linux-gnu/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3 | ||
* Save the 'open' and 'read' system calls made by a process (with PID=14889) to a file: | * Save the 'open' and 'read' system calls made by a process (with PID=14889) to a file: | ||
Line 40: | Line 56: | ||
* Hook into STDOUT and STDERR of a running process: | * Hook into STDOUT and STDERR of a running process: | ||
$ strace -p <PID> -e trace=write -e write=1,2 | $ strace -p <PID> -e trace=write -e write=1,2 | ||
+ | |||
+ | * Generate statistics report of system calls. That is, count time, calls, and errors for each system call and report a summary on program exit. This attempts to show system time (CPU time spent running in the kernel) independent of wall clock time. If "<code>-c</code>" is used with "<code>-f</code>" or "<code>-F</code>" (see man page), only aggregate totals for all traced processes are kept: | ||
+ | |||
+ | $ strace -c ls /tmp | ||
+ | <pre> | ||
+ | foo bar baz | ||
+ | % time seconds usecs/call calls errors syscall | ||
+ | ------ ----------- ----------- --------- --------- ---------------- | ||
+ | 20.82 0.000250 11 23 mmap | ||
+ | 13.32 0.000160 80 2 getdents | ||
+ | 11.24 0.000135 10 14 mprotect | ||
+ | 10.82 0.000130 14 9 open | ||
+ | 6.99 0.000084 8 10 fstat | ||
+ | 6.41 0.000077 10 8 read | ||
+ | 6.16 0.000074 6 12 close | ||
+ | 5.83 0.000070 9 8 8 access | ||
+ | 5.41 0.000065 22 3 brk | ||
+ | 5.33 0.000064 21 3 munmap | ||
+ | 2.00 0.000024 12 2 2 statfs | ||
+ | 1.58 0.000019 19 1 write | ||
+ | 1.50 0.000018 9 2 ioctl | ||
+ | 1.08 0.000013 13 1 stat | ||
+ | 0.92 0.000011 11 1 openat | ||
+ | 0.42 0.000005 5 1 execve | ||
+ | 0.17 0.000002 2 1 arch_prctl | ||
+ | ------ ----------- ----------- --------- --------- ---------------- | ||
+ | 100.00 0.001201 101 10 total | ||
+ | </pre> | ||
* [[Redirection (Linux)#Swapping STDOUT and STDERR using file descriptors|Swapping STDOUT and STDERR using file descriptors]] | * [[Redirection (Linux)#Swapping STDOUT and STDERR using file descriptors|Swapping STDOUT and STDERR using file descriptors]] |
Revision as of 22:25, 28 August 2015
strace is a diagnostic, debugging, and instructional userspace utility for Linux. It is used to monitor interactions between processes and the Linux kernel, which include system calls, signal deliveries, and changes of process state. The operation of strace(1)
is made possible by the kernel feature known as ptrace
.
strace(1)
is an indispensable tool for any Linux Systems Administrator. I use it on a daily basis at work.
The most common usage is to start a program using strace(1)
, which prints a list of system calls made by the program. This is useful if the program continually crashes, or does not behave as expected; for example using strace(1)
may reveal that the program is attempting to access a file which does not exist or cannot be read.
An alternative application is to use the "-p
" flag to attach to a running process. This is useful if a process has stopped responding, and might reveal, for example, that the process is blocking whilst attempting to make a network connection.
As strace(1)
only details system calls, it cannot be used to detect as many problems as a code debugger such as GNU Debugger (gdb). It is, however, easier to use than a code debugger, and is an extremely useful tool for system administrators.
Example usage
- Simple call:
$ strace ls
Most strace(1)
outputs will produce a lot of system function call (e.g., open(), read(), write(), close(), etc.). One can specify which of (or filter by) these system calls, as shown below.
- View the 'open' systems calls made by <code>`ls`:
$ strace -e open ls
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 open("/lib/x86_64-linux-gnu/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3 open("/lib/x86_64-linux-gnu/libacl.so.1", O_RDONLY|O_CLOEXEC) = 3 open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 open("/lib/x86_64-linux-gnu/libpcre.so.3", O_RDONLY|O_CLOEXEC) = 3 open("/lib/x86_64-linux-gnu/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3 open("/lib/x86_64-linux-gnu/libattr.so.1", O_RDONLY|O_CLOEXEC) = 3 open("/proc/filesystems", O_RDONLY) = 3 open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 foo bar baz +++ exited with 0 +++
In the output above, one can see the system call name/function ("open") being used by ls(1)
in various ways. ls(1)
requires various library modules (e.g., libc
, libattr
, etc.) and is looking for those library modules in a specific path. If a given library module is missing (or is located in a path the system does not know about), the aspect of that part of ls(1)
that depends on that library will fail to function. The above strace(1)
was successful and all required library modules were located (the "= 3
" after each "open" system call indicates success. Errors (typically a return value of -1
) have the errno symbol and error string appended. Below is an example of such an error:
open("/foo/bar", O_RDONLY) = -1 ENOENT (No such file or directory)
- View the 'open' systems calls made by
`ls /tmp`
with timestamp:
$ strace -t -e open ls /tmp 17:15:54 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 ...
- View the 'open' systems calls made by
`ls /tmp`
with a relative timestamp upon entry to each system call (note: this records the time difference between the beginning of successive system calls):
$ strace -r -e open ls /tmp 0.000000 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 0.000399 open("/lib/x86_64-linux-gnu/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3
- Save the 'open' and 'read' system calls made by a process (with PID=14889) to a file:
$ strace -e trace=open,read -p 14889 -o /tmp/strace.14889
- View the 'open' system calls made by a program called "
vpnc
" (strace(1)
= window into program functionality):
$ strace -e open vpnc #~OR~ $ strace -e open -p `pidof vpnc`
- Hook into STDOUT and STDERR of a running process:
$ strace -p <PID> -e trace=write -e write=1,2
- Generate statistics report of system calls. That is, count time, calls, and errors for each system call and report a summary on program exit. This attempts to show system time (CPU time spent running in the kernel) independent of wall clock time. If "
-c
" is used with "-f
" or "-F
" (see man page), only aggregate totals for all traced processes are kept:
$ strace -c ls /tmp
foo bar baz % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 20.82 0.000250 11 23 mmap 13.32 0.000160 80 2 getdents 11.24 0.000135 10 14 mprotect 10.82 0.000130 14 9 open 6.99 0.000084 8 10 fstat 6.41 0.000077 10 8 read 6.16 0.000074 6 12 close 5.83 0.000070 9 8 8 access 5.41 0.000065 22 3 brk 5.33 0.000064 21 3 munmap 2.00 0.000024 12 2 2 statfs 1.58 0.000019 19 1 write 1.50 0.000018 9 2 ioctl 1.08 0.000013 13 1 stat 0.92 0.000011 11 1 openat 0.42 0.000005 5 1 execve 0.17 0.000002 2 1 arch_prctl ------ ----------- ----------- --------- --------- ---------------- 100.00 0.001201 101 10 total
See also
- Lsof
-
ltrace(1), time(1), ptrace(2), proc(5)