DTrace stores the results of aggregating functions in objects called aggregations. The aggregation results are indexed using a tuple of expressions similar to those used for associative arrays. In D, the syntax for an aggregation is:
@name
[keys
] =aggfunc
(args
);
where name
is the name of the aggregation, keys
is a comma-separated list of D expressions, aggfunc
is one of the DTrace aggregating functions, and args
is a comma-separated list of arguments appropriate for the aggregating function. The aggregation name
is a D identifier that is prefixed with the special character @
. All aggregations named in your D programs are global variables; there are no thread- or clause-local aggregations. The aggregation names are kept in a separate identifier namespace from other D global variables. Remember that a
and @a
are not the same variable if you reuse names. The special aggregation name @
can be used to name an anonymous aggregation in simple D programs. The D compiler treats this name as an alias for the aggregation name @_
.
The DTrace aggregating functions are shown in the following table. Most aggregating functions take just a single argument that represents the new datum.
Table 9.1. DTrace Aggregating Functions
Function Name |
Arguments |
Result |
---|---|---|
|
none |
The number of times called. |
|
scalar expression |
The total value of the specified expressions. |
|
scalar expression |
The arithmetic average of the specified expressions. |
|
scalar expression |
The smallest value among the specified expressions. |
|
scalar expression |
The largest value among the specified expressions. |
|
scalar expression, lower bound, upper bound, step value |
A linear frequency distribution, sized by the specified range, of the values of the specified expressions. Increments the value in the highest bucket that is less than the specified expression. |
|
scalar expression |
A power-of-two frequency distribution of the values of the specified expressions. Increments the value in the highest power-of-two bucket that is less than the specified expression. |
For example, to count the number of
write
(
2
)
system calls in the system, you could use an informative string as a key and the count
aggregating function:
syscall::write:entry { @counts["write system calls"] = count(); }
The dtrace command prints aggregation results by default when the process terminates, either as the result of an explicit END
action or when the user presses Control-C. The following example output shows the result of running this command, waiting for a few seconds, and pressing Control-C:
# dtrace -s writes.d
dtrace: script './writes.d' matched 1 probe^C
write system calls 179 #
You can count system calls per process nam using the execname
variable as the key to an aggregation:
syscall::write:entry { @counts[execname] = count(); }
The following example output shows the result of running this command, waiting for a few seconds, and pressing Control-C:
# dtrace -s writesbycmd.d
dtrace: script './writesbycmd.d' matched 1 probe^C
dtrace 1 cat 4 sed 9 head 9 grep 14 find 15 tail 25 mountd 28 expr 72 sh 291 tee 814 def.dir.flp 1996 make.bin 2010 #
Alternatively, you might want to further examine writes organized by both executable name and file descriptor. The file descriptor is the first argument to
write
(
2
)
, so the following example uses a key consisting of both execname
and arg0
:
syscall::write:entry { @counts[execname, arg0] = count(); }
Running this command results in a table with both executable name and file descriptor, as shown in the following example:
# dtrace -s writesbycmdfd.d
dtrace: script './writesbycmdfd.d' matched 1 probe^C
cat 1 58 sed 1 60 grep 1 89 tee 1 156 tee 3 156 make.bin 5 164 acomp 1 263 macrogen 4 286 cg 1 397 acomp 3 736 make.bin 1 880 iropt 4 1731 #
The following example displays the average time spent in the write system call, organized by process name. This example uses the avg
aggregating function, specifying the expression to average as the argument. The example averages the wall clock time spent in the system call:
syscall::write:entry { self->ts = timestamp; } syscall::write:return /self->ts/ { @time[execname] = avg(timestamp - self->ts); self->ts = 0; }
The following example output shows the result of running this command, waiting for a few seconds, and pressing Control-C:
# dtrace -s writetime.d
dtrace: script './writetime.d' matched 2 probes^C
iropt 31315 acomp 37037 make.bin 63736 tee 68702 date 84020 sh 91632 dtrace 159200 ctfmerge 321560 install 343300 mcs 394400 get 413695 ctfconvert 594400 bringover 1332465 tail 1335260 #
The average can be useful, but often does not provide sufficient detail to understand the distribution of data points. To understand the distribution in further detail, use the quantize
aggregating function as shown in the following example:
syscall::write:entry { self->ts = timestamp; } syscall::write:return /self->ts/ { @time[execname] = quantize(timestamp - self->ts); self->ts = 0; }
Because each line of output becomes a frequency distribution diagram, the output of this script is substantially longer than previous ones. The following example shows a selection of sample output:
lint value ------------- Distribution ------------- count 8192 | 0 16384 | 2 32768 | 0 65536 |@@@@@@@@@@@@@@@@@@@ 74 131072 |@@@@@@@@@@@@@@@ 59 262144 |@@@ 14 524288 | 0 acomp value ------------- Distribution ------------- count 4096 | 0 8192 |@@@@@@@@@@@@ 840 16384 |@@@@@@@@@@@ 750 32768 |@@ 165 65536 |@@@@@@ 460 131072 |@@@@@@ 446 262144 | 16 524288 | 0 1048576 | 1 2097152 | 0 iropt value ------------- Distribution ------------- count 4096 | 0 8192 |@@@@@@@@@@@@@@@@@@@@@@@ 4149 16384 |@@@@@@@@@@ 1798 32768 |@ 332 65536 |@ 325 131072 |@@ 431 262144 | 3 524288 | 2 1048576 | 1 2097152 | 0
Notice that the rows for the frequency distribution are always power-of-two values. Each rows indicates the count of the number of elements greater than or equal to the corresponding value, but less than the next larger row value. For example, the above output shows that iropt
had 4,149 writes taking between 8,192 nanoseconds and 16,383 nanoseconds, inclusive.
While quantize
is useful for getting quick insight into the data, you might want to examine a distribution across linear values instead. To display a linear value distribution, use the lquantize
aggregating function. The lquantize
function takes three arguments in addition to a D expression: a lower bound, an upper bound, and a step. For example, if you wanted to look at the distribution of writes by file descriptor, a power-of-two quantization would not be effective. Instead, use a linear quantization with a small range, as shown in the following example:
syscall::write:entry { @fds[execname] = lquantize(arg0, 0, 100, 1); }
Running this script for several seconds yields a large amount of information. The following example shows a selection of typical output:
mountd value ------------- Distribution ------------- count 11 | 0 12 |@ 4 13 | 0 14 |@@@@@@@@@@@@@@@@@@@@@@@@@ 70 15 | 0 16 |@@@@@@@@@@@@ 34 17 | 0 xemacs-20.4 value ------------- Distribution ------------- count 6 | 0 7 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 521 8 | 0 9 | 1 10 | 0 make.bin value ------------- Distribution ------------- count 0 | 0 1 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 3596 2 | 0 3 | 0 4 | 42 5 | 50 6 | 0 acomp value ------------- Distribution ------------- count 0 | 0 1 |@@@@@ 1156 2 | 0 3 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 6635 4 |@ 297 5 | 0 iropt value ------------- Distribution ------------- count 2 | 0 3 | 299 4 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 20144 5 | 0
You can also use the lquantize
aggregating function to aggregate on time since some point in the past. This technique allows you to observe a change in behavior over time. The following example displays the change in system call behavior over the lifetime of a process executing the
date
(
1
)
command:
syscall::exec:return, syscall::exece:return /execname == "date"/ { self->start = vtimestamp; } syscall:::entry /self->start/ { /* * We linearly quantize on the current virtual time minus our * process's start time. We divide by 1000 to yield microseconds * rather than nanoseconds. The range runs from 0 to 10 milliseconds * in steps of 100 microseconds; we expect that no date(1) process * will take longer than 10 milliseconds to complete. */ @a["system calls over time"] = lquantize((vtimestamp - self->start) / 1000, 0, 10000, 100); } syscall::rexit:entry /self->start/ { self->start = 0; }
The preceding script provides greater insight into system call behavior when many date ( 1 ) processes are executed. To see this result, run sh -c 'while true; do date >/dev/null; done' in one window, while executing the D script in another. The script produces a profile of the system call behavior of the date ( 1 ) command:
# dtrace -s dateprof.d
dtrace: script './dateprof.d' matched 218 probes^C
system calls over time value ------------- Distribution ------------- count < 0 | 0 0 |@@ 20530 100 |@@@@@@ 48814 200 |@@@ 28119 300 |@ 14646 400 |@@@@@ 41237 500 | 1259 600 | 218 700 | 116 800 |@ 12783 900 |@@@ 28133 1000 | 7897 1100 |@ 14065 1200 |@@@ 27549 1300 |@@@ 25715 1400 |@@@@ 35011 1500 |@@ 16734 1600 | 498 1700 | 256 1800 | 369 1900 | 404 2000 | 320 2100 | 555 2200 | 54 2300 | 17 2400 | 5 2500 | 1 2600 | 7 2700 | 0
This output provides a rough idea of the different phases of the
date
(
1
)
command with respect to the services required of the kernel. To better understand these phases, you might want to understand which system calls are being called when. If so, you could change the D script to aggregate on the variable probefunc
instead of a constant string.