Untitled

Last login: Fri Aug 31 09:37:55 on ttys015
-bash-3.2$ man fts
Warning: cannot open configuration file /private/etc/man.conf
No manual entry for fts
-bash-3.2$ man n fts
Warning: cannot open configuration file /private/etc/man.conf
No entry for fts in section n of the manual
-bash-3.2$ man a fts
Warning: cannot open configuration file /private/etc/man.conf
No manual entry for a
No manual entry for fts
-bash-3.2$ man
Warning: cannot open configuration file /private/etc/man.conf
What manual page do you want?
-bash-3.2$ bash
bash-3.2$ ls
Applications	Documents	Library		Music		bin		projects	sudo
Desktop		Downloads	Movies		Pictures	dev		script-dev	workspace1
bash-3.2$ man
Warning: cannot open configuration file /private/etc/man.conf
What manual page do you want?
bash-3.2$ man a fts
Warning: cannot open configuration file /private/etc/man.conf
No manual entry for a
No manual entry for fts
bash-3.2$ man fts
Warning: cannot open configuration file /private/etc/man.conf
No manual entry for fts
bash-3.2$ man 8 fts
Warning: cannot open configuration file /private/etc/man.conf
No entry for fts in section 8 of the manual
bash-3.2$ manpages n fts
bash: manpages: command not found
bash-3.2$ man1 fts
bash: man1: command not found
bash-3.2$ cd /usr/share
bash-3.2$ ls
CSI				emacs				icu				mecabra				snmp				vim
CoreDuetDaemonConfig.bundle	examples			info				misc				tabset				zoneinfo
calendar			file				java				php				terminfo			zoneinfo.default
com.apple.languageassetd	firmware			kdrl.bundle			pmenergy			texinfo				zsh
cracklib			germantok			kpep				ri				thermald.bundle
cups				groff				langid				sandbox				tokenizer
dict				hiutil				locale				screen				ucupdate
doc				httpd				man				skel				uucp
bash-3.2$ cd examples
bash-3.2$ ls
DTTk
bash-3.2$ cd D*
bash-3.2$ ls
bitesize_example.txt		errinfo_example.txt		iopattern_example.txt		newproc_example.txt		rwbypid_example.txt		syscallbypid_example.txt
cpuwalk_example.txt		execsnoop_example.txt		iopending_example.txt		opensnoop_example.txt		rwbytype_example.txt		syscallbyproc_example.txt
creatbyproc_example.txt		fddist_example.txt		iosnoop_example.txt		pathopens_example.txt		rwsnoop_example.txt		syscallbysysc_example.txt
dappprof_example.txt		filebyproc_example.txt		iotop_example.txt		pidpersec_example.txt		sampleproc_example.txt		topsyscall_example.txt
dapptrace_example.txt		hotspot_example.txt		kill_example.txt		priclass_example.txt		seeksize_example.txt		topsysproc_example.txt
dispqlen_example.txt		iofile_example.txt		lastwords_example.txt		pridist_example.txt		setuids_example.txt
dtruss_example.txt		iofileb_example.txt		loads_example.txt		procsystime_example.txt		sigdist_example.txt
bash-3.2$ cat *
In this example, bitesize.d was run for several seconds then Ctrl-C was hit.
As bitesize.d runs it records how processes on the system are accessing the
disks - in particular the size of the I/O operation. It is usually desirable
for processes to be requesting large I/O operations rather than taking many
small "bites".

The final report highlights how processes performed. The find command mostly
read 1K blocks while the tar command was reading large blocks - both as
expected.

   # bitesize.d
   Tracing... Hit Ctrl-C to end.
   ^C

        PID  CMD
       7110  -bash\0

              value  ------------- Distribution ------------- count
                512 |                                         0
               1024 |@@@@@@@@@@@@@@@@@@@@@@@@@@               2
               2048 |                                         0
               4096 |@@@@@@@@@@@@@                            1
               8192 |                                         0

       7110  sync\0

              value  ------------- Distribution ------------- count
                512 |                                         0
               1024 |@@@@@                                    1
               2048 |@@@@@@@@@@                               2
               4096 |                                         0
               8192 |@@@@@@@@@@@@@@@@@@@@@@@@@                5
              16384 |                                         0

          0  sched\0

              value  ------------- Distribution ------------- count
               1024 |                                         0
               2048 |@@@                                      1
               4096 |                                         0
               8192 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@     10
              16384 |                                         0

       7109  find /\0

              value  ------------- Distribution ------------- count
                512 |                                         0
               1024 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@       1452
               2048 |@@                                       91
               4096 |                                         33
               8192 |@@                                       97
              16384 |                                         0

          3  fsflush\0

              value  ------------- Distribution ------------- count
               4096 |                                         0
               8192 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 449
              16384 |                                         0

       7108  tar cf /dev/null /\0

              value  ------------- Distribution ------------- count
                256 |                                         0
                512 |                                         70
               1024 |@@@@@@@@@@                               1306
               2048 |@@@@                                     569
               4096 |@@@@@@@@@                                1286
               8192 |@@@@@@@@@@                               1403
              16384 |@                                        190
              32768 |@@@                                      396
              65536 |                                         0


The following is a demonstration of the cpuwalk.d script,


cpuwalk.d is not that useful on a single CPU server,

   # cpuwalk.d
   Sampling... Hit Ctrl-C to end.
   ^C

        PID: 18843    CMD: bash

              value  ------------- Distribution ------------- count
                < 0 |                                         0
                  0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 30
                  1 |                                         0

        PID: 8079     CMD: mozilla-bin

              value  ------------- Distribution ------------- count
                < 0 |                                         0
                  0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 10
                  1 |                                         0

The output above shows that PID 18843, "bash", was sampled on CPU 0 a total
of 30 times (we sample at 1000 hz).


The following is a demonstration of running cpuwalk.d with a 5 second
duration. This is on a 4 CPU server running a multithreaded CPU bound
application called "cputhread",

   # cpuwalk.d 5
   Sampling...

        PID: 3        CMD: fsflush

              value  ------------- Distribution ------------- count
                  1 |                                         0
                  2 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 30
                  3 |                                         0

        PID: 12186    CMD: cputhread

              value  ------------- Distribution ------------- count
                < 0 |                                         0
                  0 |@@@@@@@@@@                               4900
                  1 |@@@@@@@@@@                               4900
                  2 |@@@@@@@@@@                               4860
                  3 |@@@@@@@@@@                               4890
                  4 |                                         0

As we are sampling at 1000 hz, the application cputhread is indeed running
concurrently across all available CPUs. We measured the applicaiton on
CPU 0 a total of 4900 times, on CPU 1 a total of 4900 times, etc. As there
are around 5000 samples per CPU available in this 5 second 1000 hz sample,
the application is using almost all the CPU capacity in this server well.


The following is a similar demonstration, this time running a multithreaded
CPU bound application called "cpuserial" that has a poor use of locking
such that the threads "serialise",


   # cpuwalk.d 5
   Sampling...

        PID: 12194    CMD: cpuserial

              value  ------------- Distribution ------------- count
                < 0 |                                         0
                  0 |@@@                                      470
                  1 |@@@@@@                                   920
                  2 |@@@@@@@@@@@@@@@@@@@@@@@@@                3840
                  3 |@@@@@@                                   850
                  4 |                                         0

In the above, we can see that this CPU bound application is not making
efficient use of the CPU resources available, only reaching 3840 samples
on CPU 2 out of a potential 5000. This problem was caused by a poor use
of locks.


The following is an example of the creatbyproc.d script,


Here we run creatbyproc.d for several seconds,

   # ./creatbyproc.d
   dtrace: script './creatbyproc.d' matched 2 probes
   CPU     ID                    FUNCTION:NAME
     0   5438                    creat64:entry touch /tmp/newfile
     0   5438                    creat64:entry sh /tmp/mpLaaOik
     0   5438                    creat64:entry sh /dev/null
   ^C

In another window, the following commands were run,

	touch /tmp/newfile
	man ls

The file creation activity caused by these commands can be seen in the
output by creatbyproc.d


The following is a demonstration of the dappprof command,

This is the usage for version 0.60,

   # dappprof -h
   USAGE: dappprof [-cehoTU] [-u lib] { -p PID | command }

             -p PID          # examine this PID
             -a              # print all details
             -c              # print syscall counts
             -e              # print elapsed times (us)
             -o              # print on cpu times
             -T              # print totals
             -u lib          # trace this library instead
             -U              # trace all libraries + user funcs
             -b bufsize      # dynamic variable buf size
      eg,
          dappprof df -h       # run and examine "df -h"
          dappprof -p 1871     # examine PID 1871
          dappprof -ap 1871    # print all data


The following shows running dappprof with the "banner hello" command.
Elapsed and on-cpu times are printed (-eo), as well as counts (-c) and
totals (-T),

   # dappprof -eocT banner hello

    #    #  ######  #       #        ####
    #    #  #       #       #       #    #
    ######  #####   #       #       #    #
    #    #  #       #       #       #    #
    #    #  #       #       #       #    #
    #    #  ######  ######  ######   ####


   CALL                                                         COUNT
   __fsr                                                            1
   main                                                             1
   banprt                                                           1
   banner                                                           1
   banset                                                           1
   convert                                                          5
   banfil                                                           5
   TOTAL:                                                          15

   CALL                                                       ELAPSED
   banset                                                       37363
   banfil                                                      147407
   convert                                                     149606
   banprt                                                      423507
   banner                                                      891088
   __fsr                                                      1694349
   TOTAL:                                                     3343320

   CALL                                                           CPU
   banset                                                        7532
   convert                                                       8805
   banfil                                                       11092
   __fsr                                                        15708
   banner                                                       48696
   banprt                                                      388853
   TOTAL:                                                      480686

The above output has analysed user functions (the default). It makes it
easy to identify which function is being called the most (COUNT), which
is taking the most time (ELAPSED), and which is consuming the most CPU (CPU).
These times are totals for all the functions called.


The following is a demonstration of the dapptrace command,

This is the usage for version 0.60,

   # dapptrace -h
   USAGE: dapptrace [-acdeholFLU] [-u lib] { -p PID | command }

             -p PID          # examine this PID
             -a              # print all details
             -c              # print syscall counts
             -d              # print relative times (us)
             -e              # print elapsed times (us)
             -F              # print flow indentation
             -l              # print pid/lwpid
             -o              # print CPU on cpu times
             -u lib          # trace this library instead
             -U              # trace all libraries + user funcs
             -b bufsize      # dynamic variable buf size
      eg,
          dapptrace df -h       # run and examine "df -h"
          dapptrace -p 1871     # examine PID 1871
          dapptrace -Fp 1871    # print using flow indents
          dapptrace -eop 1871   # print elapsed and CPU times


The following is an example of the default output. We run dapptrace with
the "banner hello" command,

   # dapptrace banner hi

    #    #     #
    #    #     #
    ######     #
    #    #     #
    #    #     #
    #    #     #

   CALL(args) 		 = return
   -> __fsr(0x2, 0x8047D7C, 0x8047D88)
   <- __fsr = 122
   -> main(0x2, 0x8047D7C, 0x8047D88)
   -> banner(0x8047E3B, 0x80614C2, 0x8047D38)
   -> banset(0x20, 0x80614C2, 0x8047DCC)
   <- banset = 36
   -> convert(0x68, 0x8047DCC, 0x2)
   <- convert = 319
   -> banfil(0x8061412, 0x80614C2, 0x8047DCC)
   <- banfil = 57
   -> convert(0x69, 0x8047DCC, 0x2)
   <- convert = 319
   -> banfil(0x8061419, 0x80614CA, 0x8047DCC)
   <- banfil = 57
   <- banner = 118
   -> banprt(0x80614C2, 0x8047D38, 0xD27FB824)
   <- banprt = 74

The default output shows user function calls. An entry is prefixed
with a "->", and the return has a "<-".


Here we run dapptrace with the -F for flow indent option,

   # dapptrace -F banner hi

    #    #     #
    #    #     #
    ######     #
    #    #     #
    #    #     #
    #    #     #

   CALL(args) 		 = return
     -> __fsr(0x2, 0x8047D7C, 0x8047D88)
     <- __fsr = 122
     -> main(0x2, 0x8047D7C, 0x8047D88)
       -> banner(0x8047E3B, 0x80614C2, 0x8047D38)
         -> banset(0x20, 0x80614C2, 0x8047DCC)
         <- banset = 36
         -> convert(0x68, 0x8047DCC, 0x2)
         <- convert = 319
         -> banfil(0x8061412, 0x80614C2, 0x8047DCC)
         <- banfil = 57
         -> convert(0x69, 0x8047DCC, 0x2)
         <- convert = 319
         -> banfil(0x8061419, 0x80614CA, 0x8047DCC)
         <- banfil = 57
       <- banner = 118
       -> banprt(0x80614C2, 0x8047D38, 0xD27FB824)
       <- banprt = 74

The above output illustrates the flow of the program, which functions
call which other functions.


Now the same command is run with -d to display relative timestamps,

   # dapptrace -dF banner hi

    #    #     #
    #    #     #
    ######     #
    #    #     #
    #    #     #
    #    #     #

   RELATIVE CALL(args) 		 = return
       2512   -> __fsr(0x2, 0x8047D7C, 0x8047D88)
       2516   <- __fsr = 122
       2518   -> main(0x2, 0x8047D7C, 0x8047D88)
       2863     -> banner(0x8047E3B, 0x80614C2, 0x8047D38)
       2865       -> banset(0x20, 0x80614C2, 0x8047DCC)
       2872       <- banset = 36
       2874       -> convert(0x68, 0x8047DCC, 0x2)
       2877       <- convert = 319
       2879       -> banfil(0x8061412, 0x80614C2, 0x8047DCC)
       2882       <- banfil = 57
       2883       -> convert(0x69, 0x8047DCC, 0x2)
       2885       <- convert = 319
       2886       -> banfil(0x8061419, 0x80614CA, 0x8047DCC)
       2888       <- banfil = 57
       2890     <- banner = 118
       2892     -> banprt(0x80614C2, 0x8047D38, 0xD27FB824)
       3214     <- banprt = 74

The relative times are in microseconds since the program's invocation. Great!


Even better is if we use the -eo options, to print elapsed times and on-cpu
times,

   # dapptrace -eoF banner hi

    #    #     #
    #    #     #
    ######     #
    #    #     #
    #    #     #
    #    #     #

    ELAPSD    CPU CALL(args) 		 = return
         .      .   -> __fsr(0x2, 0x8047D7C, 0x8047D88)
        41      4   <- __fsr = 122
         .      .   -> main(0x2, 0x8047D7C, 0x8047D88)
         .      .     -> banner(0x8047E3B, 0x80614C2, 0x8047D38)
         .      .       -> banset(0x20, 0x80614C2, 0x8047DCC)
        29      6       <- banset = 36
         .      .       -> convert(0x68, 0x8047DCC, 0x2)
        26      3       <- convert = 319
         .      .       -> banfil(0x8061412, 0x80614C2, 0x8047DCC)
        25      2       <- banfil = 57
         .      .       -> convert(0x69, 0x8047DCC, 0x2)
        23      1       <- convert = 319
         .      .       -> banfil(0x8061419, 0x80614CA, 0x8047DCC)
        23      1       <- banfil = 57
       309     28     <- banner = 118
         .      .     -> banprt(0x80614C2, 0x8047D38, 0xD27FB824)
       349    322     <- banprt = 74

Now it is easy to see which functions take the longest (elapsed), and
which consume the most CPU cycles.


The following demonstrates the -U option, to trace all libraries,

   # dapptrace -U banner hi

    #    #     #
    #    #     #
    ######     #
    #    #     #
    #    #     #
    #    #     #

   CALL(args) 		 = return
   -> ld.so.1:_rt_boot(0x8047E34, 0x8047E3B, 0x0)
   -> ld.so.1:_setup(0x8047D38, 0x20AE4, 0x3)
   -> ld.so.1:setup(0x8047D88, 0x8047DCC, 0x0)
   -> ld.so.1:fmap_setup(0x0, 0xD27FB2E4, 0xD27FB824)
   <- ld.so.1:fmap_setup = 125
   -> ld.so.1:addfree(0xD27FD3C0, 0xC40, 0x0)
   <- ld.so.1:addfree = 65
   -> ld.so.1:security(0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF)
   <- ld.so.1:security = 142
   -> ld.so.1:readenv_user(0x8047D88, 0xD27FB204, 0xD27FB220)
   -> ld.so.1:ld_str_env(0x8047E3E, 0xD27FB204, 0xD27FB220)
   <- ld.so.1:ld_str_env = 389
   -> ld.so.1:ld_str_env(0x8047E45, 0xD27FB204, 0xD27FB220)
   <- ld.so.1:ld_str_env = 389
   -> ld.so.1:ld_str_env(0x8047E49, 0xD27FB204, 0xD27FB220)
   <- ld.so.1:ld_str_env = 389
   -> ld.so.1:ld_str_env(0x8047E50, 0xD27FB204, 0xD27FB220)
   -> ld.so.1:strncmp(0x8047E53, 0xD27F7BEB, 0x4)
   <- ld.so.1:strncmp = 113
   -> ld.so.1:rd_event(0xD27FB1F8, 0x3, 0x0)
   [...4486 lines deleted...]
   -> ld.so.1:_lwp_mutex_unlock(0xD27FD380, 0xD27FB824, 0x8047C04)
   <- ld.so.1:_lwp_mutex_unlock = 47
   <- ld.so.1:rt_mutex_unlock = 34
   -> ld.so.1:rt_bind_clear(0x1, 0xD279ECC0, 0xD27FDB2C)
   <- ld.so.1:rt_bind_clear = 34
   <- ld.so.1:leave = 210
   <- ld.so.1:elf_bndr = 803
   <- ld.so.1:elf_rtbndr = 35

The output was huge, around 4500 lines long. Function names are prefixed
with their library name, eg "ld.so.1".

This full output should be used with caution, as it enables so many probes
it could well be a burden on the system.

This is a demonstration of the dispqlen.d script,


Here we run it on a single CPU desktop,

   # dispqlen.d
   Sampling... Hit Ctrl-C to end.
   ^C
    CPU 0
              value  ------------- Distribution ------------- count
                < 0 |                                         0
                  0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@    1790
                  1 |@@@                                      160
                  2 |                                         10
                  3 |                                         0

The output shows the length of the dispatcher queue is mostly 0. This is
evidence that the CPU is not very saturated. It does not indicate that the
CPU is idle - as we are measuring the length of the queue, not what is
on the CPU.


Here it is run on a multi CPU server,

   # dispqlen.d
   Sampling... Hit Ctrl-C to end.
   ^C
    CPU 1
              value  ------------- Distribution ------------- count
                < 0 |                                         0
                  0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@          1573
                  1 |@@@@@@@@@                                436
                  2 |                                         4
                  3 |                                         0

    CPU 4
              value  ------------- Distribution ------------- count
                < 0 |                                         0
                  0 |@@@@@@@@@@@@@@@@@@@@@@                   1100
                  1 |@@@@@@@@@@@@@@@@@@                       912
                  2 |                                         1
                  3 |                                         0

    CPU 0
              value  ------------- Distribution ------------- count
                < 0 |                                         0
                  0 |@@@@@@@@@@@@@@@@@                        846
                  1 |@@@@@@@@@@@@@@@@@@@@@@@                  1167
                  2 |                                         0

    CPU 5
              value  ------------- Distribution ------------- count
                < 0 |                                         0
                  0 |@@@@@@@@                                 397
                  1 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@          1537
                  2 |@@                                       79
                  3 |                                         0

The above output shows that threads are queueing up on CPU 5 much more than
CPU 0.

The following demonstrates the dtruss command - a DTrace version of truss.
This version is designed to be less intrusive and safer than running truss.

dtruss has many options. Here is the help for version 0.70,

   USAGE: dtruss [-acdefholL] [-t syscall] { -p PID | -n name | command }

             -p PID          # examine this PID
             -n name         # examine this process name
             -t syscall      # examine this syscall only
             -a              # print all details
             -c              # print syscall counts
             -d              # print relative times (us)
             -e              # print elapsed times (us)
             -f              # follow children
             -l              # force printing pid/lwpid
             -o              # print on cpu times
             -L              # don't print pid/lwpid
             -b bufsize      # dynamic variable buf size
      eg,
          dtruss df -h       # run and examine "df -h"
          dtruss -p 1871     # examine PID 1871
          dtruss -n tar      # examine all processes called "tar"
          dtruss -f test.sh  # run test.sh and follow children


For example, here we dtruss any process with the name "ksh" - the Korn shell,

 # dtruss -n ksh
 PID/LWP   SYSCALL(args)                  = return
 27547/1:  llseek(0x3F, 0xE4E, 0x0)               = 3662 0
 27547/1:  read(0x3F, "\0", 0x400)                = 0 0
 27547/1:  llseek(0x3F, 0x0, 0x0)                 = 3662 0
 27547/1:  write(0x3F, "ls -l\n\0", 0x8)          = 8 0
 27547/1:  fdsync(0x3F, 0x10, 0xFEC1D444)                 = 0 0
 27547/1:  lwp_sigmask(0x3, 0x20000, 0x0)                 = 0xFFBFFEFF 0
 27547/1:  stat64("/usr/bin/ls\0", 0x8047A00, 0xFEC1D444)                 = 0 0
 27547/1:  lwp_sigmask(0x3, 0x0, 0x0)             = 0xFFBFFEFF 0
 [...]

The output for each system call does not yet evaluate as much as truss does.


In the following example, syscall elapsed and overhead times are measured.
Elapsed times represent the time from syscall start to finish; overhead
times measure the time spent on the CPU,

 # dtruss -eon bash
 PID/LWP    ELAPSD    CPU SYSCALL(args)           = return
  3911/1:       41     26 write(0x2, "l\0", 0x1)          = 1 0
  3911/1:  1001579     43 read(0x0, "s\0", 0x1)           = 1 0
  3911/1:       38     26 write(0x2, "s\0", 0x1)          = 1 0
  3911/1:  1019129     43 read(0x0, " \001\0", 0x1)               = 1 0
  3911/1:       38     26 write(0x2, " \0", 0x1)          = 1 0
  3911/1:   998533     43 read(0x0, "-\0", 0x1)           = 1 0
  3911/1:       38     26 write(0x2, "-\001\0", 0x1)              = 1 0
  3911/1:  1094323     42 read(0x0, "l\0", 0x1)           = 1 0
  3911/1:       39     27 write(0x2, "l\001\0", 0x1)              = 1 0
  3911/1:  1210496     44 read(0x0, "\r\0", 0x1)          = 1 0
  3911/1:       40     28 write(0x2, "\n\001\0", 0x1)             = 1 0
  3911/1:        9      1 lwp_sigmask(0x3, 0x2, 0x0)              = 0xFFBFFEFF 0
  3911/1:       70     63 ioctl(0x0, 0x540F, 0x80F6D00)           = 0 0

A bash command was in another window, where the "ls -l" command was being
typed. The keystrokes can be seen above, along with the long elapsed times
(keystroke delays), and short overhead times (as the bash process blocks
on the read and leaves the CPU).


Now dtruss is put to the test. Here we truss a test program that runs several
hundred smaller programs, which in turn generate thousands of system calls.

First, as a "control" we run the program without a truss or dtruss running,

 # time ./test
 real    0m38.508s
 user    0m5.299s
 sys     0m25.668s

Now we try truss,

 # time truss ./test 2> /dev/null
 real    0m41.281s
 user    0m0.558s
 sys     0m1.351s

Now we try dtruss,

 # time dtruss ./test 2> /dev/null
 real    0m46.226s
 user    0m6.771s
 sys     0m31.703s

In the above test, truss slowed the program from 38 seconds to 41. dtruss
slowed the program from 38 seconds to 46, slightly slower that truss...

Now we try follow mode "-f". The test program does run several hundred
smaller programs, so now there are plenty more system calls to track,

 # time truss -f ./test 2> /dev/null
 real    2m28.317s
 user    0m0.893s
 sys     0m3.527s

Now we try dtruss,

 # time dtruss -f ./test 2> /dev/null
 real    0m56.179s
 user    0m10.040s
 sys     0m38.185s

Wow, the difference is huge! truss slows the program from 38 to 148 seconds;
but dtruss has only slowed the program from 38 to 56 seconds.


This is an example of the errinfo program, which prints details on syscall
failures.

By default it "snoops" syscall failures and prints their details,

   # ./errinfo
               EXEC          SYSCALL  ERR  DESC
        wnck-applet             read   11  Resource temporarily unavailable
               Xorg             read   11  Resource temporarily unavailable
           nautilus             read   11  Resource temporarily unavailable
               Xorg             read   11  Resource temporarily unavailable
               dsdm             read   11  Resource temporarily unavailable
               Xorg             read   11  Resource temporarily unavailable
               Xorg          pollsys    4  interrupted system call
        mozilla-bin         lwp_park   62  timer expired
   gnome-netstatus-            ioctl   12  Not enough core
        mozilla-bin         lwp_park   62  timer expired
               Xorg             read   11  Resource temporarily unavailable
        mozilla-bin         lwp_park   62  timer expired
   [...]

which is useful to see these events live, but can scroll off the screen
somewhat rapidly.. so,


The "-c" option will count the number of errors. Hit Ctrl-C to stop the
sample. For example,

# ./errinfo -c
Tracing... Hit Ctrl-C to end.
^C
            EXEC          SYSCALL  ERR  COUNT  DESC
            nscd            fcntl   22      1  Invalid argument
    xscreensaver             read   11      1  Resource temporarily unavailable
           inetd         lwp_park   62      1  timer expired
      svc.startd         lwp_park   62      1  timer expired
     svc.configd         lwp_park   62      1  timer expired
          ttymon            ioctl   25      1  Inappropriate ioctl for device
gnome-netstatus-            ioctl   12      2  Not enough core
     mozilla-bin         lwp_kill    3      2  No such process
     mozilla-bin          connect  150      5  operation now in progress
      svc.startd           portfs   62      8  timer expired
         java_vm    lwp_cond_wait   62      8  timer expired
     soffice.bin             read   11      9  Resource temporarily unavailable
  gnome-terminal             read   11     23  Resource temporarily unavailable
     mozilla-bin             recv   11     26  Resource temporarily unavailable
        nautilus             read   11     26  Resource temporarily unavailable
gnome-settings-d             read   11     26  Resource temporarily unavailable
   gnome-smproxy             read   11     34  Resource temporarily unavailable
     gnome-panel             read   11     42  Resource temporarily unavailable
            dsdm             read   11    112  Resource temporarily unavailable
        metacity             read   11    128  Resource temporarily unavailable
     mozilla-bin         lwp_park   62    133  timer expired
            Xorg          pollsys    4    147  interrupted system call
     wnck-applet             read   11    179  Resource temporarily unavailable
     mozilla-bin             read   11    258  Resource temporarily unavailable
            Xorg             read   11   1707  Resource temporarily unavailable

Ok, so Xorg has received 1707 of the same type of error for the syscall read().


The "-n" option lets us match on one type of process only. In the following
we match processes that have the name "mozilla-bin",

# ./errinfo -c -n mozilla-bin
Tracing... Hit Ctrl-C to end.
^C
            EXEC          SYSCALL  ERR  COUNT  DESC
     mozilla-bin      getpeername  134      1  Socket is not connected
     mozilla-bin             recv   11      2  Resource temporarily unavailable
     mozilla-bin         lwp_kill    3      2  No such process
     mozilla-bin          connect  150      5  operation now in progress
     mozilla-bin         lwp_park   62    207  timer expired
     mozilla-bin             read   11    396  Resource temporarily unavailable


The "-p" option lets us examine one PID only. The following example examines
PID 1119,

# ./errinfo -c -p 1119
Tracing... Hit Ctrl-C to end.
^C
            EXEC          SYSCALL  ERR  COUNT  DESC
            Xorg          pollsys    4     47  interrupted system call
            Xorg             read   11    669  Resource temporarily unavailable


The following is an example of execsnoop. As processes are executed their
details are printed out. Another user was logged in running a few commands
which can be viewed below,

  # ./execsnoop
    UID   PID  PPID ARGS
    100  3008  2656 ls
    100  3009  2656 ls -l
    100  3010  2656 cat /etc/passwd
    100  3011  2656 vi /etc/hosts
    100  3012  2656 date
    100  3013  2656 ls -l
    100  3014  2656 ls
    100  3015  2656 finger
  [...]


In this example the command "man gzip" was executed. The output lets us
see what the man command is actually doing,

  # ./execsnoop
    UID   PID  PPID ARGS
    100  3064  2656 man gzip
    100  3065  3064 sh -c cd /usr/share/man; tbl /usr/share/man/man1/gzip.1 |nroff -u0 -Tlp -man -
    100  3067  3066 tbl /usr/share/man/man1/gzip.1
    100  3068  3066 nroff -u0 -Tlp -man -
    100  3066  3065 col -x
    100  3069  3064 sh -c trap '' 1 15; /usr/bin/mv -f /tmp/mpoMaa_f /usr/share/man/cat1/gzip.1 2>
    100  3070  3069 /usr/bin/mv -f /tmp/mpoMaa_f /usr/share/man/cat1/gzip.1
    100  3071  3064 sh -c more -s /tmp/mpoMaa_f
    100  3072  3071 more -s /tmp/mpoMaa_f
  ^C


Execsnoop has other options,

  # ./execsnoop -h
  USAGE: execsnoop [-a|-A|-sv] [-c command]
         execsnoop                # default output
                  -a              # print all data
                  -A              # dump all data, space delimited
                  -s              # include start time, us
                  -v              # include start time, string
                  -c command      # command name to snoop


In particular the verbose option for human readable timestamps is
very useful,

  # ./execsnoop -v
  STRTIME                UID   PID  PPID ARGS
  2005 Jan 22 00:07:22     0 23053 20933 date
  2005 Jan 22 00:07:24     0 23054 20933 uname -a
  2005 Jan 22 00:07:25     0 23055 20933 ls -latr
  2005 Jan 22 00:07:27     0 23056 20933 df -k
  2005 Jan 22 00:07:29     0 23057 20933 ps -ef
  2005 Jan 22 00:07:29     0 23057 20933 ps -ef
  2005 Jan 22 00:07:34     0 23058 20933 uptime
  2005 Jan 22 00:07:34     0 23058 20933 uptime
  [...]


It is also possible to match particular commands. Here we watch
anyone using the vi command only,

  # ./execsnoop -vc vi
  STRTIME                UID   PID  PPID ARGS
  2005 Jan 22 00:10:33     0 23063 20933 vi /etc/passwd
  2005 Jan 22 00:10:40     0 23064 20933 vi /etc/shadow
  2005 Jan 22 00:10:51     0 23065 20933 vi /etc/group
  2005 Jan 22 00:10:57     0 23066 20933 vi /.rhosts
  [...]


The following is a demonstration of the fddist command,


Here fddist is run for a few seconds on an idle workstation,

   Tracing reads and writes... Hit Ctrl-C to end.
   ^C
   EXEC: dtrace           PID: 3288

              value  ------------- Distribution ------------- count
                  0 |                                         0
                  1 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 2
                  2 |                                         0

   EXEC: mozilla-bin      PID: 1659

              value  ------------- Distribution ------------- count
                  3 |                                         0
                  4 |@@@@@@@@@@                               28
                  5 |                                         0
                  6 |@@@@@@@@@@@@@@@                          40
                  7 |@@@@@@@@@@@@@@@                          40
                  8 |                                         0

   EXEC: Xorg             PID: 1532

              value  ------------- Distribution ------------- count
                 22 |                                         0
                 23 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 57
                 24 |                                         0

The above displays the usage pattern for process file descriptors.
We can see the Xorg process (PID 1532) has made 57 reads or writes to
it's file descriptor 23.

The pfiles(1) command can be used to help determine what file
descriptor 23 actually is.

The following is an example of the filebyproc.d script,

   # filebyproc.d
   dtrace: description 'syscall::open*:entry ' matched 2 probes
   CPU     ID                    FUNCTION:NAME
     0     14                       open:entry gnome-netstatus- /dev/kstat
     0     14                       open:entry man /var/ld/ld.config
     0     14                       open:entry man /lib/libc.so.1
     0     14                       open:entry man /usr/share/man/man.cf
     0     14                       open:entry man /usr/share/man/windex
     0     14                       open:entry man /usr/share/man/man1/ls.1
     0     14                       open:entry man /usr/share/man/man1/ls.1
     0     14                       open:entry man /tmp/mpqea4RF
     0     14                       open:entry sh /var/ld/ld.config
     0     14                       open:entry sh /lib/libc.so.1
     0     14                       open:entry neqn /var/ld/ld.config
     0     14                       open:entry neqn /lib/libc.so.1
     0     14                       open:entry neqn /usr/share/lib/pub/eqnchar
     0     14                       open:entry tbl /var/ld/ld.config
     0     14                       open:entry tbl /lib/libc.so.1
     0     14                       open:entry tbl /usr/share/man/man1/ls.1
     0     14                       open:entry nroff /var/ld/ld.config
   [...]

In the above example, the command "man ls" was run. Each file that was
attempted to be opened can be seen, along with the program name responsible.

The following is a demonstration of the hotspot.d script.

Here the script is run while a large file is copied from one filesystem
(cmdk0 102,0) to another (cmdk0 102,3). We can see the file mostly resided
around the 9000 to 10999 Mb range on the source disk (102,0), and was
copied to the 0 to 999 Mb range on the target disk (102,3).

   # ./hotspot.d
   Tracing... Hit Ctrl-C to end.
   ^C
   Disk: cmdk0   Major,Minor: 102,3

              value  ------------- Distribution ------------- count
                < 0 |                                         0
                  0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 418
               1000 |                                         0

   Disk: cmdk0   Major,Minor: 102,0

              value  ------------- Distribution ------------- count
                < 0 |                                         0
                  0 |                                         1
               1000 |                                         5
               2000 |                                         0
               3000 |                                         0
               4000 |                                         0
               5000 |                                         0
               6000 |                                         0
               7000 |                                         0
               8000 |                                         0
               9000 |@@@@@                                    171
              10000 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@      1157
              11000 |                                         0

The following is a demonstration of the iofile.d script,


Here we run it while a tar command is backing up /var/adm,

   # iofile.d
   Tracing... Hit Ctrl-C to end.
   ^C
      PID CMD              TIME FILE
     5206 tar               109 /var/adm/acct/nite
     5206 tar               110 /var/adm/acct/sum
     5206 tar               114 /var/adm/acct/fiscal
     5206 tar               117 /var/adm/messages.3
     5206 tar               172 /var/adm/sa
     5206 tar              3605 /var/adm/messages.2
     5206 tar              4548 /var/adm/spellhist
     5206 tar              5769 /var/adm/exacct/brendan1task
     5206 tar              6416 /var/adm/acct
     5206 tar              7587 /var/adm/messages.1
     5206 tar              8246 /var/adm/exacct/task
     5206 tar              8320 /var/adm/pool
     5206 tar              8973 /var/adm/pool/history
     5206 tar              9183 /var/adm/exacct
        3 fsflush         10882 <none>
     5206 tar             11861 /var/adm/exacct/flow
     5206 tar             12042 /var/adm/messages.0
     5206 tar             12408 /var/adm/sm.bin
     5206 tar             13021 /var/adm/sulog
     5206 tar             19007 /var/adm/streams
     5206 tar             21811 <none>
     5206 tar             24918 /var/adm/exacct/proc

In the above output, we can see that the tar command spent 24918 us (25 ms)
waiting for disk I/O on the /var/adm/exacct/proc file.

The following is a demonstration of the iofileb.d script,


Here we run it while a tar command is backing up /var/adm,

   # ./iofileb.d
   Tracing... Hit Ctrl-C to end.
   ^C
      PID CMD              KB FILE
    29529 tar              56 /var/adm/sa/sa31
    29529 tar              56 /var/adm/sa/sa03
    29529 tar              56 /var/adm/sa/sa02
    29529 tar              56 /var/adm/sa/sa01
    29529 tar              56 /var/adm/sa/sa04
    29529 tar              56 /var/adm/sa/sa27
    29529 tar              56 /var/adm/sa/sa28
    29529 tar             324 /var/adm/exacct/task
    29529 tar             736 /var/adm/wtmpx

In the above output, we can see that the tar command has caused 736 Kbytes
of the /var/adm/wtmpx file to be read from disk. All af the Kbyte values
measured are for disk activity.

The following is a demonstration of the iopattern program,


Here we run iopattern for a few seconds then hit Ctrl-C. There is a "dd"
command running on this system to intentionally create heavy sequential
disk activity,

   # iopattern
   %RAN %SEQ  COUNT    MIN    MAX    AVG     KR     KW
      1   99    465   4096  57344  52992  23916    148
      0  100    556  57344  57344  57344  31136      0
      0  100    634  57344  57344  57344  35504      0
      6   94    554    512  57344  54034  29184     49
      0  100    489  57344  57344  57344  27384      0
     21   79    568   4096  57344  46188  25576     44
      4   96    431   4096  57344  56118  23620      0
   ^C

In the above output we can see that the disk activity is mostly sequential.
The disks are also pulling around 30 Mb during each sample, with a large
average event size.


The following demonstrates iopattern while running a "find" command to
cause random disk activity,

   # iopattern
   %RAN %SEQ  COUNT    MIN    MAX    AVG     KR     KW
     86   14    400   1024   8192   1543    603      0
     81   19    455   1024   8192   1606    714      0
     89   11    469    512   8192   1854    550    299
     83   17    463   1024   8192   1782    806      0
     87   13    394   1024   8192   1551    597      0
     85   15    348    512  57344   2835    808    155
     91    9    513    512  47616   2812    570    839
     76   24    317    512  35840   3755    562    600
   ^C

In the above output, we can see from the percentages that the disk events
were mostly random. We can also see that the average event size is small -
which makes sense if we are reading through many directory files.


iopattern has options. Here we print timestamps "-v" and measure every 10
seconds,

   # iopattern -v 10
   TIME                 %RAN %SEQ  COUNT    MIN    MAX    AVG     KR     KW
   2005 Jul 25 20:40:55   97    3     33    512   8192   1163      8     29
   2005 Jul 25 20:41:05    0    0      0      0      0      0      0      0
   2005 Jul 25 20:41:15   84   16      6    512  11776   5973     22     13
   2005 Jul 25 20:41:25  100    0     26    512   8192   1496      8     30
   2005 Jul 25 20:41:35    0    0      0      0      0      0      0      0
   ^C

The following is a demonstration of the iopending tool,

Here we run it with a sample interval of 1 second,

   # iopending 1
   Tracing... Please wait.
   2006 Jan  6 20:21:59,  load: 0.02,  disk_r:      0 KB,  disk_w:      0 KB

              value  ------------- Distribution ------------- count
                < 0 |                                         0
                  0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1010
                  1 |                                         0

   2006 Jan  6 20:22:00,  load: 0.03,  disk_r:      0 KB,  disk_w:      0 KB

              value  ------------- Distribution ------------- count
                < 0 |                                         0
                  0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1000
                  1 |                                         0

   2006 Jan  6 20:22:01,  load: 0.03,  disk_r:      0 KB,  disk_w:      0 KB

              value  ------------- Distribution ------------- count
                < 0 |                                         0
                  0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1000
                  1 |                                         0

   ^C

The iopending tool samples at 1000 Hz, and prints a distribution of how many
disk events were "pending" completion. In the above example the disks are
quiet - for all the samples there are zero disk events pending.


Now iopending is run with no arguments. It will default to an interval of 5
seconds,

   # iopending
   Tracing... Please wait.
   2006 Jan  6 19:15:41,  load: 0.03,  disk_r:   3599 KB,  disk_w:      0 KB

              value  ------------- Distribution ------------- count
                < 0 |                                         0
                  0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@     4450
                  1 |@@@                                      390
                  2 |@                                        80
                  3 |                                         40
                  4 |                                         20
                  5 |                                         30
                  6 |                                         0

   ^C

In the above output there was a little disk activity. For 390 samples there
was 1 I/O event pending; for 80 samples there was 2, and so on.


In the following example iopending is run during heavy disk activity. We
print output every 10 seconds,

   # iopending 10
   Tracing... Please wait.
   2006 Jan  6 20:58:07,  load: 0.03,  disk_r:  25172 KB,  disk_w:  33321 KB

              value  ------------- Distribution ------------- count
                < 0 |                                         0
                  0 |@@@@@@@@@                                2160
                  1 |@@@@@@@@@@@@@@@@@@@@@@@@@@@              6720
                  2 |@@@@                                     1000
                  3 |                                         50
                  4 |                                         30
                  5 |                                         20
                  6 |                                         10
                  7 |                                         10
                  8 |                                         10
                  9 |                                         0

   2006 Jan  6 20:58:17,  load: 0.05,  disk_r:   8409 KB,  disk_w:  12449 KB

              value  ------------- Distribution ------------- count
                < 0 |                                         0
                  0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@            7260
                  1 |@@@@@@@                                  1700
                  2 |@                                        300
                  3 |                                         0
                  4 |                                         10
                  5 |                                         10
                  6 |                                         10
                  7 |                                         20
                  8 |                                         0
                  9 |                                         0
                 10 |                                         0
                 11 |                                         0
                 12 |                                         0
                 13 |                                         0
                 14 |                                         0
                 15 |                                         0
                 16 |                                         0
                 17 |                                         10
                 18 |                                         20
                 19 |                                         0
                 20 |                                         0
                 21 |                                         0
                 22 |                                         0
                 23 |                                         0
                 24 |                                         0
                 25 |                                         0
                 26 |                                         0
                 27 |                                         0
                 28 |                                         0
                 29 |                                         0
                 30 |                                         0
                 31 |                                         10
              >= 32 |@@@                                      650

   ^C

In the first output, most of the time (67%) there was 1 event pending,
and for a short time there were 8 events pending. In the second output we
see many samples were off the scale - 650 samples at 32 or more pending
events. For this sample I had typed "sync" in another window, which
queued many disk events immediately which were eventually completed.

The following demonstrates iosnoop. It was run on a system that was
fairly quiet until a tar command was run,

# ./iosnoop
  UID   PID D    BLOCK   SIZE       COMM PATHNAME
    0     0 W     1067    512      sched <none>
    0     0 W  6496304   1024      sched <none>
    0     3 W  6498797    512    fsflush <none>
    0     0 W     1067    512      sched <none>
    0     0 W  6496304   1024      sched <none>
  100   443 R   892288   4096       Xsun /usr/openwin/bin/Xsun
  100   443 R   891456   4096       Xsun /usr/openwin/bin/Xsun
  100 15795 R     3808   8192        tar /usr/bin/eject
  100 15795 R    35904   6144        tar /usr/bin/eject
  100 15795 R    39828   6144        tar /usr/bin/env
  100 15795 R     3872   8192        tar /usr/bin/expr
  100 15795 R    21120   7168        tar /usr/bin/expr
  100 15795 R    43680   6144        tar /usr/bin/false
  100 15795 R    44176   6144        tar /usr/bin/fdetach
  100 15795 R     3920   8192        tar /usr/bin/fdformat
  100 15795 R     3936   8192        tar /usr/bin/fdformat
  100 15795 R     4080   8192        tar /usr/bin/fdformat
  100 15795 R     9680   3072        tar /usr/bin/fdformat
  100 15795 R     4096   8192        tar /usr/bin/fgrep
  100 15795 R    46896   6144        tar /usr/bin/fgrep
  100 15795 R     4112   8192        tar /usr/bin/file
  100 15795 R     4128   8192        tar /usr/bin/file
  100 15795 R     4144   8192        tar /usr/bin/file
  100 15795 R    21552   7168        tar /usr/bin/file
  100 15795 R     4192   8192        tar /usr/bin/fmli
  100 15795 R     4208   8192        tar /usr/bin/fmli
  100 15795 R     4224  57344        tar /usr/bin/fmli
  100 15795 R     4336  24576        tar /usr/bin/fmli
  100 15795 R   695792   8192        tar <none>
  100 15795 R   696432  57344        tar /usr/bin/fmli
[...]


The following are demonstrations of the iotop program,


Here we run iotop with the -C option to not clear the screen, but instead
provide a scrolling output,

   # iotop -C
   Tracing... Please wait.
   2005 Jul 16 00:34:40,  load: 1.21,  disk_r:  12891 KB,  disk_w:   1087 KB

     UID    PID   PPID CMD              DEVICE  MAJ MIN D            BYTES
       0      3      0 fsflush          cmdk0   102   4 W              512
       0      3      0 fsflush          cmdk0   102   0 W            11776
       0  27751  20320 tar              cmdk0   102  16 W            23040
       0      3      0 fsflush          cmdk0   102   0 R            73728
       0      0      0 sched            cmdk0   102   0 R           548864
       0      0      0 sched            cmdk0   102   0 W          1078272
       0  27751  20320 tar              cmdk0   102  16 R          1514496
       0  27751  20320 tar              cmdk0   102   3 R         11767808

   2005 Jul 16 00:34:45,  load: 1.23,  disk_r:  83849 KB,  disk_w:    488 KB

     UID    PID   PPID CMD              DEVICE  MAJ MIN D            BYTES
       0      0      0 sched            cmdk0   102   4 W             1536
       0      0      0 sched            cmdk0   102   0 R           131072
       0  27752  20320 find             cmdk0   102   0 R           262144
       0      0      0 sched            cmdk0   102   0 W           498176
       0  27751  20320 tar              cmdk0   102   3 R         11780096
       0  27751  20320 tar              cmdk0   102   5 R         29745152
       0  27751  20320 tar              cmdk0   102   4 R         47203328

   2005 Jul 16 00:34:50,  load: 1.25,  disk_r:  22394 KB,  disk_w:      2 KB

     UID    PID   PPID CMD              DEVICE  MAJ MIN D            BYTES
       0  27752  20320 find             cmdk0   102   0 W             2048
       0      0      0 sched            cmdk0   102   0 R            16384
       0    321      1 automountd       cmdk0   102   0 R            22528
       0  27752  20320 find             cmdk0   102   0 R          1462272
       0  27751  20320 tar              cmdk0   102   5 R         17465344

In the above output, we can see a tar command is reading from the cmdk0
disk, from several different slices (different minor numbers), on the last
report focusing on 102,5 (an "ls -lL" in /dev/dsk can explain the number to
slice mappings).

The disk_r and disk_w values give a summary of the overall activity in
bytes.


Bytes can be used as a yardstick to determine which process is keeping the
disks busy, however either of the delta times available from iotop would
be more accurate (as they take into account whether the activity is random
or sequential).

   # iotop -Co
   Tracing... Please wait.
   2005 Jul 16 00:39:03,  load: 1.10,  disk_r:   5302 KB,  disk_w:     20 KB

     UID    PID   PPID CMD              DEVICE  MAJ MIN D         DISKTIME
       0      0      0 sched            cmdk0   102   0 W              532
       0      0      0 sched            cmdk0   102   0 R           245398
       0  27758  20320 find             cmdk0   102   0 R          3094794

   2005 Jul 16 00:39:08,  load: 1.14,  disk_r:   5268 KB,  disk_w:    273 KB

     UID    PID   PPID CMD              DEVICE  MAJ MIN D         DISKTIME
       0      3      0 fsflush          cmdk0   102   0 W             2834
       0      0      0 sched            cmdk0   102   0 W           263527
       0      0      0 sched            cmdk0   102   0 R           285015
       0      3      0 fsflush          cmdk0   102   0 R           519187
       0  27758  20320 find             cmdk0   102   0 R          2429232

   2005 Jul 16 00:39:13,  load: 1.16,  disk_r:    602 KB,  disk_w:   1238 KB

     UID    PID   PPID CMD              DEVICE  MAJ MIN D         DISKTIME
       0      3      0 fsflush          cmdk0   102   4 W              200
       0      3      0 fsflush          cmdk0   102   6 W              260
       0      3      0 fsflush          cmdk0   102   0 W              883
       0  27758  20320 find             cmdk0   102   0 R            55686
       0      3      0 fsflush          cmdk0   102   0 R           317508
       0      0      0 sched            cmdk0   102   0 R           320195
       0      0      0 sched            cmdk0   102   0 W           571084
   [...]

The disk time is in microseconds. In the first sample, we can see the find
command caused a total of 3.094 seconds of disk time - the duration of the
samples here is 5 seconds (the default), so it would be fair to say that
the find command is keeping the disk 60% busy.


A new option for iotop is to print percents "-P" which are based on disk
I/O times, and hense are a fair measurementt of what is keeping the disks
busy.

   # iotop -PC 1
   Tracing... Please wait.
   2005 Nov 18 15:26:14,  load: 0.24,  disk_r:  13176 KB,  disk_w:      0 KB

     UID    PID   PPID CMD              DEVICE  MAJ MIN D   %I/O
       0   2215   1663 bart             cmdk0   102   0 R     85

   2005 Nov 18 15:26:15,  load: 0.25,  disk_r:   5263 KB,  disk_w:      0 KB

     UID    PID   PPID CMD              DEVICE  MAJ MIN D   %I/O
       0   2214   1663 find             cmdk0   102   0 R     15
       0   2215   1663 bart             cmdk0   102   0 R     67

   2005 Nov 18 15:26:16,  load: 0.25,  disk_r:   8724 KB,  disk_w:      0 KB

     UID    PID   PPID CMD              DEVICE  MAJ MIN D   %I/O
       0   2214   1663 find             cmdk0   102   0 R     10
       0   2215   1663 bart             cmdk0   102   0 R     71

   2005 Nov 18 15:26:17,  load: 0.25,  disk_r:   7528 KB,  disk_w:      0 KB

     UID    PID   PPID CMD              DEVICE  MAJ MIN D   %I/O
       0   2214   1663 find             cmdk0   102   0 R      0
       0   2215   1663 bart             cmdk0   102   0 R     85

   2005 Nov 18 15:26:18,  load: 0.26,  disk_r:  11389 KB,  disk_w:      0 KB

     UID    PID   PPID CMD              DEVICE  MAJ MIN D   %I/O
       0   2214   1663 find             cmdk0   102   0 R      2
       0   2215   1663 bart             cmdk0   102   0 R     80

   2005 Nov 18 15:26:19,  load: 0.26,  disk_r:  22109 KB,  disk_w:      0 KB

     UID    PID   PPID CMD              DEVICE  MAJ MIN D   %I/O
       0   2215   1663 bart             cmdk0   102   0 R     76

   ^C

In the above output, bart and find jostle for disk access as they create
a database of file checksums. The command was,

	find / | bart create -I > /dev/null

Note that the %I/O is in terms of 1 disk. A %I/O of say 200 is allowed - it
would mean that effectively 2 disks were at 100%, or 4 disks at 50%, etc.

This is an example of the kill.d DTrace script,

   # kill.d
    FROM      COMMAND   SIG TO     RESULT
    2344         bash     2 3117   0
    2344         bash     9 12345  -1
   ^C

In the above output, a kill -2 (Ctrl-C) was sent from the bash command
to PID 3177. Then a kill -9 (SIGKILL) was sent to PID 12345 - which
returned a "-1" for failure.

The following is a demonstration of the lastwords command,


Here we run lastwords to catch syscalls from processes named "bash" as they
exit,

   # ./lastwords bash
   Tracing... Waiting for bash to exit...
   1091567219163679    1861         bash    sigaction          0   0
   1091567219177487    1861         bash    sigaction          0   0
   1091567219189692    1861         bash    sigaction          0   0
   1091567219202085    1861         bash    sigaction          0   0
   1091567219214553    1861         bash    sigaction          0   0
   1091567219226690    1861         bash    sigaction          0   0
   1091567219238786    1861         bash    sigaction          0   0
   1091567219251697    1861         bash    sigaction          0   0
   1091567219265770    1861         bash    sigaction          0   0
   1091567219294110    1861         bash        gtime   42a7c194   0
   1091567219428305    1861         bash        write          5   0
   1091567219451138    1861         bash   setcontext          0   0
   1091567219473911    1861         bash    sigaction          0   0
   1091567219516487    1861         bash       stat64          0   0
   1091567219547973    1861         bash       open64          4   0
   1091567219638345    1861         bash        write          5   0
   1091567219658886    1861         bash        close          0   0
   1091567219689094    1861         bash       open64          4   0
   1091567219704301    1861         bash      fstat64          0   0
   1091567219731796    1861         bash         read        2fe   0
   1091567219745541    1861         bash        close          0   0
   1091567219768536    1861         bash  lwp_sigmask   ffbffeff   0
   1091567219787494    1861         bash        ioctl          0   0
   1091567219801338    1861         bash      setpgrp        6a3   0
   1091567219814067    1861         bash        ioctl          0   0
   1091567219825791    1861         bash  lwp_sigmask   ffbffeff   0
   1091567219847778    1861         bash      setpgrp          0   0
   TIME                 PID         EXEC      SYSCALL     RETURN ERR

In another window, a bash shell was executed and then exited normally. The
last few system calls that the bash shell made can be seen above.


In the following example we moniter the exit of bash shells again, but this
time the bash shell sends itself a "kill -8",

   # ./lastwords bash
   Tracing... Waiting for bash to exit...
   1091650185555391    1865         bash    sigaction          0   0
   1091650185567963    1865         bash    sigaction          0   0
   1091650185580316    1865         bash    sigaction          0   0
   1091650185592381    1865         bash    sigaction          0   0
   1091650185605046    1865         bash    sigaction          0   0
   1091650185618451    1865         bash    sigaction          0   0
   1091650185647663    1865         bash        gtime   42a7c1e7   0
   1091650185794626    1865         bash         kill          0   0
   1091650185836941    1865         bash  lwp_sigmask   ffbffeff   0
   1091650185884145    1865         bash       stat64          0   0
   1091650185916135    1865         bash       open64          4   0
   1091650186005673    1865         bash        write          b   0
   1091650186025782    1865         bash        close          0   0
   1091650186052002    1865         bash       open64          4   0
   1091650186067538    1865         bash      fstat64          0   0
   1091650186094289    1865         bash         read        309   0
   1091650186108086    1865         bash        close          0   0
   1091650186129965    1865         bash  lwp_sigmask   ffbffeff   0
   1091650186149092    1865         bash        ioctl          0   0
   1091650186162614    1865         bash      setpgrp        6a3   0
   1091650186175457    1865         bash        ioctl          0   0
   1091650186187206    1865         bash  lwp_sigmask   ffbffeff   0
   1091650186209514    1865         bash      setpgrp          0   0
   1091650186225307    1865         bash    sigaction          0   0
   1091650186238832    1865         bash       getpid        749   0
   1091650186260149    1865         bash         kill          0   0
   1091650186277925    1865         bash   setcontext          0   0
   TIME                 PID         EXEC      SYSCALL     RETURN ERR

The last few system calls are different, we can see the kill system call
before bash exits.


The following is a demonstration of the loads.d script.


Here we run both loads.d and the uptime command for comparison,

   # uptime
     1:30am  up 14 day(s),  2:27,  3 users,  load average: 3.52, 3.45, 3.05

   # ./loads.d
   2005 Jun 11 01:30:49,  load average: 3.52, 3.45, 3.05

Both have returned the same load average, confirming that loads.d is
behaving as expected.


The point of loads.d is to demonstrate fetching the same data as uptime
does, in the DTrace language. It is not intended as a replacement
or substitute to the uptime(1) command.

The following is an example of the newproc.d script,

   # ./newproc.d
   dtrace: description 'proc:::exec-success ' matched 1 probe
   CPU     ID                    FUNCTION:NAME
     0   3297         exec_common:exec-success   man ls
     0   3297         exec_common:exec-success   sh -c cd /usr/share/man; tbl /usr/share/man/man1/ls.1 |neqn /usr/share/lib/pub/
     0   3297         exec_common:exec-success   tbl /usr/share/man/man1/ls.1
     0   3297         exec_common:exec-success   neqn /usr/share/lib/pub/eqnchar -
     0   3297         exec_common:exec-success   nroff -u0 -Tlp -man -
     0   3297         exec_common:exec-success   col -x
     0   3297         exec_common:exec-success   sh -c trap '' 1 15; /usr/bin/mv -f/tmp/mpzIaOZF /usr/share/man/cat1/ls.1 2> /d
     0   3297         exec_common:exec-success   /usr/bin/mv -f /tmp/mpzIaOZF /usr/share/man/cat1/ls.1
     0   3297         exec_common:exec-success   sh -c more -s /tmp/mpzIaOZF
     0   3297         exec_common:exec-success   more -s /tmp/mpzIaOZF

The above output was caught when running "man ls". This identifies all the
commands responsible for processing the man page.

The following are examples of opensnoop. File open events are traced
along with some process details.


This first example is of the default output. The commands "cat", "cal",
"ls" and "uname" were run. The returned file descriptor (or -1 for error) are
shown, along with the filenames.

  # ./opensnoop
    UID   PID COMM          FD PATH
    100  3504 cat           -1 /var/ld/ld.config
    100  3504 cat            3 /usr/lib/libc.so.1
    100  3504 cat            3 /etc/passwd
    100  3505 cal           -1 /var/ld/ld.config
    100  3505 cal            3 /usr/lib/libc.so.1
    100  3505 cal            3 /usr/share/lib/zoneinfo/Australia/NSW
    100  3506 ls            -1 /var/ld/ld.config
    100  3506 ls             3 /usr/lib/libc.so.1
    100  3507 uname         -1 /var/ld/ld.config
    100  3507 uname          3 /usr/lib/libc.so.1
  [...]


Full command arguments can be fetched using -g,

  # ./opensnoop -g
    UID   PID PATH                                   FD ARGS
    100  3528 /var/ld/ld.config                      -1 cat /etc/passwd
    100  3528 /usr/lib/libc.so.1                      3 cat /etc/passwd
    100  3528 /etc/passwd                             3 cat /etc/passwd
    100  3529 /var/ld/ld.config                      -1 cal
    100  3529 /usr/lib/libc.so.1                      3 cal
    100  3529 /usr/share/lib/zoneinfo/Australia/NSW   3 cal
    100  3530 /var/ld/ld.config                      -1 ls -l
    100  3530 /usr/lib/libc.so.1                      3 ls -l
    100  3530 /var/run/name_service_door              3 ls -l
    100  3530 /usr/share/lib/zoneinfo/Australia/NSW   4 ls -l
    100  3531 /var/ld/ld.config                      -1 uname -a
    100  3531 /usr/lib/libc.so.1                      3 uname -a
  [...]


The verbose option prints human readable timestamps,

  # ./opensnoop -v
  STRTIME                UID   PID COMM          FD PATH
  2005 Jan 22 01:22:50     0 23212 df            -1 /var/ld/ld.config
  2005 Jan 22 01:22:50     0 23212 df             3 /lib/libcmd.so.1
  2005 Jan 22 01:22:50     0 23212 df             3 /lib/libc.so.1
  2005 Jan 22 01:22:50     0 23212 df             3 /platform/SUNW,Sun-Fire-V210/lib/libc_psr.so.1
  2005 Jan 22 01:22:50     0 23212 df             3 /etc/mnttab
  2005 Jan 22 01:22:50     0 23211 dtrace         4 /usr/share/lib/zoneinfo/Australia/NSW
  2005 Jan 22 01:22:51     0 23213 uname         -1 /var/ld/ld.config
  2005 Jan 22 01:22:51     0 23213 uname          3 /lib/libc.so.1
  2005 Jan 22 01:22:51     0 23213 uname          3 /platform/SUNW,Sun-Fire-V210/lib/libc_psr.so.1
  [...]


Particular files can be monitored using -f. For example,

  # ./opensnoop -vgf /etc/passwd
  STRTIME                UID   PID PATH                  FD ARGS
  2005 Jan 22 01:28:50     0 23242 /etc/passwd            3 cat /etc/passwd
  2005 Jan 22 01:28:54     0 23243 /etc/passwd            4 vi /etc/passwd
  2005 Jan 22 01:29:06     0 23244 /etc/passwd            3 passwd brendan
  [...]


This example is of opensnoop running on a quiet system. We can see as
various daemons are opening files,

   # ./opensnoop
     UID   PID COMM          FD PATH
       0   253 nscd           5 /etc/user_attr
       0   253 nscd           5 /etc/hosts
       0   419 mibiisa        2 /dev/kstat
       0   419 mibiisa        2 /dev/rtls
       0   419 mibiisa        2 /dev/kstat
       0   419 mibiisa        2 /dev/kstat
       0   419 mibiisa        2 /dev/rtls
       0   419 mibiisa        2 /dev/kstat
       0   253 nscd           5 /etc/user_attr
       0   419 mibiisa        2 /dev/kstat
       0   419 mibiisa        2 /dev/rtls
       0   419 mibiisa        2 /dev/kstat
       0   174 in.routed      8 /dev/kstat
       0   174 in.routed      8 /dev/kstat
       0   174 in.routed      6 /dev/ip
       0   419 mibiisa        2 /dev/kstat
       0   419 mibiisa        2 /dev/rtls
       0   419 mibiisa        2 /dev/kstat
       0   293 utmpd          4 /var/adm/utmpx
       0   293 utmpd          5 /var/adm/utmpx
       0   293 utmpd          6 /proc/442/psinfo
       0   293 utmpd          6 /proc/567/psinfo
       0   293 utmpd          6 /proc/567/psinfo
       0   293 utmpd          6 /proc/567/psinfo
       0   293 utmpd          6 /proc/567/psinfo
       0   293 utmpd          6 /proc/567/psinfo
       0   293 utmpd          6 /proc/567/psinfo
       0   293 utmpd          6 /proc/567/psinfo
       0   293 utmpd          6 /proc/567/psinfo
       0   293 utmpd          6 /proc/3013/psinfo
       0   419 mibiisa        2 /dev/kstat
       0   419 mibiisa        2 /dev/rtls
       0   419 mibiisa        2 /dev/kstat
  [...]
The following is a demonstration of the pathopens.d script,


Here we run it for a few seconds then hit Ctrl-C,

   # pathopens.d
   Tracing... Hit Ctrl-C to end.
   ^C
    COUNT PATHNAME
        1 /lib/libcmd.so.1
        1 /export/home/root/DTrace/Dexplorer/dexplorer
        1 /lib/libmd5.so.1
        1 /lib/libaio.so.1
        1 /lib/librt.so.1
        1 /etc/security/prof_attr
        1 /etc/mnttab
        2 /devices/pseudo/devinfo@0:devinfo
        2 /dev/kstat
        2 /lib/libnvpair.so.1
        2 /lib/libkstat.so.1
        2 /lib/libdevinfo.so.1
        2 /lib/libnsl.so.1
        4 /lib/libc.so.1
        4 /var/ld/ld.config
        8 /export/home/brendan/Utils_solx86/setiathome-3.08.i386-pc-solaris2.6/outfile.sah

In the above output, many of the files would have been opened using
absolute pathnames. However the "dexplorer" file was opened using a relative
pathname - and the pathopens.d script has correctly printed the full path.

The above shows that the outfile.sah file was opened successfully 8 times.

The following is a demonstration of the pidpersec.d script.


Here the program is run on an idle system,

   # ./pidpersec.d
   TIME                    LASTPID  PID/s
   2005 Jun  9 22:15:09       3010      0
   2005 Jun  9 22:15:10       3010      0
   2005 Jun  9 22:15:11       3010      0
   2005 Jun  9 22:15:12       3010      0
   2005 Jun  9 22:15:13       3010      0
   ^C

This shows that there are now new processes being created.


Now the script is run on a busy system, that is creating many processes
(which happen to be short-lived),

   # ./pidpersec.d
   TIME                    LASTPID  PID/s
   2005 Jun  9 22:16:30       3051     13
   2005 Jun  9 22:16:31       3063     12
   2005 Jun  9 22:16:32       3073     10
   2005 Jun  9 22:16:33       3084     11
   2005 Jun  9 22:16:34       3096     12
   ^C

Now we can see that there are over 10 new processes created each second.
The value for lastpid confirms the rates printed.

The following is a demonstration of the priclass.d script.


The script was run for several seconds then Ctrl-C was hit. During
this time, other processes in different scheduling classes were
running.

   # ./priclass.d
   Sampling... Hit Ctrl-C to end.
   ^C

     IA
              value  ------------- Distribution ------------- count
                 40 |                                         0
                 50 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 30
                 60 |                                         0

     SYS
              value  ------------- Distribution ------------- count
                < 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@  4959
                  0 |                                         0
                 10 |                                         0
                 20 |                                         0
                 30 |                                         0
                 40 |                                         0
                 50 |                                         0
                 60 |                                         30
                 70 |                                         0
                 80 |                                         0
                 90 |                                         0
                100 |                                         0
                110 |                                         0
                120 |                                         0
                130 |                                         0
                140 |                                         0
                150 |                                         0
                160 |                                         50
             >= 170 |                                         0

     RT
              value  ------------- Distribution ------------- count
                 90 |                                         0
                100 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 110
                110 |                                         0

     TS
              value  ------------- Distribution ------------- count
                < 0 |                                         0
                  0 |@@@@@@@@@@@@@@@                          2880
                 10 |@@@@@@@                                  1280
                 20 |@@@@@                                    990
                 30 |@@@@@                                    920
                 40 |@@@@                                     670
                 50 |@@@@                                     730
                 60 |                                         0

The output is quite interesting, and illustrates neatly the behaviour
of different scheduling classes.

The IA interactive class had 30 samples of a 50 to 59 priority, a fairly
high priority. This class is used for interactive processes, such as
the windowing system. I had clicked on a few windows to create this
activity.

The SYS system class has had 4959 samples at a < 0 priority - the lowest,
which was for the idle thread. There are a few samples at higher
priorities, including some in the 160 to 169 range (the highest), which
are for interrupt threads. The system class is used by the kernel.

The RT real time class had 110 samples in the 100 to 109 priority range.
This class is designed for real-time applications, those that must have
a consistant response time regardless of other process activity. For that
reason, the RT class trumps both TS and IA. I created these events by
running "prstat -R" as root, which runs prstat in the real time class.

The TS time sharing class is the default scheduling class for the processes
on a Solaris system. I ran an infinite shell loop to create heavy activity,
"while :; do :; done", which shows a profile that leans towards lower
priorities. This is deliberate behaivour from the time sharing class, which
reduces the priority of CPU bound processes so that they interefere less
with I/O bound processes. The result is more samples in the lower priority
ranges.
The following are demonstrations of the pridist.d script.


Here we run pridist.d for a few seconds then hit Ctrl-C,

   # pridist.d
   Sampling... Hit Ctrl-C to end.
   ^C
    CMD: setiathome       PID: 2190

              value  ------------- Distribution ------------- count
                 -5 |                                         0
                  0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 6629
                  5 |                                         0

    CMD: sshd             PID: 9172

              value  ------------- Distribution ------------- count
                 50 |                                         0
                 55 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 10
                 60 |                                         0

    CMD: mozilla-bin      PID: 3164

              value  ------------- Distribution ------------- count
                 40 |                                         0
                 45 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 20
                 50 |                                         0

    CMD: perl             PID: 11544

              value  ------------- Distribution ------------- count
                 10 |                                         0
                 15 |@@@@@@@@                                 60
                 20 |                                         0
                 25 |@@@@@@@@@@@@@@@                          120
                 30 |                                         0
                 35 |@@@@@@@@@@                               80
                 40 |                                         0
                 45 |@@@@@                                    40
                 50 |                                         0
                 55 |@@@                                      20
                 60 |                                         0

During this sample there was a CPU bound process called "setiathome"
running, and a new CPU bound "perl" process was executed.

perl, executing an infinite loop, begins with a high priority of 55 to 59
where it is sampled 20 times. pridist.d samples 1000 times per second,
so this equates to 20 ms. The perl process has also been sampled for 40 ms
at priority 45 to 49, for 80 ms at priority 35 to 39, down to 60 ms at a
priority 15 to 19 - at which point I had hit Ctrl-C to end sampling.

The output is spectacular as it matches the behaviour of the dispatcher
table for the time sharing class perfectly!

setiathome is running with the lowest priority, in the 0 to 4 range.

... ok, so when I say 20 samples equates 20 ms, we know that's only an
estimate. It really means that for 20 samples that process was the one on
the CPU. In between the samples anything may have occured (I/O bound
processes will context switch off the CPU). DTrace can certainly be used
to measure this based on schedular events not samples (eg, cpudist),
however DTrace can then sometimes consume a noticable portion of the CPUs
(for example, 2%).


The following is a longer sample. Again, I start a new CPU bound perl
process,

   # pridist.d
   Sampling... Hit Ctrl-C to end.
   ^C
    CMD: setiathome       PID: 2190

              value  ------------- Distribution ------------- count
                 -5 |                                         0
                  0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1820
                  5 |                                         0

    CMD: mozilla-bin      PID: 3164

              value  ------------- Distribution ------------- count
                 40 |                                         0
                 45 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 10
                 50 |                                         0

    CMD: bash             PID: 9185

              value  ------------- Distribution ------------- count
                 50 |                                         0
                 55 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 10
                 60 |                                         0

    CMD: perl             PID: 11547

              value  ------------- Distribution ------------- count
                 -5 |                                         0
                  0 |@@@@@@@@@@@@@@@                          2020
                  5 |@@                                       200
                 10 |@@@@@@@                                  960
                 15 |@                                        160
                 20 |@@@@@                                    720
                 25 |@                                        120
                 30 |@@@@                                     480
                 35 |@                                        80
                 40 |@@                                       240
                 45 |                                         40
                 50 |@@                                       240
                 55 |                                         10
                 60 |                                         0

Now other behaviour can be observed as the perl process runs. The effect
here is due to ts_maxwait triggering a priority boot to avoid CPU starvation;
the priority is boosted to the 50 to 54 range, then decreases by 10 until
it reaches 0 and another ts_maxwait is triggered. The process spends
more time at lower priorities, as that is exactly how the TS dispatch table
has been configured.


Now we run prdist.d for a considerable time,

   # pridist.d
   Sampling... Hit Ctrl-C to end.
   ^C
    CMD: setiathome       PID: 2190

              value  ------------- Distribution ------------- count
                 -5 |                                         0
                  0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 3060
                  5 |                                         0

    CMD: mozilla-bin      PID: 3164

              value  ------------- Distribution ------------- count
                 40 |                                         0
                 45 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 20
                 50 |                                         0

    CMD: perl             PID: 11549

              value  ------------- Distribution ------------- count
                 -5 |                                         0
                  0 |@@@@@@@@@@@@@@@@@@@                      7680
                  5 |                                         0
                 10 |@@@@@@@                                  3040
                 15 |                                         70
                 20 |@@@@@@                                   2280
                 25 |                                         120
                 30 |@@@@                                     1580
                 35 |                                         80
                 40 |@@                                       800
                 45 |                                         40
                 50 |@@                                       800
                 55 |                                         20
                 60 |                                         0

The process has settled to a pattern of 0 priority, ts_maxwait boot to 50,
drop back to 0.

Run "dispadmin -c TS -g" for a printout of the time sharing dispatcher table.


The following shows running pridist.d on a completely idle system,

   # pridist.d
   Sampling... Hit Ctrl-C to end.
   ^C
    CMD: sched            PID: 0

              value  ------------- Distribution ------------- count
                -10 |                                         0
                 -5 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1190
                  0 |                                         0

Only the kernel "sched" was sampled. It would have been running the idle
thread.


The following is an unusual output that is worth mentioning,

   # pridist.d
   Sampling... Hit Ctrl-C to end.
   ^C
    CMD: sched            PID: 0

              value  ------------- Distribution ------------- count
                -10 |                                         0
                 -5 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 940
                  0 |                                         0
                  5 |                                         0
                 10 |                                         0
                 15 |                                         0
                 20 |                                         0
                 25 |                                         0
                 30 |                                         0
                 35 |                                         0
                 40 |                                         0
                 45 |                                         0
                 50 |                                         0
                 55 |                                         0
                 60 |                                         0
                 65 |                                         0
                 70 |                                         0
                 75 |                                         0
                 80 |                                         0
                 85 |                                         0
                 90 |                                         0
                 95 |                                         0
                100 |                                         0
                105 |                                         0
                110 |                                         0
                115 |                                         0
                120 |                                         0
                125 |                                         0
                130 |                                         0
                135 |                                         0
                140 |                                         0
                145 |                                         0
                150 |                                         0
                155 |                                         0
                160 |                                         0
                165 |                                         10
             >= 170 |                                         0

Here we have sampled the kernel running at a priority of 165 to 169. This
is the interrupt priority range, and would be an interrupt servicing thread.
Eg, a network interrupt.

This is a demonstration of the procsystime tool, which can give details
on how processes make use of system calls.

Here we run procsystime on processes which have the name "bash",

   #  procsystime -n bash
   Hit Ctrl-C to stop sampling...
   ^C

   Elapsed Times for process bash,

            SYSCALL          TIME (ns)
            setpgrp              27768
              gtime              28692
        lwp_sigmask             148074
              write             235814
          sigaction             553556
              ioctl             776691
               read          857401243

By default procsystime prints elapsed times, the time from when the syscall
was issued to it's completion. In the above output, we can see the read()
syscall took the most time for this process - 8.57 seconds for all the
reads combined. This is because the read syscall is waiting for keystrokes.


Here we try the "-o" option to print CPU overhead times on "bash",

   # procsystime -o -n bash
   Hit Ctrl-C to stop sampling...
   ^C

   CPU Times for process bash,

            SYSCALL          TIME (ns)
            setpgrp               6994
              gtime               8054
        lwp_sigmask              33865
               read             154895
          sigaction             259899
              write             343825
              ioctl             932280

This identifies which syscall type from bash is consuming the most CPU time.
This is ioctl, at 932 microseconds. Compare this output to the default in
the first example - both are useful for different reasons, this CPU overhead
output helps us see why processes are consuming a lot of sys time.


This demonstrates using the "-a" for all details, this time with "ssh",

   # procsystime -a -n ssh
   Hit Ctrl-C to stop sampling...
   ^C

   Elapsed Times for processes ssh,

            SYSCALL          TIME (ns)
               read             115833
              write             302419
            pollsys          114616076
             TOTAL:          115034328

   CPU Times for processes ssh,

            SYSCALL          TIME (ns)
               read              82381
            pollsys             201818
              write             280390
             TOTAL:             564589

   Syscall Counts for processes ssh,

            SYSCALL              COUNT
               read                  4
              write                  4
            pollsys                  8
             TOTAL:                 16

Now we can see elapsed times, overhead times, and syscall counts in one
report. Very handy. We can also see totals printed as "TOTAL:".


procsystime also lets us just examine one PID. For example,

   # procsystime -p 1304
   Hit Ctrl-C to stop sampling...
   ^C

   Elapsed Times for PID 1304,

            SYSCALL          TIME (ns)
              fcntl               7323
            fstat64              21349
              ioctl             190683
               read             238197
              write            1276169
            pollsys         1005360640


Here is a longer example of running procsystime on mozilla,

   # procsystime -a -n mozilla-bin
   Hit Ctrl-C to stop sampling...
   ^C

   Elapsed Times for processes mozilla-bin,

            SYSCALL          TIME (ns)
              readv             677958
             writev            1159088
              yield            1298742
               read           18019194
              write           35679619
              ioctl          108845685
           lwp_park        38090969432
            pollsys        65955258781
             TOTAL:       104211908499

   CPU Times for processes mozilla-bin,

            SYSCALL          TIME (ns)
              yield             120345
              readv             398046
             writev            1117178
           lwp_park            8591428
               read            9752315
              write           29043460
              ioctl           37089349
            pollsys          189933470
             TOTAL:          276045591

   Syscall Counts for processes mozilla-bin,

            SYSCALL              COUNT
             writev                  3
              yield                  9
              readv                 58
           lwp_park                280
              write               1317
               read               1744
            pollsys               8268
              ioctl              16434
             TOTAL:              28113


The following is a demonstration of the rwbypid.d script,


Here we run it for a few seconds then hit Ctrl-C,

   # rwbypid.d
   Tracing... Hit Ctrl-C to end.
   ^C
      PID CMD                       DIR    COUNT
    11131 dtrace                      W        2
    20334 sshd                        W       17
    20334 sshd                        R       24
     1532 Xorg                        W       69
     1659 mozilla-bin                 R      852
     1659 mozilla-bin                 W     1128
     1532 Xorg                        R     1702

In the above output, we can see that Xorg with PID 1532 has made 1702 reads.

The following is an example fo the rwbytype.d script.


We run rwbytype.d for a few seconds then hit Ctrl-C,

   # rwbytype.d
   Tracing... Hit Ctrl-C to end.
   ^C
   PID    CMD               VTYPE  DIR     BYTES
   1545   sshd                chr    W         1
   10357  more                chr    R        30
   2357   sshd                chr    W        31
   10354  dtrace              chr    W        32
   1545   sshd                chr    R        34
   6778   bash                chr    W        44
   1545   sshd               sock    R        52
   405    poold               reg    W        68
   1545   sshd               sock    W       136
   10357  bash                reg    R       481
   10356  find                reg    R       481
   10355  bash                reg    R       481
   10357  more                reg    R      1652
   2357   sshd               sock    R      1664
   10357  more                chr    W     96925
   10357  more               fifo    R     97280
   2357   sshd                chr    R     98686
   10356  grep               fifo    W    117760
   2357   sshd               sock    W    118972
   10356  grep                reg    R    147645

Here we can see that the grep process with PID 10356 read 147645 bytes
from "regular" files. These are I/O bytes at the application level, so
much of these read bytes would have been cached by the filesystem page cache.

vnode file types are listed in /usr/include/sys/vnode.h, and give an idea of
what the file descriptor refers to.

The following is a demonstration of the rwsnoop program,


Here we run it for about a second,

   # rwsnoop
     UID    PID CMD          D   BYTES FILE
     100  20334 sshd         R      52 <unknown>
     100  20334 sshd         W       1 /devices/pseudo/clone@0:ptm
       0  20320 bash         W       1 /devices/pseudo/pts@0:12
     100  20334 sshd         R       2 /devices/pseudo/clone@0:ptm
     100  20334 sshd         W      52 <unknown>
       0   2848 ls           W      58 /devices/pseudo/pts@0:12
       0   2848 ls           W      68 /devices/pseudo/pts@0:12
       0   2848 ls           W      57 /devices/pseudo/pts@0:12
       0   2848 ls           W      67 /devices/pseudo/pts@0:12
       0   2848 ls           W      48 /devices/pseudo/pts@0:12
       0   2848 ls           W      49 /devices/pseudo/pts@0:12
       0   2848 ls           W      33 /devices/pseudo/pts@0:12
       0   2848 ls           W      41 /devices/pseudo/pts@0:12
     100  20334 sshd         R     429 /devices/pseudo/clone@0:ptm
     100  20334 sshd         W     468 <unknown>
   ^C

The output scrolls rather fast. Above, we can see an ls command was run,
and we can see as ls writes each line. The "<unknown>" read/writes are
socket activity, which have no corresponding filename.


For a summary style output, use the rwtop program.


If a particular program is of interest, the "-n" option can be used
to match on process name. Here we match on "bash" during a login where
the user uses the bash shell as their default,

   # rwsnoop -n bash
     UID    PID CMD          D   BYTES FILE
     100   2854 bash         R     757 /etc/nsswitch.conf
     100   2854 bash         R       0 /etc/nsswitch.conf
     100   2854 bash         R     668 /etc/passwd
     100   2854 bash         R     980 /etc/profile
     100   2854 bash         W      15 /devices/pseudo/pts@0:14
     100   2854 bash         R      10 /export/home/brendan/.bash_profile
     100   2854 bash         R     867 /export/home/brendan/.bashrc
     100   2854 bash         R     980 /etc/profile
     100   2854 bash         W      15 /devices/pseudo/pts@0:14
     100   2854 bash         R    8951 /export/home/brendan/.bash_history
     100   2854 bash         R    8951 /export/home/brendan/.bash_history
     100   2854 bash         R    1652 /usr/share/lib/terminfo/d/dtterm
     100   2854 bash         W      41 /devices/pseudo/pts@0:14
     100   2854 bash         R       1 /devices/pseudo/pts@0:14
     100   2854 bash         W       1 /devices/pseudo/pts@0:14
     100   2854 bash         W      41 /devices/pseudo/pts@0:14
     100   2854 bash         R       1 /devices/pseudo/pts@0:14
     100   2854 bash         W       7 /devices/pseudo/pts@0:14

In the above, various bash related files such as ".bash_profile" and
".bash_history" can be seen. The ".bashrc" is also read, as it was sourced
from the .bash_profile.


Extra options with rwsnoop allow us to print zone ID, project ID, timestamps,
etc. Here we use "-v" to see the time printed, and match on "ps" processes,

   # rwsnoop -vn ps
   TIMESTR                UID    PID CMD          D   BYTES FILE
   2005 Jul 24 04:23:45     0   2804 ps           R     168 /proc/2804/auxv
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/2804/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R    1495 /etc/ttysrch
   2005 Jul 24 04:23:45     0   2804 ps           W      28 /devices/pseudo/pts.
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/0/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/1/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/2/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/3/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/218/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/7/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/9/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/360/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/91/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/112/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/307/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/226/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/242/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/228/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/243/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/234/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/119/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/143/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/361/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/20314/psinfo
   2005 Jul 24 04:23:45     0   2804 ps           R     336 /proc/116/psinfo
   [...]


The following is an example of the sampleproc program.


Here we run sampleproc for a few seconds on a workstation,

   # ./sampleproc
   Sampling at 100 hertz... Hit Ctrl-C to end.
   ^C
     PID CMD                       COUNT
    1659 mozilla-bin                   3
     109 nscd                          4
    2197 prstat                       23
    2190 setiathome                  421

     PID CMD                     PERCENT
    1659 mozilla-bin                   0
     109 nscd                          0
    2197 prstat                        5
    2190 setiathome                   93

The first table shows a count of how many times each process was sampled
on the CPU. The second table gives this as a percentage.

setiathome was on the CPU 421 times, which is 93% of the samples.


The following is sampleproc running on a server with 4 CPUs. A bash shell
is running in an infinate loop,

   # ./sampleproc
   Sampling at 100 hertz... Hit Ctrl-C to end.
   ^C
     PID CMD                       COUNT
   10140 dtrace                        1
   28286 java                          1
   29345 esd                           2
   29731 esd                           3
       2 pageout                       4
   29733 esd                           6
   10098 bash                       1015
       0 sched                      3028

     PID CMD                     PERCENT
   10140 dtrace                        0
   28286 java                          0
   29345 esd                           0
   29731 esd                           0
       2 pageout                       0
   29733 esd                           0
   10098 bash                         24
       0 sched                        74

The bash shell was on the CPUs for 24% of the time, which is consistant
with a CPU bound single threaded application on a 4 CPU server.

The above sample was around 10 seconds long. During this time, there were
around 4000 samples (checking the COUNT column), this is due to
4000 = CPUs (4) * Hertz (100) * Seconds (10).


The following are examples of seeksize.d.

seeksize.d records disk head seek size for each operation by process.
This allows up to identify processes that are causing "random" disk
access and those causing "sequential" disk access.

It is desirable for processes to be accesing the disks in large
sequential operations. By using seeksize.d and bitesize.d we can
identify this behaviour.


In this example we read through a large file by copying it to a
remote server. Most of the seek sizes are zero, indicating sequential
access - and we would expect good performance from the disks
under these conditions,

# ./seeksize.d
Tracing... Hit Ctrl-C to end.
^C

   22349  scp /dl/sol-10-b63-x86-v1.iso mars:\0

           value  ------------- Distribution ------------- count
              -1 |                                         0
               0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@   726
               1 |                                         0
               2 |                                         0
               4 |                                         0
               8 |@                                        13
              16 |                                         4
              32 |                                         0
              64 |                                         0
             128 |                                         2
             256 |                                         3
             512 |                                         4
            1024 |                                         4
            2048 |                                         3
            4096 |                                         0
            8192 |                                         3
           16384 |                                         0
           32768 |                                         1
           65536 |                                         0


In this example we run find. The disk operations are fairly scattered,
as illustrated below by the volume of non sequential reads,

# ./seeksize.d
Tracing... Hit Ctrl-C to end.
^C

   22399  find /var/sadm/pkg/\0

           value  ------------- Distribution ------------- count
              -1 |                                         0
               0 |@@@@@@@@@@@@@                            1475
               1 |                                         0
               2 |                                         44
               4 |@                                        77
               8 |@@@                                      286
              16 |@@                                       191
              32 |@                                        154
              64 |@@                                       173
             128 |@@                                       179
             256 |@@                                       201
             512 |@@                                       186
            1024 |@@                                       236
            2048 |@@                                       201
            4096 |@@                                       274
            8192 |@@                                       243
           16384 |@                                        154
           32768 |@                                        113
           65536 |@@                                       182
          131072 |@                                        81
          262144 |                                         0


I found the following interesting. This time I gzipp'd the large file.
While zipping, the process is reading from one location and writing
to another. One might expect that as the program toggles between
reading from one location and writing to another, that often the
distance would be the same (depending on where UFS puts the new file),

# ./seeksize.d
Tracing... Hit Ctrl-C to end.
^C

   22368  gzip sol-10-b63-x86-v1.iso\0

           value  ------------- Distribution ------------- count
              -1 |                                         0
               0 |@@@@@@@@@@@@                             353
               1 |                                         0
               2 |                                         0
               4 |                                         0
               8 |                                         7
              16 |                                         4
              32 |                                         2
              64 |                                         4
             128 |                                         14
             256 |                                         3
             512 |                                         3
            1024 |                                         5
            2048 |                                         1
            4096 |                                         0
            8192 |                                         3
           16384 |                                         1
           32768 |                                         1
           65536 |                                         1
          131072 |                                         1
          262144 |@@@@@@@@                                 249
          524288 |                                         1
         1048576 |                                         2
         2097152 |                                         1
         4194304 |                                         2
         8388608 |@@@@@@@@@@@@@@@@@@                       536
        16777216 |                                         0


The following example compares the operation of "find" with "tar".
Both are reading from the same location, and we would expect that
both programs would generally need to do the same number of seeks
to navigate the direttory tree (depending on caching); and tar
causing extra operations as it reads the file contents as well,

# ./seeksize.d
Tracing... Hit Ctrl-C to end.
^C

     PID  CMD
   22278  find /etc\0

           value  ------------- Distribution ------------- count
              -1 |                                         0
               0 |@@@@@@@@@@@@@@@@@@@@                     251
               1 |                                         0
               2 |@                                        8
               4 |                                         5
               8 |@                                        10
              16 |@                                        10
              32 |@                                        10
              64 |@                                        9
             128 |@                                        11
             256 |@                                        14
             512 |@@                                       20
            1024 |@                                        10
            2048 |                                         6
            4096 |@                                        7
            8192 |@                                        10
           16384 |@                                        16
           32768 |@@                                       21
           65536 |@@                                       28
          131072 |@                                        7
          262144 |@                                        14
          524288 |                                         6
         1048576 |@                                        15
         2097152 |@                                        7
         4194304 |                                         0


   22282  tar cf /dev/null /etc\0

           value  ------------- Distribution ------------- count
              -1 |                                         0
               0 |@@@@@@@@@@                               397
               1 |                                         0
               2 |                                         8
               4 |                                         14
               8 |                                         16
              16 |@                                        24
              32 |@                                        29
              64 |@@                                       99
             128 |@@                                       73
             256 |@@                                       78
             512 |@@@                                      109
            1024 |@@                                       62
            2048 |@@                                       69
            4096 |@@                                       73
            8192 |@@@                                      113
           16384 |@@                                       81
           32768 |@@@                                      111
           65536 |@@@                                      108
          131072 |@                                        49
          262144 |@                                        33
          524288 |                                         20
         1048576 |                                         13
         2097152 |                                         7
         4194304 |                                         5
         8388608 |@                                        30
        16777216 |                                         0

The following is an example of setuids.d. Login events in particular can
be seen, along with use of the "su" command.

   # ./setuids.d
     UID  SUID  PPID   PID PCMD         CMD
       0   100  3037  3040 in.telnetd   login -p -h mars -d /dev/pts/12
     100     0  3040  3045 bash         su -
       0   102  3045  3051 sh           su - fred
       0   100  3055  3059 sshd         /usr/lib/ssh/sshd
       0   100  3065  3067 in.rlogind   login -d /dev/pts/12 -r mars
       0   100  3071  3073 in.rlogind   login -d /dev/pts/12 -r mars
       0   102  3078  3081 in.telnetd   login -p -h mars -d /dev/pts/12
   ^C

The first line is a telnet login to the user brendan, UID 100. The parent
command is "in.telnetd", the telnet daemon spawned by inetd, and the
command that in.telnetd runs is "login".

The second line shows UID 100 using the "su" command to become root.

The third line has the root user using "su" to become fred, UID 102.

The fourth line is an example of an ssh login.

The fifth and sixth lines are examples of rsh and rlogin.

The last line is another example of a telnet login for fred, UID 102.

The following is a demonstration of the sigdist.d script.


Here we run sigdist.d, and in another window we kill -9 a sleep process,

   # ./sigdist.d
   Tracing... Hit Ctrl-C to end.
   ^C
             SENDER        RECIPIENT    SIG  COUNT
              sched           dtrace      2      1
              sched             bash     18      1
               bash            sleep      9      1
              sched             Xorg     14     55

We can see the signal sent from bash to sleep. We can also see that Xorg
has recieved 55 signal 14s. a "man -s3head signal" may help explain what
signal 14 is (alarm clock).

The following is a demonstration of the syscallbypid.d script,


Here we run syscallbypid.d for a few seconds then hit Ctrl-C,

   # syscallbypid.d
   Tracing... Hit Ctrl-C to end.
   ^C
      PID CMD                      SYSCALL                     COUNT
    11039 dtrace                   setcontext                      1
    11039 dtrace                   lwp_sigmask                     1
        7 svc.startd               portfs                          1
      357 poold                    lwp_cond_wait                   1
    27328 java_vm                  lwp_cond_wait                   1
     1532 Xorg                     writev                          1
    11039 dtrace                   lwp_park                        1
    11039 dtrace                   schedctl                        1
    11039 dtrace                   mmap                            1
      361 sendmail                 pollsys                         1
    11039 dtrace                   fstat64                         1
    11039 dtrace                   sigaction                       2
    11039 dtrace                   write                           2
      361 sendmail                 lwp_sigmask                     2
     1659 mozilla-bin              yield                           2
    11039 dtrace                   sysconfig                       3
      361 sendmail                 pset                            3
    20317 sshd                     read                            4
      361 sendmail                 gtime                           4
    20317 sshd                     write                           4
    27328 java_vm                  ioctl                           6
    11039 dtrace                   brk                             8
     1532 Xorg                     setcontext                      8
     1532 Xorg                     lwp_sigmask                     8
    20317 sshd                     pollsys                         8
      357 poold                    pollsys                        13
     1659 mozilla-bin              read                           16
    20317 sshd                     lwp_sigmask                    16
     1532 Xorg                     setitimer                      17
    27328 java_vm                  pollsys                        18
     1532 Xorg                     pollsys                        19
    11039 dtrace                   p_online                       21
     1532 Xorg                     read                           22
     1659 mozilla-bin              write                          25
     1659 mozilla-bin              lwp_park                       26
    11039 dtrace                   ioctl                          36
     1659 mozilla-bin              pollsys                       155
     1659 mozilla-bin              ioctl                         306

In the above output, we can see that "mozilla-bin" with PID 1659 made the
most system calls - 306 ioctl()s.
The following is an example of the syscallbyproc.d script,

   # syscallbyproc.d
   dtrace: description 'syscall:::entry ' matched 228 probes
   ^C
     snmpd                                                             1
     utmpd                                                             2
     inetd                                                             2
     nscd                                                              7
     svc.startd                                                       11
     sendmail                                                         31
     poold                                                           133
     dtrace                                                         1720

The above output shows that dtrace made the most system calls in this sample,
1720 syscalls.

The following is a demonstration of the syscallbysysc.d script,

   # syscallbysysc.d
   dtrace: description 'syscall:::entry ' matched 228 probes
   ^C
     fstat                                                             1
     setcontext                                                        1
     lwp_park                                                          1
     schedctl                                                          1
     mmap                                                              1
     sigaction                                                         2
     pset                                                              2
     lwp_sigmask                                                       2
     gtime                                                             3
     sysconfig                                                         3
     write                                                             4
     brk                                                               6
     pollsys                                                           7
     p_online                                                        558
     ioctl                                                           579

In the above output, the ioctl system call was the most common, occuring
579 times.

The following is a demonstration of the topsyscall command,


Here topsyscall is run with no arguments,

   # topsyscall
   2005 Jun 13 22:13:21, load average: 1.24, 1.24, 1.22   syscalls: 1287

      SYSCALL                          COUNT
      getgid                               4
      getuid                               5
      waitsys                              5
      xstat                                7
      munmap                               7
      sysconfig                            8
      brk                                  8
      setcontext                           8
      open                                 8
      getpid                               9
      close                                9
      resolvepath                         10
      lwp_sigmask                         22
      mmap                                26
      lwp_park                            43
      read                                59
      write                               72
      sigaction                          113
      pollsys                            294
      ioctl                              520

The screen updates every second, and continues until Ctrl-C is hit to
end the program.

In the above output we can see that the ioctl() system call occured 520 times,
pollsys() 294 times and sigaction() 113 times.


Here the command is run with a 10 second interval,

   # topsyscall 10
   2005 Jun 13 22:15:35, load average: 1.21, 1.22, 1.22   syscalls: 10189

      SYSCALL                          COUNT
      writev                               6
      close                                7
      lseek                                7
      open                                 7
      brk                                  8
      nanosleep                            9
      portfs                              10
      llseek                              14
      lwp_cond_wait                       21
      p_online                            21
      gtime                               27
      rusagesys                           71
      setcontext                          92
      lwp_sigmask                         98
      setitimer                          183
      lwp_park                           375
      write                              438
      read                               551
      pollsys                           3071
      ioctl                             5144

The following is a demonstration of the topsysproc program,


Here we run topsysproc with no arguments,

   # topsysproc
   2005 Jun 13 22:25:16, load average: 1.24, 1.23, 1.21   syscalls: 1347

      PROCESS                          COUNT
      svc.startd                           1
      nscd                                 1
      setiathome                           7
      poold                               18
      sshd                                21
      java_vm                             35
      tput                                49
      dtrace                              56
      Xorg                               108
      sh                                 110
      clear                              122
      mozilla-bin                        819

The screen refreshes every 1 second, which can be changed by specifying
a different interval at the command line.

In the above output we can see that processes with the name "mozilla-bin"
made 819 system calls, while processes with the name "clear" made 122.


Now topsysproc is run with a 15 second interval,

   # topsysproc 15
   2005 Jun 13 22:29:43, load average: 1.19, 1.20, 1.20   syscalls: 15909

      PROCESS                          COUNT
      fmd                                  1
      inetd                                2
      svc.configd                          2
      gconfd-2                             3
      miniserv.pl                          3
      sac                                  6
      snmpd                                6
      sshd                                 8
      automountd                           8
      ttymon                               9
      svc.startd                          17
      nscd                                21
      in.routed                           37
      sendmail                            41
      setiathome                         205
      poold                              293
      dtrace                             413
      java_vm                            529
      Xorg                              1234
      mozilla-bin                      13071
bash-3.2$