diff options
Diffstat (limited to 'tools/perf/Documentation')
-rw-r--r-- | tools/perf/Documentation/intel-pt.txt | 59 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-config.txt | 5 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-diff.txt | 5 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-list.txt | 3 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-record.txt | 16 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-report.txt | 11 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-stat.txt | 11 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-trace.txt | 14 | ||||
-rw-r--r-- | tools/perf/Documentation/perf.data-directory-format.txt | 63 | ||||
-rw-r--r-- | tools/perf/Documentation/perf.txt | 2 |
10 files changed, 187 insertions, 2 deletions
diff --git a/tools/perf/Documentation/intel-pt.txt b/tools/perf/Documentation/intel-pt.txt index e0d9e7dd4f17..2cf2d9e9d0da 100644 --- a/tools/perf/Documentation/intel-pt.txt +++ b/tools/perf/Documentation/intel-pt.txt @@ -434,6 +434,56 @@ pwr_evt Enable power events. The power events provide information about "0" otherwise. +AUX area sampling option +------------------------ + +To select Intel PT "sampling" the AUX area sampling option can be used: + + --aux-sample + +Optionally it can be followed by the sample size in bytes e.g. + + --aux-sample=8192 + +In addition, the Intel PT event to sample must be defined e.g. + + -e intel_pt//u + +Samples on other events will be created containing Intel PT data e.g. the +following will create Intel PT samples on the branch-misses event, note the +events must be grouped using {}: + + perf record --aux-sample -e '{intel_pt//u,branch-misses:u}' + +An alternative to '--aux-sample' is to add the config term 'aux-sample-size' to +events. In this case, the grouping is implied e.g. + + perf record -e intel_pt//u -e branch-misses/aux-sample-size=8192/u + +is the same as: + + perf record -e '{intel_pt//u,branch-misses/aux-sample-size=8192/u}' + +but allows for also using an address filter e.g.: + + perf record -e intel_pt//u --filter 'filter * @/bin/ls' -e branch-misses/aux-sample-size=8192/u -- ls + +It is important to select a sample size that is big enough to contain at least +one PSB packet. If not a warning will be displayed: + + Intel PT sample size (%zu) may be too small for PSB period (%zu) + +The calculation used for that is: if sample_size <= psb_period + 256 display the +warning. When sampling is used, psb_period defaults to 0 (2KiB). + +The default sample size is 4KiB. + +The sample size is passed in aux_sample_size in struct perf_event_attr. The +sample size is limited by the maximum event size which is 64KiB. It is +difficult to know how big the event might be without the trace sample attached, +but the tool validates that the sample size is not greater than 60KiB. + + new snapshot option ------------------- @@ -487,8 +537,8 @@ their mlock limit (which defaults to 64KiB but is not multiplied by the number of cpus). In full-trace mode, powers of two are allowed for buffer size, with a minimum -size of 2 pages. In snapshot mode, it is the same but the minimum size is -1 page. +size of 2 pages. In snapshot mode or sampling mode, it is the same but the +minimum size is 1 page. The mmap size and auxtrace mmap size are displayed if the -vv option is used e.g. @@ -501,12 +551,17 @@ Intel PT modes of operation Intel PT can be used in 2 modes: full-trace mode + sample mode snapshot mode Full-trace mode traces continuously e.g. perf record -e intel_pt//u uname +Sample mode attaches a Intel PT sample to other events e.g. + + perf record --aux-sample -e intel_pt//u -e branch-misses:u + Snapshot mode captures the available data when a signal is sent e.g. perf record -v -e intel_pt//u -S ./loopy 1000000000 & diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt index c599623a1f3d..c4dd23c4b478 100644 --- a/tools/perf/Documentation/perf-config.txt +++ b/tools/perf/Documentation/perf-config.txt @@ -561,6 +561,11 @@ trace.*:: trace.show_zeros:: Do not suppress syscall arguments that are equal to zero. + trace.tracepoint_beautifiers:: + Use "libtraceevent" to use that library to augment the tracepoint arguments, + "libbeauty", the default, to use the same argument beautifiers used in the + strace-like sys_enter+sys_exit lines. + llvm.*:: llvm.clang-path:: Path to clang. If omit, search it from $PATH. diff --git a/tools/perf/Documentation/perf-diff.txt b/tools/perf/Documentation/perf-diff.txt index d5cc15e651cf..f50ca0fef0a4 100644 --- a/tools/perf/Documentation/perf-diff.txt +++ b/tools/perf/Documentation/perf-diff.txt @@ -95,6 +95,11 @@ OPTIONS diff.compute config option. See COMPARISON METHODS section for more info. +--cycles-hist:: + Report a histogram and the standard deviation for cycles data. + It can help us to judge if the reported cycles data is noisy or + not. This option should be used with '-c cycles'. + -p:: --period:: Show period values for both compared hist entries. diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt index 18ed1b0fceb3..6345db33c533 100644 --- a/tools/perf/Documentation/perf-list.txt +++ b/tools/perf/Documentation/perf-list.txt @@ -36,6 +36,9 @@ Enable debugging output. Print how named events are resolved internally into perf events, and also any extra expressions computed by perf stat. +--deprecated:: +Print deprecated events. By default the deprecated events are hidden. + [[EVENT_MODIFIERS]] EVENT MODIFIERS --------------- diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt index c6f9f31b6039..b23a4012a606 100644 --- a/tools/perf/Documentation/perf-record.txt +++ b/tools/perf/Documentation/perf-record.txt @@ -62,6 +62,9 @@ OPTIONS like this: name=\'CPU_CLK_UNHALTED.THREAD:cmask=0x1\'. - 'aux-output': Generate AUX records instead of events. This requires that an AUX area event is also provided. + - 'aux-sample-size': Set sample size for AUX area sampling. If the + '--aux-sample' option has been used, set aux-sample-size=0 to disable + AUX area sampling for the event. See the linkperf:perf-list[1] man page for more parameters. @@ -433,6 +436,12 @@ can be specified in a string that follows this option: In Snapshot Mode trace data is captured only when signal SIGUSR2 is received and on exit if the above 'e' option is given. +--aux-sample[=OPTIONS]:: +Select AUX area sampling. At least one of the events selected by the -e option +must be an AUX area event. Samples on other events will be created containing +data from the AUX area. Optionally sample size may be specified, otherwise it +defaults to 4KiB. + --proc-map-timeout:: When processing pre-existing threads /proc/XXX/mmap, it may take a long time, because the file may be huge. A time out is needed in such cases. @@ -571,6 +580,13 @@ config terms. For example: 'cycles/overwrite/' and 'instructions/no-overwrite/'. Implies --tail-synthesize. +--kcore:: +Make a copy of /proc/kcore and place it into a directory with the perf data file. + +--max-size=<size>:: +Limit the sample data max size, <size> is expected to be a number with +appended unit character - B/K/M/G + SEE ALSO -------- linkperf:perf-stat[1], linkperf:perf-list[1] diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index 7315f155803f..8dbe2119686a 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -525,6 +525,17 @@ include::itrace.txt[] Configure time quantum for time sort key. Default 100ms. Accepts s, us, ms, ns units. +--total-cycles:: + When --total-cycles is specified, it supports sorting for all blocks by + 'Sampled Cycles%'. This is useful to concentrate on the globally hottest + blocks. In output, there are some new columns: + + 'Sampled Cycles%' - block sampled cycles aggregation / total sampled cycles + 'Sampled Cycles' - block sampled cycles aggregation + 'Avg Cycles%' - block average sampled cycles / sum of total block average + sampled cycles + 'Avg Cycles' - block average sampled cycles + include::callchain-overhead-calculation.txt[] SEE ALSO diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt index 930c51c01201..9431b8066fb4 100644 --- a/tools/perf/Documentation/perf-stat.txt +++ b/tools/perf/Documentation/perf-stat.txt @@ -217,6 +217,11 @@ core number and the number of online logical processors on that physical process Aggregate counts per monitored threads, when monitoring threads (-t option) or processes (-p option). +--per-node:: +Aggregate counts per NUMA nodes for system-wide mode measurements. This +is a useful mode to detect imbalance between NUMA nodes. To enable this +mode, use --per-node in addition to -a. (system-wide). + -D msecs:: --delay msecs:: After starting the program, wait msecs before measuring. This is useful to @@ -323,6 +328,12 @@ The output is SMI cycles%, equals to (aperf - unhalted core cycles) / aperf Users who wants to get the actual value can apply --no-metric-only. +--all-kernel:: +Configure all used events to run in kernel space. + +--all-user:: +Configure all used events to run in user space. + EXAMPLES -------- diff --git a/tools/perf/Documentation/perf-trace.txt b/tools/perf/Documentation/perf-trace.txt index 25b74fdb36fa..abc9b5d83312 100644 --- a/tools/perf/Documentation/perf-trace.txt +++ b/tools/perf/Documentation/perf-trace.txt @@ -42,6 +42,11 @@ OPTIONS Prefixing with ! shows all syscalls but the ones specified. You may need to escape it. +--filter=<filter>:: + Event filter. This option should follow an event selector (-e) which + selects tracepoint event(s). + + -D msecs:: --delay msecs:: After starting the program, wait msecs before measuring. This is useful to @@ -141,6 +146,10 @@ the thread executes on the designated CPUs. Default is to monitor all CPUs. Show all syscalls followed by a summary by thread with min, max, and average times (in msec) and relative stddev. +--errno-summary:: + To be used with -s or -S, to show stats for the errnos experienced by + syscalls, using only this option will trigger --summary. + --tool_stats:: Show tool stats such as number of times fd->pathname was discovered thru hooking the open syscall return + vfs_getname or via reading /proc/pid/fd, etc. @@ -219,6 +228,11 @@ the thread executes on the designated CPUs. Default is to monitor all CPUs. may happen, for instance, when a thread gets migrated to a different CPU while processing a syscall. +--libtraceevent_print:: + Use libtraceevent to print tracepoint arguments. By default 'perf trace' uses + the same beautifiers used in the strace-like enter+exit lines to augment the + tracepoint arguments. + --map-dump:: Dump BPF maps setup by events passed via -e, for instance the augmented_raw_syscalls living in tools/perf/examples/bpf/augmented_raw_syscalls.c. For now this diff --git a/tools/perf/Documentation/perf.data-directory-format.txt b/tools/perf/Documentation/perf.data-directory-format.txt new file mode 100644 index 000000000000..f37fbd29112e --- /dev/null +++ b/tools/perf/Documentation/perf.data-directory-format.txt @@ -0,0 +1,63 @@ +perf.data directory format + +DISCLAIMER This is not ABI yet and is subject to possible change + in following versions of perf. We will remove this + disclaimer once the directory format soaks in. + + +This document describes the on-disk perf.data directory format. + +The layout is described by HEADER_DIR_FORMAT feature. +Currently it holds only version number (0): + + HEADER_DIR_FORMAT = 24 + + struct { + uint64_t version; + } + +The current only version value 0 means that: + - there is a single perf.data file named 'data' within the directory. + e.g. + + $ tree -ps perf.data + perf.data + └── [-rw------- 25912] data + +Future versions are expected to describe different data files +layout according to special needs. + +Currently the only 'perf record' option to output to a directory is +the --kcore option which puts a copy of /proc/kcore into the directory. +e.g. + + $ sudo perf record --kcore uname + Linux + [ perf record: Woken up 1 times to write data ] + [ perf record: Captured and wrote 0.015 MB perf.data (9 samples) ] + $ sudo tree -ps perf.data + perf.data + ├── [-rw------- 23744] data + └── [drwx------ 4096] kcore_dir + ├── [-r-------- 6731125] kallsyms + ├── [-r-------- 40230912] kcore + └── [-r-------- 5419] modules + + 1 directory, 4 files + $ sudo perf script -v + build id event received for vmlinux: 1eaa285996affce2d74d8e66dcea09a80c9941de + build id event received for [vdso]: 8bbaf5dc62a9b644b4d4e4539737e104e4a84541 + build id event received for /lib/x86_64-linux-gnu/libc-2.28.so: 5b157f49586a3ca84d55837f97ff466767dd3445 + Samples for 'cycles' event do not have CPU attribute set. Skipping 'cpu' field. + Using CPUID GenuineIntel-6-8E-A + Using perf.data/kcore_dir/kcore for kernel data + Using perf.data/kcore_dir/kallsyms for symbols + perf 15316 2060795.480902: 1 cycles: ffffffffa2caa548 native_write_msr+0x8 (vmlinux) + perf 15316 2060795.480906: 1 cycles: ffffffffa2caa548 native_write_msr+0x8 (vmlinux) + perf 15316 2060795.480908: 7 cycles: ffffffffa2caa548 native_write_msr+0x8 (vmlinux) + perf 15316 2060795.480910: 119 cycles: ffffffffa2caa54a native_write_msr+0xa (vmlinux) + perf 15316 2060795.480912: 2109 cycles: ffffffffa2c9b7b0 native_apic_msr_write+0x0 (vmlinux) + perf 15316 2060795.480914: 37606 cycles: ffffffffa2f121fe perf_event_addr_filters_exec+0x2e (vmlinux) + uname 15316 2060795.480924: 588287 cycles: ffffffffa303a56d page_counter_try_charge+0x6d (vmlinux) + uname 15316 2060795.481067: 2261945 cycles: ffffffffa301438f kmem_cache_free+0x4f (vmlinux) + uname 15316 2060795.481643: 2172167 cycles: 7f1a48c393c0 _IO_un_link+0x0 (/lib/x86_64-linux-gnu/libc-2.28.so) diff --git a/tools/perf/Documentation/perf.txt b/tools/perf/Documentation/perf.txt index 401f0ed67439..3f37ded13f8c 100644 --- a/tools/perf/Documentation/perf.txt +++ b/tools/perf/Documentation/perf.txt @@ -24,6 +24,8 @@ OPTIONS data-convert - data convert command debug messages stderr - write debug output (option -v) to stderr in browser mode + perf-event-open - Print perf_event_open() arguments and + return value --buildid-dir:: Setup buildid cache directory. It has higher priority than |