diff options
author | Ingo Molnar <mingo@kernel.org> | 2018-06-07 07:18:51 +0200 |
---|---|---|
committer | Ingo Molnar <mingo@kernel.org> | 2018-06-07 07:18:51 +0200 |
commit | 2696ec4566f598ab483a6bebc4ec841b2efb88ec (patch) | |
tree | f4431b2d287084f53161e3de4ec5b292f4b6882a /tools | |
parent | d09a8e6f2c0a4fe3dcb85d21ea1069aa83152fe1 (diff) | |
parent | ac56aa4549cdfd9c56387b35e99e3c868cfc7bd0 (diff) |
Merge tag 'perf-core-for-mingo-4.18-20180606' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
perf stat:
- Display user and system time for workload targets (Jiri Olsa)
perf record:
- Enable arbitrary event names thru name= modifier (Alexey Budankov)
PowerPC:
- Add a python script for hypervisor call statistics (Ravi Bangoria)
Intel PT: (Adrian Hunter)
- Fix sync_switch INTEL_PT_SS_NOT_TRACING
- Fix decoding to accept CBR between FUP and corresponding TIP
- Fix MTC timing after overflow
- Fix "Unexpected indirect branch" error
perf test:
- record+probe_libc_inet_pton:
- To get the symbol table for dynamic
shared objects on ubuntu we need to pass the -D/--dynamic command line
option, unlike with the fedora distros (Arnaldo Carvalho de Melo)
- code-reading:
- Fix perf_env setup for PTI entry trampolines (Adrian Hunter)
- kmod-path:
- Add tests for vdso32 and vdsox32 (Adrian Hunter)
- Use header file util/debug.h (Thomas Richter)
perf annotate:
- Make the various UI backends (stdio, TUI, gtk) use more consistently
structs with annotation options as specified by the user (Arnaldo Carvalho de Melo)
- Move annotation specific knobs from the symbol_conf global kitchen
sink to the annotation option structs (Arnaldo Carvalho de Melo)
perf script:
- Add more PMU fields to python scripts event handler dict (Jin Yao)
Core:
- Fix misleading error for some unparsable events mentioning PMUs when
those are not involved in the problem (Jiri Olsa)
- Consider BSS symbols when processing /proc/kallsyms ('B' and 'b')
(Arnaldo Carvalho de Melo)
- Be more robust when trying to use per-symbol histograms, checking for
unlikely but possible cases where the space for the histograms wasn't
allocated, print a debug message for such cases (Arnaldo Carvalho de Melo)
- Fix symbol and object code resolution for vdso32 and vdsox32 (Adrian Hunter)
- No need to check for null when passing pointers to foo__get() style
refcount grabbing helpers, just like in the kernel and with free(),
its safe to pass a NULL pointer to avoid having to check it before
each and every foo__get() call (Arnaldo Carvalho de Melo)
- Remove some dead code (quote.[ch]) (Arnaldo Carvalho de Melo)
- Remove some needless globals, making them local (Arnaldo Carvalho de Melo)
- Reduce usage of symbol_conf.use_callchain, using other means of
finding out if callchains are in use or available for specific events,
as we evolved this codebase to allow requesting callchains for just
a subset of the monitored events. In time it will help polish
recording and showing mixed sets accross the various tools:
perf record -e cycles/call-graph=fp/,cache-misses/call-graph=dwarf/,instructions'
(Arnaldo Carvalho de Melo)
- Consider PTI entry trampolines in map__rip_2objdump() (Adrian Hunter)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Diffstat (limited to 'tools')
59 files changed, 998 insertions, 427 deletions
diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt index 2549c34a7895..11300dbe35c5 100644 --- a/tools/perf/Documentation/perf-list.txt +++ b/tools/perf/Documentation/perf-list.txt @@ -124,7 +124,11 @@ The available PMUs and their raw parameters can be listed with For example the raw event "LSD.UOPS" core pmu event above could be specified as - perf stat -e cpu/event=0xa8,umask=0x1,name=LSD.UOPS_CYCLES,cmask=1/ ... + perf stat -e cpu/event=0xa8,umask=0x1,name=LSD.UOPS_CYCLES,cmask=0x1/ ... + + or using extended name syntax + + perf stat -e cpu/event=0xa8,umask=0x1,cmask=0x1,name=\'LSD.UOPS_CYCLES:cmask=0x1\'/ ... PER SOCKET PMUS --------------- diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt index cc37b3a4be76..04168da4268e 100644 --- a/tools/perf/Documentation/perf-record.txt +++ b/tools/perf/Documentation/perf-record.txt @@ -57,6 +57,9 @@ OPTIONS FP mode, "dwarf" for DWARF mode, "lbr" for LBR mode and "no" for disable callgraph. - 'stack-size': user stack size for dwarf mode + - 'name' : User defined event name. Single quotes (') may be used to + escape symbols in the name from parsing by shell and tool + like this: name=\'CPU_CLK_UNHALTED.THREAD:cmask=0x1\'. See the linkperf:perf-list[1] man page for more parameters. diff --git a/tools/perf/Documentation/perf-script-python.txt b/tools/perf/Documentation/perf-script-python.txt index 51ec2d20068a..0fb9eda3cbca 100644 --- a/tools/perf/Documentation/perf-script-python.txt +++ b/tools/perf/Documentation/perf-script-python.txt @@ -610,6 +610,32 @@ Various utility functions for use with perf script: nsecs_str(nsecs) - returns printable string in the form secs.nsecs avg(total, n) - returns average given a sum and a total number of values +SUPPORTED FIELDS +---------------- + +Currently supported fields: + +ev_name, comm, pid, tid, cpu, ip, time, period, phys_addr, addr, +symbol, dso, time_enabled, time_running, values, callchain, +brstack, brstacksym, datasrc, datasrc_decode, iregs, uregs, +weight, transaction, raw_buf, attr. + +Some fields have sub items: + +brstack: + from, to, from_dsoname, to_dsoname, mispred, + predicted, in_tx, abort, cycles. + +brstacksym: + items: from, to, pred, in_tx, abort (converted string) + +For example, +We can use this code to print brstack "from", "to", "cycles". + +if 'brstack' in dict: + for entry in dict['brstack']: + print "from %s, to %s, cycles %s" % (entry["from"], entry["to"], entry["cycles"]) + SEE ALSO -------- linkperf:perf-script[1] diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt index 3a822f308e6d..5dfe102fb5b5 100644 --- a/tools/perf/Documentation/perf-stat.txt +++ b/tools/perf/Documentation/perf-stat.txt @@ -310,20 +310,38 @@ Users who wants to get the actual value can apply --no-metric-only. EXAMPLES -------- -$ perf stat -- make -j +$ perf stat -- make - Performance counter stats for 'make -j': + Performance counter stats for 'make': - 8117.370256 task clock ticks # 11.281 CPU utilization factor - 678 context switches # 0.000 M/sec - 133 CPU migrations # 0.000 M/sec - 235724 pagefaults # 0.029 M/sec - 24821162526 CPU cycles # 3057.784 M/sec - 18687303457 instructions # 2302.138 M/sec - 172158895 cache references # 21.209 M/sec - 27075259 cache misses # 3.335 M/sec + 83723.452481 task-clock:u (msec) # 1.004 CPUs utilized + 0 context-switches:u # 0.000 K/sec + 0 cpu-migrations:u # 0.000 K/sec + 3,228,188 page-faults:u # 0.039 M/sec + 229,570,665,834 cycles:u # 2.742 GHz + 313,163,853,778 instructions:u # 1.36 insn per cycle + 69,704,684,856 branches:u # 832.559 M/sec + 2,078,861,393 branch-misses:u # 2.98% of all branches - Wall-clock time elapsed: 719.554352 msecs + 83.409183620 seconds time elapsed + + 74.684747000 seconds user + 8.739217000 seconds sys + +TIMINGS +------- +As displayed in the example above we can display 3 types of timings. +We always display the time the counters were enabled/alive: + + 83.409183620 seconds time elapsed + +For workload sessions we also display time the workloads spent in +user/system lands: + + 74.684747000 seconds user + 8.739217000 seconds sys + +Those times are the very same as displayed by the 'time' tool. CSV FORMAT ---------- diff --git a/tools/perf/arch/common.c b/tools/perf/arch/common.c index c6f373508a4f..82657c01a3b8 100644 --- a/tools/perf/arch/common.c +++ b/tools/perf/arch/common.c @@ -189,7 +189,7 @@ out_error: return -1; } -int perf_env__lookup_objdump(struct perf_env *env) +int perf_env__lookup_objdump(struct perf_env *env, const char **path) { /* * For live mode, env->arch will be NULL and we can use @@ -198,5 +198,5 @@ int perf_env__lookup_objdump(struct perf_env *env) if (env->arch == NULL) return 0; - return perf_env__lookup_binutils_path(env, "objdump", &objdump_path); + return perf_env__lookup_binutils_path(env, "objdump", path); } diff --git a/tools/perf/arch/common.h b/tools/perf/arch/common.h index 2d875baa92e6..2167001b18c5 100644 --- a/tools/perf/arch/common.h +++ b/tools/perf/arch/common.h @@ -4,8 +4,6 @@ #include "../util/env.h" -extern const char *objdump_path; - -int perf_env__lookup_objdump(struct perf_env *env); +int perf_env__lookup_objdump(struct perf_env *env, const char **path); #endif /* ARCH_PERF_COMMON_H */ diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c index da5704240239..5eb22cc56363 100644 --- a/tools/perf/builtin-annotate.c +++ b/tools/perf/builtin-annotate.c @@ -40,9 +40,8 @@ struct perf_annotate { struct perf_tool tool; struct perf_session *session; + struct annotation_options opts; bool use_tui, use_stdio, use_stdio2, use_gtk; - bool full_paths; - bool print_line; bool skip_missing; bool has_br_stack; bool group_set; @@ -162,12 +161,12 @@ static int hist_iter__branch_callback(struct hist_entry_iter *iter, hist__account_cycles(sample->branch_stack, al, sample, false); bi = he->branch_info; - err = addr_map_symbol__inc_samples(&bi->from, sample, evsel->idx); + err = addr_map_symbol__inc_samples(&bi->from, sample, evsel); if (err) goto out; - err = addr_map_symbol__inc_samples(&bi->to, sample, evsel->idx); + err = addr_map_symbol__inc_samples(&bi->to, sample, evsel); out: return err; @@ -249,7 +248,7 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel, if (he == NULL) return -ENOMEM; - ret = hist_entry__inc_addr_samples(he, sample, evsel->idx, al->addr); + ret = hist_entry__inc_addr_samples(he, sample, evsel, al->addr); hists__inc_nr_samples(hists, true); return ret; } @@ -289,10 +288,9 @@ static int hist_entry__tty_annotate(struct hist_entry *he, struct perf_annotate *ann) { if (!ann->use_stdio2) - return symbol__tty_annotate(he->ms.sym, he->ms.map, evsel, - ann->print_line, ann->full_paths, 0, 0); - return symbol__tty_annotate2(he->ms.sym, he->ms.map, evsel, - ann->print_line, ann->full_paths); + return symbol__tty_annotate(he->ms.sym, he->ms.map, evsel, &ann->opts); + + return symbol__tty_annotate2(he->ms.sym, he->ms.map, evsel, &ann->opts); } static void hists__find_annotations(struct hists *hists, @@ -343,7 +341,7 @@ find_next: /* skip missing symbols */ nd = rb_next(nd); } else if (use_browser == 1) { - key = hist_entry__tui_annotate(he, evsel, NULL); + key = hist_entry__tui_annotate(he, evsel, NULL, &ann->opts); switch (key) { case -1: @@ -390,8 +388,9 @@ static int __cmd_annotate(struct perf_annotate *ann) goto out; } - if (!objdump_path) { - ret = perf_env__lookup_objdump(&session->header.env); + if (!ann->opts.objdump_path) { + ret = perf_env__lookup_objdump(&session->header.env, + &ann->opts.objdump_path); if (ret) goto out; } @@ -476,6 +475,7 @@ int cmd_annotate(int argc, const char **argv) .ordered_events = true, .ordering_requires_timestamps = true, }, + .opts = annotation__default_options, }; struct perf_data data = { .mode = PERF_DATA_MODE_READ, @@ -503,9 +503,9 @@ int cmd_annotate(int argc, const char **argv) "file", "vmlinux pathname"), OPT_BOOLEAN('m', "modules", &symbol_conf.use_modules, "load module symbols - WARNING: use only with -k and LIVE kernel"), - OPT_BOOLEAN('l', "print-line", &annotate.print_line, + OPT_BOOLEAN('l', "print-line", &annotate.opts.print_lines, "print matching source lines (may be slow)"), - OPT_BOOLEAN('P', "full-paths", &annotate.full_paths, + OPT_BOOLEAN('P', "full-paths", &annotate.opts.full_path, "Don't shorten the displayed pathnames"), OPT_BOOLEAN(0, "skip-missing", &annotate.skip_missing, "Skip symbols that cannot be annotated"), @@ -516,13 +516,13 @@ int cmd_annotate(int argc, const char **argv) OPT_CALLBACK(0, "symfs", NULL, "directory", "Look for files with symbols relative to this directory", symbol__config_symfs), - OPT_BOOLEAN(0, "source", &symbol_conf.annotate_src, + OPT_BOOLEAN(0, "source", &annotate.opts.annotate_src, "Interleave source code with assembly code (default)"), - OPT_BOOLEAN(0, "asm-raw", &symbol_conf.annotate_asm_raw, + OPT_BOOLEAN(0, "asm-raw", &annotate.opts.show_asm_raw, "Display raw encoding of assembly instructions (default)"), - OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style", + OPT_STRING('M', "disassembler-style", &annotate.opts.disassembler_style, "disassembler style", "Specify disassembler style (e.g. -M intel for intel syntax)"), - OPT_STRING(0, "objdump", &objdump_path, "path", + OPT_STRING(0, "objdump", &annotate.opts.objdump_path, "path", "objdump binary to use for disassembly and annotations"), OPT_BOOLEAN(0, "group", &symbol_conf.event_group, "Show event group information together"), diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c index 2126bfbcb385..307b3594525f 100644 --- a/tools/perf/builtin-c2c.c +++ b/tools/perf/builtin-c2c.c @@ -1976,7 +1976,7 @@ static int filter_cb(struct hist_entry *he) c2c_he = container_of(he, struct c2c_hist_entry, he); if (c2c.show_src && !he->srcline) - he->srcline = hist_entry__get_srcline(he); + he->srcline = hist_entry__srcline(he); calc_width(c2c_he); diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c index 72e2ca096bf5..2b1ef704169f 100644 --- a/tools/perf/builtin-kvm.c +++ b/tools/perf/builtin-kvm.c @@ -1438,8 +1438,6 @@ static int kvm_events_live(struct perf_kvm_stat *kvm, goto out; } - symbol_conf.nr_events = kvm->evlist->nr_entries; - if (perf_evlist__create_maps(kvm->evlist, &kvm->opts.target) < 0) usage_with_options(live_usage, live_options); diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c index c0065923a525..99de91698de1 100644 --- a/tools/perf/builtin-probe.c +++ b/tools/perf/builtin-probe.c @@ -81,8 +81,7 @@ static int parse_probe_event(const char *str) params.target_used = true; } - if (params.nsi) - pev->nsi = nsinfo__get(params.nsi); + pev->nsi = nsinfo__get(params.nsi); /* Parse a perf-probe command into event */ ret = parse_perf_probe_command(str, pev); diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index ad978e3ee2b8..cdb5b6949832 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -71,6 +71,7 @@ struct report { bool group_set; int max_stack; struct perf_read_values show_threads_values; + struct annotation_options annotation_opts; const char *pretty_printing_style; const char *cpu_list; const char *symbol_filter_str; @@ -136,26 +137,25 @@ static int hist_iter__report_callback(struct hist_entry_iter *iter, if (sort__mode == SORT_MODE__BRANCH) { bi = he->branch_info; - err = addr_map_symbol__inc_samples(&bi->from, sample, evsel->idx); + err = addr_map_symbol__inc_samples(&bi->from, sample, evsel); if (err) goto out; - err = addr_map_symbol__inc_samples(&bi->to, sample, evsel->idx); + err = addr_map_symbol__inc_samples(&bi->to, sample, evsel); } else if (rep->mem_mode) { mi = he->mem_info; - err = addr_map_symbol__inc_samples(&mi->daddr, sample, evsel->idx); + err = addr_map_symbol__inc_samples(&mi->daddr, sample, evsel); if (err) goto out; - err = hist_entry__inc_addr_samples(he, sample, evsel->idx, al->addr); + err = hist_entry__inc_addr_samples(he, sample, evsel, al->addr); } else if (symbol_conf.cumulate_callchain) { if (single) - err = hist_entry__inc_addr_samples(he, sample, evsel->idx, - al->addr); + err = hist_entry__inc_addr_samples(he, sample, evsel, al->addr); } else { - err = hist_entry__inc_addr_samples(he, sample, evsel->idx, al->addr); + err = hist_entry__inc_addr_samples(he, sample, evsel, al->addr); } out: @@ -181,11 +181,11 @@ static int hist_iter__branch_callback(struct hist_entry_iter *iter, rep->nonany_branch_mode); bi = he->branch_info; - err = addr_map_symbol__inc_samples(&bi->from, sample, evsel->idx); + err = addr_map_symbol__inc_samples(&bi->from, sample, evsel); if (err) goto out; - err = addr_map_symbol__inc_samples(&bi->to, sample, evsel->idx); + err = addr_map_symbol__inc_samples(&bi->to, sample, evsel); branch_type_count(&rep->brtype_stat, &bi->flags, bi->from.addr, bi->to.addr); @@ -561,7 +561,7 @@ static int report__browse_hists(struct report *rep) ret = perf_evlist__tui_browse_hists(evlist, help, NULL, rep->min_percent, &session->header.env, - true); + true, &rep->annotation_opts); /* * Usually "ret" is the last pressed key, and we only * care if the key notifies us to switch data file. @@ -946,12 +946,6 @@ parse_percent_limit(const struct option *opt, const char *str, return 0; } -#define CALLCHAIN_DEFAULT_OPT "graph,0.5,caller,function,percent" - -const char report_callchain_help[] = "Display call graph (stack chain/backtrace):\n\n" - CALLCHAIN_REPORT_HELP - "\n\t\t\t\tDefault: " CALLCHAIN_DEFAULT_OPT; - int cmd_report(int argc, const char **argv) { struct perf_session *session; @@ -960,6 +954,10 @@ int cmd_report(int argc, const char **argv) bool has_br_stack = false; int branch_mode = -1; bool branch_call_mode = false; +#define CALLCHAIN_DEFAULT_OPT "graph,0.5,caller,function,percent" + const char report_callchain_help[] = "Display call graph (stack chain/backtrace):\n\n" + CALLCHAIN_REPORT_HELP + "\n\t\t\t\tDefault: " CALLCHAIN_DEFAULT_OPT; char callchain_default_opt[] = CALLCHAIN_DEFAULT_OPT; const char * const report_usage[] = { "perf report [<options>]", @@ -989,6 +987,7 @@ int cmd_report(int argc, const char **argv) .max_stack = PERF_MAX_STACK_DEPTH, .pretty_printing_style = "normal", .socket_filter = -1, + .annotation_opts = annotation__default_options, }; const struct option options[] = { OPT_STRING('i', "input", &input_name, "file", @@ -1078,11 +1077,11 @@ int cmd_report(int argc, const char **argv) "list of cpus to profile"), OPT_BOOLEAN('I', "show-info", &report.show_full_info, "Display extended information about perf.data file"), - OPT_BOOLEAN(0, "source", &symbol_conf.annotate_src, + OPT_BOOLEAN(0, "source", &report.annotation_opts.annotate_src, "Interleave source code with assembly code (default)"), - OPT_BOOLEAN(0, "asm-raw", &symbol_conf.annotate_asm_raw, + OPT_BOOLEAN(0, "asm-raw", &report.annotation_opts.show_asm_raw, "Display raw encoding of assembly instructions (default)"), - OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style", + OPT_STRING('M', "disassembler-style", &report.annotation_opts.disassembler_style, "disassembler style", "Specify disassembler style (e.g. -M intel for intel syntax)"), OPT_BOOLEAN(0, "show-total-period", &symbol_conf.show_total_period, "Show a column with the sum of periods"), @@ -1093,7 +1092,7 @@ int cmd_report(int argc, const char **argv) parse_branch_mode), OPT_BOOLEAN(0, "branch-history", &branch_call_mode, "add last branch records to call history"), - OPT_STRING(0, "objdump", &objdump_path, "path", + OPT_STRING(0, "objdump", &report.annotation_opts.objdump_path, "path", "objdump binary to use for disassembly and annotations"), OPT_BOOLEAN(0, "demangle", &symbol_conf.demangle, "Disable symbol demangling"), diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c index 4dfdee668b0c..cbf39dab19c1 100644 --- a/tools/perf/builtin-sched.c +++ b/tools/perf/builtin-sched.c @@ -2143,7 +2143,7 @@ static void save_task_callchain(struct perf_sched *sched, return; } - if (!symbol_conf.use_callchain || sample->callchain == NULL) + if (!sched->show_callchain || sample->callchain == NULL) return; if (thread__resolve_callchain(thread, cursor, evsel, sample, @@ -2271,10 +2271,11 @@ static struct thread *get_idle_thread(int cpu) return idle_threads[cpu]; } -static void save_idle_callchain(struct idle_thread_runtime *itr, +static void save_idle_callchain(struct perf_sched *sched, + struct idle_thread_runtime *itr, struct perf_sample *sample) { - if (!symbol_conf.use_callchain || sample->callchain == NULL) + if (!sched->show_callchain || sample->callchain == NULL) return; callchain_cursor__copy(&itr->cursor, &callchain_cursor); @@ -2320,7 +2321,7 @@ static struct thread *timehist_get_thread(struct perf_sched *sched, /* copy task callchain when entering to idle */ if (perf_evsel__intval(evsel, sample, "next_pid") == 0) - save_idle_callchain(itr, sample); + save_idle_callchain(sched, itr, sample); } } @@ -2849,7 +2850,7 @@ static void timehist_print_summary(struct perf_sched *sched, printf(" CPU %2d idle entire time window\n", i); } - if (sched->idle_hist && symbol_conf.use_callchain) { + if (sched->idle_hist && sched->show_callchain) { callchain_param.mode = CHAIN_FOLDED; callchain_param.value = CCVAL_PERIOD; @@ -2933,8 +2934,7 @@ static int timehist_check_attr(struct perf_sched *sched, return -1; } - if (sched->show_callchain && - !(evsel->attr.sample_type & PERF_SAMPLE_CALLCHAIN)) { + if (sched->show_callchain && !evsel__has_callchain(evsel)) { pr_info("Samples do not have callchains.\n"); |