summaryrefslogtreecommitdiffstats
path: root/tools/perf/util/bpf-loader.c
</
AgeCommit message (Collapse)Author
2016-12-05perf clang: Compile BPF script using builtin clang supportWang Nan
After this patch, perf utilizes builtin clang support to build BPF script, no longer depend on external clang, but fallbacking to it if for some reason the builtin compiling framework fails. Test: $ type clang -bash: type: clang: not found $ cat ~/.perfconfig $ echo '#define LINUX_VERSION_CODE 0x040700' > ./test.c $ cat ./tools/perf/tests/bpf-script-example.c >> ./test.c $ ./perf record -v --dry-run -e ./test.c 2>&1 | grep builtin bpf: successfull builtin compilation $ Can't pass cflags so unable to include kernel headers now. Will be fixed by following commits. Committer notes: Make sure '-v' comes before the '-e ./test.c' in the command line otherwise the 'verbose' variable will not be set when the bpf event is parsed and thus the pr_debug indicating a 'successfull builtin compilation' will not be output, as the debug level (1) will be less than what 'verbose' has at that point (0). Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-16-wangnan0@huawei.com [ Spell check/reflow successfull pr_debug string ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05perf llvm: Extract helpers in llvm-utils.cWang Nan
The following commits will use builtin clang to compile BPF scripts. llvm__get_kbuild_opts() and llvm__get_nr_cpus() are extracted to help building '-DKERNEL_VERSION_CODE' and '-D__NR_CPUS__' macros. Doing object dumping in bpf loader, so further builtin clang compiling needn't consider it. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-7-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-10-24perf tools: Fix typo "No enough" to "Not enough"Alexander Alemayhu
The latter version occurs much more when running git grep. Signed-off-by: Alexander Alemayhu <alexander@alemayhu.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20161013161811.4939-1-alexander@alemayhu.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-08-23perf bpf: Fix typo: "ehough" -> "enough"Colin Ian King
Trivial typo fix in pr_debug message Signed-off-by: Colin King <colin.king@canonical.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: He Kuang <hekuang@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20160821141924.8056-1-colin.king@canonical.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-07-13perf bpf: Support BPF program attach to tracepointsWang Nan
To support 98b5c2c65c29 ("perf, bpf: allow bpf programs attach to tracepoints"), this patch allows BPF scripts to select tracepoints in their section name. Example: # cat test_tracepoint.c /*********************************************/ #include <uapi/linux/bpf.h> #define SEC(NAME) __attribute__((section(NAME), used)) SEC("raw_syscalls:sys_enter") int func(void *ctx) { /* * /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/format: * ... * field:long id; offset:8; size:8; signed:1; * ... * ctx + 8 select 'id' */ u64 id = *((u64 *)(ctx + 8)); if (id == 1) return 1; return 0; } SEC("_write=sys_write") int _write(void *ctx) { return 1; } char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; /*********************************************/ # perf record -e ./test_tracepoint.c dd if=/dev/zero of=/dev/null count=5 5+0 records in 5+0 records out 2560 bytes (2.6 kB) copied, 6.2281e-05 s, 41.1 MB/s [ perf record: Woken up 1 times to write data ] # perf script dd 13436 [005] 1596.490869: raw_syscalls:sys_enter: NR 1 (1, 178d000, 200, 7ffe82470d60, ffffffffffffe020, fffff dd 13436 [005] 1596.490871: perf_bpf_probe:_write: (ffffffff812351e0) dd 13436 [005] 1596.490873: raw_syscalls:sys_enter: NR 1 (1, 178d000, 200, ffffffffffffe000, ffffffffffffe020, f dd 13436 [005] 1596.490874: perf_bpf_probe:_write: (ffffffff812351e0) dd 13436 [005] 1596.490876: raw_syscalls:sys_enter: NR 1 (1, 178d000, 200, ffffffffffffe000, ffffffffffffe020, f dd 13436 [005] 1596.490876: perf_bpf_probe:_write: (ffffffff812351e0) dd 13436 [005] 1596.490878: raw_syscalls:sys_enter: NR 1 (1, 178d000, 200, ffffffffffffe000, ffffffffffffe020, f dd 13436 [005] 1596.490879: perf_bpf_probe:_write: (ffffffff812351e0) dd 13436 [005] 1596.490881: raw_syscalls:sys_enter: NR 1 (1, 178d000, 200, ffffffffffffe000, ffffffffffffe020, f dd 13436 [005] 1596.490882: perf_bpf_probe:_write: (ffffffff812351e0) dd 13436 [005] 1596.490900: raw_syscalls:sys_enter: NR 1 (2, 7ffe8246e640, 1f, 40acb8, 7f44bac74700, 7f44baa4fba dd 13436 [005] 1596.490901: perf_bpf_probe:_write: (ffffffff812351e0) dd 13436 [005] 1596.490917: raw_syscalls:sys_enter: NR 1 (2, 7ffe8246e640, 1a, fffffffa, 7f44bac74700, 7f44baa4f dd 13436 [005] 1596.490918: perf_bpf_probe:_write: (ffffffff812351e0) dd 13436 [005] 1596.490932: raw_syscalls:sys_enter: NR 1 (2, 7ffe8246e640, 1a, fffffff9, 7f44bac74700, 7f44baa4f dd 13436 [005] 1596.490933: perf_bpf_probe:_write: (ffffffff812351e0) Committer note: Further testing: # trace --no-sys --event /home/acme/bpf/tracepoint.c cat /etc/passwd > /dev/null 0.000 raw_syscalls:sys_enter:NR 1 (1, 7f0490504000, c48, 7f0490503010, ffffffffffffffff, 0)) 0.006 perf_bpf_probe:_write:(ffffffff81241bc0)) # Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1468406646-21642-6-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-07-13perf bpf: Rename bpf__foreach_tev() to bpf__foreach_event()Wang Nan
Following commit will allow BPF script attach to tracepoints. bpf__foreach_tev() will iterate over all events, not only kprobes. Rename it to bpf__foreach_event(). Since only group and event are used by caller, there's no need to pass full 'struct probe_trace_event' to bpf_prog_iter_callback_t. Pass only these two strings. After this patch bpf_prog_iter_callback_t natually support tracepoints. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1468406646-21642-5-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-07-12tools: Introduce str_error_r()Arnaldo Carvalho de Melo
The tools so far have been using the strerror_r() GNU variant, that returns a string, be it the buffer passed or something else. But that, besides being tricky in cases where we expect that the function using strerror_r() returns the error formatted in a provided buffer (we have to check if it returned something else and copy that instead), breaks the build on systems not using glibc, like Alpine Linux, where musl libc is used. So, introduce yet another wrapper, str_error_r(), that has the GNU interface, but uses the portable XSI variant of strerror_r(), so that users rest asured that the provided buffer is used and it is what is returned. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-d4t42fnf48ytlk8rjxs822tf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-06-06tools lib bpf: Rename set_private() to set_priv()Arnaldo Carvalho de Melo
For consistency with class__priv() elsewhere, and with the callback typedef for clearing those areas (e.g. bpf_map_clear_priv_t). Acked-by: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/tip-rnbiyv27ohw8xppsgx0el3xb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-06-06tools lib bpf: Make bpf_program__get_private() use IS_ERR()Arnaldo Carvalho de Melo
For consistency with bpf_map__priv() and elsewhere. Acked-by: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/tip-x17nk5mrazkf45z0l0ahlmo8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-06-06tools lib bpf: Remove _get_ from non-refcount method namesArnaldo Carvalho de Melo
The use of this term is not warranted here, we use it in the kernel sources and in tools/ for refcounting, so, for consistency, rename them. Acked-bu: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/tip-4ya1ot2e2fkrz48ws9ebiofs@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-06-06tools lib bpf: Rename bpf_map__get_fd() to bpf_map__fd()Arnaldo Carvalho de Melo
For consistency, leaving "get" for reference counting. Acked-by: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/tip-msy8sxfz9th6gl2xjeci2btm@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-06-06tools lib bpf: Use IS_ERR() reporting macros with bpf_map__get_def()Arnaldo Carvalho de Melo
And for consistency, rename it to bpf_map__def(), leaving "get" for reference counting. Also make it return a const pointer, as suggested by Wang. Acked-by: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/tip-mer00xqkiho0ymg66b5i9luw@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-06-06tools lib bpf: Rename bpf_map__get_name() to bpf_map__name()Arnaldo Carvalho de Melo
For consistency, leaving "get" for reference counting. Acked-by: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/tip-crnflv84ejyhpba933ec71gs@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-06-06tools lib bpf: Use IS_ERR() reporting macros with bpf_map__get_private()Arnaldo Carvalho de Melo
To try to, over time, consistently use the IS_ERR() interface instead of using two return values, i.e. the integer return value for an error and the pointer address to return the bpf_map->priv pointer. Also rename it to bpf__priv(), to leave the "get" term for reference counting. Noticed while working on using BPF for collecting non-integer syscall argument payloads (struct sockaddr in calls such as connect(), for instance), where we need to use BPF maps and thus generalise bpf__setup_stdout() to connect bpf_output events with maps in a bpf proggie. Acked-by: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/tip-saypxyd6ptrct379jqgxx4bl@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-11perf bpf: Automatically create bpf-output event __bpf_stdout__Wang Nan
This patch removes the need to set a bpf-output event in cmdline. By referencing a map named '__bpf_stdout__', perf automatically creates an event for it. For example: # perf record -e ./test_bpf_trace.c usleep 100000 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.012 MB perf.data (2 samples) ] # perf script usleep 4639 [000] 261895.307826: 0 __bpf_stdout__: ffffffff810eb9a1 ... BPF output: 0000: 52 61 69 73 65 20 61 20 Raise a 0008: 42 50 46 20 65 76 65 6e BPF even 0010: 74 21 00 00 t!.. BPF string: "Raise a BPF event!" usleep 4639 [000] 261895.407883: 0 __bpf_stdout__: ffffffff8105d609 ... BPF output: 0000: 52 61 69 73 65 20 61 20 Raise a 0008: 42 50 46 20 65 76 65 6e BPF even 0010: 74 21 00 00 t!.. BPF string: "Raise a BPF event!" perf record -e ./test_bpf_trace.c usleep 100000 equals to: perf record -e bpf-output/no-inherit=1,name=__bpf_stdout__/ \ -e ./test_bpf_trace.c/map:__bpf_stdout__.event=__bpf_stdout__/ \ usleep 100000 Where test_bpf_trace.c is: /************************ BEGIN **************************/ #include <uapi/linux/bpf.h> struct bpf_map_def { unsigned int type; unsigned int key_size; unsigned int value_size; unsigned int max_entries; }; #define SEC(NAME) __attribute__((section(NAME), used)) static u64 (*ktime_get_ns)(void) = (void *)BPF_FUNC_ktime_get_ns; static int (*trace_printk)(const char *fmt, int fmt_size, ...) = (void *)BPF_FUNC_trace_printk; static int (*get_smp_processor_id)(void) = (void *)BPF_FUNC_get_smp_processor_id; static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) = (void *)BPF_FUNC_perf_event_output; struct bpf_map_def SEC("maps") __bpf_stdout__ = { .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY, .key_size = sizeof(int), .value_size = sizeof(u32), .max_entries = __NR_CPUS__, }; static inline int __attribute__((always_inline)) func(void *ctx, int type) { char output_str[] = "Raise a BPF event!"; char err_str[] = "BAD %d\n"; int err; err = perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), &output_str, sizeof(output_str)); if (err) trace_printk(err_str, sizeof(err_str), err); return 1; } SEC("func_begin=sys_nanosleep") int func_begin(void *ctx) {return func(ctx, 1);} SEC("func_end=sys_nanosleep%return") int func_end(void *ctx) { return func(ctx, 2);} char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; /************************* END ***************************/ Committer note: Testing with 'perf trace': # trace -e nanosleep --ev test_bpf_stdout.c usleep 1 0.007 ( 0.007 ms): usleep/729 nanosleep(rqtp: 0x7ffc5bbc5fe0) ... 0.007 ( ): __bpf_stdout__:Raise a BPF event!..) 0.008 ( ): perf_bpf_probe:func_begin:(ffffffff81112460)) 0.069 ( ): __bpf_stdout__:Raise a BPF event!..) 0.070 ( ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92)) 0.072 ( 0.072 ms): usleep/729 ... [continued]: nanosleep()) = 0 # Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460128045-97310-5-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-11perf bpf: Clone bpf stdout events in multiple bpf scriptsWang Nan
This patch allows cloning bpf-output event configuration among multiple bpf scripts. If there exist a map named '__bpf_output__' and not configured using 'map:__bpf_output__.event=', this patch clones the configuration of another '__bpf_stdout__' map. For example, following command: # perf trace --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/ \ --ev ./test_bpf_trace2.c usleep 100000 equals to: # perf trace --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/ \ --ev ./test_bpf_trace2.c/map:__bpf_stdout__.event=evt/ \ usleep 100000 Signed-off-by: Wang Nan <wangnan0@huawei.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460128045-97310-4-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22perf tools: Introduce bpf-output eventWang Nan
Commit a43eec304259 ("bpf: introduce bpf_perf_event_output() helper") adds a helper to enable a BPF program to output data to a perf ring buffer through a new type of perf event, PERF_COUNT_SW_BPF_OUTPUT. This patch enables perf to create events of that type. Now a perf user can use the following cmdline to receive output data from BPF programs: # perf record -a -e bpf-output/no-inherit,name=evt/ \ -e ./test_bpf_output.c/map:channel.event=evt/ ls / # perf script perf 1560 [004] 347747.086295: evt: ffffffff811fd201 sys_write ... perf 1560 [004] 347747.086300: evt: ffffffff811fd201 sys_write ... perf 1560 [004] 347747.086315: evt: ffffffff811fd201 sys_write ... ... Test result: # cat test_bpf_output.c /************************ BEGIN **************************/ #include <uapi/linux/bpf.h> struct bpf_map_def { unsigned int type; unsigned int key_size; unsigned int value_size; unsigned int max_entries; }; #define SEC(NAME) __attribute__((section(NAME), used)) static u64 (*ktime_get_ns)(void) = (void *)BPF_FUNC_ktime_get_ns; static int (*trace_printk)(const char *fmt, int fmt_size, ...) = (void *)BPF_FUNC_trace_printk; static int (*get_smp_processor_id)(void) = (void *)BPF_FUNC_get_smp_processor_id; static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) = (void *)BPF_FUNC_perf_event_output; struct bpf_map_def SEC("maps") channel = { .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY, .key_size = sizeof(int), .value_size = sizeof(u32), .max_entries = __NR_CPUS__, }; SEC("func_write=sys_write") int func_write(void *ctx) { struct { u64 ktime; int cpuid; } __attribute__((packed)) output_data; char error_data[] = "Error: failed to output: %d\n"; output_data.cpuid = get_smp_processor_id(); output_data.ktime = ktime_get_ns(); int err = perf_event_output(ctx, &channel, get_smp_processor_id(), &output_data, sizeof(output_data)); if (err) trace_printk(error_data, sizeof(error_data), err); return 0; } char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; /************************ END ***************************/ # perf record -a -e bpf-output/no-inherit,name=evt/ \ -e ./test_bpf_output.c/map:channel.event=evt/ ls / # perf script | grep ls ls 2242 [003] 347851.557563: evt: ffffffff811fd201 sys_write ... ls 2242 [003] 347851.557571: evt: ffffffff811fd201 sys_write ... Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456132275-98875-11-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22perf tools: Support setting different slots in a BPF map separatelyWang Nan
This patch introduces basic facilities to support config different slots in a BPF map one by one. array.nr_ranges and array.ranges are introduced into 'struct parse_events_term', where ranges is an array of indices range (start, length) which will be configured by this config term. nr_ranges is the size of the array. The array is passed to 'struct bpf_map_priv'. To indicate the new type of configuration, BPF_MAP_KEY_RANGES is added as a new key type. bpf_map_config_foreach_key() is extended to iterate over those indices instead of all possible keys. Code in this commit will be enabled by following commit which enables the indices syntax for array configuration. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456132275-98875-8-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22perf tools: Enable passing event to BPF objectWang Nan
A new syntax is added to the parser so that the user can access predefined perf events in BPF objects. After this patch, BPF programs for perf are finally able to utilize bpf_perf_event_read() introduced in commit 35578d798400 ("bpf: Implement function bpf_perf_event_read() that get the selected hardware PMU counter"). Test result: # cat test_bpf_map_2.c /************************ BEGIN **************************/ #include <uapi/linux/bpf.h> #define SEC(NAME) __attribute__((section(NAME), used)) struct bpf_map_def { unsigned int type; unsigned int key_size; unsigned int value_size; unsigned int max_entries; }; static int (*trace_printk)(const char *fmt, int fmt_size, ...) = (void *)BPF_FUNC_trace_printk; static int (*get_smp_processor_id)(void) = (void *)BPF_FUNC_get_smp_processor_id; static int (*perf_event_read)(struct bpf_map_def *, int) = (void *)BPF_FUNC_perf_event_read; struct bpf_map_def SEC("maps") pmu_map = { .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY, .key_size = sizeof(int), .value_size = sizeof(int), .max_entries = __NR_CPUS__, }; SEC("func_write=sys_write") int func_write(void *ctx) { unsigned long long val; char fmt[] = "sys_write: pmu=%llu\n"; val = perf_event_read(&pmu_map, get_smp_processor_id()); trace_printk(fmt, sizeof(fmt), val); return 0; } SEC("func_write_return=sys_write%return") int func_write_return(void *ctx) { unsigned long long val = 0; char fmt[] = "sys_write_return: pmu=%llu\n"; val = perf_event_read(&pmu_map, get_smp_processor_id()); trace_printk(fmt, sizeof(fmt), val); return 0; } char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; /************************* END ***************************/ Normal case: # echo "" > /sys/kernel/debug/tracing/trace # perf record -i -e cycles -e './test_bpf_map_2.c/map:pmu_map.event=cycles/' ls / [SNIP] [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.013 MB perf.data (7 samples) ] # cat /sys/kernel/debug/tracing/trace | grep ls ls-17066 [000] d... 938449.863301: : sys_write: pmu=1157327 ls-17066 [000] dN.. 938449.863342: : sys_write_return: pmu=1225218 ls-17066 [000] d... 938449.863349: : sys_write: pmu=1241922 ls-17066 [000] dN.. 938449.863369: : sys_write_return: pmu=1267445 Normal case (system wide): # echo "" > /sys/kernel/debug/tracing/trace # perf record -i -e cycles -e './test_bpf_map_2.c/map:pmu_map.event=cycles/' -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.811 MB perf.data (120 samples) ] # cat /sys/kernel/debug/tracing/trace | grep -v '18446744073709551594' | grep -v perf | head -n 20 [SNIP] # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | gmain-30828 [002] d... 2740551.068992: : sys_write: pmu=84373 gmain-30828 [002] d... 2740551.068992: : sys_write_return: pmu=87696 gmain-30828 [002] d... 2740551.068996: : sys_write: pmu=100658 gmain-30828 [002] d... 2740551.068997: : sys_write_return: pmu=102572 Error case 1: # perf record -e './test_bpf_map_2.c' ls / [SNIP] [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.014 MB perf.data ] # cat /sys/kernel/debug/tracing/trace | grep ls ls-17115 [007] d... 2724279.665625: : sys_write: pmu=18446744073709551614 ls-17115 [007] dN.. 2724279.665651: : sys_write_return: pmu=18446744073709551614 ls-17115 [007] d... 2724279.665658: : sys_write: pmu=18446744073709551614 ls-17115 [007] dN.. 2724279.665677: : sys_write_return: pmu=18446744073709551614 (18446744073709551614 is 0xfffffffffffffffe (-2)) Error case 2: # perf record -e cycles -e './test_bpf_map_2.c/map:pmu_map.event=evt/' -a event syntax error: '..ps:pmu_map.event=evt/' \___ Event not found for map setting Hint: Valid config terms: map:[<arraymap>].value=[value] map:[<eventmap>].event=[event] [SNIP] Error case 3: # ls /proc/2348/task/ 2348 2505 2506 2507 2508 # perf record -i -e cycles -e './test_bpf_map_2.c/map:pmu_map.event=cycles/' -p 2348 ERROR: Apply config to BPF failed: Cannot set event to BPF map in multi-thread tracing Error case 4: # perf record -e cycles -e './test_bpf_map_2.c/map:pmu_map.event=cycles/' ls / ERROR: Apply config to BPF failed: Doesn't support inherit event (Hint: use -i to turn off inherit) Error case 5: # perf record -i -e raw_syscalls:sys_enter -e './test_bpf_map_2.c/map:pmu_map.event=raw_syscalls:sys_enter/' ls ERROR: Apply config to BPF failed: Can only put raw, hardware and BPF output event into a BPF map Error case 6: # perf record -i -e './test_bpf_map_2.c/map:pmu_map.event=123/' ls / event syntax error: '.._map.event=123/' \___ Incorrect value type for map [SNIP] Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456132275-98875-7-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22perf record: Apply config to BPF objects before recordingWang Nan
bpf__apply_obj_config() is introduced as the core API to apply object config options to all BPF objects. This patch also does the real work for setting values for BPF_MAP_TYPE_PERF_ARRAY maps by inserting value stored in map's private field into the BPF map. This patch is required because we are not always able to set all BPF config during parsing. Further patch will set events created by perf to BPF_MAP_TYPE_PERF_EVENT_ARRAY maps, which is not exist until perf_evsel__open(). bpf_map_foreach_key() is introduced to iterate over each key needs to be configured. This function would be extended to support more map types and different key settings. In perf record, before start recording, call bpf__apply_config() to turn on all BPF config options. Test result: # cat ./test_bpf_map_1.c /************************ BEGIN **************************/ #include <uapi/linux/bpf.h> #define SEC(NAME) __attribute__((section(NAME), used)) struct bpf_map_def { unsigned int type; unsigned int key_size; unsigned int value_size; unsigned int max_entries; }; static void *(*map_lookup_elem)(struct bpf_map_def *, void *) = (void *)BPF_FUNC_map_lookup_elem; static int (*trace_printk)(const char *fmt, int fmt_size, ...) = (void *)BPF_FUNC_trace_printk; struct bpf_map_def SEC("maps") channel = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(int), .value_size = sizeof(int), .max_entries = 1, }; SEC("func=sys_nanosleep") int func(void *ctx) { int key = 0; char fmt[] = "%d\n"; int *pval = map_lookup_elem(&channel, &key); if (!pval) return 0; trace_printk(fmt, sizeof(fmt), *pval); return 0; } char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; /************************* END ***************************/ # echo "" > /sys/kernel/debug/tracing/trace # ./perf record -e './test_bpf_map_1.c/map:channel.value=11/' usleep 10 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.012 MB perf.data ] # cat /sys/kernel/debug/tracing/trace # tracer: nop # # entries-in-buffer/entries-written: 1/1 #P:8 [SNIP] # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | usleep-18593 [007] d... 2394714.395539: : 11 # ./perf record -e './test_bpf_map_1.c/map:channel.value=101/' usleep 10 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.012 MB perf.data ] # cat /sys/kernel/debug/tracing/trace # tracer: nop # # entries-in-buffer/entries-written: 1/1 #P:8 [SNIP] # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | usleep-18593 [007] d... 2394714.395539: : 11 usleep-19000 [006] d... 2394831.057840: : 101 Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456132275-98875-6-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-22perf bpf: Add API to set values to map entries in a bpf objectWang Nan
bpf__config_obj() is introduced as a core API to config BPF object after loading. One configuration option of maps is introduced. After this patch BPF object can accept assignments like: map:my_map.value=1234 (map.my_map.value looks pretty. However, there's a small but hard to fix problem related to flex's greedy matching. Please see [1]. Choose ':' to avoid it in a simpler way.) This patch is more complex than the work it does because the consideration of extension. In designing BPF map configuration, the following things should be considered: 1. Array indices selection: perf should allow user setting different value for different slots in an array, with syntax like: map:my_map.value[0,3...6]=1234; 2. A map should be set by different config terms, each for a part of it. For example, set each slot to the pid of a thread; 3. Type of value: integer is not the only valid value type. A perf counter can also be put into a map after commit 35578d798400 ("bpf: Implement function bpf_perf_event_read() that get the selected hardware PMU counter") 4. For a hash table, it should be possible to use a string or other value as a key; 5. It is possible that map configuration is unable to be setup during parsing. A perf counter is an example. Therefore, this patch does the following: 1. Instead of updating map element during parsing, this patch stores map config options in 'struct bpf_map_priv'. Following patches will apply those configs at an appropriate time; 2. Link map operations in a list so a map can have multiple config terms attached, so different parts can be configured separately; 3. Make 'struct bpf_map_priv' extensible so that the following patches can add new types of keys and operations; 4. Use bpf_obj_config__map_funcs array to support more map config options. Since the patch changing the event parser to parse BPF object config is relative large, I've put it in another commit. Code in this patch can be tested after applying the next patch. [1] http://lkml.kernel.org/g/564ED621.4050500@huawei.com Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1456132275-98875-4-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> [ Changes "maps:my_map.value" to "map:my_map.value", improved error messages ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-02-19perf bpf: Rename bpf_prog_priv__clear() to clear_prog_priv()Wang Nan
The name of bpf_prog_priv__clear() doesn't follow perf's naming convention. bpf_prog_priv__delete() seems to be a better name. However, bpf_prog_priv__delete() should be a method of 'struct bpf_prog_priv', but its first parameter is 'struct bpf_program'. It is callback from libbpf to clear priv structures when destroying a bpf program. It is actually a method of bpf_program (libbpf object), but bpf_program__ functions should be provided by libbpf. This patch removes the prefix of that function. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1455882283-79592-4-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-27perf bpf: Rename bpf config to program configWang Nan
Following patches are going to introduce BPF object level configuration to enable setting values into BPF maps. To avoid confusion, this patch renames existing 'config' in bpf-loader.c to 'program config'. Following patches would introduce 'object config'. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1448614067-197576-4-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-18perf bpf: Use same BPF program if arguments are identicalWang Nan
This patch allows creating only one BPF program for different 'probe_trace_event'(tev) entries generated by one 'perf_probe_event'(pev) if their prologues are identical. This is done by comparing the argument list of different tev instances, and the maps type of prologue and tev using a mapping array. This patch utilizes qsort to sort the tevs. After sorting, tevs with identical argument lists will be grouped together. Test result: Sample BPF program: #define SEC(NAME) __attribute__((section(NAME), used)) SEC("inlines=no;" "func=SyS_dup? oldfd") int func(void *ctx) { return 1; } It would probe at SyS_dup2 and SyS_dup3, obtaining oldfd as its argument. The following cmdline shows a BPF program being loaded into the kernel by perf: # perf record -e ./test_bpf_arg.c sleep 4 & sleep 1 && ls /proc/$!/fd/ -l | grep bpf-prog Before this patch: # perf record -e ./test_bpf_arg.c sleep 4 & sleep 1 && ls /proc/$!/fd/ -l | grep bpf-prog [1] 24858 lrwx------ 1 root root 64 Nov 14 04:09 3 -> anon_inode:bpf-prog lrwx------ 1 root root 64 Nov 14 04:09 4 -> anon_inode:bpf-prog ... After this patch: # perf record -e ./test_bpf_arg.c sleep 4 & sleep 1 && ls /proc/$!/fd/ -l | grep bpf-prog [1] 25699 lrwx------ 1 root root 64 Nov 14 04:10 3 -> anon_inode:bpf-prog ... Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1447749170-175898-3-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-18perf bpf: Generate prologue for BPF programsWang Nan
This patch generates a prologue for each 'struct probe_trace_event' for fetching arguments for BPF programs. After bpf__probe(), iterate over each program to check whether prologues are required. If none of the 'struct perf_probe_event' programs will attach to have at least one argument, simply skip preprocessor hooking. For those who a prologue is required, call bpf__gen_prologue() and paste the original instruction after the prologue. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1447675815-166222-12-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-18perf bpf: Add prologue for BPF programs for fetching argumentsHe Kuang
This patch generates a prologue for a BPF program which fetches arguments for it. With this patch, the program can have arguments as follow: SEC("lock_page=__lock_page page->flags") int lock_page(struct pt_regs *ctx, int err, unsigned long flags) { return 1; } This patch passes at most 3 arguments from r3, r4 and r5. r1 is still the ctx pointer. r2 is used to indicate if dereferencing was done successfully. This patch uses r6 to hold ctx (struct pt_regs) and r7 to hold stack pointer for result. Result of each arguments first store on stack: low address BPF_REG_FP - 24 ARG3 BPF_REG_FP - 16 ARG2 BPF_REG_FP - 8 ARG1 BPF_REG_FP high address Then loaded into r3, r4 and r5. The output prologue for offn(...off2(off1(reg)))) should be: r6 <- r1 // save ctx into a callee saved register r7 <- fp r7 <- r7 - stack_offset // pointer to result slot /* load r3 with the offset in pt_regs of 'reg' */ (r7) <- r3 // make slot valid r3 <- r3 + off1 // prepare to read unsafe pointer r2 <- 8 r1 <- r7 // result put onto stack call probe_read // read unsafe pointer jnei r0, 0, err // error checking r3 <- (r7) // read result r3 <- r3 + off2 // prepare to read unsafe pointer r2 <- 8 r1 <- r7 call probe_read jnei r0, 0, err ... /* load r2, r3, r4 from stack */ goto success err: r2 <- 1 /* load r3, r4, r5 with 0 */ goto usercode success: r2 <- 0 usercode: r1 <- r6 // restore ctx // original user code If all of arguments reside in register (dereferencing is not required), gen_prologue_fastpath() will be used to create fast prologue: r3 <- (r1 + offset of reg1) r4 <- (r1 + offset of reg2) r5 <- (r1 + offset of reg3) r2 <- 0 P.S. eBPF calling convention is defined as: * r0 - return value from in-kernel function, and exit value for eBPF program * r1 - r5 - arguments from eBPF program to in-kernel function * r6 - r9 - callee saved registers that in-kernel function will preserve * r10 - read-only frame pointer to access stack Committer note: At least testing if it builds and loads: # cat test_probe_arg.c struct pt_regs; __attribute__((section("lock_page=__lock_page page->flags"), used)) int func(struct pt_regs *ctx, int err, unsigned long flags) { return 1; } char _license[] __attribute__((section("license"), used)) = "GPL"; int _version __attribute__((section("version"), used)) = 0x40300; # perf record -e ./test_probe_arg.c usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.016 MB perf.data ] # perf evlist perf_bpf_probe:lock_page # Signed-off-by: He Kuang <hekuang@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1447675815-166222-11-git-send-email-wangnan0@huawei.com Signed-off-by: Wang Nan <wangnan0@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-18perf bpf: Allow BPF program config probing optionsWang Nan
By extending the syntax of BPF object section names, this patch allows users to config probing options like what they can do in 'perf probe'. The error message in 'perf probe' is also updated. Test result: For following BPF file test_probe_glob.c: # cat test_probe_glob.c __attribute__((section("inlines=no;func=SyS_dup?"), used)) int func(void *ctx) { return 1; } char _license[] __attribute__((section("license"), used)) = "GPL"; int _version __attribute__((section("version"), used)) = 0x40300; # # ./perf record -e ./test_probe_glob.c ls / ... [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.013 MB perf.data ] # ./perf evlist perf_bpf_probe:func_1 perf_bpf_probe:func After changing "inlines=no" to "inlines=yes": # ./perf record -e ./test_probe_glob.c ls / ... [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.013 MB perf.data ] # ./perf evlist perf_bpf_probe:func_3 perf_bpf_probe:func_2 perf_bpf_probe:func_1 perf_bpf_probe:func Then test 'force': Use following program: # cat test_probe_force.c __attribute__((section("func=sys_write"), used)) int funca(void *ctx) { return 1; } __attribute__((section("force=yes;func=sys_write"), used)) int funcb(void *ctx) { return 1; } char _license[] __attribute__((section("license"), used)) = "GPL"; int _version __attribute__((section("version"), used)) = 0x40300; # # perf record -e ./test_probe_force.c usleep 1 Error: event "func" already exists. Hint: Remove existing event by 'perf probe -d' or force duplicates by 'perf probe -f' or set 'force=yes' in BPF source. event syntax error: './test_probe_force.c' \___ Probe point exist. Try 'perf probe -d "*"' and set 'force=yes' (add -v to see detail) ... Then replace 'force=no' to 'force=yes': # vim test_probe_force.c # perf record -e ./test_probe_force.c usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data ] # perf evlist perf_bpf_probe:func_1 perf_bpf_probe:func # Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1447675815-166222-7-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-18perf bpf: Allow attaching BPF programs to modules symbolsWang Nan
By extending the syntax of BPF object section names, this patch allows users to attach BPF programs to symbols in modules. For example: SEC("module=i915;" "parse_cmds=i915_parse_cmds") int parse_cmds(void *ctx) { return 1; } The implementation is very simple: like what 'perf probe' does, for module, fill 'uprobe' field in 'struct perf_probe_event'. Other parts will be done automatically. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1447675815-166222-5-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-18perf bpf: Allow BPF program attach to uprobe eventsWang Nan
This patch adds a new syntax to the BPF object section name to support probing at uprobe event. Now we can use BPF program like this: SEC( "exec=/lib64/libc.so.6;" "libcwrite=__write" ) int libcwrite(void *ctx) { return 1; } Where, in section name of a program, before the main config string, we can use 'key=value' style options. Now the only option key is "exec", for uprobes. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1447675815-166222-4-git-send-email-wangnan0@huawei.com [ Changed the separator from \n to ; ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-06perf test: Add 'perf test BPF'Wang Nan
This patch adds BPF testcase for testing BPF event filtering. By utilizing the result of 'perf test LLVM', this patch compiles the eBPF sample program then test its ability. The BPF script in 'perf test LLVM' lets only 50% samples generated by epoll_pwait() to be captured. This patch runs that system call for 111 times, so the result should contain 56 samples. Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1446817783-86722-8-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-06perf bpf: Improve BPF related error messagesWang Nan
A series of bpf loader related error codes were introduced to help error reporting. Functions were improved to return these new error codes. Functions which return pointers were adjusted to encode error codes into return value using the ERR_PTR() interface. bpf_loader_strerror() was improved to convert these error messages to strings. It checks the error codes and calls libbpf_strerror() and strerror_r() accordingly, so caller don't need to consider checking the range of the error code. In bpf__strerror_load(), print kernel version of running kernel and the object's 'version' section to notify user how to fix his/her program. v1 -> v2: Use macro for error code. Fetch error message based on array index, eliminate for-loop. Print version strings. Before: # perf record -e ./test_kversion_nomatch_program.o sleep 1 event syntax error: './test_kversion_nomatch_program.o' \___ Failed to load program: Validate your program and check 'license'/'version' sections in your object SKIP After: # perf record -e ./test_kversion_nomatch_program.o ls event syntax error: './test_kversion_nomatch_program.o' \___ 'version' (4.4.0) doesn't match running kernel (4.3.0) SKIP Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1446818289-87444-1-git-send-email-wangnan0@huawei.com [ Add 'static inline' to bpf__strerror_prepare_load() when LIBBPF is disabled ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>