Skip to content
Snippets Groups Projects
  1. May 28, 2019
    • Arnaldo Carvalho de Melo's avatar
      tools include UAPI: Update copy of files related to new fspick, fsmount,... · fba29f18
      Arnaldo Carvalho de Melo authored
      tools include UAPI: Update copy of files related to new fspick, fsmount, fsconfig, fsopen, move_mount and open_tree syscalls
      
      Copy the headers changed by these csets:
      
        d8076bdb ("uapi: Wire up the mount API syscalls on non-x86 arches [ver #2]")
        9c8ad7a2 ("uapi, x86: Fix the syscall numbering of the mount API syscalls [ver #2]")
        cf3cba4a ("vfs: syscall: Add fspick() to select a superblock for reconfiguration")
        93766fbd ("vfs: syscall: Add fsmount() to create a mount for a superblock")
        ecdab150 ("vfs: syscall: Add fsconfig() for configuring and managing a context")
        24dcb3d9 ("vfs: syscall: Add fsopen() to prepare for superblock creation")
        2db154b3 ("vfs: syscall: Add move_mount(2) to move mounts around")
        a07b2000 ("vfs: syscall: Add open_tree(2) to reference or clone a mount")
      
      We need to create tables for all the flags argument in the new syscalls,
      in followup patches.
      
      This silences these perf build warnings:
      
        Warning: Kernel ABI header at 'tools/include/uapi/linux/mount.h' differs from latest version at 'include/uapi/linux/mount.h'
        diff -u tools/include/uapi/linux/mount.h include/uapi/linux/mount.h
        Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
        diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
        Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
        diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-knpqr1u2ffvz6641056z2mwu@git.kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fba29f18
    • Vitaly Chikunov's avatar
      perf arm64: Fix mksyscalltbl when system kernel headers are ahead of the kernel · f95d050c
      Vitaly Chikunov authored
      
      When a host system has kernel headers that are newer than a compiling
      kernel, mksyscalltbl fails with errors such as:
      
        <stdin>: In function 'main':
        <stdin>:271:44: error: '__NR_kexec_file_load' undeclared (first use in this function)
        <stdin>:271:44: note: each undeclared identifier is reported only once for each function it appears in
        <stdin>:272:46: error: '__NR_pidfd_send_signal' undeclared (first use in this function)
        <stdin>:273:43: error: '__NR_io_uring_setup' undeclared (first use in this function)
        <stdin>:274:43: error: '__NR_io_uring_enter' undeclared (first use in this function)
        <stdin>:275:46: error: '__NR_io_uring_register' undeclared (first use in this function)
        tools/perf/arch/arm64/entry/syscalls//mksyscalltbl: line 48: /tmp/create-table-xvUQdD: Permission denied
      
      mksyscalltbl is compiled with default host includes, but run with
      compiling kernel tree includes, causing some syscall numbers to being
      undeclared.
      
      Committer testing:
      
      Before this patch, in my cross build environment, no build problems, but
      these new syscalls were not in the syscalls.c generated from the
      unistd.h file, which is a bug, this patch fixes it:
      
      perfbuilder@6e20056ed532:/git/perf$ tail /tmp/build/perf/arch/arm64/include/generated/asm/syscalls.c
      	[292] = "io_pgetevents",
      	[293] = "rseq",
      	[294] = "kexec_file_load",
      	[424] = "pidfd_send_signal",
      	[425] = "io_uring_setup",
      	[426] = "io_uring_enter",
      	[427] = "io_uring_register",
      	[428] = "syscalls",
      };
      perfbuilder@6e20056ed532:/git/perf$ strings /tmp/build/perf/perf | egrep '^(io_uring_|pidfd_|kexec_file)'
      kexec_file_load
      pidfd_send_signal
      io_uring_setup
      io_uring_enter
      io_uring_register
      perfbuilder@6e20056ed532:/git/perf$
      $
      
      Well, there is that last "syscalls" thing, but that looks like some
      other bug.
      
      Signed-off-by: default avatarVitaly Chikunov <vt@altlinux.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarMichael Petlan <mpetlan@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/20190521030203.1447-1-vt@altlinux.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f95d050c
    • Shawn Landden's avatar
      perf data: Fix 'strncat may truncate' build failure with recent gcc · 97acec7d
      Shawn Landden authored
      
      This strncat() is safe because the buffer was allocated with zalloc(),
      however gcc doesn't know that. Since the string always has 4 non-null
      bytes, just use memcpy() here.
      
          CC       /home/shawn/linux/tools/perf/util/data-convert-bt.o
        In file included from /usr/include/string.h:494,
                         from /home/shawn/linux/tools/lib/traceevent/event-parse.h:27,
                         from util/data-convert-bt.c:22:
        In function ‘strncat’,
            inlined from ‘string_set_value’ at util/data-convert-bt.c:274:4:
        /usr/include/powerpc64le-linux-gnu/bits/string_fortified.h:136:10: error: ‘__builtin_strncat’ output may be truncated copying 4 bytes from a string of length 4 [-Werror=stringop-truncation]
          136 |   return __builtin___strncat_chk (__dest, __src, __len, __bos (__dest));
              |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      Signed-off-by: default avatarShawn Landden <shawn@git.icu>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      LPU-Reference: 20190518183238.10954-1-shawn@git.icu
      Link: https://lkml.kernel.org/n/tip-289f1jice17ta7tr3tstm9jm@git.kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      97acec7d
  2. May 19, 2019
  3. May 17, 2019
  4. May 16, 2019
    • David Ahern's avatar
      selftests: pmtu.sh: Remove quotes around commands in setup_xfrm · 9a6c8bf9
      David Ahern authored
      
      The first command in setup_xfrm is failing resulting in the test getting
      skipped:
      
      + ip netns exec ns-B ip -6 xfrm state add src fd00:1::a dst fd00:1::b spi 0x1000 proto esp aead 'rfc4106(gcm(aes))' 0x0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f 128 mode tunnel
      + out=RTNETLINK answers: Function not implemented
      ...
        xfrm6 not supported
      TEST: vti6: PMTU exceptions                                         [SKIP]
        xfrm4 not supported
      TEST: vti4: PMTU exceptions                                         [SKIP]
      ...
      
      The setup command started failing when the run_cmd option was added.
      Removing the quotes fixes the problem:
      ...
      TEST: vti6: PMTU exceptions                                         [ OK ]
      TEST: vti4: PMTU exceptions                                         [ OK ]
      ...
      
      Fixes: 56490b62 ("selftests: Add debugging options to pmtu.sh")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9a6c8bf9
    • Andrii Nakryiko's avatar
      libbpf: move logging helpers into libbpf_internal.h · d72386fe
      Andrii Nakryiko authored
      
      libbpf_util.h header was recently exposed as public as a dependency of
      xsk.h. In addition to memory barriers, it contained logging helpers,
      which are not supposed to be exposed. This patch moves those into
      libbpf_internal.h, which is kept as an internal header.
      
      Cc: Stanislav Fomichev <sdf@google.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Fixes: 7080da89 ("libbpf: add libbpf_util.h to header install.")
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      d72386fe
    • Yonghong Song's avatar
      tools/bpftool: move set_max_rlimit() before __bpf_object__open_xattr() · ac4e0e05
      Yonghong Song authored
      
      For a host which has a lower rlimit for max locked memory (e.g., 64KB),
      the following error occurs in one of our production systems:
        # /usr/sbin/bpftool prog load /paragon/pods/52877437/home/mark.o \
          /sys/fs/bpf/paragon_mark_21 type cgroup/skb \
          map idx 0 pinned /sys/fs/bpf/paragon_map_21
        libbpf: Error in bpf_object__probe_name():Operation not permitted(1).
          Couldn't load basic 'r0 = 0' BPF program.
        Error: failed to open object file
      
      The reason is due to low locked memory during bpf_object__probe_name()
      which probes whether program name is supported in kernel or not
      during __bpf_object__open_xattr().
      
      bpftool program load already tries to relax mlock rlimit before
      bpf_object__load(). Let us move set_max_rlimit() before
      __bpf_object__open_xattr(), which fixed the issue here.
      
      Fixes: 47eff617 ("bpf, libbpf: introduce bpf_object__probe_caps to test BPF capabilities")
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      ac4e0e05
    • Stanislav Fomichev's avatar
      selftests/bpf: add test_sysctl and map_tests/tests.h to .gitignore · bca844a8
      Stanislav Fomichev authored
      
      Missing files are:
      * tools/testing/selftests/bpf/map_tests/tests.h - autogenerated
      * tools/testing/selftests/bpf/test_sysctl - binary
      
      Fixes: 51a0e301 ("bpf: Add BPF_MAP_TYPE_SK_STORAGE test to test_maps")
      Fixes: 1f5fa9ab ("selftests/bpf: Test BPF_CGROUP_SYSCTL")
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      bca844a8
    • Jin Yao's avatar
      perf stat: Support 'percore' event qualifier · 4fc4d8df
      Jin Yao authored
      
      With this patch, we can use the 'percore' event qualifier in perf-stat.
      
        root@skl:/tmp# perf stat -e cpu/event=0,umask=0x3,percore=1/,cpu/event=0,umask=0x3/ -a -A -I1000
          1.000773050 S0-C0   98,352,832 cpu/event=0,umask=0x3,percore=1/  (50.01%)
          1.000773050 S0-C1  103,763,057 cpu/event=0,umask=0x3,percore=1/  (50.02%)
          1.000773050 S0-C2  196,776,995 cpu/event=0,umask=0x3,percore=1/  (50.02%)
          1.000773050 S0-C3  176,493,779 cpu/event=0,umask=0x3,percore=1/  (50.02%)
          1.000773050 CPU0    47,699,641 cpu/event=0,umask=0x3/            (50.02%)
          1.000773050 CPU1    49,052,451 cpu/event=0,umask=0x3/            (49.98%)
          1.000773050 CPU2   102,771,422 cpu/event=0,umask=0x3/            (49.98%)
          1.000773050 CPU3   100,784,662 cpu/event=0,umask=0x3/            (49.98%)
          1.000773050 CPU4    43,171,342 cpu/event=0,umask=0x3/            (49.98%)
          1.000773050 CPU5    54,152,158 cpu/event=0,umask=0x3/            (49.98%)
          1.000773050 CPU6    93,618,410 cpu/event=0,umask=0x3/            (49.98%)
          1.000773050 CPU7    74,477,589 cpu/event=0,umask=0x3/            (49.99%)
      
      In this example, we count the event 'ref-cycles' per-core and per-CPU in
      one perf stat command-line. From the output, we can see:
      
        S0-C0 = CPU0 + CPU4
        S0-C1 = CPU1 + CPU5
        S0-C2 = CPU2 + CPU6
        S0-C3 = CPU3 + CPU7
      
      So the result is expected (tiny difference is ignored).
      
      Note that, the 'percore' event qualifier needs to use with option '-A'.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Tested-by: default avatarRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1555077590-27664-4-git-send-email-yao.jin@linux.intel.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4fc4d8df
    • Jin Yao's avatar
      perf stat: Factor out aggregate counts printing · 40480a81
      Jin Yao authored
      
      Move the aggregate counts printing to a new function
      print_counter_aggrdata, which will be used in following patches.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Tested-by: default avatarRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1555077590-27664-3-git-send-email-yao.jin@linux.intel.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      40480a81
    • Jin Yao's avatar
      perf tools: Add a 'percore' event qualifier · 064b4e82
      Jin Yao authored
      
      Add a 'percore' event qualifier, like cpu/event=0,umask=0x3,percore=1/,
      that sums up the event counts for both hardware threads in a core.
      
      We can already do this with --per-core, but it's often useful to do
      this together with other metrics that are collected per hardware thread.
      So we need to support this per-core counting on a event level.
      
      This can be implemented in only the user tool, no kernel support needed.
      
       v4:
       ---
       1. Add Arnaldo's patch which updates the documentation for
          this new qualifier.
       2. Rebase to latest perf/core branch
      
       v3:
       ---
       Simplify the code according to Jiri's comments.
       Before:
         "return term->val.percore ? true : false;"
       Now:
         "return term->val.percore;"
      
       v2:
       ---
       Change the qualifier name from 'coresum' to 'percore' according to
       comments from Jiri and Andi.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Tested-by: default avatarRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1555077590-27664-2-git-send-email-yao.jin@linux.intel.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      064b4e82
    • Thomas Richter's avatar
      perf docs: Add description for stderr · 6cf62656
      Thomas Richter authored
      
      'perf report' displays recorded data on the screen and emits warnings
      and debug messages in the status line (last one on screen).
      
      perf also supports the possibility to write all debug messages to stderr
      (instead of writing them to the status line).
      
      This is achieved with the following command:
      
        # ./perf --debug stderr=1 report -vvvvv -i ~/fast.data 2>/tmp/2
        # ll /tmp/2
        -rw-rw-r-- 1 tmricht tmricht 5420835 May  7 13:46 /tmp/2
        #
      
      The usage of variable stderr=1 is not documented, so add it to the perf
      man page.
      
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20190513080220.91966-1-tmricht@linux.ibm.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6cf62656
    • Adrian Hunter's avatar
      perf intel-pt: Fix sample timestamp wrt non-taken branches · 1b6599a9
      Adrian Hunter authored
      
      The sample timestamp is updated to ensure that the timestamp represents
      the time of the sample and not a branch that the decoder is still
      walking towards. The sample timestamp is updated when the decoder
      returns, but the decoder does not return for non-taken branches. Update
      the sample timestamp then also.
      
      Note that commit 3f04d98e ("perf intel-pt: Improve sample
      timestamp") was also a stable fix and appears, for example, in v4.4
      stable tree as commit a4ebb58fd124 ("perf intel-pt: Improve sample
      timestamp").
      
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: stable@vger.kernel.org # v4.4+
      Fixes: 3f04d98e ("perf intel-pt: Improve sample timestamp")
      Link: http://lkml.kernel.org/r/20190510124143.27054-4-adrian.hunter@intel.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1b6599a9
    • Adrian Hunter's avatar
      perf intel-pt: Fix improved sample timestamp · 61b6e08d
      Adrian Hunter authored
      
      The decoder uses its current timestamp in samples. Usually that is a
      timestamp that has already passed, but in some cases it is a timestamp
      for a branch that the decoder is walking towards, and consequently
      hasn't reached.
      
      The intel_pt_sample_time() function decides which is which, but was not
      handling TNT packets exactly correctly.
      
      In the case of TNT, the timestamp applies to the first branch, so the
      decoder must first walk to that branch.
      
      That means intel_pt_sample_time() should return true for TNT, and this
      patch makes that change. However, if the first branch is a non-taken
      branch (i.e. a 'N'), then intel_pt_sample_time() needs to return false
      for subsequent taken branches in the same TNT packet.
      
      To handle that, introduce a new state INTEL_PT_STATE_TNT_CONT to
      distinguish the cases.
      
      Note that commit 3f04d98e ("perf intel-pt: Improve sample
      timestamp") was also a stable fix and appears, for example, in v4.4
      stable tree as commit a4ebb58fd124 ("perf intel-pt: Improve sample
      timestamp").
      
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: stable@vger.kernel.org # v4.4+
      Fixes: 3f04d98e ("perf intel-pt: Improve sample timestamp")
      Link: http://lkml.kernel.org/r/20190510124143.27054-3-adrian.hunter@intel.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      61b6e08d
    • Adrian Hunter's avatar
      perf intel-pt: Fix instructions sampling rate · 7ba8fa20
      Adrian Hunter authored
      
      The timestamp used to determine if an instruction sample is made, is an
      estimate based on the number of instructions since the last known
      timestamp. A consequence is that it might go backwards, which results in
      extra samples. Change it so that a sample is only made when the
      timestamp goes forwards.
      
      Note this does not affect a sampling period of 0 or sampling periods
      specified as a count of instructions.
      
      Example:
      
       Before:
      
       $ perf script --itrace=i10us
       ls 13812 [003] 2167315.222583:       3270 instructions:u:      7fac71e2e494 __GI___tunables_init+0xf4 (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:      30902 instructions:u:      7fac71e2da0f _dl_cache_libcmp+0x2f (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:         10 instructions:u:      7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:          8 instructions:u:      7fac71e2d9ea _dl_cache_libcmp+0xa (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:         14 instructions:u:      7fac71e2d9ea _dl_cache_libcmp+0xa (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:          6 instructions:u:      7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:         14 instructions:u:      7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:          4 instructions:u:      7fac71e2dab2 _dl_cache_libcmp+0xd2 (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222728:      16423 instructions:u:      7fac71e2477a _dl_map_object_deps+0x1ba (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222734:      12731 instructions:u:      7fac71e27938 _dl_name_match_p+0x68 (/lib/x86_64-linux-gnu/ld-2.28.so)
       ...
      
       After:
       $ perf script --itrace=i10us
       ls 13812 [003] 2167315.222583:       3270 instructions:u:      7fac71e2e494 __GI___tunables_init+0xf4 (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:      30902 instructions:u:      7fac71e2da0f _dl_cache_libcmp+0x2f (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222728:      16479 instructions:u:      7fac71e2477a _dl_map_object_deps+0x1ba (/lib/x86_64-linux-gnu/ld-2.28.so)
       ...
      
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: stable@vger.kernel.org
      Fixes: f4aa0819 ("perf tools: Add Intel PT decoder")
      Link: http://lkml.kernel.org/r/20190510124143.27054-2-adrian.hunter@intel.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7ba8fa20
    • Kan Liang's avatar
      perf regs x86: Add X86 specific arch__intr_reg_mask() · 6466ec14
      Kan Liang authored
      
      XMM registers can be collected on Icelake and later platforms.
      
      Add specific arch__intr_reg_mask(), which creating an event to check if
      the kernel and hardware can collect XMM registers.
      
      Test on Skylake which doesn't support XMM registers collection. There is
      nothing changed.
      
         #perf record -I?
         available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9
         R10 R11 R12 R13 R14 R15
      
         Usage: perf record [<options>] [<command>]
          or: perf record [<options>] -- <command> [<options>]
      
          -I, --intr-regs[=<any register>]
                                sample selected machine registers on
         interrupt, use '-I?' to list register names
      
         #perf record -I
         [ perf record: Woken up 1 times to write data ]
         [ perf record: Captured and wrote 0.905 MB perf.data (2520 samples) ]
      
         #perf evlist -v
         cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
         IP|TID|TIME|CPU|PERIOD|REGS_INTR, read_format: ID, disabled: 1,
         inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3,
         sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol:
         1, bpf_event: 1, sample_regs_intr: 0xff0fff
      
      Test on Icelake which support XMM registers collection.
      
         #perf record -I?
         available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
         R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9
         XMM10 XMM11 XMM12 XMM13 XMM14 XMM15
      
         Usage: perf record [<options>] [<command>]
          or: perf record [<options>] -- <command> [<options>]
      
          -I, --intr-regs[=<any register>]
                                sample selected machine registers on
         interrupt, use '-I?' to list register names
      
         #perf record -I
         [ perf record: Woken up 1 times to write data ]
         [ perf record: Captured and wrote 0.800 MB perf.data (318 samples) ]
      
         #perf evlist -v
         cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
         IP|TID|TIME|CPU|PERIOD|REGS_INTR, read_format: ID, disabled: 1,
         inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3,
         sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol:
         1, bpf_event: 1, sample_regs_intr: 0xffffffff00ff0fff
      
      Committer notes:
      
      Don't set attr.sample_period as a named struct init, as it is part of an
      unnamed union in 'struct perf_event_attr', and doing so breaks the build
      on older gcc versions, such as:
      
        gcc version 4.1.2 20080704 (Red Hat 4.1.2-55)
        gcc version 4.4.7 20120313 (Red Hat 4.4.7-23) (GCC)
      
        arch/x86/util/perf_regs.c: In function 'arch__intr_reg_mask':
        arch/x86/util/perf_regs.c:279: error: unknown field 'sample_period' specified in initializer
        cc1: warnings being treated as errors
        arch/x86/util/perf_regs.c:279: warning: missing braces around initializer
        arch/x86/util/perf_regs.c:279: warning: (near initialization for 'attr.<anonymous>')
      
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      [ Only on a lenovo t480s, a skylake machine, where the XMM registers didn't show up in -I?/--user-regs=? as expected ]
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/1557865174-56264-3-git-send-email-kan.liang@linux.intel.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6466ec14
    • Kan Liang's avatar
      perf parse-regs: Add generic support for arch__intr/user_reg_mask() · af785e75
      Kan Liang authored
      
      There may be different register mask for use with intr or user on some
      platforms, e.g. Icelake.
      
      Add weak functions arch__intr_reg_mask() and arch__user_reg_mask() to
      return intr and user register mask respectively.
      
      Check mask before printing or comparing the register name.
      
      Generic code always return PERF_REGS_MASK. No functional change.
      
      Suggested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/1557865174-56264-2-git-send-email-kan.liang@linux.intel.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      af785e75
  5. May 15, 2019
Loading