Skip to content
Snippets Groups Projects
  1. Feb 28, 2019
  2. Dec 18, 2018
    • Tetsuo Handa's avatar
      printk: Add caller information to printk() output. · 15ff2069
      Tetsuo Handa authored
      Sometimes we want to print a series of printk() messages to consoles
      without being disturbed by concurrent printk() from interrupts and/or
      other threads. But we can't enforce printk() callers to use their local
      buffers because we need to ask them to make too much changes. Also, even
      buffering up to one line inside printk() might cause failing to emit
      an important clue under critical situation.
      
      Therefore, instead of trying to help buffering, let's try to help
      reconstructing messages by saving caller information as of calling
      log_store() and adding it as "[T$thread_id]" or "[C$processor_id]"
      upon printing to consoles.
      
      Some examples for console output:
      
        [    1.222773][    T1] x86: Booting SMP configuration:
        [    2.779635][    T1] pci 0000:00:01.0: PCI bridge to [bus 01]
        [    5.069193][  T268] Fusion MPT base driver 3.04.20
        [    9.316504][    C2] random: fast init done
        [   13.413336][ T3355] Initialized host personality
      
      Some examples for /dev/kmsg output:
      
        6,496,1222773,-,caller=T1;x86: Booting SMP configuration:
        6,968,2779635,-,caller=T1;pci 0000:00:01.0: PCI bridge to [bus 01]
         SUBSYSTEM=pci
         DEVICE=+pci:0000:00:01.0
        6,1353,5069193,-,caller=T268;Fusion MPT base driver 3.04.20
        5,1526,9316504,-,caller=C2;random: fast init done
        6,1575,13413336,-,caller=T3355;Initialized host personality
      
      Note that this patch changes max length of messages which can be printed
      by printk() or written to /dev/kmsg interface from 992 bytes to 976 bytes,
      based on an assumption that userspace won't try to write messages hitting
      that border line to /dev/kmsg interface.
      
      Link: http://lkml.kernel.org/r/93f19e57-5051-c67d-9af4-b17624062d44@i-love.sakura.ne.jp
      
      
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: LKML <linux-kernel@vger.kernel.org>
      Cc: syzkaller <syzkaller@googlegroups.com>
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Acked-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      15ff2069
  3. Oct 19, 2018
    • Waiman Long's avatar
      locking/lockdep: Make global debug_locks* variables read-mostly · 01a14bda
      Waiman Long authored
      
      Make the frequently used lockdep global variable debug_locks read-mostly.
      As debug_locks_silent is sometime used together with debug_locks,
      it is also made read-mostly so that they can be close together.
      
      With false cacheline sharing, cacheline contention problem can happen
      depending on what get put into the same cacheline as debug_locks.
      
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/1539913518-15598-2-git-send-email-longman@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      01a14bda
    • Waiman Long's avatar
      locking/lockdep: Fix debug_locks off performance problem · 9506a742
      Waiman Long authored
      
      It was found that when debug_locks was turned off because of a problem
      found by the lockdep code, the system performance could drop quite
      significantly when the lock_stat code was also configured into the
      kernel. For instance, parallel kernel build time on a 4-socket x86-64
      server nearly doubled.
      
      Further analysis into the cause of the slowdown traced back to the
      frequent call to debug_locks_off() from the __lock_acquired() function
      probably due to some inconsistent lockdep states with debug_locks
      off. The debug_locks_off() function did an unconditional atomic xchg
      to write a 0 value into debug_locks which had already been set to 0.
      This led to severe cacheline contention in the cacheline that held
      debug_locks.  As debug_locks is being referenced in quite a few different
      places in the kernel, this greatly slow down the system performance.
      
      To prevent that trashing of debug_locks cacheline, lock_acquired()
      and lock_contended() now checks the state of debug_locks before
      proceeding. The debug_locks_off() function is also modified to check
      debug_locks before calling __debug_locks_off().
      
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/1539913518-15598-1-git-send-email-longman@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9506a742
  4. Oct 15, 2018
  5. Oct 12, 2018
  6. Oct 09, 2018
  7. Oct 08, 2018
    • David Ahern's avatar
      netlink: Add strict version of nlmsg_parse and nla_parse · a5f6cba2
      David Ahern authored
      
      nla_parse is currently lenient on message parsing, allowing type to be 0
      or greater than max expected and only logging a message
      
          "netlink: %d bytes leftover after parsing attributes in process `%s'."
      
      if the netlink message has unknown data at the end after parsing. What this
      could mean is that the header at the front of the attributes is actually
      wrong and the parsing is shifted from what is expected.
      
      Add a new strict version that actually fails with EINVAL if there are any
      bytes remaining after the parsing loop completes, if the atttrbitue type
      is 0 or greater than max expected.
      
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Acked-by: default avatarChristian Brauner <christian@brauner.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a5f6cba2
  8. Oct 05, 2018
    • Steven Rostedt (VMware)'s avatar
      vsprintf: Fix off-by-one bug in bstr_printf() processing dereferenced pointers · 62165600
      Steven Rostedt (VMware) authored
      
      The functions vbin_printf() and bstr_printf() are used by trace_printk() to
      try to keep the overhead down during printing. trace_printk() uses
      vbin_printf() at the time of execution, as it only scans the fmt string to
      record the printf values into the buffer, and then uses vbin_printf() to do
      the conversions to print the string based on the format and the saved
      values in the buffer.
      
      This is an issue for dereferenced pointers, as before commit 841a915d,
      the processing of the pointer could happen some time after the pointer value
      was recorded (reading the trace buffer). This means the processing of the
      value at a later time could show different results, or even crash the
      system, if the pointer no longer existed.
      
      Commit 841a915d addressed this by processing dereferenced pointers at
      the time of execution and save the result in the ring buffer as a string.
      The bstr_printf() would then treat these pointers as normal strings, and
      print the value. But there was an off-by-one bug here, where after
      processing the argument, it move the pointer only "strlen(arg)" which made
      the arg pointer not point to the next argument in the ring buffer, but
      instead point to the nul character of the last argument. This causes any
      values after a dereferenced pointer to be corrupted.
      
      Cc: stable@vger.kernel.org
      Fixes: 841a915d ("vsprintf: Do not have bprintf dereference pointers")
      Reported-by: default avatarNikolay Borisov <nborisov@suse.com>
      Tested-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      62165600
  9. Oct 04, 2018
    • Stefan Agner's avatar
      ARM: 8800/1: use choice for kernel unwinders · f9b58e8c
      Stefan Agner authored
      
      While in theory multiple unwinders could be compiled in, it does
      not make sense in practise. Use a choice to make the unwinder
      selection mutually exclusive and mandatory.
      
      Already before this commit it has not been possible to deselect
      FRAME_POINTER. Remove the obsolete comment.
      
      Furthermore, to produce a meaningful backtrace with FRAME_POINTER
      enabled the kernel needs a specific function prologue:
          mov    ip, sp
          stmfd    sp!, {fp, ip, lr, pc}
          sub    fp, ip, #4
      
      To get to the required prologue gcc uses apcs and no-sched-prolog.
      This compiler options are not available on clang, and clang is not
      able to generate the required prologue. Make the FRAME_POINTER
      config symbol depending on !clang.
      
      Suggested-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarStefan Agner <stefan@agner.ch>
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      f9b58e8c
  10. Oct 02, 2018
    • Johannes Berg's avatar
      netlink: add validation function to policy · 33188bd6
      Johannes Berg authored
      
      Add the ability to have an arbitrary validation function attached
      to a netlink policy that doesn't already use the validation_data
      pointer in another way.
      
      This can be useful to validate for example the content of a binary
      attribute, like in nl80211 the "(information) elements", which must
      be valid streams of "u8 type, u8 length, u8 value[length]".
      
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      33188bd6
    • Johannes Berg's avatar
      netlink: add attribute range validation to policy · 3e48be05
      Johannes Berg authored
      
      Without further bloating the policy structs, we can overload
      the `validation_data' pointer with a struct of s16 min, max
      and use those to validate ranges in NLA_{U,S}{8,16,32,64}
      attributes.
      
      It may sound strange to validate NLA_U32 with a s16 max, but
      in many cases NLA_U32 is used for enums etc. since there's no
      size benefit in using a smaller attribute width anyway, due
      to netlink attribute alignment; in cases like that it's still
      useful, particularly when the attribute really transports an
      enum value.
      
      Doing so lets us remove quite a bit of validation code, if we
      can be sure that these attributes aren't used by userspace in
      places where they're ignored today.
      
      To achieve all this, split the 'type' field and introduce a
      new 'validation_type' field which indicates what further
      validation (beyond the validation prescribed by the type of
      the attribute) is done. This currently allows for no further
      validation (the default), as well as min, max and range checks.
      
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3e48be05
  11. Oct 01, 2018
    • Joel Stanley's avatar
      lib/xz: Put CRC32_POLY_LE in xz_private.h · 242cdad8
      Joel Stanley authored
      
      This fixes a regression introduced by faa16bc4 ("lib: Use
      existing define with polynomial").
      
      The cleanup added a dependency on include/linux, which broke the PowerPC
      boot wrapper/decompresser when KERNEL_XZ is enabled:
      
        BOOTCC  arch/powerpc/boot/decompress.o
       In file included from arch/powerpc/boot/../../../lib/decompress_unxz.c:233,
                       from arch/powerpc/boot/decompress.c:42:
       arch/powerpc/boot/../../../lib/xz/xz_crc32.c:18:10: fatal error:
       linux/crc32poly.h: No such file or directory
        #include <linux/crc32poly.h>
                 ^~~~~~~~~~~~~~~~~~~
      
      The powerpc decompresser is a hairy corner of the kernel. Even while building
      a 64-bit kernel it needs to build a 32-bit binary and therefore avoid including
      files from include/linux.
      
      This allows users of the xz library to avoid including headers from
      'include/linux/' while still achieving the cleanup of the magic number.
      
      Fixes: faa16bc4 ("lib: Use existing define with polynomial")
      Reported-by: default avatarMeelis Roos <mroos@linux.ee>
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Suggested-by: default avatarChristophe LEROY <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarJoel Stanley <joel@jms.id.au>
      Tested-by: default avatarMeelis Roos <mroos@linux.ee>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      242cdad8
  12. Sep 28, 2018
  13. Sep 27, 2018
    • Song Liu's avatar
      bpf: test_bpf: add init_net to dev for flow_dissector · 10081193
      Song Liu authored
      
      Latest changes in __skb_flow_dissect() assume skb->dev has valid nd_net.
      However, this is not true for test_bpf. As a result, test_bpf.ko crashes
      the system with the following stack trace:
      
      [ 1133.716622] BUG: unable to handle kernel paging request at 0000000000001030
      [ 1133.716623] PGD 8000001fbf7ee067
      [ 1133.716624] P4D 8000001fbf7ee067
      [ 1133.716624] PUD 1f6c1cf067
      [ 1133.716625] PMD 0
      [ 1133.716628] Oops: 0000 [#1] SMP PTI
      [ 1133.716630] CPU: 7 PID: 40473 Comm: modprobe Kdump: loaded Not tainted 4.19.0-rc5-00805-gca11cc92ccd2 #1167
      [ 1133.716631] Hardware name: Wiwynn Leopard-Orv2/Leopard-DDR BW, BIOS LBM12.5 12/06/2017
      [ 1133.716638] RIP: 0010:__skb_flow_dissect+0x83/0x1680
      [ 1133.716639] Code: 04 00 00 41 0f b7 44 24 04 48 85 db 4d 8d 14 07 0f 84 01 02 00 00 48 8b 43 10 48 85 c0 0f 84 e5 01 00 00 48 8b 80 a8 04 00 00 <48> 8b 90 30 10 00 00 48 85 d2 0f 84 dd 01 00 00 31 c0 b9 05 00 00
      [ 1133.716640] RSP: 0018:ffffc900303c7a80 EFLAGS: 00010282
      [ 1133.716642] RAX: 0000000000000000 RBX: ffff881fea0b7400 RCX: 0000000000000000
      [ 1133.716643] RDX: ffffc900303c7bb4 RSI: ffffffff8235c3e0 RDI: ffff881fea0b7400
      [ 1133.716643] RBP: ffffc900303c7b80 R08: 0000000000000000 R09: 000000000000000e
      [ 1133.716644] R10: ffffc900303c7bb4 R11: ffff881fb6840400 R12: ffffffff8235c3e0
      [ 1133.716645] R13: 0000000000000008 R14: 000000000000001e R15: ffffc900303c7bb4
      [ 1133.716646] FS:  00007f54e75d3740(0000) GS:ffff881fff5c0000(0000) knlGS:0000000000000000
      [ 1133.716648] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1133.716649] CR2: 0000000000001030 CR3: 0000001f6c226005 CR4: 00000000003606e0
      [ 1133.716649] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 1133.716650] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 1133.716651] Call Trace:
      [ 1133.716660]  ? sched_clock_cpu+0xc/0xa0
      [ 1133.716662]  ? sched_clock_cpu+0xc/0xa0
      [ 1133.716665]  ? log_store+0x1b5/0x260
      [ 1133.716667]  ? up+0x12/0x60
      [ 1133.716669]  ? skb_get_poff+0x4b/0xa0
      [ 1133.716674]  ? __kmalloc_reserve.isra.47+0x2e/0x80
      [ 1133.716675]  skb_get_poff+0x4b/0xa0
      [ 1133.716680]  bpf_skb_get_pay_offset+0xa/0x10
      [ 1133.716686]  ? test_bpf_init+0x578/0x1000 [test_bpf]
      [ 1133.716690]  ? netlink_broadcast_filtered+0x153/0x3d0
      [ 1133.716695]  ? free_pcppages_bulk+0x324/0x600
      [ 1133.716696]  ? 0xffffffffa0279000
      [ 1133.716699]  ? do_one_initcall+0x46/0x1bd
      [ 1133.716704]  ? kmem_cache_alloc_trace+0x144/0x1a0
      [ 1133.716709]  ? do_init_module+0x5b/0x209
      [ 1133.716712]  ? load_module+0x2136/0x25d0
      [ 1133.716715]  ? __do_sys_finit_module+0xba/0xe0
      [ 1133.716717]  ? __do_sys_finit_module+0xba/0xe0
      [ 1133.716719]  ? do_syscall_64+0x48/0x100
      [ 1133.716724]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      This patch fixes tes_bpf by using init_net in the dummy dev.
      
      Fixes: d58e468b ("flow_dissector: implements flow dissector BPF hook")
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Petar Penkov <ppenkov@google.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      10081193
  14. Sep 26, 2018
  15. Sep 21, 2018
    • Eric Biggers's avatar
      crypto: chacha20 - Fix chacha20_block() keystream alignment (again) · a5e9f557
      Eric Biggers authored
      
      In commit 9f480fae ("crypto: chacha20 - Fix keystream alignment for
      chacha20_block()"), I had missed that chacha20_block() can be called
      directly on the buffer passed to get_random_bytes(), which can have any
      alignment.  So, while my commit didn't break anything, it didn't fully
      solve the alignment problems.
      
      Revert my solution and just update chacha20_block() to use
      put_unaligned_le32(), so the output buffer need not be aligned.
      This is simpler, and on many CPUs it's the same speed.
      
      But, I kept the 'tmp' buffers in extract_crng_user() and
      _get_random_bytes() 4-byte aligned, since that alignment is actually
      needed for _crng_backtrack_protect() too.
      
      Reported-by: default avatarStephan Müller <smueller@chronox.de>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      a5e9f557
  16. Sep 19, 2018
    • Johannes Berg's avatar
      netlink: add ethernet address policy types · b60b87fc
      Johannes Berg authored
      
      Commonly, ethernet addresses are just using a policy of
      	{ .len = ETH_ALEN }
      which leaves userspace free to send more data than it should,
      which may hide bugs.
      
      Introduce NLA_EXACT_LEN which checks for exact size, rejecting
      the attribute if it's not exactly that length. Also add
      NLA_EXACT_LEN_WARN which requires the minimum length and will
      warn on longer attributes, for backward compatibility.
      
      Use these to define NLA_POLICY_ETH_ADDR (new strict policy) and
      NLA_POLICY_ETH_ADDR_COMPAT (compatible policy with warning);
      these are used like this:
      
          static const struct nla_policy <name>[...] = {
              [NL_ATTR_NAME] = NLA_POLICY_ETH_ADDR,
              ...
          };
      
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Reviewed-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b60b87fc
    • Johannes Berg's avatar
      netlink: add NLA_REJECT policy type · 568b742a
      Johannes Berg authored
      
      In some situations some netlink attributes may be used for output
      only (kernel->userspace) or may be reserved for future use. It's
      then helpful to be able to prevent userspace from using them in
      messages sent to the kernel, since they'd otherwise be ignored and
      any future will become impossible if this happens.
      
      Add NLA_REJECT to the policy which does nothing but reject (with
      EINVAL) validation of any messages containing this attribute.
      Allow for returning a specific extended ACK error message in the
      validation_data pointer.
      
      While at it clear up the documentation a bit - the NLA_BITFIELD32
      documentation was added to the list of len field descriptions.
      
      Also, use NL_SET_BAD_ATTR() in one place where it's open-coded.
      
      The specific case I have in mind now is a shared nested attribute
      containing request/response data, and it would be pointless and
      potentially confusing to have userspace include response data in
      the messages that actually contain a request.
      
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Reviewed-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      568b742a
  17. Sep 14, 2018
  18. Sep 10, 2018
  19. Sep 04, 2018
  20. Aug 30, 2018
  21. Aug 24, 2018
  22. Aug 22, 2018
Loading