- Mar 06, 2019
-
-
Andrey Ryabinin authored
Use after scope bugs detector seems to be almost entirely useless for the linux kernel. It exists over two years, but I've seen only one valid bug so far [1]. And the bug was fixed before it has been reported. There were some other use-after-scope reports, but they were false-positives due to different reasons like incompatibility with structleak plugin. This feature significantly increases stack usage, especially with GCC < 9 version, and causes a 32K stack overflow. It probably adds performance penalty too. Given all that, let's remove use-after-scope detector entirely. While preparing this patch I've noticed that we mistakenly enable use-after-scope detection for clang compiler regardless of CONFIG_KASAN_EXTRA setting. This is also fixed now. [1] http://lkml.kernel.org/r/<20171129052106.rhgbjhhis53hkgfn@wfg-t540p.sh.intel.com> Link: http://lkml.kernel.org/r/20190111185842.13978-1-aryabinin@virtuozzo.com Signed-off-by:
Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Will Deacon <will.deacon@arm.com> [arm64] Cc: Qian Cai <cai@lca.pw> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Mar 01, 2019
-
-
Arnd Bergmann authored
Building an arm64 allmodconfig kernel with clang results in over 140 warnings about overly large stack frames, the worst ones being: drivers/gpu/drm/panel/panel-sitronix-st7789v.c:196:12: error: stack frame size of 20224 bytes in function 'st7789v_prepare' drivers/video/fbdev/omap2/omapfb/displays/panel-tpo-td028ttec1.c:196:12: error: stack frame size of 13120 bytes in function 'td028ttec1_panel_enable' drivers/usb/host/max3421-hcd.c:1395:1: error: stack frame size of 10048 bytes in function 'max3421_spi_thread' drivers/net/wan/slic_ds26522.c:209:12: error: stack frame size of 9664 bytes in function 'slic_ds26522_probe' drivers/crypto/ccp/ccp-ops.c:2434:5: error: stack frame size of 8832 bytes in function 'ccp_run_cmd' drivers/media/dvb-frontends/stv0367.c:1005:12: error: stack frame size of 7840 bytes in function 'stv0367ter_algo' None of these happen with gcc today, and almost all of these are the result of a single known issue in llvm. Hopefully it will eventually get fixed with the clang-9 release. In the meantime, the best idea I have is to turn off asan-stack for clang-8 and earlier, so we can produce a kernel that is safe to run. I have posted three patches that address the frame overflow warnings that are not addressed by turning off asan-stack, so in combination with this change, we get much closer to a clean allmodconfig build, which in turn is necessary to do meaningful build regression testing. It is still possible to turn on the CONFIG_ASAN_STACK option on all versions of clang, and it's always enabled for gcc, but when CONFIG_COMPILE_TEST is set, the option remains invisible, so allmodconfig and randconfig builds (which are normally done with a forced CONFIG_COMPILE_TEST) will still result in a mostly clean build. Link: http://lkml.kernel.org/r/20190222222950.3997333-1-arnd@arndb.de Link: https://bugs.llvm.org/show_bug.cgi?id=38809 Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Reviewed-by:
Qian Cai <cai@lca.pw> Reviewed-by:
Mark Brown <broonie@kernel.org> Acked-by:
Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Feb 25, 2019
-
-
Anders Roxell authored
When running BPF test suite the following splat occurs: [ 415.930950] test_bpf: #0 TAX jited:0 [ 415.931067] BUG: assuming atomic context at lib/test_bpf.c:6674 [ 415.946169] in_atomic(): 0, irqs_disabled(): 0, pid: 11556, name: modprobe [ 415.953176] INFO: lockdep is turned off. [ 415.957207] CPU: 1 PID: 11556 Comm: modprobe Tainted: G W 5.0.0-rc7-next-20190220 #1 [ 415.966328] Hardware name: HiKey Development Board (DT) [ 415.971592] Call trace: [ 415.974069] dump_backtrace+0x0/0x160 [ 415.977761] show_stack+0x24/0x30 [ 415.981104] dump_stack+0xc8/0x114 [ 415.984534] __cant_sleep+0xf0/0x108 [ 415.988145] test_bpf_init+0x5e0/0x1000 [test_bpf] [ 415.992971] do_one_initcall+0x90/0x428 [ 415.996837] do_init_module+0x60/0x1e4 [ 416.000614] load_module+0x1de0/0x1f50 [ 416.004391] __se_sys_finit_module+0xc8/0xe0 [ 416.008691] __arm64_sys_finit_module+0x24/0x30 [ 416.013255] el0_svc_common+0x78/0x130 [ 416.017031] el0_svc_handler+0x38/0x78 [ 416.020806] el0_svc+0x8/0xc Rework so that preemption is disabled when we loop over function 'BPF_PROG_RUN(...)'. Fixes: 568f1967 ("bpf: check that BPF programs run with preemption disabled") Suggested-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Anders Roxell <anders.roxell@linaro.org> Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net>
-
- Feb 22, 2019
-
-
Herbert Xu authored
The rhashtable_walk_init function has been obsolete for more than two years. This patch finally converts its last users over to rhashtable_walk_enter and removes it. Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by:
Johannes Berg <johannes.berg@intel.com>
-
- Feb 21, 2019
-
-
Colin Ian King authored
There are spelling mistakes in warning macro messages. Fix them. Signed-off-by:
Colin Ian King <colin.king@canonical.com> Acked-by:
Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Feb 15, 2019
-
-
David Howells authored
Fix the creation of shortcuts for which the length of the index key value is an exact multiple of the machine word size. The problem is that the code that blanks off the unused bits of the shortcut value malfunctions if the number of bits in the last word equals machine word size. This is due to the "<<" operator being given a shift of zero in this case, and so the mask that should be all zeros is all ones instead. This causes the subsequent masking operation to clear everything rather than clearing nothing. Ordinarily, the presence of the hash at the beginning of the tree index key makes the issue very hard to test for, but in this case, it was encountered due to a development mistake that caused the hash output to be either 0 (keyring) or 1 (non-keyring) only. This made it susceptible to the keyctl/unlink/valid test in the keyutils package. The fix is simply to skip the blanking if the shift would be 0. For example, an index key that is 64 bits long would produce a 0 shift and thus a 'blank' of all 1s. This would then be inverted and AND'd onto the index_key, incorrectly clearing the entire last word. Fixes: 3cb98950 ("Add a generic associative array implementation.") Signed-off-by:
David Howells <dhowells@redhat.com> Signed-off-by:
James Morris <james.morris@microsoft.com>
-
Miguel Ojeda authored
The upcoming GCC 9 release extends the -Wmissing-attributes warnings (enabled by -Wall) to C and aliases: it warns when particular function attributes are missing in the aliases but not in their target. In particular, it triggers here because crc32_le_base/__crc32c_le_base aren't __pure while their target crc32_le/__crc32c_le are. These aliases are used by architectures as a fallback in accelerated versions of CRC32. See commit 9784d82d ("lib/crc32: make core crc32() routines weak so they can be overridden"). Therefore, being fallbacks, it is likely that even if the aliases were called from C, there wouldn't be any optimizations possible. Currently, the only user is arm64, which calls this from asm. Still, marking the aliases as __pure makes sense and is a good idea for documentation purposes and possible future optimizations, which also silences the warning. Acked-by:
Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by:
Laura Abbott <labbott@redhat.com> Signed-off-by:
Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
-
- Feb 14, 2019
-
-
Jiri Pirko authored
It is possible that there might be an originally parent object with 0 direct users that is in hints no longer considered as parent. Then the weight of this object is 0 and current code ignores him. That's why the total amount of hint objects might be lower than for the original objagg and WARN_ON is hit. Fix this be considering 0 weight valid. Fixes: 9069a381 ("lib: objagg: implement optimization hints assembly and use hints for object creation") Signed-off-by:
Jiri Pirko <jiri@mellanox.com> Reviewed-by:
Ido Schimmel <idosch@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Dan Carpenter authored
We need to set the error message on this path otherwise some of the callers, such as test_hints_case(), print from an uninitialized pointer. We had a similar bug earlier and set "errmsg" to NULL in the caller, test_delta_action_item(). That code is no longer required so I have removed it. Fixes: 9069a381 ("lib: objagg: implement optimization hints assembly and use hints for object creation") Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Acked-by:
Jiri Pirko <jiri@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Dan Carpenter authored
There is a typo here. We intended to check "objagg2" but we instead test "objagg" which is not an error pointer. Fixes: 9069a381 ("lib: objagg: implement optimization hints assembly and use hints for object creation") Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Acked-by:
Jiri Pirko <jiri@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Dan Carpenter authored
We need to set the error code on this path otherwise we return ERR_PTR(0) which would result in a NULL dereference in the caller. Fixes: 9069a381 ("lib: objagg: implement optimization hints assembly and use hints for object creation") Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Acked-by:
Jiri Pirko <jiri@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Feb 08, 2019
-
-
Jiri Pirko authored
Count number of roots and add it to stats. It is handy for the library user to have this stats available as it can act upon it without counting roots itself. Signed-off-by:
Jiri Pirko <jiri@mellanox.com> Signed-off-by:
Ido Schimmel <idosch@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Implement simple greedy algo to find more optimized root-delta tree for a given objagg instance. This "hints" can be used by a driver to: 1) check if the hints are better (driver's choice) than the original objagg tree. Driver does comparison of objagg stats and hints stats. 2) use the hints to create a new objagg instance which will construct the root-delta tree according to the passed hints. Currently, only a simple greedy algorithm is implemented. Basically it finds the roots according to the maximal possible user count including deltas. Signed-off-by:
Jiri Pirko <jiri@mellanox.com> Signed-off-by:
Ido Schimmel <idosch@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Signed-off-by:
Jiri Pirko <jiri@mellanox.com> Signed-off-by:
Ido Schimmel <idosch@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Feb 01, 2019
-
-
Dan Carpenter authored
There is a copy and paste bug so we set "config->test_driver" to NULL twice instead of setting "config->test_fs". Smatch complains that it leads to a double free: lib/test_kmod.c:840 __kmod_config_init() warn: 'config->test_fs' double freed Link: http://lkml.kernel.org/r/20190121140011.GA14283@kadam Fixes: d9c6a72d ("kmod: add test driver to stress test the module loader") Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Acked-by:
Luis Chamberlain <mcgrof@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jan 31, 2019
-
-
Bart Van Assche authored
The test_insert_dup() function from lib/test_rhashtable.c passes a pointer to a stack object to rhltable_init(). Allocate the hash table dynamically to avoid that the following is reported with object debugging enabled: ODEBUG: object (ptrval) is on stack (ptrval), but NOT annotated. WARNING: CPU: 0 PID: 1 at lib/debugobjects.c:368 __debug_object_init+0x312/0x480 Modules linked in: EIP: __debug_object_init+0x312/0x480 Call Trace: ? debug_object_init+0x1a/0x20 ? __init_work+0x16/0x30 ? rhashtable_init+0x1e1/0x460 ? sched_clock_cpu+0x57/0xe0 ? rhltable_init+0xb/0x20 ? test_insert_dup+0x32/0x20f ? trace_hardirqs_on+0x38/0xf0 ? ida_dump+0x10/0x10 ? jhash+0x130/0x130 ? my_hashfn+0x30/0x30 ? test_rht_init+0x6aa/0xab4 ? ida_dump+0x10/0x10 ? test_rhltable+0xc5c/0xc5c ? do_one_initcall+0x67/0x28e ? trace_hardirqs_off+0x22/0xe0 ? restore_all_kernel+0xf/0x70 ? trace_hardirqs_on_thunk+0xc/0x10 ? restore_all_kernel+0xf/0x70 ? kernel_init_freeable+0x142/0x213 ? rest_init+0x230/0x230 ? kernel_init+0x10/0x110 ? schedule_tail_wrapper+0x9/0xc ? ret_from_fork+0x19/0x24 Cc: Thomas Graf <tgraf@suug.ch> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Bart Van Assche <bvanassche@acm.org> Acked-by:
Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jan 20, 2019
-
-
Florian La Roche authored
If an input number x for int_sqrt64() has the highest bit set, then fls64(x) is 64. (1UL << 64) is an overflow and breaks the algorithm. Subtracting 1 is a better guess for the initial value of m anyway and that's what also done in int_sqrt() implicitly [*]. [*] Note how int_sqrt() uses __fls() with two underscores, which already returns the proper raw bit number. In contrast, int_sqrt64() used fls64(), and that returns bit numbers illogically starting at 1, because of error handling for the "no bits set" case. Will points out that he bug probably is due to a copy-and-paste error from the regular int_sqrt() case. Signed-off-by:
Florian La Roche <Florian.LaRoche@googlemail.com> Acked-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jan 15, 2019
-
-
Ming Lei authored
Because we may call blk_mq_get_driver_tag() directly from blk_mq_dispatch_rq_list() without holding any lock, then HARDIRQ may come and the above DEADLOCK is triggered. Commit ab53dcfb3e7b ("sbitmap: Protect swap_lock from hardirq") tries to fix this issue by using 'spin_lock_bh', which isn't enough because we complete request from hardirq context direclty in case of multiqueue. Cc: Clark Williams <williams@redhat.com> Fixes: ab53dcfb3e7b ("sbitmap: Protect swap_lock from hardirq") Cc: Jens Axboe <axboe@kernel.dk> Cc: Ming Lei <ming.lei@redhat.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jan 14, 2019
-
-
Matthew Wilcox authored
We do not currently check that the loop in xas_squash_marks() doesn't have an off-by-one error in it. It didn't, but a patch which introduced an off-by-one error wasn't caught by any existing test. Switch the roles of XA_MARK_1 and XA_MARK_2 to catch that bug. Reported-by:
Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by:
Matthew Wilcox <willy@infradead.org>
-
Steven Rostedt (VMware) authored
The swap_lock used by sbitmap has a chain with locks taken from softirq, but the swap_lock is not protected from being preempted by softirqs. A chain exists of: sbq->ws[i].wait -> dispatch_wait_lock -> swap_lock Where the sbq->ws[i].wait lock can be taken from softirq context, which means all locks below it in the chain must also be protected from softirqs. Reported-by:
Clark Williams <williams@redhat.com> Fixes: 58ab5e32 ("sbitmap: silence bogus lockdep IRQ warning") Fixes: ea86ea2c ("sbitmap: amortize cost of clearing bits") Cc: Jens Axboe <axboe@kernel.dk> Cc: Ming Lei <ming.lei@redhat.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jan 07, 2019
-
-
Matthew Wilcox authored
xa_insert() should treat reserved entries as occupied, not as available. Also, it should treat requests to insert a NULL pointer as a request to reserve the slot. Add xa_insert_bh() and xa_insert_irq() for completeness. Signed-off-by:
Matthew Wilcox <willy@infradead.org>
-
Matthew Wilcox authored
On m68k, statically allocated pointers may only be two-byte aligned. This clashes with the XArray's method for tagging internal pointers. Permit storing these pointers in single slots (ie not in multislots). Signed-off-by:
Matthew Wilcox <willy@infradead.org>
-
Matthew Wilcox authored
There were three problems with this API: 1. It took too many arguments; almost all users wanted to iterate over every element in the array rather than a subset. 2. It required that 'index' be initialised before use, and there's no realistic way to make GCC catch that. 3. 'index' and 'entry' were the opposite way round from every other member of the XArray APIs. So split it into three different APIs: xa_for_each(xa, index, entry) xa_for_each_start(xa, index, entry, start) xa_for_each_marked(xa, index, entry, filter) Signed-off-by:
Matthew Wilcox <willy@infradead.org>
-
Matthew Wilcox authored
A regular xa_init_flags() put all dynamically-initialised XArrays into the same locking class. That leads to lockdep believing that taking one XArray lock while holding another is a deadlock. It's possible to work around some of these situations with separate locking classes for irq/bh/regular XArrays, and SINGLE_DEPTH_NESTING, but that's ugly, and it doesn't work for all situations (where we have completely unrelated XArrays). Signed-off-by:
Matthew Wilcox <willy@infradead.org>
-
Matthew Wilcox authored
0day picked up that I'd forgotten to add locking to this new test. Signed-off-by:
Matthew Wilcox <willy@infradead.org>
-
- Jan 06, 2019
-
-
Masahiro Yamada authored
Since commit 9c2af1c7 ("kbuild: add .DELETE_ON_ERROR special target"), the target file is automatically deleted on failure. The boilerplate code ... || { rm -f $@; false; } is unneeded. Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com>
-
Masahiro Yamada authored
Currently, CONFIG_JUMP_LABEL just means "I _want_ to use jump label". The jump label is controlled by HAVE_JUMP_LABEL, which is defined like this: #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL) # define HAVE_JUMP_LABEL #endif We can improve this by testing 'asm goto' support in Kconfig, then make JUMP_LABEL depend on CC_HAS_ASM_GOTO. Ugly #ifdef HAVE_JUMP_LABEL will go away, and CONFIG_JUMP_LABEL will match to the real kernel capability. Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Tested-by:
Sedat Dilek <sedat.dilek@gmail.com>
-
- Jan 05, 2019
-
-
Olof Johansson authored
Fixes build break on most ARM/ARM64 defconfigs: lib/genalloc.c: In function 'gen_pool_add_virt': lib/genalloc.c:190:10: error: implicit declaration of function 'vzalloc_node'; did you mean 'kzalloc_node'? lib/genalloc.c:190:8: warning: assignment to 'struct gen_pool_chunk *' from 'int' makes pointer from integer without a cast [-Wint-conversion] lib/genalloc.c: In function 'gen_pool_destroy': lib/genalloc.c:254:3: error: implicit declaration of function 'vfree'; did you mean 'kfree'? Fixes: 6862d2fc ('lib/genalloc.c: use vzalloc_node() to allocate the bitmap') Cc: Huang Shijie <sjhuang@iluvatar.ai> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexey Skidanov <alexey.skidanov@intel.com> Signed-off-by:
Olof Johansson <olof@lixom.net> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jan 04, 2019
-
-
Huang Shijie authored
Some devices may have big memory on chip, such as over 1G. In some cases, the nbytes maybe bigger then 4M which is the bounday of the memory buddy system (4K default). So use vzalloc_node() to allocate the bitmap. Also use vfree to free it. Link: http://lkml.kernel.org/r/20181225015701.6289-1-sjhuang@iluvatar.ai Signed-off-by:
Huang Shijie <sjhuang@iluvatar.ai> Reviewed-by:
Andrew Morton <akpm@linux-foundation.org> Cc: Alexey Skidanov <alexey.skidanov@intel.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Yury Norov authored
Contrary to other tests, test_find_next_and_bit() test uses tab formatting in output and get_cycles() instead of ktime_get(). get_cycles() is not supported by some arches, so ktime_get() fits better in generic code. Fix it and minor style issues, so the output looks like this: Start testing find_bit() with random-filled bitmap find_next_bit: 7142816 ns, 163282 iterations find_next_zero_bit: 8545712 ns, 164399 iterations find_last_bit: 6332032 ns, 163282 iterations find_first_bit: 20509424 ns, 16606 iterations find_next_and_bit: 4060016 ns, 73424 iterations Start testing find_bit() with sparse bitmap find_next_bit: 55984 ns, 656 iterations find_next_zero_bit: 19197536 ns, 327025 iterations find_last_bit: 65088 ns, 656 iterations find_first_bit: 5923712 ns, 656 iterations find_next_and_bit: 29088 ns, 1 iterations Link: http://lkml.kernel.org/r/20181123174803.10916-1-ynorov@caviumnetworks.com Signed-off-by:
Yury Norov <ynorov@caviumnetworks.com> Reviewed-by:
Andrew Morton <akpm@linux-foundation.org> Cc: "Norov, Yuri" <Yuri.Norov@cavium.com> Cc: Clement Courbet <courbet@google.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Alexey Skidanov authored
gen_pool_alloc_algo() uses different allocation functions implementing different allocation algorithms. With gen_pool_first_fit_align() allocation function, the returned address should be aligned on the requested boundary. If chunk start address isn't aligned on the requested boundary, the returned address isn't aligned too. The only way to get properly aligned address is to initialize the pool with chunks aligned on the requested boundary. If want to have an ability to allocate buffers aligned on different boundaries (for example, 4K, 1MB, ...), the chunk start address should be aligned on the max possible alignment. This happens because gen_pool_first_fit_align() looks for properly aligned memory block without taking into account the chunk start address alignment. To fix this, we provide chunk start address to gen_pool_first_fit_align() and change its implementation such that it starts looking for properly aligned block with appropriate offset (exactly as is done in CMA). Link: https://lkml.kernel.org/lkml/a170cf65-6884-3592-1de9-4c235888cc8a@intel.com Link: http://lkml.kernel.org/r/1541690953-4623-1-git-send-email-alexey.skidanov@intel.com Signed-off-by:
Alexey Skidanov <alexey.skidanov@intel.com> Reviewed-by:
Andrew Morton <akpm@linux-foundation.org> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Daniel Mentz <danielmentz@google.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Laura Abbott <labbott@redhat.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
Originally, the rule used to be that you'd have to do access_ok() separately, and then user_access_begin() before actually doing the direct (optimized) user access. But experience has shown that people then decide not to do access_ok() at all, and instead rely on it being implied by other operations or similar. Which makes it very hard to verify that the access has actually been range-checked. If you use the unsafe direct user accesses, hardware features (either SMAP - Supervisor Mode Access Protection - on x86, or PAN - Privileged Access Never - on ARM) do force you to use user_access_begin(). But nothing really forces the range check. By putting the range check into user_access_begin(), we actually force people to do the right thing (tm), and the range check vill be visible near the actual accesses. We have way too long a history of people trying to avoid them. Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument of the user address range verification function since we got rid of the old racy i386-only code to walk page tables by hand. It existed because the original 80386 would not honor the write protect bit when in kernel mode, so you had to do COW by hand before doing any user access. But we haven't supported that in a long time, and these days the 'type' argument is a purely historical artifact. A discussion about extending 'user_access_begin()' to do the range checking resulted this patch, because there is no way we're going to move the old VERIFY_xyz interface to that model. And it's best done at the end of the merge window when I've done most of my merges, so let's just get this done once and for all. This patch was mostly done with a sed-script, with manual fix-ups for the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form. There were a couple of notable cases: - csky still had the old "verify_area()" name as an alias. - the iter_iov code had magical hardcoded knowledge of the actual values of VERIFY_{READ,WRITE} (not that they mattered, since nothing really used it) - microblaze used the type argument for a debug printout but other than those oddities this should be a total no-op patch. I tried to fix up all architectures, did fairly extensive grepping for access_ok() uses, and the changes are trivial, but I may have missed something. Any missed conversion should be trivially fixable, though. Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Dec 29, 2018
-
-
NeilBrown authored
gen_crc64table requires linux include files to be installed in /usr/include/linux. This is a new requrement so hosts that could previously build the kernel, now cannot. gen_crc64table makes this requirement by including <linux/swab.h>, but nothing from that header is actaully used. So remove the #include, so that the linux headers no longer need to be installed. Fixes: feba04fd ("lib: add crc64 calculation routines") Signed-off-by:
NeilBrown <neil@brown.name> Acked-by:
Coly Li <colyli@suse.de> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Dec 28, 2018
-
-
Sri Krishna chowdary authored
Kmemleak scan can be cpu intensive and can stall user tasks at times. To prevent this, add config DEBUG_KMEMLEAK_AUTO_SCAN to enable/disable auto scan on boot up. Also protect first_run with DEBUG_KMEMLEAK_AUTO_SCAN as this is meant for only first automatic scan. Link: http://lkml.kernel.org/r/1540231723-7087-1-git-send-email-prpatel@nvidia.com Signed-off-by:
Sri Krishna chowdary <schowdary@nvidia.com> Signed-off-by:
Sachin Nikam <snikam@nvidia.com> Signed-off-by:
Prateek <prpatel@nvidia.com> Reviewed-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Will Deacon authored
Whilst no architectures actually enable support for huge p4d mappings in the vmap area, the code that is implemented should be using break-before-make, as we do for pud and pmd huge entries. Link: http://lkml.kernel.org/r/1544120495-17438-6-git-send-email-will.deacon@arm.com Signed-off-by:
Will Deacon <will.deacon@arm.com> Reviewed-by:
Toshi Kani <toshi.kani@hpe.com> Cc: Chintan Pandya <cpandya@codeaurora.org> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Michal Hocko <mhocko@suse.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Will Deacon authored
The current ioremap() code uses a phys_addr variable at each level of page table, which is confusingly offset by subtracting the base virtual address being mapped so that adding the current virtual address back on when iterating through the page table entries gives back the corresponding physical address. This is fairly confusing and results in all users of phys_addr having to add the current virtual address back on. Instead, this patch just updates phys_addr when iterating over the page table entries, ensuring that it's always up-to-date and doesn't require explicit offsetting. Link: http://lkml.kernel.org/r/1544120495-17438-5-git-send-email-will.deacon@arm.com Signed-off-by:
Will Deacon <will.deacon@arm.com> Tested-by:
Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by:
Sean Christopherson <sean.j.christopherson@intel.com> Cc: Chintan Pandya <cpandya@codeaurora.org> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Will Deacon authored
The recently merged API for ensuring break-before-make on page-table entries when installing huge mappings in the vmalloc/ioremap region is fairly counter-intuitive, resulting in the arch freeing functions (e.g. pmd_free_pte_page()) being called even on entries that aren't present. This resulted in a minor bug in the arm64 implementation, giving rise to spurious VM_WARN messages. This patch moves the pXd_present() checks out into the core code, refactoring the callsites at the same time so that we avoid the complex conjunctions when determining whether or not we can put down a huge mapping. Link: http://lkml.kernel.org/r/1544120495-17438-2-git-send-email-will.deacon@arm.com Signed-off-by:
Will Deacon <will.deacon@arm.com> Reviewed-by:
Toshi Kani <toshi.kani@hpe.com> Suggested-by:
Linus Torvalds <torvalds@linux-foundation.org> Cc: Chintan Pandya <cpandya@codeaurora.org> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Michal Hocko <mhocko@suse.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Wei Yang authored
Function show_mem() is used to print system memory status when user requires or fail to allocate memory. Generally, this is a best effort information so any races with memory hotplug (or very theoretically an early initialization) should be tolerable and the worst that could happen is to print an imprecise node state. Drop the resize lock because this is the only place which might hold the lock from the interrupt context and so all other callers might use a simple spinlock. Even though this doesn't solve any real issue it makes the code easier to follow and tiny more effective. Link: http://lkml.kernel.org/r/20181129235532.9328-1-richard.weiyang@gmail.com Signed-off-by:
Wei Yang <richard.weiyang@gmail.com> Acked-by:
Michal Hocko <mhocko@suse.com> Reviewed-by:
Oscar Salvador <osalvador@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Arun KS authored
totalram_pages, zone->managed_pages and totalhigh_pages updates are protected by managed_page_count_lock, but readers never care about it. Convert these variables to atomic to avoid readers potentially seeing a store tear. This patch converts zone->managed_pages. Subsequent patches will convert totalram_panges, totalhigh_pages and eventually managed_page_count_lock will be removed. Main motivation was that managed_page_count_lock handling was complicating things. It was discussed in length here, https://lore.kernel.org/patchwork/patch/995739/#1181785 So it seemes better to remove the lock and convert variables to atomic, with preventing poteintial store-to-read tearing as a bonus. Link: http://lkml.kernel.org/r/1542090790-21750-3-git-send-email-arunks@codeaurora.org Signed-off-by:
Arun KS <arunks@codeaurora.org> Suggested-by:
Michal Hocko <mhocko@suse.com> Suggested-by:
Vlastimil Babka <vbabka@suse.cz> Reviewed-by:
Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Reviewed-by:
David Hildenbrand <david@redhat.com> Acked-by:
Michal Hocko <mhocko@suse.com> Acked-by:
Vlastimil Babka <vbabka@suse.cz> Reviewed-by:
Pavel Tatashin <pasha.tatashin@soleen.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-