Skip to content
Snippets Groups Projects
  1. Jun 28, 2018
    • Linus Torvalds's avatar
      Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL · a11e1d43
      Linus Torvalds authored
      
      The poll() changes were not well thought out, and completely
      unexplained.  They also caused a huge performance regression, because
      "->poll()" was no longer a trivial file operation that just called down
      to the underlying file operations, but instead did at least two indirect
      calls.
      
      Indirect calls are sadly slow now with the Spectre mitigation, but the
      performance problem could at least be largely mitigated by changing the
      "->get_poll_head()" operation to just have a per-file-descriptor pointer
      to the poll head instead.  That gets rid of one of the new indirections.
      
      But that doesn't fix the new complexity that is completely unwarranted
      for the regular case.  The (undocumented) reason for the poll() changes
      was some alleged AIO poll race fixing, but we don't make the common case
      slower and more complex for some uncommon special case, so this all
      really needs way more explanations and most likely a fundamental
      redesign.
      
      [ This revert is a revert of about 30 different commits, not reverted
        individually because that would just be unnecessarily messy  - Linus ]
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a11e1d43
  2. Jun 27, 2018
  3. Jun 24, 2018
  4. Jun 23, 2018
  5. Jun 22, 2018
  6. Jun 21, 2018
  7. Jun 20, 2018
  8. Jun 18, 2018
  9. Jun 15, 2018
  10. Jun 14, 2018
    • Masahiro Yamada's avatar
      Kbuild: rename HAVE_CC_STACKPROTECTOR config variable · d148eac0
      Masahiro Yamada authored
      
      HAVE_CC_STACKPROTECTOR should be selected by architectures with stack
      canary implementation.  It is not about the compiler support.
      
      For the consistency with commit 050e9baa ("Kbuild: rename
      CC_STACKPROTECTOR[_STRONG] config variables"), remove 'CC_' from the
      config symbol.
      
      I moved the 'select' lines to keep the alphabetical sorting.
      
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d148eac0
    • Christoph Hellwig's avatar
      dma-mapping: move all DMA mapping code to kernel/dma · cf65a0f6
      Christoph Hellwig authored
      
      Currently the code is split over various files with dma- prefixes in the
      lib/ and drives/base directories, and the number of files keeps growing.
      Move them into a single directory to keep the code together and remove
      the file name prefixes.  To match the irq infrastructure this directory
      is placed under the kernel/ directory.
      
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      cf65a0f6
    • Linus Torvalds's avatar
      Kbuild: rename CC_STACKPROTECTOR[_STRONG] config variables · 050e9baa
      Linus Torvalds authored
      
      The changes to automatically test for working stack protector compiler
      support in the Kconfig files removed the special STACKPROTECTOR_AUTO
      option that picked the strongest stack protector that the compiler
      supported.
      
      That was all a nice cleanup - it makes no sense to have the AUTO case
      now that the Kconfig phase can just determine the compiler support
      directly.
      
      HOWEVER.
      
      It also meant that doing "make oldconfig" would now _disable_ the strong
      stackprotector if you had AUTO enabled, because in a legacy config file,
      the sane stack protector configuration would look like
      
        CONFIG_HAVE_CC_STACKPROTECTOR=y
        # CONFIG_CC_STACKPROTECTOR_NONE is not set
        # CONFIG_CC_STACKPROTECTOR_REGULAR is not set
        # CONFIG_CC_STACKPROTECTOR_STRONG is not set
        CONFIG_CC_STACKPROTECTOR_AUTO=y
      
      and when you ran this through "make oldconfig" with the Kbuild changes,
      it would ask you about the regular CONFIG_CC_STACKPROTECTOR (that had
      been renamed from CONFIG_CC_STACKPROTECTOR_REGULAR to just
      CONFIG_CC_STACKPROTECTOR), but it would think that the STRONG version
      used to be disabled (because it was really enabled by AUTO), and would
      disable it in the new config, resulting in:
      
        CONFIG_HAVE_CC_STACKPROTECTOR=y
        CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
        CONFIG_CC_STACKPROTECTOR=y
        # CONFIG_CC_STACKPROTECTOR_STRONG is not set
        CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
      
      That's dangerously subtle - people could suddenly find themselves with
      the weaker stack protector setup without even realizing.
      
      The solution here is to just rename not just the old RECULAR stack
      protector option, but also the strong one.  This does that by just
      removing the CC_ prefix entirely for the user choices, because it really
      is not about the compiler support (the compiler support now instead
      automatially impacts _visibility_ of the options to users).
      
      This results in "make oldconfig" actually asking the user for their
      choice, so that we don't have any silent subtle security model changes.
      The end result would generally look like this:
      
        CONFIG_HAVE_CC_STACKPROTECTOR=y
        CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
        CONFIG_STACKPROTECTOR=y
        CONFIG_STACKPROTECTOR_STRONG=y
        CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
      
      where the "CC_" versions really are about internal compiler
      infrastructure, not the user selections.
      
      Acked-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      050e9baa
  11. Jun 11, 2018
  12. Jun 08, 2018
    • Alex Williamson's avatar
      vfio/mdev: Check globally for duplicate devices · 002fe996
      Alex Williamson authored
      
      When we create an mdev device, we check for duplicates against the
      parent device and return -EEXIST if found, but the mdev device
      namespace is global since we'll link all devices from the bus.  We do
      catch this later in sysfs_do_create_link_sd() to return -EEXIST, but
      with it comes a kernel warning and stack trace for trying to create
      duplicate sysfs links, which makes it an undesirable response.
      
      Therefore we should really be looking for duplicates across all mdev
      parent devices, or as implemented here, against our mdev device list.
      Using mdev_list to prevent duplicates means that we can remove
      mdev_parent.lock, but in order not to serialize mdev device creation
      and removal globally, we add mdev_device.active which allows UUIDs to
      be reserved such that we can drop the mdev_list_lock before the mdev
      device is fully in place.
      
      Two behavioral notes; first, mdev_parent.lock had the side-effect of
      serializing mdev create and remove ops per parent device.  This was
      an implementation detail, not an intentional guarantee provided to
      the mdev vendor drivers.  Vendor drivers can trivially provide this
      serialization internally if necessary.  Second, review comments note
      the new -EAGAIN behavior when the device, and in particular the remove
      attribute, becomes visible in sysfs.  If a remove is triggered prior
      to completion of mdev_device_create() the user will see a -EAGAIN
      error.  While the errno is different, receiving an error during this
      period is not, the previous implementation returned -ENODEV for the
      same condition.  Furthermore, the consistency to the user is improved
      in the case where mdev_device_remove_ops() returns error.  Previously
      concurrent calls to mdev_device_remove() could see the device
      disappear with -ENODEV and return in the case of error.  Now a user
      would see -EAGAIN while the device is in this transitory state.
      
      Reviewed-by: default avatarKirti Wankhede <kwankhede@nvidia.com>
      Reviewed-by: default avatarCornelia Huck <cohuck@redhat.com>
      Acked-by: default avatarHalil Pasic <pasic@linux.ibm.com>
      Acked-by: default avatarZhenyu Wang <zhenyuw@linux.intel.com>
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      002fe996
    • Mikulas Patocka's avatar
      dm: add writecache target · 48debafe
      Mikulas Patocka authored
      
      The writecache target caches writes on persistent memory or SSD.
      It is intended for databases or other programs that need extremely low
      commit latency.
      
      The writecache target doesn't cache reads because reads are supposed to
      be cached in page cache in normal RAM.
      
      If persistent memory isn't available this target can still be used in
      SSD mode.
      
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: Colin Ian King <colin.king@canonical.com> # fix missing goto
      Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> # fix compilation issue with !DAX
      Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> # use msecs_to_jiffies
      Acked-by: Dan Williams <dan.j.williams@intel.com> # reworks to unify ARM and x86 flushing
      Signed-off-by: default avatarMike Snitzer <msnitzer@redhat.com>
      48debafe
    • Ian Kent's avatar
      autofs: use autofs instead of autofs4 in documentation · b6bb226a
      Ian Kent authored
      Finally remove autofs4 references in the filesystems documentation.
      
      Link: http://lkml.kernel.org/r/152626709055.28589.416082809460051475.stgit@pluto.themaw.net
      
      
      Signed-off-by: default avatarIan Kent <raven@themaw.net>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b6bb226a
    • Ian Kent's avatar
      autofs: rename autofs documentation files · 9005d833
      Ian Kent authored
      There are two files in Documentation/filsystems that should now use
      autofs rather than autofs4 in their names.
      
      Link: http://lkml.kernel.org/r/152626707957.28589.3325300375892913999.stgit@pluto.themaw.net
      
      
      Signed-off-by: default avatarIan Kent <raven@themaw.net>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9005d833
    • Tejun Heo's avatar
      mm: memcg: allow lowering memory.swap.max below the current usage · be09102b
      Tejun Heo authored
      Currently an attempt to set swap.max into a value lower than the actual
      swap usage fails, which causes configuration problems as there's no way
      of lowering the configuration below the current usage short of turning
      off swap entirely.  This makes swap.max difficult to use and allows
      delegatees to lock the delegator out of reducing swap allocation.
      
      This patch updates swap_max_write() so that the limit can be lowered
      below the current usage.  It doesn't implement active reclaiming of swap
      entries for the following reasons.
      
      * mem_cgroup_swap_full() already tells the swap machinary to
        aggressively reclaim swap entries if the usage is above 50% of
        limit, so simply lowering the limit automatically triggers gradual
        reclaim.
      
      * Forcing back swapped out pages is likely to heavily impact the
        workload and mess up the working set.  Given that swap usually is a
        lot less valuable and less scarce, letting the existing usage
        dissipate over time through the above gradual reclaim and as they're
        falted back in is likely the better behavior.
      
      Link: http://lkml.kernel.org/r/20180523185041.GR1718769@devbig577.frc2.facebook.com
      
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarRoman Gushchin <guro@fb.com>
      Acked-by: default avatarRik van Riel <riel@surriel.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Shaohua Li <shli@fb.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      be09102b
    • Roman Gushchin's avatar
      memcg: introduce memory.min · bf8d5d52
      Roman Gushchin authored
      Memory controller implements the memory.low best-effort memory
      protection mechanism, which works perfectly in many cases and allows
      protecting working sets of important workloads from sudden reclaim.
      
      But its semantics has a significant limitation: it works only as long as
      there is a supply of reclaimable memory.  This makes it pretty useless
      against any sort of slow memory leaks or memory usage increases.  This
      is especially true for swapless systems.  If swap is enabled, memory
      soft protection effectively postpones problems, allowing a leaking
      application to fill all swap area, which makes no sense.  The only
      effective way to guarantee the memory protection in this case is to
      invoke the OOM killer.
      
      It's possible to handle this case in userspace by reacting on MEMCG_LOW
      events; but there is still a place for a fail-safe in-kernel mechanism
      to provide stronger guarantees.
      
      This patch introduces the memory.min interface for cgroup v2 memory
      controller.  It works very similarly to memory.low (sharing the same
      hierarchical behavior), except that it's not disabled if there is no
      more reclaimable memory in the system.
      
      If cgroup is not populated, its memory.min is ignored, because otherwise
      even the OOM killer wouldn't be able to reclaim the protected memory,
      and the system can stall.
      
      [guro@fb.com: s/low/min/ in docs]
      Link: http://lkml.kernel.org/r/20180510130758.GA9129@castle.DHCP.thefacebook.com
      Link: http://lkml.kernel.org/r/20180509180734.GA4856@castle.DHCP.thefacebook.com
      
      
      Signed-off-by: default avatarRoman Gushchin <guro@fb.com>
      Reviewed-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bf8d5d52
    • Roman Gushchin's avatar
      mm/docs: describe memory.low refinements · 7854207f
      Roman Gushchin authored
      Refine cgroup v2 docs after latest memory.low changes.
      
      Link: http://lkml.kernel.org/r/20180405185921.4942-4-guro@fb.com
      
      
      Signed-off-by: default avatarRoman Gushchin <guro@fb.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7854207f
    • Laurent Dufour's avatar
      mm: introduce ARCH_HAS_PTE_SPECIAL · 3010a5ea
      Laurent Dufour authored
      Currently the PTE special supports is turned on in per architecture
      header files.  Most of the time, it is defined in
      arch/*/include/asm/pgtable.h depending or not on some other per
      architecture static definition.
      
      This patch introduce a new configuration variable to manage this
      directly in the Kconfig files.  It would later replace
      __HAVE_ARCH_PTE_SPECIAL.
      
      Here notes for some architecture where the definition of
      __HAVE_ARCH_PTE_SPECIAL is not obvious:
      
      arm
       __HAVE_ARCH_PTE_SPECIAL which is currently defined in
      arch/arm/include/asm/pgtable-3level.h which is included by
      arch/arm/include/asm/pgtable.h when CONFIG_ARM_LPAE is set.
      So select ARCH_HAS_PTE_SPECIAL if ARM_LPAE.
      
      powerpc
      __HAVE_ARCH_PTE_SPECIAL is defined in 2 files:
       - arch/powerpc/include/asm/book3s/64/pgtable.h
       - arch/powerpc/include/asm/pte-common.h
      The first one is included if (PPC_BOOK3S & PPC64) while the second is
      included in all the other cases.
      So select ARCH_HAS_PTE_SPECIAL all the time.
      
      sparc:
      __HAVE_ARCH_PTE_SPECIAL is defined if defined(__sparc__) &&
      defined(__arch64__) which are defined through the compiler in
      sparc/Makefile if !SPARC32 which I assume to be if SPARC64.
      So select ARCH_HAS_PTE_SPECIAL if SPARC64
      
      There is no functional change introduced by this patch.
      
      Link: http://lkml.kernel.org/r/1523433816-14460-2-git-send-email-ldufour@linux.vnet.ibm.com
      
      
      Signed-off-by: default avatarLaurent Dufour <ldufour@linux.vnet.ibm.com>
      Suggested-by: default avatarJerome Glisse <jglisse@redhat.com>
      Reviewed-by: default avatarJerome Glisse <jglisse@redhat.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Albert Ou <albert@sifive.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Christophe LEROY <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3010a5ea
Loading