- Apr 21, 2019
-
-
Jordan Crouse authored
Add documentation for the interconnect and interconnect-names bindings for the GPU node as detailed by bindings/interconnect/interconnect.txt. Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Reviewed-by:
Douglas Anderson <dianders@chromium.org> Reviewed-by:
Rob Herring <robh@kernel.org> Acked-by:
Georgi Djakov <georgi.djakov@linaro.org> Signed-off-by:
Rob Clark <robdclark@chromium.org>
-
- Apr 19, 2019
-
-
Jordan Crouse authored
The GMU should have two power domains defined: "cx" and "gx". "cx" is the actual power domain for the device and "gx" will be attached at runtime to manage reference counting on the GPU device in case of a GMU crash. Signed-off-by:
Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by:
Rob Clark <robdclark@chromium.org>
-
- Apr 07, 2019
-
-
Maxime Ripard authored
Commit fd73403a ("dt-bindings: arm: Add SMP enable-method for Milbeaut") added support for a new cpu enable-method, but did so using tabulations to ident. This is however invalid in the syntax, and resulted in a failure when trying to use that schemas for validation. Use spaces instead of tabs to indent to fix this. Fixes: fd73403a ("dt-bindings: arm: Add SMP enable-method for Milbeaut") Signed-off-by:
Maxime Ripard <maxime.ripard@bootlin.com> Reviewed-by:
Rob Herring <robh@kernel.org> Acked-by:
Sugaya Taichi <sugaya.taichi@socionext.com> Signed-off-by:
Olof Johansson <olof@lixom.net>
-
- Apr 06, 2019
-
-
Waiman Long authored
The output of the PSI files show a bunch of numbers with no unit. The psi.txt documentation file also does not indicate what units are used. One can only find out by looking at the source code. The units are percentage for the averages and useconds for the total. Make the information easier to find by documenting the units in psi.txt. Link: http://lkml.kernel.org/r/20190402193810.3450-1-longman@redhat.com Signed-off-by:
Waiman Long <longman@redhat.com> Acked-by:
Johannes Weiner <hannes@cmpxchg.org> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Dave Rodgman authored
For very short input data (0 - 1 bytes), lzo-rle was not behaving correctly. Fix this behaviour and update documentation accordingly. For zero-length input, lzo v0 outputs an end-of-stream marker only, which was misinterpreted by lzo-rle as a bitstream version number. Ensure bitstream versions > 0 require a minimum stream length of 5. Also fixes a bug in handling the tail for very short inputs when a bitstream version is present. Link: http://lkml.kernel.org/r/20190326165857.34613-1-dave.rodgman@arm.com Signed-off-by:
Dave Rodgman <dave.rodgman@arm.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Apr 04, 2019
-
-
Stanislav Fomichev authored
Rename bpf_flow_dissector.txt to bpf_flow_dissector.rst and fix formatting. Also, link it from the Documentation/networking/index.rst. Tested with 'make htmldocs' to make sure it looks reasonable. Fixes: ae82899b ("flow_dissector: document BPF flow dissector environment") Signed-off-by:
Stanislav Fomichev <sdf@google.com> Acked-by:
Martin KaFai Lau <kafai@fb.com> Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net>
-
- Apr 03, 2019
-
-
Stanislav Fomichev authored
Short doc on what BPF flow dissector should expect in the input __sk_buff and flow_keys. Signed-off-by:
Stanislav Fomichev <sdf@google.com> Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net>
-
- Mar 29, 2019
-
-
Carlos Menin authored
By default, cells in DT are 32-bit in size. The driver reads "ti,mode" using the function of_property_read_u8() which causes the value to be read incorrectly in little-endian architectures if the size is not specified. Make it explicit in the binding documentation that this prorperty must be set as a 8-bit value. Signed-off-by:
Carlos Menin <menin@carlosaurelio.net> Reviewed-by:
Rob Herring <robh@kernel.org> Signed-off-by:
Guenter Roeck <linux@roeck-us.net>
-
- Mar 28, 2019
-
-
Paolo Bonzini authored
The documentation does not mention how to delete a slot, add the information. Reported-by:
Nathaniel McCallum <npmccallum@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
The series to add memcg accounting to KVM allocations[1] states: There are many KVM kernel memory allocations which are tied to the life of the VM process and should be charged to the VM process's cgroup. While it is correct to account KVM kernel allocations to the cgroup of the process that created the VM, it's technically incorrect to state that the KVM kernel memory allocations are tied to the life of the VM process. This is because the VM itself, i.e. struct kvm, is not tied to the life of the process which created it, rather it is tied to the life of its associated file descriptor. In other words, kvm_destroy_vm() is not invoked until fput() decrements its associated file's refcount to zero. A simple example is to fork() in Qemu and have the child sleep indefinitely; kvm_destroy_vm() isn't called until Qemu closes its file descriptor *and* the rogue child is killed. The allocations are guaranteed to be *accounted* to the process which created the VM, but only because KVM's per-{VM,vCPU} ioctls reject the ioctl() with -EIO if kvm->mm != current->mm. I.e. the child can keep the VM "alive" but can't do anything useful with its reference. Note that because 'struct kvm' also holds a reference to the mm_struct of its owner, the above behavior also applies to userspace allocations. Given that mucking with a VM's file descriptor can lead to subtle and undesirable behavior, e.g. memcg charges persisting after a VM is shut down, explicitly document a VM's lifecycle and its impact on the VM's resources. Alternatively, KVM could aggressively free resources when the creating process exits, e.g. via mmu_notifier->release(). However, mmu_notifier isn't guaranteed to be available, and freeing resources when the creator exits is likely to be error prone and fragile as KVM would need to ensure that it only freed resources that are truly out of reach. In practice, the existing behavior shouldn't be problematic as a properly configured system will prevent a child process from being moved out of the appropriate cgroup hierarchy, i.e. prevent hiding the process from the OOM killer, and will prevent an unprivileged user from being able to to hold a reference to struct kvm via another method, e.g. debugfs. [1]https://patchwork.kernel.org/patch/10806707/ Signed-off-by:
Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
KVM's API requires thats ioctls must be issued from the same process that created the VM. In other words, userspace can play games with a VM's file descriptors, e.g. fork(), SCM_RIGHTS, etc..., but only the creator can do anything useful. Explicitly reject device ioctls that are issued by a process other than the VM's creator, and update KVM's API documentation to extend its requirements to device ioctls. Fixes: 852b6d57 ("kvm: add device control API") Cc: <stable@vger.kernel.org> Signed-off-by:
Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Per Paolo[1], instantiating multiple VMs in a single process is legal; but this conflicts with KVM's API documentation, which states: The only supported use is one virtual machine per process, and one vcpu per thread. However, an earlier section in the documentation states: Only run VM ioctls from the same process (address space) that was used to create the VM. and: Only run vcpu ioctls from the same thread that was used to create the vcpu. This suggests that the conflicting documentation is simply an incorrect ordering of of words, i.e. what's really meant is that a virtual machine can't be shared across multiple processes and a vCPU can't be shared across multiple threads. Tweak the blurb on issuing ioctls to use a more assertive tone, and rewrite the "supported use" sentence to reference said blurb instead of poorly restating it in different terms. Opportunistically add missing punctuation. [1] https://lkml.kernel.org/r/f23265d4-528e-3bd4-011f-4d7b8f3281db@redhat.com Fixes: 9c1b96e3 ("KVM: Document basic API") Signed-off-by:
Sean Christopherson <sean.j.christopherson@intel.com> [Improve notes on asynchronous ioctl] Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
The cr4_pae flag is a bit of a misnomer, its purpose is really to track whether the guest PTE that is being shadowed is a 4-byte entry or an 8-byte entry. Prior to supporting nested EPT, the size of the gpte was reflected purely by CR4.PAE. KVM fudged things a bit for direct sptes, but it was mostly harmless since the size of the gpte never mattered. Now that a spte may be tracking an indirect EPT entry, relying on CR4.PAE is wrong and ill-named. For direct shadow pages, force the gpte_size to '1' as they are always 8-byte entries; EPT entries can only be 8-bytes and KVM always uses 8-byte entries for NPT and its identity map (when running with EPT but not unrestricted guest). Likewise, nested EPT entries are always 8-bytes. Nested EPT presents a unique scenario as the size of the entries are not dictated by CR4.PAE, but neither is the shadow page a direct map. To handle this scenario, set cr0_wp=1 and smap_andnot_wp=1, an otherwise impossible combination, to denote a nested EPT shadow page. Use the information to avoid incorrectly zapping an unsync'd indirect page in __kvm_sync_page(). Providing a consistent and accurate gpte_size fixes a bug reported by Vitaly where fast_cr3_switch() always fails when switching from L2 to L1 as kvm_mmu_get_page() would force role.cr4_pae=0 for direct pages, whereas kvm_calc_mmu_role_common() would set it according to CR4.PAE. Fixes: 7dcd5755 ("x86/kvm/mmu: check if tdp/shadow MMU reconfiguration is needed") Reported-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
David Howells authored
Update the mount API docs to reflect recent changes to the code. Signed-off-by:
David Howells <dhowells@redhat.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Mar 27, 2019
-
-
Erin Lo authored
This adds dt-binding documentation of uart for Mediatek MT8183 SoC Platform. Signed-off-by:
Erin Lo <erin.lo@mediatek.com> Acked-by:
Rob Herring <robh@kernel.org> Acked-by:
Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Wolfram Sang authored
If we use the "i2c-" prefix for the binding documentation file name, then it should match the file name of the driver, if possible. It is possible for this driver, so rename it. Signed-off-by:
Wolfram Sang <wsa@the-dreams.de>
-
Wolfram Sang authored
If we use the "i2c-" prefix for the binding documentation file name, then it should match the file name of the driver, if possible. It is possible for this driver, so rename it. Signed-off-by:
Wolfram Sang <wsa@the-dreams.de> Acked-by:
Maxime Ripard <maxime.ripard@bootlin.com>
-
Wolfram Sang authored
If we use the "i2c-" prefix for the binding documentation file name, then it should match the file name of the driver, if possible. It is possible for this driver, so rename it. Signed-off-by:
Wolfram Sang <wsa@the-dreams.de>
-
Wolfram Sang authored
If we use the "i2c-" prefix for the binding documentation file name, then it should match the file name of the driver, if possible. It is possible for this driver, so rename it. Signed-off-by:
Wolfram Sang <wsa@the-dreams.de>
-
Wolfram Sang authored
If we use the "i2c-" prefix for the binding documentation file name, then it should match the file name of the driver, if possible. It is possible for this driver, so rename it. Signed-off-by:
Wolfram Sang <wsa@the-dreams.de>
-
- Mar 26, 2019
-
-
Jesper Dangaard Brouer authored
Section 2.2.1 BTF_KIND_INT a bullet list was collapsed due to text reflow in commit 9ab5305d ("docs/btf: reflow text to fill up to 78 characters"). This patch correct the mistake. Also adjust next bullet list, which is used for comparison, to get rendered the same way. Fixes: 9ab5305d ("docs/btf: reflow text to fill up to 78 characters") Link: https://www.kernel.org/doc/html/latest/bpf/btf.html#btf-kind-int Signed-off-by:
Jesper Dangaard Brouer <brouer@redhat.com> Acked-by:
Andrii Nakryiko <andriin@fb.com> Signed-off-by:
Alexei Starovoitov <ast@kernel.org>
-
Christian Lamparter authored
This patch updates the qca8k's binding to document to the approach for using the internal mdio-bus of the supported qca8k switches. Reviewed-by:
Florian Fainelli <f.fainelli@gmail.com> Signed-off-by:
Christian Lamparter <chunkeey@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Christian Lamparter authored
In the example, the phy at phy@0 is clashing with the switch0@0 at the same address. Usually, the switches are accessible through pseudo PHYs which in case of the qca8k are located at 0x10 - 0x18. Reviewed-by:
Florian Fainelli <f.fainelli@gmail.com> Signed-off-by:
Christian Lamparter <chunkeey@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Mar 21, 2019
-
-
Fabrizio Castro authored
Document RZ/G2E (R8A774C0) SoC bindings. Signed-off-by:
Fabrizio Castro <fabrizio.castro@bp.renesas.com> Reviewed-by:
Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by:
Simon Horman <horms+renesas@verge.net.au> Reviewed-by:
Rob Herring <robh@kernel.org> Signed-off-by:
Marc Zyngier <marc.zyngier@arm.com>
-
- Mar 20, 2019
-
-
Jarkko Nikula authored
Add PCI ID for Intel Comet Lake PCH. Signed-off-by:
Jarkko Nikula <jarkko.nikula@linux.intel.com> Reviewed-by:
Jean Delvare <jdelvare@suse.de> Signed-off-by:
Wolfram Sang <wsa@the-dreams.de>
-
- Mar 19, 2019
-
-
Pablo Neira Ayuso authored
No direct transition from prerouting to forward hook, routing lookup needs to happen first. Fixes: 19b351f1 ("netfilter: add flowtable documentation") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Florian Fainelli authored
Provide an explanation of what is expected with respect to sending new versions of specific patches within a patch series, as well as what happens if an earlier patch series accidentally gets merged). Signed-off-by:
Florian Fainelli <f.fainelli@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Mar 18, 2019
-
-
Tobias Klauser authored
Use https and link to the patch directly. Signed-off-by:
Tobias Klauser <tklauser@distanz.ch> Acked-by:
Willem de Bruijn <willemb@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Randy Dunlap authored
Fix documentation markup warnings in snmp_counter.rst: Documentation/networking/snmp_counter.rst:416: WARNING: Title underline too short. Documentation/networking/snmp_counter.rst:684: WARNING: Bullet list ends without a blank line; unexpected unindent. Documentation/networking/snmp_counter.rst:693: WARNING: Title underline too short. Documentation/networking/snmp_counter.rst:707: WARNING: Bullet list ends without a blank line; unexpected unindent. Documentation/networking/snmp_counter.rst:712: WARNING: Bullet list ends without a blank line; unexpected unindent. Documentation/networking/snmp_counter.rst:722: WARNING: Title underline too short. Documentation/networking/snmp_counter.rst:733: WARNING: Bullet list ends without a blank line; unexpected unindent. Documentation/networking/snmp_counter.rst:736: WARNING: Bullet list ends without a blank line; unexpected unindent. Documentation/networking/snmp_counter.rst:739: WARNING: Bullet list ends without a blank line; unexpected unindent. Fixes: 80cc4950 ("net: Add part of TCP counts explanations in snmp_counters.rst") Fixes: 8e2ea53a ("add snmp counters document") Fixes: a6c7c7aa ("net: add document for several snmp counters") Signed-off-by:
Randy Dunlap <rdunlap@infradead.org> Cc: yupeng <yupeng0921@gmail.com>
-
- Mar 17, 2019
-
-
Masahiro Yamada authored
Currently, every arch/*/include/uapi/asm/Kbuild explicitly includes the common Kbuild.asm file. Factor out the duplicated include directives to scripts/Makefile.asm-generic so that no architecture would opt out of the mandatory-y mechanism. um is not forced to include mandatory-y since it is a very exceptional case which does not support UAPI. Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com>
-
- Mar 15, 2019
-
-
Sean Christopherson authored
The series to add memcg accounting to KVM allocations[1] states: There are many KVM kernel memory allocations which are tied to the life of the VM process and should be charged to the VM process's cgroup. While it is correct to account KVM kernel allocations to the cgroup of the process that created the VM, it's technically incorrect to state that the KVM kernel memory allocations are tied to the life of the VM process. This is because the VM itself, i.e. struct kvm, is not tied to the life of the process which created it, rather it is tied to the life of its associated file descriptor. In other words, kvm_destroy_vm() is not invoked until fput() decrements its associated file's refcount to zero. A simple example is to fork() in Qemu and have the child sleep indefinitely; kvm_destroy_vm() isn't called until Qemu closes its file descriptor *and* the rogue child is killed. The allocations are guaranteed to be *accounted* to the process which created the VM, but only because KVM's per-{VM,vCPU} ioctls reject the ioctl() with -EIO if kvm->mm != current->mm. I.e. the child can keep the VM "alive" but can't do anything useful with its reference. Note that because 'struct kvm' also holds a reference to the mm_struct of its owner, the above behavior also applies to userspace allocations. Given that mucking with a VM's file descriptor can lead to subtle and undesirable behavior, e.g. memcg charges persisting after a VM is shut down, explicitly document a VM's lifecycle and its impact on the VM's resources. Alternatively, KVM could aggressively free resources when the creating process exits, e.g. via mmu_notifier->release(). However, mmu_notifier isn't guaranteed to be available, and freeing resources when the creator exits is likely to be error prone and fragile as KVM would need to ensure that it only freed resources that are truly out of reach. In practice, the existing behavior shouldn't be problematic as a properly configured system will prevent a child process from being moved out of the appropriate cgroup hierarchy, i.e. prevent hiding the process from the OOM killer, and will prevent an unprivileged user from being able to to hold a reference to struct kvm via another method, e.g. debugfs. [1]https://patchwork.kernel.org/patch/10806707/ Signed-off-by:
Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Steve French authored
Also updated a comment describing use of the GlobalMid_Lock Signed-off-by:
Steve French <stfrench@microsoft.com>
-
- Mar 12, 2019
-
-
Mariusz Dabrowski authored
When the Partial Parity Log is enabled, circular buffer is used to store PPL data. Each write to RAID device causes overwrite of data in this buffer so some write_hint can be set to those request to help drives handle garbage collection. This patch adds new sysfs attribute which can be used to specify which write_hint should be assigned to PPL. Acked-by:
Guoqing Jiang <gqjiang@suse.com> Signed-off-by:
Mariusz Dabrowski <mariusz.dabrowski@intel.com> Signed-off-by:
Song Liu <songliubraving@fb.com>
-
Kent Overstreet authored
All existing users have been converted to generic radix trees Link: http://lkml.kernel.org/r/20181217131929.11727-8-kent.overstreet@gmail.com Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com> Acked-by:
Dave Hansen <dave.hansen@intel.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@parisplace.org> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pravin B Shelar <pshelar@ovn.org> Cc: Shaohua Li <shli@kernel.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Kent Overstreet authored
Very simple radix tree implementation that supports storing arbitrary size entries, up to PAGE_SIZE - upcoming patches will convert existing flex_array users to genradixes. The new genradix code has a much simpler API and implementation, and doesn't have a hard limit on the number of elements like flex_array does. Link: http://lkml.kernel.org/r/20181217131929.11727-5-kent.overstreet@gmail.com Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Eric Paris <eparis@parisplace.org> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pravin B Shelar <pshelar@ovn.org> Cc: Shaohua Li <shli@kernel.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Mar 11, 2019
-
-
xiaofeis authored
Add documentation for a new optional property local-mac-address which is described in ethernet.txt. Signed-off-by:
xiaofeis <xiaofeis@codeaurora.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Mar 08, 2019
-
-
Christophe Roullier authored
Syscfg clock is no more needed. Signed-off-by:
Christophe Roullier <christophe.roullier@st.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Christophe Roullier authored
Add properties to support all Phy config PHY_MODE (MII,GMII, RMII, RGMII) and in normal, PHY wo crystal (25Mhz), PHY wo crystal (50Mhz), No 125Mhz from PHY config. Signed-off-by:
Christophe Roullier <christophe.roullier@st.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Dave Rodgman authored
To prevent any issues with persistent data, separate lzo-rle from lzo so that it is treated as a separate algorithm, and lzo is still available. Link: http://lkml.kernel.org/r/20190205155944.16007-3-dave.rodgman@arm.com Signed-off-by:
Dave Rodgman <dave.rodgman@arm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com> Cc: Matt Sealey <matt.sealey@arm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <nitingupta910@gmail.com> Cc: Richard Purdie <rpurdie@openedhand.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Sonny Rao <sonnyrao@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Dave Rodgman authored
Patch series "lib/lzo: run-length encoding support", v5. Following on from the previous lzo-rle patchset: https://lkml.org/lkml/2018/11/30/972 This patchset contains only the RLE patches, and should be applied on top of the non-RLE patches ( https://lkml.org/lkml/2019/2/5/366 ). Previously, some questions were raised around the RLE patches. I've done some additional benchmarking to answer these questions. In short: - RLE offers significant additional performance (data-dependent) - I didn't measure any regressions that were clearly outside the noise One concern with this patchset was around performance - specifically, measuring RLE impact separately from Matt Sealey's patches (CTZ & fast copy). I have done some additional benchmarking which I hope clarifies the benefits of each part of the patchset. Firstly, I've captured some memory via /dev/fmem from a Chromebook with many tabs open which is starting to swap, and then split this into 4178 4k pages. I've excluded the all-zero pages (as zram does), and also the no-zero pages (which won't tell us anything about RLE performance). This should give a realistic test dataset for zram. What I found was that the data is VERY bimodal: 44% of pages in this dataset contain 5% or fewer zeros, and 44% contain over 90% zeros (30% if you include the no-zero pages). This supports the idea of special-casing zeros in zram. Next, I've benchmarked four variants of lzo on these pages (on 64-bit Arm at max frequency): baseline LZO; baseline + Matt Sealey's patches (aka MS); baseline + RLE only; baseline + MS + RLE. Numbers are for weighted roundtrip throughput (the weighting reflects that zram does more compression than decompression). https://drive.google.com/file/d/1VLtLjRVxgUNuWFOxaGPwJYhl_hMQXpHe/view?usp=sharing Matt's patches help in all cases for Arm (and no effect on Intel), as expected. RLE also behaves as expected: with few zeros present, it makes no difference; above ~75%, it gives a good improvement (50 - 300 MB/s on top of the benefit from Matt's patches). Best performance is seen with both MS and RLE patches. Finally, I have benchmarked the same dataset on an x86-64 device. Here, the MS patches make no difference (as expected); RLE helps, similarly as on Arm. There were no definite regressions; allowing for observational error, 0.1% (3/4178) of cases had a regression > 1 standard deviation, of which the largest was 4.6% (1.2 standard deviations). I think this is probably within the noise. https://drive.google.com/file/d/1xCUVwmiGD0heEMx5gcVEmLBI4eLaageV/view?usp=sharing One point to note is that the graphs show RLE appears to help very slightly with no zeros present! This is because the extra code causes the clang optimiser to change code layout in a way that happens to have a significant benefit. Taking baseline LZO and adding a do-nothing line like "__builtin_prefetch(out_len);" immediately before the "goto next" has the same effect. So this is a real, but basically spurious effect - it's small enough not to upset the overall findings. This patch (of 3): When using zram, we frequently encounter long runs of zero bytes. This adds a special case which identifies runs of zeros and encodes them using run-length encoding. This is faster for both compression and decompresion. For high-entropy data which doesn't hit this case, impact is minimal. Compression ratio is within a few percent in all cases. This modifies the bitstream in a way which is backwards compatible (i.e., we can decompress old bitstreams, but old versions of lzo cannot decompress new bitstreams). Link: http://lkml.kernel.org/r/20190205155944.16007-2-dave.rodgman@arm.com Signed-off-by:
Dave Rodgman <dave.rodgman@arm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com> Cc: Matt Sealey <matt.sealey@arm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <nitingupta910@gmail.com> Cc: Richard Purdie <rpurdie@openedhand.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Sonny Rao <sonnyrao@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-