Skip to content
Snippets Groups Projects
Commit 223e23e8 authored by Will Deacon's avatar Will Deacon Committed by Catalin Marinas
Browse files

arm64: lib: improve copy_page to deal with 128 bytes at a time


We want to avoid lots of different copy_page implementations, settling
for something that is "good enough" everywhere and hopefully easy to
understand and maintain whilst we're at it.

This patch reworks our copy_page implementation based on discussions
with Cavium on the list and benchmarking on Cortex-A processors so that:

  - The loop is unrolled to copy 128 bytes per iteration

  - The reads are offset so that we read from the next 128-byte block
    in the same iteration that we store the previous block

  - Explicit prefetch instructions are removed for now, since they hurt
    performance on CPUs with hardware prefetching

  - The loop exit condition is calculated at the start of the loop

Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
Tested-by: default avatarAndrew Pinski <apinski@cavium.com>
Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
parent d5370f75
No related branches found
No related tags found
No related merge requests found
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment