Due to an influx of spam, we had to disable account registrations. If you don't have an account yet, please write an email to support@manjaro.org, with your desired username. Sorry for the inconvenience.
System freeze (GPU hang) with i915 driver on Intel UHD 620
i915 driver causes system freeze (GPU hang) on kernel 5.4 series. Kernels 4.19 and 5.5 don't have this issue.
Here is a part of journal after GPU hang with kernel 5.4.6-2:
...Dec 25 14:14:04 andrey-lenovo kernel: i915 0000:00:02.0: GPU HANG: ecode 9:1:0x00000000, hang on rcs0Dec 25 14:14:04 andrey-lenovo kernel: GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.Dec 25 14:14:04 andrey-lenovo kernel: Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/IntelDec 25 14:14:04 andrey-lenovo kernel: drm/i915 developers can then reassign to the right component if it's not a kernel issue.Dec 25 14:14:04 andrey-lenovo kernel: The GPU crash dump is required to analyze GPU hangs, so please always attach it.Dec 25 14:14:04 andrey-lenovo kernel: GPU crash dump saved to /sys/class/drm/card0/errorDec 25 14:14:04 andrey-lenovo kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0Dec 25 14:14:04 andrey-lenovo kernel: [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}Dec 25 14:14:04 andrey-lenovo kernel: i915 0000:00:02.0: Resetting chip for hang on rcs0Dec 25 14:14:04 andrey-lenovo kernel: [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}Dec 25 14:14:04 andrey-lenovo kernel: [drm:gen8_reset_engines [i915]] *ERROR* rcs0 reset request timed out: {request: 00000001, RESET_CTL: 00000001}Dec 25 14:14:04 andrey-lenovo kernel: [drm] GuC communication enabledDec 25 14:14:04 andrey-lenovo kernel: i915 0000:00:02.0: GuC firmware i915/kbl_guc_33.0.0.bin version 33.0 submission:disabledDec 25 14:14:04 andrey-lenovo kernel: i915 0000:00:02.0: HuC firmware i915/kbl_huc_ver02_00_1810.bin version 2.0 authenticated:yes...
I've tested kernels 5.4.10-1 and 5.4.11-1. GPU hangs do occur sometimes , but the system hangs only for a few seconds.
Jan 15 10:53:08 andrey-lenovo kernel: i915 0000:00:02.0: GPU HANG: ecode 9:1:0x85ff9bc3, in Xorg [1062], hang on rcs0Jan 15 10:53:08 andrey-lenovo kernel: GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.Jan 15 10:53:08 andrey-lenovo kernel: Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/IntelJan 15 10:53:08 andrey-lenovo kernel: drm/i915 developers can then reassign to the right component if it's not a kernel issue.Jan 15 10:53:08 andrey-lenovo kernel: The GPU crash dump is required to analyze GPU hangs, so please always attach it.Jan 15 10:53:08 andrey-lenovo kernel: GPU crash dump saved to /sys/class/drm/card0/errorJan 15 10:53:08 andrey-lenovo kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
I have also been experiencing this issue with CPU/GPU Intel i5-8250U/Intel UHD 620. Typically the hang requires a full reset, though as of 5.4.12 the system has been recoverable on occasion. Haven't had much uptime with 5.4.13 but will report back if the issue persists here and with later versions.
Using Git to bisect Linux sources from 219d54332a09 (Linux 5.4, i915 broken) to e42617b825f8 (Linux 5.5-rc1, i915 fixed) yielded 5f71c84038d3 as the first commit in which the error does not occur.
Applying the diff from 219d54332a09 to 5f71c84038d3 to linux54 kernel sources (see i915.patch; git diff 219d54332a09 5f71c84038d3 -- drivers/gpu/drm/i915) fixes the error for me. Patch 0001-revert-drm-i915-cmdparser-use-explicit-goto-for-error-paths.patch needs to be removed for the diff to work (see linux54.patch).
Caveat: The patch is quite large and changes 52 files in drivers/gpu/drm/i915.
So actually you didn't git-bisect to the commit which first break it? Your patch is kinda half-backed as it doesn't pin-point the issue. Some even report that i915 is now broken for them ...
Today I had GPU hang again. Kernel 5.4.14-2. This time no hard reset was needed, GPU returned to normal after about 15 seconds.
Jan 27 11:59:48 kernel: i915 0000:00:02.0: GPU HANG: ecode 9:1:0x00000000, hang on rcs0Jan 27 11:59:48 kernel: GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.Jan 27 11:59:48 kernel: Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/IntelJan 27 11:59:48 kernel: drm/i915 developers can then reassign to the right component if it's not a kernel issue.Jan 27 11:59:48 kernel: The GPU crash dump is required to analyze GPU hangs, so please always attach it.Jan 27 11:59:48 kernel: GPU crash dump saved to /sys/class/drm/card0/errorJan 27 11:59:48 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0Jan 27 11:59:51 kernel: iwlwifi 0000:73:00.0: Queue 11 is inactive on fifo 2 and stuck for 10000 ms. SW [232, 240] HW [162, 162] FH TRB=0x0a5a5a5a2Jan 27 12:00:00 kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
I'll try to add i915 drm of 5.5 now to 5.4 series and only apply the differences: git diff 5.4.15 5.5 -- drivers/gpu/drm/i915 using the stable tree ...