Fix the execution of pacman hooks
Hello,
While doing a system update a couple of weeks ago, I've spotted a highly conspicuous error message:
( 3/18) Creating temporary files...
Assertion 'fd' failed at src/tmpfiles/tmpfiles.c:843, function fd_set_perms(). Aborting.
/usr/share/libalpm/scripts/systemd-hook: line 28: 1735 Aborted (core dumped) /usr/bin/systemd-tmpfiles --create
error: command failed to execute correctly
Here's also an excerpt from /var/log/pacman/log
(the stack trace was rather unusable, so I'll skip it):
[2021-07-11T07:25:44+0200] [ALPM] running '30-systemd-tmpfiles.hook'...
[2021-07-11T07:25:44+0200] [ALPM-SCRIPTLET] Assertion 'fd' failed at src/tmpfiles/tmpfiles.c:843, function fd_set_perms(). Aborting.
[2021-07-11T07:25:44+0200] [ALPM-SCRIPTLET] /usr/share/libalpm/scripts/systemd-hook: line 28: 1735 Aborted (core dumped) /usr/bin/systemd-tmpfiles --create
What made this even more conspicuous is that manually running systemd-tmpfiles --create
afterwards resulted in no errors. In a few words, executing /usr/share/libalpm/hooks/30-systemd-tmpfiles.hook
failed, but only when it was run from within pacman
. Also, I saw a few people complaining about the same error message on Manjaro and Arch forums, even with one complaint dating more than a few years ago, but nobody came through with a fix. Strange, isn't it?
After a detailed and rather lengthy investigation, it turned out that the root cause was twofold:
- Arch Linux ARM applies
0003-Revert-alpm_run_chroot-always-connect-parent2child-p.patch
that reverts oldpacman
commit1d6583a5
, for an unknown reason; this patch causes the error like clockwork. - The code in
lib/libalpm/util.c
that executes the hooks by forking a child has some bugs that allow the error to occur under certain circumstances.
Regarding the first bullet point, the patch from Arch Linux ARM creates a condition in which the file descriptor 0
is closed by calling close(0)
and left closed when the executed hook has no option NeedsTargets
specified, which is the case for our friend, /usr/share/libalpm/hooks/30-systemd-tmpfiles.hook
. As a result, the first open()
during the execution of the hook returns 0
as the file descriptor, because 0
is the currently lowest available value. It would all go unnoticed, but systemd-tmpfiles
performs assert(fd)
checks in src/tmpfiles/tmpfiles.c
that fail because fd
equals 0
.
Regarding the second bullet point, function _alpm_run_chroot()
that executes hooks in a fork()
ed child doesn't execute dup2()
properly, but instead executes close()
followed by dup2()
. The man page for dup2()
clearly states that such attempts to implement equivalent functionality must be avoided:
The dup2() system call performs the same task as dup(), but instead of using the lowest-numbered unused file descriptor, it uses the file descriptor number specified in newfd. In other words, the file descriptor newfd is adjusted so that it now refers to the same open file description as oldfd.
If the file descriptor newfd was previously open, it is closed before being reused; the close is performed silently (i.e., any errors during the close are not reported by dup2()).
The steps of closing and reusing the file descriptor newfd are performed atomically. This is important, because trying to implement equivalent functionality using close(2) and dup() would be subject to race conditions, whereby newfd might be reused between the two steps. Such reuse could happen because the main program is interrupted by a signal handler that allocates a file descriptor, or because a parallel thread allocates a file descriptor.
As a result, a condition can occur in which the file descriptor 0
is closed by calling close(0)
and left closed after the while
loop that fails to execute dup2()
by receiving EBUSY
, resulting in the original issue. On top of that, failed attempts to execute dup2()
should be treated as fatal errors instead of being silently ignored.
I've created pacman.patch that prevents the issues described in the second bullet point, while not applying 0003-Revert-alpm_run_chroot-always-connect-parent2child-p.patch
fixes the issues decribed in the first bullet point.