Have you tested this with recent versions of systemd-boot?
My current RK3399 boards are using systemd-boot and loading a modified (overclocked) fdt. Previously, older sd-boot will not load fdt at all and cause various problems.
Not sure what you mean by
ef00
partition type, do you mean mark it as an efi partition? In that case, we need to make sure we only do it for devices that actually supports proper EFI.
Only need to make /boot
partition to have ef00
type to make bootctl
happy.
This can be skipped as long as bootctl
is using --graceful
option.
And again, no need to bother EFI support, as that can be provided by mainline U-boot already.
My RK3399 boards (RockPro64 and RockPi4B) are all currently booting using U-boot -> systemd-boot -> linux
.
All I need to do is to remove /boot/extlinux
directory so that U-boot will search for EFI payload
instead of extlinux.conf, then populate /boot/loader/
properly, just like what I'm doing on x86_64.
I assume this could be done with a fairly simple patch to that file, or as you said, a config option to make it versatile.
I tried, but not that simple. It's mostly hardcoded macro, and I don't have a good idea on how to use Kconfig to change the order.
Upstream Archlinux mkinitcpio is going to support kernel version detection for aarch64 through pull request https://github.com/archlinux/mkinitcpio/pull/32
With that we can move to the same preset as x86_64 kernels, by just using the filenames.
Several things to mention:
Currently systemd-boot doesn't really load device tree file, even it has "devicetree" key. I didn't notice that on RPI CM4, since RPI loads the device all by its firmware, but notice that on RK3399, as it relies on U-boot to load the device tree blob.
Thankfully U-boot just merged such feature very recently: https://github.com/systemd/systemd/pull/19417
And my tests shows it works, so we only need to wait for next release, and systemd-boot can handle all the 3 files we need to load (kernel, initrd, device-tree)
At least for RK3399, the pre-built image is using 0700 parition type for its /boot vfat partition.
For systemd-boot to even copy its files, we need to change the type to ef00, which should be pretty simple.
Currently the U-boot priority is extlinux > uboot script > EFI payload. Any hit will boot without searching the rest.
To make our boot sequence change smooth, we should change the boot priority to: Uboot script (to make private overrides) > EFI payload (the optional and feature) > extlinux
Unfortunately it's hardcoded in include/config_distro_bootcmd.h.
I may need to introduce an config option to allow us to change the default priority.
It's pretty simple to update the device tree to use the op1 opp tables.
Not sure if we should craft a patch to use op1 tables for all RK3399 boards.
Or we should create a page educating users to create their device-tree overlay and apply it through U-boot?
Can we split the work into two different parts?
Migrate to EFI bootloader
Other than the existing Uboot -> kernel direct booting, we keep the existing kernel/initrd naming, but just add the support for EFI bootloader like GRUB/systemd-boot, and find a way to migrate to EFI bootloader with just single kernel.
This should replace the "Find out which of our currently supported devices support EFI." item.
Find a way to support multiple kernels
This is the hard part though.
For the first part, there are still problems left to be solved before the 2nd part:
GRUB auto detection even for just single kernel
I guess for single kernel case, we can just update the auto-detection script to include current kernel naming. And this should be pretty simple to test on existing boards.
systemd-boot config for supported kernel
Not sure where the config file should belong to though, kernel package or systemd-boot? For Archlinux x86_64, we completely rely on the end user to do the config (as they have various different setup for their rootfs). I'm completely fine to do the manual config as indeed I'm using LVM thus the Manjaro ARM default one is useless to me.
But I'm pretty sure that's not you guys want.
It turns out that all the tegra 194 (Xavier) boards are not enabled in linux-rc nor archlinuxarm linux-aarch64 kernel.
It's a little pity to find out that after all the hassles to boot into systemd-boot (the AARCH64 UEFI systemd-boot is re-enabled in recent archlinuxarm repo), but finds out that no device tree for the board.
It would be awesome to support Xavier boards as it would be the most powerful board among the publicly available SBCs.
(Yep, I'm too lazy to build that package on my existing RK3399, even with distcc it's still painfully slow)
Thanks.
We already established, this is not gonna happen any time soon, since it requires changes in alot of places for very little gain at this point. Probably with proper multi-kernel support.
Yeah, just to mention this GRUB thing for guys who is interested in booting Xavier AGX using upstream kernel (which I guess is already super niche).
As far as I know, xorg is in maintainance mode, meaing only bug fixes will get released. So it might never see the release of a 1.21 version.
I don't think that's the case, although wayland gets a lot of attention recently, it's still a pretty important piece in the display stack, and since the Nvidia guys are still contributing to Xorg to enable Xavier, I think it's just a problem of time to see next release with proper enablement for Xavier.
But I still have to admit, even with the strongest performance among all the boards I own, it's really a pain in the ass to work with Nvidia and the Xavier board.
It's already hard to get the UEFI experimental firmware working, and it still doesn't work as well as Uboot/RPI EDK2 UEFI firmware, just as the systemd-boot problem.
The worst part is the strange KVM performance. Even with all the strong bare metal CPU performance and 32G RAM, PCIE4.0 x8 lanes and PCIE3.0 x4 NVME lanes, KVM is miserably slow (Confirmed it's running KVM, not qemu-tcg). It takes over 30s just to boot the kernel, while on CM4, the same VM takes less than 10s.
I really should spend the money on HoneyComb LX2 or even an Apple M1 device.
So please consider this Xavier enablement a very low priority work.
It turns out that all the tegra 194 (Xavier) boards are not enabled in linux-rc nor archlinuxarm linux-aarch64 kernel.
It's a little pity to find out that after all the hassles to boot into systemd-boot (the AARCH64 UEFI systemd-boot is re-enabled in recent archlinuxarm repo), but finds out that no device tree for the board.
It would be awesome to support Xavier boards as it would be the most powerful board among the publicly available SBCs.
(Yep, I'm too lazy to build that package on my existing RK3399, even with distcc it's still painfully slow)
Thanks.
Sorry for the late reply, spend too much time on RPI CM4, and finally get time to turn my work back on Xavier.
Not sure about other Tegra devices, as they all seems have special Uboot/CK loader setup.
GRUB is working, except auto-detect
For Xavier AGX, it has UEFI firmware to load GRUB and can load initramfs and kernel without problem. However GRUB has its long existing problem auto-detection Arm kernels as we don't follow the kernel naming of x86_64.
This is something we need to worry in the future though.
If using GRUB, the current kernel config should work fine, at least for Xavier AGX.
However using UEFI firmware, we don't need any dtb files at all, you may want to remove the DTB file for tegra194.
systemd-boot has no initramfs support
I have already notified Nvidia that their UEFI firmware seems to have something wrong with systemd-boot, which can't load the initramfs. (While other edk2 based UEFI firmware, or even chainloaded from Uboot can all work without problem)
Which means, if using systemd-boot, we have no initramfs support, thus some config must be set to Y
or no way to find the block devices:
Furthermore, if booting without initramfs, the following config must not be compiled into kernel, or we will wait for 60s waiting for firmware loading timeout:
Currently I'm sticking to systemd-boot, with above modified kernel config to boot from NVME driver (Xavier AGX has PCIE4.0 x4 lanes for NVME driver).
GPU support waiting for xorg-server development branch
It's only xorg-server 1.21 development branch supporting tegra194 for its hardware accelerated GPU. Thus currently only LLVM pipe CPU rendering is working.
I haven't tested to compile xrog-server 1.21 yet, as the dependency chain looks pretty long. If there is any test packages for 1.21, I'm pretty happy to test.
Since it's upstream supported already, enabling it only need to add "tegra" into ${GALLIUM}
Exactly the diff I'm using right now: diff
Will report back when the compiling is finished and tested.
That's great.
Closing the issue.
OK, a upstream bug.
Maybe we need some patch backported.
Confirmed git version has the fix merged already and pass compile.
Although the Xavier AGX boards compiles mesa pretty fast, it fails to build certain part of gallium llvm.
[68/70] Compiling C++ object src/gallium/frontends/clover/libclllvm.a.p/llvm_invocation.cpp.o
FAILED: src/gallium/frontends/clover/libclllvm.a.p/llvm_invocation.cpp.o
c++ -Isrc/gallium/frontends/clover/libclllvm.a.p -Isrc/gallium/frontends/clover -I../mesa-20.3.4/src/gallium/frontends/clover -Iinclude -I../mesa-20.3.4/include -Isrc -I../mesa-20.3.4/src -I../mesa-20.3.4/src/gallium/include -Isrc/gallium/auxiliary -I../mesa-20.3.4/src/gallium/auxiliary -I/usr/include -fvisibility=hidden -fdiagnostics-color=always -DNDEBUG -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wnon-virtual-dtor -std=c++14 -ffunction-sections -fdata-sections '-DPACKAGE_VERSION="20.3.4"' '-DPACKAGE_BUGREPORT="https://gitlab.freedesktop.org/mesa/mesa/-/issues"' -DUSE_ELF_TLS -DHAVE_ST_VDPAU -DENABLE_ST_OMX_BELLAGIO=1 -DENABLE_ST_OMX_TIZONIA=0 -DHAVE_X11_PLATFORM -DGLX_INDIRECT_RENDERING -DGLX_DIRECT_RENDERING -DGLX_USE_DRM -DHAVE_DRM_PLATFORM -DENABLE_SHADER_CACHE -DHAVE___BUILTIN_BSWAP32 -DHAVE___BUILTIN_BSWAP64 -DHAVE___BUILTIN_CLZ -DHAVE___BUILTIN_CLZLL -DHAVE___BUILTIN_CTZ -DHAVE___BUILTIN_EXPECT -DHAVE___BUILTIN_FFS -DHAVE___BUILTIN_FFSLL -DHAVE___BUILTIN_POPCOUNT -DHAVE___BUILTIN_POPCOUNTLL -DHAVE___BUILTIN_UNREACHABLE -DHAVE_FUNC_ATTRIBUTE_CONST -DHAVE_FUNC_ATTRIBUTE_FLATTEN -DHAVE_FUNC_ATTRIBUTE_MALLOC -DHAVE_FUNC_ATTRIBUTE_PURE -DHAVE_FUNC_ATTRIBUTE_UNUSED -DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT -DHAVE_FUNC_ATTRIBUTE_WEAK -DHAVE_FUNC_ATTRIBUTE_FORMAT -DHAVE_FUNC_ATTRIBUTE_PACKED -DHAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL -DHAVE_FUNC_ATTRIBUTE_ALIAS -DHAVE_FUNC_ATTRIBUTE_NORETURN -DHAVE_FUNC_ATTRIBUTE_VISIBILITY -DHAVE_UINT128 -DUSE_GCC_ATOMIC_BUILTINS -DUSE_AARCH64_ASM -DMAJOR_IN_SYSMACROS -DHAVE_LINUX_FUTEX_H -DHAVE_ENDIAN_H -DHAVE_DLFCN_H -DHAVE_EXECINFO_H -DHAVE_SYS_SHM_H -DHAVE_STRTOF -DHAVE_MKOSTEMP -DHAVE_TIMESPEC_GET -DHAVE_MEMFD_CREATE -DHAVE_RANDOM_R -DHAVE_FLOCK -DHAVE_STRTOK_R -DHAVE_GETRANDOM -DHAVE_PROGRAM_INVOCATION_NAME -DHAVE_POSIX_MEMALIGN -DHAVE_DIRENT_D_TYPE -DHAVE_STRTOD_L -DHAVE_DLADDR -DHAVE_DL_ITERATE_PHDR -DHAVE_ZLIB -DHAVE_ZSTD -DHAVE_PTHREAD -DHAVE_PTHREAD_SETAFFINITY -DHAVE_LIBDRM -DLLVM_AVAILABLE '-DMESA_LLVM_VERSION_STRING="12.0.0"' -DLLVM_IS_SHARED=1 -DUSE_LIBGLVND=1 -DHAVE_WAYLAND_PLATFORM -DWL_HIDE_DEPRECATED -DHAVE_DRI3 -DHAVE_DRI3_MODIFIERS -DHAVE_GALLIUM_EXTRA_HUD=1 -DHAVE_LIBSENSORS=1 -Werror=return-type -Werror=empty-body -Wno-non-virtual-dtor -Wno-missing-field-initializers -Wno-format-truncation -fno-math-errno -fno-trapping-math -flifetime-dse=1 -Werror=format -Wformat-security -march=armv8-a -O2 -pipe -fstack-protector-strong -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-clash-protection -Wp,-D_GLIBCXX_ASSERTIONS -fPIC -pthread -D__STDC_CONSTANT_MACROS -D__STDC_LIMIT_MACROS -D_GNU_SOURCE -D__STDC_FORMAT_MACROS -Wno-ignored-attributes -DHAVE_CLOVER_ICD -DCL_TARGET_OPENCL_VERSION=300 -DCL_USE_DEPRECATED_OPENCL_1_0_APIS -DCL_USE_DEPRECATED_OPENCL_1_1_APIS -DCL_USE_DEPRECATED_OPENCL_1_2_APIS -DCL_USE_DEPRECATED_OPENCL_2_0_APIS -DCL_USE_DEPRECATED_OPENCL_2_1_APIS -DCL_USE_DEPRECATED_OPENCL_2_2_APIS '-DLIBCLC_INCLUDEDIR="/usr/include/"' '-DLIBCLC_LIBEXECDIR="/usr/share/clc/"' '-DCLANG_RESOURCE_DIR="/usr/lib/clang/12.0.0/include"' -MD -MQ src/gallium/frontends/clover/libclllvm.a.p/llvm_invocation.cpp.o -MF src/gallium/frontends/clover/libclllvm.a.p/llvm_invocation.cpp.o.d -o src/gallium/frontends/clover/libclllvm.a.p/llvm_invocation.cpp.o -c ../mesa-20.3.4/src/gallium/frontends/clover/llvm/invocation.cpp
In file included from ../mesa-20.3.4/src/gallium/frontends/clover/llvm/invocation.cpp:55:
../mesa-20.3.4/src/gallium/frontends/clover/llvm/metadata.hpp: In function 'std::string clover::llvm::get_type_kernel_metadata(const llvm::Function&, const string&)':
../mesa-20.3.4/src/gallium/frontends/clover/llvm/metadata.hpp:132:86: warning: 'unsigned int llvm::VectorType::getNumElements() const' is deprecated: Calling this function via a base VectorType is deprecated. Either call getElementCount() and handle the case where Scalable is true or cast to FixedVectorType. [-Wdeprecated-declarations]
132 | data += std::to_string(((::llvm::VectorType*)type)->getNumElements());
| ^
In file included from /usr/include/llvm/IR/DataLayout.h:26,
from /usr/include/llvm/IR/Module.h:25,
from ../mesa-20.3.4/src/gallium/frontends/clover/llvm/codegen.hpp:35,
from ../mesa-20.3.4/src/gallium/frontends/clover/llvm/invocation.cpp:52:
/usr/include/llvm/IR/DerivedTypes.h:535:10: note: declared here
535 | unsigned VectorType::getNumElements() const {
| ^~~~~~~~~~
../mesa-20.3.4/src/gallium/frontends/clover/llvm/invocation.cpp: In function 'std::unique_ptr<clang::CompilerInstance> {anonymous}::create_compiler_instance(const clover::device&, const string&, const std::vector<std::__cxx11::basic_string<char> >&, std::string&)':
../mesa-20.3.4/src/gallium/frontends/clover/llvm/invocation.cpp:233:55: error: cannot convert 'clang::PreprocessorOptions' to 'std::vector<std::__cxx11::basic_string<char> >&'
233 | c->getPreprocessorOpts(),
| ~~~~~~~~~~~~~~~~~~~~~~^~
| |
| clang::PreprocessorOptions
In file included from /usr/include/clang/Frontend/CompilerInstance.h:15,
from ../mesa-20.3.4/src/gallium/frontends/clover/llvm/codegen.hpp:37,
from ../mesa-20.3.4/src/gallium/frontends/clover/llvm/invocation.cpp:52:
/usr/include/clang/Frontend/CompilerInvocation.h:183:45: note: initializing argument 4 of 'static void clang::CompilerInvocation::setLangDefaults(clang::LangOptions&, clang::InputKind, const llvm::Triple&, std::vector<std::__cxx11::basic_string<char> >&, clang::LangStandard::Kind)'
183 | std::vector<std::string> &Includes,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~
ninja: build stopped: subcommand failed.
==> ERROR: A failure occurred in build().
Aborting...
Not sure if this is due to the tegra enablement or not.
Since it's upstream supported already, enabling it only need to add "tegra" into ${GALLIUM}
Exactly the diff I'm using right now: diff
Will report back when the compiling is finished and tested.
Even without multi-kernel support, I don't believe a distro should even rely on board specific boot sequence.
Especially considering that Uboot is already an industry standard, and have RPI upstream support.
That use case would be exactly what I want to avoid.
We're using Arch based distro, and Arch is really not the best distro to start to support multiple package at (slightly) different version.
So I won't really want to jump into that rabbit hole, but only to keep slightly different kernels with mostly compatible device trees.
But anyway, for now, multiple-kernel co-exist can be a lower priority thing.
Change RPI boot flow from current "FIRMWARE -> kernel" to "FIRMWARE -> Uboot -> linux" be more important, and should be easier to do. Then go "FIRMWARE -> Uboot -> systemd-boot/GRUB" method as the next step to unify boot sequence.
That's why we want APCI, then we can completely get rid of device tree.
But if upstream kernel and vendor kernel have too many difference on device tree, we still have to go that direction, to allow multiple device trees.
I'm not sure how other boards work, but for the boards I have, at least I didn't see the need to bother too much:
RK3399
Fully upstream, nothing to bother
RPI4/CM4
At least linux-rpi4-mainline
and linux
can share device tree, not sure about linux-rpi4
Amlogic A311D/S922
Upstream is good for almost everything, except the bootloader. Thus for device tree part, I don't think we need to bother too much about that at least.
Or do you have any example where the device tree can't be reused between upstream and vendor kernel?
For the worst case scenario, we may need to separate dtbs from kernel, and provide them as a separate package, so that we have one single copy of device trees, while have different kernel versions.
For other vendor kernels, their dtb directory also needs to have extra suffix though.
This means the board specific uboot needs to have extra search path for its fdt.
Uboot puts its dtbs directly into /boot
directory.
While linux-rpi4-mainline
doesn't provide dtbs.
And linux
puts its into /boot/dtbs
.
At least for RPI it doesn't cause much trouble, as its dtb is loaded by first firmware, there is nothing to conflicts, as long uboot-raspberry
is installed.
No problem.
For uboot-raspberrypi
is not a problem, it puts dtbs directly into /boot
And linux
puts dtbs into /boot/dtbs
For linux-rpi4-mainline
contains no dtb file, as it relies on the dtbs provided by uboot-raspberrypi
.