Metropolis needs to check whether that the rootfs
(a.k.a. system partition
) mounted by the kernel hasn't been tampered with, at least according to the built/shipped kernel. For that, we need to put more work into how we build images.
The image build flow currently is:
//metropolis/node:rootfs
generates an erofs image containing all node userland code.
//third_party/linux
builds a kernel with EFI stub that hardcodes its boot command line to console=ttyS0 root=PARTLABEL=METROPOLIS-SYSTEM rootfstype=erofs init=/init
//metropolis/node:image
takes the above rootfs and EFI image and combines them into a disk image with an EFI system partition, the erofs image as its own partition, and an empty data partition for Metropolis to use. This is used by the //metropolis/test/launch
code to actually run Metropolis, alongside a config protobuf.
We should generally uncouple these things, as currently we have some unwritten expectations about how things should work, but they effectively only work for our test code.
First, we need some way to enforce integrity on erofs system partition images (“Checksum” them). For example, using dm-integrity or dm-verity. I don't know how these works, but this should likely checksum an erofs image (perhaps adding some supporting structures to it) and yield the resulting image and an accompanying checksum (probably in some small protobuf file).
Then, we need to have a way to “Lock” a Kernel that it should expect an erofs with such-and-such integrity guarantees (ie. a given checksum). This should be done without having to rebuild the kernel. @lorenz knows of some UEFI/Linux feature that should allow us to do this by adding a command line to an existing EFI payload. We shouldn't end up with something that itself isn't further checkable, so keeping to a single EFI image is IMO the best option.
Finally, the kernel and erofs images must be “Joined” into a pair where a we get an EFI binary and erofs image that will work together. This can then be piped again into //metropolis/node:image
(which perhaps should live in //metropolis/test/launch
)?
I think these features should be part of the build process, and not an end-user tool, as users are unlikely to want to dynamically build such images outside of the build system - instead, they will either use a release from Monogon or do their own build if they want to run a patched/forked version of Metropolis. Perhaps later we will make up some sort of artifact format (ie. tarball :) ) that contains both of these together as a single file for redistribution, but for now just a Bazel target with two outputs should be good enough.
Or, graphically:
.---------------------------. .---------------------------.
| erofs image (no checksum) | | kernel image (no cmdline) |
| //metropolis/node:rootfs | | //third_party/linux |
'---------------------------' '---------------------------'
| |
| “Checksum” |
:----------------------. |
| | |
V V |
.-------------------------. .----------------. |
| erofs image (checksum) | | checksum proto | |
'-------------------------' '----------------' |
| | .-------------'
| | | “Lock”
| V V
| .---------------------------.
| | kernel image (expects |
| | erofs image with checksum |
| '---------------------------'
| |
| .--------------------'
| | “Join”
V V
.--------------------------------------.
| joined kernel EFI and erofs image |
| //metropolis/node:joined.{efi,img} ? |
'--------------------------------------'
Note: I came up with all these names right now - these are not written in stone, and neither is this design. Whoever implements this should probably first write a short design document for this :).
Not in scope: further signing this joined pair into some sort of self-standing release.
Not in scope: building an installer or anything that is 'ready to use'.
c/node c/integrity research-and-design