Hans van Kranenburg on September 6, 2014
At Mendix, we use Debian GNU/Linux and Xen a lot to run linux virtual machines as paravirtualized guests. One of the annoyances when doing so, is that when starting a virtual machine, the kernel and initrd to start have to be provided from outside the virtual machine itself, due to a classic chicken and egg problem: The virtual machine needs to be started by loading program code which is used to ‘kick it off’, but the code which needs to be run, the right linux kernel version, is hidden inside the virtual machine itself, which cannot be accessed as long as it’s not running, etc…
In this blog post, I’ll explain how a modified version of GRUB2 can be used to boot the kernel from inside the virtual machine instead.
When starting research on this topic, I quickly stumbled into the Xen documentation that mentions pvgrub. PV-GRUB is a modified GRUB binary, that is able to read a menu.lst file from the vm disk storage, and chainload the kernel and initrd that are referenced from that configuration file. There’s even a pv-grub-menu package for debian that will maintain a
/boot/grub/menu.lst for you.
However, all of this is using the legacy version of GRUB, which is not maintained or supported anymore for about 10 years already. The Xen pv-grub wiki page tells us: “Since pvgrub corresponds to GRUB legacy, it is not supported anymore by the GRUB developers. Until there is a pvgrub2…”. Hmz, pvgrub2? Will there be a new pv-grub that is based on GRUB2?
The first hit when doing a google search for ‘pvgrub2’ is a mailing list post from november 2013, telling us: pvgrub2 is merged. Yay. Read it. This mailing list post shows a way to build a standalone grub2 image that is able to be used for booting a xen domU, and then searching for a
/boot/grub/grub.cfg file in there.
The part of building grub2 itself, before being able to use
grub-mkstandalone can be skipped by just installing the grub-xen package from Debian, which is at version 2.02~beta2-11 at the time of writing.
In my Debian jessie/sid development environment, I just can do the following to create a standalone image that will create a grub2 standalone image that is able to be used to boot a domU, which will search for a grub.cfg in any of its disks:
~/tmp/pvgrub2 1-$ cat boot/grub/grub.cfg search -s root -f /boot/grub/grub.cfg configfile /boot/grub/grub.cfg ~/tmp/pvgrub2 1-$ grub-mkstandalone -o grub-x86_64-xen -O x86_64-xen boot/grub/grub.cfg ~/tmp/pvgrub2 1-$ ll total 7676 drwxr-xr-x 1 knorrie knorrie 8 Aug 24 15:16 boot -rw-r--r-- 1 knorrie knorrie 7858080 Sep 6 22:36 grub-x86_64-xen
The next step to use this is to modify the xen guest configuration file to boot pvgrub2 first instead of the linux kernel itself.
A typical xen guest configuration file contains lines like these:
# Kernel kernel = "/usr/lib/linux-domu-kernels/vmlinuz-3.14-1-amd64-3.14.12-1" ramdisk = "/usr/lib/linux-domu-kernels/initrd.img-3.14-1-amd64-3.14.12-1" root = "/dev/xvda ro"
When using pvgrub2, these three lines can be replaced by a single line that points to the pvgrub2 binary:
# pvgrub2 kernel = "/somewhere/grub-x86_64-xen"
When booting a domU using this pvgrub2 configuration, we need to have a configured
/boot/grub/grub.cfg present in the domU. Installing the
grub-xen package in a Debian Jessie domU is one way to get this done.
Right here is where I ended up during the first night of my pvgrub2 adventure, trying to figure out why I ended up with a virtual machine in my test environment that wasn’t doing anything else than using up 100% cpu usage, not showing anything on the console. 🙂
When looking at it a second time, I figured that the
/boot/grub/grub.cfg file which gets built into the internal grub2 memdisk uses a configuration option to look for a
/boot/grub/grub.cfg, which it will find in the internal grub2 memdisk, which will will be loaded, and starts looking for a
/boot/grub/grub.cfg, which will be found in the internal grub2 memdisk, which… etc…
If you’re interesting in this topic, I encourage you to search through the grub-devel mailing list archives to find discussions about the above issue, and about why it might or not might be a good idea to use the search option (what if some unprivileged user can mkdir boot in another partition?), which leads into another discussion about what to use instead, and also make sure to read the ideas of Colin about a pvgrub2 that would chainload a xen grub from the domU.
So, what would be needed to create a pvgrub2 standalone image that can be used right now? In our setup at Mendix, we always use separate disk partitions which get attached as complete block devices to the virtual machine. Usually the boot/root partition is /dev/xvda, and probably there’s an extra partition that holds application data or database data on /dev/xvdb.
What would happen if I would just set the root to /dev/xvda instead of searching for it?
~/tmp/pvgrub2 0-$ cat boot/grub/grub.cfg root='(xen/xvda)' configfile /boot/grub/grub.cfg ~/tmp/pvgrub2 0-$ grub-mkstandalone -o grub-x86_64-xen -O x86_64-xen boot/grub/grub.cfg ~/tmp/pvgrub2 0-$ tree . ├── boot │ └── grub │ └── grub.cfg └── grub-x86_64-xen 2 directories, 2 files
Tadaa.wav! I get presented with an actual GRUB menu and when choosing an option, or when waiting for timeout, the right linux kernel version gets booted.
After completing the previous step, I went forward to search for the best way to get a relevant
/boot/grub/grub.cfg file installed on all sorts of virtual machines that we run. Most of them are Debian Wheezy, some of them are still Debian Squeeze, and some of them testing virtual machines are running Debian Jessie.
I started to try and backport the current grub2 package from unstable to wheezy. Unfortunately, doing so, I ended up in quite some of a compilation and test run errors maze, which I couldn’t get out quickly, because I don’t know the source of grub that well, and because I had to revert all kind of dependencies to older versions of software in wheezy, that made tests break.
Installing some version of the grub-xen package would solve the issue of generating a simple grub.cfg menu file, but since the pvgrub2 binary which we’ll be using to boot the domU already consists of a standalone image which does not need any installed grub binaries in the domU, this is getting to sound like a bit of an overkill.
When showing all of this to my colleague Pim the next day, he responded with a quite out of the box idea, which helped me to find a solution. “When you’re building the pvgrub2 binary, you already include configuration. Why not just include some minimal contents in there instead of loading it from a grub.cfg inside the domU?”.
Mind… blow. Why not just create a dedicated pvgrub2 binary that will just load the kernel and initrd that is usually referenced by symlinks in the root filesystem? I named it the ‘fire ze missile’ configuration. It just assumes the symlinks are there, and will load everything, whithout any requirement for any grub installation or configuration inside the virtual machine!
~/src/git/pvgrub2 (master) 1-$ cat xvda-fire-ze-missile.cfg root='(xen/xvda)' insmod xzio insmod gzio insmod btrfs insmod ext2 echo 'Loading Linux ...' linux /vmlinuz root=/dev/xvda ro echo 'Loading initial ramdisk ...' initrd /initrd.img boot
That day, I created a debian package that supports building a couple of different pvgrub2 images, containing different configuation profiles. Right now, there’s two of them, the fire-ze-missile as show above, which just fires off a linux kernel and initrd by looking at the symlinks in the root filesystem, and another one, which looks at the
/boot/grub/grub.cfg configuration file inside the domU, which is currently used in our network to boot experimental configurations that for example use a btrfs root system which is located on a subvolume itself etc…