Bootstrapping for a new architecture

Bootstrapping any system is a very arduous process, and I will not be making an attempt to address every part of the procedure, only what is relevant to Exherbo and Paludis. It will frustrate you, and you will want to mutilate your computer’s internal organs if you are serious about doing this. Can’t say you weren’t warned.

For the time being, bootstrapping is really only supported for systems that already have an operating system which can compile Linux and things such as coreutils, gcc, et cetera. The first Exherbo systems were started from taking the half-dead bodies of Gentoo and Debian systems and revitalizing them with our programs, so any bootstrapping experience so far has been from the expectation of a toolchain and userland already in place in some form. Bootstrapping Linux and userland in general isn’t in the scope of this documentation.

You are expected to understand our toolchain organization, the way that packages pick up on what toolchain programs to use, and things such as CHOSTs beforehand. Study the layout of the system and the toolchain packages for tips. You may also be able to catch a developer in #exherbo and ask for tips and other pointers, if they’re feeling particularly selfless that day.

NOTE: If no one has ever actually used Exherbo on this platform, you should first discuss this with some of the developers a little. Supporting a platform is a lot of extra work, and you have probably noticed that we are a fairly small group of developers. If you talk it out with us, we’ll likely be willing to help you support this platform and guide you along the way.

Things like adding a new PLATFORM to arbor, and making new profiles should be done through discussions on IRC, and sending in patches once you get it figured out.

So, good luck. Take breaks when frustrated, and have some drinks nearby.

Requirements

Given that you’d be a fool to try and bootstrap Exherbo on a system without a distribution of some sort actually on it, these are the requirements that your host system should fulfill:

Preparing the host system

For the sake of convenience, $CHOST should be set. You should also add the path that our compiled things will be in to your $PATH.

$ export CHOST=armv7-unknown-linux-gnueabihf
$ export PATH=/usr/${CHOST}/bin:${PATH}
  1. Eclectic. Paludis requires eclectic, and eclectic can be installed before Paludis. Fairly straightforward, just get the latest version and install it manually, the dependencies are very light and you already have them.

    $ curl -O https://dev.exherbo.org/distfiles/eclectic/eclectic-2.0.14.tar.xz
    $ tar xf eclectic-2.0.14.tar.xz
    $ cd eclectic-2.0.14
    $ ./autogen.bash
    $ ./configure --build=${CHOST} --host=${CHOST}          \
                  --prefix=/usr/${CHOST}                    \
                  --bindir=/usr/${CHOST}/bin                \
                  --sbindir=/usr/${CHOST}/bin               \
                  --libdir=/usr/${CHOST}/lib                \
                  --datadir=/usr/share                      \
                  --datarootdir=/usr/share                  \
                  --docdir=/usr/share/doc/eclectic-2.0.14   \
                  --infodir=/usr/share/info                 \
                  --mandir=/usr/share/man                   \
                  --sysconfdir=/etc                         \
                  --localstatedir=/var/lib
    $ make
    $ make install
  2. Paludis. Since we’re on the SCM’s cross branch right now, you should fetch that and install it. It’s pretty unlikely your distribution will have packages for Paludis, but it’ll very likely have packages for it’s dependencies; here’s a short list for convenience: autoconf 2.5, automake 1.15, C++ standard libraries and a C++ compiler, asciidoc, xmlto, htmltidy, libmagic (comes with file), pcrecpp (usually just called pcre), eclectic, wget, and rsync.

    You must make sure that you specify --with-default-distribution=exherbo; remember that Paludis is a multi-format package manager and also targets Gentoo.

    $ git clone -b cross git://git.exherbo.org/paludis/paludis.git
    $ cd paludis
    $ ./autogen.bash

    Paludis should be statically linked so that we don’t screw ourselves over later when we start messing with libstdc++ and such. It could be done dynamically but it’s a lot of extra work and not worth it.

    $ ./configure --build=${CHOST} --host=${CHOST}          \
                  --prefix=/usr/${CHOST}                    \
                  --bindir=/usr/${CHOST}/bin                \
                  --sbindir=/usr/${CHOST}/bin               \
                  --libdir=/usr/${CHOST}/lib                \
                  --datadir=/usr/share                      \
                  --datarootdir=/usr/share                  \
                  --docdir=/usr/share/doc/paludis-scm       \
                  --infodir=/usr/share/info                 \
                  --mandir=/usr/share/man                   \
                  --sysconfdir=/etc                         \
                  --localstatedir=/var/lib                  \
                  --disable-dependency-tracking             \
                  --enable-fast-install                     \
                  --disable-doxygen                         \
                  --disable-gtest                           \
                  --disable-pbins                           \
                  --disable-python                          \
                  --disable-ruby                            \
                  --disable-search-index                    \
                  --disable-stripper                        \
                  --disable-vim                             \
                  --disable-xml                             \
                  --with-default-distribution=exherbo       \
                  --with-config-framework=eclectic          \
                  --with-repositories=all                   \
                  --with-environments=paludis               \
                  --with-clients=cave                       \
                  --enable-static

    Notice how we used /usr/${CHOST}/bin instead of /usr/bin? Make sure you keep doing that whenever you have to install stuff manually.

    $ make
    $ make install

    Paludis needs the paludisbuild user and group to actually build stuff. Make sure it’s in the tty group too.

    $ groupadd -g 443 paludisbuild 
    $ useradd -d /var/tmp/paludis -G tty -g paludisbuild -u 103 paludisbuild

    If all goes well, you now have some unholy combination of Paludis and your host system’s package manager; only one will come back alive.

  3. Reconfigure and recompile eclectic so it picks up on Paludis being installed now.

  4. Now that you’ve got that out of the way, it’s time to make the Paludis configs. These should do well for your host; adjust for the differences in profiles and such.

    Keep in mind that I bootstrapped Exherbo from a Raspberry Pi 2; these CFLAGS are what works for the Raspberry Pi 2, and you will have to change them for your host if it isn’t one. Binaries compiled with these CFLAGS use hardfloat and use the floating point hardware in the RPi2, so they will not be portable to other ARMv7 machines.

    /etc/paludis/bashrc

    CHOST="armv7-unknown-linux-gnueabihf"
    armv7_unknown_linux_gnueabihf_CFLAGS="-pipe -Os -g -march=native -mcpu=cortex-a7 -mfloat-abi=hard -mfpu=neon-vfpv4"
    armv7_unknown_linux_gnueabihf_CXXFLAGS="-pipe -Os -g -march=native -mcpu=cortex-a7 -mfloat-abi=hard -mfpu=neon-vfpv4"
    export PATH="/usr/${CHOST}/bin:${PATH}"

    /etc/paludis/general.conf

    world = ${root}/var/db/paludis/repositories/installed/world

    /etc/paludis/licences.conf

    */* *

    /etc/paludis/options.conf

    */* targets: armv7-unknown-linux-gnueabihf
    */* build_options: jobs=4 -recommended_tests symbols=preserve
    */* providers: -* links libressl pkgconf gawk
    */* -python -ruby -perl

    Symbols aren’t stripped because your stripper might be broken until you have the system put together. Save yourself the pain and just preserve them. We disabled it earlier during Paludis’ installation anyway.

    And no, we’re not running tests either. Adds extra dependencies, usually takes too long, and bootstrapping already takes long enough.

    It’s better to keep the providers specified here; you can deviate after you have a comfortable system actually set up.

    /etc/paludis/platforms.conf

    */* armv7 ~armv7

    /etc/paludis/repository.template

    format = %{repository_template_format}
    location = /var/db/paludis/repositories/%{repository_template_name}
    sync = %{repository_template_sync}

    /etc/paludis/repositories/

    ./accounts.conf

    format = accounts

    ./arbor.conf

    location = ${root}/var/db/paludis/repositories/arbor
    sync = git+https://git.exherbo.org/arbor.git
    profiles = ${location}/profiles/armv7/linux/gnueabihf
    format = e
    names_cache = ${root}/var/cache/paludis/names
    write_cache = ${root}/var/cache/paludis/metadata

    ./graveyard.conf

    format = unwritten
    location = ${root}/var/db/paludis/repositories/graveyard
    sync = git+https://git.exherbo.org/graveyard.git
    importance = -90

    ./installed.conf

    location = ${root}/var/db/paludis/repositories/installed
    format = exndbam
    names_cache = ${root}/var/cache/paludis/names
    split_debug_location = /usr/armv7-unknown-linux-gnueabihf/lib/debug
    tool_prefix = armv7-unknown-linux-gnueabihf-

    ./installed_accounts.conf

    format = installed-accounts
    handler = passwd

    ./repository.conf

    format = repository
    config_filename = /etc/paludis/repositories/%{repository_template_name}.conf
    config_template = /etc/paludis/repository.template

    ./unavailable-unofficial.conf

    format = unavailable
    name = unavailable-unofficial
    sync = tar+https://git.exherbo.org/exherbo_unofficial_repositories.tar.bz2
    location = ${root}/var/db/paludis/repositories/unavailable-unofficial
    importance = -100

    ./unavailable.conf

    format = unavailable
    name = unavailable
    sync = tar+https://git.exherbo.org/exherbo_repositories.tar.bz2
    location = ${root}/var/db/paludis/repositories/unavailable
    importance = -100

    ./unwritten.conf

    format = unwritten
    location = ${root}/var/db/paludis/repositories/unwritten
    sync = git+https://git.exherbo.org/unwritten.git
    importance = -100

    Now that that’s taken care of, you can run cave sync a few times and it’ll yell at you for not having certain directories and not having certain write permissions. Fix them and it’ll shut up.

  5. Adjust for differences in host system vs. Exherbo system

    There’s a pretty good chance your host system does the toolchain and other things differently from how we do it. This step is a bit of a variable and you may have to adjust this for how your host is.

    Toolchain paths

    If the host doesn’t prefix all the toolchain programs, you should make symlinks to Paludis can find them. It’s better to do this in a temporary directory out of the way of system files so you don’t accidentally mess something up later on.

    Here’s how I did it on an Arch Linux ARM host:

    • Made symlinks for ar, as, cc, c++, cpp, gcc, g++, ld, nm, objcopy, objdump, pkg-config, ranlib, readelf:

      $ ls /tmp/makeshift-tools
      lrwxrwxrwx 1 root root 11 Jun 29 00:05 armv7-unknown-linux-gnueabihf-ar -> /usr/bin/gcc-ar*
      lrwxrwxrwx 1 root root 11 Jun 29 00:05 armv7-unknown-linux-gnueabihf-as -> /usr/bin/as*
      lrwxrwxrwx 1 root root 11 Jun 29 00:05 armv7-unknown-linux-gnueabihf-c++ -> /usr/bin/g++*
      lrwxrwxrwx 1 root root 11 Jun 29 00:05 armv7-unknown-linux-gnueabihf-cc -> /usr/bin/gcc*
      lrwxrwxrwx 1 root root 11 Jun 29 00:05 armv7-unknown-linux-gnueabihf-cpp -> /usr/bin/cpp*
      lrwxrwxrwx 1 root root 11 Jun 29 00:05 armv7-unknown-linux-gnueabihf-g++ -> /usr/bin/g++*
      lrwxrwxrwx 1 root root 11 Jun 29 00:05 armv7-unknown-linux-gnueabihf-gcc -> /usr/bin/gcc*
      lrwxrwxrwx 1 root root 11 Jun 29 00:05 armv7-unknown-linux-gnueabihf-ld -> /usr/bin/ld*
      lrwxrwxrwx 1 root root 11 Jun 29 00:05 armv7-unknown-linux-gnueabihf-nm -> /usr/bin/nm*
      lrwxrwxrwx 1 root root 11 Jun 29 00:05 armv7-unknown-linux-gnueabihf-objcopy -> /usr/bin/objcopy*
      lrwxrwxrwx 1 root root 11 Jun 29 00:05 armv7-unknown-linux-gnueabihf-objdump -> /usr/bin/objdump*
      lrwxrwxrwx 1 root root 11 Jun 29 00:05 armv7-unknown-linux-gnueabihf-pkg-config -> /usr/bin/pkg-config*
      lrwxrwxrwx 1 root root 11 Jun 29 00:05 armv7-unknown-linux-gnueabihf-ranlib -> /usr/bin/ranlib*
      lrwxrwxrwx 1 root root 11 Jun 29 00:05 armv7-unknown-linux-gnueabihf-readelf -> /usr/bin/readelf*
    • Added an extra PATH to the Paludis bashrc; PATH="/tmp/makeshift-tools:${PATH}"

  6. A few extra configuration files:

    /etc/env.d/00basic

    PATH="/opt/bin"
    LDPATH="/usr/local/lib"
    MANPATH="/usr/local/share/man:/usr/share/man"
    INFOPATH="/usr/share/info"
    CVS_RSH="ssh"
    PAGER="/usr/host/bin/less"

Installing the system set

All packages should be installed in this order. Any dependencies which are needed should be resolved manually, which is painful. You’ll already have some of the dependencies, but we will be re-installing some of these since we want to make packages that we build not care about what the host has. It’s going to be messy.

You will have to add your platform to packages if they are masked by platform.

All packages should be installed with cave resolve -1z -0 '*/*' -x <package> so that we can tell cave to ignore dependencies and not do anything regarding updating other packages.

Let’s begin.

Now, remove any binutils symlinks created earlier. You can use cave executables binutils to see what can be removed.

Next, glibc and gcc actually need glibc to compile, so you’ll have to copy over it’s headers and libraries from the host to /usr/${CHOST}/lib. There’s a lot of files that glibc actually provides, so rather than list what needs to be copied, I’ll provide the command I used on my Arch Linux ARM host as an example.

$ mkdir -p /usr/${CHOST}/include
$ cp -nv $(pacman -Qlq glibc | grep lib/lib.*.so) /usr/${CHOST}/lib/
$ for dir in $(pacman -Qlq glibc | grep include/.*/$);do newdir=$(echo $dir | sed "s#/usr/include#/usr/${CHOST}/include#"); cp -vr "$dir" "$newdir"; done
$ pacman -Qlq glibc | grep include/.*.h  | grep -v include/.*/ | xargs cp -vt /usr/${CHOST}/include

At this point, you should remove any gcc-related symlinks, since we will be using the toolchain that we have compiled, instead of the host’s toolchain. You shouldn’t remove the host’s copy of the libraries though, or else things like gcc will fail because they can’t find the libraries they were linked with.

glibc needs the .o files from the host before we compile, or else it will fail soon after you start.

$ cp -nv /usr/lib/*.o /usr/${CHOST}/lib

If glibc is failing with a sunrpc/cross-rpcgen error, that’s because gcc is expecting the ld.so to exist at the host’s correct multiarch location; you need to copy the linker from the host system to /usr/${CHOST}/lib/ld-linux-armhf.so.3, or whichever linker the output of file /var/tmp/paludis/build/sys-libs-glibc-*/work/build/sunrpc/cross-rpcgen gives you.

$ cp -v /usr/lib/ld-* /usr/${CHOST}/lib

Resume compiling glibc with --skip-until-phase compile.

Now, you’ll see some really scary errors from cave at the end of the install because you just installed a libstdc++ that it wasn’t linked with. You can ignore these. Since we compiled Paludis with --enable-static, the gods have forsaken us.

coreutils will fail at the merging because you just overwrote env, so copy the host’s env binary to /usr/${CHOST}/bin. Then, run cave resolve -1z -0 '*/*' -x coreutils --skip-until-phase merge.

Also, after this cave will be complaining saying it couldn’t get the mtime for something in /var/db/paludis/repositories/installed/data/sys-apps---coreutils/; this is from that failed merge, just remove the directory it mentions.

Now, you can remove the rest of the symlinks from earlier, if there are any. The toolchain is now entirely bootstrapped and we don’t need the host’s programs for building. Good work.

Now that we have a system without any recursive dependencies or cyclic dependencies, we can proceed with the wonderful luxury of dependency resolution.

Do cave resolve -1zx sys-apps/paludis.

Now that you have a fairly minimal, but self-hosting system, we can start cannibalizing the host’s stuff and remove/backup the binaries from the host. This includes making the system’s filesystem layout look like a regular cross system. Instead of removing the folders though, we’ll just rename them so we have backups if something goes wrong.

Be really careful with these commands, if you mistype it could result in having to do filesystem layout stuff from another machine.

$ ln -s ${CHOST} /usr/host
$ export PATH="/usr/host/bin"
$ mv /bin /oldbin
$ mv /sbin /oldsbin
$ mv /lib /oldlib
$ mv /usr/bin /usr/oldbin
$ mv /usr/sbin /usr/oldsbin
$ mv /usr/lib /usr/oldlib
$ mv /usr/include /usr/oldinclude

If at this point you can still run things like ls, mv, bash, and other things, this means everything is correct and you managed to bootstrap without any of the packages depending on the host’s things. Yay!

If you didn’t… something went wrong. Make sure you followed all the steps earlier correctly.

$ cd /usr
$ ln -s host/bin bin
$ ln -s host/include include
$ ln -s host/lib lib
$ ln -s host/libexec libexec
$ ln -s host/sbin sbin
$ cd host
$ ln -s bin sbin
$ cd /
$ ln -s usr/host/bin bin
$ ln -s usr/host/sbin sbin
$ ln -s usr/host/lib lib

skeleton-filesystem-layout is happy now, so we can do cave resolve -cx world and start repairing things that break there rather than resolving manually. It should be fairly straightforward from here on out. If you run into dependency cycles with pciutils, util-linux, etc. just disable [udev] and [systemd] for them, compile systemd, and then enable the options again.

Once you’re done with that, clean up the configuration files that are waiting on you in eclectic config, and start combing through leftover files from the host that should be rid of with cave print-unmanaged-files. Don’t blindly delete everything that it lists, since it’ll print some config files and other things that you shouldn’t get rid of. It will take a while to make the list, since it’s scanning all the directories under /.

At this point you pretty much have an unconfigured system that is like the stages. You should follow some of the configuration steps (locales, hostnames, etc.) listed in the install guide.

You’re now done bootstrapping, and you have a fully functional Exherbo system installed. Have fun. :) You may wish to create a stage using make-exherbo-stages from infra-scripts for distributing to others.


Copyright 2015 Kylie McClain