Reboots

Reboots

Sometimes it is necessary to perform a reboot on the hosts, when the kernel is updated. Prometheus will warn about this with the NeedsReboot alert, which looks like:

Servers running trixie needs to reboot

Sometimes a newer kernel can have been released between the last apt update and apt metrics refresh. So before running reboots, make sure all servers are up to date and have the latest kernel downloaded:

cumin '*' 'apt-get update && unattended-upgrades -v && systemctl start tpa-needrestart-prometheus-metrics.service'

Note that the above triggers an update of metrics for prometheus but you'll need them to get polled before the list of hosts from the fab command below is fully up to date, so wait for 1 minute or two before launching that command to get the full list of hosts.

You can see the list of pending reboots with this Fabric task:

fab fleet.pending-reboots

See below for how to handle specific situations.

Full fleet reboot

This is the most likely scenario, especially when we were able to upgrade all of the servers to the same, stable, release of debian.

In this case, the faster way to run reboots is to reboot ganeti nodes with all of their contained instances in order to clear out reboots for many servers at once, then reboot the hosts that are not in ganeti.

The fleet.reboot-fleet command will tell you whether it's worth it, and might eventually be able to orchestrate the entire reboot on its own. For now, this reboot is only partly automated.

Note that to make the reboots run more smoothly, you can temporarily modify your yubikey touch policy to remove the need to always confirm by touching the key.

So, typically, you'd do a Ganeti fleet reboot, then reboot remaining nodes. See below.

Testing reboots

A good reflex is to test rebooting a single "canary" host as a test:

fab -H idle-fsn-01.torproject.org fleet.reboot-host --delay-shutdown-minutes=1

Rebooting Ganeti nodes

See the Ganeti reboot procedures for this procedure. Essentially, you run those two batches in parallel, paying close attention to the host list:

gnt-dal cluster:

 fab -H dal-node-03.torproject.org,dal-node-02.torproject.org,dal-node-01.torproject.org fleet.reboot-host --no-ganeti-migrate

gnt-fsn cluster:

 fab -H fsn-node-08.torproject.org,fsn-node-07.torproject.org,fsn-node-06.torproject.org,fsn-node-05.torproject.org,fsn-node-04.torproject.org,fsn-node-03.torproject.org,fsn-node-02.torproject.org,fsn-node-01.torproject.org fleet.reboot-host --no-ganeti-migrate

You want to avoid rebooting mirrors at once. Ideally, the fleet.reboot-fleet script here would handle this for you, but it doesn't right now. This can be done ad-hoc: reboot the host, and pay attention to which instances are rebooted. If too many mirrors are rebooted at once, you can abort the reboot before the timeout (control-c) and cancel the reboot by rerunning the reboot-host command with the --kind cancel flag.

Note that the above assumes only two clusters are present, the host list might have changed since this documentation was written.

Remaining nodes

The Karma alert dashboard will show remaining hosts that might have been missed by the above procedure after a while, but you can already get ahead of that by detecting physical hosts that are not covered by the Ganeti reboots with:

curl -s -G http://localhost:6785/pdb/query/v4         --data-urlencode 'query=inventory[certname]         { facts.virtual = "physical" }'         | jq -r '.[].certname' | grep -v -- -node- | sort

The above assumes you have the local "Cumin hack" to forward port 6785 to PuppetDB's localhost:8080 automatically, use this otherwise:

ssh -n -L 6785:localhost:8080 puppetdb-01.torproject.org &

You can also look for the virtual machines outside of Ganeti clusters:

ssh db.torproject.org \
    "ldapsearch -H ldap://db.torproject.org -x -ZZ -b 'ou=hosts,dc=torproject,dc=org' \
    '(|(physicalHost=hetzner-cloud)(physicalHost=safespring))' hostname \
    | grep ^hostname | sed 's/hostname: //'"

You can list both with this LDAP query:

ssh db.torproject.org 'ldapsearch -H ldap://db.torproject.org -x -ZZ -b "ou=hosts,dc=torproject,dc=org" "(!(physicalHost=gnt-*))" hostname' | sed -n '/hostname/{s/hostname: //;p}' | grep -v ".*-node-[0-9]\+\|^#" | paste -sd ','

This, for example, will reboot all of those hosts in series:

fab -H $(ssh db.torproject.org 'ldapsearch -H ldap://db.torproject.org -x -ZZ -b "ou=hosts,dc=torproject,dc=org" "(!(physicalHost=gnt-*))" hostname' | sed -n '/hostname/{s/hostname: //;p}' | grep -v ".*-node-[0-9]\+\|^#" | paste -sd ',') fleet.reboot-host

We show how to lists those hosts separately because you can also reboot a select number of hosts in parallel with the fleet.reboot-parallel command, and then you need to think more about which hosts to reboot than when you do a normal, serial reboot.

Do not reboot the entire fleet or all hosts blindly with the reboot-parallel method, as it can be pretty confusing, especially with a large number of hosts, as all the output is shown in parallel. It will also possibly reboot multiple components that are redundant mirrors, which we try to avoid.

The reboot-parallel command works a little differently than other reboot commands because the instances are passed as an argument. Here are two examples:

fab fleet.reboot-parallel --instances ci-runner-x86-14.torproject.org,tb-build-03.torproject.org,dal-rescue-01.torproject.org,cdn-backend-sunet-02.torproject.org,hetzner-nbg1-01.torproject.org

Here, the above is safe because there's only a handful (5) of servers and they don't have overlapping tasks (they're not mirrors of each other).

A "safe" parallel reboot performed by anarcat to cover most non-ganeti, but also non-redundant hosts hosts was:

fab fleet.reboot-parallel --instances archive-01.torproject.org,tb-build-02.torproject.org,tb-build-03.torproject.org,collector-02.torproject.org,metricsdb-03.torproject.org,bungei.torproject.org,hetzner-hel1-03.torproject.org,hetzner-nbg1-02.torproject.org,ci-runner-x86-03.torproject.org,ci-runner-x86-14.torproject.org,backup-storage-01.torproject.org,cdn-backend-sunet-02.torproject.org

That's essentially the output of the ldapsearch above minus the mandos-01 and name servers (ns5, hetzner-hel1-02). Those were rebooted separately with:

fab -H ns5.torproject.org,hetzner-hel1-02.torproject.org fleet.reboot-host --no-ganeti-migrate

And mandos is always rebooted last.

Rebooting a single host

If this is only a virtual machine, and the only one affected, it can be rebooted directly. This can be done with the fabric-tasks task fleet.reboot-host:

fab -H test-01.torproject.org,test-02.torproject.org fleet.reboot-host

By default, the script will wait 2 minutes before hosts: that should be changed to 30 minutes if the hosts are part of a mirror network to give the monitoring systems (mini-nag) time to rotate the hosts in and out of DNS:

fab -H mirror-01.torproject.org,mirror-02.torproject.org fleet.reboot-host --delay-hosts 1800

If the host has an encrypted filesystem and is hooked up with Mandos, it will return automatically. Otherwise it might need a password to be entered at boot time, either through the initramfs (if it has the profile::fde class in Puppet) or manually, after the boot. That is the case for the mandos-01 server itself, for example, as it currently can't unlock itself, naturally.

Note that you can cancel a reboot with --kind=cancel. This also cascades down Ganeti nodes.

Batch rebooting multiple hosts

NOTE: this section has somewhat bit-rotten. It's kept only to document the rebootPolicy but, in general, you should do a fleet-wide reboot or single-host reboots.

IMPORTANT: before following this procedure, make sure that only a subset of the hosts need a restart. If all hosts need a reboot, it's likely going to be faster and easier to reboot the entire clusters at once, see the Ganeti reboot procedures instead.

NOTE: Reboots will tend to stop for user confirmation whenever packages get upgraded just before the reboot. To prevent the process from waiting for your manual input, it is suggested that upgrades are run first, using cumin. See how to run upgrades in the section above.

LDAP hosts have information about how they can be rebooted, in the rebootPolicy field. Here are what the various fields mean:

justdoit - can be rebooted any time, with a 10 minute delay, possibly in parallel
rotation - part of a cluster where each machine needs to be rebooted one at a time, with a 30 minute delay for DNS to update
manual - needs to be done by hand or with a special tool (fabric in case of ganeti, reboot-host in the case of KVM, nothing for windows boxes)

Therefore, it's possible to selectively reboot some of those hosts in batches. Again, this is pretty rare: typically, you would either reboot only a single host or all hosts, in which case a cluster-wide reboot (with Ganeti, below) would be more appropriate.

This routine should be able to reboot all hosts with a rebootPolicy defined to justdoit or rotation:

echo "rebooting 'justdoit' hosts with a 10-minute delay, every 2 minutes...."
fab -H $(ssh db.torproject.org 'ldapsearch -H ldap://db.torproject.org -x -ZZ -b ou=hosts,dc=torproject,dc=org -LLL "(rebootPolicy=justdoit)" hostname | awk "\$1 == \"hostname:\" {print \$2}" | sort -R' | paste -sd ',') fleet.reboot-host --delay-shutdown-minutes=10 --delay-hosts-seconds=120

echo "rebooting 'rotation' hosts with a 10-minute delay, every 30 minutes...."
fab -H $(ssh db.torproject.org 'ldapsearch -H ldap://db.torproject.org -x -ZZ -b ou=hosts,dc=torproject,dc=org -LLL "(rebootPolicy=rotation)" hostname | awk "\$1 == \"hostname:\" {print \$2}" | sort -R' | paste -sd ',') fleet.reboot-host --delay-shutdown-minutes=10 --delay-hosts-seconds=1800

Another example, this will reboot all hosts running Debian bookworm, in random order:

fab -H $(ssh puppetdb-01.torproject.org "curl -s -G http://localhost:8080/pdb/query/v4 --data-urlencode 'query=inventory[certname] { facts.os.distro.codename = \"bookworm\" }'" | jq -r '.[].certname' | sort -R | paste -sd ',') fleet.reboot-host

And this will reboot all hosts with a pending kernel upgrade (updates only when puppet agent runs), again in random order:

fab -H $(ssh puppetdb-01.torproject.org "curl -s -G http://localhost:8080/pdb/query/v4 --data-urlencode 'query=inventory[certname] { facts.apt_reboot_required = true }'" | jq -r '.[].certname' | sort -R | paste -sd ',') fleet.reboot-host

And this is the list of all physical hosts with a pending upgrade, alphabetically:

fab -H $(ssh puppetdb-01.torproject.org "curl -s -G http://localhost:8080/pdb/query/v4 --data-urlencode 'query=inventory[certname] { facts.apt_reboot_required = true and facts.virtual = \"physical\" }'" | jq -r '.[].certname'  | sort | paste -sd ',') fleet.reboot-host

Userland reboots

systemd 254 (Debian 13 trixie and above) has a special command:

systemctl soft-reboot

That will "shut down and reboot userspace". As the manual page explains:

systemd-soft-reboot.service is a system service that is pulled in by soft-reboot.target and is responsible for performing a userspace-only reboot operation. When invoked, it will send the SIGTERM signal to any processes left running (but does not follow up with SIGKILL, and does not wait for the processes to exit). If the /run/nextroot/ directory exists (which may be a regular directory, a directory mount point or a symlink to either) then it will switch the file system root to it. It then reexecutes the service manager off the (possibly now new) root file system, which will enqueue a new boot transaction as in a normal reboot.

This can therefore be used to fix conditions where systemd itself needs to be restarted, or a lot of processes need to, but not the kernel.

This has not been tested, but could speed up some restart conditions.

Notifying users

Users should be notified when rebooting hosts. Normally, the shutdown(1) command noisily prints warnings on terminals which will give a heads up to connected users, but many services do not rely on interactive terminals. It is therefore important to notify users over our chat rooms (currently IRC).

The reboot script can send notifications when rebooting hosts. For that, credentials must be supplied, either through the HTTP_USER and HTTP_PASSWORD environment, or (preferably) through a ~/.netrc file. The file should look something like this:

machine kgb-bot.torproject.org login TPA password REDACTED

The password (REDACTED in the above line) is available on the bot host (currently chives) in /etc/kgb-bot/kgb.conf.d/client-repo-TPA.conf or in trocla, with the profile::kgb_bot::repo::TPA.

To confirm this works before running reboots, you should run this fabric task directly:

fab kgb.relay "test"

For example:

anarcat@angela:fabric-tasks$ fab kgb.relay "mic check"
INFO: mic check

... should result in:

16:16:26 <KGB-TPA> mic check

When rebooting, the users will see this in the #tor-admin channel:

13:13:56 <KGB-TPA> scheduled reboot on host web-fsn-02.torproject.org in 10 minutes
13:24:56 <KGB-TPA> host web-fsn-02.torproject.org rebooted

A heads up should be (manually) relayed in the #tor-project channel, inviting users to follow that progress in #tor-admin.

Ideally, we would have a map of where each server should send notifications. For example, the tb-build-* servers should notify #tor-browser-dev. This would require a rather more convoluted configuration, as each KGB "account" is bound to a single channel for the moment...

Keyboard shortcuts