Skip to content

Commit 1e93fc1

Browse files
authored
Merge pull request #463 from aanderse/switch_root
Add switch_root support for initramfs to real root transitions
2 parents d7fd5bc + 8e7d1b7 commit 1e93fc1

File tree

15 files changed

+653
-11
lines changed

15 files changed

+653
-11
lines changed

doc/features.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -288,4 +288,27 @@ Rescue mode can be disabled at build time with `configure --without-rescue`.
288288
See the [Rescue Mode](config/rescue.md) section for more information.
289289

290290

291+
**Switch Root**
292+
293+
Finit supports switching from an initramfs to a real root filesystem using
294+
the built-in `initctl switch-root` command. This allows Finit to serve as
295+
the init system in an initramfs for early boot tasks (LUKS unlock, LVM
296+
activation, network boot) before transitioning to the real root.
297+
298+
```bash
299+
# In initramfs, after mounting the real root:
300+
initctl switch-root /mnt/root
301+
```
302+
303+
The switch-root operation:
304+
305+
1. Runs the `HOOK_SWITCH_ROOT` hook for cleanup
306+
2. Stops all services gracefully
307+
3. Moves virtual filesystems (`/dev`, `/proc`, `/sys`, `/run`) to the new root
308+
4. Deletes initramfs contents to free memory
309+
5. Pivots to the new root and execs the new init
310+
311+
See the [Switch Root](switchroot.md) section for complete documentation.
312+
313+
291314
[5]: https://en.wikipedia.org/wiki/Runlevel

doc/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ Features
3939
* [Cgroups v2](config/cgroups.md), both configuration and monitoring in [`initctl top`](initctl.md)
4040
* [Plugin support](plugins.md) for customization
4141
* Proper [rescue mode](config/rescue.md) with bundled `sulogin` for protected maintenance shell
42+
* [Switch root](switchroot.md) support for initramfs-to-real-root transitions
4243
* Integration with [watchdogd][] for full system supervision
4344
* [Logging](config/logging.md) to kernel ring buffer before `syslogd` has started, see the
4445
recommended [sysklogd][] project for complete logging integration

doc/initctl.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,7 @@ Commands:
6565
halt Halt system
6666
poweroff Halt and power off system
6767
suspend Suspend system
68+
switch-root NEWROOT [INIT] Switch to new root filesystem (initramfs only)
6869
6970
utmp show Raw dump of UTMP/WTMP db
7071
```
@@ -143,3 +144,24 @@ Apr 8 12:37:46 alpine authpriv.info dropbear[2300]: Exit (root) from <192.168.1
143144
Apr 8 15:02:11 alpine authpriv.info dropbear[2634]: Child connection from 192.168.121.1:48576
144145
Apr 8 15:02:12 alpine authpriv.notice dropbear[2634]: Password auth succeeded for 'root' from 192.168.121.1:48576
145146
```
147+
148+
149+
Switch Root
150+
-----------
151+
152+
The `switch-root` command is used when running Finit in an initramfs to
153+
transition to the real root filesystem. This is similar to the standalone
154+
`switch_root(8)` utility but integrated into Finit.
155+
156+
```
157+
initctl switch-root /mnt/root [/sbin/init]
158+
```
159+
160+
Requirements:
161+
162+
- Must be run during runlevel S (bootstrap) or runlevel 1
163+
- The new root must be a mount point (different device than /)
164+
- Can only be used when Finit is running as PID 1 in an initramfs
165+
166+
For complete documentation and usage examples, see the dedicated
167+
[Switch Root](switchroot.md) section.

doc/plugins.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -205,6 +205,16 @@ hook points:
205205
new runlevel have been been stopped. When the hook has completed,
206206
Finit continues to start all services in the new runlevel.
207207

208+
### Switch Root Hooks
209+
210+
* `HOOK_SWITCH_ROOT`, `hook/sys/switchroot`: Called when
211+
`initctl switch-root` is issued, before the transition begins. Use
212+
this hook to save state, unmount initramfs-only filesystems, or perform
213+
cleanup before switching to the new root. Only runs when Finit is
214+
operating as PID 1 in an initramfs.
215+
216+
See the [Switch Root](switchroot.md) section for more information.
217+
208218
### Shutdown Hooks
209219

210220
* `HOOK_NETWORK_DN`, `hook/net/down`: Called right after having changed

doc/switchroot.md

Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
Switch Root
2+
===========
3+
4+
Finit supports switching from an initramfs to a real root filesystem
5+
using the `initctl switch-root` command. This is useful for systems
6+
that use an initramfs for early boot (LUKS, LVM, network boot, etc.)
7+
and need to transition to the real root before starting services.
8+
9+
10+
Usage
11+
-----
12+
13+
```sh
14+
initctl switch-root NEWROOT [INIT]
15+
```
16+
17+
- `NEWROOT`: Path to the mounted new root filesystem (e.g., `/mnt/root`)
18+
- `INIT`: Optional path to init on the new root (default: `/sbin/init`)
19+
20+
21+
Requirements
22+
------------
23+
24+
1. Must be run during runlevel S (bootstrap) or runlevel 1
25+
2. `NEWROOT` must be a mount point (different device than /)
26+
3. `INIT` must exist and be executable on the new root
27+
4. Finit must be running as PID 1 (in initramfs)
28+
29+
30+
How It Works
31+
------------
32+
33+
1. Runs `HOOK_SWITCH_ROOT` for any cleanup scripts/plugins
34+
2. Runs `HOOK_SHUTDOWN` to notify plugins
35+
3. Stops all services and kills remaining processes
36+
4. Exits all plugins gracefully
37+
5. Moves `/dev`, `/proc`, `/sys`, `/run` to new root
38+
6. Deletes initramfs contents (if on tmpfs/ramfs) to free memory
39+
7. Moves new root mount to `/`
40+
8. Chroots to new root
41+
9. Reopens `/dev/console` for stdin/stdout/stderr
42+
10. Execs new init as PID 1
43+
44+
45+
Example: Initramfs finit.conf
46+
-----------------------------
47+
48+
Configuration file `/etc/finit.conf` in the initramfs:
49+
50+
```
51+
# /etc/finit.conf in initramfs
52+
53+
# Mount the real root filesystem
54+
run [S] name:mount-root /bin/mount /dev/sda1 /mnt/root -- Mounting root filesystem
55+
56+
# Switch to real root after mount completes
57+
run [S] name:switch-root /sbin/initctl switch-root /mnt/root -- Switching to real root
58+
```
59+
60+
For more complex setups (LUKS, LVM, etc.):
61+
62+
```
63+
# Unlock LUKS volume
64+
run [S] name:cryptsetup /sbin/cryptsetup open /dev/sda2 cryptroot -- Unlocking encrypted root
65+
66+
# Activate LVM
67+
run [S] name:lvm /sbin/lvm vgchange -ay -- Activating LVM volumes
68+
69+
# Mount root
70+
run [S] name:mount-root /bin/mount /dev/vg0/root /mnt/root -- Mounting root
71+
72+
# Switch root
73+
run [S] name:switch-root /sbin/initctl switch-root /mnt/root -- Switching to real root
74+
```
75+
76+
77+
Example: Using Runlevel 1 for Switch Root
78+
-----------------------------------------
79+
80+
For more complex initramfs setups where ordering of tasks becomes
81+
difficult in runlevel S, you can perform the switch-root in runlevel 1:
82+
83+
```
84+
# /etc/finit.conf in initramfs
85+
86+
# Start mdevd for device handling
87+
service [S] name:mdevd notify:s6 /sbin/mdevd -D %n -- Device event daemon
88+
run [S] name:coldplug <service/mdevd/ready> /sbin/mdevd-coldplug -- Coldplug devices
89+
90+
# Mount the real root filesystem (after devices are ready)
91+
run [S] name:mount-root <run/coldplug/success> /bin/mount /dev/sda1 /mnt/root -- Mounting root
92+
93+
# Transition to runlevel 1 after all S tasks complete
94+
# The switch-root runs cleanly in runlevel 1
95+
run [1] name:switch-root /sbin/initctl switch-root /mnt/root -- Switching to real root
96+
```
97+
98+
This approach separates the initramfs setup (runlevel S) from the
99+
switch-root operation (runlevel 1), making task ordering simpler.
100+
101+
102+
Hooks
103+
-----
104+
105+
The `HOOK_SWITCH_ROOT` hook runs before the switch begins. Use it for:
106+
107+
- Saving state to the new root
108+
- Unmounting initramfs-only mounts
109+
- Cleanup tasks
110+
111+
Plugins can register for `HOOK_SWITCH_ROOT` just like other hooks:
112+
113+
```c
114+
static void my_switch_root_hook(void *arg)
115+
{
116+
/* Cleanup before switch_root */
117+
}
118+
119+
static plugin_t plugin = {
120+
.name = "my-plugin",
121+
.hook[HOOK_SWITCH_ROOT] = {
122+
.cb = my_switch_root_hook
123+
}
124+
};
125+
126+
PLUGIN_INIT(plugin_init)
127+
{
128+
plugin_register(&plugin);
129+
}
130+
```
131+
132+
133+
Conditions
134+
----------
135+
136+
After switch_root, the new finit instance starts fresh. No conditions
137+
or state are preserved across the switch. The new finit will:
138+
139+
1. Re-read `/etc/finit.conf` from the new root
140+
2. Re-initialize all conditions
141+
3. Start services according to the new configuration
142+
143+
144+
See Also
145+
--------
146+
147+
- [switch_root(8)](https://man7.org/linux/man-pages/man8/switch_root.8.html) - util-linux switch_root utility
148+
- [Kernel initramfs documentation](https://docs.kernel.org/filesystems/ramfs-rootfs-initramfs.html)

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,7 @@ nav:
7979
- Limitations: config/limitations.md
8080
- Usage:
8181
- Commands & Status: initctl.md
82+
- Switch Root: switchroot.md
8283
- Rebooting & Halting: commands.md
8384
- Command Line Options: cmdline.md
8485
- Managing Services: service.md

src/Makefile.am

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,7 @@ finit_SOURCES = api.c cgroup.c cgroup.h \
6363
exec.c finit.c finit.h \
6464
stty.c \
6565
helpers.c helpers.h \
66+
initramfs.c \
6667
iwatch.c iwatch.h \
6768
log.c log.h \
6869
mdadm.c mount.c \

src/api.c

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -245,6 +245,46 @@ static void bypass_shutdown(void *unused)
245245
do_shutdown(halt);
246246
}
247247

248+
/*
249+
* Handle switch_root API command.
250+
* Parses data: "newroot\0newinit\0"
251+
* Sends ACK before attempting switch_root since it doesn't return on success.
252+
* Returns: result from switch_root() on failure, doesn't return on success.
253+
*/
254+
static int do_switch_root_api(int sd, struct init_request *rq)
255+
{
256+
char *newroot, *newinit = NULL;
257+
char *ptr;
258+
int result;
259+
260+
dbg("switch-root %s", rq->data);
261+
strterm(rq->data, sizeof(rq->data));
262+
263+
newroot = rq->data;
264+
ptr = strchr(newroot, '\0');
265+
if (ptr && ptr < rq->data + sizeof(rq->data) - 1) {
266+
ptr++;
267+
if (*ptr)
268+
newinit = ptr;
269+
}
270+
271+
/*
272+
* Send ACK first, since we won't return from
273+
* switch_root() on success.
274+
*/
275+
rq->cmd = INIT_CMD_ACK;
276+
if (write(sd, rq, sizeof(*rq)) != sizeof(*rq))
277+
dbg("Failed sending ACK to client");
278+
close(sd);
279+
280+
/* This does not return on success */
281+
result = switch_root(newroot, newinit);
282+
if (result)
283+
logit(LOG_ERR, "switch_root failed: %s", strerror(errno));
284+
285+
return result;
286+
}
287+
248288
static int do_reboot(int cmd, int timeout, char *buf, size_t len)
249289
{
250290
int rc = 1;
@@ -403,6 +443,15 @@ static void api_cb(uev_t *w, void *arg, int events)
403443
rq.cmd, rq.data);
404444
goto leave;
405445
}
446+
break;
447+
448+
case INIT_CMD_SWITCH_ROOT:
449+
if (runlevel != INIT_LEVEL && runlevel != 1) {
450+
warnx("switch-root only allowed in runlevel S or 1");
451+
goto done;
452+
}
453+
break;
454+
406455
default:
407456
break;
408457
}
@@ -503,6 +552,10 @@ static void api_cb(uev_t *w, void *arg, int events)
503552
result = do_reboot(rq.cmd, rq.sleeptime, rq.data, sizeof(rq.data));
504553
break;
505554

555+
case INIT_CMD_SWITCH_ROOT:
556+
do_switch_root_api(sd, &rq);
557+
goto leave;
558+
506559
case INIT_CMD_ACK:
507560
dbg("Client failed reading ACK");
508561
goto leave;

src/cgroup.c

Lines changed: 32 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,8 @@
3434
#endif
3535
#include <sys/mount.h>
3636
#include <sys/sysinfo.h> /* get_nprocs_conf() */
37+
#include <sys/vfs.h>
38+
#include <linux/magic.h>
3739

3840
#include "cgroup.h"
3941
#include "finit.h"
@@ -841,6 +843,7 @@ void cgroup_config(void)
841843
void cgroup_init(uev_ctx_t *ctx)
842844
{
843845
int opts = MS_NODEV | MS_NOEXEC | MS_NOSUID;
846+
int mounted = 0;
844847
char buf[80];
845848
FILE *fp;
846849
int fd;
@@ -851,14 +854,36 @@ void cgroup_init(uev_ctx_t *ctx)
851854
#endif
852855

853856
if (mount("none", FINIT_CGPATH, "cgroup2", opts, NULL)) {
854-
if (errno == ENOENT)
857+
if (errno == EBUSY) {
858+
/*
859+
* Already mounted - this happens after switch_root
860+
* when cgroups were moved from the initramfs.
861+
* Verify it's actually cgroup2 before proceeding.
862+
*/
863+
struct statfs sfs;
864+
865+
if (statfs(FINIT_CGPATH, &sfs) || sfs.f_type != CGROUP2_SUPER_MAGIC) {
866+
logit(LOG_ERR, "Mount point %s busy but not cgroup2", FINIT_CGPATH);
867+
avail = 0;
868+
return;
869+
}
870+
dbg("cgroup2 already mounted at %s, reusing", FINIT_CGPATH);
871+
} else if (errno == ENOENT) {
855872
logit(LOG_INFO, "Kernel does not support cgroups v2, disabling.");
856-
else if (errno == EPERM) /* Probably inside an unprivileged container */
873+
avail = 0;
874+
return;
875+
} else if (errno == EPERM) {
876+
/* Probably inside an unprivileged container */
857877
logit(LOG_INFO, "Not allowed to mount cgroups v2, disabling.");
858-
else
878+
avail = 0;
879+
return;
880+
} else {
859881
err(1, "Failed mounting cgroup v2");
860-
avail = 0;
861-
return;
882+
avail = 0;
883+
return;
884+
}
885+
} else {
886+
mounted = 1;
862887
}
863888
avail = 1;
864889

@@ -867,7 +892,8 @@ void cgroup_init(uev_ctx_t *ctx)
867892
if (!fp) {
868893
err(1, "Failed opening %s", FINIT_CGPATH "/cgroup.controllers");
869894
abort:
870-
umount(FINIT_CGPATH);
895+
if (mounted)
896+
umount(FINIT_CGPATH);
871897
avail = 0;
872898
return;
873899
}

0 commit comments

Comments
 (0)