Compare commits

...

5 Commits

Author SHA1 Message Date
Daniel Barlow 07b92b5df3 more thought 2023-05-17 15:38:22 +01:00
Daniel Barlow aa3b635f61 bordervm: add sshd, usbutils 2023-05-17 15:38:22 +01:00
Daniel Barlow 648ac2eb7f Document jffs2, min-copy-closure, liminix-rebuild
Some of the code is now out of date w.r.t. some of the text
2023-05-17 15:38:22 +01:00
Daniel Barlow b1f4db00a0 add liminix-rebuild command 2023-05-17 15:38:22 +01:00
Daniel Barlow b0a0fdcfcc add "standard" module, which includes flashimage kexec & jffs2
most systems need most of these, so it makes writing the docs a
lot easier
2023-05-17 15:38:22 +01:00
13 changed files with 381 additions and 98 deletions

View File

@ -1533,3 +1533,148 @@ Sun Apr 23 18:24:34 BST 2023
- rotuer is not recognising when I set the hostname
- I may have forgotten the root password :-(
- why is hello world 70K unless hardeningDisable?
Fri Apr 28 20:51:52 BST 2023
To do nix-copy-closure we need nix-store, which is a symlink to nix,
which is
-rwxr-xr-x 1 dan users 2.3M Apr 28 21:08 nix
(stripped). This is a lot bigger than, say, a simple script to
loop through the closure of a derivation and copy only the store
folders that don't exist already.
* we'd like to only transmit the packages that aren't already present
* we'd like to use a single ssh connection
S: here is a list of package names
C: these are the names of the packages I want
S: here are the packages
while read $f ; do
test -d $f || echo $f
end
Tue May 2 21:53:08 BST 2023
1) we have a script that runs on the receiver, which
- accepts a list of store paths
- prints the missing store paths
- runs cpio -i < stdio
2) we need a script for the sender that
- refs=$(nix-store -q --references $1 && echo end)
- opens ssh connection
- print ssh $refs
- needed= capture result until "end" received
- find needed | cpio -o > ssh-connection
- close connection
3) to have a reasonable hope of testing this we should do it with qemu. It would be nice
if we could connect without faff to the qemu lan interface : either we do this by bringing up
another qemu vm (preferably with the host store shared, otherwise it has to build a mips cross
compiler/libc) or maybe we could do something unholy with ssh ProxyCommand
ssh -o ProxyCommand "socat - UDP4-DATAGRAM:230.0.0.1:1234,sourceport=1234,reuseaddr,ip-add-membership=230.0.0.1:127.0.0.1"
4) we haven't solved garbage collection, though I think "remove everything not in
nix-path-registration" might be what's needed there
Wed May 3 22:01:19 BST 2023
Something weird is going on with qemu net device enumeration: when I
run it interactively I'm getting the access network (mac ending :02)
on eth0 and the lan (mac ending :01) on eth1, and if it's behaving the
same in CI then how come any of the tests work? vanilla-confinguration.nix
definitely assumes lan=eth0
By switching from -device virtio-net-pci to -device virtio-net then
I get the desired behaviour back
Sat May 6 18:42:28 BST 2023
Next:
- package min-copy-closure
- see if we can use it on some output to copy the whole system closure
- post-copying symlink munging
- try it on a real device, see if it works for config file updates
- collect-garbage/delete-old-generation
Sun May 7 23:03:03 BST 2023
Shortly after all the work to reduce system closure size last time, I
tried adding the necessary packages to support nix-copy-closure and
saw it start building a complete C++ system with Boost. My fears that
this would lead to quite a large increase in the system size were, it
turned out, entirely founded.
So I wrote my own - or at least, a quite minimal substitute. The core
logic is simple - on the sender, we get the list of required packages,
then we check for the existence of `/nix/store/eeeeeee-foo` for
each of them on the target, and whatever's missing we send across the
link using cpio.
It sounds simple, and it should be simple, and in retrospect it _was_
simple. Along the way I went on a bit of a Qemu networking tangent and
learned quite a lot about the bash `coproc` command
Tue May 9 21:06:53 BST 2023
General direction of my thoughts:
- get a baseline working rotuer system
- prove that min-copy-closure works with it
- refactor the crap out of it
- configurablise the bordervm usb ethernet setup
- when we have a good idea of how/whether min-copy-closure *actually*
works, declare "writeable filesystem" to be done
- start to get more of a feel for how the services/config hang together
? why does rotuer not have a hostname?
? how can we get a device hooked up to rotuer's lan port that we can
control remotely
Sun May 14 23:25:46 BST 2023
the outputs.systemConfiguration attribute builds a derivation
containing a single file bin/activate
_Presumably_, copying its closure will copy all the things, as
we already use it as the roots for jffs2 creation. However, there
is also a symlink created from /init at jffs2 creation
Mon May 15 21:32:38 BST 2023
Had a neat idea about uing an overlayfs combining jffs2 and ramfs
to do upgrades that would otherwise be larger than the flash.
Could use "overlay merge" from https://github.com/kmxz/overlayfs-tools
Wed May 17 15:18:55 BST 2023
liminix-rebuild doesn't collect garbage (this is a mising feature, not
a bug). We think we can fix this using nix-path-registration: specifically,
by deleting anything not in it.
What we're going to do: build a fresh system image for rotuer, then
dogfood liminix-rebuild until we've succeeded in getting it to
change its hostname
Also wondering if we should drop outputs.default, but maybe not
* systemConfiguration: used for updates
* vmroot: used for qemu
* flashimage: used for flashing
* tftproot: used for dev/test
As long as we're consistently setting the default output to whichever
is the appropriate "full production image" I think we're good.

View File

@ -75,6 +75,7 @@ in {
ExecStart = "${pkgs.tufted}/bin/tufted /home/liminix/liminix";
};
};
services.openssh.enable = true;
systemd.services.sshd.wantedBy = pkgs.lib.mkForce [ "multi-user.target" ];
virtualisation = {
@ -104,6 +105,7 @@ in {
socat
tufted
iptables
usbutils
];
security.sudo.wheelNeedsPassword = false;
networking = {

View File

@ -149,6 +149,7 @@ module(s) it uses. I have this segment in configuration.nix which you
may be able to adapt:
.. code-block:: nix
boot = {
kernelParams = [ "intel_iommu=on" ];
kernelModules = [

View File

@ -2,31 +2,101 @@ User Manual
###########
This manual is an early work in progress, not least because Liminix is
not yet ready for users who are not also developers.
not yet really ready for users who are not also developers. Your
feedback to improve it is very welcome.
Configuring for your use case
*****************************
Installation
************
You need to create a ``configuration.nix`` that describes your router
The Liminix installation process is not quite like installing NixOS on
a real computer, but some NixOS experience will nevertheless be
helpful in understanding it. The steps are as follows:
* Decide whether you want the device to be updatable in-place (there
are advantages and disadvantages), or if you are happy to generate
and flash a new image whenever changes are required.
* Create a :file:`configuration.nix` describing the system you want
* Build an image
* Flash it to the device
Choosing a flavour (read-only or updatable)
===========================================
Liminix installations come in two "flavours"- read-only or in-place
updatable:
* a read-only installation can't be updated once it is flashed to your
device, and so must be reinstalled in its entirety every time you
want to change it. It uses the ``squashfs`` filesystem which has
very good compression ratios and so you can pack quite a lot of
useful stuff onto your device. This is good if you don't expect
to change it often.
* an updatable installation has a writable filesystem so that you can
update configuration, upgrade packages and install new packages over
the network after installation. This uses the `jffs2
<http://www.linux-mtd.infradead.org/doc/jffs2.html>`_ filesystem:
although it does compress the data, the need to support writes means
that it can't pack quite as small as squashfs, so you will not have
as much space to play with.
Updatability caveats
~~~~~~~~~~~~~~~~~~~~
At the time of writing this manual the read-only squashfs support is
much more mature. Consider also that it may not be possible to perform
"larger" updates in-place even if you do opt for updatability. If you
have (for example) an 11MB system on a 16MB device, you won't be able
to do an in-place update of something fundamental like the C library
(libc), as this will temporarily require 22MB to install all the
packages needing the new library before the packages using the old
library can be removed. A writable system will be more useful for
smaller updates such as installing a new package (perhaps you
temporarily need tcpdump to diagnose a network problem) or for
changing configuration files.
Note also that the kernel is not part of the filesystem so cannot be
updated this way. Kernel changes require a full reflash.
Creating configuration.nix
==========================
You need to create a ``configuration.nix`` that describes your device
and the services that you want to run on it. Start by copying
``vanilla-configuration.nix`` and adjusting it, or look in the `examples`
directory for some pre-written configurations.
If you want to create a configuration that can be installed on
a hardware device, be sure to include the "flashimage" module.
``configuration.nix`` conventionally describes the packages, services,
user accounts etc of the device. It does not describe the hardware
itself, which is specified separately in the build command (as you
will see below).
.. code-block: nix
Your configuration may include modules: it probably *should*
include the ``standard`` module unless you understand what it
does and what happens if you leave it out.
imports = [
./modules/flashimage.nix
]
.. code-block:: nix
imports = [
./modules/standard.nix
]
configuration.rootfsType = "jffs2"; # or "squashfs"
Building and flashing
*********************
Building
========
An example command to build Liminix might look like this:
Build Liminix using the :file:`default.nix` in the project toplevel
directory, passing it arguments for configuration and hardware. For
example:
.. code-block:: console
@ -49,52 +119,8 @@ is a raw image file that can be written directly to the firmware flash
partition.
Flashing with :command:`flashcp`
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This requires an existing Liminix system, or perhaps some other
operating system on the device which provides the :command:`flashcp`
command. You need to locate the "firmware" partition, which you can do
with a combination of :command:`dmesg` output and the contents of
:file:`/proc/mtd`
**Don't do this on a device that's running on the same flash partition
as you're about to overwrite, otherwise you're likely to crash it. Use
kexecboot (see "Updates to running devices" below) first to reboot
into a RAM-based system.**
.. code-block:: console
<5>[ 0.469841] Creating 4 MTD partitions on "spi0.0":
<5>[ 0.474837] 0x000000000000-0x000000040000 : "u-boot"
<5>[ 0.480796] 0x000000040000-0x000000050000 : "u-boot-env"
<5>[ 0.487056] 0x000000050000-0x000000060000 : "art"
<5>[ 0.492753] 0x000000060000-0x000001000000 : "firmware"
# cat /proc/mtd
dev: size erasesize name
mtd0: 00040000 00001000 "u-boot"
mtd1: 00010000 00001000 "u-boot-env"
mtd2: 00010000 00001000 "art"
mtd3: 00fa0000 00001000 "firmware"
mtd4: 002a0000 00001000 "kernel"
mtd5: 00d00000 00001000 "rootfs"
Then you can copy the image to the device with :command:`ssh`
.. code-block:: console
build-machine$ tar chf - result/firmware.bin | \
ssh root@the-device tar -C /run -xvf -
and then connect to the device and run
.. code-block:: console
flashcp -v firmware.bin /dev/mtd3
Flashing
========
Flashing from OpenWrt (untested)
@ -107,7 +133,7 @@ If your device is running OpenWrt then it probably has the
mtd -r write /tmp/firmware_image.bin firmware
For more information, please see the `OpenWrt manual <https://openwrt.org/docs/guide-user/installation/sysupgrade.cli>`_
For more information, please see the `OpenWrt manual <https://openwrt.org/docs/guide-user/installation/sysupgrade.cli>`_ which may also contain (hardware-dependent) instructions on how to flash an image using the vendor firmware - perhaps even from a web interface.
Flashing from the boot monitor
@ -115,20 +141,26 @@ Flashing from the boot monitor
If you are prepared to open the device and have a TTL serial adaptor
of some kind to connect it to, you can probably flash it using U-Boot.
This is quite hardware-specific: please refer to the Developer Manual.
This is quite hardware-specific, and sometimes involves soldering:
please refer to the Developer Manual.
Updates to running devices
**************************
Flashing from an existing Liminix system with :command:`flashcp`
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To mitigate the risk of flashing a new configuration and potentially
render the device unresponsive if the configuration is unbootable or
doesn't bring up a network device, Liminix has a
"try before write" mode.
The flash procedure from an existing Liminix-system is two-step.
First we reboot the device (using "kexec") into an "ephemeral"
RAM-based version of the new configuration, then when we're happy it
works we can flash the image - and if it doesn't work we can reboot
the device again and it will boot from the old image.
To test a configuration without writing it to flash, import the
``kexecboot`` module and build ``outputs.kexecboot`` instead of
Building the RAM-based image
............................
To creatr the ephemeral image, build ``outputs.kexecboot`` instead of
``outputs.default``. This generates a directory containing the root
filesystem image and kernel, along with an executable called `kexec`
and a `boot.sh` script that runs it with appropriate arguments.
@ -156,29 +188,111 @@ reboot - be sure to close all open files and finish anything else
you were doing first.*
If the new system crashes or is rebooted, then the device will revert
to the old configuration it finds in flash. Thus, by combining kexec
boot with a hardware watchdog you can try new images with very little
chance of bricking anything. When you are happy that the new
configuration is correct, build and flash a flashable image of it.
to the old configuration it finds in flash.
Building the second (permanent) image
.....................................
While running in the kexecboot system, you can copy the permanent
image to the device with :command:`ssh`
.. code-block:: console
build-machine$ tar chf - result/firmware.bin | \
ssh root@the-device tar -C /run -xvf -
Next you need to connect to the device and locate the "firmware"
partition, which you can do with a combination of :command:`dmesg`
output and the contents of :file:`/proc/mtd`
.. code-block:: console
<5>[ 0.469841] Creating 4 MTD partitions on "spi0.0":
<5>[ 0.474837] 0x000000000000-0x000000040000 : "u-boot"
<5>[ 0.480796] 0x000000040000-0x000000050000 : "u-boot-env"
<5>[ 0.487056] 0x000000050000-0x000000060000 : "art"
<5>[ 0.492753] 0x000000060000-0x000001000000 : "firmware"
# cat /proc/mtd
dev: size erasesize name
mtd0: 00040000 00001000 "u-boot"
mtd1: 00010000 00001000 "u-boot-env"
mtd2: 00010000 00001000 "art"
mtd3: 00fa0000 00001000 "firmware"
mtd4: 002a0000 00001000 "kernel"
mtd5: 00d00000 00001000 "rootfs"
Now run (in this example)
.. code-block:: console
flashcp -v firmware.bin /dev/mtd3
"I know my new image is good, can I skip the intemediate step?"
```````````````````````````````````````````````````````````````
In addition to giving you a chance to see if the new image works, this
two-step process ensures that you're not copying the new image over
the top of the active root filesystem. It might work, or it might
crash in surprising ways.
Module options (tbd)
**************
Updating an installed system (JFFS2)
************************************
Adding packages
===============
If your device is running a JFFS2 root filesystem, you can build
extra packages for it on your build system and copy them to the
device: any package in Nixpkgs or in the Liminix overlay is available
with the ``pkgs`` prefix:
.. code-block:: console
nix-build -I liminix-config=./my-configuration.nix \
--arg device "import ./devices/mydevice" -A pkgs.tcpdump
nix-shell -p min-copy-closure root@the-device result/
Note that this only copies the package to the device: it doesn't update
any profile to add it to ``$PATH``
Rebuilding the system
=====================
:command:`liminix-rebuild` is the Liminix analogue of :command:`nixos-rebuild`, although its operation is a bit different because it expects to run on a build machine and then copy to the host device. Run it with the same ``liminix-config`` and ``device`` parameters as you would run :command:`nix-build`, and it will build any new/changed packages and then copy them to the device using SSH. For example:
.. code-block:: console
liminix-rebuild root@the-device -I liminix-config=./examples/rotuer.nix --arg device "import ./devices/gl-ar750"
Caveats
~~~~~~~
* it needs there to be enough free space on the device for all the new
packages in addition to all the packages already on it - which may be
a problem if a lot of things have changed (e.g. a new version of
nixpkgs).
* it cannot upgrade the kernel, only userland
* it reboots the device!
Foo module
==========
Configuration Options
*********************
Module docs will go here. This part of the doc should be autogenerated.
Bar module
==========
Baz module
==========
Quuz net device
===============

View File

@ -31,10 +31,8 @@ in rec {
};
imports = [
../modules/tftpboot.nix
../modules/standard.nix
../modules/wlan.nix
../modules/flashimage.nix
../modules/kexecboot.nix
];
hostname = "arhcive";

View File

@ -34,8 +34,7 @@ in rec {
imports = [
../modules/wlan.nix
../modules/tftpboot.nix
../modules/flashimage.nix
../modules/standard.nix
];
hostname = "extneder";

View File

@ -33,9 +33,7 @@ in rec {
imports = [
../modules/wlan.nix
../modules/tftpboot.nix
../modules/flashimage.nix
../modules/jffs2.nix
../modules/standard.nix
];
rootfsType = "jffs2";

11
modules/standard.nix Normal file
View File

@ -0,0 +1,11 @@
{
# "standard" modules that aren't fundamentally required,
# but are probably useful in most common workflows and
# you should have to opt out of instead of into
imports = [
./tftpboot.nix
./kexecboot.nix
./flashimage.nix
./jffs2.nix
];
}

View File

@ -11,5 +11,6 @@
installPhase = ''
mkdir -p $out/bin
cp min-copy-closure.sh $out/bin/min-copy-closure
cp liminix-rebuild.sh $out/bin/liminix-rebuild
'';
}

View File

@ -0,0 +1,14 @@
#!/usr/bin/env bash
target_host=$1
shift
if [ -z "$target_host" ] ; then
echo Usage: liminix-rebuild target-host params
exit 1
fi
toplevel=$(nix-build "$@" -A outputs.systemConfiguration --no-out-link)
min-copy-closure $target_host $toplevel
ssh $target_host cp -P $toplevel/bin/\* /
ssh $target_host reboot

View File

@ -7,6 +7,7 @@
{
writeText
, lib
, s6-init-bin
, stdenv
}:
let
@ -74,5 +75,6 @@ in attrset:
installPhase = ''
mkdir -p $out/bin
$STRIP --remove-section=.note --remove-section=.comment --strip-all makedevs -o $out/bin/activate
ln -s ${s6-init-bin}/bin/init $out/bin/init
'';
}

View File

@ -13,8 +13,6 @@ let
in {
imports = [
../../vanilla-configuration.nix
../../modules/squashfs.nix
../../modules/jffs2.nix
];
config = {
services.sshd = longrun {

View File

@ -4,7 +4,7 @@ let
inherit (pkgs.liminix.services) oneshot longrun bundle target;
in rec {
imports = [
./modules/tftpboot.nix
./modules/standard.nix
./modules/wlan.nix
];
services.loopback = config.hardware.networkInterfaces.lo;