Thu Sep 22 00:12:42 BST 2022

Making quite reasonable progress, though only running under emulation.
Since almost everything so far has been a recap of nixwrt, that's to
be expected.

The example config starts some services at boot, or at least attempts
to. Next we shoud

 - add some network config to run-qemu
 - implement udhcp and odhcp properly to write outputs
  and create resolv.conf and all that
 - write some kind of test so we can refactor the crap
 - not let the tests write random junk everywhere

Thu Sep 22 12:46:36 BST 2022

We can store outputs in the s6 scan directory, it seems:

> There is, however, a guarantee that s6-supervise will never touch subdirectories named data or env. So if you need to store user information in the service directory with the guarantee that it will never be mistaken for a configuration file, no matter the version of s6, you should store that information in the data or env subdirectories of the service directory.

https://skarnet.org/software/s6/servicedir.html

> process 'store/pj0b27l5728cypa5mmagz0q8ibzpik0h-execline-mips-unknown-linux-musl-2.9.0.1-bin/bin/execlineb' started with executable stack

https://skarnet.org/lists/skaware/1550.html


Thu Sep 22 16:14:49 BST 2022

what network peers do we want to model for testing?

- wan: pppoe
- wan: ip over ethernet, w/ dhcp service provided
- wan: l2tp over (ip over ethernet, w/ dhcp service provided)
- lan: something with a dhcp client

https://accel-ppp.readthedocs.io/en/latest/ could use this for testing
pppoe and l2tp?


Thu Sep 22 22:57:47 BST 2022

To build a nixos vm with accel-ppp installed (not yet configured)

  nix-build '<nixpkgs/nixos>' -A vm -I nixos-config=./tests/ppp-server-configuration.nix -o ppp-server
  QEMU_OPTS="-display none -serial mon:stdio -nographic" ./ppp-server/bin/run-nixos-vm

To test it's configured I thought I'd run it against an OpenWrt qemu
install, so, fun with qemu networking ensues. This config in ../openwrt-qemu
is using two multicast socket networks -

nix-shell -p qemu --run "./run.sh ./openwrt-22.03.0-x86-64-generic-kernel.bin openwrt-22.03.0-x86-64-generic-ext4-rootfs.img "

so hopefully we can spin up other VMs connected either to its lan or
its wan: *however* we do first need to configure its wan to use pppoe

uci set network.wan=interface
uci set network.wan.device='eth1'
uci set network.wan.proto='pppoe'
uci set network.wan.username='db123@a.1'
uci set network.wan.password='NotReallyTheSecret'

(it's ext4 so this will probably stick)


Fri Sep 23 10:27:22 BST 2022

* mcast=230.0.0.1:1234  : access (interconnect between router and isp)
* mcast=230.0.0.1:1235  : lan
* mcast=230.0.0.1:1236  : world (the internet)


Sun Sep 25 20:56:28 BST 2022

TODO - bugs, missing bits, other infelicities as they occur to me:

DONE 1) shutdown doesn't work as its using the busybox one not s6.

2) perhaps we shouldn't have process-based services like dhcp, ppp
implement "address provider interface" - instead have a separate
service for interface address that depends on the service and uses its
output

* ppp is not like dhcp because dhcp finds addresses for an existing
  interface but ppp makes a new one

3) when I killed ppp it restarted, but I don't think it reran
defaultroute which is supposed to depend on it. (Might be important
e.g. if we'd been assigned a different IP address). Investigate
semantics of s6-rc service dependencies

DONE 4) make the pppoe test run unattended

5) write a test for udhcp

6) squashfs size is ~ 14MB for a configuration with not much in it,
look for obvious wastes of space

7) some of the pppoe config should be moved into a ppp service

8) some of configuration.nix (e.g. defining routes) should be moved into
tools

DONE 9) split tools up instead of having it all one file

10) is it OK to depend on squashfs pseudofiles if we might want to
switch to ubifs? will there always be a squashfs underneath? might
we want to change the pseudofiles in an overlay?

11) haven't done (overlayfs) overlays at all

12) overlay.nix needs splitting up

13) upgrade ppp to something with an ipv6-up-script option

14) add ipv6 support generally

15) "ip address add" seems to magically recognise v4 vs v6 but
is that specified or fluke?

16) tighten up the module specs. (DONE) services.foo should be a s6-rc
service, (DONE) kernel config should be checked in some way

DONE 17) rename nixwrt references in kernel builder

18) maybe stop suffixing all the service names with .service

19) syslogd - use busybox or s6?

chat -s -S ogin:--ogin: root / "ip address show dev ppp0 | grep ppp0" 192.168.100.1  "/nix/store/*-s6-linux-init-*/bin/s6-linux-init-hpr -p"


Working towards a general goal of having a derivation we can
usefully run `nix path-info` on - or some other tool that will
tell us what's making the images big. The squashfs doesn't
have this information.

Towards that end (really? can't remember how ...) what would be a
way for packages to declare "I want to add files to /etc"? Is that
even a good idea?

Thinking we should turn s6-init-files back into a real derivation.

Tue Sep 27 00:31:45 BST 2022

> Thinking we should turn s6-init-files back into a real derivation.

This turns out to be Not That Simple, because it contains weird shit
(sticky bits and fifos).

Tue Sep 27 09:50:44 BST 2022

* allow modules to register activation scripts that are run on the
root filesystem once all packages are installed

  - do they run on build or on host? if we're upgrading in place
  how do we ship filesystem changes to the host?

or:

* allow modules to declare environment.*, use pseudofile on build and
create real files on host. will need to keep the implementation on
  host faily simple because restricted environment

Tue Sep 27 16:14:18 BST 2022

TODO list is getting both longer and shorter, though longer on
average.

2) perhaps we shouldn't use process-based services like [ou]dhcp as
queryable endpoint for interface addresses (e.g. when adding routes).
Instead have a separate service for interface address that depends on
the *dhcp and uses its output

3) when I killed ppp it restarted, but I don't think it reran
defaultroute which is supposed to depend on it. (Might be important
e.g. if we'd been assigned a different IP address). Investigate
semantics of s6-rc service dependencies

4) figure out a nice way to fit ppp into this model as it actually
creates the interface instead of using an existing unconfigured one

5) write a test for udhcp

7) some of the pppoe config should be moved into a ppp service

11) haven't done (overlayfs) overlays at all

13) upgrade ppp to something with an ipv6-up-script option, move ppp and pppoe derivations into their own files

14) add ipv6 support generally

15) "ip address add" seems to magically recognise v4 vs v6 but
is that specified or fluke?

19) ship logs somehow to log collection system

21) dhcp, dns, hostap service for lan

22) support real hardware

Tue Sep 27 22:00:36 BST 2022

Found the cause of huge image size: rp-pppoe ships with scripts that
reference build-time packages, so we have x86-64 glibc in there

We don't need syslog just to accommodate ppp, there's an underdocumented
option for it to log to a file descriptor

Wed Sep 28 16:04:02 BST 2022

Based on https://unix.stackexchange.com/a/431953 if we can forge
ethernet packets we might be able to write tests for e.g. "is the vm
running a dhcp server"

Wed Sep 28 21:29:05 BST 2022

We can use Python "scapy" to generate dhcp request packets, and Python
'socket' model to send them encapsulated in UDP. Win

It's extremely janky python

Thu Sep 29 15:24:37 BST 2022

Two points to ponder

1) where service config depends on outputs of other services, we
do that rather ugly "$(cat ${output ....})" construct. Can we improve on
that? Maybe we could have some kind of tooling to read them as environment
variables ...

2) we have given no consideration yet to secrets. we want the secrets to
be not in the store; we want some way of refreshing them when they change

Sat Oct  1 14:24:21 BST 2022

The MAC80211_HWSIM kernel config creates virtual wlan[01] devices
which hostapd will work with, and a hwsim0 which we can use to monitor
(though not inject) trafic. Could we use this for wifi tests? How do
we make the guest hwsim0 visible to the host?


Sat Oct  1 18:41:31 BST 2022

virtual serial ports: I struggled with qemu for ages to get this to work.
You also need the unhelpfully named CONFIG_VIRTIO_CONSOLE option in
kconfig

QEMU_OPTIONS="-nodefaults  -chardev socket,path=/tmp/wlan,server=on,wait=off,id=wlan  -device virtio-serial-pci -device virtserialport,name=wlan,chardev=wlan"

Sun Oct  2 09:34:48 BST 2022

We could implement the secrets store as a service, then the secrets
are outputs.

Things we can do in qemu

1) make interface address service that depends on dhcp, instead of
  being set by it directly
2) check out restart behaviour of dependent services when depended-on
  service dies
3) pppd _creates_ an interface, work out how to fit it into this model
5) add bridge support for lan
8) upgrade ppp to something with an ipv6-up-script option, move ppp and pppoe derivations into their own files
9) get ipv6 address from pppoe
10) get ipv6 delegation from pppoe and add prefix to lan
11) support dhcp6 in dnsmasq, and advertise prefix on lan
12) firewalling and nat
 - default deny or zero trust?
14) write secrets holder as a service with outputs
20) should we check that references to outputs actually correspond with
  those provided by a service

Things we probably do on hardware

6) writable filesystem (ubifs?)
7) overlay with squashfs/ubifs - useful? think about workflows for
how this thing is installed
16) gl-ar750
17) mediatek device - gl-mt300 or whatever I have lying around
18) some kind of arm (banana pi router?)
19) should we give routeros a hardware ethernet and maybe an l2tp upstream,
 then we could dogfood the hardware devices.  we could run an l2tp service
 at mythic-beasts, got a /48 there



https://skarnet.org/software/s6/s6-fghack.html looks like a handy thing
we hope we'll never have to use

Sun Oct  2 22:22:17 BST 2022

> make interface address service that depends on dhcp, instead of being set by it directly

We can do this for dhcp, but we can't do it for ppp. Running the ppp service
creates a ppp[012n] interface and assigns it an ipv4 address and there's not
a whole lot we can easily do to unbundle that.

So

- the ppp service needs to behave as if it were a "link" service
- either it *also* needs to behave as an address service, or we could
  have an address service that subscribes to it and does nothing other than
  translate output formats

Note regarding that second bullet: at the moment the static address
service has no outputs anyway!


Tue Oct  4 22:43:02 BST 2022

While trying to make the TFTP workflow not awful I seem to have written
a TFTP server.


Thu Oct  6 19:26:40 BST 2022

We have a booting kernel on gl-ar750, but we aren't at a point that it can
find a root filesystem

I'd *like* to be able to use the same delivery mechanism (kernel uimage
concatenated monolithic


Sat Oct  8 11:12:09 BST 2022

We have it booting on hardware, mounting root fs, running getty :-)

For NixWRT TFTP boots we used a single image with both kernel and squashfs, and
relied on CONFIG_MTD_SPLIT_FIRMWARE to identify where the boundary was and create
/dev/mdtn devices at the right offsets so that the kernel could find the
squashfs

For Liminix we're not going to do that.

* CONFIG_MTD_SPLIT_FIRMWARE is only available in OpenWrt patches
* it's an uncomfortable level of automagic just to save us doing two TFTPs
  instea of one
* the generated image is anyway not the one we'd write to flash (has unneeded
   PHRAM support)
* it means we need to memmap out enough ram for the whole image inc kernel when really
  all we need to reserve is the rootfs bit


Sat Oct  8 11:23:08 BST 2022

"halt" and "reboot" don't work on gl-ar750

Sat Oct  8 13:10:00 BST 2022

Where do we go with this ar750?

- wired networking
- wifi


Sun Oct  9 09:57:35 BST 2022

We want to be able to package kernel modules as regular derivations, so that
they get added to the filesystem

This means they need access to kernel.modulesupport

This means  kernel.modulesupport needs to be in pkgs too?

This is fine, probably, but we'd like to avoid closing over vmlinux because
there's no need for it to be in the filesystem

Mon Oct 10 22:57:23 BST 2022

The problem is that kernel kconfig options are manipulated in the
liminix modules, which means that data must be (transitively) available
to modules, so they can't be regular packages as they're tied so tightly
to the exact config. Unless we define a second overlay that references
the configuration object, but my head hurts when I start to think about that
so maybe not.

Tue Oct 11 00:00:13 BST 2022

Building ag71xx (ethernet driver) as a module doesn't work because
it references a symbol ath79_pll_base in the kernel that hasn't been
marked with EXPORT_SYMBOL.

We could forge an object file that "declares" it with a gross and disgusting hack like this

$ echo > empty # not actually "empty", objcopy complains about that
$ grep ath79_pll_base /nix/store/jcc114cd13xa8aa4mil35rlnmxnlmv09-vmlinux-mips-unknown-linux-musl-modulesupport/System.map
ffffffff807b2094 B ath79_pll_base
$ mips-unknown-linux-musl-objcopy   -I binary -O elf32-big --add-section .bss=empty  --add-symbol ath79_pll_base=.bss:0x807b2094  empty f.o

I don't claim this is a good idea, just an idea. Thought was that we would not
have to declare its type this way. Also it might not work with kaslr
https://stackoverflow.com/a/68903503


Backstory: why are we trying to build this as a module? because the
openwrt fork of it seems to be a bit more advanced than the mainline,
and I *suspect* that the mainline version doesn't work with our
openwrt-based device tree which ahs the mdio as a nested node inside
the ag71xx node - in mainline the driver seems to have all the mdio
stuff inline. So, could we build the openwrt driver without patching
the crap out of our kernel

Sun Oct 16 15:25:33 BST 2022

Executive decision: let's use the openwrt kernel (at least for
gl-ar750).  Mainline kernel doesn’t have devicetree support for this
device or the SoC it’s based on, and the OpenWrt dts for it doesn’t
have the same "compatible"s, which makes me think that an indefinite
amount of patching will be necessary to make dts/modules for one of
them work with a kernel for the other

As a result: now we have eth0 appearing, but not eth1?  Guessing we
need to add some kconfig for the switch

Mon Oct 17 21:23:37 BST 2022

we are spending ridiculous amounts of cpu/io time copying kernel source
trees from place to place, because we have kernel tree preparation
and actual building as two separate derivations.

I think the answer is to have a generic kernel build derivation
in the overlay, and then have the device overlays override it with
an additional phase to do openwrt patching or whatever else they
need to do.

Tue Oct 18 23:02:43 BST 2022

* previous TODO list is Aug 02, need to review
* dts is hardcoded to gl-ar750, that needs cleaning up
* figure out persistent addresses for ethernet
* fix halt/reboot
* "link" services have a "device" attribute, would much rather
  have everything referenced using outputs than having two
  different mechanisms for reading similar things
* Kconfig.local do we still need it?
* check all config instead of differentiating config/checkedConfig

Sun Feb  5 18:14:02 GMT 2023

We have resumed.
commit eb4efab6a215bf03cf5aab10d4ac909e83e9c148
Author: Daniel Barlow <dan@telent.net>
Date:   Sat Jan 28 23:18:28 2023 +0000


* find out what works
* add that stuff to hydra
* fix the rest
* add that stuff to hydra
* convert to flake
* check if routeros can be run interactively
* some per-device docs in a form that can be transcluded for website


ci builds

* each of the tests has hardcoded device/config/etc
* build an "empty" configuration for each target device
* build an unstable configuration for qemu


Wed Feb  8 16:52:22 GMT 2023

We have hydra builds for all the previously-working devices, though we
don't yet know if any of those builds actually boots or does anything
useful.

[DONE] Would be nice to clean up the run-qemu and connect-qemu scripts
and put them in the buildEnv

Some thought needed about how to hook up the gl-ar750 to the internets,
ideally in a way that mirrors typical real uses. AAISP have an L2TP
service, but I would prefer to use pppoe on the device, so how to
translate one to t'other on an intermediary/gateway machine?
https://www.rfc-archive.org/getrfc.php?rfc=3817#gsc.tab=0 exists
as an RFC but I can't find anything that actually implements it

Actual Documentation (e.g.  user and developer manuals) should live in
the liminix repo so it corresponds with the code, and can be rsynced
from there to the web site, maybe with a deploy hook or something.
Haven't decided what a good doc format is yet

If we create a flake for Hydra to run on, that _more or less_ means we
don't have any manual hydra jobset configuration to document.

There are still some tests that need adding to CI

[DONE] Should the per-device config be a module not an overlay? Given that
half of what's in it is kernel config (a module could set this)
and the rest is source tarball download specs (needs nixpkgs,
a module has this and could set it too) I wonder why it isn't already

[ALREADY DOES] Can we make Hydra report output sizes so we can plot closure size
trends and see if it all goes awful?

Thu Feb  9 08:14:39 GMT 2023

For better developer experience, I am thinking that either (1)
swap tasks 2 and 3 (writable filesystem before module system)
or (2) add NBD support so I can iterate on a real device without
full rebuilds every time


Fri Feb 10 06:18:25 PM GMT 2023

did the overlay->module thing

[DONE] Need to fix all the configuration around PHRAM, I can't see how it
would ever work

Sat Feb 11 14:37:45 GMT 2023

Consolidated TODO

* figure out persistent addresses for ethernet (?)
[SEEMS DONE] * fix halt/reboot
[DONE, NO] * Kconfig.local do we still need it?
[DONE] * check all config instead of differentiating config/checkedConfig

Things we can do in qemu

* "link" services have a "device" attribute, would much rather
  have everything referenced using outputs than having two
  different mechanisms for reading similar things
1) make interface address service that depends on dhcp, instead of
  being set by it directly
2) check out restart behaviour of dependent services when depended-on
  service dies
3) pppd _creates_ an interface, work out how to fit it into this model
5) add bridge support for lan
8) upgrade ppp to something with an ipv6-up-script option, move ppp and pppoe derivations into their own files
9) get ipv6 address from pppoe
10) get ipv6 delegation from pppoe and add prefix to lan
11) support dhcp6 in dnsmasq, and advertise prefix on lan
12) firewalling and nat
 - default deny or zero trust?
14) write secrets holder as a service with outputs
20) should we check that references to outputs actually correspond with
  those provided by a service
* Actual Documentation (e.g.  user and developer manuals)
* make a flake
* There are still some tests that need adding to CI

Things we probably do on hardware

[DONE] * dts is hardcoded to gl-ar750, that needs cleaning up
6) writable filesystem (ubifs?)
7) overlay with squashfs/ubifs - useful? think about workflows for
how this thing is installed
16) gl-ar750
[DONE] * decide how to hook up the gl-ar750 to the internets
17) mediatek device - gl-mt300 or whatever I have lying around
18) some kind of arm (banana pi router?)
[DONE DIFERENTLY] 19) should we give routeros a hardware ethernet and maybe an l2tp upstream,
 then we could dogfood the hardware devices.  we could run an l2tp service
 at mythic-beasts, got a /48 there


Sat Feb 11 15:57:31 GMT 2023

The reason we would like to run PPPoE instead of L2TP on the "rotuer" device is

- closer to real world scenario
- means no need to run dhcp client on the wan interface before we
   even get to start the l2tpd


Sun Feb 12 14:57:28 GMT 2023

https://github.com/katalix/go-l2tp#kpppoed


Mon Feb 13 04:44:09 PM GMT 2023

if the gl-ar750 is connected to an ethernet card that linux is ignoring,
we're going to have to set up _some_ qemu thing just to run tftp from.

Tue Feb 14 17:59:34 GMT 2023

We should do a derivation that creates an ISO image and a qemu shell
script based on a configuration.nix, and put it in buildEnv. We'll
call it "borderNetVm" :

> A broadband remote access server (BRAS, B-RAS or BBRAS) routes
  traffic to and from broadband remote access devices such as digital
  subscriber line access multiplexers (DSLAM) on an Internet service
  provider's (ISP) network.[1][2] BRAS can also be referred to as a
  broadband network gateway or border network gateway (BNG).[3]

(for consistency we should rename the "access" qemu socket network to
match whatever we call this)

 rm border.qcow2 ; nix-shell --argstr liminix `pwd`  --argstr nixpkgs `pwd`/../nixpkgs  --argstr unstable `pwd`/../unstable-nixpkgs/ ci.nix -A buildEnv --run "run-border-vm"

Wed Feb 15 22:56:59 GMT 2023

configuration for border vm needs to come from somewhere so it's good
for more people than just me

- pci device for setting up the ethernet
- lns address
- uid so it can do 9p shares? do we need to map things here?

also need to document the host-side bits so that people can set up
their spare ethernet as vfio

next step for hacking is to figure out what I was doing with pppoe

Wed Feb 15 22:59:56 GMT 2023

docs ...

* introduction

* user guide
** how to build it
** how to flash it on your device
** what to put in configuration.nix
** modules

* developer guide
** building/running with qemu
*** emulated upstream
** building/running on hardware
*** run in place with TFTP
*** emulated upstream
** CI
** Roadmap
** Contributing



 nix-shell -p sphinx --run "make -C doc html"

https://francis.begyn.be/blog/nixos-home-router contains information about avahi reflector


Fri Feb 17 00:09:34 GMT 2023

   29 11.282085831 81.187.76.242 → 8.8.8.8      ICMP 106 Echo (ping) request  id=0x0187, seq=2/512, 4
   30 11.286314642 90.155.53.19 → 81.187.76.242 ICMP 78 Destination unreachable (Communication admin)

We're getting packets over the pppoe-l2tp relay thing. Just have to
work out now why we're not routing

Fri Feb 17 16:54:41 GMT 2023

Haha.  We weren't routing because we'd used the wrong CHAP password



Fri Feb 17 16:58:27 GMT 2023

This TODO is for nlnet task 1 and for bits of subsequent tasks that
are annoying enough that I might poke at them anyway:


1) gl-ar750, why do we get "ag71xx 19000000.eth: invalid MAC address, using random address"
2) gl-ar750, wifi
3) document services so I can remember how they work. Refer back to Oct 18 for notes that no longer make sense
4) check out restart behaviour of dependent services when depended-on service dies
5) pppd _creates_ an interface, work out how to fit it into this model
6) add bridge support for lan
7) upgrade ppp to something with an ipv6-up-script option, move ppp and pppoe derivations into their own files
8) get ipv6 address from pppoe
9) get ipv6 delegation from pppoe and add prefix to lan
10) support dhcp6 in dnsmasq, and advertise prefix on lan
11) firewalling and nat - default deny or zero trust?
13) should we check that references to outputs actually correspond with
14) make a flake?
15) see if there are other tests that need adding to CI
15a) is bordervm derivation tested?
18) gl-mt300a
19) gl-mt300n-v2
20) publish the manual using CI

12) write secrets holder as a service with outputs
16) writable filesystem (ubifs?)
17) overlay with squashfs/ubifs - useful? think about workflows for how this thing is installed


I could plug tninkpad into the gl-ar750 LAN port to dogfood the wired
networking

Sat Feb 18 14:26:45 GMT 2023

Apparently we're not currently doing anything special with busybox,
just using the default nixos build with the default applets.

We'd like to be able to say in modules which applets they need,
so that we build all necessary applets but don't waste any space.
But we don't want to build a busybox for each module because that
would be a big waste of space.

One option:
- add busybox configuration to `config` so that modules can maul it
- add a busybox module that builds it with union of all config and
 adds link in /bin
- make everything else look in /bin instead of referencing pkgs.busybox

It would be good if services could assert somehow that their required
config is present

Sat Feb 18 23:45:13 GMT 2023

# lsmod

cd /lib/modules/mac80211
insmod ./compat/compat.ko
insmod ./net/wireless/cfg80211.ko
insmod ./net/mac80211/mac80211.ko
insmod ./drivers/net/wireless/ath/ath.ko
insmod ./drivers/net/wireless/ath/ath9k/ath9k_hw.ko
insmod ./drivers/net/wireless/ath/ath9k/ath9k_common.ko
insmod ./drivers/net/wireless/ath/ath9k/ath9k.ko
insmod ./drivers/net/wireless/ath/ath10k/ath10k_core.ko
insmod ./drivers/net/wireless/ath/ath10k/ath10k_pci.ko

[21.344930] ath9k 18100000.wmac: failed to load calibration data from mtd device
[21.352728] ath: phy0: parsing configuration from OF node
[21.362576] ath: phy0: serialize_regmode is 0
[21.367092] ath: phy0: UNDEFINED -> AWAKE
[21.372051] ath: phy0: Trying EEPROM access at Address 0x03ff
[21.377999] ath: phy0: Trying EEPROM access at Address 0x0fff
[21.383940] ath: phy0: Trying EEPROM access at Address 0x01ff
[21.389879] ath: phy0: Trying OTP access at Address 0x03ff
[21.400396] Data bus error, epc == 8027964c, ra == 83125880
[21.406156] Oops[#1]:


Sun Feb 19 18:15:27 GMT 2023

We have ath9k listening for packets. To make this ready to use:

- need to load the modules
- enable bridging lan with wlan
- packet forwarding
- firewall


Mon Feb 20 20:41:17 GMT 2023

need to fix all the other broken ci jobs :-(

The wlan test is failing because we moved mac80211 to a module and
there's nothing running to insmod it

Wed Feb 22 18:17:17 GMT 2023

bridge is e2b3738d0f8c3f2fd76ebcef65612de502a7b121 but it's the wrong
way around: the master interface needs to be up whether or not all
of its children are, so members depend on master not vice versa

Next steps:
- re-implement bridge, enable bridging lan with wlan
- packet forwarding
- firewall
- ath10k
- ipv6

Fri Feb 24 23:37:56 GMT 2023

bridging wlan was made complex because can't add a device to a bridge
until it's operational, and wlan0 is not operational until hostapd
has churned awhile. Therefore, "waitup" listens for netlink messages
and notifies s6 readiness stuff

we have a firewall nft script but we're not running it on boot

we have forwarding but no dns, maybe because we haven't told
dnsmasq about any upstream servers

Sun Feb 26 21:08:47 GMT 2023

to add firmware we need to put files in /lib/firmware, which means
a module

i guess we should do that in the device module

we can create the firmware files as packages


for the cal data we would like to get it from the device MTD "art"
partition at
boot time.

f
====from openwrt



case "$FIRMWARE" in
"ath10k/cal-pci-0000:00:00.0.bin")
        case $board in
        allnet,all-wap02860ac|\
        araknis,an-500-ap-i-ac|\
        araknis,an-700-ap-i-ac|\
        engenius,eap1200h|\
        engenius,enstationac-v1|\
        glinet,gl-x750|\
        watchguard,ap300)
                caldata_extract "art" 0x5000 0x844
                ath10k_patch_mac $(macaddr_add $(mtd_get_mac_binary art 0x0) 2)




caldata_extract part offset count
      caldata_dd $mtd /lib/firmware/$FIRMWARE $count $offset || \
                caldata_die "failed to extract calibration data from $mtd"
        dd if=$source of=$target iflag=skip_bytes,fullblock bs=$count skip=$offset count=1 2>/dev/null

=======

part=$(basename $(dirname $(grep -l art /sys/class/mtd/*/name)))
dd if=/dev/$part \
  of=/run/cal-pci-0000:00:00.0.bin iflag=skip_bytes,fullblock \
  bs=0x844 skip=0x5000 count=1

Mon Feb 27 22:46:37 GMT 2023

Found and fixed a bunchg of things that were stopping ath10k from
working. The remaining problem is (I think) that insmod is not
synchronous, so "ip link set up dev wlan1" doesn't work immediately
after the module is inserted. Maybe we need another netlink thing
to wait until the interface is present.


Wed Mar  1 18:26:44 GMT 2023

ath10k works, but the wlan module loading stuff is quite kludgey

I wonder if wlan0, wlan1, eth0, eth1 etc should be defined per-device
- how does the aplication config know which devices exist? If we
decide to switch to some form of persistent device naming, the names
will differ from one device to the next. Perhaps the device should
also provide standard names where possible?

services.network.links = {
  lan = interface { ... };
  wan = interface { ... };
  wlan_24 = interface { ... };
  wlan_5 = interface { ... };
}

Thu Mar  2 22:45:11 GMT 2023

We have a flashable image!

Now we can use the gl-ar750 for internet access in the shed, we can
apppropriate the other device that's in there and try Liminix on it

Fri Mar  3 23:08:58 GMT 2023

If we're going to unplug serial console from the gl-ar750 maybe we
should install an ssh server first.

0) set a root password
1) allow setting a root password from configuration.nix
(means defining config.users properly)
2) allow authorizedKeys per user
3) dropbear service
4) see if the wired lan works! :-)


Sat Mar  4 12:31:07 GMT 2023

To improve logging, each service should have its own s6-log service
which prefixes the service name onto the log line and then sends to
stdout

  https://skarnet.org/software/s6/servicedir.html
  https://skarnet.org/software/s6/s6-log.html

As far as I can tell, the `log` directory inside the service
directory should itself be a service directory for the s6-log
process that does this

.... hahaha no that doesn't work

s6-rc, for some reason, ignores the `log` directory and requires
that loggers be done with consumer-for and producer-for instead


Sat Mar  4 23:27:00 GMT 2023

notes for this week's news update

* ath10k kernel support and and firmware

- 5GHz wifi works

- need to retrieve the firmware from a special - partition on the
  device itself, so we do that using a service that - the wlan
  interface depends on

* replace waitup with more generally useful ifwait

to make the ath10k load at boot, we need to insert the module and then
wait for it to do something or other in the background before we can
configure the interface. so we need something like waitup but
for presence not operational state

it turns out that a program that just waits for a particular interface
state and then exits is quite simple to add into run scripts and
we don't need all that notification-fd stuff anyway

* move FW_LOADER* config to modules/base

* rejig config a bit.
- device hardware characteristics are now under
  the `hardware` key and include the available network interfaces.
- options for users and groups are now defined a bit more
  specifically than "attrset", making it possible to e.g. set a
  root password
- dts is moved from `boot` to `hardware`


* now producing flashable images, so you can generate a liminix config
and write it to the device instead of having to boot using TFTP and
a serial console every time

* ssh support

* prefix logs with the service name

Sun Mar  5 22:51:21 GMT 2023

Added swconfig: it was a straight copy from nixwrt and hasn't changed
upstream since. But don't need it, because the lan port works fine
without it (I assume both lan ports and the cpu are all connected
untagged)

Mon Mar  6 09:42:33 GMT 2023

Today I plugged in the mt300a.

echo 17 >/sys/class/gpio/export
echo out >/sys/class/gpio/gpio17/direction


why are our images getting big

- lua links ncurses
- hostapd links openssl and sqlite
- nftables needs
  - iptables?
  - jansson? what is that?
  - libedit/readline
- ifwait needs bash


  File: result/squashfs
  Size: 10371072        Blocks: 20256      IO Block: 4096   regular file

with smaller nftables:    9617408        Blocks: 18784

hostapd wqithout sqlite   9003008        Blocks: 17584

without bash:             8622080         Blocks: 16840      IO Block: 4096   regular file

without lua readline: bigger?!  8769536         Blocks: 17128      IO Block: 4096   regular file


Mon Mar  6 20:57:49 GMT 2023

[    0.539992] mtk_soc_eth 10100000.ethernet: mdio-bus disabled
[   10.493918] platform regulatory.0: Direct firmware load for regulatory.db fail
ed with error -2
[   10.502828] cfg80211: failed to load regulatory.db

Check in morning, but whichever port the ethernet cable is plugged into,
is considered by the kernel as port 0 - which I think we should treat as
WAN

VLAN 1:
        vid: 1
        ports: 1 2 3 4 5 6t
VLAN 2:
        vid: 2
        ports: 0 6t

ip link add link eth0 name lan type vlan id 1
ip link add link eth0 name wan type vlan id 2

figure out how to add these to gl-mt300a device config
then extedner.nix can add a bridge

Tue Mar  7 20:13:15 GMT 2023

We need NTP or some other way to get accurate time

[done] Need to add regulatory.db somewhere standard, maybe modules/wlan?

Tue Mar  7 21:43:56 GMT 2023

When we get to phase 2, need to review how network interfaces and
their addresses interplay. It should be possible to have a network
interface and interrogate the addresses associated with it - esp
with ipv6 where there are multiple addresses for the device

This thought prompted by looking at the loopback interface, which is
a bundle of addresses and therefore we can't see what any of them are


Tue Mar  7 22:05:44 GMT 2023

[phase 1]
20) publish the manual using CI
30) document flashing process
31) go through all the unexpected dmesg and triage it
25) ntp or some other accurate time source

[phase 1.5]
26) ssh keys
8) get ipv6 address from pppoe
9) get ipv6 delegation from pppoe and add prefix to lan
10) support dhcp6 in dnsmasq, and advertise prefix on lan
11) firewalling and nat - default deny or zero trust?
7) upgrade ppp to something with an ipv6-up-script option, move ppp and pppoe derivations into their own files
32) set up iperf and do some performance measurement
35) also we need to check our wireless country code


[phase 2]
3) document services so I can remember how they work. Refer back to Oct 18 for notes that no longer make sense
4) check out restart behaviour of dependent services when depended-on service dies
13) check that references to outputs correspond with declared outputs
33) network interfaces vs the services that manage their addresses
34) write a short guide explaining how to use s6-svc

[phase n]
12) write secrets holder as a service with outputs
16) writable filesystem (ubifs?)
17) overlay with squashfs/ubifs - useful? think about workflows for how this thing is installed


dmesg lines to investigate for gl-mt300a:
[    0.467314] OF: Bad cell count for /palmbus@10000000/spi@b00/flash@0/partition
[    0.539709] mtk_soc_eth 10100000.ethernet: mdio-bus disabled  ?
[    8.778513] compat: loading out-of-tree module taints kernel.
[   17.686561] ieee80211 phy0: rt2800_wait_bbp_rf_ready: Error - BBP/RF register access failed, aborting
[   17.696025] ieee80211 phy0: rt2800_loft_iq_calibration: Warning - RF RX busy in LOFT IQ calibration
[   17.875147] ieee80211 phy0: rt2800_rxiq_calibration: Warning - Timeout waiting for MAC status in RXIQ calibration

for gl-ar750:
[    0.000000] Unknown kernel command line parameters "earlyprintk=serial,ttyS0", will be passed to user space.
[    0.416679] OF: Bad cell count for /ahb/spi@1f000000/flash@0/partitions
[    0.825495] ag71xx 19000000.eth: Could not connect to PHY device. Deferring probe.
[    1.632700] pci_bus 0000:00: root bus resource [mem 0x10000000-0x13ffffff]
[    1.639824] pci_bus 0000:00: root bus resource [io  0x0000]
[    1.645601] pci_bus 0000:00: No busn resource found for root bus, will use [bus 00-ff]
[   32.032326] ath10k_pci 0000:00:00.0: pdev param 0 not supported by firmware
[   36.627844] ath10k_pci 0000:00:00.0: failed to receive initialized event from target: 00000000


Fri Mar 10 13:17:56 GMT 2023

Lunchtime notes on images for real devices, vs ci.nix

* successfully building an image doesn't mean that the image boots
or does anything useful

* don't want to faff with serial wires on every device every time
to test it. so

* ideally, build ram-based images of rotuer, extneder, arhcive in CI
with a watchdog timer that will reboot if it can't see the network

* figure out how to boot into the new image from an ssh connection.  I
assume the challenging bit here is grabbing x MB of contiguous phys
mem after boot: I think we'd have to reserve it at _first_ boot and
then somehow copy into it before rebooting

An easier first goal might be a tool to flash from the shell command line,
but that runs a greater risk of bricking


Fri Mar 10 14:35:40 GMT 2023

programs.busybox = {
  enable = true;
  applets = [... ];
  config = {
  };
}

Fri Mar 10 23:49:04 GMT 2023

Well, we have the backup host config up and running - though haven't
plugged it back into its disk yet.

For task 1 what remains is

1) ntp sync
2) write up the flashing procedure
3) a video?


Sat Mar 11 13:58:20 GMT 2023

================== for video

what is liminix
- nix-based system for creating OS images for routers
- not "nixos on your router"
  - nixos-like module system,
  - musl for libc
  - s6/s6-rc for services
  - entirely cross-compiled

why am i making a video?
- unless you have a suitable spare device to install on,
  and you want to take it apart, it's currently hard to
  take liminix for a spin
- I have these things, so I can give you a tour

let's have a look at how the hardware's hooked up

web site & manual

a config file:
- observe the comments:
  - not going to spend ages on this because it's not in its final form.
  - as we get more configs for more use cases, we will
    get a better feel for what can be abstracted
  - that will come later: work so far has been on the
    hardware support side

to show that it builds, we're going to add a package. otherwise,
everything from this build is probably already cached

build the config

tftpboot on hardware

flash on hardware

show ci

show a qemu target

Mon Mar 13 22:46:46 GMT 2023

1) rsync on arhcive is failing because no nogroup group

  "/nix/store/gfzzl157r8xyp38lpcfxydkiiy6zrs3c-rsync-3.2.6/bin/rsync" "--verbose" "--stats" "--password-file" "/etc/nixos/secrets/arhcive-rsync" "-rltgoDz" "/var/spool/backup" "backup@arhcive.lan::srv/"
@ERROR: invalid gid nogroup
rsync error: error starting client-server protocol (code 5) at main.c(1837) [sender=3.2.6]

2) we can run findfs in a loop until the disk appears

3) still haven't decided how to do ntp but maybe we should just use
the busybox one

4) some way to do upgrades over the wire

- boot with reserved mem and a phram device at 110-128MB even in the
  flashable version

- watchdog timer in kernel

- kexec in kernel

- userland service to feed the dog as long as local network is up
 (may need to start it a couple of minutes after boot, how do we
  do that?)

- can we use flashcp on a phram mtd?

5) maybe setup a vhost for hydra or something

[nix-shell:~/t]$ wget --reject-regex '\?' -D localhost -N -r --exclude-directories=/api --level=2 --convert-links -e robots=off http://localhost:3003/jobset/liminix/build/

is almost a mirror

Tue Mar 14 20:17:35 GMT 2023

- do we have a phram mtd? need config for size and location

- how do we set the boot device
 - for first boot need to boot real flash, use  dtb, ignore bootloader args
 - for kexec, boot phram, specify args somehow (could rewrite dtb)

 => can use same kernel for both if we can give kexec a dtb with
    different params, which seems to be possible

so we need a module for the initial kernel to say
- create phram mtd
- boot from real mtd (will be index + 1)
- enable KEXEC in kernel
- add kexec-tools

and for the kernel we boot into
- most of the above
- except for the boot device
- create an output with objects that kexec(8) can parse

Could be same module for both with different outputs

what do we call this thing? "revertable"

Wed Mar 15 19:11:09 GMT 2023

"revertable" implies mtd support for the rootfs and a ramdisk at
defined location

"tftpboot" implies "revertable", because it will use the same ramdisk


Fri Mar 17 11:44:40 GMT 2023

- patch the kernel kexec code to pass DTB to new kernel unconditionally
- unpatch pernel to pass command line to kexec (breaks DTB passing)
- decide how we specify rootfs. doing it by number is awkward
  - may be phram
  - may be real mtd root
  - may be real mtd root but renumbered becuase phram exists

Wild idea: we could probably get rid of the need for declaring a phram
device in the first kernel, if we can use kexec to copy the squashfs into
physical ram. As far as I can see this is a simple(sic) matter of
specifying it as a segment, but we would have to extend kexec-tools
to do this and it's quite a niche option if we make it do all the
mtd setup.

kexec --dtb=foo.dtb --map-file=squashfs@0x120000


Sat Mar 18 18:02:26 GMT 2023
What if: we added derivations for "apply openwrt changes" as packages,
which could then be called from the kernel derivation's extraPatchPhase?
There could be one for generic and one for each openwrt targetop

Mon Mar 20 18:40:53 GMT 2023

- kexec patch is sent to mailing list, keep an eye for replies
- watchdog
- ntp
- rebuild images for live devices
- can we build a static busybox with flashcp applet and scp it
 to arhcive etc?
- [DONE] install mailman and hyperkitty on myhtic, create mailing list

Tue Mar 21 22:59:54 GMT 2023

I haven't found a way to arm the watchdog before userland runs, which
would be really nice: although there's WATCHDOG_HANDLE_BOOT_ENABLED
and WATCHDOG_OPEN_TIMEOUT it doesn't seem to be sufficient, Maybe
those options work only when the hardware watchdog is already
armed. It might not be completely awful insofar as any failure to
mount root usually results in panic anyway, so provided we start
watching early in boot then there's not a big window for anything
to go wrong

What should the watchdog service do?  Ideally we want something that
"ratchets" : can be started in early boot and signals health as long
as the system is starting up, then once the system is in "steady
state" it stops pinging as soon as any part of that steady state
becomes unhealthy. This feels like a refinement for a much later
phase though.

Maybe the health criteria might be
(sshd and lan services are running) or (time since boot < 120s)

Thu Mar 23 00:11:23 GMT 2023

tftpboot and (kexecboot || flashable) have incompatible DTB-finding
strategies which is painful if you add both modules and then
expect tftp booting to still work

Maybe we could patch the kernel to use some better strategy for when
to use/ignore the bootloader command line: e.g "only if it
contains the string 'liminix'". Could do this by patching
arch/mips/kernel/setup.c bootcmdline_init to

if(strstr(arcs_cmdline, "liminix") == NULL) arcs_cmdline[0] = '\0'
and then defining CONFIG_MIPS_CMDLINE_DTB_EXTEND. The
bootloader command line then needs to specify only the
_additional_ parameters that weren't in the DTB

(later: that turned out to be quite straighforward)

Fri Mar 24 23:45:12 GMT 2023

- add ntp support
- [DONE] expose hydra to internet
- check MAC address weirdness?
- call Task 1 "done"

Sun Mar 26 00:19:15 GMT 2023

Would be nice to have a flash.sh built in outputs.flashable

Sun Mar 26 15:27:14 BST 2023

Let's think about services and modules.

Module
+ can change global config
  * add users, groups etc
  * change kernel config
  * change busybox config
+ well-typed parameters
- is a "singleton": can't have the same module included twice
  with different config. e.g. can't have two hostap modules running on
  different wlan radios.
- can't express dependencies: a depends on b

suppose:

* modules add service functions to the config? then there's no
way to define a service while forgetting to import the module

* we use the lib.types stuff for service function arguments

* maybe we stop naming services.foo for every damn thing

* but remember, s6 services do need unique names


imports = [ ../modules/dhcp4 ];

services.dhcp4 = config.services.udhcp {
  interface = lan.device;
  options = {
    foo = true;
    bar = 42;
  };
  depends = [ services.some_other_thing ];
}

modules/dhcp4 udhcp fn needs to define a type for its argument, then
use something like

  if arg_type.check def.value then res
            else throw "The option value `${showOption loc}' in `${def.file}' is not a ${arg_type.name}.")

(where def comes from I don't know yet)

Tue Mar 28 10:44:40 BST 2023

we should reserve the name "service" for actual instantiated
services. This means we need a name for the functions that
make services. "class", "template", "fn", "maker", "factory"?
And a namespace name so they're not interleaved with real
services, which sort of suggests they are packages

if we want to do services = {
 foo = longrun { ... };
 bar = longrun { ... };
}

without repeating the `name` as an attribute of the longrun,
then longrun can't return a derivation: it has to return some
function that accepts `name` as a parameter.

where services.a depends on services.b, at the time its builder is run
it needs to know what name s6-rc will use for service b

maybe an s6 service definition should be an attrset not a derivation.

maybe this is outside scope for phase 2

Tue Mar 28 13:22:06 BST 2023

Reading nixos/doc/manual/development/building-parts.chapter.md it
suggests to me that we should rename config.outputs to
config.system.outputs. The more general question here is whether it's
good to be augmenting a variable called "config" with all this
generated stuff that is patently not configuration - perhaps putting
it under a "system" key will keep it all in one place

Tue Mar 28 13:32:30 BST 2023

how should we handle filesystem state? e.g. resolvconf service

if a service provides a file at a known global pathname, it can't be
parametrised - it must be a singleton.

Tue Mar 28 20:25:20 BST 2023

wondering if we should swap phases 2 and 3. We can't really address
modules without addressing services, which is phase +n, whereas we
can tackle overlay/ubi whenever

nand flash may have bad blocks
nor flash (supposedly) doesn't

ubi provides erase counts and bad block remapping on top of the mtd
interface. this means we should avoid flashcp of a ubi image straight
onto (nand) mtd as we will lose the erase counts and bad block information
that UBI tracks.

overlayfs works on a filename basis, so might not be very effective :
any change that results in a new store path will mean the entire package
appears in two places. I think it's reasonable to offer squashfs or
ubifs without overlay.

open questions:

1) if uboot doesn't support UBI, we can't boot a kernel on a ubifs
so we need reserved space for the kernel.

- unless we add some padding after the kernel, every new kernel
that's bigger than its predecessor will trash the start of the
ubi space (and wipe out its erase count)
- This suggests we should build more stuff as modules and less as
compiled-in

2) once a device has had a ubi volume created on it, probably we want
to use ubi-aware tools to update that volume in future instead of a
whole new flash, because we wish to preserve erase counts. This means
running ubiformat --image-file=foo.ubi on the device instead of flashcp

we can add a "ubi-flashable" output that creates a .ubi image and
a flashcp image that wraps it, with instructions on which to use.

Fri Mar 31 22:13:54 BST 2023

Error: too small LEB size 3968, minimum is 15360

> This error means that you are trying to mount too small UBI
volume. Probably because your flash is too small? Try to use JFFS2,
then, because it suits small flashes better since it has much lower
space overhead. Indeed, UBIFS stores much more indexing information on
the flash media than JFFS2, so it has much higher overhead. Also, UBI
has some overhead (see here). Thus, if you have a small flash device
(e.g., about 64MiB), it makes sense to consider using JFFS2.


Argh. Oh well,

Sat Apr  1 15:27:39 BST 2023

There's limited value in recreating pseudofiles for jffs2 because
the system is writable - changes made to /bin, /dev etc in config.filesystem
should take effect on a running system.

Can we take inspiration from https://grahamc.com/blog/erase-your-darlings/ ?

in early boot:
 mount ramfs on /
 mount the writeable filesystem on /persist/
 bind mount /persist/nix on /nix
 run script to populate rootfs from pseudofiles

on a router, do we need _anything_ persistent that's outside the store?

 - state for dhcp leases and stuff
 - secrets
 - maybe, files that the user has downloaded

this will probably require initramfs. if just use jffs2 as the rootfs and
don't worry about /persist, we can skip that step.




[ aside: I think we may be putting two busyboxes in the image:
see modules/s6/default.nix s6-init-scripts has buildInputs = [busybox];  ]

Mon Apr  3 18:34:26 BST 2023

suppose
- we boot the system with systemConfig=/nix/store/eeeeee-system
- the early-init script runs /target/$systemConfig/create-root /target
  after mounting /target
- then it runs chroot /target $systemConfig/bin/init "$@"

or maybe we could combine those steps?
or maybe it doesn't matter too much ...

Thu Apr  6 21:25:41 BST 2023

what now?

- put a jffs2 onto some hardware device
  - what do we do with uboot?
  - should we pad the kernel?
  - maybe kernel module support would be good if we're making it
    hard to do kernel updates
- try the nix-copy-closure thing and work out what else we don't know
- [done] detect endian correctly


to ask a different question, what else do we need to dogfood a router?

Sun Apr  9 10:06:08 BST 2023

- rename outputs.flashable to outputs.flashimage
- rename modules/flashable to modules/flashable_ro
- create outputs.flashable in modules/jffs2
- rename modules/jffs2 to modules/flashable_rw
- add enable config to both?
- enable kernel module compilation


Mon Apr 10 23:50:41 BST 2023

- initramfs parses /proc/cmdline to find root fs, might not play
nice with defaulting

- how to build kernel modules

- look at closure size, is it this big because we've broken it
or is jffs2 usually this much bigger than squashfs

- maybe squashfs with overlay might be better if we could
ensure hardlinks?

- maybe there's something like overlayfs but content-addressable?

Sat Apr 15 18:57:46 BST 2023

for the same configuration:

-r--r--r-- 1 root root 6066176 Jan  1  1970 /nix/store/0x271rg45mcjjgbma9wi31h1yd109fpy-frob-squashfs
-r--r--r-- 1 root root 12255232 Jan  1  1970 /nix/store/zx11adagcbzqsnqkyz5kgvr392vhlrpr-make-jffs2

may want to reconsider not using squashfs with overlay

Wed Apr 19 22:22:48 BST 2023

Where next?

Sun Apr 23 18:24:34 BST 2023

- we are down to ~ 11MB image for a barely functional (IPV4) router
  this is by avoiding all dependencies on openssl or gnutls
- rotuer is not recognising when  I set the hostname
- I may have forgotten the root password :-(
- why is hello world 70K unless hardeningDisable?

Fri Apr 28 20:51:52 BST 2023

To do nix-copy-closure we need nix-store, which is a symlink to nix,
which is

-rwxr-xr-x 1 dan users 2.3M Apr 28 21:08 nix

(stripped). This is a lot bigger than, say, a simple script to
loop through the closure of a derivation and copy only the store
folders that don't exist already.

* we'd like to only transmit the packages that aren't already present

* we'd like to use a single ssh connection


S: here is a list of package names
C: these are the names of the packages I want
S: here are the packages

while read $f ; do
  test -d $f || echo $f
end

Tue May  2 21:53:08 BST 2023

1) we have a script that runs on the receiver, which

 - accepts a list of store paths
 - prints the missing store paths
 - runs cpio -i < stdio

2) we need a script for the sender that

 - refs=$(nix-store -q --references $1 && echo end)
 - opens ssh connection
 - print ssh $refs
 - needed= capture result until "end" received
 - find needed | cpio -o  > ssh-connection
 - close connection

3) to have a reasonable hope of testing this we should do it with qemu. It would be nice
if we could connect without faff to the qemu lan interface : either we do this by bringing up
another qemu vm (preferably with the host store shared, otherwise it has to build a mips cross
compiler/libc) or maybe we could do something unholy with ssh ProxyCommand

ssh -o ProxyCommand "socat - UDP4-DATAGRAM:230.0.0.1:1234,sourceport=1234,reuseaddr,ip-add-membership=230.0.0.1:127.0.0.1"



4) we haven't solved garbage collection, though I think "remove everything not in
nix-path-registration" might be what's needed there

Wed May  3 22:01:19 BST 2023

Something weird is going on with qemu net device enumeration: when I
run it interactively I'm getting the access network (mac ending :02)
on eth0 and the lan (mac ending :01) on eth1, and if it's behaving the
same in CI then how come any of the tests work? vanilla-confinguration.nix
definitely assumes lan=eth0

By switching from  -device virtio-net-pci to     -device virtio-net then
I get the desired behaviour back

Sat May  6 18:42:28 BST 2023

Next:

- package min-copy-closure
- see if we can use it on some output to copy the whole system closure
- post-copying symlink munging
- try it on a real device, see if it works for config file updates
- collect-garbage/delete-old-generation


Sun May  7 23:03:03 BST 2023

Shortly after all the work to reduce system closure size last time, I
tried adding the necessary packages to support nix-copy-closure and
saw it start building a complete C++ system with Boost. My fears that
this would lead to quite a large increase in the system size were, it
turned out, entirely founded.

So I wrote my own - or at least, a quite minimal substitute. The core
logic is simple - on the sender, we get the list of required packages,
then we check for the existence of `/nix/store/eeeeeee-foo` for
each of them on the target, and whatever's missing we send across the
link using cpio.

It sounds simple, and it should be simple, and in retrospect it _was_
simple. Along the way I went on a bit of a Qemu networking tangent and
learned quite a lot about the bash `coproc` command

Tue May  9 21:06:53 BST 2023

General direction of my thoughts:

- get a baseline working rotuer system
- prove that min-copy-closure works with it
- refactor the crap out of it
- configurablise the bordervm usb ethernet setup
- when we have a good idea of how/whether min-copy-closure *actually*
 works, declare "writeable filesystem" to be done
- start to get more of a feel for how the services/config hang together

? why does rotuer not have a hostname?

? how can we get a device hooked up to rotuer's lan port that we can
control remotely

Sun May 14 23:25:46 BST 2023

the outputs.systemConfiguration attribute builds a derivation
containing a single file bin/activate

_Presumably_, copying its closure will copy all the things, as
we already use it as the roots for jffs2 creation. However, there
is also a symlink created from /init at jffs2 creation

Mon May 15 21:32:38 BST 2023

Had a neat idea about uing an overlayfs combining jffs2 and ramfs
to do upgrades that would otherwise be larger than the flash.
Could use "overlay merge" from https://github.com/kmxz/overlayfs-tools

Wed May 17 15:18:55 BST 2023

liminix-rebuild doesn't collect garbage (this is a mising feature, not
a bug). We think we can fix this using nix-path-registration: specifically,
by deleting anything not in it.

What we're going to do: build a fresh system image for rotuer, then
dogfood liminix-rebuild until we've succeeded in getting it to
change its hostname

Also wondering if we should drop outputs.default, but maybe not

* systemConfiguration: used for updates
* vmroot: used for qemu
* flashimage: used for flashing
* tftproot: used for dev/test

As long as we're consistently setting the default output to whichever
is the appropriate "full production image" I think we're good.

Wed May 17 22:45:40 BST 2023

Random thought: when we bind mount /target/persist/nix to /target/nix
we could make it read-only. worth doing?

Thu May 18 10:59:39 BST 2023

- liminix-rebuild can't find reboot: probably the PATH is just
  generally wrong for ssh sessions (maybe all non-login sessions?)

- need to copy path registration file somewhere useful and
  delete stuff not in it at the appropriate time. Would be safest
  to do that either late in the shutdown process before rebooting,
  or during boot.

Fri May 19 15:18:13 BST 2023

If we make min-collect-garbage - just a command you can run whenever -
that will be fine for current capabilities.  It won't work with the
theoretical overlayfs system, though: we need to copy-down from the
ramfs to real flash before rebooting, and that can't happen until
there's disk space to do it

Sat May 20 22:35:25 BST 2023

We have a working min-collect-garbage (seems to, anyway ...)

- having ssh host key wiped on reboot is sucky. maybe we can have
/persist/secrets and a service that looks there?

- find out what files ash sources on non-login shell startup
  [ set ENV=/etc/ashrc in parent process env ]

- services.default is suboptimal as there is no way to add to it
without wiping it

- decide whether to use liminix- or min- as our prefix for nixy
  commands

- should we move config.outputs  -> config.system.outputs ? see Mar 28

- less crap firewall

- add ipv6 support to rotuer

- create an l2tp configuration

- iperf and tuning

- wlan country code


dropbear weak hashes? https://github.com/mkj/dropbear/issues/138

Sun May 21 11:48:07 BST 2023

dropbear will generate host keys on first connection. It's (probably) good that the
key is generated on-device and also that we wait until there's some randomness.
It's not so good that it will only write the key to DSS_PRIV_FILENAME which is
hardcoded to /run

Sun May 21 17:27:31 BST 2023

What do we need for ipv6?

-  upgrade ppp to something with an ipv6-up-script option, move ppp and pppoe derivations into their own files
-  get ipv6 address from pppoe
-  get ipv6 delegation from DHCPv6
-  support dhcp6 in dnsmasq, and advertise prefix on lan
-  firewall settings

Sun May 21 21:30:17 BST 2023

Making hydra build the docs is straightforward, but making it
_publish_ the docs is outside scope, really. It can serve the files
but they're all text/plain

Should hydra push the docs to www.liminix.org or should www.liminix.org
pull?

TODO-at-some-point: assign uids and gids dynamically, somehow

Tue May 23 22:56:33 BST 2023

following the guidance at https://support.aa.net.uk/IPv6:

we run odhcp6c to do router solicitation/advertisement dance


odhcp6c environment variables:

RA_ADDRESSES=
RA_REACHABLE=0
CER=
PASSTHRU=00170020200108b0000000000000000000002020200108b0000000000000000000002021
SERVER=fe80::203:97ff:fed6:0
RA_MTU=0
RA_ROUTES=::/0,fe80::203:97ff:fed6:0,65535,512
OPTION_1=00030001e4956e4ef2fa
NTP_FQDN=
OPTION_2=00030001000397d60000
RA_DOMAINS=
DOMAINS=
AFTR=
SIP_IP=
NTP_IP=
PREFIXES=2001:8b0:de3a:40dc::/64,7198,7198
RA_HOPLIMIT=64
RA_DNS=
RDNSS=2001:8b0::2020 2001:8b0::2021
SNTP_IP=
RA_RETRANSMIT=0
SIP_DOMAIN=
ADDRESSES=2001:8b0:1111:1111:0:ffff:51bb:4cf2/128,3598,7198

# ip -6 route |grep default
default via fe80::203:97ff:fed6:0 dev ppp0  metric 1024  expires 65211sec

presumably from RA_ROUTES but why is the metric appaently doubled?

Tue May 30 21:25:37 BST 2023

We have an odhcpc script that preserves the prefix delegation from the
ISP.  We need a service that notices whenever the state is
available/has changed, and updates the LAN IPv6 address.

The service can depend on odhcp

add inotify to packages
use writeFennelScript with that dep
see if it works

Wed May 31 23:33:00 BST 2023

We have a thing that sets ipv6 address on lan interface, yay us

A firewall would be a very good idea

Thu Jun  1 18:46:59 BST 2023

TODO for now:

- services.default is suboptimal as there is no way to add to it
without wiping it

- decide whether to use liminix- or min- as our prefix for nixy
  commands

- should we move config.outputs  -> config.system.outputs ? see Mar 28

- less crap firewall

- create an l2tp configuration

- iperf and tuning

- wlan country code

Thu Jun  1 21:26:37 BST 2023

how can a client machine "opt out" of using the firewall, to allow
incoming connections?  Most convenient would be to have a separate SSID
for grownups. Assuming it shows up as a separate wlan device, we can
write firewall rules to allow incoming connections on that interface
(can we? only if the packet is identifiable as destined for that interface)

https://www.rfc-editor.org/rfc/rfc6092.html
https://emailstuff.org/rfc/rfc7084

We could block incoming for slaac and dhcp addresses and permit it for
stable private addresses. If we were fairly sure that devices won't
ask for stable private addresses just for funsies.

https://wiki.archlinux.org/title/IPv6_#Stable_private_addresses



Fri Jun  2 14:42:43 BST 2023

I found a handy guide to nftables at https://ww.telent.net/2023/6/2/turning_the_nftables

Mon Jun  5 16:56:44 BST 2023


How are we going to do this firewall thing then?
I can see no reason to have more than one table per family, so lets
just name the tables after families

There is nothing in nftables for functionally grouping rules by
requirement that may touch multiple hooks/chains, so we need our own
abstraction - and we can't call it any name that nftables uses already
(so, not "ruleset"). rulegroup?

"policy" would be a good name except that it's already taken

"concern"? "requirement"? "feature"?

Mon Jun 19 20:45:48 BST 2023

why is chrony using libedit?

Thu Jun 22 09:52:57 BST 2023

- There is a lot more lua being installed (luac, docs, static
  libraries etc) than we really need.

- update User docs to include a list of supported  targets

Thu Jun 22 23:43:06 BST 2023

- is there a sysfs to enable ipv6 forwarding?
- we haven't an ipv4 firewall yet


PATH=`echo /nix/store/*nftables*/bin`:$PATH
nft list ruleset

Thu Jun 22 23:58:58 BST 2023

Looks like we're missing at least one kernel config setting for
nftables. Would this be a good time to do a derivation for building
kernel modules?


Sun Jul  9 21:21:17 BST 2023


Tue Jul 11 22:10:17 BST 2023

- s6 cheatsheet, find or write
- could we have > 1 module add to services.default?
- odhcp should parse values from environ and write more files, to save readers
 from parsing it
- pkgs.liminix, who knows what thats for any more?
- interface.device, as a  general rule, doesn't work because the
  device name may be known only at runtime (e.g. for ppp)
- iperf
- figure out wifi regdomain

Tue Jul 11 23:01:59 BST 2023

We can make services depend on kernel modules, however not on bakedin
kernel config

[from March: "Let's think about services and modules."]

Module
+ can change global config
  * add users, groups etc
  * change kernel config
  * change busybox config
+ well-typed parameters
- is a "singleton": can't have the same module included twice
  with different config. e.g. can't have two hostap modules running on
  different wlan radios.
- can't express dependencies: a depends on b

thought I had then was: modules provide services. requiring the
ppp module causes config.services.ppp to exist, so you can

services.default = [
  (config.services.ppp {
    tty = "17";
    baud = "57600";
    secrets = blah;
    })
  ...
]

this might work. also though we should find out how to do type checking on
service params

Wed Jul 12 23:23:02 BST 2023

https://github.com/NixOS/mobile-nixos/pull/406 // why mobile nixos uses
mobile.outputs instead of system.build

  suggests that system.build may not be a thing to blindly emulate

if a service is a derivation should we expect to want to be able to call
it with .override? maybe we want to override the package containing the
daemon it runs. How do we best pass service config as well?

Maybe a service template is a function that returns a derivation


imports = [ ./modules/pqmud.nix ];
services.mud = system.services.pqmud {
  realm = "A deep cavern";
  port = 4067;
  users = import ./allowed-users.nix;
  # etc etc
};
services.mudBeta = let mud =
  system.services.pqmud {
    realm = "A very deep cavern";
    port = 4068;
    users = import ./allowed-users.nix;
  };
  in mud.overrideAttrs { pqmud = pkgs.pkmudLatest; };

so we have

config.system.services # services provided by modules
config.system.outputs  # build artefacts of various types

the services provided by a module must be introspectable in some way
so that we can compile a list of service options per module

service parameters are defined using the module type system.
Something like this?

# mud.nix

system.services.pqmud = args :
  let t = {
    name = mkOption { type = types.str; };
    realm = mkOption { type = types.str; };
    port = mkOption { type = types.port; default = 12345; };
    users = mkOption { type = types.any; };
  };
  in assert isType (submodule { options = t; }) args; longrun {
     inherit (args) name;
     run = "${pkgs.pqmud}/bin/pqmud --port ${port} ....."
  };

Fri Jul 14 19:07:59 BST 2023

It works for pppoe, though typechecking error messages could be
better.

- We need to find a good place to keep the typeCheck function so that
everyone can use it.

- also the type_service type defn exists only locally in modules.nix,
and we would like to refer to it elsewhere

Thu Aug 10 21:46:36 BST 2023

to finish service/modules milestone

[done] there are some modules not using serviceDefn
- modules included by standard.nix should have all their options
  grouped together in docs
  how can we determine which they are? or maybe "modules that
   don't contain services" is an acceptable criterion
  maybe this is not actually an issue, if the modules are
   reasonably coherent. It looks odd now because base.nix is a mess
[done] print the module pathname so people know what to import
[done] docs don't print the examples
[check?] and seem to be getting the default wrong too
- decide what we deem to be "internal" (if anything)
   is `filesystem` internal, for example? or `busybox`? they're
   both mostly _used_ internally but may still be valuable to expose
[done] maybe document outputs separately or not at all?
[done] bridge to be one service instead of two?
[done] get rid of services/
- anything else in rotuer.nix that we should servicify
- services for liminix.networking
- a nice way to specify service dependencies
[done] - do another video

Mon Aug 21 20:02:55 BST 2023

a nice way to do dependencies would be somethng like

services.thething =
  let s = svc.thing { .... };
  in addDependencies s (with config.services; [otherthing yetanother]);

except that addDependencies is a really klunky name. dependsOn is very
slightly better? or maybe it could be a function of the derivation?

services.thething =
  svc.thing { .... }.depends (with config.services; [otherthing yetanother]);

---

what does it mean to be dependent on an interface? that's it up? running?
has an address? has a collection of addresses?


  services.defaultroute4 = route {
    name = "defaultroute4";
    via = "$(output ${services.wan} address)";
    target = "default";
    dependencies = [ services.wan ];
  };

- this route requires the interface to have an address (if wan is an
  interface, anyway ...)

- but otoh a dhcp client doesn't want to wait for an address, because
   it is assigning the address.

should an address provider have "interface name" as an output?
is there a set of outputs that every address provider should have -
whether static, dhcp, pppoe?

maybe we're in decision paralysis and should just move forward with
what we know

Wed Aug 23 18:56:08 BST 2023

We may want to change the hardware device files to specify network
interface names not services. Otherwise hardware devices (boards)
depend on module-based-services, which is a bit weird.

Thu Aug 24 18:54:03 BST 2023

- we want network and bridge to be separate modules, because bridge
introduces extra kernel config

- bridge/service wants to create a network device ("ip link"),
using quite similar code as network/link.nix

- but bridge/service is a derivation: it has sight of pkgs but not
config

offtopic: useful s6-rc commands at https://www.skarnet.org/software/s6-rc/faq.html

Fri Aug 25 23:37:57 BST 2023

where we left off: bridge is a bundle, and bundles can't have outputs,
so how do we set the ifname of the bridge?

- ifname of the primary is set
- actually, most things that depend on the bridge really just depend
  on the primary anyway (it's OK if 1 <= n < #members are down)
- but *something* should depeond on all the members

turns out maybe we needed two services after all?

Sun Aug 27 23:50:18 BST 2023

I've done enough to make rotuer build, but in the process
trashed vanilla-configuration as I entirely forgot we don't have
a dhcpv4 client service. Need to fix that ...

- anything else in rotuer.nix that we should servicify
- anything in vanilla-configuration ditto
- and arhcive (rsync, watchdog)
- services for liminix.networking
- tidy up the dependency handling in serviceDefn build
   (interface is fine, implementation is a bit brutal)
- write a blog entry

Mon Aug 28 16:58:49 BST 2023

- [done] ntp is not setting the time
- nftables syntax error

Thu Aug 31 23:53:54 BST 2023

- anything else in rotuer.nix that we should servicify
 [done] - packet forwarding
  - dhcp6 client
  - what to do with acquire-{wan,lan} scripts?
  - resolvconf
- [done] anything in vanilla-configuration ditto
  - packet forwarding
- and arhcive
  - [not doing] rsync
  - [done] watchdog
  - [done] mount
- nftables syntax error
- tidy up the dependency handling in serviceDefn build
   (interface is fine, implementation is a bit brutal)
- [done] services for liminix.networking
- [done] write a blog entry
- [done] ntp is not setting the time
- [done] static dhcp(6) lease support reqd for dogfooding

Sat Sep  2 21:35:41 BST 2023

Considerations for "mount" service: each filesystem needs to depend on
any mount points for its parent directories, and maybe also on other
services (e.g. filesystem modules, network devices, routes)

mountpoints = {
 mnt = {
   media = svc.mountpoint.build {
     fstype = "msdos";
     device = "/dev/sda1";
     options = [ ...];
   };
   archive = svc.mountpoint.build {
     fstype = "ext4";
     device = "/dev/sda2";
     options = [ ...];
     mountpoints = {
       remote = svc.mountpoint.build {
         fstype = "nfs";
	 device = "doc.ic.ac.uk:/public";
       };
     };
   };
  };
}

services.somethingelse = svc.ftpd.build {
  # ...
  dependencies = [ mountpoints.mnt.archive ];
}

what don't we like about this? we have to walk the nested attrset in a
weird way, because the services may contain other mountpoints. Maybe
just keep it simple and do


services.mountpoints = bundle {
 name = "mountpoints";
 contents = [
   svc.mountpoint.build {
     device = "/dev/sda2"; fstype = "ext4"; directory = "/mnt/isos";
   };
   svc.mountpoint.build {
     device = "/dev/sdb1"; fstype = "msdos"; directory = "/mnt/backup";
     dependencies = [ load-vfat-module ];
   };
 ];
}

Sun Sep  3 17:34:36 BST 2023

how to dogfood

DHCP6 server: static lease support
DHCP client and acquire-{lan-prefix,wan-address}

The emergency boot thingy in glinet u-boot won't help because it
expects to flash from its tftp request instead of booting it. So we
could use kexec instead except that the openwrt install doesn't have
it.  So we could swap the hardware devices, the only downside of that
being that then I don't have a test system any more. Or we could YOLO it.

Sun Sep  3 22:11:02 BST 2023

I think we should rejigger the documentation ...

- "getting started": worked example, building and installing Liminix
with a very simple config (wifi AP with ssh daemon)

- using modules
  - link to module reference

- creating custom services
  - longrun or oneshot
  - dependencies
  - outputs

- creating your own modules

- hacking on Liminix itself

- contributing

- external links and resources

- module reference

- hardware device reference

---

I think we might rename wlan_24 to wlan and wlan_5 to wlan1.
This is on the assumption that almost no device is 5GHz only, so
would make it easier to write a basic wlan example that works
both on 2.4GHz boards and dual radio boards

Mon Sep  4 23:15:26 BST 2023

If dhcpcd parsed the update-script output into separate files, half
the complexity of acquire-lan-prefix would go away. The other half is
because it subscribes to changes in the outputs instead of just
running once. Perhaps there's a better way to do that?

Could separate prefixes and addresses something like this...

outputs/prefix/2001\:8b0\:de3a\:40dc\:\:/prefix
outputs/prefix/2001\:8b0\:de3a\:40dc\:\:/length
outputs/prefix/2001\:8b0\:de3a\:40dc\:\:/preferred
outputs/prefix/2001\:8b0\:de3a\:40dc\:\:/valid
outputs/prefix/2001\:8b0\:de3a\:80\:\:/prefix
outputs/prefix/2001\:8b0\:de3a\:80\:\:/length
outputs/prefix/2001\:8b0\:de3a\:80\:\:/preferred
outputs/prefix/2001\:8b0\:de3a\:80\:\:/valid

the directory name is arbitrary as long as it's unique. Might even be better to
remove the colons

outputs/prefix/200108b0de3a40dc/valid

or we could adopt the MS convention and replace with hyphens

outputs/prefix/2001-8b0-de3a-40dc--/prefix

Also: we should write some kind of test for this...


Tue Sep  5 21:36:39 BST 2023

How do we set the cpu governor?


Fri Sep  8 21:26:36 BST 2023

We want a fennel thing that reads a filesystem tree into a nested
table. And a thing to diff two tables

Sat Sep  9 22:40:50 BST 2023

Subscribers to odhcp6c outputs need to be able to tell which addresses
are new and which have been removed since the last run, which now we
have ohdcp-script  producing parsed data means they need to compare
tables by value. Which is a faff.

What if the directory name were a hash of all the relevant fields
such that clients could just say "new directory, must be new address"

We can have literal prefix, then need to encode
length,preferred,valid,extra space-efficiently. I cannot currently see
any way to use whatever hashing Lua uses for its table lookups,
which is a bit disapppointing, so we might have to make our own


https://gist.github.com/scheler/26a942d34fb5576a68c111b05ac3fabe
this is DJBHash, though doesn't appear to deal with integer overflow


function hash(str)
    h = 5381;

    for c in str:gmatch"." do
        h = ((h << 5) + h) + string.byte(c)
    end
    return h
end


Mon Sep 11 20:31:25 BST 2023

acquire-lan/wan-foo have no tests, and the test setup is a bit of a
faff as they are both waiting on the filesystem

also, testing lua scripts is a faff without splitting them into
script/module

am wondering if we could do some kind of convention that we only write
modules not scripts but something in the fennel->lua can call the
module's `run` method.

Tuesday

Here is a working shebang for write-fennel:

#!/nix/store/5iwv3h2jjbk2vib2bpwx3g9knpb02x3y-lua-5.3.6/bin/lua -e dofile(arg[0]).run()

Tue Sep 12 20:47:52 BST 2023

We don't handle unbound or stopped states in odhcp consumers. I think
probably we should do this in odhcp-script by deleting the outputs,
rather than making each consumer do it.

... turns out that odhcp6c itself unsets ADDRESSES and PREFIXES before
calling the script with "unbound", so maybe we don't need to do
anything special.

Wed Sep 13 17:55:33 BST 2023



@400000000000001f2723b3cb eth1.link.pppoe Script /nix/store/nyks8zl86dcp44k5sjcc76digrnfgm17-ip-up finished (pid 403), status = 0x0
@400000000000001f27b2db3b eth1.link.pppoe Script /nix/store/ds0lc4qd1zfiyxsva87rpplyr21awjh1-ip6-up finished (pid 404), status = 0x1

@400000000000001f30a7c5c5 /nix/store/v9ijgyywizqbbd9y73r2wifkxc0d1jjm-route-default-1a22c69d0e1f-up: line 4: input: not found
@400000000000001f31abf9b5 ip: command line is not complete, try "help"
@400000000000001f31ca1395 s6-rc: warning: unable to start service route-default-1a22c69d0e1f: command exited 1
@400000000000001f31f236b4 s6-rc: info: service route-default-d2586cf00da0 successfully started
@


Wed Sep 13 18:05:38 BST 2023

TODO

- service for dhcp6 client
- move acquire-{wan,lan} scripts out of examples/
- service for resolvconf
- nftables syntax error
- tidy up the dependency handling in serviceDefn build
   (interface is fine, implementation is a bit brutal)
- docs

considerations:

1) in some ways, we should be able to specify acquire-{wan,lan} as if
they were just additional addresses on the respective
interfaces. However, they're longruns so the implementation of
"address" doesn't really fit.

2) should they be bundled into a dhcp client service? I think the
answer is "no" because which of the dhcp config we want to
honour locally (and how) is policy not mechainmsm

svc.dhcp6c.client.build { interface = wan; };
svc.dhcp6c.address.build {
  inherit client;
  interface = lan;
};
svc.dhcp6c.address.build {
  inherit client;
  interface = wan;
};
svc.dhcp6c.prefix.build {
  inherit client;
  interface = lan;
  index = 1;			# default to first interface
};
svc.dhcp6c.prefix.build {
  inherit client;
  interface = vpn;
  index = 2;
};



Fri Sep 15 12:04:25 BST 2023

Qemu worked example provides dhcp and ssh service

Hardware worked example needs to be plugged into same lan as build
machine if we are going to tftp the image onto it - so it might be
awkward if we run dhcp on it

The device I have lying around is the A

How do we do the actual flash step? Assuming the device is running
stock firmware, from a laptop we can wifi to it and use the web ui to
upgrade

we can't build the hellonet config because it requires tftp

plug in mt300a
put stock firmware on it

Sun Sep 17 00:08:03 BST 2023

I don't think the user manual needs a full justification of why we
have the module/service split. Maybe we should have "decision records"
in the git tree instead

Sun Sep 17 16:44:31 BST 2023

Can we figure out which bits of the old doc are missing from the new
one and just transplant those? Then we can merge it sooner
instead of blocking on writig all the new stuff

Mon Sep 25 16:58:51 BST 2023

jffs2 on mt300a isn't finding root partition in initramfs, and it
seems to be because MTD_SPLIT_UIMAGE_FW isn't working

[    0.426792] spi spi0.0: force spi mode3
[    0.431305] spi-nor spi0.0: w25q128 (16384 Kbytes)
[    0.436322] 5 fixed-partitions partitions found on MTD device spi0.0
[    0.442875] OF: Bad cell count for /palmbus@10000000/spi@b00/flash@0/partitions
[    0.450400] OF: Bad cell count for /palmbus@10000000/spi@b00/flash@0/partitions
[    0.458208] OF: Bad cell count for /palmbus@10000000/spi@b00/flash@0/partitions
[    0.465751] OF: Bad cell count for /palmbus@10000000/spi@b00/flash@0/partitions
[    0.473522] Creating 5 MTD partitions on "spi0.0":
[    0.478466] 0x000000000000-0x000000030000 : "u-boot"
[    0.484447] 0x000000030000-0x000000040000 : "u-boot-env"
[    0.490888] 0x000000040000-0x000000050000 : "factory"
[    0.497110] 0x000000050000-0x000000fd0000 : "firmware"
[    0.596423] 0x000000ff0000-0x000001000000 : "art"
[    0.611508] gsw: setting port4 to ephy mode

with squashfs root it's the same but for the extra split partitions:

[    0.468715] Creating 5 MTD partitions on "spi0.0":
[    0.473653] 0x000000000000-0x000000030000 : "u-boot"
[    0.479652] 0x000000030000-0x000000040000 : "u-boot-env"
[    0.486085] 0x000000040000-0x000000050000 : "factory"
[    0.492318] 0x000000050000-0x000000fd0000 : "firmware"
[    0.499304] 2 uimage-fw partitions found on MTD device firmware
[    0.505457] Creating 2 MTD partitions on "firmware":
[    0.510543] 0x000000000000-0x000000260000 : "kernel"
[    0.516616] 0x000000260000-0x000000f80000 : "rootfs"
[    0.522570] mtd: device 5 (rootfs) set to be root filesystem
[    0.528565] 0x000000ff0000-0x000001000000 : "art"

turns out this is because the device thinks it has 4k erase block size
because MTD_SPI_NOR_USE_4K_SECTORS was set, and that was causing
mtdsplit to look in the wrong place for a root filesystem

Mon Sep 25 18:50:05 BST 2023

No, that wasn't it. Turned out to be an endianness-dependent check for
JFFS2 magic in mtdsplit.

setenv serverip 10.0.0.1
setenv ipaddr 10.0.0.8
tftp 0xa00000 result/uimage
bootm 0xa00000

Fri Sep 29 20:50:39 BST 2023

setenv bootargs 'liminix mtdparts=phram0:M(rootfs) phram.phram=phram0,0x40411f28,4194304,65536 memmap=4194304$0x40411f28 root=/dev/mtdblock0 console=ttyAMA0,115200 earlycon'

setenv serverip 10.0.0.1
setenv ipaddr 10.0.0.5
setenv bootargs 'liminix console=ttyS0,115200  panic=10 oops=panic earlycon=uart8250,mmio32,0x11002000 root=/dev/mtdblock0'
tftp 0x4007ff28 result/uimage ;  tftp 0x40432f28 result/rootfs
bootm 0x4007ff28

setenv bootargs 'liminix console=ttyS0,115200  panic=1 oops=panic earlycon  root=/dev/mtdblock0'

md 0x42ff0000

Cannot find regmap for /infracfg@10000000: -524.

# "console=ttyAMA0,38400 panic=10 oops=panic init=/bin/init loglevel=8 root=/dev/mtdblock0 rootfstype=squashfs fw_devlink=off"


40812468: 69726553 203a6c61 41424d41 304c5020    Serial: AMBA PL0
40812478: 55203131 20545241 76697264 3c0a7265    11 UART driver.<
40812488: 61723e33 706f6f6d 61203a73 6165726c    3>ramoops: alrea
40812498: 69207964 6974696e 7a696c61 3c0a6465    dy initialized.<
408124a8: 61723e34 706f6f6d 70203a73 65626f72    4>ramoops: probe
408124b8: 20666f20 66663234 30303030 6d61722e     of 42ff0000.ram
408124c8: 73706f6f 69616620 2064656c 68746977    oops failed with
408124d8: 72726520 2d20726f 3c0a3232 61433e33     error -22.<3>Ca
408124e8: 746f6e6e 6e696620 65722064 70616d67    nnot find regmap
408124f8: 726f6620 6e692f20 63617266 31406766     for /infracfg@1
40812508: 30303030 3a303030 32352d20 333c0a34    0000000: -524.<3
40812518: 6e61433e 20746f6e 646e6966 67657220    >Cannot find reg
MT7622>
40812528: 2070616d 20726f66 666e692f 66636172    map for /infracf
40812538: 30314067 30303030 203a3030 3432352d    g@10000000: -524
40812548: 3e333c0a 6e6e6143 6620746f 20646e69    .<3>Cannot find
40812558: 6d676572 66207061 2f20726f 72666e69    regmap for /infr
40812568: 67666361 30303140 30303030 2d203a30    acfg@10000000: -
40812578: 0a343235 433e333c 6f6e6e61 69662074    524.<3>Cannot fi
40812588: 7220646e 616d6765 6f662070 702f2072    nd regmap for /p
40812598: 63697265 31406766 32303030 3a303030    ericfg@10002000:
408125a8: 32352d20 313c0a34 616e553e 20656c62     -524.<1>Unable
408125b8: 68206f74 6c646e61 656b2065 6c656e72    to handle kernel
408125c8: 67617020 20676e69 75716572 20747365     paging request
408125d8: 76207461 75747269 61206c61 65726464    at virtual addre
408125e8: 66207373 66666666 66666666 66666666    ss fffffffffffff
408125f8: 0a656666 4d3e313c 61206d65 74726f62    ffe.<1>Mem abort


CONFIG_SERIAL_8250_FSL=y
CONFIG_SERIAL_8250_MT6577=y
CONFIG_SERIAL_8250_NR_UARTS=3
CONFIG_SERIAL_8250_RUNTIME_UARTS=3
CONFIG_SERIAL_DEV_BUS=y
CONFIG_SERIAL_DEV_CTRL_TTYPORT=y
CONFIG_SERIAL_MCTRL_GPIO=y
CONFIG_SERIAL_OF_PLATFORM=y

Mon Oct  2 10:17:04 BST 2023

We have a bootable aarch64 kernel for the Belkin, but it does not
understand the memmap= parameter we're using to protect the
phram image from being used as general memory.

One option is to add a reserved-memory stanza in the device tree,
using u-boot "fdt" command, but we don't know the fdt address in u-boot
because it doesn't have any commands to parse the image and set
variables pointing at the sub-components. (There's iminfo, but it's
onyl human-readable)

Second option is to amend the dtb in the tftpboot module: this
would mean regenerating the uimage

Third option: for tftpboot do we _have_ to use FIT? maybe we could
grab the fdt as a separate tftp transaction

we need to apply kernel patch 9401911f2d9f89035f7acebab16e72d43d1282fb
to avoid using ioremap on sysem ram which is not allowed on arm


Tue Oct  3 14:15:38 BST 2023

Progress on Liminix ARM support. The device I'm starting with is
the Belkin RT3200 (also known as [Linksys E8450](https://openwrt.org/toh/linksys/e8450)) which seems to be a featureful piece of kit, and whioch I snagged for a
very good price on the Bay of E

# Where are we right now?

* we can TFTP boot it to userland
* ethernet works

## What else needs doing?

* it has dual band wifi with many interesting features. I've built the

* we're only running in RAM, probably need to add some kernel config
  to support the flash
* initramfs support is not yet implemented
* the flash is NAND flash and it's quite large compared with the
  existing Liminix devices, so we're going to add UBIFS which will
  use it better than  JFFS2 does
* all this work is on a branch and needs to be cleaned up a lot before
   I'm letting it into main

## What have we found?

There are some significant differences between this and the MIPS
devices (yes, other than an entirely different architecture), mostly
to do with "legacy boot" support or the lack thereof. For example:

* there aren't any options like `MIPS_RAW_APPENDED_DTB` to glom
  together a kernel and device tree (FDT), because the bootloader is
  expected to be able to provide a FDT following "standards".
  U-Boot will do this, provided that we use the newer "FIT" Uimage
  format which allows a kernel and DTB and initrd to be combined in
  the same container.  (Sadly we can't use FIT everywhere because a
  lot of MIPS devices use really old forks of U-Boot that don't
  understand it)

* for tftpboot, on MIPS we use the `memmap` kernel command line option
  to reserve some RAM for the root filesystem. On Arm there's no such
  option, so we have to add a
  [reserved-memory](https://github.com/torvalds/linux/blob/master/Documentation/devicetree/bindings/reserved-memory/reserved-memory.yaml)
  node in the device tree instead. Which means, given that we _only_
  want to do this when tftp booting (the memory is wasted
  otherwise), we have to rewrite the device tree in that
  scenario.

* then it turns out that phram doesn't (didn't) work anyway, because
  it calls ioremap() and [you can't use ioremap on system memory on ARM](https://lwn.net/Articles/409689/). In newer kernels this is
  fixed: there is a [conditional](https://github.com/torvalds/linux/blob/master/drivers/mtd/devices/phram.c#L127) here to use whichever of ioremap or
  memremap is appropriate for the memory passed to phram, but it looks
  non-trivial to backport so I've gone for a [much less sophisticated approach](https://gti.telent.net/dan/liminix/commit/f7cd9c2b6e6c99a228e066b09e3febcf71c63fa1#diff-d8e355f1b2dcde7378ebb40c92cdd2ce3125753c)

* we're using [DSA](https://www.kernel.org/doc/Documentation/networking/dsa/dsa.txt) instead of the OpenWrt [swconfig](https://openwrt.org/docs/techref/swconfig) program. There was actually surprisingly little work needed to adjust to this.


Other than that, it was mostly the usual process of "did the kernel
crash silently, or has it just been unable to open a console device?".
In this regard, *one neat trick*: even though U-Boot on this device
doesn't support pstore, we can use it anyway if we don't do compression.
Enable

```
        PSTORE = "y";
        PSTORE_RAM = "y";
        PSTORE_CONSOLE = "y";
        PSTORE_DEFLATE_COMPRESS = "n";
```

then boot with `panic=3 oops=panic`, then when it resets use the
U_Boot `md` command to see what happened:

```
MT7622> md 0x42ff0000
42ff0000: 43474244 00000000 00000ff4 3d3d3d3d    DBGC........====
42ff0010: 39342e31 36373031 500a442d 63696e61    1.491076-D.Panic
42ff0020: 50203123 31747261 3e363c0a 20505050    #1 Part1.<6>PPP
42ff0030: 20445342 706d6f43 73736572 206e6f69    BSD Compression
[....]
42ff0ca0: 20676e69 73756e75 6b206465 656e7265    ing unused kerne
42ff0cb0: 656d206c 79726f6d 3731203a 0a4b3832    l memory: 1728K.
42ff0cc0: 523e363c 2f206e75 74696e69 20736120    <6>Run /init as
42ff0cd0: 74696e69 6f727020 73736563 3e373c0a    init process.<7>
```

Wed Oct  4 21:08:44 BST 2023

By randomly including chinks of the openwrt config we have made it
find the mt7915e on the pcie bus. I just don't yet know which bit of
the openwrt config it was.

It doesn't actually work yet though. 5GHz wifi gets calibration data
from the flash, so it is not going to work unless (1) we reflash the
firmware partition, or (2) we find another way to provide it
calibration data.

https://forum.openwrt.org/t/belkin-rt3200-linksys-e8450-wifi-ax-discussion/94302/401

Sat Oct  7 22:56:40 BST 2023

We're almost ready to merge the rt3200 support (it's not finished
but it mostly won't break mips) except for the uimage module
which needs all that FIT stuff, and the tftpboot contortions
to amend the dtb for memmap

For legacy uimage

1) add commandline params to dtb
2) objcopy fdt into vmlinux.elf
3) strip to raw image and compress
4) mkimage


For FIT uimage

1) add commandline params to dtb
2) strip to raw image and compress
3) create its file
4) mkimage

Do we still want to handle the no-dtb case? what about
standards-compliant boot, where u-boot is providing the dtb? No option
to provide a commandline in that case, but maybe also no need to.

For tftpboot, am undecided. We could use the dtb rewriting thing
everywhere, in the interest of consistency.

Mon Oct  9 20:45:54 BST 2023

we bumped the kernel entry point to the 32MB mark, as

(1) when using jffs2 (big rootfs image) it was clobbering the dtb at
the end of the filesystem

(2) it should be 2MB aligned anyway and wasn't

However, this has given us the next problem:

OF: fdt: Reserved memory: failed to reserve memory for node 'secmon@43000000': base 0x00000000430B
OF: reserved mem: OVERLAP DETECTED!
phram-rootfs (0x0000000040400000--0x0000000053a31488) overlaps with ramoops@42ff0000 (0x000000004)
Zone ranges:
  DMA      [mem 0x0000000040000000-0x000000005fffffff]
  DMA32    empty
  Normal   empty
Movable zone start for each node

the end of that phram-rootfs region looks well sus.

[ turns out that decimal is not hex ]

Tue Oct 10 21:37:31 BST 2023

UBI bleurgh ...

The DTB for this device, and/or the OpenWrt installer,  seems to
expect already that mtd4 is a UBI thing

mtdinfo -a says:
mtd4
Name:                           ubi
Type:                           nand
Eraseblock size:                131072 bytes, 128.0 KiB
Amount of eraseblocks:          1000 (131072000 bytes, 125.0 MiB)
Minimum input/output unit size: 2048 bytes
Sub-page size:                  2048 bytes
OOB size:                       64 bytes
Character device major/minor:   90:8
Bad blocks are allowed:         true
Device is writable:             true

<5>UBI: auto-attach mtd4
<5>ubi0: attaching mtd4
<5>ubi0: scanning is finished
<5>ubi0: attached mtd4 (name "ubi", size 125 MiB)
<5>ubi0: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
<5>ubi0: min./max. I/O unit sizes: 2048/2048, sub-page size 2048

we made a volume using "ubimkvol /dev/ubi0 -N liminix -S 825"

and from ubinfo -a we can see

0 ubootenv
1 ubootenv2
2 recovery
3 boot_backup
4 liminix

Now we could use "ubiupdatevol /dev/ubi0_4 /path/to/ubifs.img"
to put a ubifs on that volume, but obviously we'd have to boot the
device into Liminix somehow first.

Alternatively in the build environment we could use ubinize to
create the entire image that can be flashed to MTD, but

- this will overwrite erase counters.
- we have to give it a config file that describes all the volumes,
 and I'm guessing they need to match up with the existing ones
 otherwise we trash the uboot env


Wed Oct 11 17:37:09 BST 2023

We can write ubi volumes from u-boot.  Let's for the moment use
mkfs.ubifs and tftp those files to u-boot - we can figure out the
ubinize dance later

We need either

(a) to write an analogue of our jffs2 graft option for mkfs.ubifs, or
(b) to have a "cpio-like" mkfs.ubifs variant that reads filenames on stdin
  and writes only those, or
(c) to create a "staging" directory during build with all the store folders that need to go into the filesystem

although the least elegant, option (c) is the simplest and probably
not even slow, at least by comparison with unpacking the kernel source
tarball

we used
uboot> tftpboot 0x40400000 result/rootfs
uboot> ubi write 40400000 liminix $filesize

then can use ubifsmount ubi0:liminix ; ubifsls / to check that it wrote
something valid. To boot this:

setenv serverip 10.0.0.1
setenv ipaddr 10.0.0.8
setenv bootargs 'liminix console=ttyS0,115200 panic=10 oops=panic init=/bin/init loglevel=8 root=ubi0:liminix  rootfstype=ubifs fw_devlink=off'
tftpboot 0x4007ff28 result/uimage
bootm 0x4007ff28

The other thing we had to fix here is that activate wasn't being built
statically. Have to add -Xlinker -static to CFLAGS - I don't know if
this is a no-op on MIPS

Mon Oct 16 20:51:08 BST 2023

Here's a thing: the u-boot installed by openwrt on this device has a
ubifsload command, and it has a writable ubootenv. So instead of
having a separate partition for the kernel we could put the kernel in
the actual filesystem

I think we should do this by excluding flashimage and including some
other module (to be written) instead. ubimage or somesuch, perhaps.

So the image we wish to create is a ubifs with a kernel inside it in
/boot and we also need to change the u-boot env value of

boot_production=led $bootled_pwr on ; run ubi_read_production && bootm $loadaddr#$bootconf ; led $bootled_pwr off

so that it mounts the rootfs and finds /boot/uimage inside it

From uboot this is setenv and saveenv; from a running linux this is fw_setenv

Thu Oct 19 09:34:15 BST 2023

setenv bootargs 'liminix console=ttyS0,115200 panic=10 oops=panic init=/bin/init loglevel=8 root=ubi0:liminix  rootfstype=ubifs fw_devlink=off'


ubifsmount ubi0:liminix
ubifsload 4007ff28 boot/uimage
bootm 0x4007ff28


Thu Oct 19 23:11:17 BST 2023

Assuming you've done the openwrt installer to repartion the device,
what are the steps to install Liminix?

1) build rootfs, which incorporates kernel

2) from u-boot:

uboot> ubimkvol /dev/ubi0 -N liminix -S 825
uboot> tftpboot 0x40400000 result/rootfs
uboot> ubi write 40400000 liminix $filesize
uboot> setenv boot_production 'led $bootled_pwr on ; ubifsmount ubi0:liminix; ubifsload 4007ff28 boot/uimage; bootm 4007ff28'

What if we don't have a serial console? can we do all this from openwrt?

Fri Oct 27 23:21:08 BST 2023

setenv serverip 10.0.0.1
setenv ipaddr 10.0.0.8
setenv bootargs 'liminix earlyprintk earlycon=uart8250,mmio32,0xf1012000 ramoops.mem_address=0x8000000 ramoops.mem_size=0x40000 ramoops=max_reason=2 mem=128M earlycon=ttyS0 console=ttyS0,115200 panic=10 oops=panic init=/bin/init loglevel=8 root=ubi0:liminix  rootfstype=ubifs fw_devlink=off'
setenv bootargs 'liminix  ramoops.mem_address=0x8000000 ramoops.mem_size=0x40000 ramoops=max_reason=2 mem=128M console=ttyS0,115200 panic=10 oops=panic init=/bin/init loglevel=8 root=ubi0:liminix  rootfstype=ubifs fw_devlink=off'
tftpboot ${kernel_addr_r} result
bootm ${kernel_addr_r}


---
this is potentially worth checkng out because we do have very slow
decompress speed

https://scm.linefinity.com/common/u-boot/commit/5818198e6a184963c6afc82178b23a64435ace6a

Commit 5bb2c550b1 ("arm: mvebu: Move internal registers in
arch_very_early_init() function") implemented code movement according to
(now incomplete) comments which resulted in semi-broken code.

The result is that I-cache is currently disabled for all Armada 38x boards
and maybe there are some other (unreported / undetected) issues.

[...] After this change lzmadec command with lzma image of 0x7000000 bytes is
doing decompression just 5 seconds. Before this change it was 30 seconds.


Mon Oct 30 21:03:55 GMT 2023

We have a kernel that boots on the Omnia, now we need to build a
rootfs. Given this device uses mmc for its primary storage, we should
use a block filesystem not a flash filesystem.


setenv serverip 10.0.0.1
setenv ipaddr 10.0.0.8
setenv bootargs 'liminix console=ttyS0,115200 panic=10 oops=panic init=/bin/init loglevel=8 root=/dev/mtdblock0 rootfstype=ext4fs fw_devlink=off mtdparts=phram0:22M(rootfs) phram.phram=phram0,0x1300000,23068672,65536 root=/dev/mtdblock0'
tftpboot 0x1000000 tftpboot/uimage ; tftpboot 0x1300000 tftpboot/dtb ; tftpboot 0x29b0000 tftpboot/dtb; tftpboot 0x29c0000 initrd.img


Sat Nov  4 12:22:37 GMT 2023

setenv serverip 10.0.0.1
setenv ipaddr 10.0.0.8
setenv bootargs 'liminix console=ttyS0,115200 panic=10 oops=panic init=/bin/init loglevel=8 root=/dev/mtdblock0 rootfstype=ext4 fw_devlink=off mtdparts=phram0:22M(rootfs) phram.phram=phram0,0x40300000,23068672,65536 root=/dev/mtdblock0'
tftpboot 0x40000800 tftpboot/uimage ; tftpboot 0x419b0800 tftpboot/dtb ; tftpboot 0x41a00000 initrd.img  ; tftpboot 0x40300000 tftpboot/rootfs
bootm 0x40000800  0x41a00000 0x419b0800

kernel  0x40000800 + 2af400 = "402afc00"
rootfs  0x40300000 + 15f9400 = "418f9400"
dtb     0x419b0800 + 4e00 = "419b5600"

Sun Nov  5 00:01:56 GMT 2023

Open questions

1) using -device loader for qemu phram, how do we choose appropriate
start address for all architectures? we could try to unify this
with the tftpboot approach, but that would mean providing the dtb
ourselves somehow which seems silly when qemu already does it

2) flash erase block size for tftpboot phram, it need to match the hardware
 - we don't use it at all for squashfs
 - for jffs2 it needs to match if tftpboot is to use same rootfs image
   as flash
 - we don't have any way to do a tftpboot ubifs
 - ext4 doesn't care

3) the rootfstype needs thought.

- for all but squashfs it implies an initramfs
- jffs2 can be flashed naively
- ubifs needs an installer to not clobber erase counts
- ext4 (for omnia) needs a block device and can't be flashed along
  with the kernel/initramfs because it's a separate disk. or maybe
  the omnia can load kernel from mmc?

phram plus mtd_block is ok for ext4 for qemu, so that's one thing
fewer to deal with

4) for tftpboot with a separate initramfs, is there any way to
make address selection easier?

5) ubifs needs a different set of flash parameters (PEB, LEB etc)

Tue Nov  7 23:19:10 GMT 2023

We can flash the turris omnia from a USB stick
https://docs.turris.cz/hw/omnia/rescue-modes/

it looks for omnia-medkit-latest.tar.gz and (I'm guessing) just
unpacks it onto the emmc.

Perhaps we need per-device installation instructions in the docs.

* rename defaultOutput to installerOutput
* attach some docs to the various options for installerOutput
* add a link from the device's rendered manual section to the
 relevant installer output doc

do we expect that outputs will build on each other? e.g.
turris omnia output is basically a tarball (other devices
might want this too) but with docs describing how to reset the
device and hold the button down until the led flashes four times
(other devices probably won't want this). maybe the model here
is that the turris output is a directory with a symlink to the
tarball and an informative README containing the instructions.
Although we also want the instructions in the manual where people
can read them before building anything.

Fri Nov 10 21:27:50 GMT 2023

Realising now that outputs and installers aren't the same thing.

e.g. flashimage can be installed from u-boot or from kexecboot

perhaps we distinguish between the

"installation image":
- firmware.bin
- tarball
- ubifs image
- kernel + rootfs

and "installer"
- kexecboot script
- u-boot script to flash squashfs image

Sun Nov 12 17:17:30 GMT 2023

What TODO?

- "does the kernel live on the filesystem" depends on the bootloader
  not the filesystem
   - could we implement this with a module that adds to config.filesystem ?
     it would depend on whether the bootloader can follow symlinks to
     files not in /boot (probably fine unless crossing filesystems)
   - the other question is how much futzing around in u-boot can/do we do
     to tell u-boot how to boot? for grown-up u-boot it's not a
     problem as we can saveenv but are there broken u-boots that would
     prevent this?

- kexecboot is unloved and documented in the wrong place. do we have a
  test for it even?
   - it won't work on aarch64 because it needs memmap=
   - hardcodes memory size, which we should probably work out dynamically

- how to put device name into the device docs
  maybe devices

- make config.boot.commandLine a single string
- finish omnia
- for installation on turris omnia we need tarball not ext4 image
  (but keep the ext4 image anyway for tftpboot and possibly kexecboot)
[done] - now we have lim.parseInt should we use it consistently?
- usefulness tiers for devices ("stable", "experimental", "wip")
- params for ubi(fs) are a mess
- create an l2tp configuration
- iperf and tuning
- wlan country code


Fri Nov 17 17:30:39 GMT 2023

kexec is fraught. I spent some time trying (unsuccesfully) to get a
kexecboot test running, but it doesn't work in qemu for the reason
that the kernel I built for qemu has SMP support but does not have
kexec_nonboot_cpu_func() - which is the needed function to stop
non-boot CPUs before jumping into the new kernel.  That this code is
all MIPS-specific (I have to assume that other architectures have
entirely different ways to stop non-boot CPUs?) is a bit worrisome: how
many other ways is kexec hardware-dependent?

We have two scenarios where we may want to use it:

1) the "installer": e.g. for UBI platforms, we want to plonk
a new ubifs on the device without clobbering the erase counters, which
means either doing it from U-Boot - needs serial connection -
or doing it from a Linux of some kind that is not running on the
filesystem we're toasting.

2) reinstalling after the initial install - this is a big deal for
squashfs where there is no other way to change data, and an only slightly
smaller deal for jffs2, where there isn't much room to change much
data.

Maybe instead of kexec we could do this by stopping services and then
pivot_root into a ramfs. We would, I assume, need to stop any processes
that have open files on the root fs, but we would need to have network
interfaces running. So we need a subset of services that run in recovery:
can make this a bundle

* mount a ramfs
* copy the closure of the bundle into the ramfs
* stop all processes (including init?)

sending pid 1 a signal FOO will cause it to run .s6-svscan/SIGFOO

Fri Nov 17 23:53:43 GMT 2023

So we need to extend .s6-svscan/finish to

if test -e /maintenance/bin/init
  cd /maintenance
  mount --bind /maintenance/ /
  chroot .
  exec /bin/init -D maintenance
fi


foreground {
  if { test -e /maintenance/bin/init }
    cd /maintenance
    foreground { mount --move /maintenance/ / }
    foreground { chroot . }
    redirfd -r 0 /dev/console
    redirfd -w 1 /dev/console
    fdmove -c 2 1
    emptyenv /bin/init -D maintenance
}
${s6-linux-init}/bin/s6-linux-init-hpr -fr


https://openwrt.org/docs/techref/sysupgrade

s6-svscanctl -t /run/service

Sun Nov 19 10:23:17 GMT 2023

# cat `type -p reboot`
#!/nix/store/0v3q2lnh7bwg0ldk24lzmsdnmidmdvm6-execline-mips-unknown-linux-musl-2.9.3.0-bin/bin/execlineb -S0
/nix/store/j41b85ccx0rmf7lm5g13zqb7fs68l8y2-s6-linux-init-mips-unknown-linux-musl-1.1.1.0-bin/bin/s6-linux-init-hpr -r \$@


s6-linux-init-hpr calls hpr_send("", 0)
then
hpr_shutdown(what, &tain_zero, 0))
which sends "Shpr"[what] to  SCANDIRFULL "/" SHUTDOWND_SERVICEDIR "/" SHUTDOWND_FIFO

which I assume is s6-linux-init-shutdownd.c

it calls prepare_shutdown on socket read, which sets deadline, grace_time.
later (when?) it calls

  run_stage3(basedir) ; # we can see this by adding a message to rc.shutdown
                        # this causes s6-rc services to be downed gently
  s6-rc -v2 -bDa change

  prepare_stage4(basedir, what)

    creates a file STAGE4_FILE  with the contents:
     s6-linux-init-umountall
     scripts/rc.shutdown.final
     s6-linux-init-hpr -f -r

  unsupervise_tree() ;

     goes through /run/service/*/supervise/control fifos
     except shutdownd and logger, sending
     "d" to each

     then does
     s6_svc_write(SCANDIRFULL "/" S6_SVSCAN_CTLDIR "/control", "an", 2)
     (this is a rescan not a terminate)

#define SCANDIRFULL S6_LINUX_INIT_TMPFS "/" S6_LINUX_INIT_SCANDIR
      (works out to be /run/service "/" ".s6-svscan")

ls
  kill(-1, SIGTERM) ;

s6-rc -v2 -bDa change
cd /run/service
for i in s6-linux-init-runleveld s6rc-oneshot-runner s6rc-fdholder eth* getty ; do  s6-svc -d /run/service/$i; done


s6-rc -v2 -bDa change
cd /run/service
for i in  s6-linux-init-runleveld s6rc-oneshot-runner s6rc-fdholder eth*  ; do  s6-svc -d /run/service/$i; done
s6-svscanctl -an /run/service

Wed Nov 22 22:01:02 GMT 2023

- define a subset of services that run in maintenance mode.
- write a command that copies the closure of this bundle into
  /run/maintenance
- create enough non-store filesystem (proc dev etc) to make it run

Thu Nov 23 00:09:44 GMT 2023

I was as surprised as anybody that this seems to work, at least
insofar as it has started a busybox sh process. there is a serious
deficit of symlinks to busybox, so almost no shell scripts work. and I
think we need an rc.init

It would be worth tidying up the main s6 run-image quite a lot before
we go further with this

We'd like to be able to reuse the s6 pseudofile structure (/etc/s6-rc
and /etc/s6-linux-init) but we can't make it a derivation because it's
pseudofiles (with funny permissions) not real files. Maybe we can
invoke the module as a function?

Fri Nov 24 23:29:48 GMT 2023

Turris TODO

- see if network works (eth[012], which is which?)
- wireless drivers
  [DONE] ath9k and ath10k, it's like old times
  https://docs.turris.cz/hw/omnia/omnia/#turris-omnia-wi-fi-6
  (note: other variant of the device has a MT7915AN, should we add
   support for that as well?)

- [DONE] feed the watchdog
   it looks like compiling watchdog support is sufficient to stop the
   thing from rebooting after three minutes, there is no need to actually
   feed it from userspace.

- "does the kernel live on the filesystem" depends on the bootloader
  not the filesystem
   - could we implement this with a module that adds to config.filesystem ?
     it would depend on whether the bootloader can follow symlinks to
     files not in /boot (probably fine unless crossing filesystems)
   - the other question is how much futzing around in u-boot can/do we do
     to tell u-boot how to boot? for grown-up u-boot it's not a
     problem as we can saveenv but are there broken u-boots that would
     prevent this?
   - perhaps we need different boot "recipes" - e.g. some device
     might want boot.scr and another something different

- create installable tarball and test
- gpio thingy for SFP switching
- iperf

- document the watchdog
- remove kexecboot? it's unloved and documented in the wrong place. do we have a
  test for it even?
   - it won't work on aarch64 because it needs memmap=
   - hardcodes memory size, which we should probably work out dynamically

- how to put device name into the device docs

- [WONT] make config.boot.commandLine a single string
  this sounds sensible but it just makes it harder to put useful comments
  against command line fragments so that we know why they're there

- usefulness tiers for devices ("stable", "experimental", "wip")
- params for ubi(fs) are a mess
- iperf and tuning
- wlan country code
- create an l2tp configuration

Sun Nov 26 15:37:07 GMT 2023

hatching a plan ... we could do "predictable" network interfaces like this:

 . add a devpath attr to network/link.nix
 . get the kernel-issued name from "/sys${devpath}/net"
 . use ip link set ${oldname} name ${newname}

if we had the full iproute2 thng we could keep the old name as well:
# ip link property add dev wan altname eth1

maybe we could do this with lua/netlink? no support in there currently
for RTM_NEWLINKPROP though

Maybe we'll skip doing the altname. The attraction of it is that it
means the existing name isn't removed, so there's no possibility of a
race.

The kernel will allocate eth0 when asked for eth%d and there is no
eth0. This might be the case where eth0 previously existed but it
just got renamed to lan

Sun Nov 26 21:20:23 GMT 2023

The wrinkle here is ifwait: using netlink we can't wait for an
interface by devpath but have to do it by name - which is a problem if
the interface is not yet present, because there won't be a devpath
in which to look up the name until it is.

So we need a new flow

 - wait for devpath to exist
 - get the ifindex (which shouldn't change, even if the name does)
 - churn rtnetlink messages for that index

We don't want to poll the sysfs file, but we can check it whenever
we get a netlink message

Sun Nov 26 22:33:16 GMT 2023

There is no way to refer to the hardware device for a bridge interface
by sysfs path because it has none. This is probably true of other
virtual devices as well.

ls: cannot access '/sys/class/net/vbridge0/device': No such file or directory

Also, there is no way to refer to the _netdevice_ of a hardware
interface without also knowing its default name, which doesn't help us
if enumeration changes

ls /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/net/
enp1s0

So we should only be specifying devices by devpath if they're
hardware devices discovered by the kernel, not synthetic devices (that
we pick the name of anyway).

So maybe we don't need to rewrite ifwait, we just do it after renaming
the device

Wed Nov 29 21:28:37 GMT 2023

How do we name outputs?

fileystem image
with-boot filesystem image (e.g. ubifs for belkin)
tarball
with-boot tarball (e.g. omnia)
flashable combined image of kernel + filesystem (e.g. gl-mt*)
kernel + filesystem image + dtb + tftpboot glue
kernel + filesystem image + qemu script

we could add initramfs as a separate thing for tftpboot and qemu
(and FIT images) but it would mean not sharing a kernel with
the outputs that require embedded initramfs

we can enable with-boot variants of outputs by adding a
boot.loader config option. if we go that route, can we
use config options to drive the whole output thingy?
We could have a config option that just changes "defaultOutput"
but is that useful?

Maybe the question is: we can choose a different output at build time
rather than editing configuration - how often is this useful? I think
tftpboot vs installable is about the only case.

adding a /boot to a filesystem differs from making a combined
flashable image because it is a config change and not a composition
of two other already-built outputs.

we could have done tftpboot that way as well, but we chose to unpack
and repack the fdt so we don't have to build two kernels - and also so
that we can have both outputs from the same configuration without
editing any files.

for tftpboot we don't want to make the filesystem embed the kernel
if we need a separate kernel for booting (guessing we can't usually make
u-boot loopback mount a downloaded filesystem image). so that points to
not making it a config option,  _or_ to making the inclusion logic
(hardware wants a kernel in filesystem) && !(output == tftpboot)
which itself means the output somehow needs to be injected into the
config

nix-build  -I liminix-config=./examples/hello-from-mt300.nix  --arg device "import ./devices/my-device" --arg output=tftpboot

let's see if we can not do that?

repacking a ubifs to add /boot is awkward and unpleasant

Thu Nov 30 14:33:08 GMT 2023

~~We need a new boot-in-rootfs output which calls rootfs with
(config.filesystem  // { /boot ... }) when the device
boot type is extlinux~~

~~Do we need to put it in systemConfiguration as well? Yes,
otherwise liminix-rebuild won't install it~~

We may have a problem here actually: if /boot is only set up after
reboot, by adding a link while running the initramfs, how did the
bootloader  find the kernel to boot in the first place? So
we need /boot to exist and to point to the new kernel before rebooting
into it, so we need to create it as a real directory along with /nix/store
when making the filesystem,
instead of relying on activate which will be too late.

maybe we could extract the root directory structure creation
as a separate output from rootfs, then there is a single place
to put "and also add /boot"

we will need to update pkgs/min-copy-closure/liminix-rebuild.sh
to add /boot

we could make the contents of /boot a derivation and then
/boot itself is just a symlink to it. we would need to ensure that
the derivation is part of the system closure, though

Sat Dec  2 15:33:07 GMT 2023

- make rootfs the directory structure

Sun Dec  3 23:31:35 GMT 2023

Spent too much of the weekend first fighting run-liminix-vm.sh and
then rewriting it in Fennel, but we are now at the point that we can
boot u-boot in qemu. However, it maps the rootfs into high memory
where phram can find it, instead of putting it into a flash that
_qemu_ can see as flash, so u-boot is not able to boot the kernel or
at least not in a similar-to-hardware fashion. Once we've added that,
we need to write a test for boots-a-kernel-in-the-filesystem

Mon Dec  4 19:46:58 GMT 2023

We wanted a test that we are creating an image that u-boot can boot
using extlinux. Turns out that u-boot only has scripts to do this in
the case that the storage device has a partition table. Which is
representative of the Omnia mmc, but maybe not going to work for
jffs2/ubifs

(For ubifs it might be OK, there's some concept of partitions

ubifs_boot=if ubi part ${bootubipart} ${bootubioff} && ubifsmount ubi0:${bootubivol}; then devtype=ubi; devnum=ubi0; bootfstype=ubifs; distro_bootpart=${bootubivol}; run scan_dev_for_boot; ubifsumount; fi

)

So what do we need? a disk image with a partition table and an ext4fs
image in the only partition, and that partition to be bootable. Then
run-liminix-vm gets a --disk-image option which causes it to use
U-Boot instead of direct load (can we wrap it with something that sets
up the paths so it can find u-boot and qemu?)

ok, we're going to define outputs.diskimage which is like
outputs.flashimage but it has a partition table. Then we can make a
qemu configuration with an ext4 filesystem and defaultOutput="diskimage"
and config.boot.loader.extlinux.enable = true

sfdisk default behaviour for GPT partitioned disk is to start at sector 2048
(sectors are 512 bytes)

Device     Start   End Sectors Size Type
disk.bin1   2048 14335   12288   6M Linux filesystem

Tue Dec  5 23:54:22 GMT 2023

Need mbr not gpt. At least, it takes up less space and it doesn't
inveigle us into EFI boot.

This is close to working except that it doesn't want to boot a uimage

Enter choice: 1
1:      Liminix
Retrieving file: /boot/initramfs
Retrieving file: /boot/kernel.gz
append: console=ttyAMA0 panic=10 oops=panic init=/bin/init loglevel=8 root=/dev/mtdblock0 rootfstype=ext4 fw_devlink=off
zimage: Bad magic!

Thu Dec  7 19:33:02 GMT 2023

virtio devices don't have standard major/minor so we can't create
device nodes for them at build time. Either we mount devtmpfs in the
initramfs (do we then have to move it to /target?) or we parse the
/sys/block/vda/vda1/dev node to get 253:1 or whatever

Fri Dec  8 16:36:07 GMT 2023

We'd like to remove the ugly special handling of qemu dtb. Here are
some thoughts

1) qemu provides its guest the correct DTB for the configured hardware.
This is a desirable thing and we wish more platforms did it, so would
like not to replace it with a static dtb if we can avoid that

2) so we need some kind of config option that says "platform provides the
device tree" (or alternatively, "we need a static dtb, platform doesn't
provide it").

config.boot.platformProvidesDeviceTree



Monday

so where are we? we've removed the need for every hardware device to
be able to build flashimage, because not every hw device (looking at you,
bellkin/turris) works like that.

for the belkin we can have imports = [ modules/outputs/ubimage ]
and extlinux enabled

for the nwa from Raito we'd like to have imports = [ modules/outputs/ubimage ]
but the bootloader is mtd partition-based (or ubi volume? check?) -
so ubimage needs to know how to do that

perhaps we need an output for "smash together a kernel and a
filesystem image that does not also contain a kernel, and don't
put a partition table on the front"

diskimage {
  partitionType = "mtd" ;  # or "mbr" or maybe "gpt"
  partitions = [ o.uimage o.rootfs ];
}




for the turris we need to check but proceeding on the assumption
it wants a tarball with extlinux enabled

https://docs.turris.cz/geek/schnapps/schnapps/#export-and-import
https://wiki.turris.cz/en/howto/omnia_booting_from_external_storage

if we adopt this as our installation format then we are not
reformatting the flash and will keep the btrfs that the device
was shipped with.

https://forum.turris.cz/t/update-to-5-1-x-by-medkit/13986/12
suggests that we could install a custom medkit from the
vendor OS

=> btrsubvol mmc 0:1
ID 257 parent 5 name /@
ID 259 parent 5 name /@factory

there don't seem to be any other btr commands in u-boot

Tue Dec 12 14:38:53 GMT 2023

from the source code, to get to the various omnia revovery modes

uboot> setenv omnia_reset 3 # or 1..n
uboot> setenv boot_targets rescue
uboot> boot

// reset boot_targets to default value.

Tue Dec 12 22:44:34 GMT 2023

The hold-down-reset-until-n-leds-flash support depends quite heavily
on the post-boot Linux environment, in that it appears to be passing
omniarescue=3 rescue_mode=3 to the kernel command line -> pid 1
cmdline

On the other hand, it is described as being able to boot from usb
stick if there's a boot.scr on the usb stick, so maybe we just do
that. The installation process could then be "boot usb, dd the
disk image to mmc, reboot, remove usb, realise we got the wrong root=".
Hmm.

* Could we edit extlinux.conf for first boot? But bear in mind it's a
link to a store file.

* Could we have extlinux.conf point at mmc0 and somehow override it for the
usb stick boot?

* Could preinit try multiple root mounts until it gets one that works?

* maybe we could detect omniarescue on kernel command line and switch to
usb root?

* maybe outputs.usbstick could generate a customised rootfs image?
it might be unworkable to

(narrator: it boots from mmc0 first and usb stick second, so that's not
particularly useful)




Device 0: Vendor: SanDisk Rev: 1.00 Prod: Cruzer Blade
            Type: Removable Hard Disk
            Capacity: 7632.0 MB = 7.4 GB (15630336 x 512)
... is now current device
Scanning usb 0:1...
No EFI system partition
fdt_find_or_add_subnode: chosen: FDT_ERR_BADSTRUCTURE
ERROR: /chosen node create failed
 - must RESET the board to recover.

Thu Dec 14 15:32:39 GMT 2023

from the omnia rescue image, we have

## Loading kernel from FIT Image at 01700000 ...

     Load Address: 0x00800000
     Entry Point:  0x00800000


int lzmaBuffToBuffDecompress(unsigned char *outStream, SizeT *uncompressedSize,
                             const unsigned char *inStream, SizeT length)

    err = image_decomp(os.comp, load, os.image_start, os.type,
                           load_buf, image_buf, image_len,
                           CONFIG_SYS_BOOTM_LEN, &load_end);

configs/mvebu_db_armada8k_defconfig:CONFIG_SYS_BOOTM_LEN=0x800000
   default 0x4000000 if PPC || ARM64
   default 0x1000000 if X86 || ARCH_MX6 || ARCH_MX7
   default 0x800000

Fri Dec 15 19:21:53 GMT 2023

Let's put some English words on the page to explain the above
gibberish. Since I upgraded the U-Boot on my Turris Omnia, it has
stopped being able to tftpboot.

Uncompressing Kernel Image
lzma compressed: uncompress error 7
Must RESET board to recover

"uncompress error 7" means there is not enough space in the output
buffer, and the output buffer is set by CONFIG_SYS_BOOTM_LEN which is
8192k, smaller than the uncompressed 12104200 of our kernel.

So how can we fix?

* one possibility is to do what the turris rescue mode does: build an
uncompressed uimage and then lzma the result.  u-boot can uncompress
the received file using lzmadec command. we'd want to do this without
breaking tftpboot on every other device that might not have an lzmadec
command

* is there bloat in the kernel we could trim? probably not 4MB of it

* we could build a custom u-boot with a bigger buffer. this _might_
not be a completely stupid idea as it's only the people prepared to
open the box that would be doing tftp workflows anyway, so provided
it's possible to replace u-boot without bricking it

* we could try building zimage instead of uimage and use bootz to
start it

Sat Dec 16 11:15:56 GMT 2023

there is another use case for weird tftpboot derivation, which is the
device Raito has ported to where you need to wave a magic chicken at
u-boot on each command line

Sat Dec 16 23:32:11 GMT 2023

Turns out that even when using an uncompressed uimage, u-boot runs the
code to check the decompressed size, so that doesn't help at all.  But
booting a zImage works fine. I am committing a first pass of
modules/outputs/tftpbootlz.nix which does this using a lot of
copy-paste and (ironically) no lzma stuff at all.

Sun Dec 17 16:25:30 GMT 2023

it's started failing to mount root on arm32 because it's not recognising
the reserved-memory and something is trashing the phram filesystem

        reserved-memory {

                phram-rootfs {
                        reg = <0x1400000 0x1900000>;
                        compatible = "phram";
                };
        };

Sun Dec 17 17:13:54 GMT 2023

* We need to write the fdt phram differently on 64 bit vs 32 bit
  (address-cells and size-cells should be 2 or 1)

* this might be why it wasn't working on mips (can we test this
somehow in qemu or do we need to plug a device in?)

qemu user-mode networking has a builtin tftp server.

so we need a test that builds the tftpboot target for each qemu arch,
and then does run-liminix-vm with the --lan and --u-boot options, then
drives it with expect


* maybe tftpboot[lz] could be reintegrated with the regular one somehow

Fri Dec 22 15:11:35 GMT 2023

We have a working test for tftpboot, on all platforms, which took a
while.

* tftpbootlz needs to be updated with what we learned, or merged
back into it

* omnia install

- build a stripped-down installer image which can be put on a
 usb stick
- from openwrt on the device, use fw_setenv to set boot order
- boot into the installer image-
 reformat the emmc as per requirements
- PREFIX=/mnt  liminix-rebuild
- profit

Could we liminix-rebuild into /some/prefix/nix/store this would
actually be useful both for this boot-from-usb scenario and for
levitate. Looks like min-copy-closure doesn't need anything much on
the destination system except a working sshd and maybe an empty
/nix/store and /persist. Potentially we could even do it without
chroot (which would save running a second sshd), just add a prefix on
the paths

what's the fallback? we're not touching the turris rescue
system (which is in nor flash) so we can expect all the
knightrider modes to continue to workto be able to
boot back into that and restore the vendor os? I think so

Fri Dec 22 16:56:40 GMT 2023

If we're going to use fw_setenv to change boot order,
we could equally well boot from tftp and not need the
usb stick

fw_setenv boot_targets "tftp mmc0 nvme0 scsi0 usb0 pxe dhcp"
setenv bootcmd_tftp "echo TFTP BOOT"

how can we get a tftp boot into few enough characters to reasonably
put it in an environment variable?
- use dhcp?
- embed bootargs into the fdt

Fri Dec 22 21:10:53 GMT 2023

* dtb needs size and offset of uncompressed root filesystem to add
reserved-memory and cmdline params

* setting these in the dtb will change the size of the dtb

* am assuming that we don't want the kernel to relocate into
 ram that clashes with the root fs

* should we care about phys mem fragmentation?

conclusion:

hardware.loadAddress
		  <uncompressed/relocated kernel>
tftp.loadAddress
		  <space for uncompressed root>
		  uimage or zimage
		  dtb
		  compressed root


[ should we rename hardware.loadAddress to something that expresses
more clearly it's the _kernel_ load address? ]

[ another thing we need to do is stop building two kernels
 because the uimage and zimage derivations are different ]

Sat Dec 23 18:07:43 GMT 2023

Addendum: for a zimage we need the compressed kernel to be at the
highest address, otherwise it prints "Starting" and then hangs
indefinitely.

I believe this to be because the kernel decompressor sets up a stack
directly after the compressed payload, so will trash the fdt if it was
also there. The bug didn't exhibit on Turris Omnia with the same
layout, but maybe that was just luck.

Sat Dec 23 18:11:04 GMT 2023

Here is scope of work for Turris:

(I) we need to build a suitable tftpboot image for
recovery/install.

- disk partitioning tools and mkfs stuff
- kernel with all the filesystems
- dhcp client for connecting to wired network

(II) we need instructions for building the real system
and using min-copy-closure to copy and install the system
configuration of the real one into /mnt

(III) probably try the same recovery image as a USB stick

(IV) I've lost track of what we're doing with /boot, does that work?

(V) gpio thingy for SFP switching

(VI) iperf, performance testing

(VII) put device name and usefulness tiers ("stable", "experimental", "wip") in the docs

(VIII) params for ubi(fs) are a mess

(IX) wlan country code

Tue Dec 26 16:23:37 GMT 2023

I seem to have lost a chunk of notes here. Have added
systemConfiguration/bin/install which does the stuff to copy the right
files into /bin and wherever. There is currently no test for it though

We could further simplify liminix-rebuild by adding --reboot as a flag
to install

Tue Dec 26 21:38:43 GMT 2023

To be any use, the test needs to be end-to-end - as in, rather than
just checking some files are copied, test that the machine rebooted
successfully

Fri Dec 29 18:36:16 GMT 2023

Our test for liminix-rebuild uses qemu block device and ext4 instead
of phram because -device loader doesn't seem to survive a reboot.
And it needs some free space in the ext4 partition inside the
mbr image so that it can install new stuff. However, the
filesystem is sized to be near-full.

If the mbrimage output is to be much use, probably there should be
some way of telling it how big the disk is. Maybe it should use
hardware.flash.size?

UBI also does a bad job of integrating into the hardware.flash hierarchy
(but ubi is also more complicated as the ubi volumes are "nested" inside
an MTD partition)

To move forwards with this test I think I will make it not depend on
mbrimage for now, but we have to come back to this. Maybe importing
the mbrimage module provides new hardware.disk = { partitions, size etc}
config options.

Sun Dec 31 23:52:04 GMT 2023

https://developer.ridgerun.com/wiki/index.php/Setting_up_fw_printenv_to_modify_u-boot_environment_variables#Preparing_the_fw_env.config_file

can we extract the fw_env config data somehow to produce an appropriate
file for the device?

the device config needs to specify partition name and offset at minimum,
possibly also size.

we can create a service that writes the config based on those values. but
if we are to be using fw_setenv from the shell, there is no service
which depends on that service. whatever defines the service also needs
to add it to system.services so that the recovery system can specify it

Sat Jan  6 12:30:27 GMT 2024

How do we min-copy-closure to the device when we don't have anything
hooked to the LAN port? It's rather easy to break the WAN  connection
when it involves going out to the internet and back

* Don't want to plug it into the actual lan because it's doing dhcp service
 and that is going to confuse

* the machine we're copying from is loaclhost

* we could do some kind of port forwarding thing? maybe a port forward on
  run-border-vm qemu user networking ...

* static route on loaclhost?

  512  sudo ip netns add test-lan
  514  sudo ip link set dev enp1s0 netns test-lan

  525  sudo ip link add veth-test-lan type veth peer veth1 netns test-lan
  533  sudo  ip netns exec test-lan ip link add name br0 type bridge
  536  sudo  ip netns exec test-lan ip link set veth1 master br0
  537  sudo  ip netns exec test-lan ip link set enp1s0 master br0
       sudo  ip netns exec test-lan /nix/store/dh66q9k402pwpmmgc983xwmwb3vvvjbr-busybox-1.36.1/bin/busybox udhcpc -i br0

then we could add a route to 10.8.0.1/32 with dev veth-test-lan ?

Sat Jan  6 20:52:45 GMT 2024

This is all beside the point right now because the _recovery_ system
does not run all this stuff - it just has a dhcp client on the lan
interface. We could plug it straight into the switch.


As we already just plugged it into enp1s0 on loaclhost, could we
do somethin to put it on the lan from there? add it to vbridge0?

Sun Jan  7 15:30:57 GMT 2024

Turns out we should have used a working ethernet cable.

Sun Jan  7 15:31:14 GMT 2024

OK, so

# on device
mount /dev/mmcblk0p1 /mnt
 [ take a snapshot if needed ]
 [ clear out the turrisos files ]
ls /mnt/@

# on build

$ nix-build -I liminix-config=./examples/rotuer.nix --arg device "import ./devices/turris-omnia" -A outputs.systemConfiguration
$ nix-shell --run "min-copy-closure -r /mnt/@  root@recovery.lan   result "

# on device

$ mkdir /mnt/@/persist
$ /mnt/@/nix/store/swf3vn9bzx198c0cwp6naq0glqa9192n-make-stuff-armv7l-unknown-linux-musleabihf/bin/install /mnt/@/

this fails because it tries to copy from the unprefixed nix
store. Also probably it should mkdir $prefix/persist. Also it needs to
create $prefix/boot: it's too late to do that with `activate`
because u-boot will need it to exist in order to load the initramfs
that runs activate

Thu Jan 11 23:36:47 GMT 2024

squashfs rootfsType doesn't rebuild when the kernel config is changed

Mon Jan 22 19:04:45 GMT 2024

setenv serverip 10.0.0.1
setenv ipaddr 10.0.0.8
compraddr=0x01000000
tftpboot ${compraddr} recovery.img.lzma
setexpr writeaddr ${filesize} + $compraddr
lzmadec ${compraddr} $writeaddr
usb start
usb dev 0
wdt dev watchdog@20300
wdt stop
usb write ${writeaddr} 0 ${filesize}


Thu Jan 25 11:55:36 GMT 2024

openwrt:
CONFIG_BROADCOM_PHY=m
CONFIG_FIXED_PHY=y
CONFIG_GENERIC_PHY=y
CONFIG_IP17XX_PHY=m    ?
CONFIG_MARVELL_PHY=y
CONFIG_MVSW61XX_PHY=y  ?
CONFIG_RTL8366RB_PHY=m ?
CONFIG_RTL8366S_PHY=m ?
CONFIG_RTL8367B_PHY=m ?
CONFIG_SWPHY=y
CONFIG_USB_PHY=y

CONFIG_FIXED_PHY=y
CONFIG_GENERIC_PHY=y
CONFIG_MARVELL_PHY=y
CONFIG_PHY_MVEBU_A3700_COMPHY=y
CONFIG_PHY_MVEBU_A38X_COMPHY=y
CONFIG_SWPHY=y
#

Sat Jan 27 18:14:13 GMT 2024

To make the recovery system (and tftpboot generally) more useful, it
would be good to resize the root fs on boot. Need to do this before
anything that writes to it

Mon Jan 29 21:50:59 GMT 2024

something is corrupted in the uncompressed rootfs


$ head -c $(printf "%d" 0x2be0000) rootfs | sha1sum
142571fe0436c18191727d1d4c2fd32163c1f2e1  -
=> sha1sum 0x1000000 2be0000
sha1 for 01000000 ... 03bdffff ==> 142571fe0436c18191727d1d4c2fd32163c1f2e1

but!

$  head -c $(printf "%d" 0x2bf0000) rootfs | sha1sum
7aa004ba87c6772bade491fbade164e2dfe100f9  -
=> sha1sum 0x1000000 2bf0000
sha1 for 01000000 ... 03beffff ==> 1a0923a94784d0c0b86006c5e6fff1649770dad3

something is trashing something in the range 03be0000 - 03beffff
or else it's not being decompressed properly

pxefile_addr_r=0x1900000
ramdisk_addr_r=0x2200000
    scriptaddr=0x1800000
fdt_addr_r=0x2000000
fdtcontroladdr=7fb19b30
fdtfile=armada-385-turris-omnia.dtb
     fdt_high=0x10000000
  initrd_high=0x10000000
kernel_addr_r=0x1000000
              0x1700000;
              0x10000000

Sun Feb  4 11:55:00 GMT 2024

restructuredtext headings:

https://devguide.python.org/documentation/markup/#sections


####### chapter (one per filename)
*******
=======
-------

Mon Feb  5 09:57:52 GMT 2024

Before calling the Omnia "done" I'd like to get it to the point that
I can actually use it as a CPE. This means

- writing something down about how we handle static addresses
  - hosts that need static ipv6 can configure it themselves as ::n
    where n is a small number. this won't clash with slaac
  - the `hosts` param to dnsmasq can specify static ipv4

- dealing with port forwards and allowed incoming in the firewall

- would be quite cool to run sniproxy instead of forwarding to
  loaclhost (extra credit)

Sat Feb 10 18:23:54 GMT 2024

ARGH KERNEL

You can't define CONFIG_NETFILTER=y in a monolithic kernel and expect
later to separately build some modules that use it, because there are
a bunch of symbols that only get defined if certain other CONFIG
options are set at the time that the monolithic kernel is built.

https://github.com/torvalds/linux/blob/master/net/netfilter/core.c#L689

Another example is
https://github.com/torvalds/linux/blob/master/include/linux/netdevice.h#L160
- if you decide after building the kernel that you're going to build
some wireless modules, you can't do that without rebuilding the kernel
so that it knows to expect them

The moral of the story seems to be: if you have a compiled Linux kernel source tree and you change some symbol from "is not set" to m and then run  make modules, you cannot in general expect that newly compiled  module to work.

AP advertised VHT without HT, disabling HT/VHT/HE

TODO

- [done] support kernel version as parameter to builder pkgs/kernel/default.nix
- [done] extract the change in how module loading works from omnia device config,
  and fix the other thing that uses it
- [axed] wlan module to take 'backported' as a parameter
  half of the omnia conditionalConfig can go into the module
- [done] upgrade omnia to kernel v6
- figure out what mdns we need for local hostname resolution
  (maybe bridging lan/wlan)?
- [DONE] slow wifi because "AP advertised VHT without HT, disabling HT/VHT/HE"
- [DONE] add local domain to secrets
- run sniproxy instead of forwarding
- [test] forward some port to loaclhost 22 for inbound ipv4 ssh


Mon Feb 12 21:50:35 GMT 2024

# find  /run/service-state/dhcp6c.wan.link.pppoe/address/
/run/service-state/dhcp6c.wan.link.pppoe/address/
/run/service-state/dhcp6c.wan.link.pppoe/address/2001-8b0-1111-1111-0-ffff-51bb-4cf2_LFoo015bSsM
/run/service-state/dhcp6c.wan.link.pppoe/address/2001-8b0-1111-1111-0-ffff-51bb-4cf2_LFoo015bSsM/valid
/run/service-state/dhcp6c.wan.link.pppoe/address/2001-8b0-1111-1111-0-ffff-51bb-4cf2_LFoo015bSsM/preferred
/run/service-state/dhcp6c.wan.link.pppoe/address/2001-8b0-1111-1111-0-ffff-51bb-4cf2_LFoo015bSsM/len
/run/service-state/dhcp6c.wan.link.pppoe/address/2001-8b0-1111-1111-0-ffff-51bb-4cf2_LFoo015bSsM/address
#

valid 7199 preferred 3599

Tue Feb 13 19:44:57 GMT 2024

Before we put this back live, would be good to

[done] 1) move the leases file into /persist

I think we'll do /persist/service/<name>/ and change ssh to use the same
scheme.

we could put mkpersist() in serviceFns which would check for /persist
and return a directory in /persist/service/ or /run/service-state

(will something bad happen if we use /run/service-state? it will also
expose the thingy as an output, but whether it's accessible that way
will depend on whether there's a writable fs or not, which is unexpected)

: rename service-state to  /run/services/outputs
: on boot
:  if /persist
:    create /persist/services/state and symlink /run/services/state to it
:  else create /run/services/state


[done] 2) maybe change the local domain back to .lan?  setting up
  systemd-networkd with search domains is an awful faff

[done] 3) work out what to do with incoming ssh from wan

- For noetbook and thinkpad we have a vpn anyway so can expect to
  reach loaclhost directly using ipv6

- stop ssh from ever trying to get to our ipv4 address.
  - we could get rid of A record for loaclhost.telent.net but
    there are a bunch of CNAMES pointing at it for web servers.
  - we could reject incoming connections to tcp4 port 22 in firewall
    and then there is a clear signal to Dont Do That Then

- for emergency use, dnat ipv4 2200 and 2201 to rotuer and loaclhost

Tue Feb 13 22:31:03 GMT 2024

* the reason we can't reboot is that there is a service to add each
lan device to the bridge which does ifwait $dev running, which doesn't
return until there's something plugged in. So s6-rc hangs indefinitely
until the lan switch is fully populated. This is definitely a "next
milestone" thing.

* another example of "thing that depends on other thing but which it
  is actually OK if neither of them happen" might be "mount a
  filesystem if there is a usb mass storage device attached"

* I don't know if failover also fits into the model we don't quite
  have. LTE route depends on pppoe not being healthy

we can have services (or bundles) that aren't part of the default target,
and plumb them into events of some kind (netlink?) to bring them up/down?

we can use s6-rc instanced services:
https://skarnet.org/software/s6/instances.html

"s6-instance-create and s6-instance-delete are relatively expensive operations, because they have to recursively copy or delete directories and use the synchronization mechanism with the instance supervisor, compared to s6-instance-control which only has to send commands to already existing supervisors. If you are going to turn instances on and off on a regular basis, it is more efficient to keep the instance existing and control it with s6-instance-control than it is to repeatedly create and delete it. "

Probably we need something that reads netlink messages and converts
them to a format that we can use to control services. Is there a
benefit to using services here and not just running commands? it means
the system state change we desire will stay changed.

TODO items not to lose track of

- speed testing (iperf)
- make gl-ar750 tftpboot build again
- finish belkin
- install sniproxy
- is there something simple we can do to make it reboot again?
- turn rotuer,extneder examples into "profiles" that don't embed
  hardware specifics

Thu Feb 15 11:50:56 GMT 2024

1) to make tftpboot work with old bootm implementations we need

- compressed root
- uncompressed root
- kernel with dtb
 dtb needs to know where uncompressed rootfs is and how big

2) if the image is a zImage (arm32) or an Image (arm64) we have to stick
with the three-arg bootz, and the dtb has to be lower in ram than the kernel

Fri Feb 16 15:43:32 GMT 2024

DHCP6c refresh is still wrong. We get updates for an address that
hasn't changed prefix or length, when the expiry times have changed,
and we can't action that by remove;add because remove will wipe out
any routes through the interface but add won't put them back

We can use "change" for both adds and changes, but we need to know that
a change is not a delete

The "identity" of an address is the address itself: kernel won't
let you add the same address with two different prefixes.

Keeping it simple, we could call "change" on every address in the
new-addresses list and "del" on every address in old-addresses
that is no longer in new-addresses

If the upstream has changed length, "ip addr change" is ignored,
so it needs to be in deleted as well as added/changed

Fri Feb 16 19:37:08 GMT 2024

[    3.839775] cfg80211: module verification failed: signature and/or required key missing - tainting kernel
[    4.156952] ath10k_pci 0000:00:00.0: enabling device (0000 -> 0002)
[    4.165756] ath10k_pci 0000:00:00.0: pci irq legacy oper_irq_mode 1 irq_mode 0 reset_mode 0
[    4.399285] ath10k_pci 0000:00:00.0: qca9887 hw1.0 target 0x4100016d chip_id 0x004000ff sub 0000:0000
[    4.408906] ath10k_pci 0000:00:00.0: kconfig debug 1 debugfs 0 tracing 0 dfs 0 testmode 0
[    4.420096] ath10k_pci 0000:00:00.0: firmware ver 10.2.4-1.0-00047 api 5 features no-p2p,ignore-otp,ski
p-clock-init,mfp,allows-mesh-bcast crc32 62f7565f
[    4.467443] ath10k_pci 0000:00:00.0: board_file api 1 bmi_id N/A crc32 546cca0d
[    5.472096] ath10k_pci 0000:00:00.0: htt-ver 2.1 wmi-op 5 htt-op 2 cal file max-sta 128 raw 0 hwcrypto
[    5.585796] ath: EEPROM regdomain: 0x0
[    5.589712] ath: EEPROM indicates default country code should be used
[    5.596364] ath: doing EEPROM country->regdmn map search
[    5.601875] ath: country maps to regdmn code: 0x3a
[    5.606831] ath: Country alpha2 being used: US
[    5.611425] ath: Regpair used: 0x3a
[    6.742365] ath10k_pci 0000:00:00.0: pdev param 0 not supported by firmware
[    6.903389] random: hostapd: uninitialized urandom read (1027 bytes read)
[    8.169901] ath10k_pci 0000:00:00.0: pdev param 0 not supported by firmware
[   14.450193] ath10k_pci 0000:00:00.0: pdev param 0 not supported by firmware
[   15.518682] random: hostapd: uninitialized urandom read (1027 bytes read)
[   16.762697] ath10k_pci 0000:00:00.0: pdev param 0 not supported by firmware
[   23.030622] ath10k_pci 0000:00:00.0: pdev param 0 not supported by firmware
[


Tue Feb 27 23:16:27 GMT 2024

We made it a full week with rotuer running internet chez nous and no
need for an intervention, so I am happy to call it "production". There are
still things that need fixing but they're mostly within scope for
a services refresh

I have embarked on "profiles" by creating a wap.nix

I think we could have a service module for resolvconf

It would be good to build a wap.nix example for the belkin and we
could start looking at ubifs

I've lost a chunk of notes about using events to drive desired service
state. There is probably only going to be one udev listener, so
what if we have udev as a config key thusly

udev.rules = [
{
  match = {
   SUBSYSTEM = "rpmsg";
    ATTR.name = "DATA5_CNTL";
  };

  service = longrun {
    name = "lte-modem";
    run = "blah blah blah";
  };
}

# this one would be provided by the bridge module instead of
# adding bridge member services to the default target

{
  match = {
    SUBSYSTEM="net";
    ID_PATH="pci-0000:04:00.0";
    ATTR.operstate = "up";
  };

  service = oneshot {
    up = "ip link set dev $dev master $(output ${primary} ifname)";
    down = "ip link set dev $(output ${member} ifname) nomaster";
  }
}
]

This works for udev/sysfs, but we want a similar architecture(sic) for
user-generated target state so we could have services that run on e.g.
"is the ppp0 service healthy" or not. Probably there isn't a top-level
config key for each service though

services.wan = svc.ppoe.build   { .... };
services.lte = watcher.build {
  watching = services.wan;
  match =  {
    # an expression matching the outputs of the service
    # to be watched
    health = "failing";
  };
  service = oneshot {
    run = "start_lte_blah";
  };
}

thing is, we could use this syntax also for sysfs watches, but not vice versa

... but it's not quite the same because here we're doing static matches
on contents of files, whereas the udev one is a query expression on the
sysfs database. we might need that flexibiity to implement "mount the
backup drive no matter _which_ damn sda_n_ device it appears as". I don't
know if there's the same need for service outputs - postulate the
existence of a collection of services which are all similar enough that
some other service can watch them all and do $something when one of
the changes state. Or a single service with very complicated outputs.
For example, something could watch the snmp database and update service
status depending on what it finds. Or something something mqtt...

we find that the "match" needs to be interpreted differently according
to the thing being watched. perhaps the service being watched needs to
provide a "watch me" interface somehow which accepts match criteria and
outputs a true/false. Something else then needs to


services.addmember = services.udev.watch {
  match = {
    SUBSYSTEM = "net";
    ID_PATH = "pci-0000:04:00.0";
    ATTR.operstate = "up";
  };

  service = oneshot {
    up = "ip link set dev $dev master $(output ${primary} ifname)";
    down = "ip link set dev $(output ${member} ifname) nomaster";
  };
}

Sat Mar  2 15:37:29 GMT 2024

Simply put, what I think it boils down to is that we want a service
which acts as an actuator or control switch for another service,
and will start/stop that controlled service according to some
criteria.

services.addmember = svc.network.ifwatch.build {
  interface = config.hardware.networkInterfaces.lan1;

  # this should be part of the definition not the params
  service = oneshot {
    name = "member-${bridge}-${interface}";
    up = "ip link set dev $dev master $(output ${primary} ifname)";
    down = "ip link set dev $(output ${member} ifname) nomaster";
  };
}

we could start by writing this. we need to adapt ifwait

Sun Mar  3 17:09:21 GMT 2024

this is annoyingly hard to test. the tests we'd like to write are

1) when it gets events that don't match the requirement, nothing happens
2) when it gets an event that should start the service, the
  service starts
3) when stop should stop
4) when start and already started, nothing happens
5) when stop and already stopped, nothing happens

what do we do if service fails to start? s6-rc will eventually reset it
to "down", I think: do we need to take action?

Mon Mar  4 20:46:55 GMT 2024

# relevant but not correct for this model: https://www.forked.net/forums/viewtopic.php?f=13&t=3490

# power on port 5
snmpset -v 1 -c private  192.168.5.14 .1.3.6.1.4.1.318.1.1.4.4.2.1.3.5 integer 1

# power off port 5
snmpset -v 1 -c private  192.168.5.14 .1.3.6.1.4.1.318.1.1.4.4.2.1.3.5 integer 2

# toggle off/on port 5
snmpset -v 1 -c private  192.168.5.14 .1.3.6.1.4.1.318.1.1.4.4.2.1.3.5 integer 3

Wed Mar  6 18:24:29 GMT 2024

What happens when we attempt to start the service but it fails?  We
assume the start was successful so we won't try and restart it again
next time we get an event that should cause it to start.

Thu Mar  7 11:48:26 GMT 2024

what next?

- fennel script needs to know where s6-rc is
- some nix syntax
- update bridge module members.nix to use the new thing

I can't find a ci derivation that uses the bridge.

Mon Mar 11 20:31:45 GMT 2024

Create a qemu config where wan and lan devices are bridged into a
single bridge

start qemu paused
Use qemu monitor commands to no-carrier the network devices
set_link virtio-net-pci.1 off
set_link virtio-net-pci.0 off

Boot the system

See if both devices are bridge members

See if reboot is possible

Use qemu monitor commands to enable the network devices
set_link virtio-net-pci.1 on
set_link virtio-net-pci.0 on

See if both devices are bridge members

disable again,check if back to starting position


Wed Mar 13 00:00:16 GMT 2024

aside: "trigger" is the least bad word I've thought of so far for
these services that stop/start other services

telent: yeah, in general 'ps afuxww' (or s6-ps -H :)) is the way to solve this, look for hung s6-rc processes and in particular their s6-svlisten1 children, where the command line will show what service is still waiting for readiness

Wed Mar 20 19:34:36 GMT 2024

Because I forgot hoe to rebuild rotuer, I tihnk it is time to improve
support for out-of-tree configurations. So I've made
modules/profiles/gateway.nix and now I can copy rotuer.nix to
telent-nixos-config.

Probably I should make nix-build work on the top-level derivation
and install liminix-rebuild as a binary?

would be good if an out-of-tree config could specify the device
it was targeting?

Fri Mar 22 20:49:54 GMT 2024

Ideally liminix-rebuild could accept a configuration file that
specifies a liminix-config file, a target hostname (maybe plus ssh
port, credentials etc) and the device name. Not going to work on that
just now but it does mean we can punt on specifying the device inside the
liminix-config which is unreasonably circular.

Maybe we'll just chuck a makefile in telent-nixos-config

Fri Mar 22 22:14:32 GMT 2024

For the service failover milestone we said

a. A configuration demonstrating a service which is restarted when it crashes
b. A failover config where service B runs iff service A is unavailable
c. A config showing different pppd behaviour when interface is flakey (retry) vs ppp password is wrong (report error, wait for resolution)

Sun Mar 24 23:41:27 GMT 2024

TODO

1) make liminix-rebuild bounce only affected services instead of
  full reboot (what does it do about triggered services?)
2) sniproxy

3) see if arhcive still works. usb disk hotplug would be a good candidate for
switching to triggers

Mon Mar 25 19:35:47 GMT 2024

to make the liminix-rebuild thing restart only affected services, it needs to
know when the new service is not like the old one. By default it does not
restart a service with a changed up/down/run script unless the name has
also changed, so we need to figure out how to generate a "conversion"
file with the services that are different

pkgs/s6-rc-database/default.nix creates $out/compiled, we could add
$out/hashes to this

the other thing making this fun is that we will need to run `activate`
(which is usually done in preinit) otherwise the new configuration's
fhs directories won't exist.

so the plan woyuld be

in liminix-rebuild, when reboot was not chosen,

- run activate
- compare  /run/s6-rc/compiled/hashes (old services) with
  /etc/s6-rc/compiled/hashes (new services)

- whenever both files have the same column 1 and different
column 2, add that name to restart list

(need to turn restarts.fnl into a lua script)

s6-rc-update /etc/s6-rc/compiled/hashes restarts

Tue Mar 26 23:18:53 GMT 2024

activate overwrites /etc/s6-rc/compiled, which is a problem because
s6-rc-update expects to find the old compiled database here so that
it can know what to update

Maybe config.filesystem should specify /etc/s6-rc/compiled.new
and something in early boot could symlink /etc/s6-rc/compiled to it

Sat Mar 30 18:41:14 GMT 2024

soft restart doesn't restart services that are invoked by trigger,
because it has to do -p -u default so that it prunes services that
were in the old config but not the new one. Ideally we need somehow
to notify the trigger that it should respawn its service. Maybe
we could add triggers to the force restart list, if there's a way
to detect which they are?  don't want to do it by adding files in
the service state directory if there may be oneshot triggers. Can
there be oneshot triggers?

The hashes file is built when we build the service database, so we
could easily(?) add something in there to mark services that
need poking whenever there's a restart. It's not perfect because the
triggered services will be bounced unnecessarily, but remember that
the alternative is a reboot ...

Mon Apr  1 00:18:50 BST 2024

i) I don't know if digressing into remote log shipping is a tangent or
an important part of making services work well.

ii) Should there be a single "machine state" value for all of the
trigger services to reference, or is it better that each trigger
service has its own private state, or (third option) one state
per "state source"? We previously handwaved that a state source
is a service

services.addmember = services.udev.watch {
  match = {
    SUBSYSTEM = "net";
    ID_PATH = "pci-0000:04:00.0";
    ATTR.operstate = "up";
  };

  service = oneshot {
    up = "ip link set dev $dev master $(output ${primary} ifname)";
    down = "ip link set dev $(output ${member} ifname) nomaster";
  };
}


Tue Apr  2 19:55:25 BST 2024

We could do a test script for udev usb disk mounting, which uses the
qemu monitor to add/remove a disk.


./result/run.sh --flag -device --flag usb-ehci,id=xhci --flag -drive  --flag if=none,id=usbstick,format=raw,file=./stick.img

(qemu) device_add  usb-storage,bus=xhci.0,drive=usbstick

Fri Apr  5 17:11:46 BST 2024

1) write a fennel thing that reads from the udev rebroadcast socket
2) and can check sysfs for state
3) set up mdevd in liminix


Sat Apr  6 13:23:02 BST 2024

I wonder if we could replace preinit with an execline script? One for
the TODO stack


Sun Apr  7 14:03:29 BST 2024

1) we want to know what messages are sent from mdevd under various circumstances
 - actually, right now the only relevant circumstances are updown and inout

2) we might get a wider variety of messages from real hardware?

3) if we log the raw messages, pref. with timestamps, then we can
write tests for the parsing


therefore: write a program that opens the netlink socket and logs
all data received

----

what's the minimum we need here?  we need the inout test to open a
uevent socket and use uevents to update some state that says whether the
backup drive is plugged in

rather awkwardly, uevents don't have filesystem labels. so we also need
to run blkid to find the label of each partition, and ideally we do this
while the partition is present, not each time we get an event for it.

We have DEVNAME, DEVTYPE, SUBSYSTEM to indicate that a filesystem of interest
may be present, we should use that as a trigger to scan any known



add@/devices/pci0000:00/0000:00:13.0/usb1/1-1/1-1:1.0/host0/target0:0:0/0:0:0:0/block/sda/sda1
ACTION=add
DEVPATH=/devices/pci0000:00/0000:00:13.0/usb1/1-1/1-1:1.0/host0/target0:0:0/0:0:0:0/block/sda/sda1
SUBSYSTEM=block
MAJOR=8
MINOR=1
DEVNAME=sda1
DEVTYPE=partition
DISKSEQ=2
PARTN=1
SEQNUM=1528

Some disks on loaclhost and noetbook have PARTNAME field - I assume
this is because they're GPT disks. Would it actually be better to use this
field than grovelling for filesystem label?

Tue Apr  9 21:07:50 BST 2024

Having waited for the appropriately labelled disk to appear, we then
also have to communicate its path to the service that mounts it

- create a symlink
- or use an instanced service

Creating a symlink will be fine if we can pass the symlink name as
a param to fswait

Wed Apr 10 20:53:48 BST 2024

We think that fswait will evolve into a more general
waiting-for-uevents tool. Maybe we could provide the matchers on the
command line:

waituevent -l /dev/volumes/backup-disk -s mount-srv devtype=partition partname=backup-disk

Thu Apr 11 23:09:43 BST 2024

lcommit d3a2e3a4cb80b631df2ab79d463c2c4d1adef37b
commit 4a58cf9335116ce673fcf08f70f3bca921a4c9ad
commit afca6d4b63dd39062f02827b3c29e16904770216

Sun Apr 14 19:50:27 BST 2024

how to get this on to main:
 - make uevent-watcher package (it's fswait renamed)
 - make mount service use it
 - module for mdevd
 - add nellie (generalise for other netlink uses w/params pid/family/groups)

Mon Apr 15 19:59:43 BST 2024

plan:

introduce uevent-watcher command, update test to use it

make mount service use it




Tue Apr 16 18:59:25 BST 2024

Another idea for maybe-not-now: tftp local/peer addresses could be
provided as top-level params (e.g. to nix-build).

Wednesday

Here's irony: this doesn't work with arhcive when the disk is already
plugged at boot time because mdevd-coldplug has already run by the
time uevent-watch has started. Perhaps it turns out after all that we
do need to be looking at state not events? (Not that we'd ever really
believed otherwise but it hadn't been apparent so far)


options

- we could delay coldplugd to later in boot process. Don't know how:
  it would have to depend on all trigger services.  (Actually this
  is kind-of possible as they're all marked isTrigger)

- we could make uevent-watch look in sysfs for matches before it opens
  netlink socket. This would be an in-process recursive walk of sysfs
  reading all the uevent files, which may or may not be an improvment
  on having multiple mdevd-coldplug processes do a recursive walk
  of sysfs writing all the uevent files so that they trigger events
  for uevent-watch to pick up.

- we could construct some kind of queryable sysfs database so that
  all the watchers could use the same source of state, and we could
  make mdevd-coldplug depend on its presence.

Favouring option 3 as cleanest, and actually it doesn't need to read sysfs
if it gets all the coldplug events.


for each event
 parse into path and attributes
 paths[event.path] += event.attributes # only if add/change
 do something with indexes to make queries cheaper

a first guess for indices would be to index everything:

index[attribute.name][attribute.value] +=  event

and then throw away the indices that are useless: let's say, the
ones with more than 10% of events (we can tune this).

To query, look at query field names and first get the corresponding
index that (1) exists; (2) has the smallest number of values, then
scan through that looking for paths that match on the other fields

We don't need to be ludicrously fast because there is probably some
human event here that triggers this (disk added, network device
unplugged). We don't want to use all the RAM in the world, though.
Maybe 10% is too much.


We can probably TDD the hell out of this.

How should we provide the query interface? Needs to be some kind of
IPC, like a socket.

----

Could we do something quick in the meantime so I can make arhcive work
again? maybe add a second mdevd-coldplug oneshot that depends on our
mount service


Wed Apr 17 18:57:49 BST 2024

I hatched a plan (and forgot to save this file) to build a service
that subscribes to uevents and retains state so that other services
can know about things that happened before they started. I'm wondering
if it's really needed though, because there could be one process to
read the socket and start/stop *all* the udev triggered services.  Not
sure how we'd describe this in nix though: how do all the other
services

How we would do a uevent database service (sysfsq):

for each event e from socket

if e.action in (add, change)
  path[e.path] = e.attribues

if e.action == 'remove'
  path.remove e.path

(update-indices e)


(fn update-indices [event]
  for each k in (keys event)
    index.k.v += e)

we also want to not maintain indexes when there are so many values in
the index entry to make searching it worthless.

to retrieve, look at each criterion that has an index and choose the
index with fewest elements in the value. scan that index for the other
criteria

there are 813 uevent files in sysfs on arhcive, is this all overkill?
maybe we could simplify using a hardcoded stopword list - e.g. don't
have indices for MAJOR, MINOR

what are we going to use for querying? can't be netlink because that's
a shared medium (broadcast/multicast). unix dgram socket? alternative
would be to somehow use the filesystem as a database

Wed Apr 17 22:00:29 BST 2024

tests. assuming the sysfs setup from all-events.txt, we can write tests lik

- there is a path for $foo
- the attributes are x, y, z

- when I add a device with $attributes, I can recall it
  - by path
  - by attribute value

- when I remove it again, I cannot access it by path or attributes

- when I add a device with $attributes major minor foo bar baz
  it is added to indices for foo bar baz but not major minor

- when I remove it, it can no longer be found by looking in any index

- when I query with multiple attributes, the search is performed
  using the most specific attribute (= the attribute whose
  value at this key has fewest elements)



I am still looking for ways to avoid doing this, but it is potentially
the first of several "database" services that triggers could want to
use so maybe it's an emerging pattern.


https://github.com/philanc/minisock useful? we could almost replace
nellie with it only not quite (it hardcodes 0 as the "protocol" param
to socket())

Fri Apr 19 20:55:22 BST 2024

We could have a service that's present only when a devdb entry is
present. For example mount_disk only runs when partlabel=foo

Or we could have a service that continues to run as the $somedatabase
service state changes and does different things depending on the
nature of those changes. For example, [I can't think of an example
now, but it was definitely an issue the other day, maybe I dreamt it]
I don't think this will be such an issue for devdb becuase there isn't
much in it that has continuously varying values. Maybe battery health
is the exception there

The step ahead we're thinking here is: how do clients do a request?  A
single one-of request for state is fine but chances are that a client
will do that to get initial state and then need to open a netlink
socket to get updates: well, if we can feed them the initial state
filtered for their needs why can't we send them the relevant updates
as well? This makes the database server design a bit more complicated
as it needs to remember each client and their subscriptions, and then
send only relevant updates to each subscribed client

* should a client be allowed multiple subscriptions on the same
connection?

* do we guarantee that every message sent is matching the subscription
or can we send other stuff as well if it makes implementation easier?
it might defeat the purpose a bit because it means the client also
needs to filter, but the client will anyway have to do some message
parsing so they can distinguish add from remove

* where do we start?

Sun Apr 21 13:31:48 BST 2024

We have the mechanics of it working (albeit implemented in the
simplest possible terms), we need to glue it to some I/O

1) open a netlink socket and read the events from it

2) "create a PF_UNIX socket of type SOCK_STREAM, and accept connections on it, then each time you accept a connection, you get a new fd"

- accept connection
- read terms from it
- register callback that writes event to connected socket

minisock has no support for "test if fd is ready" or "wait for [fds]
to become ready", either we need poll() or we could add a call for "is
this fd ready to read" and use coroutines. Fork minisock or add as
another library?

[ if we fork minisock we could expose the protocol param to Lua
so we could use it for netlink ]

Tue Apr 23 19:13:45 BST 2024

we could convert from minisock to lualinux. if we can also use that to
get rid of nellie and/or lfs, the size tradeoff is minimal

---

Is there some way we could test the devout event loop?

I can register a fd with a callback
when the fd is ready, my callback is called
when the callback return true it remains registered
when the callback return true it is unregistered and the fd is closed

loop.register
loop.registered?
loop.feed

Tue Apr 23 20:34:03 BST 2024

I think we could make the event loop abstraction leak less?
It's not actually a _loop_, all the actual GOTO 10 happens
outside of it

[X] 1) see if we can do netlink in lualinux
[X] 2) if so, convert it to lualinux
[X] 3) add netlink socket to event loop
[X] 4) make it send messages to subscribers
[X] 5) package it
[X] 6) make uevent-watcher use it instead of netlink directly
[X] 7) write an inout test variant that has the stick inserted
  at boot time already

I'm also thinking we could wrap the raw fds from lualinux into small
objects with read and close methods? It would make testing easier if
nothing else - also use of with-open.  Maybe do that in anoia.

when a subscriber connects we need to send them their matching current
state before subscribing them [ needs a test ]

figure out what event format the subscribers want? lua-ish or send the
same messages as udev would? If we're going to send the originals,
should we store them alongside the parsed, or reconstruct from parsed?

Sat Apr 27 21:52:11 BST 2024

We have a passing inout test. Next thing to do is try it on
the actual arhcive hardware

Next big thing is some kind of failovery service.  Almost-obvious
candidate is LTE failover with aaisp l2tp tunnel

Tue Apr 30 23:27:30 BST 2024

I want to connect my new ip camera to arthur without letting it reach the
internet, or the internet reach it.

we could plug it into a gl.inet box running dhcp server on lan
and client on wan, then use NAT to expose the camera's http and rtsp
ports on whatever address it has on the wan interface

Tue May  7 22:23:49 BST 2024

If we want to build a config with an l2tp upstream, it needs an
underlying dhcp interface not pppoe as we can't use the bordervm l2tp
account simultaneously. Having bordervm do dhcp might be quite useful
anyway for other applications, although it will have to double-nat to
the internet. We could give it an aaisp /64 and have routable ipv6 but
maybe that's a level of faff too high.

Given that we can build xl2tpd  and a service for it.


're using the same l2tp account for thingy that we use to simulate ppp,
we need an upstream which is not ppp

We need a less shit coldplug that copes with filenames containing spaces (!)

Fri May 10 00:33:14 BST 2024

Getting xl2tp hackily running turned out to be not a lot of work. However,
we need to figure out routing

- we need a route on lan device to the dns to lookup l2tp.aaisp.net.uk
- we need a route on lan device to l2tp.aaisp.net.uk

also it doesn't die when the tunnel closes, which is a bit shit

maybe this is where we lean into health check services

a health check service is just a service that watches another service
and kills it if it's not healthy.

for xl2tpd, "not healthy" is "there is no ppp process" or "there is no
tunnel" or "the tunnel has no sessions". I don't know how we
(robustly) test for no ppp process associated with the l2tp peer

when ppp quits, does the tunnel come down?
in xl2tld.c child_handler we respond to sigchld by closing c->fd
and setting it to -1

Sat May 11 17:55:04 BST 2024

A better way to monitor the connection health would be to ping a
computer on the internet (preferably one that doesn't mind being
pinged).  If we combine autodial with "is $isp still there" then we
should have something fairly robust.

xl2tpd spawns pppd, we should equip it with config that writes the
ppp outputs (ip address etc) to the xl2tp service directory so
that it can be used like a regular ppp. This will also make
it possible to have the health check work by pinging the peer address

Sun May 12 22:33:09 BST 2024

sleep until the interface is probably up
failure counter  = 0
loop indefinitely
  get outputs/peer-address of watched ppp service
  ping it
  if ok
    reset failure counter
  else
    increment failure counter
  fi
  if failure counter > threshold
    bounce the ppp service
    exit, if previous action didn't do that already
  end
  sleep(check interval)
end loop


# ps ax | grep l2tp
   72 root      1316 S    s6-supervise l2tp.aaisp.net.uk.l2tp
   73 root      1316 S    s6-supervise l2tp.aaisp.net.uk.l2tp-log
  122 root      1428 S    {run.user} /bin/sh ./run.user l2tp.aaisp.net.uk.l2tp
 1099 root      1428 S    {run.user} /bin/sh ./run.user l2tp.aaisp.net.uk.l2tp
 1102 root      1104 S    {xl2tpd} /nix/store/i1bbqh7vybam03l6jzf4sm4np3k4ack5
 1115 root      1420 S    grep l2tp
# s6-rc -d change l2tp.aaisp.net.uk.l2tp
# ps ax | grep l2tp
   72 root      1316 S    s6-supervise l2tp.aaisp.net.uk.l2tp
   73 root      1316 S    s6-supervise l2tp.aaisp.net.uk.l2tp-log
  122 root      1428 S    {run.user} /bin/sh ./run.user l2tp.aaisp.net.uk.l2tp
 1102 root      1104 S    {xl2tpd} /nix/store/i1bbqh7vybam03l6jzf4sm4np3k4ack5
 1122 root      1420 S    grep l2tp


Mon May 13 19:45:59 BST 2024

We need to do the usb id swithcing dance thing for the lte modem.
At startup it's 12d1:14fe, which is "mass storage mode", although the
disks seem to disappear as soon as they appear which is weird

probably the mode switch should be triggered by device insertion

usb_modeswitch -v 12d1 -p 14fe --huawei-new-mode

https://github.com/pixelspark/tymodem?tab=readme-ov-file


Tue May 14 21:58:25 BST 2024

[ we didn't need this. the first form is the default, the second
is what something on the internet said we should change it to, the
third is setting it back to default ]

^SETPORT:A1,A2;12,1,16,A1,A2
AT^SETPORT="FF;12,16"
AT^SETPORT="A1,A2;12,1,16,A1,A2"



Wed May 15 21:55:11 BST 2024

we can use uevent-watch to look for devtype=usb_device product=12d1/14fe/102
and trigger a oneshot that runs usb-modeswitch

we can use uevent-watch to look for devtype=usb_device product=12d1/1506/102
and trigger a oneshot that runs the AT commands

if wwan0 is a triggered service how can dhcp depend on it? arse

- we can get reverse dependencies from s6-rc-db, so the sematics of
starting a triggered service could include starting everything it
enables

- we don't want to inadvertently start it on boot by putting it in the
global config.services

Thu May 16 09:09:44 BST 2024

we could do something cleverish with the config.services at build time
by stripping from it everything depending on a trigger. but then how
_do_ they get started? the intent of putting it in config.services
is that it will be started when conditions are suitable.

can we: go through each service in config.services, detect the trigger
that started it, and add it to a bundle named for that trigger?

we need something in the triggering service to mark the triggered
service as not-for-boot, and then to apply that transitively to
everyting depending on it

I don't think we have common code for triggers, so either we need to
add some or put this marking in all of the current examples

Wed May 22 22:29:46 BST 2024

# cat /sys/bus/usb/devices/1-1.2:1.2/uevent
DEVTYPE=usb_interface
DRIVER=huawei_cdc_ncm
PRODUCT=12d1/1506/102
TYPE=0/0/0
INTERFACE=255/2/22
MODALIAS=usb:v12D1p1506d0102dc00dsc00dp00icFFisc02ip16in02

# cat /sys/bus/usb/devices/1-1.2:1.0/uevent
DEVTYPE=usb_interface
DRIVER=option
PRODUCT=12d1/1506/102
TYPE=0/0/0
INTERFACE=255/2/18
MODALIAS=usb:v12D1p1506d0102dc00dsc00dp00icFFisc02ip12in00

neither of these is a tty. however, there's a ttyUSB0
diretory in there -

# cat /sys/bus/usb/devices/1-1.2:1.0/ttyUSB0/uevent
DRIVER=option1

# cat /sys/bus/usb/devices/1-1.2:1.0/ttyUSB0/tty/ttyUSB0/uevent
MAJOR=188
MINOR=0
DEVNAME=ttyUSB0


We can query devout using s6-ipcclient:

$ s6-ipcclient -v /run/devout.sock  sh -c "echo devtype=wwan >&7; cat <&6 >&2"


we're going to need to expose sysfs parent attributes in devout,
so that we can tie together vendor/product and tty name

- they don't come from netlink socket so we'll have to read the
  filesystem
- we don't know if we can cache them indefinitely or if they change
- we also will want to match on parent attributes
- per https://www.kernel.org/doc/html/v4.16/admin-guide/sysfs-rules.html
  we should do that by getting the parent directory, not by
  following symlinks

I think we are going to document "a rule depending on attrs won't
trigger if those attrs change silently without a uevent". attrs for a
path are rescanned only when a event for that path (or a child?)
is received.

we need to avoid using stdio to read sysfs attributes because of the
need for atomicity

Sun May 26 12:24:45 BST 2024

cd /
mkdir -p /tmp/sys/devices
find /sys/devices -type f | while read F ; do
   D=/tmp/$(dirname $F)
   test -d $D || mkdir -p $D
   test -r $F &&  cat $F > /tmp/$F
done

Tue May 28 20:58:58 BST 2024

To make testing easier, we pass the /sys mount point as a param
to (sysfs) via (database)

This means that for event:match to work with attributes, the event
also has to point back to the database, or at least to sysfs. However
we'd like to do the attr tests after the regular uevent key tests
and the attrs tests after even that, because they're that mich slower.

-----

Let's change it up.  The database contains kobjects not events. A
kobject knows its mountpoint as well as its path, and implements the
attr and attrs methods


Sun Jun  2 20:58:47 BST 2024

we have a new uevent-rule service that can start/stop its controlled
service based on uevent fields and sysfs attrs for the corresponding
device/any ancestor of its sysfs path

we need
- a way of starting _dependent_ services of the controlled service:
  we want to activate that whole sub-branch of the dependency tree

- a way to not attempt to start dependents of the controlled service
at boot time: we don't want to put them in the default bundle if
they depend on stuff that's part of the default service set.

what happens when a service depends on _two_ controlled services?
we want to start it only when _both_ of them are up. That discourages
any idea of creating a bundle for each controlled service and its
dependants

Tue Jun  4 19:07:44 BST 2024

we have to do it at runtime. I think we can add a file in the service
directory to identify that it's triggered (i.e. only run in response
to an event, we could use better terminology here(


for s in the reverse deps of $triggered
  unless any? (s1:  {  s1.triggered && s1.down  }  )
              s.dependencies
    start s

we can do this in ci, maybe adapt the inout test

two triggered services, A with a uevent that does happen
and B which doesn't

two dependent services; C depends on A only and D on A+B.

each should write an output
we check that the output from C is present but not the one from D



Tue Jun  4 22:22:20 BST 2024

We can't have dependencies of triggered services unless we expose the
triggered service instead of the watcher from the module that sets up
uevent-watch. otherwise the user will be declaring their dependencies
on the watcher and they will start when the watcher does. We need to
invert it.

Bother.

services.mount-foo = oneshot {
  name = "blah";
  up = "mount /dev/disk/by-partlabel/mydisk";
  runCondition = conditions.sysfs {
    terms = { partlabel = "blah"; };
    symlink = "/dev/disk/by-partlabel/mydisk";
  }
}

To process this, we need to

- remove it from the default bundle
- create a service that _is_ in the default bundle which
   starts this one when runCondition is true


How would it look for service conditions other than uevents

Thu Jun  6 21:49:10 BST 2024

runCondition can be a function that accepts the service to be
watched and returns a service that does the watching. The
derivation calls this function and makes the watched service
depend on the watcher.

When adding services to the default bundle (wherever that happens, I
no longer remember) we need to check if the service has a controller
and add that instead of the service itself.

s6-ipcclient -v /run/devout.sock  sh -c "echo subsystem=tty attrs.idVendor=12d1 >&7; cat <&6 >&2"

Sun Jun  9 22:55:05 BST 2024

services which depend on services that have controllers should not be
added to default bundle, otherwise they will try to start the controlled
service before its ready

Tue Jun 11 20:00:04 BST 2024

we can't look in /run/service/name  to see if it's controlled, because
if it's a oneshot that directory won't exist. likewise /run/s6-rc/servicedirs/

we could add the controller to the dependencies, then
controlled? = (any dependency is a controller)

what about generating a script at build time that has the knowledge
baked in?


to_start = reverse deps of service
print "to_start=$to_start"
for each service s1 in to_start
  if isControlled s1
     rd = reverse deps of s1
     print "test_stopped s1 && not_to_start="$not_to_start $rd"
print ''
for s in $to_start; do
  case " $not_to_start " in
    *" $s "*) true;;
    *) s6-rc -u change $s ;;
  esac
done
''

---

would it help to make an empty bundle called "controlled" and have all
controlled services depend on it? no, because the condition for
skipping a service is that it depends on a _different_ controlled
service that's down

s6-rc -u change $service
for s in $(s6-rc-db -d all-dependencies $service); do
  if (s6-rc-db all-dependencies $s | grep controlled); do
    # do nothing: if it's stopped, we don't want to start it,
    # and if it's already running there's no need to.
    # XXX what if it's running but its children are not?
    # XXX 2 this fails as written because _every_ service $s depends
    # on the controlled service $service
  else
    start s

"[service] Names cannot be duplicated and cannot contain a slash or a
newline; they can contain spaces and tabs, but using anything else
than alphanumerical characters, underscores and dashes is discouraged"

 would this be simpler in fennel?

(each [l (lines (io.popen (%. "s6-rc-db -d all-dependencies %s" service)))]
  (if (not (controlled-ancestor l))
    (tset to-start l true)))

Sat Jun 15 08:51:21 BST 2024

# do this at boot
for s in  $(s6-rc-db -d dependencies controlled); do
  mkdir -p /run/services/controlled
  touch  /run/services/controlled/$s
done


# now we can do this to start a service tree

for controlled in $(cd /run/services/controlled/ && echo *); do
  if down $controlled; then
    blocks="$blocks $controlled "
  fi
done

for s in $(s6-rc-db -d all-dependencies $service); do
  for dep in $(s6-rc-db all-dependencies $s)
    case "$blocks" in
      "* $dep *")
      	: don't start
        ;;
      *)
	s6-rc -u change $s
      ;;
    esac
  done
done

Sun Jun 16 23:13:53 BST 2024

what we are trying to do is set up an l2tp by hostname

1) this means looking up the hostname in the dns
2) this means having a route to the dns server
3) this means parsing the space-separated list of dns servers
  provided by dhcp

we could write the servers each into their own file, but that
helps less than you'd think unless we give those files predictable
names

Thu Jun 20 10:16:52 BST 2024

now we have l2tp-over-wwan, we need to do the failover mechanism

- can't have both l2tp and pppoe running at once (at least for aaisp)
  because same creds used for both and starting l2tp will cause them
  to route all traffic to the l2tp instead of the FTTx

- we could have the wwan stick permanently configured and ready to go,
  as long as we're not actvely using it unless the main connection is
  b0rked

- can we have the same odhcp stuff running and point it to either?
  maybe renaming the wan interface would be an easy-ish way to do this

we need some kind of health check on the main connection that will
bring up the backup if e.g. packet loss over x%. Or is lcp echo good
enough here? for multipath to the same backhaul, if some weird routing
cockup makes google unavailable from the main connection it will most
likely also be unavailable from the backup, so lcp echo is arguably better


on a side note, use of shell functions to get the output from another
service is a bit icky

Fri Jun 21 21:05:21 BST 2024

We can have a controller with two controlled services, which runs the
second one when the first one isn't working.

how do we connect the dependent services (dhcp pd, defaultroute, anything
else dependent on wan) to the correct upstream?

we can't use bundles because bundles just flatten to atomic services, there's
no either/or there

controller
  - main service
  - backup service
  - proxy service

The proxy service is running when one of the main or backup services is
up.  It provides all the outputs of whichever backend service is active

https://skarnet.org/software/s6/s6-svwait.html

proxy could use "s6-svwait -U -o main backup" to wait for one of the two
backend services, provded that both are longruns

so in the controller we start main-service, and if/when that fails start
backup-service. we run proxy-service if any of the backend services is
running, and use its outputs to indicate which.

the proxy could just symlink to the backing service outputs directory,
or it could copy and translate if the main and backup services have
different outputs, so that it presents a common interface. I'm not
sure proxy is the best name but I haven't thought of a better.

we can do a manual switch back to main-service by restarting the
controller.  we could do an automatic switch by adding logic to the
controller to make it restart itself.

perhaps the controller has an output that indicates which backend is
active, then the proxy just needs to look at that to figure which one to
use.

while true; do
  if s6-rc -u change $primary; then # will wait until succeeded, or exit 1 if timeout
    ln -sf $primary outputs/active
    s6-rc -u change $proxy
  elif s6-rc -u change $secondary;  then
    ln -sf $secondary outputs/active
    s6-rc -u change $proxy
  else
    rm outputs/active
    s6-rc -d change $proxy
  fi
  # wait for the backend to die (down cleanup will
  # remove outputs directory)
  while test -d outputs/active/.outputs
    inotifywait outputs/active/.outputs
  fi
  rm outputs/active
  s6-rc -d change $proxy
end

this script will when when primary dies, attempt to start primary: if
it doesn't come up, start secondary

if the primary comes up and then goes down later, we'll start it
again - which isn't what we want. When the primary dies, we
want to try the secondary next

backends="primary secondary tertiary etc"
rest=$backends
while true ; do
  first="${rest%% *}"
  rest="${backends#* }"
  if test -n "$first"; then
    if s6-rc -u change $first; then
      ln -sf $first outputs/active
      s6-rc -u change $proxy

      while test -d outputs/active/.outputs
	inotifywait outputs/active/.outputs
      fi
    fi
    rm outputs/active
    s6-rc -d change $proxy
  else
    rest=$backends
  fi
done

in this version when the secondary dies then we try the third backend
(round-robin). are there circumstances where we'd rather retry the primary?
Presumably there are circumstances where we would _not_ rather
retry the primary, otherwise why are we even providing a tertiary?
If we could answer that question then we'd know.


Mon Jun 24 21:22:34 BST 2024

the controller needs to know the names of its backends, which is ugly
if they're computed names because we can't define the services themselves
first without their references to the controller

mutual recursion ... maybe it's time to understand how this fixpoint
thing works

Wed Jun 26 22:16:25 BST 2024

s6 will restart the pppoe service when it dies, and keep doing this
indefinitely - unless the ./finish script returns 125. Note that this
is only true for longruns, but it's not as though oneshots can die
anyway as there's no process to fail.

Sat Jun 29 21:43:10 BST 2024

> s6-supervise says it restarts the supervised process when it exits
  "unless told not to"; however s6-rc talks about "failed
  transitions": if a s6-rc service doesn't signal readiness before
  timeout-up expires, it is stopped and won't be restarted.  I *think*
  the behaviour I am observing is that ./run may be invoked several
  times if it dies without ever signalling readiness, and then it's
  killed when the timeout is exceeded


... so ... that's OK, probably. pppoe will stop running after n
lcp-echoes time out

----

inotifywait apparently requires c++ and libgcc and transitively the
kitchen sink, which is a bit silly as we have linotify in lua. So
we should replace the failover scripty thing with a lua program

(table.concat rdepends ", ")


Fri Jul  5 21:21:18 BST 2024


1970-01-01 00:01:00.797696621 wan-switcher      blocks (        modem-modeswitch, modem-atz, wan.link.pppoe, 194.4.172.12.l2tp, wan-proxy     )       rdepends (      194.4.172.12.l2tp       )       start ( 194.4.172.12.l2tp       )


why is it starting l2tp when it should depend on having a route to the
l2tp server

Sat Jul  6 14:24:26 BST 2024

The logic for up-tree is not correct, as it assumes that the
requested service is itself ready to start (so excludes it from
the blocked list). If the requested service is dependent on
some other block, it should not be started.

[ I am confused. Isn't this what happens already? ]


@40000000000000441b51b24c wan-switcher  blocks (        modem-atz, modem-modeswitch, 194.4.172.12.l2tp, wan.link.pppoe, wan-proxy     )       rdepends (      194.4.172.12.l2tp       )       start ( 194.4.172.12.l2tp       )


# s6-rc-db all-dependencies  194.4.172.12.l2tp
route-05029a9e8e2c-ee8d76f34e9c
hostname
modem-atz
modem-modeswitch
wwan0.link
check-lns-address
resolve-l2tp-server
controlled
route-07d8f171cb5a-ee8d76f34e9c
wwan0.link.dhcpc
wwan0.link.dhcpc-log
194.4.172.12.l2tp-log
194.4.172.12.l2tp
s6rc-fdholder
s6rc-oneshot-runner

Wed Jul 10 23:37:00 BST 2024

I propose rewriting the admin section. Topics we need to cover

* building liminix given a configuration
* installing for the first time
  - refer to hardware section to find which of the following apply
  - installation from openwrt
  - installation from boot monitor
* upgrading
  - when you have a writable filesystem
  - using levitate
* using the running system (services, logs)

we also need to expunge all mention of kexec

and mention the upgrade choices in the Configuration section so
people don't build an unupgradable image and only find out later

Fri Jul 12 21:20:00 BST 2024

generalising the failover example:

- usb stick may or may not need a modeswitch
- may need a different chat script
- usb ids

Mon Jul 15 17:52:57 BST 2024

DONE 1) Should round-robin be a callService service or a function a la
longrun/oneshot, or even an overridable package?

DONE 2) maybe we should replace all liminix.callService with
 config.system.callService

3) for consistency, can we make the networking "primitives" into
services? answer: no.  the only thing left there is `ifup` which is a
function returning a string, not a derivation

Tue Jul 16 18:25:41 BST 2024

can we make the gateway profile able to use failover?  perhaps if we
add username and password as options to the pppoe service, then call
gateway with the pppoe service instead of building it _in_ the profile,
we can have gateways with other-than-pppoe for the wan

(for a straight lte uplink, could pass the wwan interface as wan)

Fri Aug  2 20:18:38 BST 2024

Some thoughts about secrets

1) clevis/tang is a mechanism to have encrypted local secrets that
can't be decrypted unless a particular network host is present. This
means we can reboot it unattended without having plaintext on the device,
but it doesn't address getting the secrets onto the device.

when is this useful?

- someone steals the router and doesn't steal the tang server, they can't have my secrets

when not?

- someone compromises the machine, gets root, and looks in /run
- doesn't address key rotation

2) hashicorp vault or something like it can download secrets from
a networked server - then we just plonk them in /run. But I don't want
to pay for vault itself, so "something like" are the key words there

3) Is sops relevant here? could we keep secrets in a big sops json/yaml
and serve the decrypted file over https?

==== so we have a plan, I think =====

1) a secrets service that retrieves the secrets from some directory1
that isn't /nix/store and copies them to its outputs with minimal
permissions.  Services that need secrets can depend on the secrets
service and be restarted when secrets change

1b) a service that monitors changes in secrets (e.g. using inotify
on /run/services/outputs/foo/bar) potentially doesn't need to restart
when the secrets change, so how do we know which ones to pass over?
This is a performance optimisation not a correctness issue

2) as (1) but the secrets are encrypted and we use clevis to decrypt

3) a secrets service that fetches the decrypted secrets as a JSON file
using HTTPS.  We can use this ourselves with SOPS and it will be easy
to adapt for Vault users.

(What about fetching _encrypted_ secrets with https and then
decrypting locally? I don't think we can do this with clevis because
the machine that encypts has to be the machine that decrypts (ICBW).

Sat Aug  3 22:28:24 BST 2024

It would be useful maybe if the secrets could survive a reboot. Fetch
using HTTPS and then store locally using clevis. But actually all that
does is mean we've shifted from "can see the secrets server" to "can
see the tangd server" so maybe not especially useful

* We can't put passwords in the secrets unless we have a service that
will change /etc/passwd if they change *

we need a service that does the HTTPS call, parses the JSON response
and writes it as nested subdirectories

can we encode permissions and ownership into the file? should we?
will some other secret store (say, Vault) know how to encode permissions
in the same way as we arbitrarily choose to?


Tue Aug  6 18:41:16 BST 2024

I would like to know

- why the docs aren't being built
- min-copy-closure error:  "cpio: unsupported cpio format, use newc or crc"
- can we avoid rebuilding this crapload of build packages?

Wed Aug  7 18:36:09 BST 2024

* a lua script that downloads a json file and turns it into outputs
* a service that runs the script and then pauses for 30 minutes
* the service will be restarted when it exits

export SOPS_AGE_KEY=$(age -d key.age)  ; sops -a age1vearrjhv4x4cw6rfg2hdgqp46p4k673avezk3td5rd9ktrcrmslsljjsfq -e secrets.yaml > secrets.enc.yaml

EDITOR="emacs -nw" SOPS_AGE_KEY=$(age -d key.age)  sops  secrets.enc.yaml

Fri Aug  9 21:51:18 BST 2024

we have a service that periodically fetches a json and writes the values
to its outputs

we need to figure how to *use* that data

- services that can't look in a file for their secrets might need a config
file to be rewritten
- service may need restarting to pick up a changed secret
- maybe service accepts secrets using environment variables (see also
 previous point)

we already have  a mechanism for watching service output changes, it's the
thing we use for picking up dhcp6 config

it doesn't do the diff for you, you have to remember the old value and
see for yourself if the change is useful.

what we'd like is something like this:

svc.secret-watcher.build {
  source = config.services.secret-service;
  watch = ["wlan" "telent5"];
  service = svc.hostapd {
    params = {
      # ....
      wpa_passphrase = "$(output secret-watcher "wlan/telent5/wpa_passphrase")";
    };
  };
}

but output is a shell function, so how do we get this substituted into
the config file? something at runtime needs to rewrite the config file
into /run and interpolate the values.

the hostap service "run" script, before starting hostapd, needs to
copy the config file from the store into /run/somewhere and
interpolate secrets.

we could have a reasonably general command to do interpolation

echo 'wpa_passphrase={[ wpa_passphrase ]}' |  \
  patch-secrets /run/services/outputs/secrets-service/wlan/telent5 {[ ]} \
  > /run/services/state/${name}/hostapd.conf


The values might need quoting/escaping, and the quoting rules will
depend on the format of the file that needs to be generated. What if
we do an Erb-style thing and evaluate the bit inside quotes as
Lua - then we can provide any kind of escapes needed as lua functions

wpa_passphrase={[ string.format("%q", wpa_passphrase) ]}

We could for convenience provide squote(), dquote() etc functions
but the necessary rules for escaping might vary. How about
having shell() or json() or ? (what else? html?)  functions that
format and escape per the encoding rules for that language?


string.gsub(template_string, "%{%[.-%]%}", function(x)
  load(x, x, "t", myenv)
end

Sat Aug 10 23:43:15 BST 2024

Every service that can be configured with secrets (at least, that uses
a configuration file) will need to be altered to interpolate at
startup

Any service that passes params on the command line may be able to
use the "$(output " syntax still, but it does feel brittle (it always did)

will we see any kind of pattern emerge so that we can provide
secrets-interpolation for config files in one place instead of
everywhere?

svc.secret-watcher.build {
  source = config.services.secret-service;
  watch = ["wlan" "telent5"];
  service = svc.hostapd.build {
    params = {
      # ....
      wpa_passphrase = "{{ $(output secret-watcher "wlan/telent5/wpa_passphrase")";
    };
  };
}

how does the watcher communicate to the inner service that it needs secrets
from x place?

svc.secret-watcher.build {
  source = config.services.secret-service;
  watch = "wlan/telent5";
  service = svc.hostapd.build {
    secrets = config.services.secret-service;
    params = {
      # ....
      wpa_passphrase = "{{ $(output secret-watcher "wlan/telent5/wpa_passphrase")";
    };
  };
}

or something like

let
 secret = name: get-output config.services.secret-service name;
in svc.secret-watcher.build {
  watch = "wlan/telent5";
  service = svc.hostapd.build {
    params = {
      # ....
      wpa_passphrase = secret "wlan/telent5/wpa_passphrase";
    };
  };
}

which is transformed into some kind of attrset that the service can
interrogate and figure out how to interpolate? this would be an improvement
as the knowledge of what kind of quoting to use is within the service

A reasonable question would be what happens if we reference outputs
from more than one service. Honestly I'd be happy to not support it
but it's made quite easy by this form of syntax

Mon Aug 12 19:42:48 BST 2024

what about if when we build the output template we'd have something
like this:

wpa_passphrase={{
  json_quote(output("/nix/store/eeeee-servicename/.outputs", "foo/bar"))
}}

which it will get partly from its own knowledge and partly from
the thing that called it


let
  literal_or_output = o:
    if builtins.typeOf(o) == "string"
    then builtins.toJSON o
    else "output(${builtins.toJSON o.service}, ${builtins.toJSON o.path})"
in
''
wpa_passphrase={{
  json_quote(${literal_or_output(wpa_passphrase)$})
}}
''

builtins.toJSON is not the "correct" quoting regime for Lua strings,
but it's sufficient for printable ascii, and using unprintable
characters in Nix strings is asking for trouble in the first place

Tue Aug 13 18:37:59 BST 2024

next thing is secret-watcher service

svc.secret-watcher.build {
  watch = { service = config.services.secrets; path= "wlan/telent5"; };
  service = svc.hostapd.build {
    params = {
      # ....
      wpa_passphrase = {service= config.services.secrets; path= "wlan/telent5/wpa_passphrase"};
    };
  };
  action = "restart"; # or "sighup" or "stop-start" or ?
}

we can implement this using the same output watching thing as acquire-*.fnl
use

- a fennel script that opens the service and calls events
- when an event path matches, do the action

[ should the watched service do this restart thing itself? ]

watch-outputs -r controlled-service watched-service path1 path2 ...
[-r or -R or -s n to send that signal number ]


TODO stack

1) should we move all of the veneer-on-s6 scripts into a single package? s6-rc-up-tree, s6-rc-round-robin, watch-outputs, output-template

2) convert all writeFennelScript calls to writeFennel

3) implement if-modified-since in http-fstree

Wed Aug 14 23:00:12 BST 2024

we have a watch-outputs program, just need to hook it up to services
that need restarting

if we follow the pattern that health-check uses, define a service
that runs a script to do

 s6-svwait -U /run/service/${name}
 watch-outputs -r ${name} .....

and then insert it into the dependencies of the service that needs
restarting

Sat Aug 17 22:25:44 BST 2024

hostapd is wrapping itself in a watch-outputs, so it restarts
when the secrets change. TODO

[not worth it] 1) be smarter about the watched paths? e.g. find common prefixes?

[done] 2) don't need to wrap at all if there were no secrets

[done] 3) implement different kinds of restart

4) extend to other services
[why?]- dnsmasq
[done] - pppoe / l2tp
[done] - ssh keys

5) other sources
- local filesystem
- local filesystem with tang unlocking

6) should we send authorization header?

7) install on router

8) docs/video


Tue Aug 20 22:45:04 BST 2024

pppd is different because we do the stuff on the command line instead
of using a config file.  Though I suppose we could convert to a config
file if it makes it simpler to reuse the template code, and that would
mean that secrets were in the filesystem instead of exposed on the
command line

Wed Aug 21 23:28:41 BST 2024

We may need to patch dropbear to make it look for authorized keys in
somewhere under /run that we can control. Or we could have a separate
dropbearpubkeyagent service that overwrites those files when things
change (but only if home is writable, which it isn't). Or we could install
those files as symlinks to writable storage

https://github.com/fabriziobertocci/dropbear-epka useful?

Fri Aug 23 11:51:34 BST 2024

Wrote a patch to dropbear that permits us to -U /run/dropbear/authorized_keys/%n

We need to write dropbearpubkeyagent service, which listens alongside
the ssh service to create those files when secrets change. it doesn't need
to interact with the actual sshd, but we _do_ need to invoke the
sshd with -U if keys-from-secrets were requested

we need somewhere to specify the secrets path to the keys

sshd = svc.ssh.build {
  port = 2222;
  authorizedKeys = {
    service = config.services.secrets;
    path = "ssh/authorized_keys";
  };
}

will

 - start the pubkey watcher service
 - add it as a dependency of sshd service

vaguely uneasy about the difference between how we reference a
directory full of secrets here and how we reference a single static
secret in e.g pppoe. But maybe it's ok. the output reference just says
where the value is, it's up to the implementing service script to say
how it gets converted to useful form

How do we reconcile this with config.users, which also has ssh auth
keys? Maybe we just say it overrides.

What if someone provided static data for authorizedKeys?
(1) we would want it to be a attrset not a string
 (how do we distinguish an attrset from a secret reference, hmm?)

(2) we would convert it to /run/${name}/authorized_keys/ and use -U
 anyway

[done] - make ssh service accept keys as a param, use -U to point dropbear at them
[done] - turn replacable into a function which takes a param and returns
 service or path
[done] - replacable type definition  takes a param to indicate the "underlying"
type: i.e. an attr can be replacable int or replacable attrset, not
just replacable string
[done] - destructure args in ssh.nix
[done] - write fennel script that watches a secret ref and writes authorized
keys when it changes
[done] - update ssh service to start the watcher instead of constructing key files using echo

Sun Aug 25 19:20:56 BST 2024

5) other sources
- local filesystem
- local filesystem with tang unlocking

should we use a json here, or nested directories like the outputs directly?
I think json, then there's a single file to encrypt

6) should we send authorization header?

It's a form of protection against any random MOTP getting our secrets,
but it does mean the device has to be configured with a secret as well
as an URL, Is that OK?

7) install on rotuer

8) docs/video

9) we're not using luaposix on the host so maybe we can drop it in
write-fennel?

Sun Aug 25 21:52:23 BST 2024

It turns out that fetch-freebsd (and, therefore, http-fstree)
can fetch file: urls, so we don't need to do anything for local files
- except maybe rename that service?

Sun Aug 25 21:55:17 BST 2024

clevis-{en,de}crypt-tang are bash scripts that expect PATH to include
jose, curl, cat. Most of the hard work seems to be done by jose

Should we drag in bash (and curl ...) just to run these scripts?

most of what clevis-decrypt-tang is doing is calling jose repeatedly
to do base64 decoding and then json manipulation, then curl, then jose
again for some actual jwk stuff. I think we could mostly rewrite this
in fennel using rxi-json and fetch


Wed Aug 28 09:40:41 BST 2024

we have clevis-decrypt-tang but not encrypt

Wed Aug 28 21:36:47 BST 2024


new TODO

[done, neeeds testing] 1) to finish local secrets, we need a service
and script that gets the file, decrypts it and turns it to
outputs. Easiest way is to use a temp file in /run/${name} and then
use json-to-tree: there's no extra risk to having the plaintext json
there when it's in the same place anyway as fstree

1.5) and test the process and write some docs

2) perhaps we should use /run/services/var/${name} instead of /run/${name}
to avoid surprise conflicts. or we could use the existing mkstate?
mkstate is setting perms 2751 and I don't know if that's important,
but we want 0700 for secrets

[done] 3) http auth - we have netrc file support "for free", so to speak:
fetch-freebsd looks for $NETRC or $HOME/.netrc. If we put the auth
tokens in configuration, they will get embedded into the image and
this will protect against leaked http server logs but not much else.
Scenario: you have a LAN with untrusted devices on it, plus WAPs which
want to get their config from a server. If the server logs leak, other
LAN users still can't use the config URL to fetch your PPP auth data.

I think it just comes down to docs/video now


-=----

docs!

to cover:

- outputs
  - what for
  - how to read?
    - one-off read in shell
    - monitoring in fennel
  - how to write

- secrets
  - sources
    - https
    - local/tang
  - supported services/attributes
    - how to add a new attribute
    - how to add a service
  - how it works (see outputs)


think this is mostly to go in Configuration. Is there anything for Admin?



Sat Aug 31 17:52:10 BST 2024

Still having trouble with tangc, which I think is just poor coding in
popen2.  Hard to test it without access on loaclhost to the tangd on
bordervm

 - use a hostname instead of an ip address
 - set the hostname somehow on loaclhost
    (actually we could just hardcode the url in tangc.fnl)
 - export 7654 in qemu somewhow

Mon Sep  2 21:49:05 BST 2024

TODO - Things we haven't done yet, ideas to consider

[done] 1) improve popen2, maybe using coroutines for proper async chat

2) allow on-device not-in-store netrc so it could be kept in 0700.
Could just do authFile = "/mnt/store/blah"

[done] 3) we're not using luaposix on the host so maybe we can drop it in
write-fennel?

[done] 4) add nodefaultroute to default ppp-options

[done] 5) implement if-modified-since in json-to-fstree

6) clean up some copy-paste (e.g. literal_or_output or whatever we call it)
- [done] ppp variants are consolidated, but there's still more to do here

7) remove references to kexecboot

8) performance testing

9) revive ax3200 port and fix ubifs

10) rebuild our wifi APs and lenscap to use levitate and outboard secrets

11) [outside scope] secrets server on arthur, and oidc too?

12) remove errors from ersatz coldplug

13) teach anoia.svc how to write/remove .lock and state

14) log kernel messages

15) log shipping to something useful

16) standardise error messages in fennel. using assert() is not good
for errors like "the file is missing or can't be opened" because
the backtrace is voluminous and usually inaccurate

we could have something like check-ok which looks for the
common multiple-return (nil errmsg) pattern, we'd like a similar
one for the lualinux (nill errno) pattern.

17) fix with nixpkgs unstable

Wed Sep  4 21:45:07 BST 2024

blurb for audit:

Liminix is a Linux/Nix-based OS that can be flashed to consumer WiFi
routers of the kind that OpenWrt usually runs on (usually small MIPS
or ARM SBCs). Its USP is that because it's based on Nix, the
configuration of your device is based on a text file: there's no GUI
or other imperative interface allowing you to make changes that you
will forget you did six months later and have to recreate when you
update to a new version of the system.

tl;dr C, nftables, Lua (Fennel even better), shell, Nix. No specific
timeline from my end (unless nlnet have told you otherwise). IMO,
emphasis on network vulnerabilities rather than anything involving local
escalation: nobody is expected to be logged in locally except for
maintenance purposes in which case they're trusted by definition.

If you want to start by seeing it running, unless you have a
supported device then your best bet would be to build it for Qemu
<https://www.liminix.org/doc/tutorial.html#running-in-qemu>.  It boots
to a root console shell (there is no password on the serial console
because if you have that level of physical access on a real device
it's game over anyway) so take a look at the process list and
filesystem and generally poke around. The filesystem is read-only
unless you configure it otherwise.


To do a "static" audit: a rough breakdown of the contents, by volume, would look like this:

1) 95% of it is packages in the Nix package system (and the Linux
kernel).  Some of the packages are built with different compilation
options to produce smaller output, and in a few cases I've patched
them, so someone with C experience might be suited to look at those
patches.

```
[dan@loaclhost:~/src/liminix]$ find pkgs/ -name \*.patch
pkgs/kernel/phram-allow-cached-mappings.patch           # relevant to dev devices not production
pkgs/kernel/mips-malta-fdt-from-bootloader.patch        # for qemu only
pkgs/kernel/cmdline-cookie.patch
pkgs/dropbear/add-authkeyfile-option.patch
pkgs/u-boot/0002-virtio-init-for-malta.patch            # only used in tests
pkgs/u-boot/0001-add-ubifs-to-boot-targets.patch        # only used in tests
pkgs/xl2tpd-exit-on-close.patch
pkgs/qemu/arm-image-friendly-load-addr.patch            # only used in tests
pkgs/kernel-backport/gentree-writable-outputs.patch     # unused
pkgs/openwrt/make-mtdsplit-jffs2-endian-agnostic.patch
pkgs/mtdutils/0001-mkfs.jffs2-add-graft-option.patch    # can be removed
pkgs/kexec-map-file.patch                               # can be removed
```

dropbear (ssh) and xl2tpd (l2tp) are network-accessible. The kernel
is a high-impact target, but cmdline-cookie.patch is the only "production"
patch there so hopefully easy to review

Significant packages with custom config options:

* hostapd (configured for libtommath, internal TLS)
* nftables (--with-mini-gmp)
* openssl "no-threads" and patches to Configure to build on MIPS

My assumption with all of these is that the package authors wouldn't
provide these as configuration knobs if they weren't reasonably confident
they work as advertised, but I am willing to hear otherwise.

2) the device has a firewall using nftables. The user gets to choose
their firewall rules, but the default ruleset
https://gti.telent.net/dan/liminix/src/branch/main/modules/firewall/default-rules.nix
is based on RFC 6092 for IPv6 and "received wisdom" for IPv4: I would
very much like a second pair of eyes on this.

3) Code which is original to Liminix: as far as possible I've used a
high-level language (Fennel, which is a Lisp syntax that transpiles to
Lua) for "original" development. There is one C package
(pkgs/preinit) and some C glue to expose interfaces to Lua.

None of the original code listens to the network (except Unix-domain
sockets). At least, it was never intended to :-)

Highlights:

pkgs/devout : fills the same role as udev (listens to a kernel socket
 and a unix domain socket)

pkgs/json-to-fstree : does HTTP GET and POST requests, using a port of the
FreeBSD libfetch code (see pkgs/fetch-freebsd)

pkgs/tangc : is a transliteration from bash script to Fennel of
https://github.com/latchset/clevis/blob/master/src/pins/tang/clevis-decrypt-tang
and
https://github.com/latchset/clevis/blob/master/src/pins/tang/clevis-encrypt-tang

I call this one out specially because it's crypto-adjacent, but all the
actual cryptography happens in "jose" which it invokes as a subprocess

pkgs/min-copy-closure contains some shell scripts which leverage cpio
and ssh to update a running device over the network. My shell
scripting is probably worse than my C, so take a look

4) there is a mechanism to configure the device's secrets (PPP
password, ssh keys, etc) by fetching a JSON file from an HTTPS server,
and then generating configuration files for the various services that
use those secrets. This is mostly Fennel (so, Lua)


5) the init/service supervision system is based on s6/s6-rc (again, C
software). I don't know if this has ever had an external audit, but to
my eyes it looks like it's been written with security in mind.

Thu Sep  5 10:12:11 BST 2024

if-modified-since and fenceposts ...

we set the mtime of "." to last-modified on retrieve. what resolution
is the timestamp? empirically (using stat(1)), tmpfs seems to have
sub-second resolution, so no loss of data

Thu Sep  5 11:15:24 BST 2024

how do we do deadlock-free popen2? need to use select or poll
on the input and output fds, and read/write a chunk when one of them is ready

(subprocess ["/usr/games/advent" "advent"]
  {
    :on-stdout #(print (ll.read %1))
    :on-stderr #(print "ERR" (ll.read %1))
    :on-stdin #(ll.write %1 "go north\n")
  })

for send/expect things, a single callback would be preferable if
it has a reason it's being called

(subprocess ["/usr/games/advent" "advent"]
  (fn talk [stream fd]
    (match stream
      :out (print (ll.read fd))
      :err (print "ERR" (ll.read fd))
      :in (ll.write %1 "go north\n"))))

because it can be hooked up to a coroutine. The coroutine is then
responsible for doing things in the right order to avoid letting buffers
fill up - probably this is just a matter of dealing with the
subprocess output before sending it more that it can choke on

wat about partial writes? the coroutine is presumably keeping some
kind of state so it can check the return of ll.write when it updates that
state


Fri Sep  6 19:57:37 BST 2024

video editing

* there are ppp credentials onscreen at 13:00

* should finish at 33:00


Sat Sep  7 22:21:12 BST 2024

what is causing these messages?

@400000000001620127a096b3 watch-for-modem-modeswitch /nix/store/4q3swc0mg28ja30anap4id3gics3h7fk-lua-tty-mips-unknown-linux-mus
l/bin/lua: ...lr-uevent-watch-mips-unknown-linux-musl/bin/uevent-watch:4: unexpected symbol near '/'
@400000000001620127a0ec7e watch-for-modem-modeswitch stack traceback:
@400000000001620127a11c42 watch-for-modem-modeswitch    [C]: in function 'dofile'
@400000000001620127a14dec watch-for-modem-modeswitch    (command line):1: in main chunk
@400000000001620127a41b1c watch-for-modem-modeswitch    [C]: in ?
@

Sun Sep  8 10:15:56 BST 2024

If we could produce logs in JSON then we could push them to zinc (or
elasticsearch, which has the same api). We'd like fields for
timestamp, message, pid, host.

* we can add host when we post to elasticsearch,  no need to repeat it
on every field

* there is no (sensible) way to get the pid of the other end of a pipe.
But we could print it from the sender before execing the process. But then
it'll only appear once instead of every entry. Maybe we could log the
logger pid as well, then we can correlate

TBH given that we already have to process the log lines to get them
into zinc, and that we already can unambiguously parse the log line
(provided we disallow whitespace in the service name, and we mandate
that the message is always the final field) there's not much value in
producing a different json.

Actually the logger pid probably won't help us tell when the service
has been restarted, because the logger won't be restarted at the time
time due to fdholder stuff

so perhaps there are no logging changes we can easily/reasonably make
and we should just write a log processor that ships to a collector.

- open connection to zinc (s6-tlsclient)
- send http headers
- while not eof(stdin)
 - read line
 - split fields
 - send command, send data

{ "index" : { "_index" : "olympics" } }
{"Year": 1896, "City": "Athens", "Sport": "Aquatics", "Discipline": "Swimming", "Athlete": "HAJOS, Alfred", "Country": "HUN", "Gender": "Men", "Event": "100M Freestyle", "Medal": "Gold", "Season": "summer"}


we can't calculate content-length. maybe we can use chunks

Transfer-Encoding: chunked

size-of-chunk-in-hex CRLF
chunk-data  CRLF

0 CRLF
CRLF

to generate test data:
$ nix-shell -p s6 --run " sort --random-sort ~/src/liminix/THOUGHTS.txt  | head -1 | sed 's/^/servicename /g' |tr -cd '[a-z0-9 ]' |  s6-tai64n"

Sun Sep  8 16:39:55 BST 2024

* how do we add incz to the logging infra and configure it?
* how do we get zinc on loaclhost to be visible to test lan (port forward
 on border?)
* shall we rig up a service on loaclhost so that zinc starts at boot?

Mon Sep  9 17:58:46 BST 2024

We can use this as a log processor. However, a log processor doesn't
ship the segment until the log writer has finished with it, therefore,
some latency is introduced.

We can write logs to the network as they are generated. However, what if:

- the network is not available
- the collector is not keeping up

s6-log says "if a processor fails, s6-log will try it again after some
cooldown time.".  Laurent says "if a processor fails [if you're using
-b] then the rotation cannot happen, and s6-log will stop reading
until the processor succeeds.  Without -b, logs keep accumulating in
RAM, and s6-log may crash if it runs oom before the processor
succeeds"

.... so, maybe we shouldn't use log processors here.

shipping finished log segments outside of the s6-log framework is
quite straightforward. the issue is how to send the in-progress log.
Challenges

- if we are sending the in-progress log, how _not_ to resend all the
same entries when the segment is rotated
- how do we know when the segment is rotated and should start reading
the new file

Maybe

1) we have a logshipper service that listens on a unix socket
2) s6-log is hooked to the logshipper-client logger, which checks
 for the unix socket and only writes data to it if it exists.
 Probably it should check periodically for the socket to exist
 and not just try it on every write
 (Would be good if it could tell whether the socket had a
 listener or not. Maybe abstract sockets)


 (and also writes to stdout so the s6 logging chain is unbroken)
3) when logshipper is ready, it reads all past log entries whose
 timestamps are from before when it started, and writes them.
 and/or it could write a cookie to the log, then it would know
 to stop reading the logs when it encounters its own cookie

Mon Sep 16 19:54:35 BST 2024

incz won't work as-is because it uses stdin/stdout for communicating
with http and reads the logs from a filename.  unless we make it use
/proc/self/fd/3 for the filename? Even then, ideally we kind of want
to adapt it for streaming

Wed Sep 18 18:23:38 BST 2024

we can run

socat tcp-listen:19612,reuseaddr,fork | s6-log -b /var/log/clients

on the log collection host (or use openssl-listen if you're going to set up ssl certs)

then our logshipper program can basically be "open socket and cat to s6-tcpclient" (though it would be better if we can add the hostname in the process)

let's make the logging script a config option


pipeline { s6-ipcserver -1 /run/uncaught-logs/shipping }
pipeline { s6-tcpclient loghost:19612 }
fdmove -c 1 7
cat

Wed Sep 18 20:32:54 BST 2024

s6-log will create its directory but the parent must exist. incidentally,
putting /run/uncaught-logs in pseudofiles is pointless because /run is a
mountpoint

ip addr add 10.0.2.15/24 dev lan

s6-rc -d change sshd; s6-rc -u change sshd;

Sun Sep 22 21:13:15 BST 2024

This works for the collector (but note that it collects logs from
*anywhere* that can write to that port, so please firewall responsibly)


  systemd.services."s6-log-collector" = {
    after = [ "network.target" ];
    wantedBy = [ "multi-user.target" ];
    serviceConfig = {
      Type = "exec";
      WorkingDirectory = "/var/log";
      ExecStart = ''
        ${pkgs.bash}/bin/sh -c "${pkgs.socat}/bin/socat tcp4-listen:17345,reuseaddr,fork stdout | ${pkgs.s6}/bin/s6-log -b /var/log/remote"
      '';
    };
  };


Tue Sep 24 18:14:42 BST 2024

"Raw" TCP is not the ideal transport for logs because I don't want the
whole internet able to write to my log server, and writing
only-from-the-LAN iptables rules is messy with a gazillion ipv6
addresses to account for.

. SSL with client certificates would be nice, but there is the issue
of how to get the private key onto the device and sign it. My idea is

1) device generates a private key on first boot (or every boot, if no
persistent storage). Private key includes some field with a value that
was set at build time (PSK, effectively)

2) there is an API-driven CA signing thingy that the client can use to
get a cert based on their key. It checks the field for the presence
of the PSK. It should probably expose HTTPS only so that the client
can be sure it's getting signed by the correct CA


3) the log collector refuses connections unless the client is signed by the
local CA

This is quite a lot of work insofar as it would appear to require
writing the CA.

We could alternatively do something much more ad hoc where the client
just writes the PSK to the server when it opens the stream, before
sending any data. We'd have to write the server end, then, instead of
just using socat - but that is probably less work than an API-driven CA.
On the other hand, TLS logs would also be encrypted which is a good thing
if the LAN is not trusted.

Sat Sep 28 16:04:15 BST 2024

OK, so we wrote the CA.

To do HTTPS on the client we need

1) to generate a csr
2) to https it to the server
3) store the generated thingy as a service output

looking at x86-64 sizes for ballpark

-r-xr-xr-x 1 root root 987K Jan  1  1970 /nix/store/s45wy1ssim1dkxzligx09xjp4n0668i2-openssl-3.0.14-bin/bin/openssl
-r-xr-xr-x 1 root root 263K Jan  1  1970 /nix/store/z28bxdnsw2gr1xwx7qj6px9iz5sr84i9-lua-5.3.6-env/lib/lua/5.3/_openssl.so

suggesting that we'd use _less_ disk doing the whole thing in lua than

Sun Sep 29 10:20:49 BST 2024

We need luaossl support for setting attributes in a CSR

https://www.rfc-editor.org/rfc/rfc2986#page-5

   Attributes { ATTRIBUTE:IOSet } ::= SET OF Attribute{{ IOSet }}

   CRIAttributes  ATTRIBUTE  ::= {
        ... -- add any locally defined attributes here -- }

   Attribute { ATTRIBUTE:IOSet } ::= SEQUENCE {
        type   ATTRIBUTE.&id({IOSet}),
        values SET SIZE(1..MAX) OF ATTRIBUTE.&Type({IOSet}{@type})
   }

I don't understand this 100% but it looks like the raw data is _not_
the same format as an x509 attribute. See e.g.
https://github.com/golang/go/commit/e78e654c1de0a7bfe0314d6954d42b046f14f1bb#diff-a789286d7e257f148c437404f8cf5d3379688597381ff13352e62ac406be295aL1712
in support of my hypothesis (background: iiuc, "critical" is a boolean flag,
but x509 attributes aren't allowed to be booleans)

https://en.wikipedia.org/wiki/Certificate_signing_request

  388:d=2  hl=2 l=  35 cons:   cont [ 0 ]
  390:d=3  hl=2 l=  33 cons:    SEQUENCE
  392:d=4  hl=2 l=   9 prim:     OBJECT            :challengePassword
  403:d=4  hl=2 l=  20 cons:     SET
  405:d=5  hl=2 l=  18 prim:      UTF8STRING        :loves labours lost
  425:d=1  hl=2 l=  13 cons:  SEQUENCE
  427:d=2  hl=2 l=   9 prim:   OBJECT            :sha256WithRSAEncryption


(csr:getAttribute "challengePassword")
(csr:getAttributeNames)
(csr:setAttribute "challengePassword" :IA5STRING ["loves labours lost"])

how do we know the asn1 type of the attribute values? it looks like they're
defined by the object: see e.g. https://www.rfc-editor.org/rfc/rfc2985#page-16

   A challenge-password attribute must have a single attribute value.

   ChallengePassword attribute values generated in accordance with this
   version of this document SHOULD use the PrintableString encoding
   whenever possible.  If internationalization issues make this
   impossible, the UTF8String alternative SHOULD be used.  PKCS #9-
   attribute processing systems MUST be able to recognize and process
   all string types in DirectoryString values.

crypto/asn1/tbl_standard.h:    {NID_pkcs9_challengePassword, 1, -1, PKCS9STRING_TYPE, 0},
include/openssl/asn1.h.in:# define PKCS9STRING_TYPE (DIRSTRING_TYPE|B_ASN1_IA5STRING)

I assume there's something in openssl that will do lookups in this table
to give us the type for the oid, then maybe something in luaossl that
would lua-ize it?
5

or we could put that burden on the caller, as x509.name:add does

Sun Sep 29 20:46:16 BST 2024


OBJ_txt2nid("challengePassword"); works with short/long names

the get call is complicated because there can be multiple
attributes with the same type. There probably aren't but ...

(csr:getAttribute "challengePassword") => multivals attr, index

(csr:getAttribute "challengePassword" index) => multivals attr, index

(csr:addAttribute "challengePassword" :IA5String ["loves labours"])

(csr:clearAttribute index)


Tue Oct  1 21:55:25 BST 2024

on server, we need to reconfigure socat to give it our CA cert and
expect peer to authenticate

on client, I am not sure if we can persuade s6-tlsclient to use the same
file for cert and private key.  perhaps certifix-client could write
two separate files with --out-key and --out-certificate


CAFILE=ca.crt KEYFILE=client.key CERTFILE=client.crt s6-tlsclient -k localhost -y -v localhost 19612 socat 'fd:6!!fd:7'  -


socat ssl-l:19612,reuseaddr,fork,cert=server-combined.pem,cafile=ca.crt  stdout

suggest creating /var/lib/s6-log-collector/{private,cert} on loaclhost
wherein we keep the server and ca keys, then socat and the signing
server can both see them


ip addr add 10.0.2.15/24 dev lan

Sat Oct  5 22:35:41 BST 2024

We had it working in a VM, and the service is installed on loaclhost

TODO

[done] 1) make a module-based service for client-cert
caCertificateFile
secretFile
subject
url

[done] 2) make the shipping service a consumer-for

[not by much] 3) can we reduce the verbiage in the shipping service somehow?

4) rebuild an actual device with all this stuff

Tue Oct  8 23:50:00 BST 2024

idea: create outputs.update which builds a systemConfiguration and
also a result/install.sh which does  min-copy-closure and
restart-services as per liminix-rebuild, then we don't have
to nix-shell liminix-rebuild

nix-build ../liminix/ -I liminix-config=hosts/rotuer.nix --argstr deviceName turris-omnia -A outputs.update-system  -o rotuer && ./rotuer/install.sh

this would make it much more straightforward to build a bunch of hosts
using a Makefile

idea 2: when a configuration contains levitate, something similar
but necessarily more "manual" to do the analogous thing


Sun Dec 15 18:55:55 GMT 2024

Where we left off with this, rotuer was crashing randomly or failing
to boot every time we tried to add log shipping, which is not very
ideal. I started doing something with logging to /dev/pmsg0
(CONFIG_PSTORE_PMSG) but I think (there seems not to be anything
written down :-( ) that the gl-ar750 kernel needs it added to kconfig and device tree

https://wiki.postmarketos.org/wiki/User:Knuxify/Enabling_pstore_and_ramoops

we could add a new hardware.dts.dtsi = [] option so that any module
could add a new chunk of dts. (Ideally we'd call it `includes`
but that conflicts with the existing use of `includes` to specify
search path. Maybe rename?)

would we ever use it except in a hardware device definition?
(Or user config?) I guess if we were consistent with names
then we could set up nodes in the device file with status="disabled"
and enable them in the module, except that dt doesn't consistently
use status and in fact there isn't one for reserved-memory

we could use global config to enable pstore_msg and check it in
the device module to enable the needed hw support

Tue Dec 17 23:39:28 GMT 2024

I think we can just stick a tee in the fallback logger pipeline that
writes to /dev/pmsg0

Need to check it's a circular buffer

do we want to do anything about recovering the log on boot?
- we could just copy it to /run/log
- if we have backfilling for shipped logs (we don't yet)
  then we might want to ship it - but that may result in duplicate
  logs if some of it was shipped before the crash

perhaps we should truncate pmsg0 on orderly shutdown? or maybe it's
good to see the late shutdown logs.

Thu Dec 19 13:40:39 GMT 2024

although we have PSTORE_foo in the omnia kconfig, I think this might
be just because I copied it from RT3200

Thu Dec 19 14:15:43 GMT 2024

Omnia is not in ci.nix, and it's not trivial to add it because there
is no output in the ci.nix configuration that makes sense for omnia.

... OK, fixed by adding system-configuration as an independent module
and importing in device config

Thu Dec 19 21:59:47 GMT 2024

The build-system shell script in outputs.systemConfiguration
is ugly and requires we do bad things to avoid sucking build
system stuff into the config

I propose we make it a separate derivation.

But first maybe we could improve some names

Sun Dec 22 14:23:02 GMT 2024

MT7622> echo $boot_default
if env exists flag_recover ; then else run bootcmd ; fi ; run boot_recovery ; setenv replacevol 1 ; run boot_tftp_for
ever
MT7622> echo $bootcmd
if pstore check ; then run boot_recovery ; else run boot_ubi ; fi
MT7622> echo $boot_ubi
ubi part ubi && run boot_production ; run boot_recovery
MT7622> echo $boot_production
led $bootled_pwr on ; run ubi_read_production && bootm $loadaddr#$bootconf ; led $bootled_pwr off
MT7622> echo $ubi_read_production
ubi read $loadaddr fit && iminfo $loadaddr && run ubi_prepare_rootfs
MT7622> echo $ubi_prepare_rootfs
if ubi check rootfs_data ; then else if env exists rootfs_data_max ; then ubi create rootfs_data $rootfs_data_max dynamic || ubi create rootfs_data - dynamic ; else ubi create rootfs_data - dynamic ; fi ; fi
MT7622> echo $bootconf
config-1
MT7622> run boot_ubi
UBI partition 'ubi' already selected
No size specified -> Using max size (126976)
Read 126976 bytes from volume fit to 0000000048000000

## Checking Image at 48000000 ...
Unknown image format!
No size specified -> Using max size (7491584)
Read 7491584 bytes from volume recovery to 0000000048000000
## Loading kernel from FIT Image at 48000000 ...
   Using 'config-1' configuration
   Trying 'kernel-1' kernel subimage
     Description:  ARM64 OpenWrt Linux-6.6.45

Sun Dec 22 17:39:56 GMT 2024

From the output above it looks like the device I have plugged in has
an openwrt "recovery" image but not a "production" image

It also looks like it will be quite hard work to persuade it to boot
from usb or anything. It doesn't have any of the extlinux stuff
but it does have uefi for what that's worth

default boot_production command reads a ubi volume called 'fit' and
calls bootm on what it finds

we could define boot_production to ubifsmount liminix; ubifs load
<addr> <filename> (which is a fit) and bootm it. *presumably* we could
do this from the openwrt recovery image

but could we install the whole system using said recovery image?  I
expect we could do, it only requires getting a tarball onto it and
unpacking it

however, extlinux is not going to be helpful
(actually it might be a bit, if we ask it to write the fit as
well as/instead of the individual files)

maybe we need separate concepts of "the filesystem contains
stuff we need for boot" and "the stuff we need is the stuff
that extlinux needs"

each bootloader makes an output called bootfiles, and
the bootablerootdir output copies from bootfiles

Mon Dec 23 18:28:50 GMT 2024

it might be worth moving ubi option decls into the hardware module, if
they're hardware-dependent

Tue Dec 24 16:14:50 GMT 2024

Where next?

a config for haproxy would be good

> a connection or request arrives on a frontend, then the
information carried with this request or connection are processed, and
at this point it is possible to write ACLs-based conditions making use
of these information to decide what backend will process the request.

http://docs.haproxy.org/3.1/intro.html#3.4.4

listen foo
	bind :443

frontend foo
	mode tcp

Thu Dec 26 14:27:26 GMT 2024

What's the plan?

1) build the updater target for rotuer
2) ssh forward through bordervm to install it
3) carve the ngix sni proxy into examples/module-sniproxy
  with a fat warning comment
3.5) should  we add the examples to ci?
4) see if it builds
5) using curl on bordervm, see if it forwards
5000) swap the real rotuer hardware


Sun Dec 29 18:22:42 GMT 2024

To make sniproxy work it needs dns, which means it needs an upstream.
But if we move the bordervm cable from lan to wan, we won't be able to
rebuild over ssh unless it's sufficiently unbroken for pppoe to be working

... unless we add a "management" address to the wan interface along with pppoe

to permit wan stuff through the firewall, now that we've fixed the
firewall :embarrassed:, do

/nix/store/*nftables*/bin/nft insert rule table-ip input-ip4 position 19  iifname "wan" jump incoming-allowed-ip4

What else do we need to try rt3200 as prod rotuer?

- pstore logging
- log shipping (maybe via wan to bordervm for now)
- enable sniproxy module

zcat /proc/config.gz  | grep PSTORE

Tue Dec 31 22:45:33 GMT 2024

/nix/store/i1khbsqpyx020xrhvfbdazc1bnmirc72-kernel-aarch64-unknown-linux-musl-modulesupport/vmlinux


these 3 derivations will be built:
  /nix/store/fvxrvyx64cm86cc4na0584qj9xw6s406-kernel-aarch64-unknown-linux-musl.drv
  /nix/store/ksg93bd8x2z1xpfmj24ixm2srg547mjx-dtb-aarch64-unknown-linux-musl.drv
  /nix/store/n33ij1npzwjmlsw23sz6yhyrh9hgxwaw-kernel.image-aarch64-unknown-linux-musl.drv
building '/nix/store/fvxrvyx64cm86cc4na0584qj9xw6s406-kernel-aarch64-unknown-linux-musl.drv'...
zcat /proc/config.gz  | grep PSTORE

we think tftpboot works (check!)
build the FIT from the updater target with squashfs added in rotuer.nix
boot it with tftp by copy-pasting the boot.scr and changing it

setenv serverip 10.0.0.1
setenv ipaddr 10.0.0.8
tftpboot 0x4007ff28 r2/rootfs; tftpboot 0x41086f28 r2/image; tftpboot 0x4107ff28 r2/dtb
 bootm 0x41086f28 - 0x4107ff28

tftpboot works
tftpboot with outputs.uimage works
1458ed2cdeaa6bf4c02c6511a6d7369d  /nix/store/4n3jc4n7lsp73yc2ccnmgibc7079rxms-kernel.image-aarch64-unknown-linux-musl

$ grep fit /nix/store/ihi7vski37b9ph4xcyqpabymc5nsngjd-system-configuration-aarch64-unknown-linux-musl/etc/nix-store-paths
/nix/store/5jphs9mnb8hzhvwfi2z8h6cnaz6qx4y6-boot-fit
$ md5sum /nix/store/5jphs9mnb8hzhvwfi2z8h6cnaz6qx4y6-boot-fit/fit
1458ed2cdeaa6bf4c02c6511a6d7369d  /nix/store/5jphs9mnb8hzhvwfi2z8h6cnaz6qx4y6-boot-fit/fit

something is changing /persist/boot from a symlink to a directory and I don't know what
... plot twist: or is it? maybe it's ubimage that makes it a directory

rt3200 has no pmsg-size according to  dtc -I fs  /proc/device-tree/ -O dts

should be more like

reserved-memory {
		ramoops@03f00000 {
			compatible = "ramoops";
			reg = <0x03f00000 0x10000>;
			pmsg-size = <0x10000>;
		};
	};
};

Wed Jan  1 14:57:35 GMT 2025

At last I have working persistent logging.

I think we need to do something at boot time to move the persistent logs into
the regular s6 log thing

      foreground {
        redirfd -w 1 /run/prior-boot2
        elglob file /sys/fs/pstore/* cat $file
      }

Thu Jan  2 14:31:11 GMT 2025

to change to a previous "generation" we could run any of
"/nix/store/*system-configuration*/bin/install /mnt" from a rescue
system. It would populate /boot and bin/activate

supposing we are in such a rescue system, how do we find *which*
system-configuration is the one we want to revert to? The derivation
should be pure, so if we're going to timestamp anything we have to do
that in the imperative step i.e. update.sh

perhaps a symlink from /persist/configuration/yyyymmddtmmhhss -> /nix/store/eeee-blah


Thu Jan  2 23:38:59 GMT 2025

Stuff we should tidy up:

1) all devices to set credible default output, rootfs, etc
2) expunge remaining references to kexecboot
3) dynamic uid assigment for users
4) lessen logging noise from firewall
5) update.sh --fast sends wlan services for a spin



Fri Jan  3 16:05:10 GMT 2025

* build actual rotuer config and install on rt3200
* find usb ethernet dongle for laptop
* plug it in?

Sat Jan  4 15:32:03 GMT 2025

didn't work so well, because the installer version is coupled with the
dtb and the dtb can't be upgraded without upgrading the kenrel

kernel 6.6.67 builds and boots but seems to have no wired ethernet.
looking through the openwrt config changes ...

+CONFIG_MTD_UBI_NVMEM=y
+CONFIG_NVMEM_BLOCK=y
+CONFIG_NVMEM_LAYOUT_ADTRAN=y
+CONFIG_PHYLIB_LEDS=y
+CONFIG_MTK_REGULATOR_COUPLER=y
+CONFIG_FW_LOADER_SYSFS=y


Sun Jan  5 12:58:52 GMT 2025

We are running with rt3200 and everything appears to work :-)

Sun Jan  5 20:34:18 GMT 2025

what customization do we want from the firewall?

 - what's allowed wan->lan
 - what's allowed lan->wan
 - which dropped packets get logged or don't

plus fix whatever it was RoS found
plus stop hardcoding the interface names

 Q: if pppd _makes_ an interface, how do we know what the name of it
 is going to be before it's up and passing packets so that we can
 have the firewall active before it starts

 or could we have ppp service install a "drop everything" firewall
 before it starts? what if there's more than one upstream interface,
 they shouldn't wipe each other out

 so can we have a "default deny" firewall in which every allowed flow
 is qualified by the interface name, and any service that brings up an
 interface is required to add firewall rules for it according to its
 role

 or maybe the firewall service could watch for interfaces being added
 (and removed) and update the ruleset as appropriate for the interface
 role (lan, wan, dmz, management, guest, ???). but how does it know
 the role based on the interface name?

Tue Jan 14 19:33:28 GMT 2025

each interface can add its own chain when it comes up, and then we can
figure out some way to jump to the correct chain (vmaps) based on
interface name

% nft add map filter mydict { type ipv4_addr : verdict\; }
% nft add element filter mydict { 192.168.0.10 : drop, 192.168.0.11 : accept }
% nft add rule filter input ip saddr vmap @mydict

_but_ we might be better off declaring static "zones" (lan, world,
dmz, guest, etc etc) with a map for each, and then replace the hardcoded
interface names with a map lookup

% nft add map filter ifzone { type ifname : string ; }


zones  { type ifname: ipv4_addr\; }
 ...  and presumably we also need identical maps for nat and
 ...  any other chain type where we need to distinguish
 ...  inside from outside
% nft add element nat porttoip { 80 : 192.168.1.100, 8888 : 192.168.1.101 }

% nft add rule ip nat postrouting snat to tcp dport map @porttoip

Mon Jan 20 20:32:58 GMT 2025

1) maybe we can add a type="chain" or type="set" attribute to the
firewallgen input, then we cna have sets in the default firewall rules

2) then we convert the default firewall rules to use sets instead of
hardcoding ifname

3) then find a nice place to hook "new interface is available" and
add it to the appropriate zone (separately for ip4 and ip6, gah)

4) and how do we find

# nft add element ip table-ip lan { int }
# nft add element ip table-ip wan { ppp0 }
# nft add element ip6 table-ip6 lan { int }
# nft add element ip6 table-ip6 wan { ppp0 }

Tue Jan 28 21:51:46 GMT 2025

Going back and forth on the firewall stuff, in respect of where to aim
for with "general" vs "useful". Specifically, if we have hooks
every time interfaces are added/removed that
expect some specific sets to (a) exist and (b) have particular
semantics, there is necessary coupling between those hooks and the
firewall definition. So, practically speaking, the "a new interface
appears" rules need to be bundled with the firewall ruleset

Which also means that the firewall needs to know which zone the
interface is assigned to, which is a problem if it can't tell from the
name (for example wg0 wg1 wg2 ... could be wan or lan or dmz or
anything)

So the service that owns the interface needs to communicate "another
one for the lan zone" to the firewall and it can't do that by adding
to sets directly unless it knows what the sets are called. Implying we
need an interface between the interface service that knows "new
interface ppp7 added in wan zone" and the firewall service that knows
how to accommodate this. For extra credit this actually should be more
like pubsub: the interface shouldn't really have to know the firewall
exists.

outputs?

- what if each interface wrote a "zone" output and the firewall
subscribed to them? would need the firewall to know which of all the
available services were interfaces

- a zone service that every interface in the zone depends on. it
doesn't do much in itself but it means the interface updown scripts
know a service directory where they can touch lan-zone/eth0 or
whatever. This could work.  The firewall service definition specifies
the zone services and uses inotify watcher thingies to update
interface sets when contents change.

Wed Jan 29 17:19:24 GMT 2025

1) make a zone service defn that can be instantiated for each zone.
it should create $output/interfaces

2) add a `zone` attribute to interface definitions, causing
- the zone service to be added to the dependencies
- the interface "up" script to include writing to the zone/interfaces output

2b) any other service that creates an interface (e.g. ppp) needs to also
have `zone` and do the same

3) firewallgen to be able to make sets

4) firewall service to watch the zone outputs

Fri Jan 31 17:11:16 GMT 2025

Do we need zone services? I think we could put zones in the outputs of
the firewall service?

Sun Feb  2 20:59:56 GMT 2025

What's the smallest first step?

 - [done] how can we make firewallgen output sets (or could we
   make the firewall service tack them on afterwards)

 - make a longrun that watches its own zones output and updates the
   appropriate sets

The sticking point is that if you give the firewall `rules` instead of
`extraRules` then the longrun may or may not work depending on (1)
whether you made the zone sets; (2) whether your rules use
them. Conclusion: if you supply `rules` then you also have to say
whether you want the longrun or not. So add a param
watchForInterfaceUpdates which defaults true

Mon Feb  3 21:12:55 GMT 2025

the thing that updates sets has to know they exist, so the interface watcher
service must live in the firewall module

the firewall service defn should return the firewall service after
adding the interface watcher as a dependency of it. Or: the watcher
should make the sets and then the firewall service could depend on _it_.
That would mean that the firewall service would fail if it used sets
that the watcher didn't make, is that good or bad or indifferent?

the interface services have to know about the watcher as well in order
to write into its outputs, so it can't be hidden inside the module

maybe the watcher service should _be_ the firewall service.

we could add a "notify" param to an interface which would be an output
reference to (the firewall service / zones / lan ) that the interface would
write its ifname into when the service is up

Wed Feb  5 00:14:29 GMT 2025

another thought: the firewall service could have params to say
which interface services are in which zones

we'd have to ensure that the interface services did not end up as
dependencies of the firewall

then the firewall could

- create the sets
- watch each interface service for the ifname output and add it to the right zone

Sun Feb  9 21:33:57 GMT 2025

nft update set @lan

echo 'flush set table-ip lan;  add element table-ip lan { eth0,lo }' | nft -f -

Tue Feb 11 18:30:09 GMT 2025

outstanding for 1.0:

1) security audit fedback

a) ask ROS if I can ship their report, with a response doc
 showing the commits that address each finding/non-finding
b) firewall rules: icmp rate limit, DNS, doc for icmpv6 packet dropping
c) look over env var inputs and parse them properly instead of
  string glommeration

2) docs:
 - for each device, add "finishedness" status and link to build status
 - generally read them over and spruce up
 - porting guide

3) some kconfig magic to generate minimal kconfig files so that
device modules don't end up as copy-pastes of the openwrt defconfig


---

apparently 5% of available bandwidth is a reasonable rate limit for
icmp

% nft add rule filter input limit rate over 10 mbytes/second drop

but nftables has no way to get interface bandwidth and indeed nor does
the device generally: the 1000Mb/s ethernet interface might be
connected to a 70Mb/s pppoe upstream and how would it know?  So the
site operator needs to say somewhere what the upstream bandwidth is.

Sun Feb 16 22:16:29 GMT 2025

we probably didn't need to write that service, we could have used the
thing that makes templated config files _and_ if we somehow contrive
to write the interface bandwidth as an interface output we could get
that the same way

if only I could remember how it worked :-)

----

* watch-output watches only _one_ service and is called with a list of
outputs inside that service, so not exactly what we need. we can
extend it easily enough to watch multiple services using poll() if we
can figure out the syntax we want. Luckily all the places that call it
go through modules/secrets/subscriber.nix so it's easy enough to change
existing uses

we could do
watch-outputs -r foo /nix/store/blah/.outputs/ifname   /nix/store/eee/.outputs/ifname  ...

or
watch-outputs -r foo /nix/store/blah:ifname   /nix/store/eee:ifname  /nix/store/eee:bandwidth

or

watch-outputs -r foo /nix/store/blah:ifname   /nix/store/eee:ifname:bandwidth

which I quite like insofar as it's shorter but has no other real merit

then we need to decide how to represent an output reference in a firewall rule.
Since each rule is basically text already, might just put the handlebars straight in

let qq = builtins.toJSON ;
in "icmp6 limit rate over {{ tonumber(output(${qq (intf "service")}, ${qq (intf "bandwidth")})) / 20 }} bytes/second drop"

probably we should do a separate rule for each interface in the wan zone

Sun Feb 23 00:34:34 GMT 2025

looks like we have no tests for anything involving watched services or subscribers,
or if we do I can't see what

Thu Feb 27 20:47:03 GMT 2025

- use output-template to write firewall rule file
- wrap firewall in svc.secrets.subscriber.build (c.f. e745991) with zones as
   watched services
- put the handlebars in the firewall config

we have uncommitted changes to watch-outputs that I'm relunctant to
commit until I have some way to see if they're working. the pppoe test
will check both firewall zones so _should_ start to fail with the
current watch-outputs (because only one service) and then pass when we
put the new one in

Fri Feb 28 01:00:03 GMT 2025

Well, it works at least well enough to pass the test. There is an awful hack
though, because nftables doesn't accept "elements = { }" as valid syntax
for a set with no elements, so we post-process the file to wipe those lines

I wonder if we could instead create the set empty and then use the "other"
nftables format to generate commands that add the elements. If it's
all in the same file (or included files) it will continue to be atomic

Other options

- is the nftables json format any better? we will have to rebuild it
  with json support, may be bugger
- write lua bindings to libnftables

Fri Feb 28 23:31:06 GMT 2025

adding json would add 76 + 88k to the image, but I think it would also
mean we have to rewrite all the default rules in json format



with json
[dan@loaclhost:~/src/liminix]$ du  result/
20      result/share/doc/nftables/examples
24      result/share/doc/nftables
28      result/share/doc
12      result/share/man/man3
16      result/share/man/man5
44      result/share/man/man8
76      result/share/man
60      result/share/nftables
168     result/share
36      result/etc/nftables/osf
40      result/etc/nftables
44      result/etc
76      result/bin
8       result/include/nftables
12      result/include
8       result/lib/pkgconfig
1172    result/lib
1476    result/

[dan@loaclhost:~/src/liminix]$ du /nix/store/l0zsvldsskiv52b4c9b21ziq5z1qr7vn-jansson-mips-unknown-linux-musl-2.14/
84      /nix/store/l0zsvldsskiv52b4c9b21ziq5z1qr7vn-jansson-mips-unknown-linux-musl-2.14/lib
88      /nix/store/l0zsvldsskiv52b4c9b21ziq5z1qr7vn-jansson-mips-unknown-linux-musl-2.14/

without:
[dan@loaclhost:~/src/liminix]$ du  result/
20      result/share/doc/nftables/examples
24      result/share/doc/nftables
28      result/share/doc
12      result/share/man/man3
16      result/share/man/man5
44      result/share/man/man8
76      result/share/man
60      result/share/nftables
168     result/share
36      result/etc/nftables/osf
40      result/etc/nftables
44      result/etc
76      result/bin
8       result/include/nftables
12      result/include
8       result/lib/pkgconfig
1096    result/lib
1400    result/


Sat Mar  1 23:43:17 GMT 2025

I don't think json is going to help because either we'd have to do

 elements = map (f: "{{ lookup(f, \"ifname\") }}") zones.${zone}

and there would  be null elements in the places for the interfaces that don't exist
yet, or we'd have to write actual json syntax at runtime, at which point why don't
we write the trad nftables syntax instead?

let's write a firewall .nftables file consisting of the zone set
elements plus an "include" directive for the rest of the firewall. NOTE THAT
we may still need to template the rest of the firewall if we want to have
other variables (rate limits) in it, because the rules for that need to be
inserted ahead of the rules for accepting icmp, and there's no way to
do that without

Sun Mar  9 21:46:30 GMT 2025

OK, we have updating firewall zones that are good neough to pass the test
(and may even work ...).  We need

* to write a rule with a rate limit for incoming icmp6 for each interface,
capping it to 5% of the interface bandwidth

* some place to say what the bandwidth is, which I am thinking we could do with
a passthru attribute of some kind on services

{
  run = "mdcnvdfngkj";
  name = "dtret";
  properties.bandwidthKbps = 80000;
}

and insert logic in the up/run commands to copy the properties to
outputs.  Or we could use `data` or `env` directories, which means
that the properties would be there even while the service is down, if
that's a concern, but would need a different way to look them up. And
also, would we allow them to change? What if there's some kind of interface
where we _can_ interrogate the bandwidth at runtime?

Mon Mar 10 21:00:17 GMT 2025

The idea that occured to me at lunchtime is "what if we made the
(svc:output ) method fall back to properties if no output was present".
To do this, we'd have to

(1) arrange for /nix/store/eeee-service/.properties to exist

 - add properties attribute to service functions
 - write them to .properties in liminix-tools/services/builder.sh
 - make sure they get passed through whne provided to all the service
    builder functions

[done] (2) pass the store directory to svc.open instead of  ..../.outputs

(3) make service:output look in both places

(4) write the damn firewall rule

Mon Mar 17 21:13:36 GMT 2025

Argh why is it never simple?

We need to write a rate-limiting firewall rule for each interface to
restrict icmp on that interface. This is not easy to reconcile with
putting them in default-rules because how do we generate multiple
array elements by config file templating?

There are two things in my mind now:

1) could we have some better way of manipulating the firewall rules
such that the rules from different modules are composable

this is complicated somewhat by ordering: if every rule in a chain is
"drop" or "accept" then it's easy to add another, but if the same
chain does first one then the other, doing the other and then the one
will not work

today we do e.g.

input-ip6
-> reject-bogons
-> accept non-bogus-icmp
-> process per-zone allowlist
-> allow established,related on wan
-> allow all on lan (so why did we need an allowlist?)

could we express this in a less sequential form? the
specification of what's allowed


input-ip6 for wan: is input-ip6 with the wan allowlist
input-ip6 for lan: is input-ip6 with the lan allowlist


input-ip6 for ppp0: is input-ip6 with the wan allowlist with a rate
limit for icmp

ssh module wishes to modify the allowlist for lan/wan/both so that
it includes port 22

am wondering if we could do default deny and _all_ the rules
(except for bogons) are allows

maybe we have the concept of "subtraction": a rule can be an allow
preceded by some number of drops which (at least by convention; this is
probably hard to enforce) are "carve outs" of the packets that are
being allowed.

... it's hard to express the forward-ipv6 in these terms, though.
we end up with "some drops and then multiple accepts"

we have
(and (not (or drop1 drop2 ...)) (or accept1 accept2 ...))

and to add ssh we need to break into the second clause instead of
composing at the top level

(and (not (or drop1 drop2 ...))
     (or accept1 accept2 accept-ssh...))


then the icmp bogons composite rule is "drop weird icmp and then
allow what's left"

(side note: maybe we could use a map to do interface name -> bandwidth
for rate limiting)




a composite rule might be a bunch of denies and then an allow
for anything the

2) some kinda syntax for referencing outputs (or properties) that's not
just string interpolation


----

I think we could address the immediate problem by writing a rule for
rate-limiting that looks up the rate in a map, and some maps (with
extraText) that get the rates from service properties? And that would
suffice for addressing the RoS audit, at least

Tue Mar 18 18:48:22 GMT 2025

Unless the interface exists, we do not (at least, may not) know its
name because that's an output. So the fact that it has a permanent
property is not per se terribly useful

limit rate 50000 bytes / minute accept


nft add rule ip nat postrouting snat to ip saddr map { 192.168.1.0/24 : 10.0.0.1, 192.168.2.0/24 : 10.0.0.2 }

nft add rule table-ip input-ip4 ip daddr 2.2.2.3/32 limit rate iifname map { "eth0": 10, "ppp0": 20 } kbytes/second   accept

nft add map 'table-ip intf-limits  { typeof 5000000 ;  elements = { lan: 50000000, ppp0: 3500000 } ; }'

OK, we can't do rate lookup in a map because the nftables grammar only
supports a numeric literal for limit_rate_bytes. so we're back to writing
a collection of rules, one for each interface with an ifname output,
that sets the limit for that interface

* we could do this all in one element of a rules list, with newlines
between each actual rule

* we could add extraText to the ruleset syntax - but does it go at the
start of the rules or the end or somewhere in the middle? this is
almost worse

* we could pick up where we left off on march 17 and redesign the
  firewall module

gonna be option 1 isn't it?