Advanced Namespace Tools blog 22 December 2016

Working on 9front's new TLS boot option

Plan 9 has always had an option for "tcp boot" - which means attaching to a root fileserver over the network at startup. 9front has recently added a "tls boot" option, which is similar, but sets up an encrypted connection using TLS. I'd like to get this working and supported in ANTS.

I have made a first attempt at adding support for it in the plan9rc bootup script, but haven't succeeded in making it work yet. However, I haven't succeeded with the standard bootrc script either, so I think I may have an auth configuration issue in my systems, seperate from anything the plan9rc script is doing. On the client system I see an error like:

mount: mount /root: tls error

And on the server side I see this error:

/bin/aux/trampoline: dial net!$fs!9fs: cs: can't translate address: dns: resource does not exist; negrcode

The server error looks familiar, from other authentication issues. Perhaps some information is missing from /lib/ndb/local. Looking at the /rc/bin/service/tcp17020 file, it seems like I might need to be running standard 9fs on port 564, and the port 17020 service will just wrap it in a tls tunnel.

So, I have now started standard fs listener on port 564 from fossil, and added fs= and auth= to the server ndb. After doing this, I was able to successfully use srvtls as a test from my client node, so I am optimistic that the boot might work. About to retest...and...success!

Cruft in plan9rc Boot Options

Adding support for the TLS boot option meant adding another case to the function called "dogetrootfs" in plan9rc. This is a crucial function, and it has become too long and overloaded. It is built as a large set of cases to switch($getrootfs) and it should probably be factored so that the logic for each case becomes its own fn. 9front has also dropped kfs from the distribution and with hjfs filling the role of a smaller and simpler fs, I don't see much need to keep support for it in 9front-ANTS. I'm also not confident that the logic for using cfs (a local cache file server for a remote connection) even works any more in 9front, I should probably test it and fix it if it is broken.

The plan9rc script was originally written for the Bell Labs version of Plan 9, and it was modified to work with 9front, without a comprehensive rewrite. In comparison to the standard 9front bootrc script, it is rather gracelessly written. (ANTS also has an ANTS-specific variant of the bootrc script, you can choose which script you want to use via the "bootcmd" variable in plan9.ini) There is support for interactive, non-interactive, and traditional bootup, which is implemented with quite a bit of mostly-duplicated code.

I have mixed feelings about trying to do comprehensive code improvement here: on one hand, the current script works well and I understand it, and there is a lot of wisdom in "if it ain't broke, don't fix it". On the other hand, cleanup work would make it more maintainable. It would be nice if it shared as much code as possible with the standard 9front bootrc, partly because users who aren't me are likely to find the current script "a bit much." Looking at the 9front base:

cpu% wc bootrc local.rc net.rc
    237     629    4010 bootrc
     75     178    1102 local.rc
     67     200    1324 net.rc
    379    1007    6436 total

In comparison, the ANTS startup scripts:

cpu% wc plan9rc initskel
    701    2040   15303 plan9rc
    170     524    3590 initskel
    871    2564   18893 total

Obviously the ANTS startup does more (it creates an independent namespace separate from standard userspace, supports fossil+venti boot and other variants) but I'm sure it doesn't need to use twice as many lines and three times as many total characters to do its work.