this includes the jump to conftrack, a custom-written configuration
library that'll hopefully be less annoying to deal with than conferer.
It's very much unstable & somewhat incomplete software for now, but
should hopefully reach a stable state soon (this deployment is thus
basically part of testing it).
It also means we can finally write camelCase in config keys without
having the config library fail on us!
it sometimes takes a long while to boot & signal being ready to systemd,
which will kill it after the timeout is reached, after which it's rinse
and repeat and yay for a boot loop.
this depends on a whole lot of imperative nonsense being done at the
same time, which i have done.
of special interest to anyone attempting to understand this is
https://docs.mattermost.com/deploy/postgres-migration.html
for the general shape of incompetence at work,
https://docs.mattermost.com/install/setting-up-socket-based-mattermost-database.html#with-unix-socket
for yet another interesting syntax for database connection strings, and
https://github.com/dimitri/pgloader/issues/782#issuecomment-502323324
for a truly astonishing take on how to do database migrations, which
unfortunately i have followed.
As far as I can tell, everything has kept working. Downtime was mostly
spent understanding connection string syntax and their horribly buggy
parsers.
Note for people with server access:
- i have kept the temporary files (including logs) around in
/persist/migration inside the container should we ever need them
again
- there's a zfs snapshot @pre-postgres with the old state
I have little idea what happened here, but this postgres is entirely
unused. The actual database is in mysql, and always has been — the
postgres does contain a mattermost database with the correct tables, but
these are empty.
there's little point in having it alert while people are working on the
config & test-deploying things; it's meant to remind later, in case we
forget committing the result.
this one's not connected to our SSO and intended for short-term use
only, after which it will be deleted again.
I've gone through at least some of mattermost's options to see how many
of these are actually relevant anymore. Some can be left out.
Unlike the other mattermost it also doesn't use any mysql.
this is not entirely accurate — the lastModified attribute of a flake's
self-input gives the date of the last commit, not the last deploy. But I
figure it's close enough and less obscure to check than reading in the
last date via nix-env.
inspired by: we did no server updates for two weeks.
This is the initial version for this year's run of absurd train
operations. I won't dare to call it a release for at least another month
or so, so no version number.
Changes done in our nixfiles:
- tracktrain now needs ntfy-sh so people (read: I) can get push
notifications if things break or at least look a little weird
- I removed the grafana instance; seems like somewhere in the last year
they changed how to host it under a sub-path (ours was at /metrics),
so it broke, and I'm not feeling any particular urge to fix it
- last year's database contents have been yoten
- also manually updated the gtfs (though I intend to implement logic
for fetching it in tracktrain, I first need to drag Ilztalbahn into
actually publishing up-to-date versions again first)
this doesn't help us with anything yet, but it does at least mean that
this openssh now also listens on IPv6, which it didn't before.
(reaching the container from the outside still does not work)
this started with emily pointing out to me that it's possible to
generate IP addresses for containers in Nix (hence no need to worry
about ever having collisions, as we had before), but then I thought,
hey, while I'm at it, I can also write a little container module so we
have a little less repetition in our configs in general (and a more
reasonable place for our custom evalConfig than just keeping it around
in flake.nix).
See the option descriptions in modules/containers.nix for further
details.
Apart from giving all containers a new IP address (and also shiny new
IPv6 addresses), this should be a no-op for the actual built system.
move the monit config out of mail.nix, and add two checks:
- has any systemd unit failed?
- is the currently deployed commit the tip of the main branch of
haccfiles?
this should hopefully help with our consistent onlyoffice-does-not-work-but-no-one-noticed
problems (yes, monit runs as root and can do that).
"then restart" will still send an alert if it restarted the unit (see monit's man page)
alps frequently fails to start (e.g. during a system activation script)
since either its configured imap or smtp servers are not reachable
yet (i.e. their process has not yet opened the corresponding port).
This should hopefully fix that behaviour:
- also set BindsTo, telling systemd to only start alps once the
required units have entered "active" state (not just after it has
started them)
- also require postfix to be present, since that provides smtp
mostly just replacing strings to avoid confusion later on. Since our
containers are now ephemeral, renaming them is basically a non-issue
(though the files under /persist/containers & the uffd client name had
to be changed manually)
this removes usage of the nftnat module by rendering it into a static
nftables config. It's a no-op (modulo /etc/haccfiles) as far as nix is
concerned, hence the slightly off-putting whitespace of the multi-line
string.
This seems to me to be a better approach than just bundling the module,
since we only use it for two things (giving the containers network
access & forwarding port 22 to forgejo), which to me doesn't press for
using a custom module we can't really maintain on our own.
the bind mount module has been tweaked in a couple ways:
- rename hexchen.* to hacc.*
- rename bindmount to bindMount to make it consistent with usage in
the nixpkgs container module
- add a hacc.bindToPersist option as shorthand for prepending /perist
to a path via bind mount
the nopersist module has been shortened a little by moving
service-specific things which are used once out into the individual
service files, and removing those which we don't need at all (this also
means we get to loose a mkForce or two in case of mismatches between
hexchen's and our current config).
this is a slightly cursed work around; see the comment.
Alternatively, we could pass in the $src attribute of that derivation
via callPackage (passing it through all the way from flake.nix), but tbh
that sounds like too much effort rn.
Have fun with confusingly long paths in the nix store 🙃
we decided to:
- get rid of unused packages
- simpify the directory layout since we only have one host anyways
- move our docs (such as they are) in-tree