README: more skyflake

This commit is contained in:
Astro 2022-12-03 01:10:15 +01:00
parent 2621bd671f
commit 70377149b2
1 changed files with 37 additions and 18 deletions

View File

@ -139,30 +139,29 @@ so the following is all that is needed on a MicroVM-hosting server:
microvm -Ru $hostname
```
## High Availability Deployment on Nomad
First, stop and delete `/var/lib/microvm/$NAME` where the
systemd-managed MicroVMs live, or move the state to
`/glusterfs/fast/microvms/$NAME`.
```sh
nix run .#nomad-$NAME
```
# Cluster deployment with Skyflake
## About
Skyflake provides Hyperconverged Infrastructure to run NixOS MicroVMs
on a cluster. Our setup unifies networking with one bridge per
VLAN. Persistent storage is replicated with Glusterfs.
[Skyflake](https://github.com/astro/skyflake) provides Hyperconverged
Infrastructure to run NixOS MicroVMs on a cluster. Our setup unifies
networking with one bridge per VLAN. Persistent storage is replicated
with Glusterfs.
Recognize MicroVMs for Skyflake by modules containing
`self.nixosModules.cluster-options`.
## Deploying
## User interface
Push our repo to any machine on the cluster, preferably to Hydra
We use the less-privileged `c3d2@` user for deployment. This flake's
name on the cluster is `config`. Other flakes can coexist in the same
user so that we can run separately developed projects like
*dump-dvb*. *leon* and potentially other users can deploy Flakes and
MicroVMs without name clashes.
### Deploying
Push our repo to any machine in the cluster, preferably to Hydra
because it just builds stuff and probably has most packages already in
store.
@ -175,10 +174,23 @@ select NixOS systems by the branches you push to.
git push c3d2@hydra.serv.zentralwerk.org:config HEAD:mucbot HEAD:sdrweb
```
## Debugging
### Updating
**TODO:** how would you like it?
### MicroVM status
```bash
ssh c3d2@hydra.serv.zentralwerk.org status
```
## Debugging for cluster admins
### Glusterfs
Glusterfs holds our MicroVMs' state. They *must always be mounted* or
brains are split.
```bash
gluster volume info
gluster volume status
@ -205,9 +217,15 @@ systemctl restart /glusterfs/big
nomad server members
```
Nomad *servers* **coordinate** the cluster.
Nomad *clients* **run** the tasks.
#### Browse in the terminal
Use `wander` and `damon`
[wander](https://github.com/robinovitch61/wander) and
[damon](https://github.com/hashicorp/damon) are nice TUIs that are
preinstalled on our cluster nodes.
#### Browse with a browser
@ -221,7 +239,8 @@ Then, visit https://localhost:4646 for for full klickibunti.
#### Reset the Nomad state on a node
After upgrades, Nomad servers may fail rejoining the cluster. Do:
After upgrades, Nomad servers may fail rejoining the cluster. Do this
to make a *Nomad server* behave like a newborn:
```shell
systemctl stop nomad