README: doc skyflake deployment

This commit is contained in:
Astro 2022-11-30 21:00:36 +01:00
parent 877d9f04c7
commit 6ce8454fd9
1 changed files with 80 additions and 0 deletions

View File

@ -149,6 +149,86 @@ systemd-managed MicroVMs live, or move the state to
nix run .#nomad-$NAME
```
# Cluster deployment with Skyflake
## About
Skyflake provides Hyperconverged Infrastructure to run NixOS MicroVMs
on a cluster. Our setup unifies networking with one bridge per
VLAN. Persistent storage is replicated with Glusterfs.
Recognize MicroVMs for Skyflake by modules containing
`self.nixosModules.cluster-options`.
## Deploying
Push our repo to any machine on the cluster, preferably to Hydra
because it just builds stuff and probably has most packages already in
store.
You don't deploy all MicroVMs at once. Instead, Skyflake allows you to
select NixOS systems by the branches you push to.
**Example:** deploy hosts `mucbot` and `sdrweb`
```bash
git push c3d2@hydra.serv.zentralwerk.org:config HEAD:mucbot HEAD:sdrweb
```
## Debugging
### Glusterfs
```bash
gluster volume info
gluster volume status
```
#### Restart glusterd
```bash
systemctl restart glusterd
```
#### Remount volumes
```bash
systemctl restart /glusterfs/fast
systemctl restart /glusterfs/big
```
### Nomad
#### Check the cluster state
```shell
nomad server members
```
#### Browse in the terminal
Use `wander` and `damon`
#### Browse with a browser
First, tunnel TCP port `:4646` from a cluster server:
```bash
ssh -L 4646:localhost:4646 root@server10.cluster.zentralwerk.org
```
Then, visit https://localhost:4646 for for full klickibunti.
#### Reset the Nomad state on a node
After upgrades, Nomad servers may fail rejoining the cluster. Do:
```shell
systemctl stop nomad
rm -rf /var/lib/nomad/server/raft/
systemctl start nomad
```
# Secrets management
## Secrets managment with PGP