README: doc skyflake deployment
This commit is contained in:
parent
877d9f04c7
commit
6ce8454fd9
80
README.md
80
README.md
|
@ -149,6 +149,86 @@ systemd-managed MicroVMs live, or move the state to
|
|||
nix run .#nomad-$NAME
|
||||
```
|
||||
|
||||
# Cluster deployment with Skyflake
|
||||
|
||||
## About
|
||||
|
||||
Skyflake provides Hyperconverged Infrastructure to run NixOS MicroVMs
|
||||
on a cluster. Our setup unifies networking with one bridge per
|
||||
VLAN. Persistent storage is replicated with Glusterfs.
|
||||
|
||||
Recognize MicroVMs for Skyflake by modules containing
|
||||
`self.nixosModules.cluster-options`.
|
||||
|
||||
## Deploying
|
||||
|
||||
Push our repo to any machine on the cluster, preferably to Hydra
|
||||
because it just builds stuff and probably has most packages already in
|
||||
store.
|
||||
|
||||
You don't deploy all MicroVMs at once. Instead, Skyflake allows you to
|
||||
select NixOS systems by the branches you push to.
|
||||
|
||||
**Example:** deploy hosts `mucbot` and `sdrweb`
|
||||
|
||||
```bash
|
||||
git push c3d2@hydra.serv.zentralwerk.org:config HEAD:mucbot HEAD:sdrweb
|
||||
```
|
||||
|
||||
## Debugging
|
||||
|
||||
### Glusterfs
|
||||
|
||||
```bash
|
||||
gluster volume info
|
||||
gluster volume status
|
||||
```
|
||||
|
||||
#### Restart glusterd
|
||||
|
||||
```bash
|
||||
systemctl restart glusterd
|
||||
```
|
||||
|
||||
#### Remount volumes
|
||||
|
||||
```bash
|
||||
systemctl restart /glusterfs/fast
|
||||
systemctl restart /glusterfs/big
|
||||
```
|
||||
|
||||
### Nomad
|
||||
|
||||
#### Check the cluster state
|
||||
|
||||
```shell
|
||||
nomad server members
|
||||
```
|
||||
|
||||
#### Browse in the terminal
|
||||
|
||||
Use `wander` and `damon`
|
||||
|
||||
#### Browse with a browser
|
||||
|
||||
First, tunnel TCP port `:4646` from a cluster server:
|
||||
|
||||
```bash
|
||||
ssh -L 4646:localhost:4646 root@server10.cluster.zentralwerk.org
|
||||
```
|
||||
|
||||
Then, visit https://localhost:4646 for for full klickibunti.
|
||||
|
||||
#### Reset the Nomad state on a node
|
||||
|
||||
After upgrades, Nomad servers may fail rejoining the cluster. Do:
|
||||
|
||||
```shell
|
||||
systemctl stop nomad
|
||||
rm -rf /var/lib/nomad/server/raft/
|
||||
systemctl start nomad
|
||||
```
|
||||
|
||||
# Secrets management
|
||||
|
||||
## Secrets managment with PGP
|
||||
|
|
Loading…
Reference in New Issue