Ephemeral Castle Operational Cheat Sheet
This page provides a quick reference for common commands and maintenance tasks within the ephemeral-castle repository.
Cluster Lifecycle (tazlab-k8s)
Run these from clusters/tazlab-k8s/proxmox/:
| Action | Command |
|---|---|
| Full Bootstrap | ./create.sh |
| Full Teardown | ./destroy.sh |
| Force VM Delete | ./nuclear-wipe.sh |
| Check Logs | ls -ltr logs/ |
Manual Terragrunt Operations
To apply changes to a specific layer without a full rebirth:
Export Secrets (from TazPod):
export INFISICAL_CLIENT_ID="$(tr -d "'\" " < ~/secrets/infisical-client-id)" export INFISICAL_CLIENT_SECRET="$(tr -d "'\" " < ~/secrets/infisical-client-secret)" export PROXMOX_TOKEN_ID="$(tr -d "'\" " < ~/secrets/proxmox-token-id)" export PROXMOX_TOKEN_SECRET="$(tr -d "'\" " < ~/secrets/proxmox-token-secret)" export GITHUB_TOKEN="$(tr -d "'\" " < ~/secrets/github-token)"Navigate to Layer:
cd clusters/tazlab-k8s/live/<layer-name>Execute:
terragrunt plan terragrunt apply --non-interactive --auto-approve
Networking (Tailscale)
| Action | Command |
|---|---|
| Start / Join Tailnet | AGENTS.ctx/tools/tailscale/start.sh |
| Apply ACL/OAuth | cd tailscale/ && ./setup.sh |
| Check Peers | tailscale status |
| Ping Vault | tailscale ping lushycorp-vault |
| Ping Cluster | tailscale ping tazlab-k8s-control-plane-01 |
Note: start.sh is run from the workspace root, and it launches tailscaled in the background so the shell returns immediately while the daemon initializes.
Vault Runtime (Hetzner)
Run these from runtimes/lushycorp-vault/hetzner/:
| Action | Command |
|---|---|
| Create/Restore | ./create.sh |
| Nuclear Destroy | ./destroy.sh |
| Build golden image | ./golden-image/scripts/build-golden-image.sh --snapshot-name "<name>" |
Ansible Playbooks
Run from runtimes/lushycorp-vault/hetzner/ansible/:
| Playbook | When to use |
|---|---|
tailscale-bootstrap.yml | First-time Tailscale join on new VM (public IP) |
common.yml | Podman + package verification (tailnet) |
vault-runtime-install.yml | Runtime installation, config, service setup |
vault-runtime-converge.yml | Classification, restore/init, unseal, health |
vault-runtime-post.yml | Admin token, snapshot backup, TazPod persistence |
The old monolithic vault-s3-backup-recovery.yml was split into three playbooks (install/converge/post) to improve observability and allow per-stage timing.
Vault Runtime Preflight
Before ./create.sh on the Hetzner Vault runtime:
- Hostnet+TUN mode: Tailscale runs natively inside the container via
tazpod-tailscale-up. The old userspacestart.shpath is no longer needed.create.shauto-detects the Tailscale socket (TUN socket preferred). - Ensure TazPod vault is unlocked:
tazpod unlock - Verify the canonical bootstrap files if the run depends on restore or remote-durability continuity:
ls ~/secrets/lushycorp-vault/init.json \ ~/secrets/lushycorp-vault/unseal-keys.json \ ~/secrets/lushycorp-vault/root-token.txt \ ~/secrets/lushycorp-vault/admin-token.txt \ ~/secrets/lushycorp-vault/admin-token.json - Remember that
/workspace/.tazpod/vault/vault.tar.aesalone is not enough for runtime classification; the playbook reads the decrypted canonical files under~/secrets/lushycorp-vault/. - Check phase logs under
logs/:*-10-terraform.log*-20-public-bootstrap.log*-30-tailscale-validation.log*-40-transport-switch.log*-50-podman-verification.log*-60-vault-runtime-install.log*-70-vault-runtime-converge.log*-80-vault-runtime-post.log
Timing
With hostnet+TUN mode and the split playbooks, the full create.sh cycle completes in ~344s (down from ~1200s):
| Phase | Sec |
|---|---|
| Terraform (VM) | 4 |
| Public Bootstrap (SSH+Tailscale) | 16 |
| Tailscale Validation | 1 |
| Transport Switch | 9 |
| Podman Verify (common) | 11 |
| Vault Runtime Install | 175 |
| Vault Runtime Converge | 90 |
| Vault Runtime Post | 38 |
| TOTAL | 344 |
Destroy/Create Warning
./destroy.sh removes the Hetzner server and local Terraform outputs, but it does not clear the S3 remote-durability layer. That means a subsequent ./create.sh can legitimately hit the T0 + H0 + S1 hard-fail matrix branch if the operator-side canonical bootstrap files are absent while remote S3 remains coherent.
Common Debugging Tools
Talos OS
- Check Dashboard:
talosctl dashboard --talosconfig clusters/tazlab-k8s/proxmox/configs/talosconfig - Get Config:
talosctl get machineconfig
Kubernetes
- Access Cluster:
kubectl --kubeconfig clusters/tazlab-k8s/proxmox/configs/kubeconfig get nodes - Flux Status:
flux get kustomizations - ESO Logs:
kubectl logs -n external-secrets -l app.kubernetes.io/name=external-secrets
Proxmox
- List VMs:
qm list(on the Proxmox host) - Check Task Log: Look at the Proxmox Web UI “Tasks” pane.
See Also
- Detail: Create/Destroy Detail
- Detail: Golden Image Detail
- Detail: Proxmox Cluster Detail
- Hub: Ephemeral Castle