Ephemeral Castle: Create/Destroy Detail
Level 3 (Detail) — Hetzner Vault runtime provisioning pipeline.
Concept
The Hetzner Vault runtime is provisioned through an 8-stage pipeline (create.sh) and torn down by a 4-stage destroy process (destroy.sh). Each stage writes a timestamped log file under logs/.
Create Pipeline
File: runtimes/lushycorp-vault/hetzner/create.sh (228 lines)
Stage 1: Terraform (VM)
Log: *-10-terraform.log
cd terraform/ && terraform init && terraform apply- Variables:
hcloud_token,image_id,image_name - Creates: Hetzner VPS from golden image, firewall, initial network
- Outputs: server IP, public inventory file (
inventory.public.ini)
Stage 2: Public Bootstrap (SSH + Tailscale)
Log: *-20-public-bootstrap.log
- Waits for SSH reachability (up to 30 attempts, 10s apart)
- Runs Ansible playbook
tailscale-bootstrap.ymlvia public IP - Tailscale OAuth credentials passed as env vars
- Installs Tailscale on the VM, registers with tags
tag:tazlab-vault, tag:vault-api - Hostname registered as
lushycorp-vault
Stage 3: Tailscale Validation
Log: *-30-tailscale-validation.log
- Verifies operator-side
tailscale statusandtailscale ip -4 - Runs
validate-device-tags.shto confirmlushycorp-vaulthas required tags - Outputs tag validation JSON to
configs/tailscale-tags.json
Stage 4: Transport Switch
Log: *-40-transport-switch.log
render-tailscale-inventory.shgenerates Ansible inventory with tailnet IP- Verifies SSH over Tailscale transport (up to 6 retries, 5s apart)
- This is the critical transition from public-IP SSH to tailnet-only SSH
Stage 5: Podman Verify
Log: *-50-podman-verification.log
- Runs
ansible-playbook common.ymlover tailnet transport - Verifies Podman is installed and functional on the target VM
Stage 6: Vault Runtime Install
Log: *-60-vault-runtime-install.log
- Runs
ansible-playbook vault-runtime-install.ymlover tailnet - Vault binary + config installation, systemd unit setup, golden image validation
Stage 7: Vault Runtime Converge
Log: *-70-vault-runtime-converge.log
- Runs
ansible-playbook vault-runtime-converge.ymlover tailnet - Classification, PKI generation, init/restore, unseal, health verification
Stage 8: Vault Runtime Post
Log: *-80-vault-runtime-post.log
- Runs
ansible-playbook vault-runtime-post.ymlover tailnet - Admin token persistence, snapshot backup, TazPod encrypted archive refresh and S3 push
Timing Summary
After completion, create.sh prints a timing table showing elapsed seconds per stage.
Preflight Checks
The script validates before starting:
- TazPod vault is unlocked (
/workspace/.tazpod/vault/vault.tar.aes) - Hetzner token exists (
~/secrets/hetzner-token) - Tailscale OAuth credentials exist
- SSH key exists (
~/secrets/ssh/lushycorp-vault/id_ed25519) - Golden image env file exists (
configs/golden-image.env)
Transport: Hostnet+TUN Auto-Detection
create.sh auto-detects the Tailscale socket at startup: if /dev/net/tun is available and a TUN-mode socket exists at /var/run/tailscale/tailscaled.sock, it uses that path. Otherwise it falls back to the old userspace socket under AGENTS.ctx/tools/tailscale/state/. The inventory renderer (render-tailscale-inventory.sh) also auto-detects TUN mode and generates direct-tailnet SSH (no ProxyCommand) when available.
Timing
With hostnet+TUN mode and split playbooks, the full create cycle completes in ~344s (down from ~1200s with the old monolithic playbook). Typical per-stage timing: Terraform 4s, Public Bootstrap 16s, Tailscale Validation 1s, Transport Switch 9s, Podman Verify 11s, Vault Install 175s, Vault Converge 90s, Vault Post 38s.
Destroy Pipeline
File: runtimes/lushycorp-vault/hetzner/destroy.sh (188 lines)
Log: *-90-destroy.log
1. Delete Hetzner server (by name)
2. Wait for server to disappear (30 attempts, 2s apart)
3. Delete Hetzner firewall (by name)
4. Delete Tailscale device (by hostname via OAuth API)
5. Cleanup local artifacts (inventories, terraform state)
Destroy Order
- delete_server_fast: Finds server by name (
lushycorp-vault) viahcloud server list, deletes all matching IDs - wait_server_gone: Polls until server is no longer listed
- delete_firewall_if_present: Finds and deletes firewall named
<server_name>-bootstrap - delete_tailscale_device: Obtains OAuth token, lists devices, finds device by hostname
lushycorp-vault, deletes it via Tailscale API - cleanup_local_artifacts: Removes generated inventories, terraform state, runtime metadata, tag validation JSON
Important: No S3 Clearing
destroy.sh does not clear the S3 remote durability layer. After destroy, S3 still contains the lineage and snapshots. A subsequent create.sh will hit the T0 + H0 + S1 hard-fail if the operator-side canonical bootstrap files are absent.
See Also
- Detail: Ansible Vault Detail
- Topic: Rebirth Protocol
- Topic: Bootstrap and Restore
- Hub: Ephemeral Castle