Entity: Ephemeral Castle (Infrastructure Layer)

This page is the repository hub for the provider-specific infrastructure layer.

Overview

ephemeral-castle/ is the provider-specific bootstrap layer. It provisions VMs (Proxmox/Talos), networking (Tailscale), storage (Longhorn), Flux bootstrap, and the Hetzner Vault runtime. After bootstrap, the cluster is managed declaratively by Flux from tazlab-k8s.

Repository Structure

ephemeral-castle/
├── clusters/tazlab-k8s/           # Proxmox/Talos cluster (active)
│   ├── proxmox/                   # Lifecycle scripts (create/destroy/nuclear-wipe)
│   │   └── configs/               # GENERATED: kubeconfig, talosconfig
│   ├── live/                      # Terragrunt layers
│   │   ├── env.hcl                # Cluster variables (source of truth)
│   │   ├── terragrunt.hcl         # Root config
│   │   ├── secrets/               # Layer 1: bootstrap credentials (from ~/secrets/ env vars)
│   │   ├── platform/              # Layer 2: Proxmox VMs + Talos (certSANs, JWT issuer baked in)
│   │   ├── engine/                # Layer 3: ESO + CoreDNS + vault secrets + github/tailscale tokens
│   │   ├── networking/            # Layer 4a: MetalLB
│   │   ├── gitops/                # Layer 4b: Flux bootstrap
│   │   ├── storage/               # Layer 5: Longhorn + S3 backup
│   │   └── gcp-services/          # Standalone: AlloyDB
│   ├── modules/                   # Reusable Terraform modules (6)
│   └── manifests/                 # Static Helm values
├── runtimes/
│   └── lushycorp-vault/hetzner/   # Vault runtime (active)
│       ├── create.sh              # 8-stage provisioning pipeline (~344s)
│       ├── destroy.sh             # Ordered teardown + state cleanup
│       ├── terraform/             # Hetzner VM + firewall
│       ├── ansible/               # Ansible roles (vault-runtime, common) + 3 playbooks (install/converge/post)
│       ├── golden-image/          # Image builder pipeline (v1-v4)
│       └── configs/               # Golden image env, runtime metadata
├── tailscale/                     # Tailnet ACL + OAuth IaC (Terraform)
├── hermes/                        # Hermes Agent LXC deployment
│   ├── create.sh                  # 7-phase orchestrator (~298s)
│   ├── destroy.sh                 # Backup + destroy
│   ├── cycle.sh                   # Full destroy/create with timing
│   ├── terraform/                 # LXC container resource + mount_point
│   └── ansible/                   # 4 roles (baseline, agent, configure, verify)
├── templates/                     # Cluster blueprint + gitops example
└── docs/                          # Architecture docs

Quick Facts

PropertyValue
Repositoryephemeral-castle/
Proxmox endpoint192.168.1.200:8006
K8s API VIP192.168.1.210:6443
MetalLB range192.168.1.240-250
Traefik LB192.168.1.240
Talos versionv1.12.0
K8s versionv1.35.0
CNIFlannel
Topology1 CP + 1 Worker
JWKS endpointlushycorp-apiserver-proxy.magellanic-gondola.ts.net/openid/v1/jwks (via ProxyGroup)
oidc-reviewer CRB✅ Created — binds system:service-account-issuer-discovery to system:unauthenticated
Vault runtimeHetzner CX23, Nuremberg
Vault FQDNlushycorp-vault.magellanic-gondola.ts.net
Vault tailnet IP100.82.13.87

Canonical Starting Pages for Agents

Cluster Infrastructure

Vault Runtime (Hetzner)

Networking

Details

Known Issues

TDAreaSummary
TD-016Vault runtimeQuadlet .container generator unreliable on Podman 4.3.1; forced Type=simple systemd fallback
TD-020Vault connectivitylushycorp-api.ts.tazlab.net is runtime-defined TLS alias, not Tailscale-owned MagicDNS
TD-021Vault recoveryBootstrap anchor gap: canonical files absent even when S3 lineage coherent
TD-023Proxmox hostBIOS/WoL auto-power-on not configured — after AC power loss, requires physical button press to restart (2026-04-29)
TD-024Operator transportTailscale userspace-networking + tailscale nc transport bug (Closed). Resolved by hostnet+TUN mode. Create pipeline now ~344s.
TD-028BootstrapEngine layer still creates unused Infisical resources (waste, confusion) — needs cleanup after Vault migration fully validated
TD-029BootstrapUser-managed CoreDNS deployed by Terraform engine layer — single point of failure if engine fails
TD-030BootstrapPost-bootstrap PGO→Vault password sync creates window before Grafana can start — eliminated by Phase 1 Vault Injector

Relationships

See Also