Entity: Ephemeral Castle (Infrastructure Layer)
This page is the repository hub for the provider-specific infrastructure layer.
Overview
ephemeral-castle/ is the provider-specific bootstrap layer. It provisions VMs (Proxmox/Talos), networking (Tailscale), storage (Longhorn), Flux bootstrap, and the Hetzner Vault runtime. After bootstrap, the cluster is managed declaratively by Flux from tazlab-k8s.
Repository Structure
ephemeral-castle/
├── clusters/tazlab-k8s/ # Proxmox/Talos cluster (active)
│ ├── proxmox/ # Lifecycle scripts (create/destroy/nuclear-wipe)
│ │ └── configs/ # GENERATED: kubeconfig, talosconfig
│ ├── live/ # Terragrunt layers
│ │ ├── env.hcl # Cluster variables (source of truth)
│ │ ├── terragrunt.hcl # Root config
│ │ ├── secrets/ # Layer 1: bootstrap credentials (from ~/secrets/ env vars)
│ │ ├── platform/ # Layer 2: Proxmox VMs + Talos (certSANs, JWT issuer baked in)
│ │ ├── engine/ # Layer 3: ESO + CoreDNS + vault secrets + github/tailscale tokens
│ │ ├── networking/ # Layer 4a: MetalLB
│ │ ├── gitops/ # Layer 4b: Flux bootstrap
│ │ ├── storage/ # Layer 5: Longhorn + S3 backup
│ │ └── gcp-services/ # Standalone: AlloyDB
│ ├── modules/ # Reusable Terraform modules (6)
│ └── manifests/ # Static Helm values
├── runtimes/
│ └── lushycorp-vault/hetzner/ # Vault runtime (active)
│ ├── create.sh # 8-stage provisioning pipeline (~344s)
│ ├── destroy.sh # Ordered teardown + state cleanup
│ ├── terraform/ # Hetzner VM + firewall
│ ├── ansible/ # Ansible roles (vault-runtime, common) + 3 playbooks (install/converge/post)
│ ├── golden-image/ # Image builder pipeline (v1-v4)
│ └── configs/ # Golden image env, runtime metadata
├── tailscale/ # Tailnet ACL + OAuth IaC (Terraform)
├── hermes/ # Hermes Agent LXC deployment
│ ├── create.sh # 7-phase orchestrator (~298s)
│ ├── destroy.sh # Backup + destroy
│ ├── cycle.sh # Full destroy/create with timing
│ ├── terraform/ # LXC container resource + mount_point
│ └── ansible/ # 4 roles (baseline, agent, configure, verify)
├── templates/ # Cluster blueprint + gitops example
└── docs/ # Architecture docs
Quick Facts
| Property | Value |
|---|---|
| Repository | ephemeral-castle/ |
| Proxmox endpoint | 192.168.1.200:8006 |
| K8s API VIP | 192.168.1.210:6443 |
| MetalLB range | 192.168.1.240-250 |
| Traefik LB | 192.168.1.240 |
| Talos version | v1.12.0 |
| K8s version | v1.35.0 |
| CNI | Flannel |
| Topology | 1 CP + 1 Worker |
| JWKS endpoint | lushycorp-apiserver-proxy.magellanic-gondola.ts.net/openid/v1/jwks (via ProxyGroup) |
| oidc-reviewer CRB | ✅ Created — binds system:service-account-issuer-discovery to system:unauthenticated |
| Vault runtime | Hetzner CX23, Nuremberg |
| Vault FQDN | lushycorp-vault.magellanic-gondola.ts.net |
| Vault tailnet IP | 100.82.13.87 |
Canonical Starting Pages for Agents
Cluster Infrastructure
- Architecture — Philosophy, Terragrunt structure
- Terragrunt Layers — 6 sequential layers
- Rebirth Protocol — Full create/destroy lifecycle
Vault Runtime (Hetzner)
- Vault Architecture — Unseal, connectivity, hostname evolution
- Vault Bootstrap & Restore — Classification matrix, S3 lineage
- Create/Destroy Detail — 8-stage pipeline
- Ansible Vault Detail — Full task orchestration
Networking
- Tailnet Security — ACLs, tags, OAuth
- Tailscale Bridge — Talos System Extension
Details
- Terraform Modules Detail — 6 modules
- Terragrunt Layers Detail — Dependency chain
- Note: persistent/ layers (vault-jwt-config, vault-db-config) live under live/persistent/, applied before platform
- Ansible Scripts Detail — 4 shell scripts
- Systemd Detail — 5 systemd units
- Golden Image Detail — Builder pipeline
- Proxmox Cluster Detail — Create/destroy/nuclear-wipe
Known Issues
| TD | Area | Summary |
|---|---|---|
| TD-016 | Vault runtime | Quadlet .container generator unreliable on Podman 4.3.1; forced Type=simple systemd fallback |
| TD-020 | Vault connectivity | lushycorp-api.ts.tazlab.net is runtime-defined TLS alias, not Tailscale-owned MagicDNS |
| TD-021 | Vault recovery | Bootstrap anchor gap: canonical files absent even when S3 lineage coherent |
| TD-023 | Proxmox host | BIOS/WoL auto-power-on not configured — after AC power loss, requires physical button press to restart (2026-04-29) |
| TD-024 | Operator transport | Tailscale userspace-networking + tailscale nc transport bug (Closed). Resolved by hostnet+TUN mode. Create pipeline now ~344s. |
| TD-028 | Bootstrap | Engine layer still creates unused Infisical resources (waste, confusion) — needs cleanup after Vault migration fully validated |
| TD-029 | Bootstrap | User-managed CoreDNS deployed by Terraform engine layer — single point of failure if engine fails |
| TD-030 | Bootstrap | Post-bootstrap PGO→Vault password sync creates window before Grafana can start — eliminated by Phase 1 Vault Injector |
Relationships
- GitOps layer: tazlab-k8s
- Operator environment: tazpod
- Semantic memory: mnemosyne-mcp-server
- AI agent service: hermes — Hermes Agent LXC deployment inside this repo