Ephemeral Castle Cluster Bootstrap
Scope
This page documents the active tazlab-k8s one-shot bootstrap flow inside ephemeral-castle/.
Current Synthesis
clusters/tazlab-k8s/proxmox/create.sh is the high-level rebirth orchestrator. It mints bootstrap credentials, applies the Terragrunt foundation layers, then drives Flux convergence and post-bootstrap validation.
No Ansible, no vault-configurator pod. All Vault configuration is done via Terraform Vault provider (direct over Tailscale).
Bootstrap Sequence
1. Secret Minting
- resolve Infisical, Proxmox, GitHub, Tailscale operator secrets from /home/tazpod/secrets
- read Vault root token + CA cert from ~/secrets/lushycorp-vault/
- read bootstrap token from ~/secrets/bootstrap-token.txt
- validate VAULT_CA_CRT is non-empty before Phase 1
2. Terragrunt Foundation (Phase 1)
- secrets: tls_private_key per SA signing keypair (+ output public/private key)
- persistent/vault-jwt-config: JWT auth backend configurato su Vault persistente (root token)
- platform: Proxmox VMs + Talos machine config (con serviceAccount.key)
- cluster health check (CM status + node Ready)
- engine: namespace + random_password x4 + Secret Adoption + vault secrets (vault-ca-cert, vault-eso-token, tailscale-operator-oauth)
- networking + gitops + storage in parallel after engine
3. Flux Reconciliation (Phase 2)
- reconcile flux-system source
- reconcile core infrastructure kustomizations (namespaces cert-manager traefik tailscale VSO configs instances apps)
- kubectl wait kustomization/apps-static -n flux-system for=condition=Ready timeout=600s
4. Post-Bootstrap Validation
- Privilege enforcement: master pod wait, pg_is_in_recovery() polling, ALTER ROLE
- Vault database engine: vault secrets enable -path=database database, auto-import, terragrunt apply vault-db-config
- Smoke test: vault read database/creds/grafana
- LB IP wait: Traefik LoadBalancer IP
- check-blog.sh: HTTPS + Hugo marker string
5. Convergence (Post-create)
- 21/21 Flux kustomizations True (automatic)
- VSO reconciles VaultDynamicSecret senza interventi
Current DAG Order
secrets -> vault-jwt-config -> vault-db-config -> platform -> cluster health -> engine -> [networking + gitops + storage] (parallel) -> Flux convergence
Important Implementation Details
network and gitops layers are parallelized after engine
vault-jwt-config e vault-db-config sono layer persistenti (prevent_destroy=true, preservati da destroy.sh)
terragrunt cache: root include ha extra_arguments init_reconfigure + disable_dependency_optimization = true
namespaces tailscale, external-secrets, tazlab-db sono creati dal modulo k8s-engine (prima di Flux)
create.sh logga parallel layer output a /workspace/logs/dag-fix/
check-blog.sh verifica HTTPS e cerca il marker Hugo
precision-test.sh e stress-test.sh esercitano il ciclo rebirth
Source Basis
- clusters/tazlab-k8s/proxmox/create.sh
- clusters/tazlab-k8s/proxmox/destroy.sh
- clusters/tazlab-k8s/proxmox/check-blog.sh
- clusters/tazlab-k8s/live/terragrunt.hcl