TazLab K8s: Flux Kustomizations Detail
Level 3 (Detail) — All 15 Flux Kustomizations with exact spec, dependencies, and health checks.
Concept
Flux watches the clusters/tazlab-k8s/ directory for Kustomization resources. Each Kustomization points to a path in the repository and defines its dependencies, interval, and post-build substitution. Flux applies them in dependency order.
DAG Overview
infrastructure-operators-namespaces ──┬── infrastructure-bridge ────────── infrastructure-configs ──┬── infrastructure-instances ── infrastructure-auth
infrastructure-operators-core ────────┘ ├── apps-static (hugo-blog)
infrastructure-operators-data ────────┴── infrastructure-monitoring ├── apps-static-wiki (hugo-wiki)
├── apps-data (mnemosyne-mcp)
infrastructure-tailscale ─────────────── infrastructure-operators-tailscale ── infrastructure-tailscale-dns
Note: The Tailscale chain (infrastructure-tailscale → infrastructure-operators-tailscale → infrastructure-tailscale-dns) is an independent branch that does not depend on any core/infra Kustomization except itself. Tailscale DNS resolution is orthogonal to cluster core operations.
Kustomization Inventory
All Kustomizations share:
namespace: flux-systemsourceRef: kind: GitRepository, name: flux-systempostBuild.substituteFrom→ ConfigMapcluster-vars(provisioned by ephemeral-castle)interval: 1hprune: true
1. infrastructure-operators-namespaces
| Field | Value |
|---|---|
| Path | ./infrastructure/operators/namespaces |
| Wait | true |
| HealthChecks | DaemonSet kube-flannel (kube-system), Deployment coredns (kube-system) |
| DependsOn | none (root) |
| Timeout | 5m |
Declares the ai-agents namespace. Other namespaces (cert-manager, traefik, monitoring, tazlab-db, hugo-blog, hugo-wiki, dex, auth, reloader, cloudflare-ddns) are declared inline in each operator’s folder via Opzione A.1 pattern.
2. infrastructure-operators-core
| Field | Value |
|---|---|
| Path | ./infrastructure/operators/core |
| Wait | true |
| DependsOn | none (root) |
| Timeout | 10m |
| RetryInterval | 2m |
Installs all core operators via HelmReleases:
- cert-manager (v1.16.2)
- traefik (v34.0.0)
- reloader (v1.2.1)
- dex
- auth (OAuth2 Proxy)
- cloudflare-ddns
- tazlab-db namespace declaration
- hugo-blog namespace declaration
- hugo-wiki namespace declaration
3. infrastructure-operators-data
| Field | Value |
|---|---|
| Path | ./infrastructure/operators/data |
| Wait | true |
| DependsOn | none (root) |
| Timeout | 10m |
| RetryInterval | 2m |
Installs:
- postgres-operator (Crunchy PGO v5.7.2)
4. infrastructure-tailscale
| Field | Value |
|---|---|
| Path | ./infrastructure/tailscale |
| Wait | true |
| HealthChecks | Deployment tailscale-operator (tailscale) |
| DependsOn | none (root) |
| Timeout | 5m |
| RetryInterval | 1m |
Creates the tailscale namespace, provisions the OAuth ExternalSecret (k8s_operator client), and defines the HelmRepository for https://pkgs.tailscale.com/helmcharts. Layer 1 of the Tailscale Operator 3-layer DAG.
5. infrastructure-operators-tailscale
| Field | Value |
|---|---|
| Path | ./infrastructure/operators/tailscale |
| Wait | true |
| DependsOn | infrastructure-tailscale |
| Timeout | 5m |
| RetryInterval | 1m |
Installs the Tailscale Operator HelmRelease (v1.96.5, tailscale/tailscale-operator). Layer 2 of the 3-layer DAG — only applies after the namespace, Secret, and HelmRepository exist.
6. infrastructure-tailscale-dns
| Field | Value |
|---|---|
| Path | ./infrastructure/tailscale-dns |
| Wait | true |
| DependsOn | infrastructure-operators-tailscale |
| Timeout | 5m |
| RetryInterval | 1m |
Deploys the hostNetwork CoreDNS relay DaemonSet (port 5353) for magellanic-gondola.ts.net resolution, static ClusterIP Service (10.96.0.101), and patches the coredns ConfigMap with a tailnet forwarding zone. Layer 3 — only applies after the Operator CRDs are available.
7. infrastructure-bridge
| Field | Value |
|---|---|
| Path | ./infrastructure/cluster-bridge |
| Wait | true |
| DependsOn | infrastructure-operators-core, infrastructure-operators-namespaces |
| Timeout | 5m |
Aggregator for ./infrastructure/bridge. Configures:
- IngressClass
traefik - ClusterIssuer
tazlab-issuer(Let’s Encrypt prod, HTTP01)
8. infrastructure-monitoring
| Field | Value |
|---|---|
| Path | ./infrastructure/operators/monitoring |
| Wait | not set (defaults to true for non-root) |
| DependsOn | infrastructure-operators-namespaces |
| Timeout | 10m |
| RetryInterval | 2m |
Installs:
- kube-prometheus-stack (HelmRelease)
- metrics-server
- Grafana dashboards as ConfigMaps (cluster-health, nodes-pro)
- Grafana Ingress
- flux-secret-sync (syncs Grafana credential from ExternalSecret)
9. infrastructure-configs
| Field | Value |
|---|---|
| Path | ./infrastructure/configs |
| Wait | true |
| DependsOn | infrastructure-bridge |
| Timeout | 5m |
Deploys all ExternalSecrets and static configuration:
- cert-manager: Cloudflare API token + DNS01 ClusterIssuer
- wildcard-tls:
*.tazlab.netTLS cert via ExternalSecret - hugo-wiki: namespace-specific ExternalSecrets
- tazlab-db: S3 backup credentials
- dex: OIDC client credentials
- ai-agents: OpenClaw gateway / Telegram / OpenAI / ElevenLabs tokens
- github-external-secret: GitHub token for Flux image automation
10. infrastructure-instances
| Field | Value |
|---|---|
| Path | ./infrastructure/cluster-instances |
| Wait | false |
| DependsOn | infrastructure-configs, infrastructure-operators-data |
| Timeout | 10m |
| RetryInterval | 2m |
| SubstituteFrom (extra) | Secret grafana-bootstrap-secret |
Aggregator for ./infrastructure/instances plus all 4 image automation pipelines. Uses wait: false because some workloads (tazlab-db, longhorn, apps with init containers) can take variable time to become ready. Kubernetes handles Pending/Init states naturally.
Deploys:
- tazlab-db: PostgresCluster (Crunchy PGO, 1 replica, 4Gi, S3 backup)
- traefik: LoadBalancer Service (IP 192.168.1.240)
- longhorn: Ingress + Service
- dex: Deployment + Service + Ingress + ConfigMap
- pgadmin: Deployment + PVC + Service
- homepage: Dashboard UI
- cloudflare-ddns: Deployment + ExternalSecret
Plus image automation for:
- hugo-blog
- hugo-wiki
- mnemosyne-mcp
11. apps-static (hugo-blog)
| Field | Value |
|---|---|
| Path | ./apps/cluster/hugo-blog |
| Wait | not set (defaults to true) |
| DependsOn | infrastructure-configs |
| Timeout | 5m |
Deploys hugo-blog from apps/base/hugo-blog/:
- nginx serving static Hugo build
- Certificate CR for
blog.tazlab.net - Traefik Ingress + middlewares
- Redirect middleware (
tazlab.net→blog.tazlab.net)
12. apps-static-wiki (hugo-wiki)
| Field | Value |
|---|---|
| Path | ./apps/cluster/hugo-wiki |
| Wait | not set (defaults to true) |
| DependsOn | infrastructure-configs |
| Timeout | 5m |
Deploys hugo-wiki from apps/base/hugo-wiki/:
- nginx serving static Hugo wiki build
- Ingress for
wiki.tazlab.net - Wildcard TLS via ExternalSecret (same cert chain as blog)
13. apps-data (mnemosyne-mcp)
| Field | Value |
|---|---|
| Path | ./apps/cluster/mnemosyne-mcp |
| Wait | not set (defaults to true) |
| DependsOn | infrastructure-configs |
| Timeout | 5m |
Deploys mnemosyne-mcp from apps/base/mnemosyne-mcp/:
- Go MCP server (tazzo/mnemosyne-mcp)
- LoadBalancer Service port 8004 → 8080
- ExternalSecret for GEMINI_API_KEY
wait-for-dbinitContainer patch- RBAC for secret reading
- Reloader annotation on
tazlab-db-pguser-mnemosyne+mnemosyne-mcp-secrets
| Field | Value |
|---|---|
| Wait | true |
| DependsOn | infrastructure-configs |
| Timeout | 5m |
| RetryInterval | 1m |
- PVCs for config and workspace (5Gi + 10Gi, tazlab-storage)
- MetalLB LoadBalancer IP (192.168.1.242)
- ExternalSecrets for gateway token, telegram, OpenAI, ElevenLabs
15. infrastructure-auth
| Field | Value |
|---|---|
| Path | ./infrastructure/auth |
| Wait | true |
| DependsOn | infrastructure-instances |
| Timeout | 5m |
Deploys:
- OAuth2 Proxy Deployment + Service + Ingress
- ForwardAuth middleware
- RBAC for the oauth2-proxy ServiceAccount
Wait Policy Truth Table
| Kustomization | wait | healthChecks | Effect |
|---|---|---|---|
| infrastructure-operators-namespaces | true | flannel + coredns | Blocks dependents until CNI + DNS ready |
| infrastructure-operators-core | true | — | Blocks until core HelmReleases installed |
| infrastructure-operators-data | true | — | Blocks until PGO installed |
| infrastructure-tailscale | true | tailscale-operator | Blocks until Layer 1 ready |
| infrastructure-operators-tailscale | true | — | Blocks until Layer 2 HelmRelease installed |
| infrastructure-tailscale-dns | true | — | Blocks until Layer 3 DNS resources ready |
| infrastructure-bridge | true | — | Blocks until IngressClass + Issuer ready |
| infrastructure-monitoring | default(true) | — | |
| infrastructure-configs | true | — | Blocks until secrets available |
| infrastructure-instances | false | — | Non-blocking: pods handle Pending/Init naturally |
| apps-static | default(true) | — | |
| apps-static-wiki | default(true) | — | |
| apps-data | default(true) | — | |
| infrastructure-auth | true | — | Blocks until Dex + OAuth2 healthy |
DAG Integrity Rules
- Namespaces must exist before their operators are installed
- ClusterIssuer / IngressClass must exist before ExternalSecrets can reference them
- ExternalSecrets must be available before workloads that consume them
- PGO must be installed before PostgresCluster CRs are applied
- Infrastructure instances must be ready before auth layer depends on them
- Never apply instances before their operator is ready —
dependsOnenforces this
See Also
- Parent hub: tazlab-k8s
- Sibling topics: Flux DAG, Repository Mapping
- Sibling details: Image Automation Detail
- Reference: Kustomization Example