Tailscale: Operator Connectivity

Level 2 (Topic) — Userspace daemon inside TazPod, host network + TUN mode, auth key minting, restart requirement.

Concept

The operator runs Tailscale inside the TazPod container. Two modes exist:

  • Legacy: userspace-networking daemon managed by AGENTS.ctx/tools/tailscale/start.sh (deprecated).
  • Current: host network + TUN mode (--network host, --cap-add NET_ADMIN, /dev/net/tun), started by the in-image helper tazpod-tailscale-up.

The TUN mode is now the default. It avoids the tailscale nc transport bugs that caused apt install workloads to fail over the userspace path.

Userspace Mode

Tailscale runs with --tun=userspace-networking because inside the Docker container there is no access to /dev/net/tun for kernel-mode wireguard. This limits raw throughput but is sufficient for SSH, API access, and mesh connectivity.

sudo -n tailscaled --tun=userspace-networking \
  --state="${STATE_FILE}" --socket="${SOCKET_FILE}"

State is persisted in AGENTS.ctx/tools/tailscale/state/ to survive container restarts.

Auth Key Minting

start.sh mints a short-lived auth key before bringing the interface up:

  1. Reads OAuth client credentials from ~/secrets/tailscale-oauth-client-id and ~/secrets/tailscale-oauth-client-secret
  2. Calls GET https://api.tailscale.com/api/v2/oauth/token to get an access token
  3. Calls POST https://api.tailscale.com/api/v2/tailnet/-/keys to create a 1-hour auth key with tags tag:tazpod
  4. Falls back to Tailscale API key if OAuth client is not configured

The key is reusable, ephemeral, and pre-authorized.

After Container Restart

The daemon can be restarted manually:

AGENTS.ctx/tools/tailscale/start.sh

There is also an auto-start mechanism in .bashrc (since 2026-05-27): on first shell entry, if tailscale status fails (socket not responding), it launches start.sh in background via setsid. Second shells skip because the singleton check passes and silently run update-hosts.sh to refresh /etc/hosts.

The daemon log is at AGENTS.ctx/tools/tailscale/logs/tailscaled.log.

Verification

After daemon start, verify connectivity:

tailscale status
tailscale ping lushycorp-vault

Host Network + TUN Mode (Current Default)

Since TazPod commit cf0fabf, the container is created with:

  • --network host — container shares the host network stack directly, no Docker bridge
  • --cap-add NET_ADMIN — required for /dev/net/tun access
  • /dev/net/tun device mapped into the container

The start.sh script handles startup:

  • runs tailscaled in kernel/TUN mode when /dev/net/tun available (auto-detected), falls back to --tun=userspace-networking otherwise
  • uses --stateful-filtering=falsecritical fix: Tailscale v1.66+ enables stateful filtering by default, which drops UDP DNS packets from Docker bridge interfaces (172.17.0.0/16) to MagicDNS (100.100.100.100). TCP DNS worked because the handshake establishes a tracked connection.
  • uses --accept-dns=false — prevents tailscale from overriding /etc/resolv.conf inside the container, avoiding conflicts with Docker DNS
  • runs update-hosts.sh after joining the tailnet, which syncs all MagicDNS names (*.magellanic-gondola.ts.net) to /etc/hosts via tailscale status --json

Socket Auto-Detection

create.sh and tazpod-tailscale-up auto-detect the Tailscale socket path:

  • TUN mode: socket at the default system path (/var/run/tailscale/tailscaled.sock)
  • Userspace fallback: socket at AGENTS.ctx/tools/tailscale/state/tailscaled.sock (via TAILSCALE_SOCKET)

The render-tailscale-inventory.sh script also auto-detects TUN availability:

  • when /dev/net/tun exists: generates direct-tailnet SSH inventory (no ProxyCommand)
  • when only bridge/userspace mode: falls back to tailscale nc ProxyCommand

DNS Fix — Root Cause (2026-05-28)

Tailscale v1.66+ enables stateful filtering by default. In Docker, DNS UDP packets from container bridge interfaces (172.17.0.0/16) to MagicDNS resolver (100.100.100.100) are seen as external traffic and dropped. TCP DNS worked because the TCP handshake established a tracked connection, but UDP-only DNS queries failed silently.

Fix chain:

  1. --stateful-filtering=false on tailscaled → allows UDP DNS from any interface
  2. --accept-dns=false on tailscale up → prevents /etc/resolv.conf conflicts with Docker
  3. update-hosts.sh → populates /etc/hosts with MagicDNS names from tailscale status --json

Result: curl, vault CLI, getent hosts, tailscale ping all resolve *.magellanic-gondola.ts.net natively inside the container.

Performance Impact

The full create pipeline (Hetzner Vault) dropped from ~1200s to ~344s after switching to TUN mode. Transport-layer timeouts and UNREACHABLE errors are eliminated.

See Also