Source:
oceanid/cluster/README.md| ✏️ Edit on GitHub
Oceanid Cluster Bootstrap
This Pulumi project bootstraps the K3s Kubernetes cluster with foundational infrastructure components. It manages in-cluster resources and requires kubeconfig access.
Scope
Managed Resources:
- Flux CD: GitOps continuous deployment
- Pulumi Kubernetes Operator (PKO): In-cluster Pulumi deployments
- Cloudflare Tunnel (cloudflared): Secure k8s API ingress
- Node Tunnels: Per-node cloudflared daemonsets (optional)
- Image Automation: Automated dependency updates
- Label Studio: Annotation platform (k8s deployment)
- Triton Adapter: ML inference bridge
NOT Managed Here:
- Cloud resources (DNS, Access, databases) → See
../cloud/ - Application workloads → Managed by Flux in
../clusters/
Important: Deployment Model
This stack CANNOT run in GitHub Actions due to kubeconfig requirements.
Local Deployment (Current)
cd cluster/
# Install dependencies
pnpm install
# Build TypeScript
pnpm build
# Preview changes
pulumi preview
# Deploy changes
pulumi up
Prerequisites:
- SSH tunnel to K3s control plane (see CLAUDE.md)
- Valid kubeconfig at
~/.kube/k3s-config.yaml - Pulumi CLI authenticated
Self-Hosted Deployment (Future)
For automated cluster bootstrap, use Pulumi Deployments with a self-hosted agent:
# On a machine with cluster access (e.g., tethys control plane)
# 1. Install Pulumi Deployments agent
pulumi deployments agent install \
--token $(op read "op://Infrastructure/Pulumi Agent/token") \
--pool oceanid-cluster
# 2. Configure agent to run on startup
sudo systemctl enable pulumi-deployments-agent
sudo systemctl start pulumi-deployments-agent
# 3. Update stack settings in Pulumi Cloud console
# Settings → Deployments → Enable
# Deployment pool: oceanid-cluster
# Trigger: Git push to main
# Work directory: cluster/
Benefits:
- Automated deployments on git push
- Audit trail in Pulumi Cloud
- No local CLI dependencies
- Centralized secret management
Tradeoffs:
- Agent must maintain cluster access
- Requires paid Pulumi tier (or free tier quota)
- Additional operational complexity
See Pulumi Deployments docs for setup details.
Stack Configuration
Stack: ryan-taylor/oceanid-cluster/prod
ESC Environment: default/oceanid-cluster (shared with cloud project)
Required Configuration
# Node IP addresses
pulumi config set tethysIp 157.173.210.123
pulumi config set styxIp 191.101.1.3
# Cloudflare tunnel credentials
pulumi config set --secret cloudflare_tunnel_token <token>
pulumi config set --secret cloudflare_node_tunnel_token <token>
# GitHub token for Flux
pulumi config set --secret github_token <token>
# Database connection (from cloud stack)
pulumi config set --secret postgres_url <url>
CI Guard
This stack includes a runtime guard to prevent accidental execution in GitHub Actions:
if (process.env.CI === "true" && process.env.GITHUB_ACTIONS === "true") {
throw new Error("CLUSTER STACK CANNOT RUN IN GITHUB ACTIONS");
}
If you see this error in CI, you likely meant to modify the cloud/ stack instead.
Architecture
Bootstrap Flow
- Flux CD: Bootstraps GitOps controller watching
clusters/tethys/ - PKO: Enables in-cluster Pulumi operations (future use)
- Cloudflare Tunnel: Exposes k8s API at
k3s.boathou.se - Node Tunnels: Optional per-node access (GPU, wildcard pods)
- Image Automation: Watches registries for dependency updates
- Applications: Label Studio, Triton adapter, etc.
Dependencies
Flux (GitOps engine)
↓
PKO (in-cluster IaC)
↓
Cloudflare Tunnel (ingress)
↓
Applications (Label Studio, Triton)
Stack Outputs
export const kubeconfigPath: string // Path to kubeconfig file
export const fluxNamespace: string // flux-system
export const pkoNamespace: string // pulumi-system
export const cloudflareNamespace: string // cloudflared
export const labelStudioHostname: string // label.boathou.se
export const migrationStatus: object // Component migration state
View outputs:
pulumi stack output kubeconfigPath
pulumi stack output labelStudioHostname
GitOps Workflow
This stack bootstraps GitOps; applications deploy via Flux:
- Modify manifests in
clusters/tethys/ - Commit and push to
main - Flux reconciles automatically (default: 1 minute)
- Monitor:
flux get all -A
Force reconciliation:
flux reconcile source git flux-system
flux reconcile kustomization apps
Common Operations
Update K3s Node Configuration
# Edit node IPs or labels in src/index.ts
vim src/index.ts
# Build and deploy
pnpm build
pulumi up
Add New Cloudflare Ingress Rule
# Edit extraIngress in src/index.ts
# Add to CloudflareTunnel instantiation
# Deploy
pnpm build
pulumi up
Rotate Secrets
# Update in Pulumi config (encrypts automatically)
pulumi config set --secret github_token <new_token>
# Apply changes
pulumi up
Troubleshooting
"Unable to connect to cluster"
# Verify SSH tunnel is active
lsof -ti:16443
# If not, establish tunnel
ssh -L 16443:localhost:6443 tethys -N &
# Verify kubeconfig
export KUBECONFIG=~/.kube/k3s-config.yaml
kubectl get nodes
"Resource already exists"
If Pulumi fails with duplicate resources:
# Import existing resource
pulumi import <type> <name> <id>
# Or delete from state if no longer needed
pulumi state delete <URN>
Flux Not Reconciling
# Check Flux health
flux check
# View reconciliation logs
kubectl logs -n flux-system deploy/source-controller
kubectl logs -n flux-system deploy/kustomize-controller
# Force reconciliation
flux reconcile source git flux-system --with-source
Migration Notes
2025-01-30: Cloud resources migrated to separate oceanid-cloud stack:
- Cloudflare DNS records →
cloud/ - CrunchyBridge database →
cloud/ - Cloudflare Access apps →
cloud/
This stack now focuses exclusively on k8s cluster bootstrap.
See Also
- Cloud Infrastructure - DNS, Access, databases
- CLAUDE.md - Cluster connection guide