Deployment and Usage Guide for buroa/k8s-gitops
1. Prerequisites
Before deploying this GitOps-managed Kubernetes cluster, ensure you have the following installed and configured:
Core Infrastructure:
- Physical or virtual machines (minimum 3 nodes recommended) for a Talos Linux cluster.
- Network access between cluster nodes and to the internet.
- A dedicated storage solution (NAS recommended for media, optional for core cluster storage).
Required Software & Tools:
talosctl– CLI for managing Talos Linux.kubectl– Kubernetes command-line tool.flux– CLI for Flux v2.age– For encryption of secrets (used with Mozilla SOPS).sops– Secrets encryption/decryption tool.
Accounts & Services:
- A GitHub account with a repository fork of
buroa/k8s-gitops(or your own template fromonedr0p/cluster-template). - GitHub Personal Access Token with
repoandworkflowpermissions. - Cloudflare account (if using Cloudflare Tunnels and DNS).
- Domain name with DNS managed by Cloudflare (or another supported provider for
external-dns).
2. Installation
Step 1: Clone and Prepare the Repository
# Fork the template repository (recommended starting point)
# https://github.com/onedr0p/cluster-template
# Clone your forked repository
git clone https://github.com/your-username/your-cluster-repo.git
cd your-cluster-repo
Step 2: Bootstrap Talos Linux Cluster
-
Install Talos Linux on each node:
- Download the Talos Linux ISO and boot each machine.
- Generate a Talos configuration:
talosctl gen config your-cluster-name https://<control-plane-ip>:6443 - Apply the configuration to each node:
talosctl apply-config --insecure --nodes <node-ip> --file controlplane.yaml talosctl apply-config --insecure --nodes <node-ip> --file worker.yaml - Bootstrap the cluster:
talosctl bootstrap --nodes <control-plane-ip>
-
Retrieve kubeconfig:
talosctl kubeconfig --nodes <control-plane-ip> . export KUBECONFIG=$(pwd)/kubeconfig kubectl get nodes
Step 3: Bootstrap Flux
With your Kubernetes cluster running and kubectl configured, bootstrap Flux to manage the GitOps deployment:
flux bootstrap github \
--owner=your-username \
--repository=your-cluster-repo \
--branch=main \
--path=./cluster \
--personal
This installs Flux in your cluster and configures it to sync with your Git repository.
3. Configuration
Environment and Repository Structure
The repository is organized into GitOps directories:
cluster/– Base cluster configuration (Flux, core apps).apps/– Deployments for user applications.infrastructure/– System-level applications (Cilium, Rook, monitoring).
Required Secrets and Variables
Secrets are managed with SOPS and age. You must generate and configure encryption keys:
-
Generate an age keypair:
age-keygen -o age.agekey # Public key will be displayed; keep the private key secure. -
Configure SOPS in the repository: Create or update
.sops.yamlin the repository root:creation_rules: - path_regex: .*.yaml encrypted_regex: ^(data|stringData)$ age: <your-age-public-key> -
Encrypt secrets: Place sensitive data (API tokens, passwords) in YAML files under
cluster/apps/*/secret.yamland encrypt:sops --encrypt --in-place secret.yaml -
Set GitHub Repository Secrets (for GitHub Actions): In your GitHub repository, add the following secrets:
AGE_PRIVATE_KEY– The private age key for decryption in CI.GH_TOKEN– GitHub Personal Access Token for Renovate and runner authentication.
Customizing for Your Environment
- Update
cluster/flux/flux-system/gotk-components.yamlandgotk-sync.yamlto point to your repository. - Modify
cluster/apps/*/kustomization.yamlfiles to adjust configurations (like ingress hosts, resource limits). - Adjust
infrastructure/configurations (like storage classes in Rook, network policies in Cilium) to match your hardware and network.
4. Build & Run
This is a GitOps-managed infrastructure; there is no traditional "build" step. Changes are made via Git commits and synchronized automatically by Flux.
Local Development/Testing:
- Use
kustomizeto preview manifests:kustomize build ./cluster/apps/cert-manager - Validate Kubernetes manifests:
kubeval --strict ./cluster/apps/cert-manager/*.yaml - Dry-run sync with Flux:
flux reconcile kustomization flux-system --dry-run
5. Deployment
Deployment is continuous and automated via Flux. Once bootstrapped:
-
Push configuration changes to Git:
git add . git commit -m "feat: add new application" git push origin main -
Flux automatically syncs (every 5 minutes by default) or trigger manually:
flux reconcile source git flux-system flux reconcile kustomization flux-system -
Monitor deployment:
flux get kustomizations --watch kubectl get pods -A
Suggested Platform: This setup is designed for bare-metal or VM-based homelabs running Talos Linux. It can be adapted to cloud Kubernetes services (EKS, GKE, AKS) by replacing the Talos installation with a managed cluster and adjusting storage/network components accordingly.
6. Troubleshooting
Common Issues and Solutions
Flux fails to sync with "decryption failed" error:
- Ensure
AGE_PRIVATE_KEYis correctly set in GitHub Secrets (for CI) and locally. - Verify the SOPS configuration file uses the correct public key.
- Decrypt manually to test:
sops --decrypt secret.yaml.
Talos nodes not joining the cluster:
- Check network connectivity and firewall rules (ports 50000, 50001, 6443).
- Validate Talos configuration files match the node roles (controlplane vs worker).
- Use
talosctl logs --nodes <node-ip> kubeletto inspect kubelet logs.
Applications not deploying after sync:
- Check Flux logs:
kubectl logs -n flux-system deployment/flux-controller. - Verify kustomization health:
flux get kustomizations. - Look for resource conflicts or missing CRDs:
kubectl get events -A --sort-by='.lastTimestamp'.
Renovate not creating PRs:
- Ensure the Renovate workflow file (
renovate.yaml) is present in.github/workflows/. - Check GitHub Actions logs for Renovate runs.
- Verify the
GH_TOKENhas sufficient permissions.
Persistent storage issues with Rook:
- Confirm disks are available and not mounted on the host.
- Check Rook operator logs:
kubectl logs -n rook-ceph deployment/rook-ceph-operator. - Validate storage class creation:
kubectl get storageclass.
Ingress/access problems:
- Ensure
envoy-gatewayor your ingress controller pods are running. - Verify DNS records in Cloudflare (or your DNS provider) point to the correct tunnel or load balancer IP.
- Check Cloudflare Tunnel connectivity:
kubectl logs -n cloudflared deployment/cloudflared.
For further assistance, join the Home Operations Discord.