Deployment and Usage Guide for szinn/k8s-homelab
1. Prerequisites
Before deploying this Kubernetes home lab configuration, ensure you have the following:
Core Technologies:
- A Kubernetes cluster running Talos Linux (v1.6+ recommended)
- FluxCD v2.x for GitOps management
- Renovate for automated dependency updates
- GitHub account with repository access
Required Tools:
kubectlconfigured to access your clusterfluxCLI installed (brew install fluxcd/tap/fluxor see Flux installation docs)talosctlfor Talos cluster management (brew install siderolabs/talos/talosctl)task(go-task) for automation scripts (brew install go-task/tap/go-task)sopsfor secret management (brew install sops)agefor encryption (brew install age)pre-commitfor Git hooks (brew install pre-commit)
Infrastructure Requirements:
- Multiple nodes (as per hardware specs: 3 control plane nodes, 3 worker nodes minimum)
- Network storage (NFS server recommended, like Synology NAS)
- Ubiquiti networking equipment (optional but recommended for full functionality)
- GitHub repository fork of
szinn/k8s-homelab
2. Installation
Fork and Clone the Repository
# Fork the repository on GitHub first, then clone your fork
git clone https://github.com/YOUR_USERNAME/k8s-homelab.git
cd k8s-homelab
# Install pre-commit hooks
pre-commit install
pre-commit install-hooks
Bootstrap FluxCD
# Ensure your kubectl context points to your Talos cluster
kubectl config get-contexts
# Bootstrap Flux with your GitHub repository
flux bootstrap github \
--owner=YOUR_USERNAME \
--repository=k8s-homelab \
--branch=main \
--path=./cluster \
--personal
Initialize the Cluster Structure
# Apply the cluster base configuration
kubectl apply -f cluster/base/flux-system/kustomization.yaml
# Apply core components
kubectl apply -k cluster/core/
3. Configuration
Environment Setup
Create a .env file based on the template:
cp .env.example .env
Edit the .env file with your specific values:
# Cluster Configuration
CLUSTER_NAME=your-cluster-name
CLUSTER_DOMAIN=your-domain.local
GITHUB_USER=your-github-username
GITHUB_TOKEN=your-github-token
# Network Configuration
METALLB_IP_RANGE=192.168.1.240-192.168.1.250
EXTERNALDNS_ZONE=your-domain.com
# Storage Configuration
NFS_SERVER=192.168.1.100
NFS_PATH=/path/to/nfs/share
Secret Management with SOPS
-
Generate an age keypair:
age-keygen -o key.txt -
Configure SOPS with your age key:
export SOPS_AGE_KEY_FILE=$(pwd)/key.txt -
Update the
.sops.yamlfile with your public key:creation_rules: - path_regex: .*.yaml age: your-public-age-key-here -
Encrypt secrets:
sops -e -i cluster/core/secrets/example-secret.yaml
Customizing Applications
Edit the application configurations in cluster/apps/:
- Modify
cluster/apps/kustomization.yamlto enable/disable applications - Adjust resource requests/limits in application manifests
- Configure ingress rules for your domain
- Update storage classes for your specific storage backend
4. Build & Run
Local Development
For local testing and development:
# Validate Kubernetes manifests
task validate
# Dry-run Flux reconciliations
flux reconcile kustomization flux-system --dry-run
# Test specific components
task test:apps
Cluster Operations
Use the provided Taskfile for common operations:
# List available tasks
task --list
# Update all components
task update
# Check cluster status
task status
# Backup cluster configuration
task backup
5. Deployment
Production Deployment Workflow
-
Prepare Your Infrastructure:
- Provision Talos Linux nodes (3 control plane, 3+ workers)
- Configure network storage (NFS, Ceph, or other)
- Set up network infrastructure (VLANs, firewall rules)
-
Initialize Talos Cluster:
# Generate Talos configuration talosctl gen config your-cluster-name https://your-control-plane-ip:6443 # Apply configuration to nodes talosctl apply-config --insecure --nodes <node-ip> --file controlplane.yaml talosctl apply-config --insecure --nodes <node-ip> --file worker.yaml # Bootstrap the cluster talosctl bootstrap --nodes <control-plane-ip> # Configure kubeconfig talosctl kubeconfig --nodes <control-plane-ip> -
Deploy GitOps Pipeline:
# Bootstrap Flux (as shown in Installation section) flux bootstrap github ... # Monitor deployment progress flux get all --watch -
Verify Deployment:
# Check all pods are running kubectl get pods -A # Check Flux reconciliations flux reconcile source git flux-system
Platform Recommendations
Based on the project's architecture:
- Primary Platform: Bare metal with Talos Linux (as used in the reference implementation)
- Alternative Platforms:
- VMware ESXi with Talos VMs
- Proxmox VE with Talos VMs
- Equinix Metal or other bare metal providers
- Major cloud providers (AWS EKS, GCP GKE, Azure AKS) with adjustments for Talos
Scaling Considerations
- Start with minimal hardware (1 control plane, 2 workers)
- Add nodes as needed using Talos machine configurations
- Monitor resource usage via included Prometheus/Grafana stack
- Adjust application resource requests based on monitoring data
6. Troubleshooting
Common Issues
Flux Not Syncing:
# Check Flux status
flux get all
# Check logs
flux logs --kind=Kustomization --name=flux-system
# Force reconciliation
flux reconcile source git flux-system
Talos Node Issues:
# Check node status
talosctl -n <node-ip> version
talosctl -n <node-ip> containers
# Reset a node
talosctl -n <node-ip> reset --graceful=false --reboot=true
Application Deployment Failures:
# Check specific application
kubectl get pods -n <namespace>
kubectl describe pod <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace>
# Check PersistentVolumeClaims
kubectl get pvc -A
Network Issues:
# Check MetalLB
kubectl get pods -n metallb-system
kubectl logs -l app.kubernetes.io/name=metallb -n metallb-system
# Check CoreDNS
kubectl get pods -n kube-system -l k8s-app=kube-dns
Monitoring and Logging
Access the included monitoring stack:
-
Grafana:
kubectl port-forward -n monitoring svc/grafana 3000:3000- Default credentials: admin/prom-operator
-
AlertManager: Configure notifications in
cluster/core/monitoring/alertmanager-config.yaml -
Loki Logs: Query logs via Grafana's Explore tab or
kubectl logs -n logging loki-0
Recovery Procedures
Cluster Recovery:
# If Flux is broken, re-bootstrap
flux bootstrap github ... --force
# Restore from backup
task restore
# Recreate Talos cluster from scratch
talosctl reset --graceful=false --reboot=true --nodes <all-nodes>
Secret Recovery:
# If SOPS keys are lost, regenerate and re-encrypt
sops --rotate --in-place cluster/core/secrets/*.yaml
Data Recovery:
- Use Velero backups (if configured)
- Restore from NFS snapshots
- Recreate PVCs from backup storage
Getting Help
- Check existing GitHub issues: https://github.com/szinn/k8s-homelab/issues
- Join Kubernetes/Talos/Flux community Slack/Discord channels
- Review referenced template: onedr0p/cluster-template
- Search for similar implementations on kubesearch.dev