← Back to home-ops

How to Deploy & Use home-ops

Home-Ops Deployment and Usage Guide

1. Prerequisites

Before deploying this HomeOps Kubernetes cluster, ensure you have the following:

Hardware/Infrastructure:

  • Physical or virtual machines for Kubernetes nodes (minimum 3 recommended for high availability)
  • Separate storage server with ZFS support for NFS/SMB shares and backups
  • Network infrastructure with proper routing and firewall configurations

Software & Tools:

  • Talos Linux for Kubernetes deployment
  • kubectl command-line tool
  • flux CLI version 2.x
  • Git for version control
  • GitHub account with repository access
  • SOPS for secrets management (optional, but recommended)
  • age for encryption (if using SOPS)

Accounts & Services:

  • GitHub account (for repository hosting and GitHub Actions)
  • Cloudflare account (for DNS and tunnel management)
  • 1Password account (for secrets management via 1Password Connect)
  • Domain name with DNS managed by Cloudflare

2. Installation

Step 1: Clone the Repository

git clone https://github.com/onedr0p/home-ops.git
cd home-ops

Step 2: Bootstrap Kubernetes Cluster with Talos

  1. Install Talos Linux on your nodes following the official documentation
  2. Configure Talos machine configurations for your hardware
  3. Bootstrap the Kubernetes cluster:
talosctl apply-config --insecure --nodes <node-ip> --file controlplane.yaml
talosctl apply-config --insecure --nodes <node-ip> --file worker.yaml

Step 3: Install Flux CD

flux install \
  --components=source-controller,kustomize-controller,helm-controller,notification-controller \
  --namespace=flux-system

Step 4: Bootstrap Flux with GitOps

flux bootstrap github \
  --owner=onedr0p \
  --repository=home-ops \
  --branch=main \
  --path=./cluster/base/flux-system \
  --personal

3. Configuration

Environment Setup

Create necessary namespaces and configure base resources:

# Apply base configurations
kubectl apply -k cluster/base

# Apply core components
kubectl apply -k cluster/core

Secrets Management

This project uses SOPS with age for secrets encryption:

  1. Generate age keypair:
age-keygen -o age.agekey
  1. Export the public key:
cat age.agekey | grep "public" | cut -d " " -f 4
  1. Update .sops.yaml with your public key:
creation_rules:
  - path_regex: .*.yaml
    encrypted_regex: ^(data|stringData)$
    age: "your-public-key-here"
  1. Encrypt secrets:
sops --encrypt --in-place secrets/encrypted-secret.yaml

Required API Keys and Configuration

Cloudflare:

  • API token with DNS and Zone permissions
  • Tunnel credentials for cloudflared

1Password Connect:

  • 1Password Connect API token
  • Vault UUID for secrets storage

GitHub:

  • Personal access token with repo permissions
  • Webhook secret for Flux notifications

DNS Configuration:

  • Configure Cloudflare as your DNS provider
  • Set up A records pointing to your ingress controller
  • Configure tunnel for secure ingress

4. Build & Run

Local Development

For testing configurations locally:

  1. Install k3d for local Kubernetes:
k3d cluster create home-ops-test --servers 1 --agents 2
  1. Test Flux configurations:
flux check --pre
  1. Dry-run kustomizations:
kubectl kustomize cluster/core --enable-alpha-plugins

Production Deployment

The repository is designed for production use with:

  1. GitOps Workflow: All changes are made via Git commits
  2. Automated Sync: Flux automatically applies changes from the repository
  3. Health Checks: Automated health checking and remediation
  4. Backup Integration: VolSync for data protection

5. Deployment

Platform Recommendations

Based on the tech stack, deploy to:

Primary Platform:

  • Bare metal with Talos Linux (as used in the repository)
  • Minimum 3 nodes for high availability
  • Separate storage server for persistent data

Alternative Platforms:

  • Proxmox VE with Talos VMs
  • VMware ESXi with Talos virtual machines
  • Cloud providers (Hetzner, AWS, GCP) with Talos support

Deployment Steps

  1. Infrastructure Provisioning:
# Using the cluster template
git clone https://github.com/onedr0p/cluster-template.git
cd cluster-template
# Customize for your environment
  1. Core Services Deployment:
# Apply networking (Cilium)
kubectl apply -f cluster/core/cilium/

# Apply service mesh (Istio)
kubectl apply -f cluster/core/istio/

# Apply storage (Rook)
kubectl apply -f cluster/core/rook/
  1. Application Deployment: Applications are organized by namespace:
# Deploy media applications
kubectl apply -k cluster/apps/media/

# Deploy monitoring stack
kubectl apply -k cluster/apps/monitoring/

# Deploy networking applications
kubectl apply -k cluster/apps/networking/
  1. Automation Setup:
  • Renovate is configured via .github/renovate.yaml
  • GitHub Actions workflows are in .github/workflows/
  • Flux notifications are configured for monitoring

6. Troubleshooting

Common Issues and Solutions

Flux Not Syncing:

# Check Flux status
flux get all --all-namespaces

# Check reconciliation status
flux reconcile kustomization flux-system --with-source

# View logs
kubectl logs -n flux-system deployment/kustomize-controller

Talos Node Issues:

# Check node status
talosctl -n <node-ip> version
talosctl -n <node-ip> containers

# Reset node if needed
talosctl -n <node-ip> reset

Certificate Problems:

# Check cert-manager status
kubectl get certificates -A
kubectl describe certificate <cert-name> -n <namespace>

# Check ClusterIssuers
kubectl get clusterissuers

Storage Issues:

# Check Rook Ceph status
kubectl get cephclusters -n rook-ceph
kubectl describe cephcluster rook-ceph -n rook-ceph

# Check PVC status
kubectl get pvc -A

Network Connectivity:

# Check Cilium status
kubectl get pods -n kube-system -l k8s-app=cilium
cilium status

# Check Istio sidecar injection
kubectl get pods -n <namespace> -l istio.io/rev=<revision>

Secrets Not Loading:

# Check External Secrets status
kubectl get externalsecrets -A
kubectl describe externalsecret <name> -n <namespace>

# Verify 1Password Connect connectivity
kubectl logs -n external-secrets deployment/1password-connect

Monitoring and Logs

  • Access Grafana dashboard for cluster metrics
  • Check Prometheus alerts for issues
  • View logs through Loki or directly via kubectl
  • Use k9s for interactive cluster management

Recovery Procedures

  1. Cluster Recovery: Use Talos backup/restore features
  2. Data Recovery: Utilize VolSync backups from secondary storage
  3. GitOps Recovery: Flux can be re-bootstrapped from the Git repository
  4. Secrets Recovery: All secrets are stored in 1Password and can be re-injected

For additional help, join the Home Operations Discord community.