Skip to main content

GitOps

Key Takeaways for AI & Readers
  • Git as Single Source of Truth: GitOps uses Git repositories as the single, authoritative source for all infrastructure and application configuration. Every change is versioned, auditable, and reversible through standard Git operations (commits, branches, pull requests).
  • Pull Model vs Push Model: The GitOps pull model, where an in-cluster operator (ArgoCD, Flux) continuously pulls desired state from Git and applies it to the cluster, is fundamentally safer than the traditional push model (CI pipeline runs kubectl apply). The pull model eliminates the need to expose cluster credentials to external CI systems.
  • Continuous Reconciliation: GitOps operators do not just apply changes once -- they run a continuous reconciliation loop that compares the desired state in Git against the actual cluster state. Any drift (manual changes, failed deployments, resource deletions) is automatically detected and can be self-healed.
  • ArgoCD Application CRD: ArgoCD models each deployment as an Application custom resource that points to a Git repository path. It supports Helm, Kustomize, plain YAML, and Jsonnet, with configurable sync policies, health checks, and pruning behavior.
  • Secrets Management: Storing secrets in Git requires encryption. Solutions include Sealed Secrets (encrypt with a cluster-specific key), SOPS (encrypt with cloud KMS), and External Secrets Operator (sync secrets from external vaults like AWS Secrets Manager or HashiCorp Vault).
  • Multi-Cluster and Scale: ArgoCD ApplicationSets and Flux's multi-cluster support enable managing hundreds of clusters from a single control plane, using generators to dynamically create applications across clusters, environments, or teams.

Git Repo

🐙
replicas: 3
ArgoCD

Cluster

GitOps is a set of practices that uses Git as the single source of truth for declarative infrastructure and application configuration. Rather than imperatively running commands against a cluster, you describe the desired state in Git, and a dedicated operator running inside the cluster ensures that reality matches what Git says.

1. GitOps Principles

GitOps rests on four core principles:

Declarative: The entire system (infrastructure and applications) is described declaratively. Kubernetes manifests, Helm charts, and Kustomize overlays are declarative by nature, making Kubernetes an ideal platform for GitOps.

Versioned and Immutable: All desired state is stored in Git, providing a complete audit trail, the ability to roll back to any previous state, and the ability to reproduce any environment from a specific commit.

Pulled Automatically: An operator running inside the cluster (ArgoCD, Flux) continuously pulls the desired state from Git and applies it. No external system needs credentials to the cluster.

Continuously Reconciled: The operator does not just apply changes when they happen. It continuously compares the cluster's actual state to the desired state in Git. Any drift -- whether from manual kubectl edit commands, failed rollouts, or resource corruption -- is detected and corrected.

2. The Pull Model vs The Push Model

Traditional CI/CD (Push)

  1. Developer commits code.
  2. CI system (Jenkins, GitHub Actions) builds a container image.
  3. CI system runs kubectl apply -f deploy.yaml or helm upgrade.

Problems with the push model:

  • The CI system needs cluster credentials (kubeconfig), creating a security risk. If the CI system is compromised, the attacker has direct cluster access.
  • The cluster state can diverge from Git if anyone runs kubectl edit or kubectl delete manually. The CI system only pushes on commits -- it does not continuously reconcile.
  • There is no self-healing. If a resource is deleted manually, the CI system does not notice until the next deployment.

GitOps (Pull)

  1. Developer commits code.
  2. CI builds an image and commits the new image tag to a config repository (e.g., updates values.yaml or a Kustomize overlay).
  3. GitOps operator (ArgoCD) running inside the cluster detects the change in Git.
  4. ArgoCD pulls the new configuration and applies it to the cluster.

Advantages of the pull model:

  • No cluster credentials leave the cluster. ArgoCD runs inside the cluster and has a ServiceAccount with the necessary RBAC permissions.
  • Continuous reconciliation means the cluster always matches Git. Manual changes are detected as drift and can be automatically reverted.
  • Full audit trail through Git history. Every deployment is a Git commit, reviewable via pull requests.

3. ArgoCD Deep Dive

ArgoCD is the most widely adopted GitOps operator for Kubernetes. It provides a declarative Application CRD, a powerful web UI, SSO integration, RBAC, and support for Helm, Kustomize, plain YAML, and Jsonnet.

Application CRD

The core abstraction in ArgoCD is the Application -- a custom resource that defines the relationship between a Git repository (source) and a Kubernetes cluster/namespace (destination):

# argocd-application.yaml
# Deploy the frontend application from a Git repository using Kustomize
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: frontend
namespace: argocd # ArgoCD Applications live in the argocd namespace
finalizers:
- resources-finalizer.argocd.argoproj.io # Cascade delete application resources
spec:
project: default # ArgoCD project for RBAC grouping
source:
repoURL: https://github.com/myorg/k8s-config.git
targetRevision: main # Branch, tag, or commit SHA
path: apps/frontend/overlays/production # Path within the repo
destination:
server: https://kubernetes.default.svc # In-cluster
namespace: production
syncPolicy:
automated:
prune: true # Delete resources removed from Git
selfHeal: true # Revert manual cluster changes to match Git
allowEmpty: false # Do not sync if source is empty (safety check)
syncOptions:
- CreateNamespace=true # Create namespace if it does not exist
- PrunePropagationPolicy=foreground # Wait for dependents to be deleted
- PruneLast=true # Prune resources after all other syncs
retry:
limit: 3 # Retry failed syncs up to 3 times
backoff:
duration: 5s
factor: 2
maxDuration: 3m
ignoreDifferences: # Ignore fields managed by other controllers
- group: apps
kind: Deployment
jsonPointers:
- /spec/replicas # Ignore replicas managed by HPA

Sync Policies

ArgoCD sync behavior is highly configurable:

  • Manual sync: ArgoCD detects drift and shows the Application as "OutOfSync," but waits for a human to click "Sync" in the UI or run argocd app sync.
  • Automated sync: ArgoCD automatically applies changes when it detects drift between Git and the cluster.
  • selfHeal: true: Reverts manual cluster changes. If someone runs kubectl scale deployment frontend --replicas=1, ArgoCD detects the drift and restores the replica count from Git.
  • prune: true: Deletes resources that exist in the cluster but are no longer in Git. Without this, removing a manifest from Git does not delete the resource from the cluster.

Health Checks

ArgoCD understands the health of Kubernetes resources beyond just "exists/does not exist":

  • Deployments: Healthy when all replicas are available and the rollout is complete.
  • StatefulSets: Healthy when all replicas are ready and at the current revision.
  • Services: Healthy when they have endpoints.
  • PersistentVolumeClaims: Healthy when bound to a PV.
  • Custom Resources: ArgoCD supports custom health checks defined in Lua scripts.

App of Apps Pattern

For managing many Applications, ArgoCD supports a pattern where a "root" Application manages other Application manifests stored in Git:

# root-application.yaml
# A single Application that manages all other Applications
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: root
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/myorg/k8s-config.git
targetRevision: main
path: argocd/applications # Directory containing Application YAMLs
destination:
server: https://kubernetes.default.svc
namespace: argocd
syncPolicy:
automated:
prune: true
selfHeal: true

The argocd/applications/ directory contains one YAML file per Application. Adding a new Application is as simple as adding a YAML file and pushing to Git.

ApplicationSets

ApplicationSets provide a more powerful alternative to App of Apps for generating Applications dynamically using generators:

# applicationset-clusters.yaml
# Automatically create an Application for each registered cluster
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: platform-services
namespace: argocd
spec:
generators:
- clusters:
selector:
matchLabels:
env: production # Only target production clusters
template:
metadata:
name: 'platform-{{name}}' # {{name}} is the cluster name
spec:
project: platform
source:
repoURL: https://github.com/myorg/k8s-config.git
targetRevision: main
path: 'platform/overlays/{{metadata.labels.region}}'
destination:
server: '{{server}}' # Cluster API server URL
namespace: platform
syncPolicy:
automated:
prune: true
selfHeal: true

Generators include: clusters (generate per cluster), git (generate per directory or file in a repo), list (static list), matrix (combine two generators), and merge (overlay generators).

4. Flux Overview

Flux is the other major GitOps operator, now a CNCF graduated project. It takes a more modular, composable approach compared to ArgoCD's monolithic architecture.

Flux consists of independent controllers:

  • source-controller: Watches Git repositories, Helm repositories, OCI registries, and S3 buckets for changes.
  • kustomize-controller: Reconciles Kustomization resources, applying manifests from Git sources.
  • helm-controller: Reconciles HelmRelease resources, installing and upgrading Helm charts.
  • notification-controller: Handles inbound webhooks (from Git providers) and outbound notifications (Slack, Teams, PagerDuty).
# flux-gitrepository.yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: k8s-config
namespace: flux-system
spec:
interval: 1m # Poll every 1 minute
url: https://github.com/myorg/k8s-config.git
ref:
branch: main
secretRef:
name: git-credentials
---
# flux-kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: frontend
namespace: flux-system
spec:
interval: 5m # Reconcile every 5 minutes
sourceRef:
kind: GitRepository
name: k8s-config
path: ./apps/frontend/overlays/production
prune: true # Delete resources removed from Git
healthChecks: # Wait for resources to be healthy after sync
- apiVersion: apps/v1
kind: Deployment
name: frontend
namespace: production
timeout: 3m

5. Repository Structure Patterns

Monorepo

All application and infrastructure manifests live in a single repository:

k8s-config/
├── apps/
│ ├── frontend/
│ │ ├── base/
│ │ └── overlays/
│ │ ├── dev/
│ │ ├── staging/
│ │ └── production/
│ └── backend/
│ ├── base/
│ └── overlays/
├── platform/
│ ├── cert-manager/
│ ├── ingress-nginx/
│ └── monitoring/
└── argocd/
└── applications/

Pros: Single place to see everything. Atomic commits that span applications. Simpler RBAC (one repo to manage). Cons: Repository grows large. All teams commit to the same repo. Fine-grained access control requires CODEOWNERS files.

Multi-Repo

Each team or application has its own config repository:

Pros: Independent release cycles. Team-scoped access control. Smaller, focused repositories. Cons: Harder to maintain consistency. Cross-cutting changes require commits to multiple repos. More ArgoCD Applications to manage.

6. Secrets in GitOps

Secrets are the biggest challenge in GitOps -- you cannot commit plaintext secrets to Git, but your GitOps operator needs them in the cluster. Three common solutions:

Sealed Secrets

The Sealed Secrets controller runs in the cluster and generates a public/private key pair. You encrypt secrets locally with the public key using kubeseal, and the encrypted SealedSecret is safe to commit to Git. The controller in the cluster decrypts it and creates a regular Secret.

# Encrypt a secret for committing to Git
kubeseal --format yaml < my-secret.yaml > my-sealed-secret.yaml

SOPS (Secrets Operations)

Mozilla SOPS encrypts secret values (not keys) in YAML files using cloud KMS (AWS KMS, GCP KMS, Azure Key Vault) or PGP. Flux has native SOPS integration. ArgoCD supports SOPS via plugins.

External Secrets Operator

The External Secrets Operator (ESO) syncs secrets from external secret stores (AWS Secrets Manager, HashiCorp Vault, GCP Secret Manager) into Kubernetes Secrets. You commit an ExternalSecret manifest to Git that references a secret path in the external store, and ESO creates the corresponding Secret in the cluster.

# external-secret.yaml
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: db-credentials
namespace: production
spec:
refreshInterval: 1h # Re-sync from the secret store every hour
secretStoreRef:
name: aws-secrets-manager
kind: ClusterSecretStore
target:
name: db-credentials # The Kubernetes Secret to create
creationPolicy: Owner
data:
- secretKey: username # Key in the Kubernetes Secret
remoteRef:
key: production/database # Path in AWS Secrets Manager
property: username # Field within the secret
- secretKey: password
remoteRef:
key: production/database
property: password

7. Drift Detection and Remediation

GitOps drift detection is one of the most valuable features of the pull model:

  • ArgoCD drift detection: ArgoCD compares the live cluster state against the rendered manifests from Git every 3 minutes (configurable via timeout.reconciliation in argocd-cm). Drift is displayed in the UI with a diff view.
  • Self-healing: With selfHeal: true, ArgoCD automatically reverts drift. This prevents manual changes from persisting and ensures the cluster always matches Git.
  • Alerting on drift: Configure ArgoCD notifications to send Slack/PagerDuty alerts when drift is detected, even if self-healing is enabled. This helps identify team members making manual changes.

Common Pitfalls

  • Not enabling pruning: Without prune: true, deleting a resource from Git does not delete it from the cluster. Over time, orphaned resources accumulate and cause confusion.
  • Committing plaintext secrets to Git: Even in a private repository, this is a security risk. Use Sealed Secrets, SOPS, or External Secrets Operator.
  • Ignoring the image update problem: ArgoCD does not update image tags in Git by itself. You need a separate mechanism -- ArgoCD Image Updater, Flux Image Automation, or a CI pipeline that commits new tags to the config repo.
  • Self-healing conflicts with HPA: If your Git manifests specify replicas: 3 and HPA scales to 10, ArgoCD with selfHeal will revert to 3. Use ignoreDifferences to exclude /spec/replicas from reconciliation.
  • Monorepo sync performance: A large monorepo with many paths can slow ArgoCD sync times. Use targeted refresh and webhook-based triggers instead of polling.
  • Mixing imperative and declarative: If you use kubectl apply alongside ArgoCD, the two will fight over resource ownership. Commit to one approach.

Best Practices

  1. Separate app code and config repos: Keep application source code in one repository and Kubernetes manifests in a separate config repository. This decouples application release cycles from config changes and prevents CI builds from triggering on config-only changes.
  2. Use Kustomize overlays for environments: Define base manifests once and use Kustomize overlays for environment-specific values (replicas, resource limits, image tags). This reduces duplication.
  3. Require pull request reviews for production changes: Protect the main branch of your config repository with branch protection rules. All production changes should be peer-reviewed.
  4. Pin image tags, never use :latest: Mutable tags like :latest make deployments non-reproducible. Use immutable tags (commit SHAs or semantic versions) so a Git revert actually reverts the deployment.
  5. Monitor ArgoCD health: ArgoCD itself is a critical component. Monitor its controller metrics, repo server latency, and application sync status.
  6. Use Projects for multi-tenancy: ArgoCD Projects restrict which repositories and clusters each team can deploy to, providing multi-tenancy guardrails.

What's Next?

  • Crossplane: Extend GitOps from application deployments to infrastructure provisioning -- manage cloud resources through the same Git-based workflow.
  • CRDs & Operators: Understand the operator pattern that ArgoCD and Flux are built on, and how CRDs enable declarative management of anything.
  • Disaster Recovery: GitOps complements disaster recovery by maintaining a complete, version-controlled record of your cluster's desired state in Git. Restoring a cluster is as simple as pointing a new ArgoCD instance at the same repository.