CRDs & Operators
- API Extensibility: Custom Resource Definitions (CRDs) allow you to add your own object types (like
Database,Certificate, orKafkaCluster) to the Kubernetes API without modifying the Kubernetes source code. Once registered, custom resources work withkubectl, RBAC, admission webhooks, and every other Kubernetes-native tool. - The Operator Pattern: Combining a CRD with a custom controller creates an Operator -- software that encodes human operational knowledge into automated reconciliation loops. The controller watches for changes to custom resources and takes action to make the real world match the declared desired state.
- Declarative Operations: Operators automate complex Day 2 operations (backups, failover, scaling, upgrades, certificate rotation) that would otherwise require manual intervention or custom scripts. They turn operational runbooks into code.
- Ecosystem of Operators: The Kubernetes ecosystem includes hundreds of production-grade operators -- Prometheus Operator for monitoring, cert-manager for TLS certificates, Strimzi for Kafka, CloudNativePG for PostgreSQL, and many more. Before building your own, check if one already exists.
- Building Operators: Frameworks like kubebuilder and operator-sdk provide scaffolding, code generation, and testing utilities to build production-quality operators in Go, with support for Rust, Java, and Python via alternative SDKs.
Kubernetes is extensible by design. You are not limited to the built-in resource types like Pods, Services, and Deployments. You can teach Kubernetes entirely new concepts by defining your own resources and building controllers to act on them.
1. Custom Resource Definitions (CRDs)
A CRD registers a new resource type with the Kubernetes API server. Once registered, you can create, read, update, and delete instances of that resource using the same tools and APIs you use for any built-in resource.
Kubernetes API
Creating a CRD
# crd-database.yaml
# Register a new "Database" resource type in the Kubernetes API
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: databases.example.com # Must be <plural>.<group>
spec:
group: example.com # API group
names:
kind: Database # PascalCase singular
plural: databases # Lowercase plural (used in URLs)
singular: database # Lowercase singular
shortNames:
- db # kubectl get db
categories:
- all # Include in 'kubectl get all'
scope: Namespaced # Namespaced or Cluster
versions:
- name: v1alpha1
served: true # Accept API requests for this version
storage: true # Store in etcd using this version
schema:
openAPIV3Schema: # Validation schema
type: object
properties:
spec:
type: object
required:
- engine
- version
- storage
properties:
engine:
type: string
enum: ["postgres", "mysql", "mongodb"]
description: "Database engine type"
version:
type: string
description: "Engine version (e.g., '15' for Postgres)"
storage:
type: string
pattern: "^[0-9]+(Gi|Ti)$"
description: "Storage size (e.g., '100Gi')"
replicas:
type: integer
minimum: 1
maximum: 7
default: 1
description: "Number of replicas for HA"
backup:
type: object
properties:
enabled:
type: boolean
default: true
schedule:
type: string
default: "0 2 * * *"
status:
type: object
properties:
phase:
type: string
enum: ["Provisioning", "Running", "Failed", "Deleting"]
endpoint:
type: string
conditions:
type: array
items:
type: object
properties:
type:
type: string
status:
type: string
lastTransitionTime:
type: string
format: date-time
reason:
type: string
message:
type: string
subresources:
status: {} # Enable the /status subresource
additionalPrinterColumns: # Customize kubectl get output
- name: Engine
type: string
jsonPath: .spec.engine
- name: Version
type: string
jsonPath: .spec.version
- name: Status
type: string
jsonPath: .status.phase
- name: Endpoint
type: string
jsonPath: .status.endpoint
- name: Age
type: date
jsonPath: .metadata.creationTimestamp
After applying this CRD, kubectl treats Database as a first-class citizen:
kubectl apply -f crd-database.yaml
kubectl get databases # or kubectl get db
kubectl describe database my-db
kubectl delete database my-db
Custom Resource Instances
Once the CRD is registered, you create instances just like any other resource:
# my-database.yaml
# Create a Database instance -- the operator will provision the actual database
apiVersion: example.com/v1alpha1
kind: Database
metadata:
name: orders-db
namespace: production
spec:
engine: postgres
version: "15"
storage: 100Gi
replicas: 3
backup:
enabled: true
schedule: "0 */6 * * *" # Backup every 6 hours
The API server validates this against the OpenAPI schema defined in the CRD. If engine is set to "redis" (not in the enum), or storage does not match the pattern, the request is rejected.
The Status Subresource
The status subresource is critical for the operator pattern. It separates the user's desired state (spec) from the system's observed state (status):
- Users and tools update
spec(viakubectl apply, ArgoCD, etc.). - The controller updates
status(via the/statussubresource endpoint). - RBAC can grant users permission to update
specbut notstatus, and grant the controller permission to updatestatusbut notspec.
Without the status subresource, any update to the resource (including status changes by the controller) would trigger a new reconciliation, creating an infinite loop.
Finalizers
Finalizers allow controllers to perform cleanup before a resource is deleted. When a resource has a finalizer, Kubernetes sets the deletionTimestamp but does not actually remove the resource until all finalizers are removed.
This is essential for operators that manage external resources. When you delete a Database custom resource, the operator needs time to deprovision the actual database, delete snapshots, or clean up DNS records before the Kubernetes object is garbage collected.
# The operator adds a finalizer when it creates the database
metadata:
finalizers:
- databases.example.com/cleanup
The controller's reconciliation loop checks for deletionTimestamp:
- If the resource is being deleted (has a
deletionTimestamp), perform cleanup (delete the real database, remove DNS records). - Once cleanup is complete, remove the finalizer from the resource.
- Kubernetes then garbage collects the resource.
2. The Operator Pattern
A CRD by itself only stores data in etcd -- it does not do anything. To make custom resources active, you need a controller that watches for changes and acts on them.
CRD + Custom Controller = Operator
The controller runs the reconciliation loop:
- Watch: The controller subscribes to change events for its custom resource type.
- Compare: On each event (or periodic re-sync), it compares the desired state (
spec) with the actual state of the world. - Act: If there is a difference, the controller takes action to converge the actual state toward the desired state.
- Update Status: The controller updates the resource's
statusto reflect the current observed state.
This loop runs continuously and is level-triggered (not edge-triggered). This means the controller does not rely on seeing every individual event -- it simply compares desired vs. actual state on each reconciliation. If an event is missed, the next periodic re-sync catches the drift.
3. Popular Operators
Prometheus Operator
The Prometheus Operator manages Prometheus, Alertmanager, and related monitoring components. It introduces CRDs like:
ServiceMonitor: Defines which Services to scrape for metrics. The operator watches for ServiceMonitor instances and automatically reconfigures Prometheus to scrape new targets.PrometheusRule: Defines alerting and recording rules. Adding a new alerting rule is as simple as applying a YAML manifest.Prometheus: Declares a Prometheus instance with specific configuration (retention, storage, replicas).
cert-manager
cert-manager automates TLS certificate management. It introduces:
Certificate: Declares that a TLS certificate is needed for a specific DNS name.Issuer/ClusterIssuer: Defines how certificates are obtained (Let's Encrypt ACME, Vault, self-signed).
When you create a Certificate resource, cert-manager automatically requests the certificate from the issuer, stores it in a Kubernetes Secret, and renews it before expiration.
Strimzi (Apache Kafka)
Strimzi manages Apache Kafka clusters on Kubernetes. It introduces:
Kafka: Declares a complete Kafka cluster (brokers, ZooKeeper/KRaft, topic configuration).KafkaTopic: Declares Kafka topics with partition counts and replication factors.KafkaUser: Manages Kafka ACLs and authentication.
The Strimzi operator handles rolling upgrades, broker rebalancing, and TLS certificate management for the entire Kafka cluster.
CloudNativePG (PostgreSQL)
CloudNativePG manages PostgreSQL clusters with:
Cluster: Declares a PostgreSQL cluster with primary and replica instances, automated failover, continuous backup to S3/GCS, and point-in-time recovery.
4. Building Operators with kubebuilder
kubebuilder is the standard framework for building Kubernetes operators in Go. It provides scaffolding, code generation, and integration with controller-runtime.
# Initialize a new operator project
kubebuilder init --domain example.com --repo github.com/myorg/database-operator
# Create a new API (CRD + Controller)
kubebuilder create api --group infra --version v1alpha1 --kind Database
This generates:
api/v1alpha1/database_types.go-- The Go struct defining the CRD schema (DatabaseSpec,DatabaseStatus).internal/controller/database_controller.go-- The reconciliation logic.config/crd/-- Generated CRD YAML from the Go struct tags.
The core of your operator is the Reconcile function:
// Simplified reconciliation logic
func (r *DatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
log := log.FromContext(ctx)
// 1. Fetch the Database custom resource
var database infrav1alpha1.Database
if err := r.Get(ctx, req.NamespacedName, &database); err != nil {
return ctrl.Result{}, client.IgnoreNotFound(err)
}
// 2. Check if being deleted (finalizer pattern)
if !database.DeletionTimestamp.IsZero() {
// Perform cleanup: delete the real database, remove DNS records
if err := r.deprovisionDatabase(ctx, &database); err != nil {
return ctrl.Result{}, err
}
// Remove finalizer
controllerutil.RemoveFinalizer(&database, "infra.example.com/cleanup")
return ctrl.Result{}, r.Update(ctx, &database)
}
// 3. Ensure finalizer exists
if !controllerutil.ContainsFinalizer(&database, "infra.example.com/cleanup") {
controllerutil.AddFinalizer(&database, "infra.example.com/cleanup")
if err := r.Update(ctx, &database); err != nil {
return ctrl.Result{}, err
}
}
// 4. Provision or update the database
endpoint, err := r.ensureDatabase(ctx, &database)
if err != nil {
database.Status.Phase = "Failed"
r.Status().Update(ctx, &database)
return ctrl.Result{}, err
}
// 5. Update status
database.Status.Phase = "Running"
database.Status.Endpoint = endpoint
if err := r.Status().Update(ctx, &database); err != nil {
return ctrl.Result{}, err
}
// 6. Requeue after 5 minutes to check health
return ctrl.Result{RequeueAfter: 5 * time.Minute}, nil
}
operator-sdk is an alternative framework that wraps kubebuilder with additional features like OLM (Operator Lifecycle Manager) integration, Ansible-based operators, and Helm-based operators for teams that do not want to write Go code.
5. Conversion Webhooks
When you evolve your CRD schema from v1alpha1 to v1beta1, you need a conversion webhook to translate between versions. The API server calls this webhook when a client requests a resource in a different version than what is stored in etcd.
This allows you to make breaking changes to your API while maintaining backward compatibility. Existing resources stored as v1alpha1 are automatically converted to v1beta1 when requested.
6. When to Use CRDs vs ConfigMaps
CRDs and ConfigMaps both store configuration, but they serve very different purposes:
| Aspect | CRD | ConfigMap |
|---|---|---|
| Purpose | Extend the API with new resource types | Store configuration data for pods |
| Schema Validation | OpenAPI v3 schema, enforced by API server | None (arbitrary key-value pairs) |
| RBAC | Fine-grained per resource type | Per ConfigMap object |
| Status | Supports status subresource | No status concept |
| Controllers | Designed to be watched by controllers | Not designed for controller patterns |
| Kubectl | Full CRUD with typed output | Generic key-value display |
Use CRDs when: You are building an operator, need schema validation, need to represent a managed entity with lifecycle semantics, or need status tracking.
Use ConfigMaps when: You need to pass configuration data to a pod (environment variables, config files), and the configuration does not represent a managed entity.
Common Pitfalls
- Not enabling the status subresource: Without it, users can overwrite the status field in their YAML, and controller status updates trigger reconciliation loops. Always enable
subresources.status. - Missing validation schemas: Without an OpenAPI schema, any YAML structure is accepted. Users will create invalid resources that cause controller errors. Define strict schemas with enums, patterns, and required fields.
- Forgetting finalizers for external resources: If your operator creates external resources (cloud databases, DNS records), you must use finalizers to ensure cleanup on deletion. Without finalizers, deleting the CR leaves orphaned external resources.
- Not handling concurrent modifications: Multiple reconciliation loops can race on the same resource. Use
resourceVersionfor optimistic concurrency and handleConflicterrors by re-fetching and retrying. - Overly broad RBAC for the controller: Grant your controller only the permissions it needs. A controller that manages Databases does not need cluster-admin.
- Not using owner references: When your controller creates child resources (Pods, Services, PVCs), set the custom resource as the owner. This ensures child resources are garbage collected when the parent is deleted.
Best Practices
- Follow the Kubernetes API conventions: Use
specfor desired state,statusfor observed state, andmetadatafor object metadata. Users expect consistent patterns. - Make reconciliation idempotent: The reconcile function may be called multiple times for the same resource. Each call should produce the same result.
- Use conditions in status: Follow the standard Kubernetes condition pattern (
type,status,reason,message,lastTransitionTime) for communicating resource health. - Version your CRD API: Use
v1alpha1for experimental APIs,v1beta1for stable-but-evolving APIs, andv1for stable APIs. Implement conversion webhooks when evolving. - Write integration tests: Use envtest (from controller-runtime) to test your operator against a real API server without needing a full cluster.
- Publish your CRD schema: Include the CRD YAML in your Helm chart or kustomize base so users can install it alongside the operator.
What's Next?
- Crossplane: Crossplane is built entirely on CRDs and the operator pattern, extending Kubernetes to manage cloud infrastructure.
- GitOps (ArgoCD): ArgoCD itself is an operator (it watches Application CRDs) and can manage the deployment of your custom operators and their CRDs.
- Security Policies: Use RBAC to control who can create, update, and delete your custom resources, and admission webhooks to enforce policies beyond what OpenAPI schemas support.