Security Policies
- Defense in Depth: Kubernetes security requires multiple overlapping layers -- network policies for traffic segmentation, RBAC for access control, Pod Security Standards for workload hardening, image scanning for supply chain security, and audit logging for forensics. No single layer is sufficient.
- Network Policies as Firewalls: Network Policies control pod-to-pod and pod-to-external traffic at L3/L4. By default, all pods can communicate with all other pods. Implementing default-deny policies and then explicitly allowing required traffic paths is the foundation of cluster network security.
- Pod Security Standards (PSS): PSS replaced the deprecated PodSecurityPolicies and provides three built-in security profiles (Privileged, Baseline, Restricted) enforced at the namespace level via labels. The Restricted profile prevents running as root, disables privilege escalation, and requires read-only root filesystems.
- RBAC Least Privilege: Kubernetes RBAC should follow the principle of least privilege -- grant only the permissions each identity (user, service account, operator) actually needs. Avoid cluster-admin except for break-glass scenarios, and audit bindings regularly.
- Image and Supply Chain Security: Secure your container supply chain by scanning images for vulnerabilities, enforcing that only images from trusted registries are deployed, and signing images with Sigstore/cosign to verify provenance.
- Audit Everything: Kubernetes audit logging records all API requests. Enable it to detect unauthorized access, investigate incidents, and meet compliance requirements.
Securing a Kubernetes cluster requires a layered approach -- no single mechanism provides complete protection. This guide covers the essential security layers for production clusters, from network segmentation to supply chain integrity.
1. Defense in Depth Strategy
The principle of defense in depth means that even if one security layer is bypassed, others continue to protect the system. In Kubernetes, the key layers are:
- Network layer: Network Policies control which pods can communicate.
- Authentication and authorization: RBAC controls who can access which resources through the API.
- Workload hardening: Pod Security Standards restrict what containers can do (run as root, mount host paths, use privileged mode).
- Image security: Vulnerability scanning and image signing prevent deployment of compromised or unvetted images.
- Runtime security: Tools like Falco detect anomalous behavior at runtime (unexpected process execution, file access, network connections).
- Audit logging: API audit logs provide a forensic trail of all cluster operations.
2. Network Policies Deep Dive
By default, all pods in a Kubernetes cluster can communicate with all other pods across all namespaces. This is convenient for development but dangerous in production. Network Policies act as firewalls that control traffic between pods.
Visualizing Network Isolation
The diagram below shows a simple 3-tier application plus a "malicious" pod in the same namespace.
- Unsecured: Notice the red line from the malicious pod to the backend.
- Secured: Click "Apply Network Policy" to isolate the backend so it only accepts traffic from the frontend.
Default Deny All Ingress
The single most impactful security measure you can apply to a namespace is a default deny policy. This blocks all ingress traffic to all pods in the namespace, and then you explicitly allow only the traffic paths that are needed.
# default-deny-ingress.yaml
# Block all incoming traffic to pods in this namespace by default
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-ingress
namespace: production
spec:
podSelector: {} # Empty selector = all pods in namespace
policyTypes:
- Ingress # Only affects ingress; egress is unaffected
# No ingress rules = deny all ingress
Default Deny All Egress
# default-deny-egress.yaml
# Block all outgoing traffic from pods in this namespace by default
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-egress
namespace: production
spec:
podSelector: {}
policyTypes:
- Egress
# No egress rules = deny all egress
Important: A default deny on egress also blocks DNS lookups (port 53). You almost always need to pair it with an explicit DNS egress rule.
Allow DNS Egress
# allow-dns-egress.yaml
# Allow all pods to resolve DNS (required when using default deny egress)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
namespace: production
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53
Allow Traffic from Specific Namespace
# allow-frontend-to-backend.yaml
# Allow only pods labeled 'app: frontend' to reach pods labeled 'app: backend' on port 8080
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend-to-backend
namespace: production
spec:
podSelector:
matchLabels:
app: backend # Apply to backend pods
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend # Allow from frontend pods
- namespaceSelector: # In the same namespace (implicit) or specific namespace
matchLabels:
kubernetes.io/metadata.name: production
ports:
- protocol: TCP
port: 8080
Allow Traffic from Monitoring Namespace
# allow-monitoring-scrape.yaml
# Allow Prometheus to scrape metrics from all pods on port 9090
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-prometheus-scrape
namespace: production
spec:
podSelector: {} # All pods in production namespace
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: monitoring
podSelector:
matchLabels:
app: prometheus
ports:
- protocol: TCP
port: 9090 # Metrics port
Deny All External Traffic (ipBlock)
# deny-external-egress.yaml
# Allow only cluster-internal traffic, block all external egress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-external-egress
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Egress
egress:
- to:
- ipBlock:
cidr: 10.0.0.0/8 # Allow cluster-internal traffic
- ipBlock:
cidr: 172.16.0.0/12 # Allow private network ranges
- ipBlock:
cidr: 192.168.0.0/16
- to: # Allow DNS separately
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
Note: Network Policies require a CNI plugin that supports them (Calico, Cilium, Weave Net). The default kubenet CNI does not enforce Network Policies -- they are silently ignored.
3. Pod Security Standards (PSS)
Pod Security Standards replaced the deprecated PodSecurityPolicies (removed in Kubernetes 1.25) and provide three predefined security profiles enforced via namespace labels:
Three Security Levels
Privileged: No restrictions. For system-level workloads that need full host access (CNI plugins, CSI drivers, logging DaemonSets). Never use for application workloads.
Baseline: Prevents known privilege escalations. Disallows hostNetwork, hostPID, hostIPC, privileged containers, and most hostPath mounts. Suitable for most workloads that do not require special privileges.
Restricted: Maximum security. Everything in Baseline plus: must run as non-root, must drop ALL capabilities, requires read-only root filesystem, and must set a seccomp profile. This is the target for all application workloads.
# namespace-restricted.yaml
# Enforce the restricted Pod Security Standard on this namespace
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
pod-security.kubernetes.io/enforce: restricted # Reject pods that violate
pod-security.kubernetes.io/enforce-version: latest
pod-security.kubernetes.io/audit: restricted # Log violations
pod-security.kubernetes.io/audit-version: latest
pod-security.kubernetes.io/warn: restricted # Warn on kubectl apply
pod-security.kubernetes.io/warn-version: latest
A pod that complies with the Restricted profile looks like this:
# secure-pod.yaml
# A pod that complies with the PSS Restricted profile
apiVersion: v1
kind: Pod
metadata:
name: secure-app
namespace: production
spec:
securityContext:
runAsNonRoot: true # Pod-level: must run as non-root
seccompProfile:
type: RuntimeDefault # Enable seccomp for syscall filtering
containers:
- name: app
image: myregistry.com/app:v1.2.3
securityContext:
allowPrivilegeEscalation: false # Cannot gain more privileges
readOnlyRootFilesystem: true # Cannot write to the container filesystem
runAsNonRoot: true # Container must run as non-root
capabilities:
drop:
- ALL # Drop all Linux capabilities
volumeMounts:
- name: tmp
mountPath: /tmp # Writable tmp via emptyDir
volumes:
- name: tmp
emptyDir: {} # emptyDir for temporary files
4. RBAC Security Hardening
Kubernetes RBAC (Role-Based Access Control) controls who can perform which actions on which resources. Production clusters require careful RBAC configuration:
Principle of Least Privilege
- Never grant
cluster-adminto application ServiceAccounts. Most workloads need zero RBAC permissions (they do not interact with the Kubernetes API). - Use Roles (namespaced) instead of ClusterRoles whenever possible to limit scope.
- Avoid wildcards (
*) in resources or verbs. Be explicit about what each role can do. - Audit RoleBindings and ClusterRoleBindings regularly. Use
kubectl auth can-i --list --as=system:serviceaccount:ns:sato verify permissions.
# rbac-developer.yaml
# A Role that gives developers read-only access to their namespace
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: developer
namespace: team-alpha
rules:
- apiGroups: ["", "apps", "batch"]
resources: ["pods", "deployments", "services", "jobs", "configmaps"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get"] # View logs but not exec into pods
# Explicitly NO: pods/exec, secrets, or any write verbs
ServiceAccount Best Practices
- Disable automounting of ServiceAccount tokens unless the pod actually needs to talk to the Kubernetes API:
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-app
namespace: production
automountServiceAccountToken: false # Do not mount API token
- Use short-lived tokens: Kubernetes 1.22+ automatically uses bound, time-limited tokens via the TokenRequest API instead of long-lived tokens stored in Secrets.
5. Image Security
Vulnerability Scanning
Scan container images for known CVEs before deployment and continuously in the registry:
- Trivy: Open-source scanner from Aqua Security. Run in CI/CD pipelines to block images with critical vulnerabilities.
- Grype: Open-source scanner from Anchore. Fast and supports multiple package ecosystems.
- Cloud provider scanners: ECR image scanning, GCR vulnerability scanning, ACR Defender.
Enforcing Trusted Registries
Use admission controllers (OPA Gatekeeper, Kyverno) to enforce that only images from your private registry are deployed:
# kyverno-restrict-registries.yaml
# Only allow images from trusted registries
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: restrict-image-registries
spec:
validationFailureAction: Enforce
rules:
- name: validate-registries
match:
any:
- resources:
kinds:
- Pod
validate:
message: "Images must come from our private registry"
pattern:
spec:
containers:
- image: "myregistry.com/*"
initContainers:
- image: "myregistry.com/*"
Image Signing with Sigstore/cosign
Sign your images in CI/CD and verify signatures before deployment to ensure images have not been tampered with and originate from your build pipeline:
# Sign an image in CI/CD
cosign sign --key cosign.key myregistry.com/app:v1.2.3
# Verify a signature
cosign verify --key cosign.pub myregistry.com/app:v1.2.3
6. Runtime Security
Runtime security tools monitor container behavior at execution time and detect anomalies:
Falco (CNCF incubating project) uses eBPF or kernel modules to monitor system calls. It detects behaviors like:
- A container spawns an unexpected shell (
/bin/bash,/bin/sh) - A process reads sensitive files (
/etc/shadow,/etc/passwd) - An outbound network connection is made to an unusual IP
- A container writes to a directory that should be read-only
Falco rules can trigger alerts (Slack, PagerDuty) or even kill the offending pod via a response engine.
7. CIS Kubernetes Benchmark
The CIS Kubernetes Benchmark is a set of security best practices published by the Center for Internet Security. It covers:
- Control plane configuration (API server flags, etcd encryption, audit logging)
- Worker node configuration (kubelet authentication, file permissions)
- Policies (RBAC, Pod Security, Network Policies)
- Secrets management
Use automated scanners like kube-bench to audit your cluster against the CIS benchmark:
# Run kube-bench to check CIS compliance
kubectl apply -f https://raw.githubusercontent.com/aquasecurity/kube-bench/main/job.yaml
kubectl logs job/kube-bench
8. Supply Chain Security
Securing the software supply chain means verifying the integrity and provenance of every component that runs in your cluster:
- SBOM (Software Bill of Materials): Generate SBOMs for your container images using Syft or Trivy. SBOMs document every package and dependency in an image.
- Provenance attestations: Use SLSA (Supply chain Levels for Software Artifacts) framework to generate build provenance attestations that prove where and how an image was built.
- Dependency scanning: Regularly scan application dependencies (npm, pip, go modules) for known vulnerabilities.
- Base image updates: Automate base image updates (Dependabot, Renovate) and rebuild downstream images when base images receive security patches.
9. Audit Logging
Kubernetes audit logging records all requests to the API server, providing a forensic trail for security investigations and compliance:
# audit-policy.yaml
# Define which API events to log and at what detail level
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Log all requests to Secrets at the Metadata level (do not log the secret body)
- level: Metadata
resources:
- group: ""
resources: ["secrets"]
# Log all changes (create, update, delete) at the RequestResponse level
- level: RequestResponse
verbs: ["create", "update", "patch", "delete"]
# Log everything else at the Metadata level
- level: Metadata
Ship audit logs to a central log aggregation system (Elasticsearch, Splunk, CloudWatch) for analysis and alerting. Set retention policies that meet your compliance requirements.
10. Production Security Checklist
Use this checklist as a baseline for securing production clusters:
- Network Policies: Default deny in all application namespaces with explicit allow rules
- PSS: Restricted profile enforced on all application namespaces
- RBAC: No unnecessary
cluster-adminbindings; ServiceAccount tokens disabled where not needed - Image scanning: All images scanned in CI/CD; critical CVEs block deployment
- Registry policy: Only images from approved registries can be deployed
- Secrets: Encrypted at rest (
EncryptionConfiguration); not stored in Git as plaintext - Audit logging: Enabled and shipped to central logging with alerting on suspicious activity
- etcd encryption: Enabled at rest for Secrets
- API server: Anonymous auth disabled, insecure port disabled, NodeRestriction admission enabled
- Runtime security: Falco or equivalent monitoring container behavior
- CIS benchmark: Regular automated audits with kube-bench
Common Pitfalls
- Assuming Network Policies are enforced: Network Policies only work if your CNI plugin supports them. If you use the default kubenet or a CNI that does not support policies, your policies are silently ignored. Verify with a test: create a deny-all policy and confirm that traffic is actually blocked.
- Forgetting that Network Policy selectors use AND within a rule but OR across rules: Within a single
fromentry,podSelectorandnamespaceSelectorare ANDed. Multiple entries in thefromarray are ORed. This subtlety causes many misconfigurations. - Not blocking egress: Most teams focus on ingress policies but forget egress. A compromised pod can exfiltrate data to any external IP without egress restrictions.
- Running as root by default: Many container images default to running as root. Enforce
runAsNonRoot: trueand build images with a non-root USER directive. - Long-lived ServiceAccount tokens: In older Kubernetes versions, SA tokens were long-lived secrets. Ensure you are on 1.22+ and using bound, time-limited tokens.
- Over-permissive ClusterRoles: A ClusterRole with
resources: ["*"]andverbs: ["*"]is equivalent to cluster-admin. Audit all ClusterRoles for excessive permissions.
Best Practices
- Start with default deny Network Policies: Apply default deny ingress and egress to every application namespace on day one, then add explicit allow rules as needed.
- Enforce PSS Restricted for all application namespaces: Use Baseline only for workloads that genuinely need relaxed restrictions (e.g., logging DaemonSets that need hostPath).
- Rotate secrets and certificates regularly: Use cert-manager for TLS certificates and External Secrets Operator with rotation policies for application secrets.
- Run security scanners continuously: Do not just scan once in CI -- continuously re-scan running images as new CVEs are published.
- Separate namespaces by trust boundary: Do not mix trusted and untrusted workloads in the same namespace. Use namespaces as security boundaries with appropriate Network Policies and RBAC.
- Use admission controllers for policy enforcement: OPA Gatekeeper or Kyverno can enforce policies that PSS cannot (e.g., required labels, resource limits, image registries).
What's Next?
- Service Mesh: Service meshes provide mTLS encryption and L7 authorization policies that complement the L3/L4 segmentation of Network Policies.
- Chaos Engineering: Validate that your security policies (Network Policies, RBAC restrictions) actually work by testing them with controlled experiments.
- CRDs & Operators: Tools like cert-manager, Kyverno, and External Secrets Operator are all built on the CRD/Operator pattern -- understand how they work under the hood.