Skip to main content

Multi-Tenancy: Sharing the Cluster

Key Takeaways for AI & Readers
  • Cost Efficiency: Running one shared cluster is significantly cheaper than running separate clusters per team. Multi-tenancy enables this sharing while maintaining isolation, but the isolation model must match your trust boundaries.
  • Soft vs. Hard Multi-Tenancy: Soft multi-tenancy (namespace-based isolation) is suitable for trusted teams within an organization. Hard multi-tenancy (virtual clusters, separate control planes) is required for untrusted tenants like SaaS customers.
  • Three Pillars of Isolation: Namespaces provide logical separation of resources. ResourceQuotas enforce resource limits per tenant to prevent noisy-neighbor problems. NetworkPolicies control network traffic between tenant namespaces.
  • RBAC Per Namespace: Role-based access control scoped to namespaces ensures tenants can only see and manage their own resources. ClusterRoles grant cross-namespace access and must be carefully restricted.
  • Advanced Isolation: Hierarchical Namespaces Controller (HNC) enables namespace hierarchies with inherited policies. vcluster creates virtual Kubernetes clusters inside namespaces for strong isolation without the cost of separate physical clusters.
  • Cost Allocation: Label-based cost attribution with Kubecost/OpenCost enables showback or chargeback per tenant, creating financial accountability for resource consumption.

Running one large cluster is significantly cheaper and easier to operate than running 100 small ones. You avoid duplicating control plane costs, shared services (monitoring, logging, ingress controllers), and operational overhead. A single EKS control plane costs ~$73/month — running a separate cluster per team at 20 teams means $1,460/month just for control planes before any workloads run.

Multi-tenancy is the practice of hosting multiple teams, users, or applications on a single shared Kubernetes cluster while maintaining appropriate isolation between them. The challenge is ensuring that one tenant's workloads cannot affect another tenant's security, performance, or cost.

1. Soft vs. Hard Multi-Tenancy

The right isolation model depends on the trust relationship between tenants.

Soft Multi-Tenancy (Trusted Tenants)

Used when tenants are teams within the same organization who share a trust boundary. You trust that developers are not malicious, but you want to prevent accidental interference.

Namespace: team-a
Soft Isolation
Quota: 4 vCPU / 8Gi
Namespace: team-b
Network Policy Active
Quota: 2 vCPU / 4Gi
Multi-tenancy in Kubernetes is achieved via Logical Isolation. By combining Namespaces, ResourceQuotas, and NetworkPolicies, you can securely host multiple teams on one physical cluster.

Soft multi-tenancy relies on:

  • Namespaces for logical separation
  • RBAC for access control
  • ResourceQuotas for resource limits
  • NetworkPolicies for network isolation
  • Pod Security Admission for container security

This model is appropriate for:

  • Engineering teams within a company sharing a development or staging cluster
  • Different application environments (frontend, backend, data) within the same organization
  • Internal platforms where all developers are trusted employees

Hard Multi-Tenancy (Untrusted Tenants)

Used when tenants are external customers or untrusted parties who may attempt to break isolation intentionally. Namespace-level isolation is insufficient because:

  • Kernel exploits: A container escape affects the underlying node, which is shared with other tenants.
  • Resource exhaustion: Even with ResourceQuotas, a tenant can exhaust node-level resources (disk I/O, network bandwidth, kernel connections) that quotas do not cover.
  • Metadata service access: On cloud providers, pod access to the instance metadata service can expose node-level IAM credentials.

Hard multi-tenancy requires one or more of:

  • Virtual clusters (vcluster) — isolated control planes within a shared cluster
  • Separate physical clusters per tenant
  • Kata Containers / gVisor — VM-level or user-space kernel isolation for pods
  • Node isolation — dedicated node pools per tenant with strict scheduling

2. Namespace-Based Isolation

Namespaces are the fundamental building block of multi-tenancy. Each tenant gets one or more namespaces, and access is restricted via RBAC, ResourceQuotas, and NetworkPolicies.

Creating a Tenant Namespace

apiVersion: v1
kind: Namespace
metadata:
name: team-payments
labels:
tenant: payments
cost-center: cc-1234
environment: production
# Pod Security Admission
pod-security.kubernetes.io/enforce: baseline
pod-security.kubernetes.io/enforce-version: latest
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/warn-version: latest

3. ResourceQuotas Per Tenant

ResourceQuota objects enforce hard limits on the total resources a namespace can consume. This prevents the "noisy neighbor" problem where one team's workloads starve others of resources.

apiVersion: v1
kind: ResourceQuota
metadata:
name: tenant-quota
namespace: team-payments
spec:
hard:
# Compute limits
requests.cpu: "16" # 16 CPU cores total
requests.memory: 32Gi # 32 GiB memory total
limits.cpu: "32"
limits.memory: 64Gi
# Object count limits
pods: "50" # max 50 pods
services: "20" # max 20 services
services.loadbalancers: "2" # max 2 load balancers (cost control)
persistentvolumeclaims: "10" # max 10 PVCs
secrets: "50" # max 50 secrets
configmaps: "50" # max 50 configmaps
count/deployments.apps: "20" # max 20 deployments

LimitRange for Per-Pod Defaults

Without a LimitRange, developers can create pods without resource requests, which bypasses ResourceQuota enforcement and leads to unpredictable scheduling. LimitRange sets defaults and constraints:

apiVersion: v1
kind: LimitRange
metadata:
name: default-limits
namespace: team-payments
spec:
limits:
- type: Container
default: # default limits if developer omits them
cpu: 500m
memory: 512Mi
defaultRequest: # default requests if developer omits them
cpu: 100m
memory: 128Mi
max: # maximum any single container can request
cpu: 4
memory: 8Gi
min: # minimum (prevents trivially small pods)
cpu: 10m
memory: 16Mi
- type: PersistentVolumeClaim
max:
storage: 100Gi # max PVC size
min:
storage: 1Gi

4. NetworkPolicies for Network Isolation

By default, all pods in a Kubernetes cluster can communicate with all other pods, regardless of namespace. NetworkPolicies restrict this to enforce network isolation between tenants.

Default Deny All Traffic

Start with a deny-all policy in each tenant namespace, then explicitly allow what is needed:

# Deny all ingress and egress by default
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: team-payments
spec:
podSelector: {} # applies to all pods in the namespace
policyTypes:
- Ingress
- Egress
---
# Allow pods within the same namespace to talk to each other
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-same-namespace
namespace: team-payments
spec:
podSelector: {}
policyTypes:
- Ingress
ingress:
- from:
- podSelector: {} # any pod in this namespace
---
# Allow egress to DNS (required for service discovery)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
namespace: team-payments
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53
---
# Allow egress to the internet (for external API calls)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-external-egress
namespace: team-payments
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to:
- ipBlock:
cidr: 0.0.0.0/0
except:
- 10.0.0.0/8 # block access to other cluster pods
- 172.16.0.0/12
- 192.168.0.0/16

Allow Access to Shared Services

Tenants often need access to shared services (monitoring, ingress, databases). Explicitly allow this:

# Allow ingress from the ingress controller namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-ingress-controller
namespace: team-payments
spec:
podSelector: {}
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: ingress-nginx

5. RBAC Per Namespace

RBAC scoped to namespaces ensures tenants can only see and manage their own resources.

# Role for tenant developers (scoped to their namespace)
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: tenant-developer
namespace: team-payments
rules:
# Full access to workload resources
- apiGroups: ["apps"]
resources: ["deployments", "statefulsets", "daemonsets", "replicasets"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
# Full access to core resources
- apiGroups: [""]
resources: ["pods", "services", "configmaps", "persistentvolumeclaims"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
# Read-only access to events for debugging
- apiGroups: [""]
resources: ["events"]
verbs: ["get", "list", "watch"]
# Read-only access to pod logs
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get", "list"]
# Exec into pods for debugging (remove for stricter security)
- apiGroups: [""]
resources: ["pods/exec"]
verbs: ["create"]
# Access to secrets (limit to specific secrets if possible)
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
# HPA management
- apiGroups: ["autoscaling"]
resources: ["horizontalpodautoscalers"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
---
# Bind the Role to the team's group
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: payments-team-developers
namespace: team-payments
subjects:
- kind: Group
name: "payments-developers" # from your identity provider (OIDC)
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: tenant-developer
apiGroup: rbac.authorization.k8s.io

What to Restrict

  • ClusterRoles: Do not grant ClusterRoles to tenants unless absolutely necessary. ClusterRoles provide cross-namespace access.
  • Secrets in other namespaces: Tenants should never be able to read secrets outside their namespace.
  • Node access: Tenants should not be able to get nodes, create daemonsets across the cluster, or access the Kubernetes API at the cluster level.
  • Namespace creation: Only platform admins should create namespaces. Tenants operate within assigned namespaces.

6. Hierarchical Namespaces Controller (HNC)

For organizations with complex team structures, the Hierarchical Namespaces Controller enables parent-child namespace relationships. Policies (RBAC, NetworkPolicies, ResourceQuotas) defined in a parent namespace propagate to child namespaces automatically.

# Create a subnamespace under team-payments
apiVersion: hnc.x-k8s.io/v1alpha2
kind: SubnamespaceAnchor
metadata:
name: payments-staging
namespace: team-payments # parent namespace

This creates payments-staging as a child of team-payments. All RBAC RoleBindings, NetworkPolicies, and ResourceQuotas from team-payments are automatically propagated to payments-staging.

Use cases for HNC:

  • A team needs separate namespaces for staging and production but wants consistent policies.
  • An organization has departments with sub-teams that need inherited but customizable policies.
  • Self-service namespace creation within a defined hierarchy (team leads can create subnamespaces without cluster admin involvement).

7. vcluster: Virtual Clusters

For stronger isolation without the cost of separate physical clusters, vcluster creates virtual Kubernetes clusters inside regular namespaces. Each vcluster has its own:

  • API server
  • Controller manager
  • Separate etcd (or backed by the host cluster's etcd)
  • Its own set of namespaces, RBAC, and CRDs

But it shares the host cluster's:

  • Nodes and compute capacity
  • Networking (CNI)
  • Storage (CSI)
# Install vcluster CLI
curl -L -o vcluster "https://github.com/loft-sh/vcluster/releases/latest/download/vcluster-linux-amd64"
chmod +x vcluster && sudo mv vcluster /usr/local/bin/

# Create a virtual cluster for a tenant
vcluster create tenant-acme \
--namespace vcluster-acme \
--set vcluster.resources.limits.cpu=4 \
--set vcluster.resources.limits.memory=8Gi

# Connect to the virtual cluster
vcluster connect tenant-acme --namespace vcluster-acme
# Now kubectl commands target the virtual cluster
kubectl get namespaces # shows the vcluster's namespaces, not the host's

When to use vcluster:

  • SaaS platforms where each customer needs their own Kubernetes experience.
  • CI/CD pipelines that need an isolated cluster per build but cannot afford real clusters.
  • Development environments that mirror production cluster architecture.
  • Tenants that need to install their own CRDs, admission controllers, or operators.

8. Cost Allocation Per Tenant

Multi-tenancy requires clear cost attribution. Without it, there is no accountability for resource consumption.

Implementation Pattern

  1. Require tenant and cost-center labels on all namespaces and workloads (enforce with Kyverno or Gatekeeper).
  2. Deploy Kubecost or OpenCost for cost visibility.
  3. Generate monthly reports per tenant showing compute, storage, and network costs.
  4. Implement showback (show teams their costs) as a first step, then graduate to chargeback (actually billing teams).
# Kyverno policy to require tenant labels
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: require-tenant-labels
spec:
validationFailureAction: Enforce
rules:
- name: require-labels-on-namespace
match:
any:
- resources:
kinds:
- Namespace
exclude:
any:
- resources:
namespaces:
- kube-system
- kube-public
- kube-node-lease
validate:
message: "Namespaces must have 'tenant' and 'cost-center' labels."
pattern:
metadata:
labels:
tenant: "?*"
cost-center: "?*"

9. Shared Services Patterns

Multi-tenant clusters typically have shared services that all tenants consume:

Shared ServiceNamespaceAccess Pattern
Ingress controlleringress-nginxTenants create Ingress/HTTPRoute in their namespace
Monitoring (Prometheus)monitoringTenants get Grafana dashboards scoped to their namespace
Logging (FluentBit + Loki)loggingTenants query logs filtered by their namespace
Cert-Managercert-managerTenants create Certificate resources in their namespace
External Secrets Operatorexternal-secretsTenants create ExternalSecret in their namespace

Shared services are managed by the platform team and exposed to tenants via well-defined interfaces (CRDs, services). Tenants should not need access to the shared service namespaces.

10. When to Use Multi-Cluster Instead

Multi-tenancy within a single cluster has limits. Consider separate clusters when:

  • Compliance requirements mandate physical separation (e.g., PCI DSS, HIPAA with strict tenant isolation).
  • Blast radius: A cluster-level failure (etcd corruption, control plane outage, CVE in a shared component) affects all tenants. Separate clusters contain the blast radius.
  • Different Kubernetes versions: Tenants need different Kubernetes versions or different cluster configurations (e.g., different CNIs or admission controllers).
  • Regulatory boundaries: Data residency laws require workloads in specific regions or jurisdictions.
  • Trust model: Tenants are truly adversarial (external customers) and you cannot risk kernel-level escapes.

For multi-cluster management, use tools like Cluster API (cluster lifecycle management), Rancher (multi-cluster management), or ArgoCD with ApplicationSets (GitOps across clusters).

Managing at Scale: The App-of-Apps Pattern

How do you manage 50+ tenant namespaces? You don't apply YAML manually.

  • GitOps (ArgoCD/Flux): Define each tenant as an "Application" in a Git repo.
  • App-of-Apps: A master Application that generates other Applications.
    • Repo Structure: /tenants/team-a, /tenants/team-b
    • Automation: Adding a new folder /tenants/team-c automatically provisions the namespace, RBAC, Quotas, and NetworkPolicies.
  • Self-Service: Tenants submit a Pull Request to the platform repo to request a new namespace.

Complete Tenant Onboarding Example

Here is a comprehensive YAML that sets up a new tenant with all isolation controls:

# 1. Namespace with PSA and labels
apiVersion: v1
kind: Namespace
metadata:
name: team-checkout
labels:
tenant: checkout
cost-center: cc-5678
pod-security.kubernetes.io/enforce: baseline
pod-security.kubernetes.io/warn: restricted
---
# 2. ResourceQuota
apiVersion: v1
kind: ResourceQuota
metadata:
name: quota
namespace: team-checkout
spec:
hard:
requests.cpu: "8"
requests.memory: 16Gi
limits.cpu: "16"
limits.memory: 32Gi
pods: "30"
services.loadbalancers: "1"
---
# 3. LimitRange
apiVersion: v1
kind: LimitRange
metadata:
name: limits
namespace: team-checkout
spec:
limits:
- type: Container
default:
cpu: 200m
memory: 256Mi
defaultRequest:
cpu: 100m
memory: 128Mi
---
# 4. Default deny NetworkPolicy
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
namespace: team-checkout
spec:
podSelector: {}
policyTypes: [Ingress, Egress]
---
# 5. Allow same-namespace and DNS
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-internal-and-dns
namespace: team-checkout
spec:
podSelector: {}
policyTypes: [Ingress, Egress]
ingress:
- from:
- podSelector: {}
egress:
- to:
- podSelector: {}
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
---
# 6. RBAC
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: checkout-developers
namespace: team-checkout
subjects:
- kind: Group
name: "checkout-team"
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: edit # built-in role: full access within namespace
apiGroup: rbac.authorization.k8s.io

Common Pitfalls

  1. Relying solely on namespaces for isolation: Namespaces provide logical separation but not security isolation. Without NetworkPolicies and RBAC, pods in different namespaces can freely communicate and developers can access resources in other namespaces.
  2. Forgetting DNS-based network policy: Even with ingress deny rules, pods can resolve service names in other namespaces via DNS. Use egress policies to limit which namespaces a tenant can connect to.
  3. Not setting LimitRange defaults: Without defaults, developers can create pods without resource requests, bypassing ResourceQuota enforcement entirely.
  4. Over-allocating quotas: If the sum of all tenant quotas exceeds cluster capacity, all tenants can technically exhaust resources simultaneously. Set quotas to sum to 80-90% of total cluster capacity.
  5. Ignoring shared resource contention: ResourceQuotas do not cover node-level resources like disk I/O, kernel connections, or PID limits. These can still cause noisy-neighbor issues.
  6. Granting cluster-scoped RBAC to tenants: ClusterRoles and ClusterRoleBindings bypass namespace isolation. Audit all cluster-level bindings regularly.

Best Practices

  1. Start with soft multi-tenancy for internal teams. Graduate to vcluster or multi-cluster only when the trust model or compliance requirements demand it.
  2. Automate tenant onboarding — create a Helm chart, Kustomize overlay, or custom controller that provisions all namespace resources (Quota, LimitRange, NetworkPolicy, RBAC, PSA labels) from a single tenant definition.
  3. Default-deny NetworkPolicies in every tenant namespace. Explicitly allow only necessary traffic.
  4. Enforce PSA Baseline minimum in all tenant namespaces. Use Restricted for sensitive workloads.
  5. Require cost-center labels for financial accountability.
  6. Use Hierarchical Namespaces for large organizations with sub-teams to reduce operational toil.
  7. Monitor cross-namespace traffic to detect unexpected communication between tenants.
  8. Review RBAC bindings quarterly — permissions tend to accumulate over time.
  9. Set PodDisruptionBudgets per tenant to prevent cluster operations (node drains, autoscaling) from disproportionately affecting a single tenant.
  10. Document the shared responsibility model — clearly define what the platform team manages vs. what the tenant team is responsible for.

What's Next?