Skip to main content

Cluster Autoscaler (Scale the Nodes)

Key Takeaways for AI & Readers
  • Infrastructure Scaling: Unlike HPA, Cluster Autoscaler adds or removes physical/virtual nodes when the existing capacity is exhausted.
  • Trigger Mechanism: It monitors for Pending Pods with "insufficient resources" and requests new nodes from the cloud provider.
  • Cost Optimization: The autoscaler automatically "drains" and removes underutilized nodes to minimize cloud infrastructure costs.
  • Next-Gen Scaling: Karpenter offers a faster, more flexible alternative by provisioning exactly-sized nodes without relying on rigid node groups.

We've covered HPA (Scaling Pods) and VPA (Scaling Resources). But what happens if you need 100 pods and your physical nodes are full? You need to scale the Infrastructure.

1. Automatic Node Provisioning

Scale the application below. Watch how new Physical Nodes are added when there is no room left on existing ones.

3
Desired Pods: 3
Physical Node 1
📦
📦
📦
Cluster Autoscaler detects when Pods cannot be scheduled due to lack of resources and automatically provisions a new Node from the cloud provider (AWS/GCP).

2. How it works

The Cluster Autoscaler (CA) watches for "Unschedulable" pods.

  1. Detect: Pod is Pending because of Insufficient CPU/Memory.
  2. Request: CA talks to the Cloud Provider (AWS ASG, GCP Managed Instance Group).
  3. Create: A new VM is started and joins the cluster as a Node.
  4. Schedule: The pod is finally placed on the new node.

3. Scaling Down

CA also works in reverse. If a Node is under-utilized for a long time (usually 10 minutes) and its pods can be moved elsewhere, CA will Drain the node and terminate the VM to save money.

4. Modern Alternative: Karpenter

AWS recently released Karpenter, which is much faster than the standard Cluster Autoscaler. Instead of managing "Groups" of nodes, it creates the exact size VM you need for your pods in seconds.