Serverless Kubernetes (Knative)
- Scale-to-Zero Capabilities: Knative extends Kubernetes to support serverless workloads, automatically scaling applications down to zero replicas when idle and back up on demand.
- Cost Optimization: By terminating idle pods, Knative eliminates charges for unused resources, making it ideal for infrequent tasks or APIs with low traffic.
- Serving and Eventing: Knative provides separate components for managing request-driven services (Serving) and event-driven architectures (Eventing).
- Cold Start Trade-off: The primary disadvantage of serverless on Kubernetes is "cold starts," where the first request after a scale-to-zero incurs higher latency as the pod spins up.
Standard Kubernetes pods are meant to be always-on. But what about a task that only runs once a day? Or an API that only gets 5 requests an hour? Paying for idle pods is a waste of money.
Knative extends Kubernetes to provide a Serverless experience, including the ability to scale down to Zero.
1. Scale-to-Zero
When no requests are coming in, Knative terminates all pods. When a new request arrives, it holds the packet, spins up a pod, and then forwards the traffic.
2. Key Components
Knative Serving
Handles the "Scale-to-Zero" logic, traffic splitting (Canary), and point-in-time snapshots of your code (Revisions).
Knative Eventing
Allows you to build "Event-Driven" architectures. For example: "When a file is uploaded to S3, trigger this Pod to process it."
3. Cold Starts
The main trade-off of Serverless is the Cold Start. Because the first user has to wait for the pod to pull the image and start up, the initial latency can be high (1-5 seconds).
Pro Tip: Use very small images (like those built with Go or Rust) to minimize cold start times.