eBPF: High-Performance Networking

Key Takeaways for AI & Readers

Technology: eBPF (extended Berkeley Packet Filter) runs sandboxed, JIT-compiled programs directly in the Linux kernel without modifying kernel source or loading kernel modules.
Performance: Replaces the sequential O(n) rule traversal of iptables with O(1) hash-map lookups, eliminating the scalability bottleneck that appears at thousands of Services.
Cilium Architecture: Cilium is the leading eBPF-based CNI, composed of an agent (per-node datapath programming), an operator (cluster-wide coordination), and Hubble (flow-level observability).
Use Cases: Identity-based network policies, transparent WireGuard encryption, kube-proxy replacement, bandwidth management, host firewall, and sidecar-free service mesh.
Adoption Signal: If your cluster has more than 500 Services, or you need L7 visibility without a sidecar mesh, eBPF networking delivers measurable gains in latency, throughput, and operational simplicity.

💻

Network Card

📦

Pod

iptables must scan a list of rules sequentially. As the cluster grows, latency increases linearly (O(n)).

1. What Is eBPF and Why It Matters for Kubernetes

eBPF (extended Berkeley Packet Filter) is a technology that allows you to run small, verified programs directly inside the Linux kernel. These programs attach to kernel hooks -- network socket operations, tracepoints, kprobes, and XDP (eXpress Data Path) ingress points -- and execute at near-native speed without the overhead of context-switching to user space.

In the context of Kubernetes, eBPF transforms the kernel into a programmable data plane. Every packet arriving at a node can be inspected, rewritten, load-balanced, or dropped by an eBPF program that has been JIT-compiled to machine code. The kernel verifier guarantees that these programs terminate, do not access invalid memory, and cannot crash the kernel, making them safe for production use.

Traditional Kubernetes networking relies on iptables or ipvs rules managed by kube-proxy. These approaches worked well for small clusters, but they struggle at scale. Every Service with a ClusterIP generates multiple iptables rules, and the kernel must traverse those rules sequentially for every connection. eBPF eliminates this bottleneck entirely by performing lookups against in-kernel hash maps in constant time.

The eBPF Execution Model

When an eBPF program is loaded, the kernel verifier performs static analysis to ensure safety. The program is then JIT-compiled to native instructions for the host architecture (x86_64, ARM64). eBPF programs communicate with user space through maps -- key-value data structures stored in kernel memory. Cilium uses these maps to store Service endpoints, network policies, connection tracking entries, and identity-to-IP mappings.

User Space         Kernel Space
+-----------+      +------------------+
| Cilium    | ---> | eBPF Program     |
| Agent     |      |   (JIT-compiled) |
|           | <--- |                  |
| (updates  |      | eBPF Maps        |
|  maps)    |      | (hash tables,    |
+-----------+      |  LPM tries, etc.)|
                   +------------------+
                          |
                   +------v------+
                   | XDP / TC    |
                   | Hook Points |
                   +-------------+

2. eBPF vs. iptables: A Direct Comparison

Understanding the practical differences between eBPF and iptables is critical for making an informed architectural decision.

iptables works by maintaining chains of rules in the netfilter framework. When a packet arrives, it walks through the chain sequentially until a match is found. In a cluster with 5,000 Services (each with 3 endpoints), kube-proxy generates roughly 25,000+ iptables rules. Every new connection must traverse a significant portion of this list.

eBPF replaces this with hash-map lookups. A Service VIP and port are used as a key to look up the backend endpoints in O(1) time. Connection tracking is handled by eBPF maps rather than conntrack tables, and NAT is performed inline within the eBPF program.

Aspect	iptables	eBPF (Cilium)
Rule Complexity	O(n) sequential traversal	O(1) hash-map lookup
Rule Update	Full chain replacement (atomic swap)	Individual map entry update
Connection Tracking	Kernel conntrack module	eBPF-native CT maps
Latency at 10K Services	~3.5ms per connection setup	~0.2ms per connection setup
CPU Overhead	Scales linearly with rule count	Constant regardless of scale
Visibility	`iptables -L` (hard to parse)	Hubble flow logs with identity context

3. Cilium Architecture

Cilium is the most widely adopted eBPF-based CNI and networking platform for Kubernetes. It consists of three core components.

Cilium Agent (DaemonSet)

The Cilium agent runs as a DaemonSet on every node. It is responsible for:

Compiling and loading eBPF programs into the kernel for each endpoint (pod) on the node.
Managing eBPF maps that store Service backends, network policy rules, identity mappings, and connection tracking state.
Allocating IP addresses to pods via its built-in IPAM or by delegating to the cloud provider's IPAM (e.g., AWS ENI mode).
Implementing kube-proxy replacement by watching the Kubernetes API for Service and Endpoint changes and programming the datapath accordingly.

Cilium Operator (Deployment)

The operator handles cluster-wide responsibilities that should not be duplicated across every node:

IPAM coordination in multi-pool or cloud-provider modes (allocating CIDRs from the cloud API).
CiliumIdentity garbage collection -- cleaning up stale security identities that no longer map to running pods.
CiliumNode synchronization across the cluster for node-to-node tunneling or direct routing configuration.

Hubble (Observability)

Hubble provides deep, identity-aware flow visibility by reading events from the eBPF datapath.

Hubble Relay aggregates flow data from all nodes and exposes a cluster-wide gRPC API.
Hubble UI renders a service dependency map and allows filtering flows by namespace, label, verdict (forwarded/dropped), and HTTP method.
Hubble CLI (hubble observe) enables real-time flow inspection from the terminal.

4. Cilium Features in Depth

Identity-Based Network Policies

Traditional NetworkPolicy objects match on IP addresses or CIDR blocks. Cilium assigns a security identity to each set of pods with the same labels. Network policies are enforced based on these identities, not IPs. When a pod is rescheduled to a different node and gets a new IP, the identity stays the same, and the policy is enforced without any rule updates.

# Cilium Network Policy: Allow frontend to talk to api on port 443
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: allow-frontend-to-api
  namespace: production
spec:
  endpointSelector:
    matchLabels:
      app: api          # Apply to pods labeled app=api
  ingress:
    - fromEndpoints:
        - matchLabels:
            app: frontend  # Only allow traffic from frontend pods
      toPorts:
        - ports:
            - port: "443"
              protocol: TCP
          rules:
            http:               # L7 policy: only allow GET and POST
              - method: "GET"
              - method: "POST"

Transparent Encryption

Cilium supports WireGuard and IPsec for transparent pod-to-pod encryption. When enabled, all traffic between nodes is encrypted without any application changes. WireGuard is generally preferred for its simplicity and performance (it operates at the kernel level with minimal overhead).

# Enable WireGuard encryption in Cilium Helm values
encryption:
  enabled: true
  type: wireguard
  # WireGuard key rotation is handled automatically by Cilium

Bandwidth Management

Cilium uses eBPF-based Earliest Departure Time (EDT) rate limiting instead of the older tc-tbf (token bucket filter) approach. This provides more accurate rate limiting with lower CPU overhead and better burst handling.

# Annotate a pod to limit egress bandwidth
apiVersion: v1
kind: Pod
metadata:
  name: bandwidth-limited-app
  annotations:
    kubernetes.io/egress-bandwidth: "10M"  # 10 Mbit/s egress limit
spec:
  containers:
    - name: app
      image: nginx:1.27

Host Firewall

Cilium can enforce network policies on the host itself (not just pods), providing a unified security model that protects the node OS, the kubelet, and host-networked pods.

5. Replacing kube-proxy with Cilium

One of Cilium's most impactful features is its ability to fully replace kube-proxy. This eliminates thousands of iptables rules and centralizes Service load-balancing in the eBPF datapath.

# Helm values for Cilium with kube-proxy replacement
kubeProxyReplacement: true       # Full kube-proxy replacement
k8sServiceHost: "api.k8s.local" # API server endpoint
k8sServicePort: "6443"

loadBalancer:
  algorithm: maglev             # Consistent hashing for backends
  mode: dsr                     # Direct Server Return for lower latency

bpf:
  masquerade: true              # BPF-based masquerading (replaces iptables SNAT)
  tproxy: true                  # Transparent proxy for L7 policies

To deploy a new cluster without kube-proxy entirely:

# Initialize the cluster without kube-proxy
kubeadm init --skip-phases=addon/kube-proxy

# Install Cilium with kube-proxy replacement enabled
helm install cilium cilium/cilium \
  --namespace kube-system \
  --set kubeProxyReplacement=true \
  --set k8sServiceHost="10.0.0.10" \
  --set k8sServicePort="6443"

6. Performance Benchmarks

Real-world benchmarks consistently demonstrate the advantages of eBPF-based networking. The following numbers are representative of commonly published results (actual numbers vary by hardware, kernel version, and workload):

Throughput (TCP_STREAM, pod-to-pod, same node): Cilium eBPF achieves within 2-3% of host networking throughput, whereas iptables-based CNIs typically show 8-15% overhead.
Latency (TCP_RR, pod-to-pod, cross-node): At 10,000 Services, Cilium maintains sub-0.3ms P99 connection setup time. iptables-based setups can exceed 3-5ms.
Connection rate (TCP_CRR): Cilium handles 30-40% more new connections per second than iptables at scale due to the elimination of conntrack contention.
Rule update time: Adding a new Service backend in Cilium updates a single map entry (~microseconds). kube-proxy must regenerate and atomically swap an entire iptables chain (~hundreds of milliseconds at scale).

7. Hubble for Observability

Hubble gives platform engineers the kind of visibility that previously required dedicated network monitoring appliances or intrusive sidecar proxies.

# Watch all dropped packets in the production namespace
hubble observe --namespace production --verdict DROPPED

# Observe HTTP flows to a specific service
hubble observe --namespace production \
  --to-label app=api-gateway \
  --protocol HTTP \
  -o json

# Get a flow summary for the last hour
hubble observe --since 1h --namespace production -o compact

Hubble metrics can be exported to Prometheus, enabling dashboards and alerting on network-layer signals such as DNS error rates, TCP retransmissions, and HTTP error codes -- without any application instrumentation.

8. Common Pitfalls

Kernel version requirements. Cilium requires Linux kernel 4.19.57+ at minimum, but many features (WireGuard, BBR congestion control, EDT bandwidth management) require 5.10+. Always check the Cilium compatibility matrix before upgrading.
Running Cilium alongside kube-proxy. If you enable kubeProxyReplacement: true, you must ensure kube-proxy is actually removed. Running both leads to conflicting conntrack entries and intermittent connection failures.
MTU misconfiguration with tunneling. When using VXLAN or Geneve encapsulation mode, Cilium reduces the pod MTU by 50-100 bytes. If your underlying network MTU is already non-standard, you may need to explicitly set bpf.mtu in the Helm values.
Forgetting to enable Hubble. Hubble is disabled by default. Without it, troubleshooting Cilium network policies is significantly harder. Always enable Hubble Relay and the UI in non-production environments.
CiliumIdentity leaks. In large clusters with high pod churn, the Cilium operator may fall behind on garbage-collecting stale identities. Monitor the cilium_identity_count metric and ensure the operator has sufficient resources.
Misunderstanding L7 policy overhead. Enabling HTTP-aware policies requires Cilium to proxy the connection through an Envoy instance (per-node, not per-pod). This adds latency and CPU overhead. Use L7 policies selectively, not as a blanket default.

9. What's Next?

CNI Deep Dive: Understand how Cilium fits into the broader CNI landscape, including overlay vs. native routing modes. See CNI Deep Dive.
Network Policies: Learn the standard Kubernetes NetworkPolicy resource before exploring CiliumNetworkPolicy extensions.
Service Mesh: Explore Cilium's sidecar-free service mesh capabilities, which use eBPF for mTLS, retries, and L7 traffic management without injecting Envoy sidecars.
Security: Combine eBPF networking with Tetragon for runtime security enforcement -- detecting and blocking suspicious system calls at the kernel level.
Hands-on: Deploy Cilium in a Kind or k3d cluster using the Cilium CLI (cilium install) and explore Hubble flows to build intuition before rolling out to production.

1. What Is eBPF and Why It Matters for Kubernetes​

The eBPF Execution Model​

2. eBPF vs. iptables: A Direct Comparison​

3. Cilium Architecture​

Cilium Agent (DaemonSet)​

Cilium Operator (Deployment)​

Hubble (Observability)​

4. Cilium Features in Depth​

Identity-Based Network Policies​

Transparent Encryption​

Bandwidth Management​

Host Firewall​

5. Replacing kube-proxy with Cilium​

6. Performance Benchmarks​

7. Hubble for Observability​

8. Common Pitfalls​

9. What's Next?​