OpenTelemetry: Modern Observability
- Open Standards: OTel provides a vendor-neutral standard for telemetry data (Metrics, Logs, Traces), preventing vendor lock-in. You instrument once and export to any backend.
- OTel Collector: Acts as a central proxy to receive, process, and export telemetry data to multiple backends simultaneously. It consists of receivers (ingest data), processors (transform data), and exporters (send data to backends).
- Context Propagation: Automatically links distributed requests across microservices using Trace IDs via W3C Trace Context headers, enabling deep debugging of complex multi-service request flows.
- Auto-Instrumentation: For supported languages (Java, Python, Node.js, .NET, Go), OTel can automatically instrument applications without code changes using the OpenTelemetry Operator and language-specific agents.
- Deployment Patterns: The Collector can be deployed as a DaemonSet (one per node, for collecting node-level telemetry) or as a Deployment (centralized gateway, for processing and exporting).
- Three Pillars: OTel unifies metrics (numeric measurements over time), logs (discrete events), and traces (request flows across services) under a single framework.
Historically, every monitoring tool (Datadog, New Relic, Jaeger, Splunk) had its own proprietary agent and SDK. If you wanted to switch tools, you had to re-instrument your entire application. This vendor lock-in was expensive and frustrating.
OpenTelemetry (OTel) solves this by providing a single, open standard for collecting Metrics, Logs, and Traces. It is a CNCF project (the second most active after Kubernetes itself) and has become the de facto standard for cloud-native observability.
The Three Pillars of Observability
Before diving into OTel's architecture, let's clarify what it collects:
- Traces: A trace represents a single request's journey through your distributed system. It consists of spans, each representing a unit of work (an HTTP handler, a database query, a cache lookup). Traces answer: "Why was this request slow?"
- Metrics: Numeric measurements collected over time (request count, latency histograms, CPU usage, queue depth). Metrics answer: "Is the system healthy right now?"
- Logs: Discrete events with timestamps and context. Logs answer: "What exactly happened at this moment?"
OTel's power lies in correlating all three. A trace can link to the metrics that spiked during a slow request and the log entries generated along the way.
Unified Tracing
Visualize a single request flowing through multiple microservices.
OTel vs. Vendor-Specific SDKs
| Feature | Vendor SDK (e.g., Datadog) | OpenTelemetry |
|---|---|---|
| Lock-in | Tied to one vendor | Export to any backend |
| Switching cost | Re-instrument everything | Change a YAML config |
| Community | Single company | CNCF, 1000+ contributors |
| Coverage | Vendor-specific features | Broad but may lag vendor features |
| Cost | Vendor pricing | Open source (backends may cost) |
OTel does not replace your observability backend. You still need Prometheus for metrics storage, Jaeger or Tempo for trace storage, and Loki or Elasticsearch for logs. OTel replaces the collection and instrumentation layer, decoupling your application from your observability vendor.
OTel Collector Architecture
The Collector is the core of any OTel deployment. It is a vendor-agnostic proxy that receives telemetry data, processes it, and exports it to one or more backends.
┌─────────────────────────────────────────────┐
│ OTel Collector │
│ │
│ Receivers ──▶ Processors ──▶ Exporters │
│ (OTLP, (batch, (Prometheus, │
│ Prometheus, filter, Jaeger, │
│ Jaeger, transform, OTLP, │
│ Zipkin) memory_limiter) Loki) │
└─────────────────────────────────────────────┘
Receivers
Receivers ingest data into the Collector. Common receivers:
- otlp: The native OTel protocol (gRPC and HTTP). This is the recommended receiver for OTel SDKs.
- prometheus: Scrapes Prometheus-format metrics endpoints.
- jaeger: Accepts Jaeger-format traces.
- zipkin: Accepts Zipkin-format traces.
- filelog: Reads log files from disk.
- k8s_events: Collects Kubernetes events as logs.
Processors
Processors transform data between receiving and exporting:
- batch: Batches telemetry data to reduce export overhead. Almost always used.
- memory_limiter: Prevents the Collector from consuming too much memory.
- filter: Drops unwanted telemetry (e.g., health check traces).
- attributes: Adds, removes, or modifies attributes on spans and metrics.
- k8sattributes: Enriches telemetry with Kubernetes metadata (pod name, namespace, node).
- transform: Applies complex transformations using the OTel Transformation Language (OTTL).
Exporters
Exporters send data to backends:
- otlp/otlphttp: Sends to any OTLP-compatible backend (Grafana Tempo, Honeycomb, Datadog).
- prometheus: Exposes a Prometheus scrape endpoint.
- prometheusremotewrite: Pushes metrics to Prometheus-compatible storage (Thanos, Cortex, Mimir).
- jaeger: Sends traces to Jaeger.
- loki: Sends logs to Grafana Loki.
- debug: Prints telemetry to stdout (for development only).
Collector Configuration YAML
Here is a production-ready Collector configuration:
# otel-collector-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317 # gRPC receiver
http:
endpoint: 0.0.0.0:4318 # HTTP receiver
prometheus:
config:
scrape_configs:
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
processors:
batch:
timeout: 5s # Flush every 5 seconds
send_batch_size: 1024 # Or when batch reaches 1024 items
memory_limiter:
check_interval: 1s
limit_mib: 512 # Hard limit
spike_limit_mib: 128 # Spike allowance
k8sattributes:
extract:
metadata:
- k8s.namespace.name
- k8s.pod.name
- k8s.deployment.name
- k8s.node.name
filter:
traces:
span:
- 'attributes["http.route"] == "/healthz"' # Drop health checks
exporters:
otlphttp:
endpoint: http://tempo.monitoring:4318 # Traces to Grafana Tempo
prometheusremotewrite:
endpoint: http://mimir.monitoring/api/v1/push # Metrics to Mimir
loki:
endpoint: http://loki.monitoring:3100/loki/api/v1/push # Logs to Loki
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, k8sattributes, filter, batch]
exporters: [otlphttp]
metrics:
receivers: [otlp, prometheus]
processors: [memory_limiter, k8sattributes, batch]
exporters: [prometheusremotewrite]
logs:
receivers: [otlp]
processors: [memory_limiter, k8sattributes, batch]
exporters: [loki]
The service.pipelines section wires everything together. Each pipeline defines which receivers feed into which processors and exporters. You can have separate pipelines for traces, metrics, and logs with different processing chains.
Deploying the Collector: DaemonSet vs. Deployment
DaemonSet (Agent Mode)
Runs one Collector pod per node. Ideal for collecting node-level telemetry (host metrics, container logs) and receiving data from applications on the same node:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: otel-collector-agent
namespace: monitoring
spec:
selector:
matchLabels:
app: otel-collector-agent
template:
metadata:
labels:
app: otel-collector-agent
spec:
containers:
- name: collector
image: otel/opentelemetry-collector-contrib:0.96.0
args: ["--config=/conf/otel-collector-config.yaml"]
ports:
- containerPort: 4317 # gRPC OTLP
- containerPort: 4318 # HTTP OTLP
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
memory: 512Mi
volumeMounts:
- name: config
mountPath: /conf
volumes:
- name: config
configMap:
name: otel-collector-config
Deployment (Gateway Mode)
Runs a centralized set of Collector replicas. Applications send telemetry to this gateway service. Ideal for centralized processing, tail-based sampling, and multi-cluster aggregation:
apiVersion: apps/v1
kind: Deployment
metadata:
name: otel-collector-gateway
namespace: monitoring
spec:
replicas: 3 # Scale based on telemetry volume
selector:
matchLabels:
app: otel-collector-gateway
template:
metadata:
labels:
app: otel-collector-gateway
spec:
containers:
- name: collector
image: otel/opentelemetry-collector-contrib:0.96.0
args: ["--config=/conf/otel-collector-config.yaml"]
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
memory: 2Gi
A common pattern is to use both: DaemonSet agents collect and forward to a Deployment gateway that performs heavier processing before exporting to backends.
Auto-Instrumentation
The OTel Operator for Kubernetes can automatically inject instrumentation into your applications without code changes. It works by modifying pod specs at admission time (via a mutating webhook) to inject language-specific agents:
apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
name: auto-instrumentation
namespace: my-app
spec:
exporter:
endpoint: http://otel-collector.monitoring:4317
propagators:
- tracecontext # W3C Trace Context
- baggage
java:
image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest
python:
image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest
nodejs:
image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest
Then annotate your pods to opt in:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-java-app
spec:
template:
metadata:
annotations:
instrumentation.opentelemetry.io/inject-java: "true" # Auto-instrument Java
spec:
containers:
- name: app
image: my-java-app:latest
Supported annotations:
instrumentation.opentelemetry.io/inject-java: "true"instrumentation.opentelemetry.io/inject-python: "true"instrumentation.opentelemetry.io/inject-nodejs: "true"instrumentation.opentelemetry.io/inject-dotnet: "true"instrumentation.opentelemetry.io/inject-go: "true"(requires eBPF, experimental)
Context Propagation
Context propagation is what makes distributed tracing work. When Service A calls Service B, it includes trace context in HTTP headers. The W3C Trace Context standard uses two headers:
traceparent: 00-<trace-id>-<span-id>-<trace-flags>
tracestate: <vendor-specific-data>
Example:
traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
OTel SDKs automatically propagate these headers. When you see a trace in Jaeger or Tempo, every span from every service is linked because they all share the same trace-id.
For non-HTTP communication (gRPC, messaging queues), OTel propagates context through metadata fields or message headers specific to each protocol.
Integrating with the Observability Stack
Prometheus (Metrics)
The Collector can both scrape Prometheus endpoints and export metrics in Prometheus format:
# In the Collector config
receivers:
prometheus:
config:
scrape_configs:
- job_name: 'app-metrics'
scrape_interval: 15s
static_configs:
- targets: ['my-app:8080']
Jaeger (Traces)
Export traces directly to Jaeger:
exporters:
jaeger:
endpoint: jaeger-collector.monitoring:14250
tls:
insecure: true # For dev; use TLS in production
Grafana (Visualization)
Grafana connects to the backends (Prometheus, Tempo, Loki) to visualize metrics, traces, and logs in unified dashboards. The correlation between the three signals is what makes OTel powerful: click on a spike in a metrics graph, jump to the traces that caused it, and view the logs from those specific requests.
Common Pitfalls
-
Forgetting the memory_limiter processor: Without it, the Collector can consume unbounded memory during traffic spikes and get OOMKilled. Always include it as the first processor in every pipeline.
-
Not using the batch processor: Sending telemetry one item at a time is extremely inefficient. The batch processor reduces network overhead by 10-100x.
-
Exporting everything: High-cardinality attributes (user IDs, request IDs) in metrics cause cardinality explosions. Use the filter and attributes processors to remove unnecessary data before exporting.
-
Using the
debugexporter in production: It writes to stdout and can overwhelm logging infrastructure. Use it only during development. -
Misconfiguring the OTLP endpoint in applications: Applications must point to the Collector's receiver port (4317 for gRPC, 4318 for HTTP), not the exporter's backend port.
-
Ignoring Collector resource limits: The Collector processes all telemetry data. Size it appropriately based on your cluster's telemetry volume. A good starting point is 256Mi-512Mi per agent, 1-2Gi per gateway.
Best Practices
-
Start with auto-instrumentation: Get basic traces working without code changes, then add manual spans for business-critical operations.
-
Use the Collector, do not export directly: Applications should send telemetry to a local Collector (DaemonSet agent), not directly to backends. The Collector provides buffering, retry, and processing.
-
Enrich with Kubernetes metadata: The
k8sattributesprocessor automatically adds pod name, namespace, deployment, and node to every telemetry item, making filtering and correlation much easier. -
Sample intelligently: In high-traffic systems, collect 100% of error traces and a sample of successful ones. Use tail-based sampling in the Collector gateway for best results.
-
Version your Collector config: Store Collector configuration in a ConfigMap managed by GitOps. Changes to telemetry pipelines should go through the same review process as application code.
-
Monitor the Collector itself: The Collector exposes its own metrics at
/metrics. Alert on dropped spans, memory usage, and export failures.
What's Next?
- Observability: Learn the broader observability story including Prometheus metrics, log aggregation, and alerting.
- Troubleshooting: Use traces and logs to debug application issues systematically.
- Health Checks: Configure probes that integrate with your observability pipeline for automated health monitoring.
- Dev Experience: Set up development workflows that include local observability for faster debugging cycles.