CtrlK
BlogDocsLog inGet started
Tessl Logo

dash0/agent-skills

Expert guidance for configuring and deploying the OpenTelemetry Collector. Use when setting up a Collector pipeline, configuring receivers, exporters, or processors, deploying a Collector to Kubernetes or Docker, or forwarding telemetry to Dash0. Triggers on requests involving collector, pipeline, OTLP receiver, exporter, or Dash0 collector setup.

100

Quality

100%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

Overview
Quality
Evals
Security
Files

sampling.mdskills/otel-collector/rules/

title:
Sampling
impact:
HIGH
tags:
sampling, head-sampling, tail-sampling, probabilistic-sampler, load-balancing

Sampling

Sampling reduces the volume of trace data by selectively retaining only a subset of traces. It is an unavoidable necessity in large distributed systems, but every sampling strategy involves trade-offs.

Application SDKs should use the AlwaysOn sampler (the default) and export every span. All sampling decisions belong in the Collector or observability pipeline, where they can be changed centrally without redeploying applications. See SDK sampling guidance for why.

Decision table

RequirementApproachCollector component
Reduce trace volume by a fixed percentageHead samplingprobabilisticsamplerprocessor
Keep all errors, slow traces, and a baseline of normal tracesTail samplingtailsamplingprocessor
Route all spans of a trace to the same Collector instanceTrace-aware load balancingloadbalancingexporter
Generate accurate RED metrics from all traces before samplingMetric materializationsignaltometricsconnector

Head sampling

Head sampling discards traces by a fixed probability, evaluated deterministically from the trace ID. Every Collector instance in the fleet reaches the same decision for the same trace without coordination.

Use the probabilisticsamplerprocessor in the traces pipeline:

processors:
  probabilistic_sampler:
    sampling_percentage: 10

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [memory_limiter, probabilistic_sampler]
      exporters: [otlp]

Head sampling is simple to deploy and scales horizontally, but it has a fundamental limitation: the decision is made before the outcome of the request is known. At low sampling rates, errors, latency spikes, and rarely exercised code paths are likely to be missed.

Use head sampling only when tail sampling is too complex for the environment, or as a coarse volume reduction before tail sampling.

Tail sampling

Tail sampling collects all spans for a trace, then decides whether the trace is worth keeping based on its full content. It can retain errors, slow requests, unusual traces, and a configurable baseline of normal traffic.

Architecture

Tail sampling requires all spans of a trace to converge at a single Collector instance. This requires a two-tier architecture:

Applications → Agent (DaemonSet) → Gateway (sampling layer) → Backend
                     ↑                        ↑
              loadbalancingexporter    tailsamplingprocessor
              routes by trace ID      buffers and decides
  1. Agent tier (DaemonSet): receives spans from local applications via the OTLP receiver. Uses the loadbalancingexporter to consistently hash the trace ID and route all spans of a trace to the same gateway instance.
  2. Gateway tier (Deployment): runs the tailsamplingprocessor, which buffers spans, evaluates sampling policies, and either forwards or drops the trace.

Agent configuration

The agent uses the loadbalancingexporter instead of a standard OTLP exporter for traces. Metrics and logs bypass the sampling layer and are exported directly.

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  memory_limiter:
    check_interval: 1s
    limit_mib: 410
    spike_limit_mib: 100

exporters:
  loadbalancing:
    protocol:
      otlp:
        tls:
          insecure: true # Validate if this is needed, and avoid if possible
    resolver:
      dns:
        hostname: otel-collector-gateway-headless.otel.svc.cluster.local
        port: "4317"
  otlp:
    endpoint: <OTLP_ENDPOINT>
    headers:
      Authorization: "Bearer ${env:DASH0_AUTH_TOKEN}"

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [memory_limiter]
      exporters: [loadbalancing]
    metrics:
      receivers: [otlp]
      processors: [memory_limiter]
      exporters: [otlp]
    logs:
      receivers: [otlp]
      processors: [memory_limiter]
      exporters: [otlp]

The loadbalancingexporter uses DNS to discover gateway instances. Create a headless Service (with clusterIP: None) for the gateway so that DNS returns individual pod IPs rather than a virtual IP.

apiVersion: v1
kind: Service
metadata:
  name: otel-collector-gateway-headless
  namespace: otel
spec:
  clusterIP: None
  ports:
    - name: otlp-grpc
      port: 4317
      targetPort: otlp-grpc
      protocol: TCP
  selector:
    app.kubernetes.io/name: otel-collector
    app.kubernetes.io/component: gateway

Gateway configuration

The gateway runs the tailsamplingprocessor with policies that determine which traces to keep.

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317

processors:
  memory_limiter:
    check_interval: 1s
    limit_mib: 1638
    spike_limit_mib: 400
  tail_sampling:
    decision_wait: 30s
    num_traces: 100000
    expected_new_traces_per_sec: 1000
    policies:
      - name: errors
        type: status_code
        status_code:
          status_codes:
            - ERROR
      - name: slow-traces
        type: latency
        latency:
          threshold_ms: 1000
      - name: baseline
        type: probabilistic
        probabilistic:
          sampling_percentage: 10

exporters:
  otlp:
    endpoint: <OTLP_ENDPOINT>
    headers:
      Authorization: "Bearer ${env:DASH0_AUTH_TOKEN}"
    compression: gzip
    sending_queue:
      enabled: true
      num_consumers: 10
      queue_size: 5000
      storage: file_storage

extensions:
  file_storage:
    directory: /var/lib/otelcol/queue

service:
  extensions: [file_storage]
  pipelines:
    traces:
      receivers: [otlp]
      processors: [memory_limiter, tail_sampling]
      exporters: [otlp]

Policy design

Design sampling policies around how interesting a trace is, not just whether it contains errors.

Policy typeWhat it keepsWhen to use
status_codeTraces with at least one ERROR spanAlways — errors are high-signal
latencyTraces exceeding a duration thresholdAlways — slow requests indicate problems
probabilisticA random baseline of normal tracesAlways — baseline is needed for comparison
string_attributeTraces matching specific attribute valuesWhen specific operations or tenants are high-priority
rate_limitingA fixed number of traces per secondWhen you need predictable volume limits

Combine multiple policies. The tailsamplingprocessor evaluates all policies and keeps a trace if any policy matches.

Sizing the tail sampling buffer

The tailsamplingprocessor buffers spans in memory while waiting for traces to complete. Distributed traces have no end-of-trace marker — the processor waits for decision_wait seconds, then makes a decision with whatever spans have arrived.

SettingDescriptionGuidance
decision_waitTime to wait before making a sampling decisionStart with 30s. Increase for environments with long-running operations, but be aware that longer waits increase memory usage.
num_tracesMaximum number of traces held in memorySet based on expected trace throughput and decision_wait. If exceeded, the oldest traces are force-decided.
expected_new_traces_per_secHint for internal sizingSet to approximate traces per second to pre-allocate internal structures.

Memory usage scales linearly with decision_wait * traces_per_second * average_spans_per_trace * average_span_size. Set the Collector container memory limit accordingly, and set memory_limiter.limit_mib to 80 percent of that limit.

Operational complexity

Tail sampling adds significant operational complexity:

  • The two tiers must be scaled and monitored independently.
  • Consistent hashing must remain stable during gateway scaling events. DNS-based discovery (used by the loadbalancingexporter) is eventually consistent — scaling the gateway may temporarily route spans of the same trace to different instances.
  • The sampling layer is stateful. Gateway pods cannot be terminated without losing buffered spans. Use preStop hooks and graceful shutdown periods to drain in-flight data.
  • Cross-availability-zone traffic increases networking costs. The loadbalancingexporter routes by trace ID, not by zone, so spans frequently cross zone boundaries.

Materialize RED metrics before sampling

Accurate RED metrics cannot be computed from sampled traces — head sampling skews counts, tail sampling overrepresents errors. Always generate metrics from all spans before any sampling occurs.

See RED metrics from traces for connector configuration, histogram type selection, and a full configuration example combining RED metric materialization with tail sampling.

Anti-patterns

  • Sampling in the SDK. Configuring TraceIdRatioBased or other non-AlwaysOn samplers in the application loses traces before their outcome is known. Use AlwaysOn in all SDKs and sample in the Collector.
  • Tail sampling without RED metric materialization. Dashboards and alerts that depend on trace-derived metrics show skewed data. Always generate metrics from spans before the sampling step — see RED metrics from traces.
  • Single-tier tail sampling. Running tailsamplingprocessor on agents (DaemonSet) without a loadbalancingexporter means each agent only sees spans from local pods. Traces that span multiple nodes are split across agents, and sampling decisions are made on incomplete data.
  • Short decision_wait with long-running operations. A decision_wait of 10 seconds misses spans from batch jobs or async workflows that take minutes. Traces are force-decided before all spans arrive, leading to incomplete or incorrect sampling decisions.
  • Gateway without a headless Service. The loadbalancingexporter needs individual pod IPs from DNS. A regular ClusterIP Service returns a virtual IP, defeating consistent hashing.

References

skills

otel-collector

rules

custom-distributions.md

deployment.md

exporters.md

pipelines.md

processors.md

receivers.md

red-metrics.md

sampling.md

SKILL.md

README.md

tile.json