Batching Configuration Performance Impact
In high-throughput ingestion pipelines, the placement of the batch processor in the OpenTelemetry Collector pipeline significantly affects throughput, CPU utilization, and memory consumption.
Poor batching placement can lead to:
Excessive CPU load from per-record processing
Higher memory pressure from holding unbatched data longer than necessary
Reduced throughput due to inefficient export batching
Optimal batching strategy is workload-specific. Without careful tuning, performance can drop by thousands of telemetry items per second.
The batch processor groups telemetry signals into batches to reduce per-export overhead.
Two main placement strategies
Early Batching – Batch immediately after the receiver:
Receiver → Batch Processor → Other Processors → Exporter
Pros:
Downstream processors work with fewer, larger payloads
Reduced CPU context switching downstream
Cons:
Batch-wide transformations can be more expensive if data is later dropped
Potential for higher memory usage if downstream processors expand batches
Late Batching – Batch just before export:
Receiver → Other Processors → Batch Processor → Exporter
Pros:
Processors handle smaller, more granular payloads
Reduces wasted processing on items that may be filtered out later
Cons:
Less benefit from batching in earlier stages; higher per-record overhead in processing
Observed Performance Differences
From benchmarking across varied workloads:
Early batching tends to improve throughput by ~10% in simple pipelines and reduce CPU usage for high-volume, low-complexity telemetry.
Late batching performs better in complex pipelines with multiple processing stages, avoiding expensive batch-wide operations on data that will be dropped or heavily modified.
Batch size and timeout settings (
send_batch_max_size
,timeout
) directly influence both throughput and resource usage.
Solution
Choose batching placement based on processing complexity and data volume:
Workload Type
Recommended Placement
Reason
High-volume, low-complexity (minimal processing)
Early Batching
Minimizes per-record overhead early in the pipeline
Complex, multi-processor pipelines
Late Batching
Avoids unnecessary batch-wide transformations
Early Batching Example:
service:
pipelines:
logs:
receivers: [my_receiver]
processors: [batch, memory_limiter, attributes]
exporters: [otlphttp]
Late Batching Example:
service:
pipelines:
logs:
receivers: [my_receiver]
processors: [memory_limiter, attributes, batch]
exporters: [otlphttp]
What's in store for future versions?
The batch processor is being deprecated in favor of exporter-native batching.
While current deployments can benefit from placement tuning, future optimization will focus on exporter-level batch settings, so re-benchmarking will be required after migration.
Last updated
Was this helpful?