For the complete documentation index, see llms.txt. This page is also available as Markdown.

Configure Collector Queue Size

The Bindplane OTel Collector (BDOT) buffers outgoing telemetry in a sending queue before delivering it to a destination. Knowing how the queue works — and how to size it — is the key to controlling memory consumption, preventing disk bloat, and avoiding silent data loss when a destination becomes unreachable.

How the Sending Queue Works

Every destination in a Bindplane configuration exposes a sending queue that sits between the pipeline and the export call. When the destination endpoint is slow or temporarily unavailable, the queue absorbs the backlog so the rest of the pipeline can keep running.

Receiver → Processors → [Sending Queue] → Destination

The queue holds batches of telemetry, not individual spans or log records. One queue slot = one batch produced by the batch processor upstream.

Two queue types are available:

Type
Storage
Survives collector restart
Best for

In-memory (default)

RAM

No — data is lost on crash or restart

Low-latency, non-critical telemetry

Persistent (Pebble / Bolt)

Local disk

Yes

Critical telemetry, gateway collectors

The in-memory queue is enabled by default. To survive restarts you must explicitly configure a persistent queue backend (Pebble is recommended for new deployments).

Key Parameters

These parameters appear on every destination's Advanced settings panel.

Parameter
Default
Description

sending_queue_enabled

true

Enables the sending queue. Disable only when the destination is always reachable and you want zero buffering overhead.

sending_queue_queue_size

5000

Maximum number of batches held in the queue. When the queue is full, new batches are dropped.

sending_queue_num_consumers

10

Number of concurrent goroutines reading from the queue and sending to the destination. Higher values increase throughput at the cost of more CPU and open connections.

Estimating Memory Usage

For the in-memory queue, the total RAM consumed is approximately:

Example: Default queue_size of 5000 with a typical 100 KB batch:

In practice, the queue drains continuously, so average memory usage is a fraction of the worst case. However, plan for the worst case when sizing collector nodes.

Reducing memory pressure:

  • Lower queue_size (e.g., 5001000) on memory-constrained hosts.

  • Reduce batch size in the batch processor (send_batch_max_size).

  • Switch to a persistent queue to move storage off the heap.

Estimating Disk Usage (Persistent Queue)

For a persistent queue, the same formula applies but the cost is disk space, not RAM:

The persistent queue backend (Pebble) compacts old entries automatically. Monitor actual disk growth and set queue_size so the queue never fills a partition.

What Happens When the Queue Is Full

When queue_size is reached and the destination is still unavailable:

  1. The queue stops accepting new batches.

  2. Back-pressure propagates upstream — receivers and processors stall.

  3. If back-pressure cannot be absorbed, telemetry is dropped and a warning is logged:

Increase queue_size to absorb longer outages, or investigate why the destination is not keeping up (network issues, throttling, insufficient num_consumers).

Retry on Failure vs. Queue Size

The sending queue and retry-on-failure work together but serve different purposes:

Setting
Purpose

Sending queue

Holds batches while the destination is unreachable

Retry on failure

Re-attempts individual failed export calls with exponential backoff

Both are enabled by default. The queue provides the buffer; retry on failure governs how aggressively the collector retries each dequeue attempt before giving up.

Scenario

Recommended queue_size

Notes

Memory-constrained edge agent (< 512 MB RAM)

250500

Pair with a small send_batch_max_size

Standard agent (1–4 GB RAM)

10002000

Fits most single-host workloads

Gateway collector (high throughput)

500010000

Use a persistent queue to avoid RAM exhaustion

Critical telemetry (must not lose data on restart)

Any, with persistent queue enabled

Pebble is recommended

Increase num_consumers (e.g., 2050) on gateway collectors that send to low-latency destinations to maximize export throughput.

What YOU Should Do Next

  1. Check queue_size against available RAM: queue_size × avg_batch_size ≤ available_memory × 0.5

  2. Enable a persistent queue if data must survive a collector restart

  3. Monitor queue depth using internal telemetry

  4. Watch collector logs for Dropping data because sending_queue is full warnings

  5. For gateway deployments, prefer Pebble over the default in-memory queue

Last updated

Was this helpful?