Google Cloud Storage Pub/Sub Event

The Google Cloud Pub/Sub Event Receiver consumes GCS event notifications via a Pub/Sub subscription and emits the GCS object as the string body of a log record.

Supported Platforms

Platform
Supported

Linux

Windows

macOS

Available in the Bindplane Distro for OpenTelemetry Collector v1.95.0+.

Prerequisites

  • A Google Cloud project with access to Cloud Storage and Pub/Sub.

  • A Pub/Sub subscription configured to receive GCS event notifications.

  • Ensure the collector has permission to pull and acknowledge messages from the Pub/Sub subscription.

  • Ensure the collector has permission to read objects from the GCS bucket.

  • If using Application Default Credentials, ensure the environment is configured accordingly.

How It Works

  1. The receiver pulls messages from a Pub/Sub subscription for GCS event notifications.

  2. When a notification is received, the receiver downloads the GCS object.

  3. The receiver reads the object into the body of a new log record.

  4. Messages undergo two-layer deduplication:

    • Batch-level: Within a single Pull response, duplicates keyed by (bucket, object, generation) are acknowledged immediately.

    • Cross-batch: A time-bounded tracker (configurable via dedup_ttl) catches sequential duplicates arriving in separate Pull responses, accounting for GCS's at-least-once delivery semantics.

  5. If a GCS object is not found (404 error), the message is nacked for redelivery or DLQ processing.

  6. If a permission denied error (403) is encountered, the message is nacked for redelivery or DLQ processing.

  7. Large objects are handled via offset-based resumption when Redis offset storage is enabled, allowing arbitrarily large files to be processed without memory exhaustion.

Configuration Fields

Field
Type
Default
Required
Description

project_id

string

true

The Google Cloud project ID that contains the Pub/Sub subscription.

subscription_id

string

true

The Pub/Sub subscription ID that receives GCS event notifications.

workers

int

5

false

The number of concurrent workers to process events.

max_extension

duration

3600

false

The maximum total time (in seconds) for which the receiver will extend the ack deadline for a message being processed. After this duration, the message becomes eligible for redelivery.

max_log_size

int

1048576

false

The maximum size in bytes for a single log record. Logs exceeding this size will be split into chunks.

max_logs_emitted

int

1000

false

The maximum number of log records to emit in a batch. A higher number will result in fewer batches, but more memory.

bucket_name_filter

string

false

When set, the source will only emit logs for bucket names that match the specified regex.

object_key_filter

string

false

When set, the source will only emit logs for object names that match the specified regex.

enable_offset_storage

bool

false

false

When enabled, the current position into an object will be saved to Redis, and reading will resume from where it left off after a collector restart.

redis_hostname

string

false

The hostname or IP address of the Redis server used for offset storage.

redis_port

int

6379

false

The port number of the Redis server used for offset storage.

redis_password

string

false

The password for the Redis server used for offset storage.

redis_database

int

0

false

The Redis database number to use for offset storage.

redis_expiration

int

0

false

The expiration time (in seconds) for offset storage in Redis.

redis_tls_enabled

bool

false

false

Whether to enable TLS for the Redis connection.

redis_tls_ca_file

string

false

The path to the CA file for Redis TLS connection.

redis_tls_cert_file

string

false

The path to the client certificate file for Redis TLS connection.

redis_tls_key_file

string

false

The path to the client key file for Redis TLS connection.

Example Configuration

Component Telemetry

This component emits telemetry that can provide insight into how it is performing. The collector is configured to emit these metrics to localhost:8888/metrics by default.

Metric Name
Type
Description

otelcol_gcsevent.batch_size

histogram

The number of logs in a batch.

otelcol_gcsevent.objects_handled

counter

The number of GCS objects processed by the receiver.

otelcol_gcsevent.failures

counter

The number of failures encountered while processing GCS objects.

otelcol_gcsevent.parse_errors

counter

The number of individual log records skipped due to parse errors within a GCS object.

otelcol_gcsevent.dlq_file_not_found_errors

counter

The number of file not found errors that triggered DLQ processing.

otelcol_gcsevent.dlq_iam_errors

counter

The number of IAM permission denied errors that triggered DLQ processing.

otelcol_gcsevent.dlq_unsupported_file_errors

counter

The number of unsupported file type errors that triggered DLQ processing.

Last updated

Was this helpful?