# Formatting Cluster Level Events from Kubernetes for Querying in ClickStack

Using the **Process Kubernetes Cluster Events for Clickhouse** blueprint, you can filter routine controller reconciliation, reduce cardinality from object UIDs, map event types to severity levels, and deduplicate warnings, preparing cluster events for efficient Clickhouse storage and alerting.

### Overview

The Kubernetes events blueprint preprocesses cluster-level events from the Kubernetes Events API before they reach ClickHouse. It solves specific challenges in Kubernetes event analytics:

* **Controller Noise Filtering**: Controllers continuously reconcile desired state. Events like `ScalingReplicaSet`, `SuccessfulCreate`, `SuccessfulDelete`, `Scheduled`, and `Pulled` generate high volume with low signal. The blueprint filters these routine Normal events.
* **Cardinality Explosion Prevention**: Kubernetes object UIDs (pod UIDs, deployment UIDs, node UIDs) are globally unique and create massive cardinality in Clickhouse's map columns, degrading query performance. The blueprint removes these identifiers.
* **Severity Mapping**: Kubernetes events carry type (Normal/Warning) but lack OTLP severity levels. The blueprint maps event types and reasons to SeverityText and SeverityNumber, enabling efficient SQL queries like `WHERE SeverityText = 'WARN'`.
* **Warning Deduplication**: Controllers can emit repeated warnings every few seconds during resource pressure or failed rollouts. The blueprint collapses these into single entries with event counts.

### Bindplane Configuration

To deploy this blueprint in your Bindplane instance:

1. Navigate to [Blueprints](https://app.bindplane.com/p/01J06XSD7F4KT3D0XDE2VQDAR5/blueprints) and choose the **Process Kubernetes Cluster Events for ClickHouse** Blueprint. Save it to your Library.
2. Open the **Processors** section in any of your Bindplane configurations.
3. Modify the Blueprint to fit your specific dataset and requirements.
4. Ensure that the data is formatted correctly by comparing the [Snapshot](https://docs.bindplane.com/feature-guides/snapshots) you see to the [Live Preview](https://docs.bindplane.com/feature-guides/live-preview), and validating data pre-processing looks good.
5. Save your changes and rollout the configuration update to production.

#### Resource Detection Setup

The first processor in the pipeline detects resource attributes based on your cloud environment. Update the detector to match your infrastructure:

* **GCP**: Set `detectors: ["gcp"]` (default)
* **AWS EKS**: Set `detectors: ["eks"]`
* **AWS EC2**: Set `detectors: ["ec2"]`
* **Azure AKS**: Set `detectors: ["aks"]`
* **Azure**: Set `detectors: ["azure"]`

Select the detector matching your Kubernetes cluster's hosting environment. This ensures Clickhouse stores accurate resource metadata in the ResourceAttributes map column.

#### Processing Pipeline

The blueprint applies these transformations in sequence:

**Resource Enrichment**: Detects and adds resource attributes (cloud provider, account, region, availability zone) based on your environment. These attributes populate Clickhouse's ResourceAttributes column for multi-cloud deployments.

**Controller Noise Filtering**: Normal events from Kubernetes system components are excluded:

* `ScalingReplicaSet` – ReplicaSet scaling during deployments
* `SuccessfulCreate` / `SuccessfulDelete` – Object creation/deletion
* `Scheduled` – Pod scheduling to nodes
* `Started` / `Created` / `Pulled` – Container lifecycle events
* `SawCompletedJob` – Job completion
* `Killing` – Pod termination
* `SuccessfulRescale` – HPA scaling

These events are routine and add minimal operational value. Warning and Error events are always preserved.

**High-Cardinality UID Removal**: Kubernetes object UIDs are deleted:

* Pod UIDs, ReplicaSet UIDs, Deployment UIDs
* StatefulSet, DaemonSet, Job, CronJob UIDs
* Node UIDs, Namespace UIDs, Service UIDs

Removing UIDs prevents fragmentation in Clickhouse's LogAttributes and ResourceAttributes map columns.

**Metadata Normalization**: Standard fields are set if missing:

* `service.name` defaults to `kubernetes`
* `k8s.cluster.name` defaults to `default` (update to your actual cluster name if desired)
* `k8s.namespace.name` defaults to `default`

These defaults ensure consistent Clickhouse schema across all cluster events.

**Event Type to Severity Mapping**: Kubernetes event types and reasons are mapped to OTLP severity:

* Normal events → `SeverityText="INFO"`, `SeverityNumber=9`
* Warning events → `SeverityText="WARN"`, `SeverityNumber=13`
* Events with reasons containing "error", "failed", "backoff", or "crash" → `SeverityText="ERROR"`, `SeverityNumber=17`

This mapping enables native ClickHouse severity filtering without parsing string fields.

**Warning Deduplication**: Warning events are deduplicated within 60-second windows. Repeated warnings (e.g., "PodCrashLooping" during cascading failures) are collapsed into single entries with an `event_count` field, reducing row count in ClickHouse.

**Batching**: Events are batched for efficient ingestion.

### Customizing the Blueprint

Adjust the blueprint for your deployment:

* **Preserve Routine Events**: Remove or modify the "Filter K8s Controller Noise" processor if you need complete event visibility
* **Adjust Cluster Name**: Update the default `k8s.cluster.name` value to match your actual cluster name for easier filtering in ClickHouse
* **Change Dedup Window**: Modify the warning deduplication `interval` (currently 60 seconds) to tune sensitivity
* **Add Custom Filters**: Insert additional processors to filter other high-volume events specific to your environment

### Querying Deduplicated Events in ClickHouse

When working with deduplicated events, include the `event_count` field:

```sql
SELECT
    attributes['k8s.event.reason'] as reason,
    attributes['involved_object.name'] as object_name,
    event_count,
    COUNT() as unique_events
FROM otel_logs
WHERE SeverityText = 'WARN'
  AND attributes['k8s.cluster.name'] = 'production'
GROUP BY reason, object_name, event_count
ORDER BY unique_events DESC
```

This query shows warning events grouped by reason and object, with `event_count` indicating repetition frequency.

### Integration with ClickHouse

Configure your ClickHouse exporter to align with the blueprint:

* **Batch size**: 5000-10000 rows
* **Timeout**: 5+ seconds for adequate batching
* **Connection pooling**: Enabled for throughput
* **Retry logic**: Enabled for network resilience

The blueprint's filtering and deduplication typically reduce raw event volume by 60-80%, significantly lowering storage costs.

### Monitoring Pipeline Effectiveness

Track these metrics to ensure healthy operation:

* **Filter Effectiveness**: Normal events should represent 40-70% of raw event volume (these are filtered)
* **Dedup Rate**: Warning events with `event_count > 1` indicate successful deduplication
* **Cardinality Reduction**: UID field removal prevents LogAttributes from fragmenting
* **Severity Distribution**: Verify that mapped events have consistent SeverityText/SeverityNumber values

{% hint style="info" %}
**NOTE**

This Blueprint has been tested against standard data patterns. You may need to adjust the configuration to match your specific data format.
{% endhint %}
