Storing Security Events in ClickHouse using Bindplane

Security and audit logs are critical for compliance and threat detection, but they require careful handling to balance forensic completeness with operational efficiency.

Using the Process Security Logs for Clickhouse blueprint, you can preserve authentication events and security signals while masking PII, normalizing timestamps, and deduplicating repeated alerts, optimizing your SIEM data for Clickhouse analytics.

Overview

The security logs blueprint preprocesses audit and security logs before ingestion into ClickHouse, addressing SIEM-specific requirements:

  • Security Signal Preservation: The blueprint protects authentication, authorization, and intrusion detection logs regardless of severity level, ensuring no security events are lost in filtering.

  • PII Masking with Forensic Retention: Sensitive data like passwords, API keys, and emails are hashed (not censored), allowing forensic analysis while protecting personal information. User IDs and IP addresses are preserved in hashed form for correlation.

  • ECS Normalization: Security event fields are normalized to the Elastic Common Schema (ECS) standard, enabling consistent querying across diverse security sources (firewalls, identity providers, SIEM platforms, cloud audit logs).

  • Alert Deduplication: During security incidents, repeated alerts can flood the system. The blueprint deduplicates within 60-second windows while preserving the key forensic fields (event type, source IP, user name, action) needed for investigation.

Bindplane Configuration

To implement this blueprint in your Bindplane deployment:

  1. Navigate to Blueprintsarrow-up-right and choose the Process Security Logs for ClickHouse Blueprint. Save it to your Library.

  2. Open the Processors section in any of your Bindplane configurations.

  3. Modify the Blueprint to fit your specific dataset and requirements.

  4. Ensure that the data is formatted correctly by comparing the Snapshotarrow-up-right you see to the Live Previewarrow-up-right, and validating data pre-processing looks good.

  5. Save your changes and rollout the configuration update to production.

Processing Pipeline

The blueprint applies these transformations in sequence:

JSON Parsing: Security log bodies are parsed from JSON into structured attributes, enabling field-level access for downstream processors.

Priority-Based Filtering: The blueprint filters low-priority informational events while preserving all authentication, authorization, intrusion detection, and malware alerts:

  • Drops INFO-level events unless they're authentication-related or marked with security.keep="true"

  • Always preserves events with event.category in: authentication, authorization, intrusion, malware

  • Always preserves events with event.type in: login, logout, access, denied, failed, blocked

This ensures critical security signals are never lost due to severity filtering.

PII Masking with Hashing: Sensitive data is hashed using SHA-3 rather than censored, preserving forensic value:

  • Passwords, secrets, and credentials → hashed

  • Emails, phone numbers, SSNs, credit cards → hashed

  • Authorization tokens and API keys → hashed

Forensically important fields are preserved:

  • user.id and user.name – hashed for correlation

  • source.ip and destination.ip – hashed for traffic pattern analysis

  • event.action and event.outcome – preserved for investigating access attempts

Hashing allows you to link failed authentication attempts from the same user or IP while protecting actual PII.

Non-Security Cardinality Reduction: High-cardinality fields unrelated to security analysis are removed:

  • Request IDs, correlation IDs, transaction IDs

  • Session IDs and container identifiers

  • Kubernetes pod UIDs and process IDs

  • HTTP cookies

Security-relevant fields (user, IP, event type) are preserved despite potential cardinality.

ECS Field Normalization: Standard security event fields receive defaults:

  • event.kind defaults to event

  • event.category defaults to authentication

  • event.outcome defaults to unknown

  • service.name defaults to unknown (update to your service name)

  • severity_text defaults to INFO

ECS normalization ensures consistent field naming across heterogeneous security sources.

Failed Auth Enrichment: Failed authentication attempts are automatically enriched:

  • Events matching event.type = login/authentication with event.outcome = failure/denied/failed receive:

    • security.signal="auth_failure"

    • security.priority="high"

This enrichment simplifies querying for security incidents.

Alert Deduplication: Repeated security alerts (events with security.signal or categories like intrusion/malware/threat) are deduplicated within 60-second windows. The deduplication preserves key forensic fields:

  • event.category, event.type, event.action

  • source.ip, user.name

Duplicate occurrences are recorded in an alert_count field, allowing investigation of attack patterns without log flooding.

Lower-Latency Batching: Security logs use smaller batches (2000-5000) with 3-second timeout to reduce latency, ensuring rapid alerting over maximum throughput.

Customizing the Blueprint

Adjust the blueprint for your environment:

  • Update Service Names: Modify the service.name default to match your actual services for easier filtering

  • Adjust Dedup Window: Change the 60-second deduplication window if your security tools have different alert patterns

  • Extend Security Signals: Add custom processors to enrich additional event types with security signals (e.g., marking suspicious API calls)

  • Modify Redaction Strategy: Change from hashing to censoring (asterisks) if your compliance requirements forbid even hashed PII retention

  • Add Custom Filters: Insert processors to exclude domain-specific noise (e.g., automated health checks from monitoring tools)

Querying Security Data in Clickhouse

Query deduplicated security alerts while accounting for repetition:

Find failed authentication attempts:

Integration with Clickhouse

Configure your Clickhouse exporter for security data:

  • Batch size: 2000-5000 rows (smaller than general logs for lower latency)

  • Timeout: 3 seconds (shorter than general logs for rapid alerting)

  • Retry logic: Enabled with exponential backoff for reliability

  • Connection pooling: Enabled for steady throughput

The lower batch size and timeout ensure security alerts reach analysts quickly while still maintaining efficient batching.

Compliance and Audit Trail

The blueprint helps maintain compliance through:

  • PII Protection: Sensitive data is hashed, complying with GDPR, CCPA, and other data protection regulations

  • Audit Trail: Preserved forensic fields (user.name, source.ip, event.action, event.outcome) enable incident investigation without storing raw PII

  • Field Normalization: ECS schema compliance simplifies audit log analysis across multi-vendor environments

Monitoring Pipeline Health

Track these metrics to ensure proper security log processing:

  • Filter Retention Rate: Security-relevant logs should never be filtered; verify 100% of authentication and malware events pass through

  • Alert Dedup Rate: Monitor alert_count distribution to detect attack patterns and alert fatigue

  • Latency: Monitor end-to-end latency from ingestion to ClickHouse (target: < 3 seconds for critical alerts)

  • Hash Consistency: Verify that the same user/IP consistently hashes to the same value for correlation

circle-info

NOTE

This Blueprint has been tested against standard data patterns. You may need to adjust the configuration to match your specific data format.

Last updated

Was this helpful?