Install BDOT Collector in AWS ECS Fargate

Deploy BDOT Collector on AWS ECS Fargate for scalable, serverless collector deployment with automatic scaling and monitoring.

This guide walks you through deploying BDOT Collector on AWS ECS using Fargate launch type. Fargate provides serverless compute for containers, eliminating the need to manage EC2 instances for your collectors.

Prerequisites

Before starting, ensure you have:

  • AWS CLI v2.x installed and configured with appropriate permissions

  • Valid AWS account with permissions to create ECS, VPC, and IAM resources

  • Bindplane Server running and accessible (self-hosted or cloud)

  • Collector secret key from your Bindplane Server

  • Basic understanding of AWS ECS, VPC, and container concepts

Quick Deployment with CloudFormation

For a quick deployment, you can use the provided CloudFormation template that creates all the necessary infrastructure automatically. This is the recommended approach for most users.

CloudFormation Template

The following CloudFormation template creates all the required AWS resources for BDOT Collector on ECS Fargate:

AWSTemplateFormatVersion: '2010-09-09'
Description: 'BDOT Collector on AWS ECS Fargate with VPC and Auto Scaling'

Parameters:
  CollectorSecretKey:
    Type: String
    Description: BDOT Collector secret key from Bindplane Server
    NoEcho: true
  
  OpampEndpoint:
    Type: String
    Description: OpAMP endpoint URL
    Default: 'wss://app.bindplane.com/v1/opamp'
    AllowedPattern: '^(ws|wss)://.*'
  
  CollectorImage:
    Type: String
    Description: BDOT Collector Docker image
    Default: 'ghcr.io/observiq/bindplane-agent:1.84.0'
  
  Environment:
    Type: String
    Description: Environment name (used for resource naming)
    Default: prod
    AllowedValues: [dev, staging, prod]
  
  DesiredCount:
    Type: Number
    Description: Desired number of Bindplane collector instances
    Default: 1
    MinValue: 1
    MaxValue: 10

Resources:
  # VPC and Networking
  VPC:
    Type: AWS::EC2::VPC
    Properties:
      CidrBlock: 10.0.0.0/16
      EnableDnsHostnames: true
      EnableDnsSupport: true
      Tags:
        - Key: Name
          Value: !Sub '${Environment}-bindplane-collector-vpc'

  # Internet Gateway
  InternetGateway:
    Type: AWS::EC2::InternetGateway
    Properties:
      Tags:
        - Key: Name
          Value: !Sub '${Environment}-bindplane-collector-igw'

  InternetGatewayAttachment:
    Type: AWS::EC2::VPCGatewayAttachment
    Properties:
      InternetGatewayId: !Ref InternetGateway
      VpcId: !Ref VPC

  # Public Subnets
  PublicSubnet1:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      AvailabilityZone: !Select [0, !GetAZs '']
      CidrBlock: 10.0.1.0/24
      MapPublicIpOnLaunch: true
      Tags:
        - Key: Name
          Value: !Sub '${Environment}-bindplane-collector-public-1a'

  PublicSubnet2:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      AvailabilityZone: !Select [1, !GetAZs '']
      CidrBlock: 10.0.2.0/24
      MapPublicIpOnLaunch: true
      Tags:
        - Key: Name
          Value: !Sub '${Environment}-bindplane-collector-public-1b'

  # Route Tables
  PublicRouteTable:
    Type: AWS::EC2::RouteTable
    Properties:
      VpcId: !Ref VPC
      Tags:
        - Key: Name
          Value: !Sub '${Environment}-bindplane-collector-public-rt'

  DefaultPublicRoute:
    Type: AWS::EC2::Route
    DependsOn: InternetGatewayAttachment
    Properties:
      RouteTableId: !Ref PublicRouteTable
      DestinationCidrBlock: 0.0.0.0/0
      GatewayId: !Ref InternetGateway

  PublicSubnet1RouteTableAssociation:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      RouteTableId: !Ref PublicRouteTable
      SubnetId: !Ref PublicSubnet1

  PublicSubnet2RouteTableAssociation:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      RouteTableId: !Ref PublicRouteTable
      SubnetId: !Ref PublicSubnet2

  # Security Groups
  CollectorSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: Security group for BDOT Collector
      VpcId: !Ref VPC
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: 4317
          ToPort: 4317
          CidrIp: 0.0.0.0/0
          Description: OTLP gRPC
        - IpProtocol: tcp
          FromPort: 4318
          ToPort: 4318
          CidrIp: 0.0.0.0/0
          Description: OTLP HTTP
        - IpProtocol: tcp
          FromPort: 13133
          ToPort: 13133
          CidrIp: 0.0.0.0/0
          Description: Health check
        - IpProtocol: tcp
          FromPort: 55679
          ToPort: 55679
          CidrIp: 0.0.0.0/0
          Description: ZPages debugging
      Tags:
        - Key: Name
          Value: !Sub '${Environment}-bindplane-collector-sg'

  # IAM Roles
  TaskExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              Service: ecs-tasks.amazonaws.com
            Action: sts:AssumeRole
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy
      Tags:
        - Key: Name
          Value: !Sub '${Environment}-bindplane-collector-execution-role'

  TaskRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              Service: ecs-tasks.amazonaws.com
            Action: sts:AssumeRole
      Tags:
        - Key: Name
          Value: !Sub '${Environment}-bindplane-collector-task-role'

  # CloudWatch Log Group
  LogGroup:
    Type: AWS::Logs::LogGroup
    Properties:
      LogGroupName: !Sub '/ecs/${Environment}-bindplane-collector'
      RetentionInDays: 30

  # ECS Cluster
  ECSCluster:
    Type: AWS::ECS::Cluster
    Properties:
      ClusterName: !Sub '${Environment}-bindplane-collector-cluster'
      CapacityProviders:
        - FARGATE
      DefaultCapacityProviderStrategy:
        - CapacityProvider: FARGATE
          Weight: 1

  # ECS Task Definition
  TaskDefinition:
    Type: AWS::ECS::TaskDefinition
    Properties:
      Family: !Sub '${Environment}-bindplane-collector'
      NetworkMode: awsvpc
      RequiresCompatibilities:
        - FARGATE
      Cpu: 512
      Memory: 1024
      ExecutionRoleArn: !Ref TaskExecutionRole
      TaskRoleArn: !Ref TaskRole
      ContainerDefinitions:
        - Name: bdot-collector
          Image: !Ref CollectorImage
          PortMappings:
            - ContainerPort: 4317
              Protocol: tcp
            - ContainerPort: 4318
              Protocol: tcp
            - ContainerPort: 13133
              Protocol: tcp
            - ContainerPort: 55679
              Protocol: tcp
          Environment:
            - Name: OPAMP_ENDPOINT
              Value: !Ref OpampEndpoint
            - Name: OPAMP_SECRET_KEY
              Value: !Ref CollectorSecretKey
            - Name: OPAMP_LABELS
              Value: !Sub 'environment=${Environment},platform=aws-ecs-fargate'
            - Name: MANAGER_YAML_PATH
              Value: /etc/otel/storage/manager.yaml
          LogConfiguration:
            LogDriver: awslogs
            Options:
              awslogs-group: !Ref LogGroup
              awslogs-region: !Ref AWS::Region
              awslogs-stream-prefix: ecs

  # ECS Service
  ECSService:
    Type: AWS::ECS::Service
    Properties:
      ServiceName: !Sub '${Environment}-bindplane-collector-service'
      Cluster: !Ref ECSCluster
      TaskDefinition: !Ref TaskDefinition
      DesiredCount: !Ref DesiredCount
      LaunchType: FARGATE
      NetworkConfiguration:
        AwsvpcConfiguration:
          Subnets:
            - !Ref PublicSubnet1
            - !Ref PublicSubnet2
          SecurityGroups:
            - !Ref CollectorSecurityGroup
          AssignPublicIp: ENABLED

Outputs:
  VPCId:
    Description: VPC ID
    Value: !Ref VPC
    Export:
      Name: !Sub '${Environment}-bindplane-collector-vpc-id'

  ECSClusterName:
    Description: ECS Cluster Name
    Value: !Ref ECSCluster
    Export:
      Name: !Sub '${Environment}-bindplane-collector-cluster-name'

  ServiceName:
    Description: ECS Service Name
    Value: !Ref ECSService
    Export:
      Name: !Sub '${Environment}-bindplane-collector-service-name'

  TaskDefinitionArn:
    Description: ECS Task Definition ARN
    Value: !Ref TaskDefinition
    Export:
      Name: !Sub '${Environment}-bindplane-collector-task-definition-arn'

Deploy with CloudFormation

Save the template above as bindplane-collector-ecs-fargate.yaml and deploy it:

Architecture Overview

The deployment includes:

  • ECS Fargate Cluster: Serverless compute for BDOT Collector

  • VPC with Public Subnets: Network isolation with internet access

  • Security Groups: Controlled access to collector ports

  • CloudWatch: Monitoring and logging

  • Auto Scaling: Configurable number of collector instances

Container Architecture

The ECS task runs a single container:

BDOT Collector (ghcr.io/observiq/bindplane-agent:1.84.0)

  • OpenTelemetry collector on ports 4317 (gRPC), 4318 (HTTP), 13133 (health), 55679 (ZPages)

  • Connects to Bindplane Server via OpAMP protocol

  • Automatically receives configurations from Bindplane Server

  • Persistent storage for manager.yaml configuration

Manual Deployment Steps

If you prefer to understand each component or need custom configurations, you can follow the manual deployment steps below.

Step 1: Set Up AWS Infrastructure

Important: Follow these steps in order, as later steps depend on resources created in earlier steps.

1.1 Create VPC and Networking

1.2 Create Security Group

Step 2: Create ECS Resources

2.1 Create IAM Roles

2.2 Create CloudWatch Log Group

2.3 Create ECS Cluster

Step 3: Create Task Definition

Step 4: Create ECS Service

Configuration and Management

Connecting to Bindplane Server

  1. Get your collector secret key from your Bindplane Server:

    • Navigate to Agents → Install Agent

    • Choose Linux platform

    • Copy the secret-key

  2. Update the OpAMP endpoint in your task definition:

    • For Bindplane Cloud: wss://app.bindplane.com/v1/opamp

    • For self-hosted: ws://your-server:3001/v1/opamp (or wss:// with TLS)

  3. Update the task definition with your secret key:

Scaling Collectors

Manual Scaling

Auto Scaling with Application Auto Scaling

Monitoring and Logging

CloudWatch Logs

CloudWatch Metrics

The ECS service automatically sends metrics to CloudWatch:

  • CPU and Memory utilization

  • Task count and health

  • Network I/O

Health Checks

The collector includes health checks on port 13133:

  • Health endpoint: http://localhost:13133/

  • ZPages debugging: http://localhost:55679/

TLS Configuration

For Self-Hosted Bindplane with TLS

If your Bindplane Server uses TLS with a custom CA:

Then update your task definition to include:

Troubleshooting

Common Issues

Collector Not Appearing in Bindplane

  1. Check OpAMP endpoint: Ensure the endpoint URL is correct

  2. Verify secret key: Make sure the secret key matches your Bindplane Server

  3. Check network connectivity: Ensure the collector can reach the Bindplane Server

  4. Review logs: Check CloudWatch logs for connection errors

High CPU/Memory Usage

  1. Scale up resources: Increase CPU/memory in task definition

  2. Scale out: Add more collector instances

  3. Optimize configuration: Review collector configuration for efficiency

Network Issues

  1. Check security groups: Ensure all required ports are open

  2. Verify VPC configuration: Ensure subnets have internet access

  3. Test connectivity: Use VPC endpoints if needed

Best Practices

  1. Resource Sizing: Start with 512 CPU / 1024 Memory, adjust based on load

  2. Scaling: Use Application Auto Scaling for automatic scaling

  3. Monitoring: Set up CloudWatch alarms for key metrics

  4. Security: Use IAM roles with minimal required permissions

  5. Logging: Enable detailed logging for troubleshooting

  6. Updates: Regularly update collector image versions

Cleanup

To remove all resources created by this guide:

Next Steps

After successfully deploying your BDOT Collector:

  1. Verify connection in your Bindplane Server UI

  2. Create configurations for your collectors

  3. Set up monitoring and alerting

  4. Configure auto-scaling based on your needs

  5. Review security settings and access controls

Last updated

Was this helpful?