Skip to content

Deploy with Docker

Docker is the fastest way to deploy Bufstream, whether you need an in-memory broker for testing and development or a persistent environment using an existing storage bucket and Postgres database.

For production Kubernetes deployments with Helm, we provide deployment guides for AWS, Google Cloud, and Azure. For a full-stack local environment, we also provide a Docker Compose example.

In-memory deployment

You can run a Bufstream broker suitable for local testing and development with one line:

sh
docker run --network host bufbuild/bufstream serve --inmemory

This creates an ephemeral instance listening for Kafka connections on 9092 and admin API requests on port 9089. Once it's stopped, all data is lost.

Deploying with existing resources

For a long-lived broker, Bufstream requires the following:

  1. Object storage such as AWS S3, Google Cloud Storage, or Azure Blob Storage.
  2. A metadata storage service such as PostgreSQL.

Follow the instructions below to run Bufstream with an existing storage bucket and Postgres database.

Create a bufstream.yaml file providing bucket configuration and Postgres connection information:

bufstream.yaml

yaml
storage:
  provider: S3
  region: <region>
  bucket: <bucket-name>
  access_key_id:
    string: <S3 access key id>
  secret_access_key:
    string: <S3 secret access key>
postgres:
  dsn:
    string: postgresql://<postgres-user>:<postgres-password>@<postgres-host>:<postgres-port>/<database-name>

It's never a good idea to commit credentials, so be sure to follow your organization's policies before adding configuration files like bufstream.yaml to version control.

Now that you have a configuration file, use Docker to start Bufstream. Note that this command uses -v to mount the bufstream.yaml file and the --config flag to specify the file for bufstream serve.

text
$ docker run \
    -v $PWD/bufstream.yaml:/bufstream.yaml \
    --network host \
    bufbuild/bufstream serve \
    --config /bufstream.yaml

This creates a broker listening for Kafka connections on 9092 and admin API requests on port 9089. It's safe to stop this instance — all of its topic data is stored in the bucket you configured, and its metadata state is stored in Postgres.

Bucket permissions

Bufstream needs the following permissions to work with objects in its storage bucket.

Bufstream uses an S3 bucket for object storage, and needs to perform the following operations:

  • s3:GetObject: Read existing objects
  • s3:PutObject: Create new objects
  • s3:DeleteObject: Remove old objects according to retention and compaction rules
  • s3:ListBucket: List objects in the bucket
  • s3:AbortMultipartUpload: Allow failing of multi-part uploads that won't succeed

For more information about S3 bucket permissions and actions, consult the AWS S3 documentation.

Postgres role

Bufstream needs full access to the database in Postgres so that it can manage its metadata schema.

Network ports

If you're not running Bufstream locally, the following ports need to be open to allow Kafka clients and admin API requests to connect:

  • Kafka traffic: Defaults to 9092. Change this by setting kafka.address.port in bufstream.yaml.
  • Admin API traffic: Defaults to 9089. Change this by setting kafka.admin_address.port in bufstream.yaml.

Other considerations

For additional configuration topics like instance types and sizes, metadata storage configuration, and autoscaling, see Cluster recommendations.

When running in Kubernetes, Bufstream supports workload identity federation within AWS, GCP, or Azure. It also supports GCP Cloud SQL IAM users. Refer to cloud provider deployment guides for more information.

Deploying with Docker Compose

We also provide a full-stack Docker Compose example that sets up MinIO, PostgreSQL and Bufstream for you.