Deploy with Docker
Docker is the fastest way to deploy Bufstream, whether you need an in-memory broker for testing and development or a persistent environment using an existing storage bucket and Postgres database.
For production Kubernetes deployments with Helm, we provide deployment guides for AWS, Google Cloud, and Azure. For a full-stack local environment, we also provide a Docker Compose example.
In-memory deployment
You can run a Bufstream broker suitable for local testing and development with one line:
docker run --network host bufbuild/bufstream serve --inmemory
This creates an ephemeral instance listening for Kafka connections on 9092 and admin API requests on port 9089. Once it's stopped, all data is lost.
Deploying with existing resources
For a long-lived broker, Bufstream requires the following:
- Object storage such as AWS S3, Google Cloud Storage, or Azure Blob Storage.
- A metadata storage service such as PostgreSQL.
Follow the instructions below to run Bufstream with an existing storage bucket and Postgres database.
Create a bufstream.yaml
file providing bucket configuration and Postgres connection information:
bufstream.yaml
storage:
provider: S3
region: <region>
bucket: <bucket-name>
access_key_id:
string: <S3 access key id>
secret_access_key:
string: <S3 secret access key>
postgres:
dsn:
string: postgresql://<postgres-user>:<postgres-password>@<postgres-host>:<postgres-port>/<database-name>
It's never a good idea to commit credentials, so be sure to follow your organization's policies before adding configuration files like bufstream.yaml
to version control.
Now that you have a configuration file, use Docker to start Bufstream. Note that this command uses -v
to mount the bufstream.yaml
file and the --config
flag to specify the file for bufstream serve
.
$ docker run \
-v $PWD/bufstream.yaml:/bufstream.yaml \
--network host \
bufbuild/bufstream serve \
--config /bufstream.yaml
This creates a broker listening for Kafka connections on 9092 and admin API requests on port 9089. It's safe to stop this instance — all of its topic data is stored in the bucket you configured, and its metadata state is stored in Postgres.
Bucket permissions
Bufstream needs the following permissions to work with objects in its storage bucket.
Bufstream uses an S3 bucket for object storage, and needs to perform the following operations:
s3:GetObject
: Read existing objectss3:PutObject
: Create new objectss3:DeleteObject
: Remove old objects according to retention and compaction ruless3:ListBucket
: List objects in the buckets3:AbortMultipartUpload
: Allow failing of multi-part uploads that won't succeed
For more information about S3 bucket permissions and actions, consult the AWS S3 documentation.
Postgres role
Bufstream needs full access to the database in Postgres so that it can manage its metadata schema.
Network ports
If you're not running Bufstream locally, the following ports need to be open to allow Kafka clients and admin API requests to connect:
- Kafka traffic: Defaults to 9092. Change this by setting
kafka.address.port
inbufstream.yaml
. - Admin API traffic: Defaults to 9089. Change this by setting
kafka.admin_address.port
inbufstream.yaml
.
Other considerations
For additional configuration topics like instance types and sizes, metadata storage configuration, and autoscaling, see Cluster recommendations.
When running in Kubernetes, Bufstream supports workload identity federation within AWS, GCP, or Azure. It also supports GCP Cloud SQL IAM users. Refer to cloud provider deployment guides for more information.
Deploying with Docker Compose
We also provide a full-stack Docker Compose example that sets up MinIO, PostgreSQL and Bufstream for you.