Inbox listener configuration
Configure the inbox listener
The inbox listener processes recordings from the inbox bucket to the lake bucket for indexing, storage, and access.
It uses a combination of compute, memory, and disk resources and can be scaled horizontally (see Autoscaling). In many cases, the default configuration is appropriate and does not need to be changed.
Local scratch storage
If your Primary Site is expected to process unindexed MCAP or ROS 1 BAG files, consider configuring your inbox listener deployment with local scratch storage. We recommend allocating at least three times as much capacity as your largest expected input file size. If unconfigured or if processing an input might exceed its capacity, the inbox listener will use the lake bucket as scratch storage instead, which is usually slower.
Scratch storage can be configured in your helm values file:
inboxListener:
deployment:
localScratch:
enabled: true
capacity: "107374182400" # a string specifying a whole number of bytes.
Processing very large files
While we support processing very large files, we generally recommend avoiding large (50GB+) files to reduce the chance of upload failures or processing issues. If you have tooling that generates very large files, consider splitting them up into smaller chunks. AWS and GCP both have a file size limit of 5TB. Azure supports files up to 4TB.
If your Primary Site is hosted on AWS and you expect to process very large files, over 1TB, you will need to increase the copy part size that is used to upload the data to your lake bucket.
The default copy part size is 100MB, we recommend increasing this to roughly max file size / 10000
. For example, if you expect to process files up to 4TB, you should set the copy part size to around 400MB (or 419430400
bytes).
This can be done in your helm values file:
inboxListener:
deployment:
env:
# AWS_COPY_PART_SIZE_BYTES: size in bytes of multipart copy batch part
# sizes (default 100MB)
- name: AWS_COPY_PART_SIZE_BYTES
value: 104857600