Inbox listener configuration

Configure the inbox listener

The inbox listener processes recordings from the inbox bucket to the lake bucket for indexing, storage, and access.

It uses a combination of compute, memory, and disk resources and can be scaled horizontally (see Autoscaling). In many cases, the default configuration is appropriate and does not need to be changed.

Local scratch storage

If your Primary Site is expected to process unindexed MCAP or ROS 1 BAG files, consider configuring your inbox listener deployment with local scratch storage. We recommend allocating at least three times as much capacity as your largest expected input file size. If unconfigured or if processing an input might exceed its capacity, the inbox listener will use the lake bucket as scratch storage instead, which is usually slower.

Scratch storage can be configured in your helm values file:

inboxListener:
  deployment:
    localScratch:
      enabled: true
      capacity: "107374182400" # a string specifying a whole number of bytes.

Processing very large files

While we support processing very large files, we generally recommend avoiding large (50GB+) files to reduce the chance of upload failures or processing issues. If you have tooling that generates very large files, consider splitting them up into smaller chunks. AWS and GCP both have a file size limit of 5TB. Azure supports files up to 4TB.

If your Primary Site is hosted on AWS and you expect to process very large files, over 1TB, you will need to increase the copy part size that is used to upload the data to your lake bucket.

The default copy part size is 100MB, we recommend increasing this to roughly max file size / 10000. For example, if you expect to process files up to 4TB, you should set the copy part size to around 400MB (or 419430400 bytes).

This can be done in your helm values file:

inboxListener:
  deployment:
  env:
    # AWS_COPY_PART_SIZE_BYTES: size in bytes of multipart copy batch part
    # sizes (default 100MB)
    - name: AWS_COPY_PART_SIZE_BYTES
      value: 104857600

Mitigating failures caused by lake bucket rate limits

The inbox listener will retry the import process if it encounters rate-limit errors for a limited number of retries. If you import many recordings concurrently, you may encounter processing failures caused by rate-limit errors from your object storage provider. You can reduce the maximum number of workers used to write files to the lake bucket to mitigate this issue.

inboxListener:
  deployment:
  env:
    - name: MAX_FINALIZATION_WORKER_THREADS
      # Defaults to 40 concurrent writer threads. When set to 0, one worker is used per input file topic.
      value: 40

Configure the inbox listener​

Local scratch storage​

Processing very large files​

Mitigating failures caused by lake bucket rate limits​

Configure the inbox listener

Local scratch storage

Processing very large files

Mitigating failures caused by lake bucket rate limits