Autoscaling

Use the following HorizontalPodAutoscaler (HPA) definitions to autoscale your self-hosted Primary Site services – be sure to revise configurations based on your specific workload.

Stream service

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: stream-service
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: stream-service
  minReplicas: 1
  maxReplicas: 5
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 80

Site controller

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: site-controller
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: site-controller
  minReplicas: 1
  maxReplicas: 5
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 80

Inbox listener

As of release 0.0.47 the Helm chart supports autoscaling as an option. The built-in autoscaling requires KEDA to be installed in the cluster before enabling. To use this built-in autoscaling, any previous inbox-listener autoscaling setup should be disabled or removed.

Installing KEDA

KEDA should be installed via Helm chart before installing or upgrading with built-in autoscaling.

helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda --namespace keda --create-namespace

Configuring autoscaling

The following Helm chart values and defaults are now exposed to adjust autoscaling settings:

inboxListener:
  autoscaling:
    enabled: false
    # minReplicas can be raised if time to start processing incoming files is slower than desired
    # 1 is a good default for almost all use-cases
    minReplicas: 1
    # maxReplicas can be raised if you constantly have a very large number of incoming files to process
    # it should be set to a value that allows your site to process incoming files at peak load
    maxReplicas: 10
    # This value, supplied as duration string (https://pkg.go.dev/time#ParseDuration) determines how long a pod will
    # wait for new work items. It is unlikely that this value should be changed. The value should only be set when
    # using this auto-scaling.
    maxWaitForWork: "30s"

To enable autoscaling, run helm upgrade or helm install with the flag --set inboxListener.autoscaling.enabled=true or enable it in your values YAML file.

The built-in autoscaling is unique in that is based on a KEDA ScaledJob. This type of scaling will create a new inbox listener pod if there are any outstanding inbox data files to process. If there are none, the pods created will exit after a configurable timeout, maxWaitForWork. KEDA will ensure that between minReplicas and maxReplicas pods are running.

Alternatives

If desired, inbox-listener can be scaled with a HorizontalPodAutoscaler and a custom metric. Configure Prometheus to expose metrics to Kubernetes. In this case, ensure that built-in autoscaling is disabled (the default value).

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: inbox-listener
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: inbox-listener
  minReplicas: 1
  maxReplicas: 3
  metrics:
    - type: Object
      object:
        describedObject:
          kind: Namespace
          name: foxglove
          apiVersion: v1
        metric:
          name: foxglove_data_platform_site_controller_unleased_pending_import_count
        target:
          type: AverageValue
          averageValue: 2

Stream service​

Site controller​

Inbox listener​

Installing KEDA​

Configuring autoscaling​

Alternatives​