Skip to main content

Autoscaling

Use the following HorizontalPodAutoscaler (HPA) definitions to autoscale your self-hosted Primary Site services – be sure to revise configurations based on your specific workload.

Stream service

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: stream-service
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: stream-service
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80

Site controller

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: site-controller
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: site-controller
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80

Inbox listener

As of release 0.0.47 the Helm chart supports autoscaling as an option. The built-in autoscaling requires KEDA to be installed in the cluster before enabling. To use this built-in autoscaling, any previous inbox-listener autoscaling setup should be disabled or removed.

Installing KEDA

KEDA should be installed via Helm chart before installing or upgrading with built-in autoscaling.

helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda --namespace keda --create-namespace

Configuring autoscaling

The following Helm chart values and defaults are now exposed to adjust autoscaling settings:

inboxListener:
autoscaling:
enabled: false
# minReplicas can be raised if time to start processing incoming files is slower than desired
# 1 is a good default for almost all use-cases
minReplicas: 1
# maxReplicas can be raised if you constantly have a very large number of incoming files to process
# it should be set to a value that allows your site to process incoming files at peak load
maxReplicas: 10
# This value, supplied as duration string (https://pkg.go.dev/time#ParseDuration) determines how long a pod will
# wait for new work items. It is unlikely that this value should be changed. The value should only be set when
# using this auto-scaling.
maxWaitForWork: "30s"

To enable autoscaling, run helm upgrade or helm install with the flag --set inboxListener.autoscaling.enabled=true or enable it in your values YAML file.

The built-in autoscaling is unique in that is based on a KEDA ScaledJob. This type of scaling will create a new inbox listener pod if there are any outstanding inbox data files to process. If there are none, the pods created will exit after a configurable timeout, maxWaitForWork. KEDA will ensure that between minReplicas and maxReplicas pods are running.

Alternatives

If desired, inbox-listener can be scaled with a HorizontalPodAutoscaler and a custom metric. Configure Prometheus to expose metrics to Kubernetes. In this case, ensure that built-in autoscaling is disabled (the default value).

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: inbox-listener
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: inbox-listener
minReplicas: 1
maxReplicas: 3
metrics:
- type: Object
object:
describedObject:
kind: Namespace
name: foxglove
apiVersion: v1
metric:
name: foxglove_site_controller_unleased_pending_import_count
target:
type: AverageValue
averageValue: 2