Autoscaling
Use the following HorizontalPodAutoscaler (HPA) definitions to autoscale your self-hosted Primary Site services – be sure to revise configurations based on your specific workload.
Stream service
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: stream-service
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: stream-service
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
Site controller
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: site-controller
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: site-controller
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
Inbox listener
As of release 0.0.47 the Helm chart supports autoscaling as an option. The built-in autoscaling requires KEDA to be installed in the cluster before enabling. To use this built-in autoscaling, any previous inbox-listener autoscaling setup should be disabled or removed.
Installing KEDA
KEDA should be installed via Helm chart before installing or upgrading with built-in autoscaling.
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda --namespace keda --create-namespace
Configuring autoscaling
The following Helm chart values and defaults are now exposed to adjust autoscaling settings:
inboxListener:
autoscaling:
enabled: false
# minReplicas can be raised if time to start processing incoming files is slower than desired
# 1 is a good default for almost all use-cases
minReplicas: 1
# maxReplicas can be raised if you constantly have a very large number of incoming files to process
# it should be set to a value that allows your site to process incoming files at peak load
maxReplicas: 10
# This value, supplied as duration string (https://pkg.go.dev/time#ParseDuration) determines how long a pod will
# wait for new work items. It is unlikely that this value should be changed. The value should only be set when
# using this auto-scaling.
maxWaitForWork: "30s"
To enable autoscaling, run helm upgrade
or helm install
with the flag --set inboxListener.autoscaling.enabled=true
or enable it in your values YAML file.
The built-in autoscaling is unique in that is based on a KEDA ScaledJob
. This type of scaling will create a new inbox listener pod if there are any outstanding inbox data files to process. If there are none, the pods created will exit after a configurable timeout, maxWaitForWork
. KEDA will ensure that between minReplicas
and maxReplicas
pods are running.
Alternatives
If desired, inbox-listener can be scaled with a HorizontalPodAutoscaler
and a custom metric. Configure Prometheus to expose metrics to Kubernetes. In this case, ensure that built-in autoscaling is disabled (the default value).
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: inbox-listener
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: inbox-listener
minReplicas: 1
maxReplicas: 3
metrics:
- type: Object
object:
describedObject:
kind: Namespace
name: foxglove
apiVersion: v1
metric:
name: foxglove_site_controller_unleased_pending_import_count
target:
type: AverageValue
averageValue: 2