Development Guide

This guide walks you through building a simple data backend for a remote data loader.

The Rust remote data loader example is a great starting point to understand and build your backend. You can clone and extend that project to fit your needs, or build your own using whatever stack suits your team. This guide uses excerpts from that example throughout.

Context

To recap, the data backend is the HTTP service that serves data to the remote data loader.

It consists of:

A manifest endpoint, which provides a set of data endpoint URLs to load from.
Data endpoints. These provide visualization data on request.

You can see the request flow in the following sequence diagram:

Manifest endpoint

The manifest endpoint handler provides the source list and validates access before the remote data loader requests data.

It serves two purposes:

To provide a manifest of your source data to the remote data loader.
To validate that the user may access all of that source data.

note

The manifest endpoint is the source of truth for authorization. The manifest endpoint must not return a data endpoint URL if the requestor is not authorized for that source. Even if the data endpoint would deny access to that user, the remote data loader can still serve that data out of cache.

If manifest response caching is enabled, the remote data loader can reuse a cached manifest response instead of calling this endpoint. Because the manifest endpoint controls access to source data, cached manifest responses also cache authorization decisions. If a token's access is revoked after a manifest response is cached, the remote data loader may continue to use that cached response until the configured time-to-live expires.

The cache key is the pair of:

The full manifest endpoint URL, including query parameters.
The request's auth token.

Manifest format

The manifest is a JSON document with two top-level properties:

name: A human-readable name for the recording being visualized. This name is what the user sees in the title bar of their app.
sources: A list of data sources.

Each source has these properties:

url: A URL that serves the MCAP data for this source. This can be a relative URL, these are interpreted as being relative to the manifest endpoint. This URL does not need to be from the same domain or service as the manifest endpoint, or other source URLs.
id: An idempotency key for this source data. This is an explicit cache key that the remote data loader uses when caching the source. If this property is not defined, the url property is used as cache key.
startTime and endTime: timestamps of the earliest and latest message in the MCAP data. These can be estimates. However, startTime must not be later than the earliest message, and endTime must not be earlier than the latest. These set the time span of the playback bar in the Foxglove app.
topics and schemas: The set of MCAP topics and schemas served by this source.
supportsRangeRequests: true if the data URL serves a static MCAP file and supports HTTP range requests. If this is set to true, the startTime, endTime, topics and schemas properties are not required and are ignored.

A full JSON Schema is available here.

note

The schema IDs used in the topics and schemas entries do not need to match those served by the data URL.

Code example

See the manifest handler as reference for this section.

    if let Err(status) = check_auth(&headers, &params) {
        return status.into_response();
    }

The first thing the manifest endpoint does is check user credentials. This check must ensure that the caller is allowed to access all of the data it lists in sources.

    let mut channels = ChannelSet::new();
    channels.insert::<Vector3>("/demo");
    let (topics, schemas) = channels.into_topics_and_schemas();

Then, construct the set of Foxglove channels used in this recording. The ChannelSet helps map those channels into manifest types.

    let query = serde_urlencoded::to_string(&params).unwrap();
    let source = StreamedSource {
        url: format!("{DATA_ROUTE}?{query}"),
        id: Some(format!("flight-v1-{query}")),
        topics,
        schemas,
        start_time: params.start_time,
        end_time: params.end_time,
    };
    let manifest = Manifest {
        name: Some(format!("Flight {}", params.flight_id)),
        sources: vec![DataSource::Streamed(source)],
    };

    Json(manifest).into_response()

Finally, serialize the manifest. Details worth noting include:

The id field must be unique to this source content, otherwise incorrect data may be served from cache. The example re-uses the query string as part of this ID, to ensure that cached data for different query parameters never clash. It also uses a version number to increment whenever the data handling code changes.
The start and end times come directly from query parameters. You may want to instead calculate these from the underlying data. The SQLite example data backend demonstrates this approach.

Data endpoint

A data endpoint serves MCAP data to the remote data loader. You can use the Foxglove SDK to serialize MCAP in a stream, so you don't have to hold the entire recording in memory.

Let's walk through the example code to see this in action:

    // Check auth.
    if let Err(status) = check_auth(&headers, &params) {
        return status.into_response();
    }

The data endpoint checks auth information for requests. This is necessary but not sufficient to ensure only authorized users read this data, because the remote data loader might serve it from cache without hitting this endpoint. The manifest endpoint is the source of truth for authorization.

    // construct a stream
    let (mut handle, mcap_stream) = create_mcap_stream();

    // Declare channels.
    let channel = handle.channel_builder("/demo").build::<Vector3>();

The mcap_stream writes MCAP data to the HTTP response body. You use the handle with Foxglove SDK Channels to write messages to this stream.

    // Spawn a task to stream data asynchronously rather than buffering it all up front.
    tokio::spawn(async move {

The data handler must return a Response object before the MCAP data is finished serializing, or the entire MCAP will be buffered in memory. To avoid this, the serialization work is done in a separate tokio task.

    channel.log_with_time(
        &Vector3 {
            x: inner.timestamp() as f64,
            y: 0.0,
            z: 0.0,
        },
        inner,
    );

    const FLUSH_THRESHOLD: usize = 1024 * 1024;
    if handle.buffer_size() >= FLUSH_THRESHOLD
        && let Err(e) = handle.flush().await
    {
        tracing::error!(%e, "flush failed");
        return;
    }

While writing messages, periodically flush buffered data to the response stream. This serves two purposes: the client receives data incrementally instead of all at once, and memory usage stays bounded instead of growing with the entire recording.

warning

Messages must be written in ascending log time order. The Foxglove app may render incorrectly if data is provided out-of-order.

    // Finalize the streamed MCAP and ensure it is sent to the client.
    if let Err(e) = handle.close().await {
        tracing::error!(%e, "error closing MCAP stream");
    }

When you're done serializing messages, finalize the stream. This writes the summary to the HTTP response body and finishes it.

Local development

To test out your backend in development, use minikube and in-memory caching to deploy a remote data loader locally.

Setup

First, start minikube:

minikube start

Verify kubectl is hooked up to minikube:

kubectl config current-context

Prepare Foxglove namespace

Create a namespace to install the remote data loader into.

kubectl create namespace foxglove

Once this completes, install the chart.

Install the remote data loader

Prepare a values.yaml file to configure the installation:

globals:
  manifestEndpoint: <your local manifest endpoint>
  cache:
    enabled: true
    storageProvider: in_memory
    bucketName: foxglove-cache
  # You can disable OAuth 2.0 authentication during development if there is no provider
  # readily available.
  disableAuth: true

remoteDataLoader:
  autoscaling:
    enabled: true
  deployment:
    env:
      # Developing locally, so the data backend will only be accessible over HTTP.
      - name: "ALLOW_HTTP_SOURCES"
        value: "true"

When you're happy with your values.yaml, install the helm chart:

helm repo add foxglove https://helm-charts.foxglove.dev
helm repo update
helm upgrade --install \
    foxglove-remote-data-loader foxglove/remote-data-loader \
    --namespace foxglove \
    --values ./values.yaml

Once this completes the helm chart has been deployed, validate you can reach the remote data loader ingress with curl:

export MINIKUBE_ADDR="http://$(minikube ip)/"
curl $MINIKUBE_ADDR -v

You can now use a helper script like this one to open visualization from Foxglove.

#!/usr/bin/env python3
import os
from urllib.parse import urlencode
import webbrowser


query = urlencode({
  "ds": "remote-data-loader",
  "ds.dataLoaderUrl": os.environ["MINIKUBE_ADDR"],
  "ds.flightId": "abc123",
})

webbrowser.open(f"https://app.foxglove.dev/~/view?{query}")

Next steps

To authenticate your users with an OAuth2 provider, see the auth guide.

Once your data backend is ready, deploy your remote data loader to production.

Context​

Manifest endpoint​

Manifest format​

Code example​

Data endpoint​

Local development​

Setup​

Prepare Foxglove namespace​

Install the remote data loader​

Next steps​