Understanding Distributional’s Platform Architecture

Written by

Renaud Bourassa

Adaptive testing is critical to support production-grade AI applications at scale. This enables AI teams to leverage behavioral distributions to create a comprehensive definition of an app’s desired behavior that can be refined over time. Thus enabling teams to quantifiably detect and understand when there are deviations from that desired behavior.

Using Distributional’s platform, customers are able to productize higher value applications, and keep them in production, all while minimizing risk to the business with the confidence that these applications are behaving and will continue to behave as desired.

Distributional provides an enterprise platform for adaptive testing. To support the scale and performance needs for AI teams to continuously test, understand, and improve application behavior, the platform needed to handle large, context-rich datasets as well as perform fast, analytic-style processing of that data. Let’s go into more detail about how the platform is architected to support this.

Platform architecture

Importantly, Distributional’s platform does not try to replace customers' existing data storage systems. It is purposefully designed to integrate with existing data infrastructure, where all the raw logs and LLM traces for the AI applications already are stored. Based on the customer-defined ingestion schedule, Distributional is then able to efficiently process this data in batch, with inherently no limits on scale. This enables the platform to run much more complex analytics against the data, while maintaining its rich context, to derive a comprehensive set of metrics and statistics about the applications, as well as calculate changes to these over time.

Additionally, unlike systems designed for pure logging and monitoring, users can update or change the data provided to Distributional. This gives them the ability to do things like adding in missing data points and retrying the processing job, expanding the type of data and context provided, or even recalculating metrics based on data from a past point in time.

Since access to this data is critical, customers deploy and manage the Distributional platform in their private cloud environment. This prevents Distributional from becoming yet another technology silo within a customers overall architecture and allows the platform to live where the data already resides, preventing duplicate systems of record or concerns about moving data outside of the existing secure environment. To support this though, we purposefully architected it with as few dependencies as possible and leveraged industry-standard systems to make it as easy to deploy and manage as possible.

At a high level, the platform consists of an API service, a UI service, a worker service, a messaging queue, and a database. All three services run on Kubernetes with Redis and PostgreSQL being used for the messaging queue and database, respectively. The data is stored in an object store with support for AWS S3, Google Cloud Storage (GCS), and Azure Blob Storage. This results in only four external dependencies that all leverage industry-standards our customers are already familiar with.

The core of the platform is the API service, with users able to interact with it via the user interface or SDK for more programmatic access. When the API service is tasked with a workload such as computing metrics on raw data or evaluating different tests, the messaging queue relays the work to the worker service for compute processing and can scale out these resources or adjust the type of compute as needed depending on the size of job or concurrent requests. For the users, this design abstracts away the complexity of managing storage and compute resources, so they can focus on getting the answers they need from their application’s data.

If you’re interested in learning more about Distributional’s platform and architecture, check out the full paper. If you’re interested in trying out Distributional’s adaptive testing platform, reach out to the team and we’d be happy to get you set up.