Architecture

Observo's architecture is designed with a dual-plane approach, comprising the Manager (control plane) and Site (data plane). Before delving into the specifics, let's define some key terminology.

Terminology

Manager: The control plane for customers to manage and optimize their observability data. This platform serves as the primary interface for customers to configure how Observo processes and optimizes their observability data within their environments. Through the web interface, customers can set up data sources, configure optimization tools (transforms), and determine the destinations for the processed data.
Site: A collection of services installed within a Kubernetes cluster located in the customer's VPC. Each deployment of these services constitutes a single site, with an Agent operating within each site. The Agent manages data pipeline processing and coordinates the exchange of control plane data with the Manager, executing actions as needed. The Agent and all necessary services are installed during the site creation process.
Source: A service or endpoint that generates observability data. Examples include AWS S3, Kafka, OpenTelemetry, and Syslog.
Destination: A target system where data is sent after being processed by Observo. Typical destinations include observability tools, SIEM systems, or storage solutions, such as AWS S3, Splunk, Logz.io, Elasticsearch, Logstash, and HTTP servers.
Pipeline: A data conduit that transfers data from a source to one or more destinations. A pipeline begins with a data source, like AWS S3 or Kafka, and includes one or more intermediate processing steps. These steps may involve data transformation, filtering, or aggregation to enhance or optimize the data. The processed data is then routed to one or more destinations, such as storage systems, analytics platforms, or other tools used for further storage or analysis. The primary goal of a pipeline is to enable efficient data movement and processing, allowing users to derive insights and make informed decisions based on the processed data.

Overview

Observo Manager

The Manager functions as the central hub for control, providing customers with the tools to efficiently manage and optimize their observability data. Serving as the primary interface, it allows customers to define the processing and enrichment of their observability data across both public and private cloud environments. In scenarios where the private cloud deployment model is preferred, the Manager, along with essential services, can also be deployed within the customer's private cloud infrastructure.

Using Observo’s intuitive user interface, customers can configure their observability data sources, set up optimization policies for data processing, and define the destinations for the processed data within their environment. This interface provides authenticated users with the ability to create and oversee Sources, Destinations, and Pipelines, along with gaining valuable insights into telemetry data.

Observo Site

The Site functions as the data plane for an organization’s telemetry data, comprising a suite of services deployed within a Kubernetes cluster hosted in the customer's Virtual Private Cloud (VPC). Each Kubernetes cluster represents a unique site, with various flavors available for deployment, such as managed services like Amazon EKS or Google Kubernetes Engine (GKE), or self-managed Kubernetes clusters like Kubernetes on AWS EC2 instances. Single-node deployment on EC2 or similar is also available.

In every cluster, an Agent manages data pipelines and enables the transfer of control plane data with the Manager. As part of the Site creation process, the Agent, along with other Observo services, is integrated as a central element. It's important to highlight that both the Site and the Manager, utilizing your preferred flavor of Kubernetes cluster, can be deployed within the customer's public/private cloud infrastructure, providing a range of deployment choices.

Responsibilities of the Site encompass ingesting data from customer-defined sources, optimizing it, and transmitting it to one or more destinations. Observo upholds a core principle ensuring that all observability data processing occurs within the customer's local environment, thereby preserving the integrity and confidentiality of the customer's data. This design principle is crucial in maintaining the security of the customer's data within their network perimeter.

Pipeline

A Pipeline serves as a data conduit, facilitating the movement of data from a Source to one or more Destinations. It begins with a data source, such as AWS S3 or Kafka, and encompasses intermediate processing steps involving data transformation, filtering, or aggregation to enhance or optimize the data. The processed data is subsequently routed to one or more destinations, including blob and index storage, SIEM, analytics platforms, or other storage or analysis tools. The primary objective of a pipeline is to facilitate the efficient movement and processing of data, enabling users to derive insights and make informed decisions based on the processed data.

Observo supports the follow deployment environments for Site:

Amazon Web Services (AWS)
Google Cloud Platform (GCP)
Azure Cloud
Virtual Machine

Availability, Scalability and Fault Tolerance

High Availability

Observo's architecture is designed to ensure high availability and resilience. In the event of a Manager failure, the Site can continue to operate without interruption. The Site, which performs the core data processing tasks, is capable of functioning independently of the Manager. This decoupling ensures that data collection, processing, and routing are not compromised even if the Manager is temporarily unavailable. This design provides robustness, allowing the system to maintain its critical functions and ensuring continuous data workflow.

Scalability

Observo utilizes Kubernetes to manage scaling, ensuring that the Site can dynamically adjust to fluctuations in data volume. We've implemented custom auto scaling strategies to scale the core data plane up or down based on real-time load. This elasticity ensures optimal performance and resource utilization. Furthermore, Kubernetes enhances resilience by automatically recreating pods on healthy nodes if any nodes fail, maintaining continuous and uninterrupted data processing and routing.

Dynamic Request Throttling

Observo dynamically tunes request concurrency to handle variations in data volume and the performance of destinations (like Splunk or Elasticsearch). By dynamically adjusting the number of concurrent requests based on factors such as current load, destination response times, destination response codes and error rates, Observo can optimize throughput while avoiding overloading the target systems. This adaptive mechanism ensures efficient use of resources and maintains the performance and reliability of data delivery even under fluctuating conditions.

Fault Tolerance

Observo Site is built with fault tolerance at its core, ensuring that data processing remains uninterrupted even during hardware or software failures. The Observo Site can scale horizontally as more nodes are added to the Kubernetes cluster that it is deployed on. This intrinsically ensures that if there are failures in any nodes, the Site can re-distributed data processing pods across other nodes. Additionally, the Site's decoupling from the Manager and dynamic resource management via Kubernetes contribute to fault resilience, allowing for automated recovery and minimizing any impact on data workflows. This design guarantees high availability and continuous data processing, maintaining system reliability and data integrity under adverse conditions.

PreviousOverview NextDeployment Modes

Last updated 1 year ago

Was this helpful?

hashtagTerminology

hashtagOverview

hashtagObservo Manager

hashtagObservo Site

hashtagPipeline

hashtagAvailability, Scalability and Fault Tolerance

hashtagHigh Availability

hashtagScalability

hashtagDynamic Request Throttling

hashtagFault Tolerance