Overview

Fleet Management offers a centralized, scalable solution for managing telemetry data collection across Windows, Linux, and macOS environments, using lightweight, OTel-based Edge Collectors to simplify deployment and configuration. It supports real-time filtering, enrichment, and routing of logs, metrics, and traces, ensuring cost-efficient and consistent observability across hybrid infrastructures.

Purpose

This section provides a comprehensive walk-through for onboarding and managing agents in our Fleet Management system across Windows, Linux, and macOS environments. It outlines procedures for source creation, agent configuration, installation, validation, and cleanup to address challenges in large-scale collection of log, metrics, and trace telemetry data. Leveraging a centralized, OTel-based architecture, it ensures scalable and consistent deployment and maintenance. Detailed instructions for testing and updating configurations optimize data flow and enhance observability. As part of the product documentation, this section empowers security and DevOps teams to efficiently manage telemetry data while reducing operational costs and complexity.

The Modern Data Collection Challenge

Enterprise observability and security architectures face significant challenges when collecting telemetry at scale. With thousands of VMs, bare-metal hosts, and containerized services spread across hybrid environments, most teams rely on a fragmented patchwork of Syslog agents, Fluentd/Fluent Bit nodes, Syslog daemons, other open-source agents or collectors, and proprietary vendor forwarders.

What starts as a well-intentioned data collection plan quickly turns into a maintenance nightmare:

  • Configs drift across environments

  • Agents need per-host updates

  • Infrastructure and network costs balloon due to unfiltered, unstructured data, leaving the edge

  • Insecure open-source agents and collectors are huge security risks

Fleet Management Architecture

Edge Collector is tightly integrated with the Observo AI platform. Architecturally, it sits upstream of the core AI-powered Data Pipeline and handles collection, filtering, and lightweight processing at the source. It provides a unified agent framework that is centrally configured, highly scalable, and built on modern telemetry standards like OpenTelemetry (OTel)—while delivering full fleet visibility, control, security, and performance.

Fleet management is comprised of three elements:

  • Agents – Lightweight, OTel-based collectors deployed at the edge

  • Fleets – Logical groupings of agents such as "prod-linux-east" or "win-dev"

  • Configurations – Declarative policies that define what to collect, how to transform it, and where to route it

This architecture enables scalable telemetry collection that can be managed declaratively, much like Kubernetes manages “compute”. Instead of hand-tuning syslog.conf on 10,000 endpoints, you define it once and push it everywhere.

Observo AI’s Fleet Manager provides a centralized control plane for managing telemetry data collection across large, distributed environments. Fleets represent logical groupings of (Edge) Agents installed on Windows, Linux, or macOS nodes. These (Edge) Agents are configured using OTel-based configurations, supporting both Full Raw OTel and Receiver OTel modes, enabling flexible and scalable data collection strategies.

Fleet Management System Characteristics

The Observo AI Fleet Management system (Figure 1) provides a centralized, scalable solution for managing telemetry data collection across Windows, Linux, and macOS environments. Leveraging lightweight, OTel-based Edge Collectors, it ensures consistent observability, simplifies deployment, and optimizes data flow across hybrid infrastructures. Below are the key characteristics of the system:

  • Single Fleet Manager per Site: Each Site is managed by one Fleet Manager Console, serving as the centralized control plane for overseeing telemetry data collection and agent management.

  • Fleet Manager to Fleets: A Fleet Manager can manage one or more Fleets, enabling logical grouping of agents such as "prod-linux-east" or "win-dev" for streamlined administration.

  • Fleets to (Edge) Agents: Each Fleet can manage multiple Edge Agents, which are lightweight, OTel-based collectors deployed on individual hosts (Windows, Linux, or macOS).

  • Single Configuration per (Edge) Agent: Each Edge Agent operates with a single Configuration, defining data collection, filtering, and routing policies.

  • Flexible Configuration Structure: A Configuration can include single or multiple component parts, such as receivers, processors, and exporters, supporting Full Raw OTel or Receiver-Only modes for tailored telemetry processing.

  • Multiple Configurations per Fleet: A Fleet can support multiple Configurations, allowing for flexible and declarative management of telemetry policies across agents.

  • Pipeline Source to (Edge) Agents: A Pipeline Source consumes telemetry data from a single Edge Agent, ensuring targeted data ingestion and processing. An Edge Agent can connect to multiple Pipeline Sources.

  • Multiple Pipelines per Source: A Pipeline Source can be associated with multiple Pipelines, enabling scalable and cross-site telemetry data routing.

This architecture supports real-time filtering, enrichment, and routing of logs, metrics, and traces at the edge, reducing data volume and operational costs. Configurations are managed declaratively via the Fleet Manager Console, with automated deployment scripts ensuring seamless scalability across thousands of nodes.

Figure: Fleet Management Data Flow

Once deployed, Edge Agents collect raw telemetry—including logs, metrics, traces and events—directly from hosts. Before transmitting data, agents apply real-time filtering, enrichment, and parsing at the edge, reducing data volume and ensuring only valuable telemetry reaches downstream systems. The processed telemetry is forwarded to a designated Source within an Observo AI Site, where it enters the AI-powered pipeline for advanced transformation, optimization, and enrichment.

The optimized telemetry is then routed to targeted destinations, including SIEM platforms, observability tools, and data lakes, supporting a wide range of operational and security use cases. This approach simplifies deployment, reduces operational costs, and ensures consistent observability across hybrid infrastructures—all managed declaratively via the Fleet Manager Console.

Last updated

Was this helpful?