Disk-Based Buffering

Overview

Observo Pipeline provides disk-based buffering capabilities to handle backpressure events and ensure data reliability during destination outages. When a destination becomes unavailable, the pipeline can temporarily store data on disk and automatically resume forwarding once the destination is available again.

How It Works

When a backpressure event occurs (such as a destination outage or slowdown):

Data Storage: Observo Pipeline engages a backpressure queue that writes incoming data to disk
Temporary Buffering: Data accumulates in the disk buffer during the outage period
Automatic Recovery: Once the destination becomes available, the pipeline automatically forwards buffered data from disk
Continuous Operation: The pipeline maintains data flow and prevents data loss during temporary disruptions

Getting Started

Navigate to Destinations
Select a destination where you want to enable disk based buffering
Go to Buffering Configuration
Fill in the required fields as shown below:

Buffering Configuration

Buffer Type: Select where buffered data will be stored during backpressure events.This configuration supports two storage mechanisms:
- Memory: Utilizes RAM for faster buffer operations but offers no persistence guarantees. Data is lost if the pipeline restarts. Suitable for scenarios prioritizing speed over durability.
- Disk: Persists data to disk storage, providing durability across pipeline restarts and ensuring data survives system failures. Recommended for production environments where data retention is critical.

Note: This document covers the configuration details for Disk based buffering.

Max Bytes Size: The maximum number of bytes allowed in the buffer.
- Minimum size must be at least 268435488 bytes (256 MB).
- This setting determines the total capacity of the disk buffer. Once this threshold is met, the "When Full" policy determines how the pipeline handles additional incoming data.
- Example:

Max Bytes Size: 268435488  (minimum)
Max Bytes Size: 1073741824 (1 GB)

When Full: Determines how the pipeline handles incoming events when the buffer reaches its maximum capacity.

You have two options:

1. Block (Default)
- Behavior: Stops accepting new data when the buffer is full
- Impact: Creates upstream backpressure, causing data sources to reduce their event transmission rate until buffer space becomes available
- Use Case: When data integrity is critical and every event must be preserved
- Best For: Mission-critical logs, compliance data, financial transactions

2. Drop Newest

Behavior: Discards new incoming data when buffer capacity is exhausted
Impact: Maintains consistent pipeline throughput by sacrificing newer events
Use Case: When maintaining pipeline throughput is more important than preserving every data point
Best For: High-volume metrics, non-critical telemetry, sampling scenarios

Configuration Example

Best Practices

Sizing Your Buffer

Assess Data Volume: Calculate your average data ingestion rate (bytes/second)
Estimate Outage Duration: Consider typical outage windows for your destinations

Choosing the Right "When Full" Policy

Choose Block when:

Data loss is unacceptable
Compliance or audit requirements mandate complete data retention
You can tolerate temporary pipeline pauses
Downstream systems can handle delayed data delivery

Choose Drop when:

Real-time data flow is the highest priority
Your use case can tolerate some data loss during peak load periods
Preventing upstream source blockage is critical
You're working with high-volume, less critical data streams

Troubleshooting

Buffer Frequently Full

Symptoms: Buffer reaches capacity often, causing data blocks or drops

Solutions:

Increase Max Bytes Size
Optimize destination performance
Review data volume patterns for spikes

Disk Space Issues

Symptoms: Disk space exhaustion, buffer write failures

Solutions:

Ensure sufficient disk space (3-4x buffer size recommended)
Configure appropriate buffer size for available disk resources

PreviousSocket NextTransforms

Last updated 16 days ago

Was this helpful?

hashtagOverview

hashtagHow It Works

hashtagGetting Started

hashtagBuffering Configuration

hashtagConfiguration Example

hashtagBest Practices

hashtagSizing Your Buffer

hashtagChoosing the Right "When Full" Policy

hashtagTroubleshooting

hashtagBuffer Frequently Full

hashtagDisk Space Issues