Sample
The Sample function allows you to reduce the volume of events passing through your Observo AI pipeline by sampling a subset of events. This is useful for reducing costs, improving performance, or focusing on a representative subset of data.
Purpose
The Sample function is typically used in scenarios where you want to:
Reduce the volume of logs or metrics sent to downstream systems.
Debug or analyze a subset of data without processing the entire dataset.
Optimize pipeline performance by reducing the load.
Usage
Select Sample transform. Add Name (required) and Description (optional).
General Configuration:
Bypass Transform: Defaults to disable. When enabled, this transform will be bypassed entirely, allowing the event to pass through without any modifications.
Add Filter Conditions: Defaults to disable. When enabled, it allows events to filter through conditions. Only events that meet the true condition will be processed; all others will bypass this transform. Based on AND/OR conditions, "+Rule" or "+Group" buttons.
Sample: Enabled: Defaults to disabled, meaning it does not evaluate any events. Toggle to Enabled to allow event processing to feed data to the downstream Transforms.
Sample Configuration Rules: Set of event fields to evaluate and add/set. First field entry (1 rule) key-value pair added by default. Click Add button to add new field as a key-value pair, with the following inputs:
Enabled: Defaults to enabled, meaning it does evaluate all events. Toggle Enabled off to prevent event processing to feed data to the downstream Transforms.
Conditions: Defaults to empty. When set, allows events to filter through conditions. Only events that meet the true condition will be processed; all others will bypass this transform. Based on AND/OR conditions, "+Rule" or "+Group" buttons.
Sample Rate: The rate at which events will be forwarded, expressed as 1/N. For example, rate = 10 means 1 out of every 10 events will be forwarded and the rest will be dropped.
Example
Uniform Sampling
Scenario: Sample 10% of events randomly. Assume 1,000 events are sampled.
Sample
Enabled
Sample Configurations
Enabled
Conditions: N/A
10
Results: Only 10% events randomly kept. This means only 100 events are kept from the original 1,000 events.
Consistent Sampling by Field
Scenario: Sample 20% of events consistently in the production environment. Assume 1,000 events are sampled.
Sample
Enabled
Sample Configurations
Enabled
Conditions
AND
environment
equals
production
5
Results: Based on the condition, 80% of the sampled events are dropped. This means only 200 events are kept from the original 1,000 events.
Techniques
Random Sampling: Each log entry has an equal probability of being chosen. This method is effective when logs are relatively uniform, and there is no need for bias in selection.
Consistent Sampling: Divides logs by field(s) such as environment and can group the sample by each field specified. This method ensures that the specified field is represented in the sample.
Limitations
Sampling reduces the volume of data, which may result in the loss of some events. Ensure that sampling is appropriate for your use case.
Related Functions
Filter Event: Apply conditions to filter data before or after removing fields.
Throttle: Limit the rate of events passing through the pipeline.
Additional Resources
Last updated
Was this helpful?

