Sample

The Sample function allows you to reduce the volume of events passing through your Observo AI pipeline by sampling a subset of events. This is useful for reducing costs, improving performance, or focusing on a representative subset of data.

Purpose

The Sample function is typically used in scenarios where you want to:

  • Reduce the volume of logs or metrics sent to downstream systems.

  • Debug or analyze a subset of data without processing the entire dataset.

  • Optimize pipeline performance by reducing the load.

Usage

Select Sample transform. Add Name (required) and Description (optional).

General Configuration:

  • Bypass Transform: Defaults to disable. When enabled, this transform will be bypassed entirely, allowing the event to pass through without any modifications.

  • Add Filter Conditions: Defaults to disable. When enabled, it allows events to filter through conditions. Only events that meet the true condition will be processed; all others will bypass this transform. Based on AND/OR conditions, "+Rule" or "+Group" buttons.

Sample: Enabled: Defaults to disabled, meaning it does not evaluate any events. Toggle to Enabled to allow event processing to feed data to the downstream Transforms.

Sample Configuration Rules: Set of event fields to evaluate and add/set. First field entry (1 rule) key-value pair added by default. Click Add button to add new field as a key-value pair, with the following inputs:

  • Enabled: Defaults to enabled, meaning it does evaluate all events. Toggle Enabled off to prevent event processing to feed data to the downstream Transforms.

  • Conditions: Defaults to empty. When set, allows events to filter through conditions. Only events that meet the true condition will be processed; all others will bypass this transform. Based on AND/OR conditions, "+Rule" or "+Group" buttons.

  • Sample Rate: The rate at which events will be forwarded, expressed as 1/N. For example, rate = 10 means 1 out of every 10 events will be forwarded and the rest will be dropped.

Example

Uniform Sampling

Scenario: Sample 10% of events randomly. Assume 1,000 events are sampled.

Sample

Entire Transform

Enabled

Sample Configurations

Enable This Part of Transform

Enabled

Conditions: N/A

Sample Rate

10

Results: Only 10% events randomly kept. This means only 100 events are kept from the original 1,000 events.

Consistent Sampling by Field

Scenario: Sample 20% of events consistently in the production environment. Assume 1,000 events are sampled.

Sample

Entire Transform

Enabled

Sample Configurations

Enable This Part of Transform

Enabled

Conditions

Condition
Label
Label Condition
Value

AND

environment

equals

production

Sample Rate

5

Results: Based on the condition, 80% of the sampled events are dropped. This means only 200 events are kept from the original 1,000 events.

Techniques

  • Random Sampling: Each log entry has an equal probability of being chosen. This method is effective when logs are relatively uniform, and there is no need for bias in selection.

  • Consistent Sampling: Divides logs by field(s) such as environment and can group the sample by each field specified. This method ensures that the specified field is represented in the sample.

Limitations

  • Sampling reduces the volume of data, which may result in the loss of some events. Ensure that sampling is appropriate for your use case.

  • Filter Event: Apply conditions to filter data before or after removing fields.

  • Throttle: Limit the rate of events passing through the pipeline.

Additional Resources

Last updated

Was this helpful?