Filter Events
The Filter Events function in Observo AI allows you to selectively include or exclude events from your data pipeline based on defined conditions. This helps reduce noise, improve data quality, and optimize storage and processing costs.
Purpose
Use the Filter Events function to control which events flow through your pipeline by:
Removing unwanted data such as debug logs, health checks, or test environment events
Isolating specific events for targeted processing, routing, or analysis
Reducing data volume to lower ingestion and storage costs
Improving data quality by filtering out noisy or irrelevant events
Meeting compliance requirements by excluding sensitive or regulated data
How It Works
The Filter Events function evaluates each incoming event against your defined conditions:
Condition Evaluation: Each event is tested against your filter conditions
Drop Data Setting: Determines whether matching events are kept or removed
Drop Data OFF (default): Events that match conditions pass through; others are blocked
Drop Data ON: Events that match conditions are dropped; others pass through
Event Processing: Allowed events continue to downstream transforms and destinations
Configuration
To configure the Filter Events function:
Select Filter Events transform from the function library
Add a Name (required) and Description (optional)
Configure the filter settings:
Filter Settings
Enabled
Default: ON (enabled)
Purpose: Controls whether the filter actively evaluates events
When ON: All events are evaluated against filter conditions
When OFF: All events bypass this transform without evaluation
Drop Data
Default: OFF (disabled)
Purpose: Determines whether matching events are kept or removed
When OFF: Events matching conditions pass through (include mode)
When ON: Events matching conditions are dropped (exclude mode)
Filter Conditions
Default: Empty
Purpose: Define the logic that determines which events are affected
Options: Build conditions using:
+Rule: Add a single condition (field, operator, value)
+Group: Add a nested group of conditions with AND/OR logic
Operators: Select from the list of available operators depend on the field type.
Usage
Include Mode (Drop Data OFF)
Use this mode when you want to keep only specific events.
The filter acts as a whitelist: only events matching your conditions pass through.
Common Use Cases:
Keep only error or critical severity logs
Process events from specific sources or applications
Include only production environment data
Filter events within a specific time range or value threshold
Example: Keep only critical severity events
Drop Data: OFF
Conditions: log.level equals "critical"
Result: Only critical logs pass through; all other severity levels are blockedExclude Mode (Drop Data ON)
Use this mode when you want to remove specific events.
The filter acts as a blacklist: events matching your conditions are dropped.
Common Use Cases:
Remove debug or verbose logging
Exclude health check or monitoring probe events
Drop test environment data
Filter out events with null or empty values
Example: Remove debug logs
Drop Data: ON
Conditions: log.level equals "debug"
Result: Debug logs are dropped; all other severity levels pass throughExamples
Example 1: Drop Events with Specific Pattern (Single Condition)
Scenario: Drop events where the log.message field starts with "Bad" and contains the word "entry".
Configuration:
Enabled
ON
Drop Data
ON
Filter Conditions:
OR
log.message
matches regex
^Bad.+entry.+$
Result: Any log.message that starts with "Bad" and contains "entry" is dropped from the pipeline.
Example 2: Drop Events with Multiple Conditions (AND Logic)
Sample Log Event:
{
"log": {
"level": "info",
"message": "Request processed successfully",
"source": "api"
}
}Scenario: Drop events that are both "info" level AND from "api" source.
Configuration:
Enabled
ON
Drop Data
ON
Filter Conditions:
AND
log.level
equals
info
AND
log.source
contains
api
Result: Events with log.level = "info" AND log.source containing "api" are dropped. Both conditions must be true for the event to be dropped.
Example 3: Separate PAN Traffic & Threat Logs
Scenario: You have a mixed stream of Palo Alto Networks (PAN) logs containing both Traffic and Threat events. You want to route them separately for different processing and destinations.
Solution: Create two Filter Events transforms in your pipeline:
Transform 1: Get Threat Events
Configuration:
Enabled
ON
Drop Data
OFF
Filter Conditions:
AND
palo_alto.log_type
equals
THREAT
Sample Output:
{
"appname": "pan",
"facility": "lpr",
"hostname": "cgen",
"palo_alto": {
"future_use1": "1",
"log_subtype": "end",
"log_type": "THREAT",
"receive_time": "2025/02/20 16:42:00",
"serial_number": "007051000113358",
"version": "0"
},
"severity": "alert",
"source_ip": "192.168.3.48",
"timestamp": "2025-02-20T16:42:00.735Z"
}Transform 2: Get Traffic Events
Configuration:
Enabled
ON
Drop Data
OFF
Filter Conditions:
AND
palo_alto.log_type
equals
TRAFFIC
Sample Output:
{
"appname": "pan",
"facility": "lpr",
"hostname": "cgen",
"palo_alto": {
"Action": "allow",
"log_type": "TRAFFIC",
"packets": "42",
"packets_in": "0"
},
"severity": "alert",
"source_ip": "192.168.3.48",
"timestamp": "2025-02-20T16:41:53.851Z"
}Benefits of Separation:
Apply custom transformations specific to each log type (enrichment, reduction, sampling)
Route events to different destinations (e.g., Splunk for Threat, S3 for Traffic)
Support compliance and retention policies based on log category
Improve performance and data fidelity
Best Practices
1. Test Before Enabling in Production
Always test your filter conditions in a development or staging environment before deploying to production. This prevents accidental data loss from overly aggressive filters.
2. Start with Include Mode (Drop Data OFF)
When defining new filters, start with Drop Data OFF to explicitly specify what you want to keep.
3. Use Specific Conditions
Be as specific as possible in your filter conditions to avoid unintended consequences:
Avoid overly broad filters (e.g., dropping all "info" logs without considering source)
Combine multiple conditions using AND logic for precision
4. Combine with Other Optimization Functions
Maximize cost savings and efficiency by combining Filter Events with complementary functions:
Sample Function
Use Sample Function to retain only a percentage of high-volume log streams while maintaining visibility for analysis.
Remove Fields Function
Use Remove Field Function to drop unnecessary fields from events, reducing log size and ingestion costs.
Dedupe Function
Use Dedupe Function to eliminate duplicate events based on specified fields, improving storage efficiency.
5. Use Regex Carefully
While regex patterns are powerful, they can impact performance:
Keep regex patterns simple and efficient
Test regex performance with expected data volumes
Consider using simpler operators (equals, contains) when possible
Limitations
Data Recovery
Dropped events cannot be recovered once they are removed from the pipeline. Always ensure your filter conditions are correct before enabling in production.
Performance Impact
Complex filter conditions may impact pipeline performance, especially:
Deeply nested condition groups
Multiple regex pattern matching
High-cardinality field evaluations
Very large numbers of OR conditions
Field Availability
Filter conditions can only reference fields that exist in events at the time of filtering:
Cannot filter on fields created by downstream transforms
Field names must exactly match (case-sensitive)
Nested fields require proper path notation
Troubleshooting
Problem: No Events Passing Through
Solutions:
Temporarily disable the filter (toggle Enabled OFF) to verify upstream data flow
Review sample events to confirm field names and values
Check the Drop Data setting matches your intent (include vs. exclude mode)
Simplify conditions to test one at a time
Use the pipeline preview or test mode to see which conditions are matching
Problem: Unexpected Events Being Dropped
Solutions:
Review the exact operators used (equals vs. contains, etc.)
Test conditions against known event samples
Add null checks for optional fields
Verify data types match expected values
Check for typos in field names or values
Problem: Filter Not Evaluating
Solutions:
Verify Enabled toggle is ON
Check that filter conditions are properly configured and saved
Review pipeline flow to ensure transform is in correct position
Check for any configuration errors or validation warnings
Problem: Performance Degradation
Solutions:
Simplify condition logic where possible
Reduce the use of regex; use simpler operators when possible
Move filter earlier in pipeline to reduce downstream processing
Optimize regex patterns for performance
Consider splitting complex filters into multiple stages
Related Functions
Enhance your data pipeline by combining Filter Events with these related functions:
Sample Function: Retain a percentage of events for representative analysis of high-volume streams
Remove Fields Function: Drop unnecessary fields to reduce event size and storage costs
Dedupe Function: Remove duplicate events based on specified fields
Add Fields Function: Add computed or static fields before filtering
Rename Fields Function: Standardize field names across different log sources
Additional Resources
Last updated
Was this helpful?

