Cloudflare
Integrate Cloudflare as a source in the Observo AI platform using either the Splunk HEC Source or the S3 Source. This allows high-volume HTTP, DNS, and security logs to be securely streamed into Observo AI for intelligent routing, enrichment, and real-time analysis.
Purpose
Cloudflare's Logpush exports granular edge telemetry in near real-time. Observo AI receives this data via:
Splunk HEC for low-latency ingestion
AWS S3 for cost-effective storage and batch analytics
Observo AI processes these logs using AI-driven filters, enrichments, and destination rules—helping enterprises reduce SIEM costs, detect threats faster, and maintain audit visibility.
Prerequisites
Before configuring the Cloudflare Integration in Observo AI, ensure the following requirements are met to facilitate seamless data ingestion:
Observo AI Platform Setup:
The Observo AI platform must be installed and operational, with support for either Splunk HEC Source or S3 Source configurations.
Verify that the platform supports common data formats such as JSON and compressed formats (gzip, parquet), as Cloudflare logs are typically delivered in these formats.
Ensure proper network connectivity and firewall configurations to allow traffic from Cloudflare's egress IP ranges.
Cloudflare Account Requirements:
An active Cloudflare account with Logpush feature enabled (available on Pro, Business, and Enterprise plans).
Administrative access or API token permissions to configure Logpush jobs in the Cloudflare dashboard.
Access to Cloudflare Analytics & Logs section for configuration and monitoring.
Authentication and Security:
For HEC Integration: Prepare OAuth2 tokens or API keys for secure HTTP endpoint access.
For S3 Integration: Configure AWS IAM roles or access keys with appropriate S3 bucket permissions.
Valid TLS certificates (TLS 1.2+ required) with proper CA chain for secure data transmission.
IP allowlisting capabilities to restrict access to Cloudflare's official egress IP ranges.
Network and Connectivity:
Ensure Observo AI can receive HTTPS traffic on designated ports (typically 8088 or 10088 for HEC).
For S3 integration, verify read access to the designated S3 bucket with proper IAM policies.
Configure load balancers or reverse proxies for high availability and security hardening.
Establish proper DNS resolution for all endpoints involved in the integration.
Observo AI Platform
Must support Splunk HEC or S3 Source with JSON parsing
Verify TLS 1.2+ support and compression handling
Cloudflare Account
Active account with Logpush enabled and admin access
Pro/Business/Enterprise plans required for full features
Authentication
OAuth2/API tokens for HEC or AWS IAM for S3
Rotate credentials every 90 days minimum
Network Security
HTTPS endpoints with proper TLS and IP restrictions
Use trusted CA certificates, avoid self-signed in production
Integration Options
Observo AI supports two primary integration methods for Cloudflare log ingestion, each with distinct advantages for different enterprise requirements:
Option 1: Integration via Splunk HEC
Best for: Real-time analytics, immediate alerting, and low-latency requirements
Advantages:
Real-time log streaming with minimal latency
Direct authentication through secure tokens
Immediate data availability for analysis and alerting
Simplified architecture with fewer components
Considerations:
Requires stable HTTPS endpoint with high availability
Direct exposure to internet traffic requiring robust security measures
Real-time processing demands higher resource allocation
Option 2: AWS S3 Bucket Integration
Best for: High-volume environments, cost optimization, and batch processing
Advantages:
Cost-effective for large log volumes with S3 storage economics
Built-in redundancy and durability through AWS S3
Flexible processing schedules and batch optimization
Natural integration with existing AWS infrastructure
Considerations:
Slight delay in data availability due to polling intervals
Additional AWS costs for S3 storage and API calls
Requires AWS infrastructure and IAM management
Integration
Integration Option 1: Cloudflare → Observo AI via Splunk HEC
This section outlines the configuration for direct HTTPS integration using Splunk HEC source.
Observo AI Configuration:
Log in to Observo AI:
Navigate to the Sources tab
Click the Add Source button and select Create New
Choose Splunk HEC from the list of available sources to begin configuration
Refer to Splunk HEC doc for configuration details
Cloudflare Configuration:
Access Cloudflare Dashboard:
Navigate to Analytics → Logs → Logpush
Click Create Job to begin configuration
Dataset Selection:
Choose appropriate dataset(s):
HTTP requests for web traffic analysis
Firewall events for security monitoring
DNS queries for DNS analytics
Multiple datasets can be configured with separate jobs
Destination Configuration:
Type: HTTPS
URL:
<Observo push source endpoint>/services/collector/rawRetrieve the push source endpoint from Observo UI for Splunk HEC source you created in step 1.
Headers: Add
Authorization: <Your Auth Code>Ensure to add the same auth code in Observo Splunk HEC source
Compression: Enable gzip compression
Advanced Settings:
Frequency: Real-time or batch (recommended: real-time for security events)
Format: JSON (recommended) or CSV
Field Selection: Choose relevant fields based on use case requirements
Filtering: Apply filters to reduce unnecessary log volume
Integration Option 2: Cloudflare → S3 → Observo AI via S3 Source
This section outlines the configuration for S3-based integration using batch processing.
AWS S3 Configuration:
Create S3 Bucket:
Refer to AWS S3 documentation for more details.
Cloudflare Configuration:
Logpush Job Creation:
Navigate to Analytics → Logs → Logpush → Create Job
Select desired dataset for logging
S3 Destination Setup:
Destination Type: Amazon S3
Bucket:
<S3 bucket name created in step 1>Region: Match your S3 bucket region
Path Pattern:
logs/cloudflare/{YYYY}/{MM}/{DD}/{HH}/Filename Pattern:
cf-logs-{TIMESTAMP}-{BATCH_ID}.json.gz
Format and Compression:
Format: JSON or Parquet (JSON recommended for flexibility)
Compression: gzip (recommended for bandwidth optimization)
Field Selection: Configure based on analytical requirements
Observo AI S3 Source Configuration:
S3 Source Setup:
Navigate to Sources → Add Source → S3
Refer to AWS S3 for configuration details
Test Configuration
For HEC Integration:
Save the configuration in the Observo AI interface
Use curl to test the HEC endpoint with sample Cloudflare log data
Verify token authentication and TLS connectivity
Monitor Observo AI logs for successful ingestion
Validate log parsing and field extraction in the Analytics tab
For S3 Integration:
Create test files in the S3 bucket with sample Cloudflare data
Verify IAM permissions and bucket access
Monitor S3 source polling and file processing
Confirm automatic JSON parsing and field extraction
Validate data flow through to downstream systems
Scenario Troubleshooting
HEC Integration Issues:
401 Unauthorized: Verify HEC token validity and IP allowlist configuration
TLS Handshake Failures: Check certificate validity, CN/SAN matching, and TLS version compatibility
Connection Timeouts: Validate network connectivity, firewall rules, and load balancer configuration
High Latency: Optimize compression settings, review network path, and check processing capacity
S3 Integration Issues:
Access Denied: Verify IAM role permissions, bucket policies, and cross-account access
Files Not Processed: Check file patterns, path prefixes, and polling interval configuration
Parsing Errors: Validate JSON format, compression handling, and field extraction rules
Duplicate Processing: Ensure checkpointing is enabled and functioning correctly
Common Issues for Both Methods:
Log Format Mismatches: Use Cloudflare's field reference and test with sample data
Volume Overload: Implement rate limiting, batch processing optimization, and capacity scaling
Authentication Expiry: Establish token rotation procedures and monitoring
Network Security: Regularly update Cloudflare IP ranges and monitor for unauthorized access
Security Best Practices
Authentication and Access Control
Token Management: Rotate HEC tokens every 90 days minimum, use strong token generation
IP Restrictions: Maintain updated allowlists with official Cloudflare egress IP ranges
Multi-Factor Authentication: Enable MFA for all administrative access to logging configurations
Principle of Least Privilege: Grant minimal necessary permissions for service accounts
Transport Security
TLS Configuration: Use TLS 1.2 minimum, prefer TLS 1.3 for enhanced security
Certificate Management: Use trusted CA certificates, avoid self-signed certificates in production
Cipher Suites: Restrict to modern, secure cipher suites, disable legacy protocols
Certificate Monitoring: Implement automated certificate expiry monitoring and renewal
Network Security
Firewall Rules: Configure strict ingress rules allowing only necessary traffic
DDoS Protection: Implement rate limiting and DDoS mitigation at infrastructure level
Load Balancer Security: Use WAF rules and health checks for additional protection
Network Segmentation: Isolate logging infrastructure from other network segments
Data Protection
Encryption at Rest: Enable encryption for S3 buckets and local storage
Data Retention: Implement appropriate retention policies based on compliance requirements
Access Logging: Monitor and log all access to logging infrastructure and data
Data Masking: Apply data masking for sensitive information in log streams
Troubleshooting
If issues arise with the Cloudflare Integration in Observo AI, use the following comprehensive steps to diagnose and resolve them:
Configuration Validation
Endpoint Verification: Ensure all URLs, ports, and paths are correctly specified and accessible
Authentication Check: Verify tokens, certificates, and credentials are valid and not expired
Format Validation: Confirm log formats match expected JSON structure and field mappings
Network Connectivity: Test connectivity using tools like curl, telnet, or AWS CLI
Common Error Messages and Resolutions
"401 Unauthorized"
Invalid or expired HEC token
Verify token validity in Cloudflare and Observo AI configuration
"403 Forbidden"
IP not in allowlist or insufficient permissions
Update IP allowlist with current Cloudflare ranges, check IAM policies
"SSL Handshake Failed"
Certificate or TLS version mismatch
Verify certificate validity, check TLS version compatibility
"Connection Timeout"
Network connectivity or firewall issues
Test network path, review firewall rules, check load balancer health
"Access Denied (S3)"
IAM permissions or bucket policy issue
Verify IAM role permissions and S3 bucket policies
"JSON Parse Error"
Log format mismatch or corruption
Validate sample logs against expected JSON schema
"Rate Limit Exceeded"
Too many requests or large log volume
Adjust rate limits, implement backoff strategies, optimize batch sizes
"Certificate Expired"
TLS certificate has expired
Renew certificate and update configuration
Monitoring and Validation
Log Volume Monitoring: Track expected vs. actual log volume to identify gaps
Error Rate Analysis: Monitor ingestion error rates and patterns
Latency Measurement: Measure end-to-end latency from Cloudflare to Observo AI
Data Quality Checks: Validate field extraction, timestamp parsing, and data completeness
Performance Optimization
Compression Settings: Optimize compression levels for bandwidth vs. CPU trade-offs
Batch Size Tuning: Adjust batch sizes for optimal throughput and latency
Polling Frequency: Balance polling frequency with API costs and latency requirements
Resource Scaling: Monitor CPU, memory, and network utilization for scaling decisions
Advanced Troubleshooting
Network Trace Analysis: Use packet capture tools to analyze network-level issues
Log Analysis: Enable debug logging for detailed troubleshooting information
Health Check Implementation: Implement comprehensive health checks for all components
Backup Procedures: Establish procedures for failover and data recovery scenarios
Resources
Last updated
Was this helpful?

