Splunk HEC
The Splunk HEC Source in Observo AI facilitates secure, scalable ingestion of Splunk data via HTTP/S, supporting real-time analytics and observability with optional TLS encryption and HEC token authentication.
Purpose
The Splunk HEC Source in Observo AI enables the ingestion of high-volume Splunk data, such as transaction logs, via HTTP/S for real-time observability and analytics. It supports secure data transfer with optional TLS encryption and HEC token authentication, ensuring compliance with enterprise security standards. Designed for scalability, it integrates with load balancers to handle large-scale data from Splunk forwarders, as exemplified by FinServer Insights for financial transaction monitoring. The configuration allows flexible parsing and archival options to support advanced analytics and long-term data retention.
Prerequisites
Before configuring the Splunk HEC Source in Observo AI, ensure the following requirements are met to facilitate seamless data ingestion:
Observo AI Platform Setup:
The Observo AI Site must be installed and available.
Network and Connectivity:
Ensure Observo AI can receive Splunk HEC data over HTTP/S. The default port range is [10000-10200] for HTTP/S.
Check firewall rules, proxy settings, or VPC configurations that may affect connectivity to the specified port.
If using TLS, ensure certificates are properly configured, including CA certificates, host certificates, and private key files for secure communication.
Authentication (Optional for TLS):
For TLS-enabled Splunk HEC sources, prepare one of the following authentication methods:
Certificate-Based Authentication: Provide paths to CA certificate, host certificate, and private key files if required.
No Authentication: If TLS is disabled, no additional credentials are needed.
HEC Authentication Token: Provide a valid Splunk HEC token for client authentication. Tokens can be generated in Splunk and added to Observo AI.
Load Balancer (Optional):
For high-volume Splunk HEC data, configure a load balancer such as HAProxy, nginx, or AWS ELB to distribute traffic across Observo AI worker nodes to prevent CPU strain on a single node.
If using a load balancer, consider enabling Proxy Protocol v1 or v2 to preserve the original sender IP address in the X-Forwarded-For header.
Network
Connectivity to HTTP/S ports
Default port range is [10000-10200]; Check firewall/proxy settings
Authentication
TLS certificate or HEC token setup if enabled
Provide CA, host cert, key files for TLS, or HEC token for authentication
Load Balancer
Optional for high-volume HTTP/S data
Use HAProxy, nginx, or AWS ELB; Enable Proxy Protocol if needed
Integration
The Integration section outlines configurations for the Observo AI Splunk HEC Source. To configure Splunk HEC as a source in Observo AI, follow these steps to set up and test the data flow:
Log in to Observo AI:
Navigate to the Sources tab.
Click the "Add Source" button and select "Create New".
Choose "Splunk HEC" from the list of available sources to begin configuration.
General Settings:
Source Type: Splunk HEC.
Name: A unique identifier for the source, such as splunk-hec-source-1.
Description (Optional): Provide a description for the source.
Socket Address: Socket address to listen for connections on. It should be in the format of host:port. Port should be in range[10000-10200]
Example0.0.0.0:10000
Store HEC token (False): If set to true, the HEC token will be added to each event. This is useful if the event is being written to a Splunk HEC destination.
Valid Authorization Tokens (Add) List of valid authorization tokens. If the token is not in the list, the event will be dropped. If the list is empty, all events will be accepted.
TLS Options (Optional):
TLS Enable (False): Toggle to Enable for secure HTTP/S communication.
CA File Path: Absolute path to an additional CA certificate file for TLS.
The certificate must be in the DER or PEM (X.509) format.
Additionally, the certificate can be provided as an inline string in PEM format.
CRT File Path: Absolute path to a certificate file to identify the server.
The certificate must be in DER, PEM (X.509), or PKCS#12 format.
Additionally, the certificate can be provided as an inline string in PEM format.
If this is set, and is not a PKCS#12 archive, "Private Key File Path" must also be set.
Private Key File Path: Absolute path to a private key file.
The key must be in DER or PEM (PKCS#8) format.
Additionally, the key can be provided as an inline string in PEM format.
Private Key Password: Passphrase used to unlock the encrypted key file.
Verify Client Certificates: Enables client certificate verification by a trusted CA.
For components that create a server, this requires that the client connections have a valid client certificate.
For components that initiate requests, this validates that the upstream has a valid certificate.
If enabled, certificates must not be expired and must be issued by a trusted issuer.
This verification operates in a hierarchical manner, checking that the leaf certificate (the certificate presented by the client/server) is not only valid, but that the issuer of that certificate is also valid, and so on until the verification process reaches a root certificate.
Verify Client Hostname: Enable to ensure the client hostname matches the certificate hostname.
If enabled, the hostname used to connect to the remote host must be present in the TLS certificate presented by the remote host, either as the Common Name or as an entry in the Subject Alternative Name extension.
Only relevant for outgoing connections.
SNI Server Name: (Empty) Server name to use when using Server Name Indication (SNI).
Supported ALPN protocols (Add): Declare the supported ALPN protocols, which are used during communication with peers.
They are prioritized in the order defined.
Acknowledgement Settings (Optional):
Enabled: (False) Enables end-to-end acknowledgements.
Max Idle Time in Seconds: The maximum time in seconds that an idle connection will be kept open.
Ack Idle Cleanup: If set to true, the idle connections will be cleaned up after the max idle time.
Max Number of Ack Channels: The maximum number of Splunk HEC channels clients can use in this Source.
Max Pending Acks: The maximum number of pending acknowledgements across all channels for the Source.
Max Pending Acks Per Channel: The maximum number of pending acknowledgements per channel for the Source.
Advanced Settings:
Max connection age in seconds: The maximum amount of time a connection may exist before it is closed by sending a Connection: close header on the HTTP response.
Set this to a large value like 100000000 to “disable” this feature
Only applies to HTTP/0.9, HTTP/1.0, and HTTP/1.1 requests.
A random jitter configured by "Max connection age jitter factor" is added to the specified duration to spread out connection storms.
Max connection age jitter factor: The factor by which to jitter the "Max connection age in seconds" value.
A value of 0.1 means that the actual duration will be between 90% and 110% of the specified maximum duration.
Parser Config:
Enable Source Log parser: (False)
Toggle Enable Source Log parser Switch to enable
Select appropriate Parser from the Source Log Parser dropdown
Add additional Parsers as needed
Pattern Extractor:
Refer to Observo AI’s Pattern Extractor documentation for details on configuring pattern-based data extraction.
Archival Destination:
Toggle Enable Archival on Source Switch to enable
Under Archival Destination, select from the list of Archival Destinations (Required)
Save and Test Configuration:
Save the configuration settings.
Send sample Splunk HEC data such as via a Splunk forwarder or curl command and verify ingestion in Observo AI by monitoring the Analytics tab in the target pipeline.
Sending a Formatted Event
curl https://\<observo_site_url\>:\<port\>/services/collector/event \-H 'Authorization: \<token\>' \-d '{"event":"Lets go Observo\!","index":"main"}'
Sending a Raw Event
curl https://\<observo_site_url\>:\<port\>/services/collector/raw \-H 'Authorization: \<token\>' \-d 'this is a sample raw event'
Troubleshooting Details:
Add
--verboseto the end of the curl command to gather step by step details of the connection process.
Example Scenarios
FinServer Insights, a fictitious company specializing in Financial analytics, aims to integrate Observo with a Splunk HEC to process high-volume transaction logs for large financial institutions. The setup ensures secure data ingestion, high availability with load balancing, and compliance with enterprise security standards.
Standard Splunk HEC Destination Setup
Here is a standard Kafka Destination configuration example. Only the required sections and their associated field updates are displayed in the table below:
General Settings
Source Type
Splunk HEC
Select "Splunk HEC" from the list of available sources in Observo AI.
Name
splunk-hec-finance-prod-1
Unique identifier for the production Splunk HEC source.
Description
Splunk HEC source for financial transaction logs in production
Provides context for the source's purpose.
Socket Address
0.0.0.0:10002
Host:port format; port within range [10000-10200], behind a load balancer.
Store HEC Token
True
Adds HEC token to each event for downstream Splunk HEC destinations.
Valid Authorization Tokens
finance-prod-token-9876-zyxw-5432-kjhg
Only events with this token are accepted; it aligns with enterprise security policy.
TLS Configuration
TLS Enable
True
Mandatory for secure HTTPS communication in compliance with enterprise security standards.
CA File Path
/etc/observo/certs/enterprise-ca.pem
Path to enterprise CA certificate in PEM format, issued by internal PKI.
CRT File Path
/etc/observo/certs/finance-hec-server-cert.pem
Path to server certificate in PEM format, signed by enterprise CA.
Private Key File Path
/etc/observo/certs/finance-hec-server-key.pem
Path to private key in PEM format, encrypted for security.
Private Key Password
FinSecureKey2025!
Passphrase to unlock the encrypted private key, managed via enterprise secrets vault.
Verify Client Certificates
True
Enforces client certificate verification to ensure only trusted clients send data.
Verify Client Hostname
True
Validates client hostname against certificate to prevent spoofing.
SNI Server Name
hec.finance-prod.example.com
Server name for SNI, matching enterprise DNS configuration.
Supported ALPN Protocols
h2, http/1.1
Prioritizes HTTP/2 for performance, with HTTP/1.1 fallback for compatibility.
Advanced Settings
Max Connection Age in Seconds
43200
Sets connection duration to 12 hours to balance stability and resource management.
Max Connection Age Jitter Factor
0.05
Limits jitter to ±5% to minimize connection storms in high-traffic enterprise environments.
Additional Enterprise Configuration Notes
Load Balancer Setup: Deploy an AWS ELB to distribute high-volume Splunk HEC traffic across multiple Observo Sites. Enable Proxy Protocol v2 to preserve client IP addresses in the X-Forwarded-For header for audit compliance.
Acknowledgement Settings: Configure for reliability:
Enabled: True
Max Idle Time in Seconds: 600
Ack Idle Cleanup: True
Max Number of Ack Channels: 100
Max Pending Acks: 10000
Max Pending Acks Per Channel: 100
Parser Config: Enable Source Log Parser and select a JSON parser to handle structured financial transaction logs. Add a custom parser for proprietary log formats if needed.
Pattern Extractor: Configure to extract fields like transaction_id, account_number, and timestamp for analytics, per Observo AI’s Pattern Extractor documentation.
Archival Destination: Enable and select an S3-based archival destination for long-term storage, complying with financial data retention policies.
Testing: Validate the configuration by sending a sample event:
Monitoring: Monitor ingestion in the Observo AI Analytics tab for throughput, event counts, and errors. Set up alerts for "Server is busy (503)" errors and adjust the Active request limit in Advanced Settings if needed.
Troubleshooting: If "No data received" errors occur, verify load balancer health checks and VPC firewall rules. For "Invalid token" errors, ensure the token matches the Splunk HEC sender configuration.
Troubleshooting
If issues arise with the Splunk HEC Source in Observo AI, use the following steps to diagnose and resolve them:
Verify Configuration Settings:
Ensure fields like Input ID, Socket Address, Port, and Splunk HEC Endpoint match the sender's configuration.
Confirm that the port such as [10000-10200] for HTTP/S is open and accessible.
Verify that the HEC token matches the sender’s configuration.
Check Network Connectivity:
Verify that firewall rules, proxy settings, or VPC configurations allow traffic to the specified HTTP/S port.
Test connectivity using tools like curl or telnet to ensure the Observo AI instance can receive data on the configured port.
Validate TLS Configuration:
For TLS-enabled sources, ensure CA, certificate, and key files are correctly specified and accessible.
Check for TLS version mismatches such as sender using TLS 1.2 while Observo AI expects TLS 1.3. Adjust TLS settings or disable "Verify Client Hostname" if certificate issues occur.
Monitor Logs and Data:
Verify data ingestion by monitoring the Analytics tab in the Observo AI pipeline for throughput and event counts.
Check Observo AI for errors related to parsing, buffering, or connection issues.
Common Error Messages:
"No data received": Ensure the Splunk HEC sender is pointing to the correct IP/port and that network connectivity is not blocked. Verify the sender’s protocol (HTTP/S) matches the source configuration.
"Invalid message format": Confirm that Splunk HEC messages are valid JSON, raw, or S2S events.
"Server is busy (503)": Increase the Active request limit in Advanced Settings to handle more simultaneous HEC requests.
"Invalid token": Ensure the HEC token in the sender matches the token configured in Observo AI. Get the HEC token from the Splunk UI → Data Inputs → HTTP Tokens
No data received
Incorrect IP/port or network block
Verify sender configuration and network connectivity
Invalid message format
Non-compliant HEC messages
Verify sender configuration
Server is busy (503)
Insufficient active request limit
Increase Active request limit in Advanced Settings
Invalid token
Mismatched HEC token
Verify token matches sender configuration
Resources
For additional guidance and detailed information, refer to the following resources:
External References:
Splunk HTTP Event Collector Documentation: Official Splunk documentation on HEC configuration and usage.
Send data to HTTP Event Collector: Splunk Documentation
Splunk REST API Endpoints: Details on Splunk HEC endpoints for JSON, raw, and S2S events.
Best Practices:
Use HTTPS with TLS for secure and reliable data delivery unless the sender only supports HTTP.
Configure separate Splunk HEC Sources for different data types such as JSON vs. raw events to enhance processing efficiency.
Enable Splunk HEC acknowledgments for senders that require acks to prevent TCP connection exhaustion, but verify successful ingestion using HTTP 200 responses rather than acks.
Last updated
Was this helpful?

