Splunk HEC

The Splunk HEC Source in Observo AI facilitates secure, scalable ingestion of Splunk data via HTTP/S, supporting real-time analytics and observability with optional TLS encryption and HEC token authentication.

Purpose

The Splunk HEC Source in Observo AI enables the ingestion of high-volume Splunk data, such as transaction logs, via HTTP/S for real-time observability and analytics. It supports secure data transfer with optional TLS encryption and HEC token authentication, ensuring compliance with enterprise security standards. Designed for scalability, it integrates with load balancers to handle large-scale data from Splunk forwarders, as exemplified by FinServer Insights for financial transaction monitoring. The configuration allows flexible parsing and archival options to support advanced analytics and long-term data retention.

Prerequisites

Before configuring the Splunk HEC Source in Observo AI, ensure the following requirements are met to facilitate seamless data ingestion:

  • Observo AI Platform Setup:

    • The Observo AI Site must be installed and available.

  • Network and Connectivity:

    • Ensure Observo AI can receive Splunk HEC data over HTTP/S. The default port range is [10000-10200] for HTTP/S.

    • Check firewall rules, proxy settings, or VPC configurations that may affect connectivity to the specified port.

    • If using TLS, ensure certificates are properly configured, including CA certificates, host certificates, and private key files for secure communication.

  • Authentication (Optional for TLS):

    • For TLS-enabled Splunk HEC sources, prepare one of the following authentication methods:

      • Certificate-Based Authentication: Provide paths to CA certificate, host certificate, and private key files if required.

      • No Authentication: If TLS is disabled, no additional credentials are needed.

      • HEC Authentication Token: Provide a valid Splunk HEC token for client authentication. Tokens can be generated in Splunk and added to Observo AI.

  • Load Balancer (Optional):

    • For high-volume Splunk HEC data, configure a load balancer such as HAProxy, nginx, or AWS ELB to distribute traffic across Observo AI worker nodes to prevent CPU strain on a single node.

    • If using a load balancer, consider enabling Proxy Protocol v1 or v2 to preserve the original sender IP address in the X-Forwarded-For header.

Prerequisite
Description
Notes

Network

Connectivity to HTTP/S ports

Default port range is [10000-10200]; Check firewall/proxy settings

Authentication

TLS certificate or HEC token setup if enabled

Provide CA, host cert, key files for TLS, or HEC token for authentication

Load Balancer

Optional for high-volume HTTP/S data

Use HAProxy, nginx, or AWS ELB; Enable Proxy Protocol if needed

Integration

The Integration section outlines configurations for the Observo AI Splunk HEC Source. To configure Splunk HEC as a source in Observo AI, follow these steps to set up and test the data flow:

  1. Log in to Observo AI:

    • Navigate to the Sources tab.

    • Click the "Add Source" button and select "Create New".

    • Choose "Splunk HEC" from the list of available sources to begin configuration.

  2. General Settings:

    • Source Type: Splunk HEC.

    • Name: A unique identifier for the source, such as splunk-hec-source-1.

    • Description (Optional): Provide a description for the source.

    • Socket Address: Socket address to listen for connections on. It should be in the format of host:port. Port should be in range[10000-10200]

      Example

      0.0.0.0:10000

    • Store HEC token (False): If set to true, the HEC token will be added to each event. This is useful if the event is being written to a Splunk HEC destination.

    • Valid Authorization Tokens (Add) List of valid authorization tokens. If the token is not in the list, the event will be dropped. If the list is empty, all events will be accepted.

  3. TLS Options (Optional):

    • TLS Enable (False): Toggle to Enable for secure HTTP/S communication.

    • CA File Path: Absolute path to an additional CA certificate file for TLS.

      • The certificate must be in the DER or PEM (X.509) format.

      • Additionally, the certificate can be provided as an inline string in PEM format.

    • CRT File Path: Absolute path to a certificate file to identify the server.

      • The certificate must be in DER, PEM (X.509), or PKCS#12 format.

      • Additionally, the certificate can be provided as an inline string in PEM format.

      • If this is set, and is not a PKCS#12 archive, "Private Key File Path" must also be set.

    • Private Key File Path: Absolute path to a private key file.

      • The key must be in DER or PEM (PKCS#8) format.

      • Additionally, the key can be provided as an inline string in PEM format.

    • Private Key Password: Passphrase used to unlock the encrypted key file.

    • Verify Client Certificates: Enables client certificate verification by a trusted CA.

      • For components that create a server, this requires that the client connections have a valid client certificate.

      • For components that initiate requests, this validates that the upstream has a valid certificate.

      • If enabled, certificates must not be expired and must be issued by a trusted issuer.

      • This verification operates in a hierarchical manner, checking that the leaf certificate (the certificate presented by the client/server) is not only valid, but that the issuer of that certificate is also valid, and so on until the verification process reaches a root certificate.

    • Verify Client Hostname: Enable to ensure the client hostname matches the certificate hostname.

      • If enabled, the hostname used to connect to the remote host must be present in the TLS certificate presented by the remote host, either as the Common Name or as an entry in the Subject Alternative Name extension.

      • Only relevant for outgoing connections.

    • SNI Server Name: (Empty) Server name to use when using Server Name Indication (SNI).

    • Supported ALPN protocols (Add): Declare the supported ALPN protocols, which are used during communication with peers.

      • They are prioritized in the order defined.

  4. Acknowledgement Settings (Optional):

    • Enabled: (False) Enables end-to-end acknowledgements.

    • Max Idle Time in Seconds: The maximum time in seconds that an idle connection will be kept open.

    • Ack Idle Cleanup: If set to true, the idle connections will be cleaned up after the max idle time.

    • Max Number of Ack Channels: The maximum number of Splunk HEC channels clients can use in this Source.

    • Max Pending Acks: The maximum number of pending acknowledgements across all channels for the Source.

    • Max Pending Acks Per Channel: The maximum number of pending acknowledgements per channel for the Source.

  5. Advanced Settings:

    • Max connection age in seconds: The maximum amount of time a connection may exist before it is closed by sending a Connection: close header on the HTTP response.

      • Set this to a large value like 100000000 to “disable” this feature

      • Only applies to HTTP/0.9, HTTP/1.0, and HTTP/1.1 requests.

      • A random jitter configured by "Max connection age jitter factor" is added to the specified duration to spread out connection storms.

    • Max connection age jitter factor: The factor by which to jitter the "Max connection age in seconds" value.

      • A value of 0.1 means that the actual duration will be between 90% and 110% of the specified maximum duration.

  6. Parser Config:

    • Enable Source Log parser: (False)

    • Toggle Enable Source Log parser Switch to enable

      • Select appropriate Parser from the Source Log Parser dropdown

      • Add additional Parsers as needed

  7. Pattern Extractor:

    • Refer to Observo AI’s Pattern Extractor documentation for details on configuring pattern-based data extraction.

  8. Archival Destination:

    • Toggle Enable Archival on Source Switch to enable

    • Under Archival Destination, select from the list of Archival Destinations (Required)

  9. Save and Test Configuration:

    • Save the configuration settings.

    • Send sample Splunk HEC data such as via a Splunk forwarder or curl command and verify ingestion in Observo AI by monitoring the Analytics tab in the target pipeline.

    • Sending a Formatted Event

      • curl https://\<observo_site_url\>:\<port\>/services/collector/event \-H 'Authorization: \<token\>' \-d '{"event":"Lets go Observo\!","index":"main"}'

    • Sending a Raw Event

      • curl https://\<observo_site_url\>:\<port\>/services/collector/raw \-H 'Authorization: \<token\>' \-d 'this is a sample raw event'

    • Troubleshooting Details:

      • Add --verbose to the end of the curl command to gather step by step details of the connection process.

Example Scenarios

FinServer Insights, a fictitious company specializing in Financial analytics, aims to integrate Observo with a Splunk HEC to process high-volume transaction logs for large financial institutions. The setup ensures secure data ingestion, high availability with load balancing, and compliance with enterprise security standards.

Standard Splunk HEC Destination Setup

Here is a standard Kafka Destination configuration example. Only the required sections and their associated field updates are displayed in the table below:

General Settings

Field
Value
Notes

Source Type

Splunk HEC

Select "Splunk HEC" from the list of available sources in Observo AI.

Name

splunk-hec-finance-prod-1

Unique identifier for the production Splunk HEC source.

Description

Splunk HEC source for financial transaction logs in production

Provides context for the source's purpose.

Socket Address

0.0.0.0:10002

Host:port format; port within range [10000-10200], behind a load balancer.

Store HEC Token

True

Adds HEC token to each event for downstream Splunk HEC destinations.

Valid Authorization Tokens

finance-prod-token-9876-zyxw-5432-kjhg

Only events with this token are accepted; it aligns with enterprise security policy.

TLS Configuration

Field
Value
Notes

TLS Enable

True

Mandatory for secure HTTPS communication in compliance with enterprise security standards.

CA File Path

/etc/observo/certs/enterprise-ca.pem

Path to enterprise CA certificate in PEM format, issued by internal PKI.

CRT File Path

/etc/observo/certs/finance-hec-server-cert.pem

Path to server certificate in PEM format, signed by enterprise CA.

Private Key File Path

/etc/observo/certs/finance-hec-server-key.pem

Path to private key in PEM format, encrypted for security.

Private Key Password

FinSecureKey2025!

Passphrase to unlock the encrypted private key, managed via enterprise secrets vault.

Verify Client Certificates

True

Enforces client certificate verification to ensure only trusted clients send data.

Verify Client Hostname

True

Validates client hostname against certificate to prevent spoofing.

SNI Server Name

hec.finance-prod.example.com

Server name for SNI, matching enterprise DNS configuration.

Supported ALPN Protocols

h2, http/1.1

Prioritizes HTTP/2 for performance, with HTTP/1.1 fallback for compatibility.

Advanced Settings

Field
Value
Notes

Max Connection Age in Seconds

43200

Sets connection duration to 12 hours to balance stability and resource management.

Max Connection Age Jitter Factor

0.05

Limits jitter to ±5% to minimize connection storms in high-traffic enterprise environments.

Additional Enterprise Configuration Notes

  • Load Balancer Setup: Deploy an AWS ELB to distribute high-volume Splunk HEC traffic across multiple Observo Sites. Enable Proxy Protocol v2 to preserve client IP addresses in the X-Forwarded-For header for audit compliance.

  • Acknowledgement Settings: Configure for reliability:

    • Enabled: True

    • Max Idle Time in Seconds: 600

    • Ack Idle Cleanup: True

    • Max Number of Ack Channels: 100

    • Max Pending Acks: 10000

    • Max Pending Acks Per Channel: 100

  • Parser Config: Enable Source Log Parser and select a JSON parser to handle structured financial transaction logs. Add a custom parser for proprietary log formats if needed.

  • Pattern Extractor: Configure to extract fields like transaction_id, account_number, and timestamp for analytics, per Observo AI’s Pattern Extractor documentation.

  • Archival Destination: Enable and select an S3-based archival destination for long-term storage, complying with financial data retention policies.

  • Testing: Validate the configuration by sending a sample event:

  • Monitoring: Monitor ingestion in the Observo AI Analytics tab for throughput, event counts, and errors. Set up alerts for "Server is busy (503)" errors and adjust the Active request limit in Advanced Settings if needed.

  • Troubleshooting: If "No data received" errors occur, verify load balancer health checks and VPC firewall rules. For "Invalid token" errors, ensure the token matches the Splunk HEC sender configuration.

Troubleshooting

If issues arise with the Splunk HEC Source in Observo AI, use the following steps to diagnose and resolve them:

  • Verify Configuration Settings:

    • Ensure fields like Input ID, Socket Address, Port, and Splunk HEC Endpoint match the sender's configuration.

    • Confirm that the port such as [10000-10200] for HTTP/S is open and accessible.

    • Verify that the HEC token matches the sender’s configuration.

  • Check Network Connectivity:

    • Verify that firewall rules, proxy settings, or VPC configurations allow traffic to the specified HTTP/S port.

    • Test connectivity using tools like curl or telnet to ensure the Observo AI instance can receive data on the configured port.

  • Validate TLS Configuration:

    • For TLS-enabled sources, ensure CA, certificate, and key files are correctly specified and accessible.

    • Check for TLS version mismatches such as sender using TLS 1.2 while Observo AI expects TLS 1.3. Adjust TLS settings or disable "Verify Client Hostname" if certificate issues occur.

  • Monitor Logs and Data:

    • Verify data ingestion by monitoring the Analytics tab in the Observo AI pipeline for throughput and event counts.

    • Check Observo AI for errors related to parsing, buffering, or connection issues.

  • Common Error Messages:

    • "No data received": Ensure the Splunk HEC sender is pointing to the correct IP/port and that network connectivity is not blocked. Verify the sender’s protocol (HTTP/S) matches the source configuration.

    • "Invalid message format": Confirm that Splunk HEC messages are valid JSON, raw, or S2S events.

    • "Server is busy (503)": Increase the Active request limit in Advanced Settings to handle more simultaneous HEC requests.

    • "Invalid token": Ensure the HEC token in the sender matches the token configured in Observo AI. Get the HEC token from the Splunk UI → Data Inputs → HTTP Tokens

Issue
Possible Cause
Resolution

No data received

Incorrect IP/port or network block

Verify sender configuration and network connectivity

Invalid message format

Non-compliant HEC messages

Verify sender configuration

Server is busy (503)

Insufficient active request limit

Increase Active request limit in Advanced Settings

Invalid token

Mismatched HEC token

Verify token matches sender configuration

Resources

For additional guidance and detailed information, refer to the following resources:

Last updated

Was this helpful?