Splunk HEC Metrics

The Observo AI Splunk HEC Metrics destination enables secure transmission of metric data to Splunk’s HTTP Event Collector for centralized monitoring and analysis, supporting Gzip compression, customizable namespaces, and secure token authentication.

Purpose

The Observo AI Splunk HEC Metrics destination enables the transmission of metric data to Splunk's HTTP Event Collector (HEC) for centralized monitoring and analysis. This integration facilitates advanced querying, visualization, and performance tracking within Splunk, allowing organizations to consolidate and analyze metrics for actionable insights and operational intelligence.

Prerequisites

Before configuring the Splunk HEC Metrics destination in Observo AI, ensure the following requirements are met to facilitate seamless data ingestion:

Observo AI Platform Setup

  • Observo AI Site: Ensure that the Observo AI Site is installed and operational.

Splunk HTTP Event Collector (HEC) Configuration

  • HEC Token: Generate a token in Splunk to authenticate incoming data.

  • HEC Endpoint URL: Determine the Splunk HEC endpoint URL, typically in the format https://<splunk-host>:8088/services/collector.

  • HEC Port: Ensure that the Splunk HEC is configured to listen on the appropriate port (default is 8088).

  • HEC Input Settings: Configure the Splunk HEC input in Splunk UI to accept metric data from Observo AI, setting the appropriate source type and index.

Network and Connectivity

  • HTTPS Access: Ensure that Observo AI can communicate with the Splunk HEC endpoint over HTTPS (port 443 or 8088).

  • Firewall Rules: If using firewalls or network security groups, configure them to allow outbound traffic from Observo AI to the Splunk HEC endpoint.

Integration

To configure Splunk HEC Metrics as a destination in Observo AI, follow these steps:

  1. Access Observo AI Destinations:

    • Navigate to the Destinations tab in the Observo AI interface.

    • Click on the "Add Destination" button and select "Create New".

    • Choose "Splunk HEC Metrics" from the list of available destinations.

  2. General Settings:

    • Name: Provide a unique identifier for the destination such as splunk-hec-metrics-dest-1.

    • Description (Optional): Add a description for the destination.

    • Endpoint: The base URL of the Splunk instance. The scheme (http or https) must be specified. No path should be included since the paths defined by the Splunk API are used.

      Examples

      https://http-inputs-hec.splunkcloud.com

      https://hec.splunk.com:8088

      http://example.com

    • Default Token: Default Splunk HEC token. If an event has a token set in its secrets (splunk_hec_token), it will prevail over the one set here.

    • Default Namespace: Specifies the default namespace to write metrics to when namespace is not specified in the metric. The namespace is a string that represents the logical grouping of metrics. It is used to organize metrics in the Splunk Metrics Store.The namespace is a period-separated string that can contain up to 256 characters. It can contain only alphanumeric characters, periods, and underscores.

    • Host key: Overrides the name of the log field used to grab the hostname to send to Splunk HEC. Default: host

    • Index: Specify the Splunk index where the metric data should be stored.

      Examples

      {{ host }}

      custom_index

  3. Acknowledgement (Optional):

    • Acknowledgements Enabled (False): Whether or not end-to-end acknowledgements are enabled. When enabled, any source connected to this supporting end-to-end acknowledgements, will wait for events to be acknowledged by the sink before acknowledging them at the source.

    • Acknowledgements Indexer Acknowledgements Enabled (True): Maximum no of pending acknowledgements from events sent to the Splunk HEC collector. Once reached, the sink will begin applying backpressure.

    • Acknowledgements Query Interval (Seconds): The amount of time in seconds to wait between queries to the Splunk HEC indexer acknowledgement endpoint. Default: 10

    • Acknowledgements Retry Limit: Maximum no of times an acknowledgement ID will be queried for its status. Default: 30

  4. Request Configuration:

    • Request Concurrency: Configuration for outbound request concurrency. Default: Adaptive concurrency.

      Options
      Description

      Adaptive concurrency

      Adjusts parallelism based on system load

      A fixed concurrency of 1

      Processes one task at a time only

    • Request Rate Limit Duration Secs: The time window used for the rate_limit_num option. Default: 1.

    • Request Rate Limit Num: The maximum number of requests allowed within the rate_limit_duration_secs time window.

    • Request Retry Attempts: The maximum number of retries to make for failed requests. The default, represents an infinite number of retries. Default: Unlimited.

    • Request Retry Initial Backoff Secs: The amount of time to wait in seconds before attempting the first retry for a failed request. After the first retry has failed, the fibonacci sequence will be used to select future backoffs. Default: 1.

    • Request Retry Max Duration Secs: The maximum amount of time to wait between retries. Default: 3600.

    • Request Timeout Secs: The time a request waits before being aborted. It is recommended that this value is not lowered below the service’s internal timeout, as this could create orphaned requests, and duplicate data downstream. Default: 60.

  5. Batching Requirements:

    • Batch Max Bytes (Increment as needed): The maximum size of a batch that will be processed by a sink. This is based on the uncompressed size of the batched events, before they are serialized / compressed.

    • Batch Max Events (Increment as needed): The maximum size of a batch before it is flushed.

    • Batch Timeout Seconds (Increment as needed): The maximum age of a batch before it is flushed. Default: 1

  6. TLS Configuration (Optional):

    • TLS Enabled (False): Whether or not to require TLS for incoming or outgoing connections. When enabled and used for incoming connections, an identity certificate is also required. See tls.crt_file for more information.

    • TLS CA: The CA certificate provided as an inline string in PEM format.

      Example

      /etc/certs/ca.crt

    • TLS CRT: The certificate as a string in PEM format.

      Example

      /etc/certs/tls.crt

    • TLS Key: Absolute path to a private key file used to identify this server. The key must be in DER or PEM (PKCS#8) format. Additionally, the key can be provided as an inline string in PEM format.

      Example

      /etc/certs/tls.key

    • TLS Key Pass: Passphrase used to unlock the encrypted key file. This has no effect unless key_file is set.

      Examples

      ${KEY_PASS_ENV_VAR}

      PassWord1

    • TLS Verify Hostname (False): Enables hostname verification. Hostname used to connect to the remote host must be present in the TLS certificate presented by the remote host, either as the Common Name or as an entry in the Subject Alternative Name extension. Only relevant for outgoing connections. NOT recommended to set this to false unless you understand the risks.

    • TLS Verify Certificate (False): Enables certificate verification. Certificates must be valid in terms of not being expired, and being issued by a trusted issuer. This verification operates in a hierarchical manner, checking validity of the certificate, the issuer of that certificate and so on until reaching a root certificate. Relevant for both incoming and outgoing connections. Do NOT set this to false unless you understand the risks of not verifying the validity of certificates.

  7. Buffering Configuration (Optional):

    • Buffer Type: Specifies the buffering mechanism for event delivery.

      Options
      Description

      Memory

      High-Performance, in-memory buffering Max Events: The maximum number of events allowed in the buffer. Default: 500 When Full: Event handling behavior when a buffer is full. Default: Block - Block: Wait for free space in the buffer.This applies backpressure up the topology, signalling that sources should slow down the acceptance/consumption of events. This means that while no data is lost, data will pile up at the edge. - Drop Newest: Drop the event instead of waiting for free space in the buffer. The event will be intentionally dropped. This mode is typically used when performance is the highest priority, and it is preferable to temporarily lose events rather than cause a slowdown in the acceptance/consumption of events.

      Disk

      Lower-Performance, Less-costly, on disk buffering Max Bytes Size: The maximum number of bytes size allowed in the buffer. Must be at-least 268435488 When Full: Event handling behavior when a buffer is full. Default: Block - Block: Wait for free space in the buffer. This applies backpressure up the topology, signalling that sources should slow down the acceptance/consumption of events. This means that while no data is lost, data will pile up at the edge. - Drop Newest: Drop the event instead of waiting for free space in the buffer. The event will be intentionally dropped. This mode is typically used when performance is the highest priority, and it is preferable to temporarily lose events rather than cause a slowdown in the acceptance/consumption of events.

  8. Advanced Settings (Optional):

    • Compression: Compression algorithm to use for the request body. Default: No compression

      Options
      Description

      Gzip compression

      DEFLATE compression with headers for file storage

      No compression

      Data stored and transmitted in original form

      Zlib compression

      DEFLATE format with minimal wrapper and checksums

      Snappy compression

      Prioritizes speed over compression ratio and complexity

      Zstd compression

      Fast compression with good ratio and dictionaries

    • Source: The source of events sent to this sink. This is typically the filename the logs originated from. If unset, the Splunk collector will set it.

      Example

      {{ file }}

      /var/log/syslog

      UDP:514

    • Sourcetype: The sourcetype of events sent to this sink. If unset, Splunk will default to httpevent.

      Example

      {{ sourcetype }}

      _json

  9. Save and Test Configuration:

    • Save the configuration settings.

    • Send sample metric data to verify that it reaches the specified Splunk index.

Example Scenarios

UrbanTrend Retail Co., a fictitious mid-sized retail chain specializing in fashion and accessories, uses Observo AI to manage telemetry data from its e-commerce platform, point-of-sale (POS) systems, and inventory management applications. To centralize monitoring and gain actionable insights into system performance, UrbanTrend integrates Observo AI with Splunk’s HTTP Event Collector (HEC) to send metrics data. This integration enables advanced querying, visualization, and performance tracking of transaction volumes, application response times, and inventory updates, supporting operational efficiency and enhanced customer experience.

Standard Splunk HEC Metrics Destination Setup

Here is a standard Splunk HEC Metrics Destination configuration example. Only the required sections and their associated field updates are displayed in the table below:

General Settings

Field
Value
Description

Name

urbantrend-splunk-hec-metrics

Unique identifier for the Splunk HEC Metrics destination.

Description

Send retail metrics to Splunk for monitoring

Optional description of the destination's purpose.

Endpoint

https://hec.urbantrend.splunkcloud.com

Base URL of the Splunk HEC instance for metrics ingestion.

Site

urbantrend.splunkcloud.com

Splunk site for the HEC endpoint, aligning with the enterprise’s Splunk setup.

Default Token

a1b2c3d4-e5f6-7890-abcd-1234567890ef

Default Splunk HEC token for authenticating HTTP requests (securely stored).

Default Namespace

retail_metrics

Namespace for organizing metrics in Splunk Metrics Store (e.g., retail_metrics.transactions).

Host key

hostname

Field in metrics metadata identifying the host, overriding default 'host'.

Index

retail_metrics_index

Splunk index where metrics data is stored for querying and visualization.

Acknowledgement

Field
Value
Description

Acknowledgements Enabled

True

Enables end-to-end acknowledgements to ensure data delivery to Splunk.

Acknowledgements Indexer Acknowledgements Enabled

True

Applies backpressure when maximum pending acknowledgements are reached.

Acknowledgements Query Interval (Seconds)

10

Queries Splunk HEC for acknowledgement status every 10 seconds.

Acknowledgements Retry Limit

30

Queries acknowledgement status up to 30 times before giving up.

Request Configuration

Field
Value
Description

Request Concurrency

Adaptive concurrency

Adjusts parallelism based on system load for optimal performance.

Request Rate Limit Duration Secs

1

1-second time window for rate limiting requests.

Request Rate Limit Num

200

Maximum of 200 requests allowed within the 1-second window.

Request Retry Attempts

5

Retries failed requests up to 5 times to ensure reliable delivery.

Request Retry Initial Backoff Secs

1

Waits 1 second before the first retry, using Fibonacci for subsequent retries.

Request Retry Max Duration Secs

1800

Maximum 30-minute wait between retries to prevent excessive delays.

Request Timeout Secs

60

60-second timeout for HTTP requests to avoid orphaned requests.

TLS Configuration

Field
Value
Description

TLS Enabled

True

Requires TLS for secure outgoing connections to Splunk HEC.

TLS CA

-----BEGIN CERTIFICATE-----...

Inline PEM-formatted CA certificate for verifying Splunk’s server.

TLS CRT

-----BEGIN CERTIFICATE-----...

Certificate in PEM format for secure connections.

TLS Key

-----BEGIN PRIVATE KEY-----...

Private key in PEM format for secure connections (securely stored).

TLS Key Pass

RetailSecure2025

Passphrase to unlock the encrypted key file.

TLS Verify Hostname

True

Verifies that the hostname in Splunk’s certificate matches hec.urbantrend.splunkcloud.com.

TLS Verify Certificate

True

Ensures certificates are valid and issued by a trusted authority.

Batching Configuration

Field
Value
Description

Batch Max Bytes

2097152

Maximum batch size of 2MB (uncompressed) to balance throughput and efficiency.

Batch Max Events

1000

Maximum of 1000 events per batch before flushing to Splunk.

Batch Timeout Seconds

1

Flushes batches after 1 second to ensure timely delivery.

Buffering Configuration

Field
Value
Description

Buffer Type

Memory

Uses high-performance in-memory buffering for metrics delivery.

Max Events

2000

Limits buffer to 2000 events to manage memory usage.

When Full

Block

Applies backpressure when the buffer is full, preventing data loss.

Advanced Settings

Field
Value
Description

Compression

Gzip compression

Uses Gzip to reduce data transfer size, optimizing bandwidth usage.

Source

/var/log/retail_metrics

Specifies the source of metrics data, aligning with Splunk conventions.

Sourcetype

_json

Sets the sourcetype to JSON for proper parsing in Splunk.

Troubleshooting

If issues arise with the Splunk HEC Metrics destination in Observo AI, use the following steps to diagnose and resolve them:

Verify Configuration Settings

  • Ensure that the HEC Endpoint URL, Token, Index, and Source Type are correctly entered and match the Splunk setup.

  • Confirm that the Splunk HEC input is enabled and configured to accept metric data from Observo AI.

Check Authentication

  • Verify that the HEC Token is valid and has the necessary permissions to write to the specified index.

  • Ensure that the token has not expired or been revoked.

Monitor Logs

  • Check Observo AI’s Notifications tab for errors or warnings related to data transmission.

  • In the Splunk interface, search the specified index to confirm data arrival.

Validate Data Format and Schema

  • Ensure that the metric data sent from Observo AI matches the expected format and schema in Splunk.

  • If using custom source types, verify that they are properly configured in Splunk.

Network and Connectivity

  • Ensure that Observo AI can reach the Splunk HEC endpoint over the network.

  • If using firewalls or proxies, verify their configurations to allow necessary traffic.

Common Error Messages

  • "Authorization failed": Indicates invalid or missing HEC Token. Verify the token's validity and permissions.

  • "Index not found": Check that the specified index exists in Splunk and that the token has write permissions.

  • "No data ingested": Confirm that metric data is being sent and matches the expected format.

Test Data Flow

  • Send sample metric data from Observo AI and verify its ingestion in Splunk.

  • Use Splunk's search functionality to locate and analyze the ingested metrics.

Resources

For additional guidance and detailed information, refer to the following resources:

Last updated

Was this helpful?