Splunk HEC Logs

The Observo AI Splunk HEC Logs destination enables secure transmission of security and telemetry data to Splunk’s HTTP Event Collector for centralized log analytics, supporting JSON encoding, Gzip compression, and customizable indexing with secure token authentication.

Purpose

The Observo AI Splunk HEC Logs destination enables the transmission of security and telemetry data to Splunk's HTTP Event Collector (HEC) for centralized log analytics and monitoring. This integration facilitates advanced querying, visualization, and threat detection within Splunk, allowing organizations to consolidate and analyze data for actionable insights and operational intelligence.

Prerequisites

Before configuring the Splunk HEC Logs destination in Observo AI, ensure the following requirements are met to facilitate seamless data ingestion:

Observo AI Platform Setup

  • Observo AI Site: Ensure that the Observo AI Site is installed and operational.

HTTP Event Collector (Configuration)

  • Global Settings:

    • All Tokens: Enabled

    • Enable SSL (Optional): To have HEC listen and communicate over HTTPS rather than HTTP

    • HTTP Port Number: Enter the port number for the Splunk HEC endpoint (default 8088)

  • HEC Endpoint URL: The Global Settings (SSL/Port) and Splunk deployment type determine the Splunk HEC endpoint URL (More Info)

    • Example: https://<Splunk-Host>:<HEC-Port>

  • HEC Token: Generate a token in Splunk to authenticate incoming data

    • HEC Input Settings: Configure the Splunk HEC input in Splunk to accept data from Observo AI

      • Ensure the data’s target index is in the Select Indexes allowed list.

  • Splunk HEC Collector Considerations

    • Token-Specific Routing: Each HEC token's configuration dictates which indexes it can access. Changing one token's settings does not affect others.

    • Default Index: If you define a default index, it will be the default index for that HEC token. If a different index is defined in the data, make sure that index is in the Selected Indexes list for the target HEC Token.

Network and Connectivity

  • HTTPS Access: Ensure that Observo AI can communicate with the Splunk HEC endpoint and HEC port over HTTP or HTTPS.

  • Firewall Rules: If using firewalls or network security groups, configure them to allow outbound traffic from Observo AI to the Splunk HEC endpoint.

Integration

To configure Splunk HEC Logs as a destination in Observo AI, follow these steps:

  1. Access Observo AI Destinations:

    • Navigate to the Destinations tab in the Observo AI interface.

    • Click on the "Add Destination" button and select "Create New".

    • Choose "Splunk HEC Logs" from the list of available destinations.

  2. General Settings:

    • Name: Provide a unique identifier for the destination (e.g., splunk-hec-dest-1).

    • Description (Optional): Add a description for the destination.

    • Endpoint: The base URL of the Splunk instance. The scheme (http or https) must be specified. No path should be included since the paths defined by the Splunk API are used.

      Examples

      https://http-inputs-hec.splunkcloud.com

      https://hec.splunk.com:8088

      http://example.com

    • Default Token: Default Splunk HEC token. If an event has a token set in its secrets (splunk_hec_token), it will prevail over the one set here.

    • Default Namespace: Specifies the default namespace to write metrics to when namespace is not specified in the metric. The namespace is a string that represents the logical grouping of metrics. It is used to organize metrics in the Splunk Metrics Store.The namespace is a period-separated string that can contain up to 256 characters. It can contain only alphanumeric characters, periods, and underscores.

    • Host Key: Overrides the name of the log field used to grab the hostname to send to Splunk HEC. Default: host

    • Index: The name of the index to send events to. If not specified, the default index defined within Splunk is used.

      Examples

      {{ host }}

      custom_index

    • Timestamp Key: Overrides the name of the log field used to grab the timestamp to send to Splunk HEC. When set to “”, timestamp is not set in the events sent to Splunk HEC.

      Example

      timestamp

      time

  3. Acknowledgement (Optional):

    • Acknowledgements Enabled (False): Whether or not end-to-end acknowledgements are enabled. When enabled, any source connected to this supporting end-to-end acknowledgements, will wait for events to be acknowledged by the sink before acknowledging them at the source.

    • Acknowledgements Indexer Acknowledgements Enabled (True): Maximum no of pending acknowledgements from events sent to the Splunk HEC collector. Once reached, the sink will begin applying backpressure.

    • Acknowledgements Query Interval (Seconds): The amount of time in seconds to wait between queries to the Splunk HEC indexer acknowledgement endpoint. Default: 10

    • Acknowledgements Retry Limit: Maximum no of times an acknowledgement ID will be queried for its status. Default: 30

  4. Encoding:

    • Encoding Codec: The codec to use for encoding events. Default: JSON Encoding.

      Options
      Sub-Options

      JSON Encoding

      Pretty JSON (False): Format JSON with indentation and line breaks for better readability. Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format Fields to exclude from serialization (Add): Transformations to prepare an event for serialization. List of fields that are excluded from the encoded event. Example: message.payload

      logfmt Encoding

      Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format Fields to exclude from serialization (Add): Transformations to prepare an event for serialization. List of fields that are excluded from the encoded event. Example: message.payload

      Apache Avro Encoding

      Avro Schema: Specify the Apache Avro schema definition for serializing events. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format Fields to exclude from serialization (Add): Transformations to prepare an event for serialization. List of fields that are excluded from the encoded event. Example: message.payload

      Newline Delimited JSON Encoding

      Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (Default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format Fields to exclude from serialization (Add): Transformations to prepare an event for serialization. List of fields that are excluded from the encoded event. Example: message.payload

      No encoding

      Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format Fields to exclude from serialization (Add): Transformations to prepare an event for serialization. List of fields that are excluded from the encoded event. Example: message.payload

      Plain text encoding

      Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format Fields to exclude from serialization (Add): Transformations to prepare an event for serialization. List of fields that are excluded from the encoded event. Example: message.payload

      Parquet

      Include Raw Log (False): Capture the complete log message as an additional field(observo_record) apart from the given schema. Examples: In addition to the Parquet schema, there will be a field named "observo_record" in the Parquet file. Parquet Schema: Enter parquet schema for encoding. Examples: message root { optional binary stream; optional binary time; optional group kubernetes { optional binary pod_name; optional binary pod_id; optional binary docker_id; optional binary container_hash; optional binary container_image; optional group labels { optional binary pod-template-hash; } } } Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format Fields to exclude from serialization (Add): Transformations to prepare an event for serialization. List of fields that are excluded from the encoded event. Example: message.payload

      Common Event Format (CEF)

      CEF Device Event Class ID: Provide a unique identifier for categorizing the type of event (maximum 1023 characters). Example: login-failure CEF Device Product: Specify the product name that generated the event (maximum 63 characters). Example: Log Analyzer CEF Device Vendor: Specify the vendor name that produced the event (maximum 63 characters). Example: Observo CEF Device Version: Specify the version of the product that generated the event (maximum 31 characters). Example: 1.0.0 CEF Extensions (Add): Define custom key-value pairs for additional event data fields in CEF format. CEF Name: Provide a human-readable description of the event (maximum 512 characters). Example: cef.name CEF Severity: Indicate the importance of the event with a value from 0 (lowest) to 10 (highest). Example: 5 CEF Version (Select): Specify which version of the CEF specification to use for formatting. - CEF specification version 0.1 - CEF specification version 1.x Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format Fields to exclude from serialization (Add): Transformations to prepare an event for serialization. List of fields that are excluded from the encoded event. Example: message.payload

      CSV Format

      CSV Fields (Add): Specify the field names to include as columns in the CSV output and their order. Examples: - timestamp - host - message CSV Buffer Capacity (Optional): Set the internal buffer size (in bytes) used when writing CSV data. Example: 8192 CSV Delimitier (Optional): Set the character that separates fields in the CSV output. Example: , Enable Double Quote Escapes (True): When enabled, quotes in field data are escaped by doubling them. When disabled, an escape character is used instead. CSV Escape Character (Optional): Set the character used to escape quotes when double_quote is disabled. Example: <br> CSV Quote Character (Optional): Set the character used for quoting fields in the CSV output. Example: " CSV Quoting Style (Optional): Control when field values should be wrapped in quote characters. Options: - Always quot all fields - Quote only when necessary - Never use quotes - Quote all non-numeric fields Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format Fields to exclude from serialization (Add): Transformations to prepare an event for serialization. List of fields that are excluded from the encoded event. Example: message.payload

      Protocol Buffers

      Protobuf Message Type: Specify the fully qualified message type name for Protobuf serialization. Example: package.Message Protobuf Descriptor File: Specify the path to the compiled protobuf descriptor file (.desc). Example: /path/to/descriptor.desc Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format Fields to exclude from serialization (Add): Transformations to prepare an event for serialization. List of fields that are excluded from the encoded event. Example: message.payload

      Graylog Extended Log Format (GELF)

      Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format Fields to exclude from serialization (Add): Transformations to prepare an event for serialization. List of fields that are excluded from the encoded event. Example: message.payload

  5. Request Configuration:

    • Request Concurrency: Configuration for outbound request concurrency. Default: Adaptive concurrency.

      Options
      Description

      Adaptive concurrency

      Adjusts parallelism based on system load

      A fixed concurrency of 1

      Processes one task at a time only

    • Request Rate Limit Duration Secs: The time window used for the rate_limit_num option. Default: 1.

    • Request Rate Limit Num: The maximum number of requests allowed within the rate_limit_duration_secs time window.

    • Request Retry Attempts: The maximum number of retries to make for failed requests. The default, represents an infinite number of retries. Default: Unlimited.

    • Request Retry Initial Backoff Secs: The amount of time to wait in seconds before attempting the first retry for a failed request. After the first retry has failed, the fibonacci sequence will be used to select future backoffs. Default: 1.

    • Request Retry Max Duration Secs: The maximum amount of time to wait between retries. Default: 3600.

    • Request Timeout Secs: The time a request waits before being aborted. It is recommended that this value is not lowered below the service’s internal timeout, as this could create orphaned requests, and duplicate data downstream. Default: 60.

  6. Batching Requirements:

    • Batch Max Bytes (Increment as needed): The maximum size of a batch that will be processed by a sink. This is based on the uncompressed size of the batched events, before they are serialized / compressed.

    • Batch Max Events (Increment as needed): The maximum size of a batch before it is flushed.

    • Batch Timeout Seconds (Increment as needed): The maximum age of a batch before it is flushed. Default: 1

  7. TLS Configuration (Optional):

    • TLS CA: Provide the CA certificate in PEM format.

    • TLS Certificate: Provide the client certificate in PEM format.

    • TLS Key: Provide the private key in PEM format.

    • TLS Key Passphrase: If the key is encrypted, provide the passphrase.

    • Verify Certificate (False): Enables certificate verification.

    • Certificates must be valid in terms of not being expired, and being issued by a trusted issuer. This verification operates in a hierarchical manner, checking validity of the certificate, the issuer of that certificate and so on until reaching a root certificate. Relevant for both incoming and outgoing connections. Do NOT set this to false unless you understand the risks of not verifying the validity of certificates.

    • Verify Hostname: Enables hostname verification. If enabled, the hostname used to connect to the remote host must be present in the TLS certificate presented by the remote host, either as the Common Name or as an entry in the Subject Alternative Name extension. Only relevant for outgoing connections. Do NOT set this to false unless you understand the risks of not verifying the remote hostname.

  8. Buffering Configuration (Optional):

    • Buffer Type: Specifies the buffering mechanism for event delivery.

      Options
      Description

      Memory

      High-Performance, in-memory buffering Max Events: The maximum number of events allowed in the buffer. Default: 500 When Full: Event handling behavior when a buffer is full. Default: Block - Block: Wait for free space in the buffer.This applies backpressure up the topology, signalling that sources should slow down the acceptance/consumption of events. This means that while no data is lost, data will pile up at the edge. - Drop Newest: Drop the event instead of waiting for free space in the buffer. The event will be intentionally dropped. This mode is typically used when performance is the highest priority, and it is preferable to temporarily lose events rather than cause a slowdown in the acceptance/consumption of events.

      Disk

      Lower-Performance, Less-costly, on disk buffering Max Bytes Size: The maximum number of bytes size allowed in the buffer. Must be at-least 268435488 When Full: Event handling behavior when a buffer is full. Default: Block - Block: Wait for free space in the buffer. This applies backpressure up the topology, signalling that sources should slow down the acceptance/consumption of events. This means that while no data is lost, data will pile up at the edge. - Drop Newest: Drop the event instead of waiting for free space in the buffer. The event will be intentionally dropped. This mode is typically used when performance is the highest priority, and it is preferable to temporarily lose events rather than cause a slowdown in the acceptance/consumption of events.

  9. Advanced Settings (Optional):

    • Compression: Compression algorithm to use for the request body. Default: No compression

      Options
      Description

      Gzip compression

      DEFLATE compression with headers for file storage

      No compression

      Data stored and transmitted in original form

      Zlib compression

      DEFLATE format with minimal wrapper and checksums

    • Endpoint Target: Splunk HEC endpoint configuration.

      Example

      Event endpoint (Metadata sent with event payload)

      Raw endpoint (Metadata as query parameters)

    • Source: The source of events sent to this sink. This is typically the filename the logs originated from. If unset, the Splunk collector will set it.

      Example

      {{ file }}

      /var/log/syslog

      UDP:514

    • Sourcetype: The sourcetype of events sent to this sink. If unset, Splunk will default to httpevent.

      Example

      {{ sourcetype }}

      _json

  10. Save and Test Configuration:

    • Save the configuration settings.

    • Send sample data to verify that it reaches the specified Splunk index.

Example Scenarios

The Observo AI Splunk HEC Logs Destination makes it easy to send logs from different sources directly to Splunk. Its flexible setup lets you customize and optimize data pipelines for smooth and efficient data ingestion. It's great for teams that want to centralize data in Splunk for monitoring, analytics, or security.

You can include Splunk-specific fields like sourcetype, index, or source to improve Splunk ingestion and search. The setup uses HTTPS, supports compression, and is optimized for real-time data delivery.

Standard Splunk HEC Logs Destination Setup

Here is a standard Splunk HEC Logs Destination configuration example. Only the required sections and their associated field updates are displayed in the table below:

General Settings
Example Inputs

Name

SplunkHECExample

Endpoint

https://<Splunk-Host>:<HEC-Port>

Default Token

Enter the Splunk HEC token from your Splunk instance (found in Splunk under Settings > Data Inputs > HTTP Event Collector).

Host Key

host

Index

{{ index }}

Timestamp Key

timestamp

Encoding
Example Inputs

Encoding Codec

JSON Encoding

Encoding Metric Tag Values

Tag values will be exposed as single strings

Encoding Timestamp Format

Unix timestamp

Advanced Settings
Example Inputs

Compression

No compression

Endpoint Target

Event endpoint (Metadata sent with every workload)

Source

{{source}}

Sourcetype

{{sourcetype}}

Test Data Flow

  • Send sample data from Observo AI and verify its ingestion in Splunk, then use Splunk's search functionality to locate and analyze the ingested data

  • Test Splunk HEC Endpoint;

    • curl "https://<Splunk-Host>:<HEC-Port>/services/collector/event"\ -H "Authorization: Splunk <Splunk HEC Token>" \ -d '{"event": "Hello, world!", "sourcetype": "manual"}'

Troubleshooting

If issues arise with the Splunk HEC Logs destination in Observo AI, use the following steps to diagnose and resolve them:

Verify Configuration Settings

  • Ensure that the HEC Endpoint URL, Token, Index, and Source Type are correctly entered and match the Splunk setup.

  • Confirm that the Splunk HEC input is enabled and configured to accept data from Observo AI (Splunk HEC Global Settings and Token Settings).

Check Authentication

  • Verify that the HEC Token is valid and has the necessary permissions to write to the specified index.

  • Ensure that the token has not expired or been revoked.

Monitor Logs

  • Check Observo AI’s Monitoring>Pipeline Errors view for errors or warnings related to data transmission.

  • In the Splunk interface, search the specified index to confirm data arrival.

Validate Data Format and Schema

  • Ensure that the data sent from Observo AI matches the expected format and schema in Splunk.

  • If using custom source types, verify that they are properly configured in Splunk.

Network and Connectivity

  • Ensure that Observo AI can reach the Splunk HEC endpoint over the network.

  • If using firewalls or proxies, verify their configurations to allow necessary traffic.

Common Error Messages

  • "Authorization failed": Indicates invalid or missing HEC Token. Verify the token's validity and permissions.

  • "Index not found": Check that the specified index exists in Splunk and that the token has write permissions.

  • "No data ingested": Confirm that data is being sent and matches the expected format.

Resources

For additional guidance and detailed information, refer to the following resources:

Last updated

Was this helpful?