Elasticsearch

The Observo AI Elasticsearch destination enables the transmission of log and event data to Elasticsearch clusters for advanced search, analytics, and visualization, supporting customizable encoding, secure authentication (Basic or AWS), and Gzip compression for efficient data integration.

Purpose

The Observo AI Elasticsearch destination enables the transmission of log and event data to Elasticsearch clusters, facilitating advanced search, analytics, and visualization capabilities. This integration allows organizations to harness Elasticsearch's powerful indexing and querying features for comprehensive observability and operational insights.

Prerequisites

Before configuring the Elasticsearch destination in Observo AI, ensure the following requirements are met:

Observo AI Platform Setup

Observo AI Site: Ensure that the Observo AI Site is installed and operational.

Elasticsearch Cluster Configuration

Elasticsearch Endpoint URL: Determine the Elasticsearch endpoint URL, typically in the format http://<elasticsearch-host>:9200.
Authentication Credentials: If authentication is enabled, obtain the necessary username and password or API key.
Index Settings: Decide on the index or index pattern where the data will be stored.
TLS Configuration: If using HTTPS, ensure that the necessary TLS certificates are in place.

Network and Connectivity

HTTP/HTTPS Access: Ensure that Observo AI can communicate with the Elasticsearch endpoint over HTTP or HTTPS.
Firewall Rules: If using firewalls or network security groups, configure them to allow outbound traffic from Observo AI to the Elasticsearch endpoint.

Integration

To configure Elasticsearch as a destination in Observo AI, follow these steps:

Access Observo AI Destinations:
- Navigate to the Destinations tab in the Observo AI interface.
- Click on the "Add Destination" button and select "Create New".
- Choose "Elasticsearch" from the list of available destinations.
General Settings:
- Name: Provide a unique identifier for the destination such as elasticsearch-dest-1.
- Description (Optional): Add a description for the destination.
- Elasticsearch Endpoint (Add as needed): The Elasticsearch endpoints to send logs to. Each endpoint must contain an HTTP scheme, and may specify a hostname or IP address and port.
  Examples
  https://127.0.0.1:9200
  http://my-elasticsearch-endpoint
- Mode: Elasticsearch Bulk API Indexing mode. Determines which Elasticsearch Bulk API Indexing mode to use.
  Options
  Description
  Bulk
  Batch process multiple operations in a single request
  Data Stream
  Continuous flow of timestamped data, optimized for ingestion
- Id Key: The name of the event key that should map to Elasticsearch’s id field. By default, the _id field is not set, which allows Elasticsearch to set this automatically. Setting your own Elasticsearch IDs can impact performance.
  Examples
  id
  _id
- Pipeline: The name of the ingest pipeline to apply in Elasticsearch.
Authentication (Optional):
- Auth Strategy: The authentication strategy to use. Choose between the following authentication mechanisms:
  Options
  Amazon OpenSearch Service-specific authentication
  HTTP Basic Authentication
  No selection - choose this option if you are using API token based authentication. This will have to be specified as a HTTP header.
- Auth Access Key Id: The AWS access key ID.
  Example
  AKIAIOSFODNN7EXAMPLE
- Auth Secret Access Key: The AWS secret access key.
  Example
  wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
- Auth User: Basic authentication username.
  Examples
  ${ELASTICSEARCH_PASSWORD}
  username
- Auth Password: Basic authentication password.
  Examples
  ${ELASTICSEARCH_PASSWORD}
  password
- Auth Region: The AWS region to send STS requests to. Defaults to the configured region for the service itself.
  Example
  use-west-2
- Auth Assume Role: The ARN of an IAM role to assume.
  Example
  arn:aws:iam::123456789098:role/my_role
- Auth Load Timeout Secs: Timeout for successfully loading any credentials, in seconds. Relevant when the default credentials chain is used or assume_role.
  Example
  30
- Auth IMDS Connect Timeout Seconds: Connect timeout for IMDS.
- Auth IMDS Max Attempts: Number of IMDS retries for fetching tokens and metadata.
- Auth IMDS Read Timeout Seconds: Read timeout for IMDS.
Acknowledgement (Optional):
- Acknowledgements Enabled (Disabled): Whether or not end-to-end acknowledgements are enabled. When enabled, any source connected to this supporting end-to-end acknowledgements, will wait for events to be acknowledged by the sink before acknowledging them at the source.
Encoding (Optional):
- Fields to Exclude (Add): List any fields that should be excluded from the serialized payload. Default: host
  Examples
  fields1
  date
  host
- Encoding Timestamp Format: Specify the timestamp format (default is RFC 3339).
  Options
  Description
  RFC 3339 timerstamp
  Human-readable date-time format with timezone (ISO 8601-based)
  Unix tinestamp
  Seconds since January 1, 1970 (UTC epoch)
Bulk Mode Configuration:
- Bulk Index: The name of the index to write events to. Relevant when Mode=Bulk.
  Examples
  application-{{ application_id }}-%Y-%m-%d
  {{ index }}
- Bulk Action: Action to use when making requests to the Elasticsearch Bulk API. Currently, Observo only supports index and create. update and delete actions are not supported. Relevant when Mode=Bulk.
  Options
  Description
  Create
  Adds a new document only if it doesn’t exist
  Index
  Adds or replaces a document with the same ID
Request Configurations:
- Request Concurrency: Configuration for outbound request concurrency. Default: Adaptive concurrency.
  Options
  Description
  Adaptive concurrency
  Adjusts parallelism based on system load
  A fixed concurrency of 1
  Processes one task at a time only
- Request Rate Limit Duration Secs: The time window used for the rate_limit_num option. Default: 1.
- Request Rate Limit Num: The maximum number of requests allowed within the rate_limit_duration_secs time window. Default: Unlimited.
- Request Retry Attempts: The maximum number of retries to make for failed requests. The default, represents an infinite number of retries. Default: Unlimited.
- Request Retry Initial Backoff Secs: The amount of time to wait in seconds before attempting the first retry for a failed request. After the first retry has failed, the fibonacci sequence will be used to select future backoffs. Default: 1.
- Request Retry Max Duration Secs: The maximum amount of time to wait between retries. Default: 3600.
- Request Timeout Secs: The time a request waits before being aborted. It is recommended that this value is not lowered below the service’s internal timeout, as this could create orphaned requests, and duplicate data downstream. Default: 60.
Data Stream Mode Configuration:
- Data Stream Auto Routing (True): Automatically routes events by deriving the data stream name using specific event fields. Data stream name is <type>-<dataset>-<namespace>, where value comes from the data_stream configuration field of the same name. If enabled, the value of the Data Stream Type, Data Stream Dataset, and Data Stream Namespace event fields will be used if they are present. Otherwise, the values set here in the configuration will be used.
- Data Stream Dataset: The data stream dataset used to construct the data stream at index time. Default: generic
  Examples
  generic
  nginx
  {{ service }}
- Data Stream Namespace: The data stream namespace used to construct the data stream at index time. Default: default
  Example
  {{ environment }}
- Data Stream Sync Fields (True): Automatically adds and syncs the data_stream.* event fields if they are missing from the event. This ensures that fields match the name of the data stream that is receiving events.
- Data Stream Type: The data stream type used to construct the data stream at index time. Default: log
  Examples
  metrics
  synthetics
  {{ type }}
Batching Requirements (Optional):
- Batch Max Bytes (Increment as needed): The maximum size of a batch that will be processed by a sink. This is based on the uncompressed size of the batched events, before they are serialized / compressed.
- Batch Max Events (Increment as needed): The maximum size of a batch before it is flushed.
- Batch Timeout Seconds (Increment as needed): The maximum age of a batch before it is flushed. Default: 1
AWS Configuration (Optional):
- AWS Endpoint: Custom endpoint for use with AWS-compatible services.
  Example
  http://127.0.0.0:5000/path/to/service
- AWS Region: The AWS Region of the target service.
  Example
  us-east-1
TLS Configuration (Optional):
- TLS CA: Provide the CA certificate in PEM format.
- TLS Certificate: Provide the client certificate in PEM format.
- TLS Key: Provide the private key in PEM format.
- TLS Key Passphrase: If the key is encrypted, provide the passphrase.
- Verify Certificate: Enable or disable certificate verification.
- Verify Hostname: Enable or disable hostname verification.
Buffering Configuration:
- Buffer Type: Specifies the buffering mechanism for event delivery.
  Options
  Description
  Memory
  High-Performance, in-memory buffering Max Events: The maximum number of events allowed in the buffer. Default: 500 When Full: Event handling behavior when a buffer is full. Default: Block - Block: Wait for free space in the buffer.This applies backpressure up the topology, signalling that sources should slow down the acceptance/consumption of events. This means that while no data is lost, data will pile up at the edge. - Drop Newest: Drop the event instead of waiting for free space in the buffer. The event will be intentionally dropped. This mode is typically used when performance is the highest priority, and it is preferable to temporarily lose events rather than cause a slowdown in the acceptance/consumption of events.
  Disk
  Lower-Performance, Less-costly, on disk buffering Max Bytes Size: The maximum number of bytes size allowed in the buffer. Must be at-least 268435488 When Full: Event handling behavior when a buffer is full. Default: Block - Block: Wait for free space in the buffer. This applies backpressure up the topology, signalling that sources should slow down the acceptance/consumption of events. This means that while no data is lost, data will pile up at the edge. - Drop Newest: Drop the event instead of waiting for free space in the buffer. The event will be intentionally dropped. This mode is typically used when performance is the highest priority, and it is preferable to temporarily lose events rather than cause a slowdown in the acceptance/consumption of events.
Advanced Settings (Optional):
- Metrics Timezone: The name of the timezone to apply to timestamp conversions that do not contain an explicit time zone. The time zone name may be any name in the TZ database, or local to indicate system local time.
  Example
  local
  America/New_York
  EST5EDT
- Metrics Host Tag: Name of the tag in the metric to use for the source host.
  Examples
  host
  hostname
- Api Version: The API version of Elasticsearch
  Options
  Auto-detect the API version
  Ekasticsearch 6.x API
  Ekasticsearch 7.x API
  Ekasticsearch 8.x API
- Compression: Compression configuration. All compression algorithms use the default compression level unless otherwise specified. Default: No compression
  Options
  Description
  Gzip compression
  Widely used DEFLATE-based compression format
  No compression
  No compression applied to data
  Zlib compression
  DEFLATE-based, lightweight compression library
- Distribution Retry Initial Backoff Secs (Increment as needed): Initial delay between attempts to reactivate endpoints once they become unhealthy.
- Distribution Retry Max Duration Secs (Increment as needed): Maximum delay between attempts to reactivate endpoints once they become unhealthy.
- Doc Type: The doc_type for your index data. Only relevant for Elasticsearch <= 6.X. Deprecated for version >= 7.0. Default: _doc
- Metrics Metric Tag Values: Controls how metric tag values are encoded.
  Options
  Descriptions
  Tags will be exposed as single strings
  When set to single, only the last non-bare value of tags will be displayed with the metric.
  Tags exposed as arrays of strings
  When set to full, all metric tags will be exposed as separate assignments.
- Query (add as needed): A query string parameter and its value to add to the query string. Example:
  key
  value
  X-Powered-By
  Observo
- Request Retry Partial (False): Whether to retry successful requests containing partial failures. To avoid duplicates in Elasticsearch, please use option id_key.
- Suppress Type Name (False): Whether to send the type field to Elasticsearch. Deprecated in Elasticsearch 7.x and removed in Elasticsearch 8.x. If enabled, the doc_type option will be ignored.
- Rejection Reporting: Elasticsearch may reject some events due to internal constraints such as non-adherence to schema. If rejection-reporting is turned on (temporarily), it may help isolate the cause for rejection. Smaller batch-size often make corner-cases easier to debug, while keeping overhead in check.
  Options
  Report stats but drop request and response payloads
  Report response payload but drop request (significant overhead)
  Report both request and response (very high overhead)
Save and Test Configuration:
- Save the configuration settings.
- Send sample data to verify that it reaches the specified Elasticsearch index.

Example Scenarios

Apex Financial Services, a fictitious enterprise in the financial services sector, specializes in wealth management and transaction processing. To enhance observability and gain actionable insights into their transaction logs and customer interaction data, Apex decides to integrate their Observo AI platform with an Elasticsearch cluster. This integration will enable advanced search, analytics, and visualization of their financial data, helping them monitor market trends, detect anomalies, and ensure regulatory compliance. Below is the detailed configuration process for setting up Elasticsearch as a destination in Observo AI, based on the provided documentation, with all required fields specified.

Standard Elastic Search Source Setup

Here is a standard Elastic Search Source configuration example. Only the required sections and their associated field updates are displayed in the table below:

General Settings

Field

Value

Description

Name

apex-es-transactions

Unique identifier for the Elasticsearch destination.

Description

Elasticsearch destination for transaction logs and customer interactions

Optional description for clarity.

Elasticsearch Endpoint

https://es-cluster.apexfin.com:9200

The secure endpoint for the Elasticsearch cluster.

Mode

Bulk

Uses Elasticsearch Bulk API for batch processing multiple operations in a single request.

Id Key

transaction_id

Maps to Elasticsearch’s _id field for unique transaction identification.

Pipeline

apex-transaction-pipeline

The ingest pipeline to apply for data preprocessing in Elasticsearch.

2. Authentication

Field

Value

Description

Auth Strategy

HTTP Basic Authentication

Uses username and password for secure access.

Auth User

apex_admin

Username for Elasticsearch authentication.

Auth Password

${ELASTICSEARCH_PASSWORD}

Password stored in environment variable for security.

Encoding

Field

Value

Description

Fields to Exclude

host, client_ip

Excludes sensitive fields from the serialized payload.

Encoding Timestamp Format

RFC 3339 timestamp

Uses human-readable ISO 8601-based format with timezone.

Bulk Mode Configuration

Field

Value

Description

Bulk Index

transactions-%Y-%m-%d

Dynamic index name based on date for daily transaction logs.

Bulk Action

Index

Adds or replaces documents with the same transaction_id.

Request Configuration

Field

Value

Description

Request Concurrency

Adaptive concurrency

Adjusts parallelism based on system load for optimal performance.

Request Rate Limit Duration Secs

Time window for rate limiting (default).

Request Rate Limit Num

100

Maximum requests allowed within the time window.

Request Retry Attempts

Maximum retries for failed requests.

Request Retry Initial Backoff Secs

Initial wait time before retrying a failed request.

Request Retry Max Duration Secs

3600

Maximum wait time between retries (default).

Request Timeout Secs

Time before a request is aborted (default).

Data Stream Mode Configuration

Field

Value

Description

Data Stream Auto Routing

True

Automatically derives data stream name using event fields.

Data Stream Dataset

transactions

Dataset used to construct the data stream name.

Data Stream Namespace

production

Namespace for the data stream, reflecting the environment.

Data Stream Sync Fields

True

Ensures data_stream.* fields match the receiving data stream.

Data Stream Type

log

Type used to construct the data stream name.

Batching Configuration

Field

Value

Description

Batch Max Bytes

10485760

Maximum batch size (10 MB) for uncompressed events.

Batch Max Events

1000

Maximum number of events in a batch before flushing.

Batch Timeout Seconds

Maximum age of a batch before flushing (default).

TLS Configuration

Field

Value

Description

TLS CA

/certs/apex_ca.pem

Path to the CA certificate in PEM format.

TLS Certificate

/certs/apex_client_cert.pem

Path to the client certificate in PEM format.

TLS Key

/certs/apex_client_key.pem

Path to the private key in PEM format.

TLS Key Passphrase

${TLS_KEY_PASSPHRASE}

Passphrase for the encrypted private key, stored securely.

Verify Certificate

True

Enables certificate verification for secure communication.

Verify Hostname

True

Enables hostname verification for added security.

Buffering Configuration

Field

Value

Description

Buffer Type

Memory

Uses high-performance, in-memory buffering.

Max Events

500

Maximum number of events in the buffer (default).

When Full

Block

Applies backpressure to wait for free space, preventing data loss.

Advanced Settings

Field

Value

Description

Metrics Timezone

America/New_York

Timezone for timestamp conversions, matching Apex’s primary location.

Metrics Host Tag

hostname

Tag used for the source host in metrics.

Api Version

Auto-detect

Automatically detects the Elasticsearch API version.

Compression

Gzip compression

Applies Gzip compression to reduce data transfer size.

Distribution Retry Initial Backoff Secs

Initial delay for retrying unhealthy endpoints.

Distribution Retry Max Duration Secs

3600

Maximum delay for retrying unhealthy endpoints.

Doc Type

_doc

Default document type for Elasticsearch (relevant for <= 6.x).

Metrics Metric Tag Values

Tags exposed as arrays of strings

Exposes all metric tags as separate assignments.

Query

X-Powered-By: Observo

Adds a query string parameter to identify the source.

Request Retry Partial

False

Does not retry requests with partial failures to avoid duplicates.

Suppress Type Name

False

Sends the type field to Elasticsearch (relevant for <= 6.x).

Rejection Reporting

Report stats but drop request and response payloads

Reports stats to help debug rejections with minimal overhead.

Test Configuration

Save the configuration in the Observo AI interface.
Send sample transaction data (e.g., a mock transaction log) to verify ingestion.
Use Elasticsearch’s search functionality to confirm that data appears in the transactions-%Y-%m-%d index.
Monitor Observo AI’s Notifications tab for any errors or warnings.

Scenario Troubleshooting

Authentication Issues: Verify that apex_admin and the password stored in ${ELASTICSEARCH_PASSWORD} are valid and have write permissions.
Index Not Found: Ensure the transactions-%Y-%m-%d index pattern is correctly configured in Elasticsearch.
Network Issues: Confirm that Observo AI can reach https://es-cluster.apexfin.com:9200 and that firewall rules allow outbound HTTPS traffic.
Data Format Errors: Validate that transaction logs match the expected schema and that the apex-transaction-pipeline ingest pipeline is correctly set up.

This configuration enables Apex Financial Services to efficiently stream and analyze their financial data in Elasticsearch,

Troubleshooting

If issues arise with the Elasticsearch destination in Observo AI, use the following steps to diagnose and resolve them:

Verify Configuration Settings

Ensure that the Elasticsearch Endpoint URL, Authentication Credentials, and Index are correctly entered and match the Elasticsearch setup.
Confirm that the Elasticsearch cluster is operational and accessible.

Check Authentication

Verify that the provided credentials are valid and have the necessary permissions to write to the specified index.
Ensure that the credentials have not expired or been revoked.

Monitor Logs

Check Observo AI’s Notifications tab for errors or warnings related to data transmission.
In the Elasticsearch interface, search the specified index to confirm data arrival.

Validate Data Format and Schema

Ensure that the data sent from Observo AI matches the expected format and schema in Elasticsearch.
If using custom mappings, verify that they are properly configured in Elasticsearch.

Network and Connectivity

Ensure that Observo AI can reach the Elasticsearch endpoint over the network.
If using firewalls or proxies, verify their configurations to allow necessary traffic.

Common Error Messages

"Authentication failed": Indicates invalid or missing credentials. Verify the credentials' validity and permissions.
"Index not found": Check that the specified index exists in Elasticsearch and that the credentials have write permissions.
"Error in creation of index": If you encounter this error, this is because the index that Observo is writing to does not exist. To fix this issue, do one of the following:
- Create the index in Elasticsearch
- Give create_index permissions to Observo.
"No data ingested": Confirm that data is being sent and matches the expected format.
“Error in writing document to Index”: Limit of total fields [1000] has been exceeded while adding new fields. Refer here in order to fix this issue.

Test Data Flow

Send sample data from Observo AI and verify its ingestion in Elasticsearch.
Use Elasticsearch's search functionality to locate and analyze the ingested data.

Resources

For additional guidance and detailed information, refer to the following resources:

Observo AI Elasticsearch Documentation: Comprehensive guide to configuring Elasticsearch destination in Observo AI.
Elasticsearch Documentation: Instructions for setting up and managing Elasticsearch clusters.
Observo AI Support: Contact support for assistance with configuration and troubleshooting.
Elasticsearch Community: Engage with the Elasticsearch community for best practices and solutions.

PreviousDynatrace NextExabeam

Last updated 7 months ago

Was this helpful?

hashtagPurpose

hashtagPrerequisites

hashtagObservo AI Platform Setup

hashtagElasticsearch Cluster Configuration

hashtagNetwork and Connectivity

hashtagIntegration

hashtagExample Scenarios

hashtagStandard Elastic Search Source Setup

hashtagTroubleshooting

hashtagVerify Configuration Settings

hashtagCheck Authentication

hashtagMonitor Logs

hashtagValidate Data Format and Schema

hashtagNetwork and Connectivity

hashtagCommon Error Messages

hashtagTest Data Flow

hashtagResources

Purpose

Prerequisites

Observo AI Platform Setup

Elasticsearch Cluster Configuration

Network and Connectivity

Integration

Example Scenarios

Standard Elastic Search Source Setup

Troubleshooting

Verify Configuration Settings

Check Authentication

Monitor Logs

Validate Data Format and Schema

Network and Connectivity

Common Error Messages

Test Data Flow

Resources