Datadog Metrics

The Observo AI Datadog Metrics destination enables seamless transmission of metrics data to Datadog for centralized monitoring, visualization, and alerting, supporting Gzip compression, secure API key authentication, and customizable tags for efficient system performance analysis.

Purpose

The Observo AI Datadog Metrics destination enables users to send metrics data to Datadog for centralized monitoring, visualization, and alerting. This destination integrates seamlessly with Datadog’s platform, allowing organizations to leverage Datadog’s powerful metrics analysis capabilities to track system performance and application health.

Prerequisites

Before configuring the Datadog Metrics destination in Observo AI, ensure the following requirements are met to facilitate seamless data ingestion:

  • Datadog Account:

    • Create a Datadog account if one does not already exist. This account serves as the hub for your metrics data.

    • Ensure the account is active and configured to accept metrics data.

    • Note the Datadog site such as us1, us3, eu1 that corresponds to your account’s region.

  • API Key:

    • Generate a Datadog API key in the Datadog platform to authenticate metrics data ingestion (Generate an API Key in Datadog).

    • Navigate to Organization Settings > API Keys in the Datadog UI, create a new key, and securely store its value.

    • Ensure the API key has permissions to send metrics to Datadog.

  • Network Access:

    • Verify that the Observo AI instance can communicate with the Datadog metrics intake endpoint such as api.datadoghq.com for US1.

    • Check for firewall rules or network policies that may block outbound HTTPS traffic to Datadog’s metrics intake endpoint on port 443.

  • Datadog Tags (Optional):

    • Prepare any tags you want to apply to metrics for filtering and organization in Datadog such as env:production, service:webapp.

    • Tags can be configured in Observo AI to align with Datadog’s tagging conventions for metrics.

Prerequisite
Description
Notes

Datadog Account

Hub for metrics data

Must be active and configured for metrics

API Key

Authenticates metrics data ingestion

Securely store the API key

Network Access

Enables communication with Datadog

Ensure HTTPS connectivity to Datadog metrics endpoint

Datadog Tags

Organizes metrics in Datadog

Optional, but recommended for filtering

Integration

The Integration section outlines default configurations for the Datadog Metrics destination. To tailor the setup to your environment, consult the Configuration Parameters section in the Datadog documentation for advanced options.

To configure Datadog Metrics as a destination in Observo AI, follow these steps:

  1. Log in to Observo AI:

    • Navigate to the Destinations tab.

    • Click the Add Destinations button and select Create New.

    • Choose Datadog Metrics from the list of available destinations to begin configuration.

  2. General Settings:

    • Name: Add a unique identifier, such as datadog-metrics-1.

    • Description (Optional): Provide a description for the destination.

    • Site (Optional): The Datadog site to send observability data to. Default: datadoghq.com

      Examples

      us3.datadoghq.com

      datadoghq.eu

    • Default Api Key (Optional): The default Datadog API key to use in authentication of HTTP requests. If an event has a Datadog API key set explicitly in its metadata, it will take precedence over this setting.

      Examples

      ${DATADOG_API_KEY_ENV_VAR}

      ef8d5de700e7989468166c40fc8a0ccd

    • Default Namespace (Optional): Specify the default namespace to use for metrics. If an event has a namespace set explicitly in its metadata, it will take precedence over this setting.

      Example

      service_name

  3. Request Configuration (Optional):

    • Request Concurrency: Configuration for outbound request concurrency. Default: Adaptive concurrency.

      Options
      Description

      Adaptive concurrency

      Adjusts parallelism based on system load

      A fixed concurrency of 1

      Processes one task at a time only

    • Request Rate Limit Duration Secs: The time window used for the rate_limit_num option. Default: 1.

    • Request Rate Limit Num: The maximum number of requests allowed within the rate_limit_duration_secs time window.

    • Request Retry Attempts: The maximum number of retries to make for failed requests. The default, represents an infinite number of retries. Default: Unlimited.

    • Request Retry Initial Backoff Secs: The amount of time to wait in seconds before attempting the first retry for a failed request. After the first retry has failed, the fibonacci sequence will be used to select future backoffs. Default: 1.

    • Request Retry Max Duration Secs: The maximum amount of time to wait between retries. Default: 3600.

    • Request Timeout Secs: The time a request waits before being aborted. It is recommended that this value is not lowered below the service’s internal timeout, as this could create orphaned requests, and duplicate data downstream. Default: 60.

  4. TLS Configuration (Optional):

    • TLS Enabled (False): Whether or not to require TLS for incoming or outgoing connections. When enabled and used for incoming connections, an identity certificate is also required. See tls.crt_file for more information.

    • TLS CA: The CA certificate provided as an inline string in PEM format.

      Example

      /etc/certs/ca.crt

    • TLS CRT: The certificate as a string in PEM format.

      Example

      /etc/certs/tls.crt

    • TLS Key: Absolute path to a private key file used to identify this server. The key must be in DER or PEM (PKCS#8) format. Additionally, the key can be provided as an inline string in PEM format.

      Example

      /etc/certs/tls.key

    • TLS Key Pass: Passphrase used to unlock the encrypted key file. This has no effect unless key_file is set.

      Examples

      ${KEY_PASS_ENV_VAR}

      PassWord1

    • TLS Verify Hostname (False): Enables hostname verification. Hostname used to connect to the remote host must be present in the TLS certificate presented by the remote host, either as the Common Name or as an entry in the Subject Alternative Name extension. Only relevant for outgoing connections. NOT recommended to set this to false unless you understand the risks.

    • TLS Verify Certificate (False): Enables certificate verification. Certificates must be valid in terms of not being expired, and being issued by a trusted issuer. This verification operates in a hierarchical manner, checking validity of the certificate, the issuer of that certificate and so on until reaching a root certificate. Relevant for both incoming and outgoing connections. Do NOT set this to false unless you understand the risks of not verifying the validity of certificates.

  5. Batching Requirements (Default):

    • Batch Max Bytes (Increment as needed): The maximum size of a batch that will be processed by a sink. This is based on the uncompressed size of the batched events, before they are serialized / compressed.

    • Batch Max Events (Increment as needed): The maximum size of a batch before it is flushed.

    • Batch Timeout Seconds (Increment as needed): The maximum age of a batch before it is flushed. Default: 1

  6. Buffering Configuration (Optional):

    • Buffer Type: Specifies the buffering mechanism for event delivery.

      Options
      Description

      Memory

      High-Performance, in-memory buffering Max Events: The maximum number of events allowed in the buffer. Default: 500 When Full: Event handling behavior when a buffer is full. Default: Block - Block: Wait for free space in the buffer.This applies backpressure up the topology, signalling that sources should slow down the acceptance/consumption of events. This means that while no data is lost, data will pile up at the edge. - Drop Newest: Drop the event instead of waiting for free space in the buffer. The event will be intentionally dropped. This mode is typically used when performance is the highest priority, and it is preferable to temporarily lose events rather than cause a slowdown in the acceptance/consumption of events.

      Disk

      Lower-Performance, Less-costly, on disk buffering Max Bytes Size: The maximum number of bytes size allowed in the buffer. Must be at-least 268435488 When Full: Event handling behavior when a buffer is full. Default: Block - Block: Wait for free space in the buffer. This applies backpressure up the topology, signalling that sources should slow down the acceptance/consumption of events. This means that while no data is lost, data will pile up at the edge. - Drop Newest: Drop the event instead of waiting for free space in the buffer. The event will be intentionally dropped. This mode is typically used when performance is the highest priority, and it is preferable to temporarily lose events rather than cause a slowdown in the acceptance/consumption of events.

  7. Advanced Settings (Optional):

    • Endpoint (Optional): The endpoint to send observability data to. The endpoint must contain an HTTP scheme, and may specify a hostname or IP address and port. If set, overrides the site option.

      Examples

      http://example.com:12345

    • Enable Proxy: (False) Defines whether to use a proxy to connect to New Relic. If set to true, the proxy settings must be configured. If enabled:

    • Proxy HTTP Endpoint: Specify the HTTP proxy endpoint.

      Example

      http://proxy.example.com:8080

    • Proxy HTTPS Endpoint: Specify the HTTPS proxy endpoint.

      Example

      https://proxy.example.com:8080

    • Proxy Bypass List (Add as needed): Hosts to avoid connecting through the proxy.

      Example

      https://proxy.example.com:8080

  8. Save and Test Configuration:

    • Save the configuration settings in Observo AI.

    • Send sample metrics data and verify that it appears in the Datadog Metrics Explorer under the specified namespace and tags.

Example Scenarios

UrbanTrend Retail Co., a fictitious mid-sized retail chain specializing in fashion and accessories, uses Observo AI to manage its IT infrastructure data, including metrics from point-of-sale systems, e-commerce platforms, and inventory management applications. To centralize monitoring and gain insights into system performance, UrbanTrend integrates Observo AI with Datadog to send metrics data for visualization and alerting. This integration enables real-time tracking of transaction volumes, application response times, and inventory updates, supporting operational efficiency and customer satisfaction.

Standard Elastic Search Source Setup

Here is a standard Datadog Metrics Source configuration example. Only the required sections and their associated field updates are displayed in the table below:

General Settings

Field
Value
Description

Name

urbantrend-datadog-metrics

Unique identifier for the Datadog Metrics destination.

Description

Send retail metrics to Datadog for monitoring

Optional description of the destination's purpose.

Site

us1.datadoghq.com

Datadog site corresponding to the US1 region for metrics ingestion.

Default Api Key

ef8d5de700e7989468166c40fc8a0ccd

Default Datadog API key for authenticating HTTP requests (securely stored).

Default Namespace

retail_metrics

Default namespace for metrics, ensuring clear organization in Datadog.

Request Configuration

Field
Value
Description

Request Concurrency

Adaptive concurrency

Adjusts parallelism based on system load for optimal performance.

Request Rate Limit Duration Secs

1

1-second time window for rate limiting requests.

Request Rate Limit Num

100

Maximum of 100 requests allowed within the 1-second window.

Request Retry Attempts

3

Retries failed requests up to 3 times to ensure reliable delivery.

Request Retry Initial Backoff Secs

1

Waits 1 second before the first retry, using Fibonacci for subsequent retries.

Request Retry Max Duration Secs

3600

Maximum 1-hour wait between retries to prevent excessive delays.

Request Timeout Secs

60

60-second timeout for HTTP requests to avoid orphaned requests.

TLS Configuration

Field
Value
Description

TLS Enabled

True

Requires TLS for secure outgoing connections to Datadog.

TLS CA

-----BEGIN CERTIFICATE-----...

Inline PEM-formatted CA certificate for verifying Datadog's server.

TLS CRT

-----BEGIN CERTIFICATE-----...

Certificate in PEM format for secure connections.

TLS Key

-----BEGIN PRIVATE KEY-----...

Private key in PEM format for secure connections (securely stored).

TLS Key Pass

PassWord1

Passphrase to unlock the encrypted key file.

TLS Verify Hostname

True

Verifies that the hostname in Datadog’s certificate matches us1.datadoghq.com.

TLS Verify Certificate

True

Ensures certificates are valid and issued by a trusted authority.

Batching Configuration

Field
Value
Description

Batch Max Bytes

1048576

Maximum batch size of 1MB (uncompressed) to balance throughput and efficiency.

Batch Max Events

1000

Maximum of 1000 events per batch before flushing to Datadog.

Batch Timeout Seconds

1

Flushes batches after 1 second to ensure timely delivery.

Buffer Type

Memory

Uses high-performance in-memory buffering for metrics delivery.

Max Events

500

Limits buffer to 500 events to manage memory usage.

When Full

Block

Applies backpressure when the buffer is full, preventing data loss.

Advanced Settings

Field
Value
Description

Endpoint

https://api.datadoghq.com

Datadog metrics intake endpoint for sending observability data.

Enable Proxy

True

Enables proxy usage for secure and controlled network communication.

Proxy HTTP Endpoint

http://proxy.urbantrend.com:8080

HTTP proxy endpoint for routing requests through the corporate network.

Proxy HTTPS Endpoint

https://proxy.urbantrend.com:8080

HTTPS proxy endpoint for secure request routing.

Proxy Bypass List

*.internal.urbantrend.com

Bypasses proxy for internal UrbanTrend domains to optimize performance.

Troubleshooting

If issues arise with the Datadog Metrics destination, use the following steps to diagnose and resolve them:

  • Verify Configuration Settings:

    • Ensure all fields, such as Datadog Site, API Key, and Metric Namespace, are correctly entered and match the Datadog account configuration.

    • Confirm that the Datadog Site matches your account’s region such as us1, eu1.

  • Check Authentication:

    • Verify that the API key is valid and has not been revoked or expired.

    • Regenerate the API key in Datadog if necessary and update the Observo AI configuration.

  • Monitor Metrics:

    • Check Observo AI logs for errors or warnings related to metrics data transmission to Datadog.

    • In the Datadog platform, navigate to Metrics > Explorer to confirm that metrics are arriving with the expected namespace and tags.

  • Validate Network Connectivity:

    • Ensure that the Observo AI instance can reach the Datadog metrics intake endpoint such as api.datadoghq.com.

    • Check for firewall rules or network policies blocking HTTPS traffic on port 443.

  • Test Data Flow:

    • Send sample metrics data through Observo AI and monitor its arrival in Datadog’s Metrics Explorer.

    • Use the Analytics tab in the targeted Observo AI pipeline to monitor data volume and ensure expected throughput.

  • Check Quotas and Limits:

    • Verify that the Datadog account is not hitting metrics ingestion limits or quotas (Datadog Metrics Quotas and Limits).

    • Adjust batching settings such as Batch Max Bytes, Batch Max Events if backpressure or slow data transfer occurs.

Issue
Possible Cause
Resolution

Metrics not appearing in Datadog

Incorrect API key or Datadog Site

Verify API key and site in configuration

Authentication errors

Expired or invalid API key

Regenerate API key and update configuration

Connection failures

Network or firewall issues

Check network policies and HTTPS connectivity

Slow metrics transfer

Backpressure or rate limiting

Adjust batching settings or check Datadog quotas

Resources

For additional guidance and detailed information, refer to the following resources:

Last updated

Was this helpful?