AWS Kinesis Firehose

The Observo AI AWS Kinesis Firehose source enables seamless ingestion of logs, metrics, and events from Amazon Kinesis Data Firehose delivery streams into the Observo AI platform for real-time monitoring, analytics, and security, supporting formats like JSON, CSV, and Parquet.

Purpose

The purpose of the Observo AI AWS Kinesis Firehose source is to enable users to ingest data from Amazon Kinesis Data Firehose delivery streams into the Observo AI platform for analysis and processing. It supports formats like JSON, CSV, and Parquet, facilitating the collection of data such as logs, metrics, or events sent via HTTP endpoints. This integration helps organizations streamline data pipelines, enhance observability, and support use cases like real-time monitoring, analytics, and security by processing data from AWS Kinesis Firehose efficiently.

Prerequisites

Before configuring the AWS Kinesis Firehose source in Observo AI, ensure the following requirements are met to facilitate seamless data ingestion:

  • Observo AI Platform Setup:

    • The Observo AI platform must be installed and operational, with support for AWS Kinesis Firehose as a data source.

    • Verify that the platform supports common data formats such as JSON, CSV, or Parquet. Additional formats may require specific parser configurations.

  • AWS Account and Permissions:

    • An active AWS account with access to the target Kinesis Data Firehose delivery stream is required.

    • The Kinesis Data Firehose delivery stream must be configured to send data to an HTTP endpoint provided by Observo AI.

    • Required IAM permissions:

      • For Kinesis Data Firehose: firehose:PutRecord, firehose:PutRecordBatch, and firehose:DescribeDeliveryStream.

      • For accessing the Observo AI endpoint: Ensure the Firehose delivery stream has permissions to send HTTP requests to the Observo AI endpoint.

  • Authentication:

    • Prepare one of the following authentication methods:

      • Auto Authentication: Use IAM roles, shared credentials, or environment variables.

      • Manual Authentication: Provide an AWS access key and secret key.

      • Secret Authentication: Use a stored secret within Observo AI's secure storage.

  • Network and Connectivity:

    • Ensure Observo AI can receive data from the Kinesis Data Firehose delivery stream endpoint (e.g., firehose.<region>.amazonaws.com).

    • Check for proxy settings, firewall rules, or VPC endpoint configurations that may affect connectivity to AWS services or the Observo AI endpoint.

Prerequisite
Description
Notes

Observo AI Platform

Must be installed and support Kinesis Firehose sources

Verify support for JSON, CSV, Parquet, etc.; additional parsers may be needed

AWS Account

Active account with Kinesis Firehose access

Configure Firehose to send data to Observo AI endpoint

IAM Permissions

Required for Firehose and endpoint access

See permissions list above

Authentication

Auto, Manual, or Secret

Prepare credentials accordingly

Network

Connectivity to AWS Firehose and Observo AI endpoint

Check VPC endpoints, proxies, and firewalls

Integration

The Integration section outlines the configurations for the AWS Kinesis Firehose source. To configure AWS Kinesis Firehose as a source in Observo AI, follow these steps to set up and test the data flow:

  1. Log in to Observo AI:

    • Navigate to the Sources tab.

    • Click the Add Source button and select Create New.

    • Choose AWS Kinesis Firehose from the list of available sources to begin configuration.

  2. General Settings:

    • Name: A unique identifier for the source such as kinesis-firehose-source-1.

    • Description (Optional): Provide a description for the source.

    • Socket address: Socket address to listen for connections on. It should be in the format of host:port. The port should be in range[1-65535]

      Example

      0.0.0.0:10000

    • Store Access Key (False): Whether or not to store the AWS Firehose Access Key in event secrets. If set to true, when incoming requests contain an access key sent by AWS Firehose, it is kept in the event secret as “aws_kinesis_firehose_access_key”..

  3. Framing (Optional):

    • Framing Delimiter (Empty): The character that delimits byte sequences.

    • Framing Max Length (None): The maximum length of the byte buffer.

    • Framing Method (Empty): The framing method.

      Options
      Description

      Byte Frames

      Byte frames are passed through as-is according to the underlying I/O boundaries (for example, split between messages or stream segments).

      Character Delimited

      Byte frames which are delimited by a chosen character.

      Length Delimited

      Byte frames which are prefixed by an unsigned big-endian 32-bit integer indicating the length.

      Newline Delimited

      Byte frames which are delimited by a newline character.

      Octet Counting

      Byte frames according to the octet counting format.

    • Framing Newline Delimited Max Length (Empty): The maximum length of the byte buffer. This length does not include the trailing delimiter. By default, there is no maximum length enforced. If events are malformed, this can lead to additional resource usage as events continue to be buffered in memory, and can potentially lead to memory exhaustion in extreme cases. If there is a risk of processing malformed data, such as logs with user-controlled input, consider setting the maximum length to a reasonably large value as a safety net. This will ensure that processing is not truly unbounded.

    • Framing Octet Counting Max Length (Empty): The maximum length of the byte buffer.

  4. Advanced Settings:

    • Record Compression: The compression scheme to use for decompressing records within the Firehose message. Some services, like AWS CloudWatch Logs, compresses the events with gzip, before sending them to AWS Kinesis Firehose. This option can be used to automatically decompress them before forwarding them to the next component. Note that this is different from the Content encoding option of the Firehose HTTP endpoint destination. That option controls the content encoding of the entire HTTP request. Default: auto

      Options
      Description

      auto

      Automatically tries to identify the compression scheme by examining the file signature, also referred to as magic bytes. If decompression with the detected format fails, the record is passed through unchanged. If you know the records are consistently gzip encoded (e.g., from AWS CloudWatch Logs), set this field to gzip to reject any non-gzipped records.

      gzip

      GZIP compression

      none

      Uncompressed

    • Access Keys: An optional list of access keys to authenticate requests against. AWS Kinesis Firehose can be configured to pass along a user-configurable access key with each request. If configured, access keys should be set to the same value. Otherwise, all requests are allowed. (Add as needed)

      Examples

      A94A8FE5CCB236761C4C08

      A94A8FE5CCB193441C4C08

  5. Parser Config:

    • Enable Source Log Parser: Disabled by default.Toggle Enable Source Log Parser to enable and select an appropriate parser from the Source Log Parser dropdown.

    • Add additional parsers as needed. Events are processed by each parser in order they are defined.

  6. Pattern Extractor:

  7. Archival Destination:

    • Toggle Enable Archival on Source Switch to enable

    • Under Archival Destination, select from the list of Archival Destinations (Required)

  8. Save and Test Configuration:

    • Save the configuration settings in Observo AI.

    • Send sample data through the Kinesis Data Firehose delivery stream and verify ingestion in the Analytics tab to confirm data flow.

Example Scenarios

The Observo AI AWS Kinesis Firehose Source makes it easy to ingest streaming telemetry data delivered via Amazon Kinesis Data Firehose. Its flexible configuration enables you to efficiently parse, enrich, and route real-time data through your pipelines. This is ideal for teams that need to process continuously streamed logs and metrics for monitoring, analytics, or security use cases.

Standard AWS Kinesis Firehose Source Setup

Here is a standard AWS Kinesis Firehose Source configuration example. Only the required sections and their associated field updates are displayed in the table below:

General Settings
Example Inputs

Name

my-s3-fire-hose

Socket Address

0.0.0.0:8000

Troubleshooting

If issues arise with the AWS Kinesis Firehose source in Observo AI, use the following steps to diagnose and resolve them:

  • Verify Configuration Settings:

    • Ensure all fields, such as Delivery Stream Name, Region, and Endpoint, are correctly entered and match the AWS and Observo AI setup.

    • Confirm that the Firehose delivery stream is configured to send data to the Observo AI HTTP endpoint.

  • Check Authentication:

    • Verify the authentication method:

      • For Auto authentication, ensure IAM roles, shared credentials, or environment variables are correctly configured.

      • For Manual authentication, check that the access key and secret key are valid.

      • For Secret authentication, confirm the secret is accessible in Observo AI.

  • Validate Permissions:

    • Ensure the credentials have the required permissions: firehose:PutRecord, firehose:PutRecordBatch, and firehose:DescribeDeliveryStream.

    • Verify that the Firehose delivery stream has permissions to send HTTP requests to the Observo AI endpoint.

  • Network and Connectivity:

    • Check for firewall rules, proxy settings, or VPC endpoint configurations that may block access to firehose.<region>.amazonaws.com or the Observo AI endpoint.

    • Test connectivity using the AWS CLI or tools like curl with similar proxy configurations to verify access.

  • Common Error Messages:

    • "Inaccessible host": May indicate DNS issues or firewall restrictions. Ensure endpoints are reachable and check DNS settings.

    • "Missing credentials": Verify that the authentication method is correctly configured. For IAM roles, ensure the role is assumed correctly.

    • "Delivery stream does not exist": Check the delivery stream name and ensure there are no certificate validation issues. Consider adding CA certificates if needed.

  • Monitor Logs and Data:

    • Verify that data is being ingested by monitoring the Firehose delivery stream and Observo AI endpoint activity.

    • Use the Analytics tab in the targeted Observo AI pipeline to monitor data volume and ensure expected throughput.

    • Check Observo AI logs for errors or warnings related to data ingestion from the Firehose source.

Issue
Possible Cause
Resolution

Data not ingested

Incorrect delivery stream or endpoint configuration

Verify Firehose configuration and Observo AI endpoint

Authentication errors

Invalid credentials or role

Check authentication method and permissions

Connectivity issues

Firewall or proxy blocking access

Test network connectivity and VPC endpoints

"Inaccessible host"

DNS or firewall issues

Ensure endpoints are reachable and check DNS

"Missing credentials"

Authentication misconfiguration

Verify IAM roles or manual credentials

"Delivery stream does not exist"

Incorrect stream name or certificate issues

Check stream name and certificate settings

Resources

For additional guidance and detailed information, refer to the following resources:

What are the URLs?

Last updated

Was this helpful?