Azure Event Hub

Azure Event Hub is Microsoft's data streaming service that enables you to collect and process large amounts of event data. It supports real-time data ingestion from various sources using the Kafka protocol.

Purpose

The Observo AI Azure Event Hub source enables the ingestion of real-time data streams from Azure Event Hubs into the Observo AI platform for observability and analytics. It allows organizations to process high-throughput event data, such as logs or telemetry, for monitoring and threat detection. This integration facilitates scalable, secure, and efficient data collection from Azure’s event streaming service.

Prerequisites

Before configuring the Azure Event Hubs source in Observo AI, ensure the following requirements are met to facilitate seamless data ingestion:

  • Observo AI Platform Setup:

    • The Observo AI Site must be installed and available.

    • Verify the expected data formats, such as JSON or other structured data from Event Hubs.

  • Azure Event Hubs Configuration:

    • An active Azure subscription with an Event Hubs namespace and an event hub created (Create an Event Hub).

    • Required permissions:

      • The account or application used by Observo AI must have read access to the event hub, typically via a Shared Access Signature (SAS) policy or Microsoft Entra ID role with “Azure Event Hubs Data Receiver” permissions (Azure Event Hubs Documentation).

  • Authentication:

    • Prepare one of the following authentication methods:

      • Basic Authentication: Use a connection string in the format Endpoint=sb://<FQDN>/;SharedAccessKeyName=<key-name>;SharedAccessKey=<key-value>.

      • OAuth Authentication: Use Microsoft Entra ID with a Client ID, Tenant ID, and Scope (e.g., https://<Event-Hubs-Namespace-Host-name>/.default).

  • Network and Connectivity:

    • Ensure Observo AI can communicate directly with Azure Event Hubs, as HTTP proxies are not supported.

    • Adjust firewall rules to allow traffic over TCP, typically on port 9093, for the binary protocol used by Azure Event Hubs (Azure Event Hubs Overview).

Prerequisite
Description
Notes

Observo AI Platform

Must support Azure Event Hubs

Verify data format compatibility

Azure Event Hubs

Active namespace and event hub

Create via Azure portal

Permissions

Read access to event hub

Use SAS or Entra ID roles

Authentication

Basic or OAuth

Prepare connection string or Entra ID credentials

Network

Direct TCP connectivity

Allow port 9093, no HTTP proxies

Integration

The Integration section outlines default configurations. To configure Azure Event Hubs as a source in Observo AI, follow these steps to set up and test the data flow:

  1. Log in to Observo AI:

    • Navigate to Sources Tab

    • Click on “Add Sources” button and select “Create New

    • Choose “Azure Event Hubs” from the list of available destinations to begin configuration.

  2. General Settings:

    • Name: A unique identifier for the source such as event-hubs-source-1.

    • Description (Optional): Description for the source

    • Event Hubs Namespace Endpoint: Your Event Hubs namespace endpoint in the format: .servicebus.windows.net:9093

      Example

      myeventhubs.servicebus.windows.net:9093

    • Consumer Group: The Event Hub consumer group to use. For the default consumer group ($Default), use $$Default to escape the $ character.

      Examples

      $$Default

      myconsumergroup

    • Event Hub Name (Add): The name of the event hub within the namespace.

      Examples

      myeventhub1

      myeventhub2

  3. SASL Authentication:

    • SASL Enabled: (true): Should be enabled for Azure Event Hubs authentication.

    • SASL Mechanism: Should be set to PLAIN for Azure Event Hubs

    • Connection String: Your Event Hubs connection string. Find this in Azure Portal under your Event Hubs namespace's 'Shared access policies'.

      Example

      Endpoint=sb://myeventhubs.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=yourkey

      The connection string should follow this format:

    • SASL Username: Should be set to $$ConnectionString for Azure Event Hubs.

  4. TLS Configurations (Optional):

    • TLS Options Ca File: Absolute path to CA certificate file. The certificate must be in the DER or PEM (X.509) format. Additionally, the certificate can be provided as an inline string in PEM format.

      Example

      /etc/certs/ca.crt

    • TLS Options Crt File: Absolute path to a certificate file used to identify this server. The certificate must be in DER, PEM (X.509), or PKCS#12 format. Additionally, the certificate can be provided as an inline string in PEM format. If this is set, and is not a PKCS#12 archive, key_file must also be set.

      Example

      /etc/certs/tls.crt

    • TLS Enabled (False): Whether to require TLS for incoming/outgoing connections. When enabled for incoming connections, TLS certificate is also required. See TLS CRT File for more information.

    • TLS Verify Certificate (False): Enables certificate verification. Certificates must be valid in terms of not being expired, and being issued by a trusted issuer. This verification operates in a hierarchical manner, checking validity of the certificate, the issuer of that certificate and so on until reaching a root certificate. Relevant for both incoming and outgoing connections. Do NOT set this to false unless you understand the risks of not verifying the validity of certificates.

    • TLS Key: Absolute path to a private key file used to identify this server. The key must be in DER or PEM (PKCS#8) format. Additionally, the key can be provided as an inline string in PEM format.

      Example

      /etc/certs/tls.key

    • TLS Verify Hostname (False): Enables hostname verification. Hostname used to connect to the remote host must be present in the TLS certificate presented by the remote host, either as the Common Name or as an entry in the Subject Alternative Name extension.Only relevant for outgoing connections. NOT recommended to set this to false unless you understand the risks.

  5. Advanced Settings:

    • Librdkafka Options (Add): A librdkafka configuration option. Default: {"security.protocol": "sasl_ssl"}

    • Headers Key: Overrides the name of the log field used to add the headers to each event.

      Example

      headers

    • Key Field: Overrides the name of the log field used to add the headers to each event. The value will be the headers of the Kafka message itself.

      Example

      message_key

    • Topic Key: Overrides the name of the log field used to add the topic to each event. The value will be the topic from which the Kafka message was consumed from.

      Example

      topic

    • Partition Key: Overrides the name of the log field used to add the partition to each event. The value will be the partition from which the Kafka message was consumed from.

      Example

      partition

    • Offset Key: Overrides the name of the log field used to add the offset to each event. The value will be the offset of the Kafka message itself.

      Example

      offset

    • Commit Interval in Milliseconds: (5000)

    • ​​Fetch Wait Max in Milliseconds: (100)

    • Session Timeout in Milliseconds: (10000)

    • Socket Timeout in Milliseconds. Defaults to 60 seconds: (60000)

    • Auto Offset Reset: If offsets for consumer groups do not exist, set them using this strategy. Default: largest

      Examples

      smallest earliest beginning largest latest end error

  6. Parser Config:

    • Enable Source Log parser: (False)

    • Toggle Enable Source Log parser Switch to enable

      • Select appropriate Parser from the Source Log Parser dropdown

      • Add additional Parsers as needed

  7. Pattern Extractor:

  8. Archival Destination:

    • Toggle Enable Archival on Source Switch to enable

    • Under Archival Destination, select from the list of Archival Destinations (Required)

  9. Save and Test Configuration:

    • Save the configuration settings.

    • Verify that data is being ingested from the event hub.

Example Scenarios

GridNova, a fictitious utility enterprise, integrates Azure Event Hubs to stream real-time smart meter telemetry into Observo for grid monitoring and anomaly detection, supporting the deployment of millions of smart devices. The configuration uses the eastus region, SASL authentication, and TLS for secure ingestion of JSON-formatted data.

Standard Azure Event Hub Destination Setup

Here is a standard Azure Event Hub Source configuration example. Only the required sections and their associated field updates are displayed in the table below:

General Settings

Field
Value
Notes

Name

azure-event-hub-novagrid-2

Unique identifier for the Azure Event Hubs source, indicating GridNovae’s second smart meter telemetry ingestion.

Description

Ingest smart meter telemetry from Azure Event Hubs for GridNova grid monitoring

Optional, provides context for the source’s purpose.

Event Hubs Namespace Endpoint

novagrid-meterhub.servicebus.windows.net:9093

Namespace endpoint for the Event Hubs, located in the eastus region.

Consumer Group

$$Default

Default consumer group, escaped as required for Azure Event Hubs.

Event Hub Name

smartmeter-telemetry

Name of the event hub within the namespace, containing smart meter data.

SASL Authentication

Field
Value
Notes

SASL Enabled

True

Required for Azure Event Hubs authentication, enables SASL protocol.

SASL Mechanism

PLAIN

Required mechanism for Azure Event Hubs, ensuring compatibility.

Connection String

Endpoint=sb://novagrid-meterhub.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=xyz1234567890abcdefEXAMPLEKEY

SAS connection string from Azure Portal, providing read access to the event hub.

SASL Username

$$ConnectionString

Required setting for Azure Event Hubs, indicating use of the connection string.

TLS Configurations

Field
Value
Notes

TLS Options Ca File

-----BEGIN CERTIFICATE----- MIID... (PEM format)

Inline PEM string for the CA certificate, ensuring secure communication with Event Hubs.

TLS Options Crt File

-----BEGIN CERTIFICATE----- MIIC... (PEM format)

Inline PEM string for the client certificate to identify Observo AI.

TLS Enabled

True

Enabled to require TLS for secure connections to the Event Hubs endpoint.

TLS Verify Certificate

True

Enables certificate verification to ensure the server’s certificate is valid and trusted.

TLS Key

-----BEGIN PRIVATE KEY----- MIIE... (PEM format)

Inline PEM string for the private key, securely stored.

TLS Verify Hostname

True

Ensures the hostname (novagrid-meterhub.servicebus.windows.net) matches the certificate.

Advanced Settings

Field
Value
Notes

Librdkafka Options

{"security.protocol": "sasl_ssl"}

Default, ensures SASL over SSL for secure communication.

Headers Key

headers

Default, adds message headers to each event.

Key Field

message_key

Default, adds the Kafka message key to each event.

Topic Key

topic

Default, adds the topic name to each event.

Partition Key

partition

Default, adds the partition number to each event.

Offset Key

offset

Default, adds the message offset to each event.

Commit Interval in Milliseconds

5000

Default, commits offsets every 5 seconds to prevent data loss.

Fetch Wait Max in Milliseconds

100

Default, waits 100ms for batch fetching, optimizing throughput.

Session Timeout in Milliseconds

10000

Default, ensures stable consumer group sessions.

Socket Timeout in Milliseconds

60000

Default, 60-second timeout for socket operations, ensuring reliability.

Auto Offset Reset

earliest

Starts from the earliest offset if no consumer group offsets exist, ensuring no data loss for historical analysis.

Troubleshooting

If issues arise with the Azure Event Hubs source in Observo AI, use the following steps to diagnose and resolve them:

  • Verify Configuration Settings:

    • Ensure all fields such as Event Hub Name, Connection String, Client ID are correctly entered and match the Azure setup.

    • Confirm that the SASL mechanism and authentication method align with your Azure configuration.

  • Connection Failures:

    • Verify connection string format

    • Check network connectivity

    • Validate TLS settings if enabled

    • Ensure proper SASL configuration

  • Consumer Group Issues:

    • Check for consumer group conflicts

    • Verify consumer group permissions

  • Check Authentication:

    • For Basic Authentication, verify that the connection string is valid and includes the correct SharedAccessKeyName and SharedAccessKey.

    • For OAuth, ensure the Client ID, Tenant ID, and Scope are correct, and the application has the “Azure Event Hubs Data Receiver” role.

    • Verify connection string is valid and not expired

    • Check SASL username is set to $$ConnectionString

    • Validate shared access policy permissions

  • Monitor Logs:

    • Check Observo AI’s Logs tab for errors or warnings related to data ingestion.

    • Use Azure’s diagnostic tools to verify that events are being sent to the event hub (Azure Monitor for Event Hubs).

  • Validate Connectivity:

    • Ensure no HTTP proxies are interfering, as Azure Event Hubs uses a TCP-based binary protocol.

    • Verify that firewall rules allow traffic on port 9093.

  • Common Error Messages:

    • “Not authorized to access topics: [Topic authorization failed]”: Indicates the username or connection string lacks read permissions. Verify the SAS policy or Entra ID role assignments.

    • Connectivity Issues: Check network settings and ensure Observo AI can reach the Event Hubs endpoint.

    • Data Not Ingested: Confirm that the event hub is receiving data and that the Group ID matches an existing consumer group.

  • Test Data Flow:

    • Capture real-time events and verify ingestion.

    • Use the Analytics tab in the targeted Observo AI pipeline to monitor data volume and ensure expected throughput

Issue
Possible Cause
Resolution

Data not ingested

Incorrect Brokers or Event Hub Name

Verify configuration settings

Authorization errors

Invalid connection string or permissions

Check SAS policy or Entra ID roles

Connectivity issues

Firewall or proxy blocking TCP traffic

Allow port 9093, remove proxies

“Topic authorization failed”

Missing read permissions

Update SAS or Entra ID permissions

Slow data transfer

Consumer group misconfiguration

Verify Group ID and partition settings

Resources

For additional guidance and detailed information, refer to the following resources:

Last updated

Was this helpful?