Azure Blob Storage Receiver
A source that reads logs from Azure Blob Storage containers, triggered by Azure Event Hub notifications when new blobs are created or via scheduled polling via directory traversal on Blob containers.
When mode is set to traversal the system can read and process files stored in Azure blob storage containers. It handles two types of files: regular files that are processed in full when changed, and append files (like logs) that can be read incrementally as new content is added. It supports gzip and regular log files.
Purpose
The Observo AI Azure Blob Storage Receiver source enables the ingestion of data from Azure Blob Storage containers into the Observo AI platform for processing and analysis. It supports various file formats, such as JSON or CSV, allowing organizations to leverage stored data for observability and analytics. This integration facilitates efficient data retrieval from Azure’s scalable storage service for monitoring and insights.
Prerequisites
Before configuring the Azure Blob Storage Receiver source in Observo AI, ensure the following requirements are met to facilitate seamless data ingestion:
Observo AI Platform Setup:
The Observo AI Site must be installed and operational.
Verify that the platform can process expected file formats, such as JSON, CSV, or Parquet, if applicable.
Azure Storage Account:
An active Azure subscription with a storage account and at least one container created (Create a Storage Account).
The storage account must be accessible to Observo AI, either publicly or via private endpoints or firewall rules.
Required permissions:
The account or application used by Observo AI must have read access to the container, typically via a Shared Access Signature (SAS) token, Storage Account Key, or Microsoft Entra ID role with “Storage Blob Data Reader” permissions (Azure Blob Storage Access Control).
Authentication:
Prepare one of the following authentication methods:
Storage Account Key: Use the access key from the storage account’s “Access keys” section.
Shared Access Signature (SAS): Generate a SAS token with read permissions for the container (Create SAS Token).
OAuth (Microsoft Entra ID): Use a Client ID, Tenant ID, Client Secret, and Scope (e.g.,
https://storage.azure.com/.default) for authentication.
Network and Connectivity:
Ensure Observo AI can communicate with Azure Blob Storage endpoints, typically over HTTPS (port 443).
If using private endpoints or firewall rules, configure them to allow access from Observo AI (Azure Private Link for Blob Storage).
If a proxy is used, ensure it supports HTTPS traffic to Azure endpoints.
Observo AI Platform
Must support Azure Blob Storage
Verify file format compatibility
Azure Storage Account
Active account with container
Create via Azure portal
Permissions
Read access to container
Use SAS, Storage Account Key, or Entra ID
Authentication
Key, SAS, or OAuth
Prepare credentials accordingly
Network
HTTPS connectivity
Allow port 443, configure private endpoints if needed
Integration
The Integration section outlines the configurations. To configure Azure Blob Storage Receiver as a source in Observo AI, follow these steps to set up and test the data flow:
Log in to Observo AI:
Click on “Add Sources” button and select “Create New”
Choose “Azure Blob Storage Receiver” from the list of available destinations to begin configuration.
General Settings:
Name: A unique identifier for the source such as blob-storage-source-1.
Description: Optional description for the source.
Mode (Optional): The mode to choose, whether to use direct Azure Blob directory traversal or rely on EventHub for Blob notifications. Default: Use Event-Hub for Blob Notifications.
OptionsDescriptionEventHub
Use EventHub for Blob Notifications
Traversal
Use Directory Traversal for Blobs
Event Hub Endpoint: Azure Event Hub endpoint triggering on the Blob Create event. The receiver subscribes to the events published by Azure Blob Storage and handled by Azure Event Hub. When it receives a Blob Create event, it reads the logs or traces from a corresponding blob and deletes it after processing. Required only when mode is EventHub. See Trigger Azure Event Hub on Blob Creation section for further details.
Authentication Method: The authentication method to use when connecting to Azure Blob Storage. Default: Connection String
OptionsDescriptionConnection String (Default)
Use connection string for authentication.
Service Principal (need to select)
Use service principal for authentication.
Connection String (Default): The connection string to use when connecting to Azure Blob Storage.
ExampleDefaultEndpointsProtocol=https;AccountName=accountName;AccountKey=+idLkHYcL0MUWIKYHm2j4Q==;EndpointSuffix=core.windows.net
Service Principal (If selected):
Tenant ID: The tenant ID of the service principal to use when connecting to Azure Blob Storage. Example: ${tenant_id}
Client ID: The client ID of the service principal to use when connecting to Azure Blob Storage. Example: ${client_id}
Client Secret: The Client Secret of the service principal to use when connecting to Azure Blob Storage. Example: ${env:CLIENT_SECRET}
Storage Account URL: The URL of the storage account to use when connecting to Azure Blob Storage.
Azure Cloud: Defines which Azure Cloud to use when using the service_principal authentication method. Options: Azure Cloud or Azure US Government.
Advanced Settings:
Logs Container Name (Optional): Name of the blob container with the logs.
Max Event Size (Optional): Buffer size in natural language. Specifies how big an event can be. Default: 1MB
(Internal Queue) Batch size in events: Batch size of intermediate queues Default: 100
(Internal Queue) Max batch size in events: Max batch size of intermediate queue
Processing Settings: Define JavaScript expressions for field extraction or select a Pipeline/Pack for data transformation. Default: 200
(Internal Queue) Flush timeout: Flush timeout for intermediate queue. Default: 1s
Parser Config:
Enable Source Log parser: (False)
Toggle Enable Source Log parser Switch to enable
Select appropriate Parser from the Source Log Parser dropdown
Add additional Parsers as needed.
Pattern Extractor:
See Pattern Extractor for details.
Archival Destination:
Toggle Enable Archival on Source Switch to enable
Under Archival Destination, select from the list of Archival Destinations (Required)
Save and Test Configuration:
Save the configuration settings.
Use Observo AI’s testing tools to verify that data is being ingested from the specified container.
Example Scenarios
TechCorp, a fictitious Technology enterprise, wants to integrate their Azure Blob Storage container, which stores JSON log files, into the Observo platform for monitoring and analytics. They have an Azure storage account named "techcorpstorage" with a container called "logs-container" that holds the log files. They use a Connection String for authentication and rely on an Azure Event Hub to trigger blob notifications. The configuration will be set up to handle events up to 1MB, with specific batch sizes and flush timeouts for processing.
Standard Azure Blob Storage Source Setup
Here is a standard Azure Blob Storage Source configuration example. Only the required sections and their associated field updates are displayed in the table below:
General Settings
Name
blob-storage-techcorp-logs
Unique identifier for the source, e.g., indicating TechCorp’s log ingestion.
Description
Ingest JSON logs from TechCorp’s Azure Blob Storage for monitoring
Optional, provides context for the source’s purpose.
Storage Account Name
techcorpstorage
The name of the Azure storage account containing the logs.
Mode
Use Event-Hub for Blob Notifications
Default mode, relying on Event Hub to trigger on Blob Create events.
Event Hub Endpoint
Endpoint=sb://techcorpeventhub.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=abc123xyz==
Azure Event Hub endpoint configured to trigger on Blob Create events.
Authentication Method
Connection String
Default method for connecting to Azure Blob Storage.
Connection String
DefaultEndpointsProtocol=https;AccountName=techcorpstorage;AccountKey=+xyzLkHYcL0MUWIKYHm2j4Q==;EndpointSuffix=core.windows.net
The connection string for authenticating with the Azure storage account.
Advanced Settings
Logs Container Name
logs-container
Name of the blob container storing the JSON log files.
Max Event Size
1MB
Buffer size for events, set to the default of 1MB as specified.
(Internal Queue) Batch size in events
100
Default batch size for intermediate queues.
(Internal Queue) Max batch size in events
200
Maximum batch size for intermediate queues, set to default.
Processing Settings
Default JavaScript expressions
Define JavaScript expressions for field extraction or select a Pipeline/Pack; default used here.
(Internal Queue) Flush timeout
1s
Default flush timeout for the intermediate queue.
Troubleshooting
If issues arise with the Azure Blob Storage Receiver source in Observo AI, use the following steps to diagnose and resolve them:
Verify Configuration Settings:
Ensure all fields (e.g., Storage Account Name, Container, Blob Path, Authentication) are correctly entered and match the Azure setup.
Confirm that the container exists and contains blobs matching the Filename Filter.
Check Authentication:
For Storage Account Key, verify the key is valid and not rotated.
For SAS Token, ensure it has read permissions and is not expired.
For OAuth, confirm the Client ID, Tenant ID, Client Secret, and Scope are correct, and the application has “Storage Blob Data Reader” permissions.
Monitor Logs:
Check Observo AI’s Logs tab for errors or warnings related to data ingestion.
Use Azure’s Storage Analytics or Azure Monitor to verify blob access and activity (Azure Storage Analytics).
Validate Connectivity:
Ensure Observo AI can reach Azure Blob Storage endpoints over HTTPS (port 443).
If using private endpoints or firewall rules, verify they are correctly configured (Azure Private Link).
Common Error Messages:
“Authorization failure”: Indicates invalid credentials or insufficient permissions. Verify the Storage Account Key, SAS Token, or Entra ID role assignments.
“Container not found”: Check the container name and ensure it exists in the storage account.
“No data ingested”: Confirm blobs exist in the container and match the Filename Filter. Check the Polling Interval for delays.
Test Data Flow:
Capture real-time events and verify ingestion.
Use the Analytics tab in the targeted Observo AI pipeline to monitor data volume and ensure expected throughput
Data not ingested
Incorrect container or filter
Verify container name and Filename Filter
Authorization errors
Invalid or expired credentials
Check Storage Account Key, SAS, or OAuth settings
Connectivity issues
Firewall or private endpoint issues
Allow HTTPS on port 443, verify endpoints
“Container not found”
Incorrect container name
Confirm container exists in storage account
“Authorization failure”
Missing permissions
Update permissions or regenerate credentials
Trigger Azure Event Hub on Blob Creation
Azure Event Hubs is a scalable event ingestion service that can process millions of events per second. It is commonly used for real-time analytics, event-driven architectures, and data streaming. One powerful use case is triggering an Event Hub endpoint when a new blob is created in Azure Blob Storage. This setup enables real-time processing of data as soon as it is uploaded to storage.
Prerequisites
Before walk through the steps to configure Azure Event Hub to trigger a Blob Create event, ensure you have the following:
Azure Subscription: You need an active Azure subscription. Create an Azure account if you don’t have one.
Azure Blob Storage Account: A storage account with a container to store blobs. Create a Blob Storage account.
Azure Event Hub: An Event Hub namespace and Event Hub instance. Create an Event Hub.
Azure Event Grid: Used to route Blob Create events to the Event Hub. Learn more about Event Grid.
Step 1: Create an Azure Event Hub
Go to the Azure Portal.
Navigate to Event Hubs and click + Create.
Provide a Namespace Name, select a Pricing Tier, and choose a Resource Group.
Click Review + Create, then Create.
Once the namespace is created, go to the namespace and create an Event Hub instance.
Reference: Create an Event Hub Namespace
Step 2: Enable Event Grid on Azure Blob Storage
Go to your Azure Blob Storage Account in the Azure Portal.
Navigate to Events under the Settings section.
Click + Event Subscription to create a new subscription.
Reference: Enable Event Grid on Blob Storage
Step 3: Configure the Event Subscription
Event Subscription Details:
Provide a Name for the subscription.
For Event Schema, select Event Grid Schema.
Topic Details:
Set System Topic Name to a meaningful name (e.g., BlobCreateTopic).
Event Types:
Select Blob Created as the event type. You can deselect other event types if not needed.
Endpoint Details:
Set Endpoint Type to Event Hub.
Select the Event Hub namespace and instance you created earlier.
Click Create to finalize the event subscription.
Reference: Create an Event Grid Subscription
Step 4: Verify the Configuration
Upload a file to the Azure Blob Storage container.
Go to the Event Hub in the Azure Portal.
Use the Metrics or Live Events feature to verify that the Blob Create event is being routed to the Event Hub.
Reference: Monitor Event Hubs
Step 5: Obtain the Event Hub Endpoint String
To configure the OTEL receiver, you’ll need the Event Hub connection string. Here’s how to obtain it:
Go to your Event Hub Namespace in the Azure Portal.
Navigate to Shared Access Policies under the Settings section.
Click on the policy (e.g., RootManageSharedAccessKey) or create a new one.
Copy the Connection String from the policy.
The connection string will look like this:
Endpoint=sb://oteldata.servicebus.windows.net/;SharedAccessKeyName=otelhubbpollicy;SharedAccessKey=mPJVubIK5dJ6mLfZo1ucsdkLysLSQ6N7kddvsIcmoEs=;EntityPath=otellhubReference: Get Event Hub Connection String
Resources
For additional guidance and detailed information, refer to the following resources:
Last updated
Was this helpful?

