AWS S3
AWS S3 is a scalable object storage solution for unstructured data, such as text, binary data, logs, and media files. It is commonly used for data archiving, backup, and analytics. This document outlines the parameters required for configuring AWS S3 as a destination for event storage.
Purpose
Observo AI’s AWS S3 destination is designed to create a cost-effective, high-performance data lake for security and observability data. It can store enriched and normalized telemetry in such formats as Parquet, making it analytics-ready for querying and long-term use. This destination supports compliance needs and integrates with tools using the Open Cybersecurity Schema Framework (OCSF). It also enables advanced features like natural language search and dynamic routing to simplify security operations.
Prerequisites
Before configuring the AWS S3 destination in Observo AI, ensure the following requirements are met to facilitate seamless data export:
Observo AI Platform Setup:
The Observo AI platform must be installed and operational, with support for AWS S3 as a data destination.
If exporting data in Parquet format (.parquet, .parq, .pqt), verify that the platform supports this format, potentially requiring specific configurations.
AWS Account and Permissions:
An active AWS account with access to the target S3 buckets is required.
Required IAM permissions for S3:
s3:PutObject
s3:ListBucket
s3:GetBucketLocation (to determine the bucket's region)
If using server-side encryption, additional permissions such as kms:GenerateDataKey and kms:Decrypt may be required for AWS KMS keys.
Authentication:
Prepare one of the following authentication methods:
Auto Authentication: Use IAM roles, shared credentials, environment variables, or a JSON credentials file.
Manual Authentication: Provide an AWS access key and secret key.
Secret Authentication: Use a stored secret within Observo AI's secure storage.
Network and Connectivity:
Ensure Observo AI can communicate with AWS S3 services. If using VPC endpoints for S3, verify their configuration.
Check for proxy settings or firewall rules that may affect connectivity to AWS endpoints.
Observo AI Platform
Must be installed and support S3 destinations
Verify Parquet support if needed
AWS Account
Active account with S3 access
Ensure bucket exists and is accessible
IAM Permissions
Required for S3 operations
Include KMS permissions if using encryption
Authentication
Auto, Manual, or Secret
Prepare credentials accordingly
Network
Connectivity to AWS services
Check VPC endpoints and proxies
Integration
To configure AWS S3 as a destination in Observo AI, follow these steps to set up and test the data flow:
Log in to Observo AI:
Navigate to the Destinations tab.
Click the Add Destination button and select Create New.
Choose AWS S3 from the list of available destinations to begin configuration.
General Settings:
Name: Provide a unique identifier for the destination, e.g., s3-destination-1.
Description (Optional): Add a description for the destination.
Bucket: Enter the name of the target S3 bucket.
Examplemy-bucket
Region: Specify the AWS region of the S3 bucket
Exampleus-east-1
Key Prefix: Prefix to apply to all object keys. Useful for partitioning objects. Must end in / to act as a directory path. Default: %Y/%m/%d
Examplesdate=%F/hour=%H
year=%Y/month=%m/day=%d
application_id={{ application_id }}/date=%F
%Y/%m/%d
date=%F
Filename Append UUID to Timestamp (True): Whether or not to append a UUID v4 token to the end of the object key’s timestamp portion. Ensure uniqueness of name in high performance use cases.
ExampleFor object key `date=2022-07-18/1658176486`, setting this field to `true` would result in an object key that looked like `date=2022-07-18/1658176486-30f6652c-71da-4f9f-800d-a1189c47c547`
ACL: Canned ACL to apply to the created objects.
Select the option:DescriptionAuthenticated Users Read access
Read access granted to authenticated AWS users
EC2 readable
Allows Amazon EC2 instances to read objects
FULL_CONTROL for object and bucket owner
Grants full control to both object and bucket owners
Read only for bucket owner
Only the bucket owner can read objects
Logs Writeable Bucket
Allows write access for S3 log delivery
Bucket/Object Owner All Access
Grants full access to object and bucket owner
AllUsers readable
Public read access for everyone on the internet
AllUsers Read Write
Public read/write access for everyone globally
Encoding:
Encoding Codec: The codec to use for encoding events. Default: JSON Encoding.
OptionsSub-OptionsJSON Encoding
Pretty JSON (False): Format JSON with indentation and line breaks for better readability. Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format
logfmt Encoding
Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format
Apache Avro Encoding
Avro Schema: Specify the Apache Avro schema definition for serializing events. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format
Newline Delimited JSON Encoding
Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (Default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format
No encoding
Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format
Plain text encoding
Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format
Parquet
Include Raw Log (False): Capture the complete log message as an additional field(observo_record) apart from the given schema. Examples: In addition to the Parquet schema, there will be a field named "observo_record" in the Parquet file. Parquet Schema: Enter parquet schema for encoding. Examples: message root { optional binary stream; optional binary time; optional group kubernetes { optional binary pod_name; optional binary pod_id; optional binary docker_id; optional binary container_hash; optional binary container_image; optional group labels { optional binary pod-template-hash; } } } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format (default) - UNIX format
Common Event Format (CEF)
CEF Device Event Class ID: Provide a unique identifier for categorizing the type of event (maximum 1023 characters). Example: login-failure CEF Device Product: Specify the product name that generated the event (maximum 63 characters). Example: Log Analyzer CEF Device Vendor: Specify the vendor name that produced the event (maximum 63 characters). Example: Observo CEF Device Version: Specify the version of the product that generated the event (maximum 31 characters). Example: 1.0.0 CEF Extensions (Add): Define custom key-value pairs for additional event data fields in CEF format. CEF Name: Provide a human-readable description of the event (maximum 512 characters). Example: cef.name CEF Severity: Indicate the importance of the event with a value from 0 (lowest) to 10 (highest). Example: 5 CEF Version (Select): Specify which version of the CEF specification to use for formatting. - CEF specification version 0.1 - CEF specification version 1.x Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format
CSV Format
CSV Fields (Add): Specify the field names to include as columns in the CSV output and their order. Examples: - timestamp - host - message CSV Buffer Capacity (Optional): Set the internal buffer size (in bytes) used when writing CSV data. Example: 8192 CSV Delimitier (Optional): Set the character that separates fields in the CSV output. Example: , Enable Double Quote Escapes (True): When enabled, quotes in field data are escaped by doubling them. When disabled, an escape character is used instead. CSV Escape Character (Optional): Set the character used to escape quotes when double_quote is disabled. Example: <br> CSV Quote Character (Optional): Set the character used for quoting fields in the CSV output. Example: " CSV Quoting Style (Optional): Control when field values should be wrapped in quote characters. Options: - Always quot all fields - Quote only when necessary - Never use quotes Quote all non-numeric fields Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format
Protocol Buffers
Protobuf Message Type: Specify the fully qualified message type name for Protobuf serialization. Example: package.Message Protobuf Descriptor File: Specify the path to the compiled protobuf descriptor file (.desc). Example: /path/to/descriptor.desc Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format (default) - UNIX format
Graylog Extended Log Format (GELF)
Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format (default) - UNIX format
Request Configuration (Optional):
Request Concurrency: Configuration for outbound request concurrency. Default: Adaptive concurrency.
OptionsDescriptionAdaptive concurrency
Adjusts parallelism based on system load
A fixed concurrency of 1
Processes one task at a time only
Request Rate Limit Duration Secs: The time window used for the rate_limit_num option. Default: 1.
Request Rate Limit Num: The maximum number of requests allowed within the rate_limit_duration_secs time window.
Request Retry Attempts: The maximum number of retries to make for failed requests. The default, represents an infinite number of retries. Default: Unlimited.
Request Retry Initial Backoff Secs: The amount of time to wait in seconds before attempting the first retry for a failed request. After the first retry has failed, the fibonacci sequence will be used to select future backoffs. Default: 1.
Request Retry Max Duration Secs: The maximum amount of time to wait between retries. Default: 3600.
Request Timeout Secs: The time a request waits before being aborted. It is recommended that this value is not lowered below the service’s internal timeout, as this could create orphaned requests, and duplicate data downstream. Default: 60.
TLS Configuration (Optional):
TLS CA: Provide the CA certificate in PEM format.
TLS CRT: Provide the client certificate in PEM format.
TLS Key: Provide the private key in PEM format.
Verify Certificate (False): Enables certificate verification. Certificates must be valid in terms of not being expired, and being issued by a trusted issuer. This verification operates in a hierarchical manner, checking validity of the certificate, the issuer of that certificate and so on until reaching a root certificate. Relevant for both incoming and outgoing connections. Do NOT set this to false unless you understand the risks of not verifying the validity of certificates.
Verify Hostname: Enables hostname verification. If enabled, the hostname used to connect to the remote host must be present in the TLS certificate presented by the remote host, either as the Common Name or as an entry in the Subject Alternative Name extension. Only relevant for outgoing connections. Do NOT set this to false unless you understand the risks of not verifying the remote hostname
Batching Requirements (Default):
Batch Max Bytes: The maximum size of a batch before it is flushed. Default: 100000000
Batch Max Events:The maximum size of a batch before it is flushed. Default: 1000
Batch Timeout Secs: The maximum age of a batch before it is flushed. Default: 300
Acknowledgments:
Acknowledgements Enabled (False): Whether or not end-to-end acknowledgements are enabled. When enabled, any source connected to this supporting end-to-end acknowledgements, will wait for events to be acknowledged by the sink before acknowledging them at the source.
Framing (Default):
Framing Character Delimited Delimiter: The ASCII (7-bit) character that delimits byte sequences. Default: Empty
Framing Method: The framing method. Default: Newline Delimeted
OptionsDescriptionsRaw Event data (not delimited)
No framing is applied. This method is best when each event is self-contained.
Single Character Delimited
Each event is separated by a specific single character (ASCII value)
Prefixed with Byte Length
Each event is prefixed with its byte length, ensuring precise separation between events
Newline Delimited
Each event is followed by a newline character (), which is commonly used for logging formats.
Authentication (Optional):
Auth Access Key Id: Enter The AWS access key ID
ExamplewJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Auth Secret Access Key: Enter the AWS secret access key.
ExamplewJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Auth Assume Role: Enter the ARN of an IAM role to assume.
Examplearn:aws:iam::123456789098:role/my_role
Auth Region: Enter the AWS region to send STS requests to. Defaults to the configured region for the service itself.
Exampleus-east-1
Auth Load Timeout Secs: Timeout for successfully loading any credentials, in seconds. Relevant when the default credentials chain is used or assume_role.
Example30
Auth Imds Connect Timeout Seconds (Optional): Connect timeout for IMDS. Default: Empty
Auth Imds Max Attempts: Enter number of IMDS retries for fetching tokens and metadata. Default: None
Auth Imds Read Timeout Seconds: Read timeout for IMDS. Default: None
Auth IMDS Read Timeout Seconds: Read timeout for IMDS. Default: None
Buffering Configuration (Optional):
Buffer Type: Specifies the buffering mechanism for event delivery.
OptionsDescriptionMemory
High-Performance, in-memory buffering Max Events: The maximum number of events allowed in the buffer. Default: 500 When Full: Event handling behavior when a buffer is full. Default: Block - Block: Wait for free space in the buffer.This applies backpressure up the topology, signalling that sources should slow down the acceptance/consumption of events. This means that while no data is lost, data will pile up at the edge. - Drop Newest: Drop the event instead of waiting for free space in the buffer. The event will be intentionally dropped. This mode is typically used when performance is the highest priority, and it is preferable to temporarily lose events rather than cause a slowdown in the acceptance/consumption of events.
Disk
Lower-Performance, Less-costly, on disk buffering Max Bytes Size: The maximum number of bytes size allowed in the buffer. Must be at-least 268435488 When Full: Event handling behavior when a buffer is full. Default: Block - Block: Wait for free space in the buffer. This applies backpressure up the topology, signalling that sources should slow down the acceptance/consumption of events. This means that while no data is lost, data will pile up at the edge. - Drop Newest: Drop the event instead of waiting for free space in the buffer. The event will be intentionally dropped. This mode is typically used when performance is the highest priority, and it is preferable to temporarily lose events rather than cause a slowdown in the acceptance/consumption of events.
Advanced Settings (Optional):
Endpoint: Custom endpoint for use with AWS-compatible services.
Examplehttp://127.0.0.0:5000/path/to/service
Compression: Compression algorithm to use for the request body. Default: Gzip compression
OptionsDescriptionGzip compression
DEFLATE compression with headers for file storage
No compression
Data stored and transmitted in original form
Zlib compression
DEFLATE format with minimal wrapper and checksums
Filename Extension: The filename extension to use in the object key. This overrides setting the extension based on the configured compression.
Examplejson
Filename Time Format: The timestamp format for the time component of the object key. By default, object keys are appended with timestamp (in epoch seconds) reflecting when the objects are sent to S3. The resulting object key is the key prefix followed by the formatted timestamp, eg: date=2022-07-18/1658176486. Supports strftime specifiers. Default: %s
Example%s
Content Encoding: Overrides what content encoding has been applied to the object. Directly comparable to the Content-Encoding HTTP header. If not specified, the compression scheme used dictates this value.
Examplegzip
Content Type: Overrides the MIME type of the object. Directly comparable to the Content-Type HTTP header. If not specified, the compression scheme used dictates this value. When compression is set to none, the value text/x-log is used.
Exampleapplication/gzip
Grant Full Control: Grants READ, READ_ACP, and WRITE_ACP permissions on the created objects to the named [grantee]. This allows the grantee to read the created objects and their metadata, as well as read and modify the ACL on the created objects.
Examples79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be
http://acs.amazonaws.com/groups/global/AllUsers
Grant Read: Grants READ permissions on the created objects to the named [grantee]. This allows the grantee to read the created objects and their metadata.
Examples79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be
http://acs.amazonaws.com/groups/global/AllUsers
Grant Read Acp: Grants READ_ACP permissions on the created objects to the named [grantee]. This allows the grantee to read the ACL on the created objects.
Examples79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be
http://acs.amazonaws.com/groups/global/AllUsers
Grant Write Acp: Grants WRITE_ACP permissions on the created objects to the named [grantee]. This allows the grantee to modify the ACL on the created objects.
Examples79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be
http://acs.amazonaws.com/groups/global/AllUsers
Server Side Encryption: The Server-side Encryption algorithm used when storing these objects.
Select from options:AES-256 Encryption (SSE-S3)
AES-256 Encryption managed by AWS KMS (SSE-KMS / SSE-C)
Ssekms Key Id: Specifies the ID of the AWS Key Management Service (AWS KMS) symmetrical customer master key (CMK) that is used for the created objects. Only applies when server_side_encryption is configured to use KMS. If not specified, Amazon S3 uses the AWS managed CMK in AWS to protect the data.
Exampleabcd1234
Storage Class: The S3 Storage Class for the created objects. Default: Standard Redundancy
Select the Option:DescriptionGlacier Deep Archive
Lowest cost, long-term archival storage option
Glacier Flexible Retrieval
Low-cost archive with flexible access speeds
Intelligent Tiering
Automatically moves data to cost tiers
Infrequently Accessed (Single Availability Zone)
Low-cost storage in one availability zone
Reduced Redundancy
Lower durability, lower cost for duplicates
Standard Redundancy
High durability, multi-zone general-purpose storage
Infrequently Accessed
Low-cost, high-durability for less access
Tags: A list of tag key-value pairs. (Add key-value pairs as needed)
Save and Test Configuration:
Save the configuration settings.
Send sample data to the S3 bucket and verify that it is stored correctly.
Example Scenarios
HealthCarePlus, a fictitious healthcare enterprise, manages a network of hospitals and telehealth services, generating extensive patient data, compliance logs, and audit trails in JSON and Parquet formats. To support regulatory compliance and long-term data analysis, HealthCarePlus aims to export this telemetry data to an Amazon S3 bucket named healthcareplus-data-archive using the Observo AI platform. The bucket resides in the AWS region us-east-1, and an IAM role with necessary permissions ensures secure data writes. The configuration below outlines the steps to set up the AWS S3 destination in Observo AI, adhering to the required fields specified in the Integration section of the provided document, enabling HealthCarePlus to centralize data for compliance and analytics.
Standard AWS S3 Destination Setup
Here is a standard AWS S3 Destination configuration example. Only the required sections and their associated field updates are displayed in the table below:
General Settings
Name
healthcareplus-s3-archive
Unique identifier for the S3 destination
Description
Export patient and compliance data to S3 for HealthCarePlus
Optional description of the destination
Bucket
healthcareplus-data-archive
Name of the target S3 bucket
Region
us-east-1
AWS region of the S3 bucket
Key Prefix
year=%Y/month=%m/day=%d/
Partitions objects by year, month, and day
ACL
Bucket/Object Owner All Access
Grants full access to bucket and object owner
Encoding
Encoding Codec
Parquet
Encodes events in Parquet format for structured data
Parquet Include Raw Log
True
Includes complete log as observo_record field
Parquet Schema
message root { optional binary patient_id; optional binary timestamp; optional binary event_type; optional binary diagnosis_code; }
Parquet schema for patient data
Encoding Metric Tag Values
Single
Exposes metric tag values as single strings
Encoding Timestamp Format
RFC3339
Formats timestamps in RFC3339 format
Request Configuration
Request Concurrency
Adaptive concurrency
Adjusts parallelism based on system load
Request Rate Limit Duration Secs
1
Time window for rate limiting
Request Rate Limit Num
1000
Maximum requests within the time window
Request Retry Attempts
3
Maximum retries for failed requests
Request Retry Initial Backoff Secs
1
Initial wait time before first retry
Request Retry Max Duration Secs
3600
Maximum wait time between retries
Request Timeout Secs
60
Time before aborting a request
TLS Configuration
TLS CA
/opt/observo/certs/ca.crt
Path to CA certificate for server verification
TLS CRT
/opt/observo/certs/healthcareplus.crt
Path to client certificate for authentication
TLS Key
/opt/observo/certs/healthcareplus.key
Path to private key for authentication
Verify Certificate
True
Enables certificate verification
Verify Hostname
True
Verifies hostname in the TLS certificate
Batching Configuration
Batch Max Bytes
100000000
Maximum batch size (100MB) before flushing
Batch Max Events
1000
Maximum number of events in a batch
Batch Timeout Secs
300
Maximum age of a batch before flushing
Acknowledgements
Acknowledgements Enabled
True
Enables end-to-end acknowledgements for data delivery
Framing
Framing Character Delimited Delimiter
Empty
Not used as newline-delimited framing is selected
Framing Method
Newline Delimited
Frames events with newline characters
Authentication
Auth Access Key Id
AKIAHEALTHCAREPLUS123
AWS access key ID for authentication
Auth Secret Access Key
wJalrXUtnHEALTHCAREPLUSKEY
AWS secret access key for authentication
Auth Assume Role
arn:aws:iam::123456789012:role/healthcareplus-s3-role
IAM role ARN for S3 access
Auth Region
us-east-1
AWS region for STS requests
Auth Load Timeout Secs
30
Timeout for loading credentials
Auth Imds Connect Timeout Seconds
5
Connect timeout for IMDS
Auth Imds Max Attempts
3
Number of IMDS retries for fetching tokens
Auth Imds Read Timeout Seconds
5
Read timeout for IMDS
Buffering Configuration
Buffer Type
Disk
Uses disk-based buffering for reliability
Max Bytes Size
268435488
Maximum buffer size (256MB)
When Full
Block
Applies backpressure when buffer is full
Advanced Settings
Endpoint
None
Uses standard AWS S3 endpoints
Compression
Gzip compression
Applies Gzip compression to request body
Filename Extension
.parquet
Specifies Parquet extension for object keys
Filename Time Format
%s
Timestamps in seconds since Unix epoch
Content Encoding
gzip
Specifies Gzip content encoding
Content Type
application/x-parquet
Specifies Parquet MIME type
Grant Full Control
None
No additional grantees for full control
Grant Read
None
No additional grantees for read access
Grant Read Acp
None
No additional grantees for read ACL access
Grant Write Acp
None
No additional grantees for write ACL access
Server Side Encryption
AES-256 Encryption (SSE-S3)
Uses S3-managed AES-256 encryption
Ssekms Key Id
None
Not used as SSE-S3 is selected
Storage Class
Standard Redundancy
High durability, multi-zone storage
Tags
environment=production, data_type=healthcare
Key-value pairs for object metadata
Additional Configuration
Save and Test: Save the configuration and send sample patient data to the healthcareplus-data-archive bucket.
Verify data presence in the S3 bucket using the Observo AI Analytics tab to confirm successful data flow.
Outcome
With this configuration, HealthCarePlus successfully exports patient data, compliance logs, and audit trails to its S3 bucket via Observo AI, enabling secure, long-term storage for regulatory compliance and advanced analytics, thereby enhancing operational efficiency and data governance.
Troubleshooting
If issues arise with the AWS S3 destination in Observo AI, use the following steps to diagnose and resolve them:
Verify Configuration Settings:
Ensure all fields, such as Bucket Name, Region, and Authentication, are correctly entered and match the AWS setup.
Confirm that the S3 bucket exists and is accessible in the specified region.
Check Authentication:
For Auto Authentication, verify that IAM roles, shared credentials, or environment variables are correctly configured.
For Manual Authentication, ensure the access key and secret key are valid.
For Secret Authentication, confirm the secret is accessible in Observo AI.
Validate Permissions:
Ensure the credentials have the required permissions:
s3:PutObject, s3:ListBucket, s3:GetBucketLocation.
If using KMS encryption, verify kms:GenerateDataKey and kms:Decrypt permissions.
Check that the IAM role (if used) is correctly assumed.
Network and Connectivity:
Check for firewall rules, VPC endpoint configurations, or proxy settings that may block access to AWS S3 services.
Test connectivity using the AWS CLI with similar proxy configurations to verify access to S3.
Common Error Messages:
"Access Denied": Indicates insufficient permissions. Verify IAM permissions for the bucket and KMS keys (if used).
"Bucket does not exist": Check the bucket name and region. Ensure there are no certificate validation issues.
"Inaccessible host": May indicate TLS version mismatches or DNS issues. Ensure the host supports the required TLS version and check DNS settings.
Monitor Data:
Verify that data is being written to the S3 bucket by checking the bucket contents.
Use the Observo AI Analytics tab to monitor data volume and ensure expected throughput.
Data not written
Incorrect bucket name or region
Verify bucket name and region
Authentication errors
Invalid credentials or role
Check authentication method and permissions
Connectivity issues
Firewall or proxy blocking access
Test network connectivity and VPC endpoints
"Access Denied"
Insufficient permissions
Verify IAM permissions for S3 and KMS
"Bucket does not exist"
Incorrect bucket name or certificate issues
Check bucket name and certificate settings
"Inaccessible host"
TLS or DNS issues
Ensure TLS compatibility and check DNS
Resources
For additional guidance and detailed information, refer to the following resources:
AWS Documentation:
Amazon S3 Bucket Configuration: Guide to configuring S3 buckets.
Amazon S3 Permissions: Details on S3 authentication and permissions.
AWS KMS Encryption: Information on configuring server-side encryption with KMS.
Best Practices:
Refer to general best practices for integrating S3 with data streaming platforms, such as optimizing bucket organization, enabling versioning, and using lifecycle policies for cost
Last updated
Was this helpful?

