GCP Cloud Storage
The GCP Cloud Storage configuration allows you to write events into Google Cloud Storage buckets. This destination supports compression, batching, TLS, and customizable metadata for objects. Below are the detailed configuration parameters to set up a GCP Cloud Storage destination.
Purpose
The Observo AI GCP Cloud Storage destination enables users to send telemetry data, including logs, metrics, and traces, to Google Cloud Storage for scalable, cost-effective storage and further analysis. This destination supports flexible data formats and integrates seamlessly with Google Cloud's ecosystem, allowing organizations to centralize telemetry data for observability, compliance, and analytics purposes.
Prerequisites
Before configuring the GCP Cloud Storage destination in Observo AI, ensure the following requirements are met:
Google Cloud Project:
A Google Cloud project must be created and linked to your GCP Cloud Storage instance. It’s recommended to use a dedicated project for isolation, but an existing project can be used if permissions are correctly configured (Create a Google Cloud Project).
The Cloud Storage API must be enabled in the project (Enable Cloud Storage API).
Configure Essential Contacts for notifications to receive updates from Google Cloud (Manage Notification Contacts).
Authentication:
Set up authentication using a service account with the "Storage Object Admin" role to allow Observo AI to write to GCP Cloud Storage buckets (Service Accounts).
Obtain a service account JSON key file for authentication (Creating and Managing Service Account Keys).
Optionally, configure Google Cloud Identity or a third-party Identity Provider (IdP) for enhanced security (Configure Cloud Identity, Configure Third-Party IdP).
GCP Cloud Storage Bucket:
Ensure an active GCP Cloud Storage bucket is available for data storage. The bucket must be accessible and properly configured for write operations (Creating Storage Buckets).
Verify the bucket’s region aligns with your performance and compliance requirements.
Integration
To configure GCP Cloud Storage as a destination in Observo AI, follow these steps:
Log in to Observo AI:
Navigate to the Destinations tab.
Click the Add Destinations button and select Create New.
Choose GCP Cloud Storage from the list of available destinations to begin configuration.
General Settings:
Name: Add a unique identifier such as gcp-cloud-storage-1.
Description (Optional): Provide a description for the destination.
Bucket: The GCS bucket name.
Examplemy-bucket
Api Key (Optional): An API key. Either an API key, or a path to a service account credentials JSON file can be specified. If both are unset, the GOOGLE_APPLICATION_CREDENTIALS environment variable is checked for a filename. If no filename is named, an attempt is made to fetch an instance service account for the compute instance the program is running on. If this is not on a GCE instance, then you must define it with an API key or service account credentials JSON file.
Credentials Path: Path to a [service account] credentials JSON file. Either an API key, or a path to a service account credentials JSON file can be specified. If both are unset, the GOOGLE_APPLICATION_CREDENTIALS environment variable is checked for a filename. If no filename is named, an attempt is made to fetch an instance service account for the compute instance the program is running on. If this is not on a GCE instance, then you must define it with an API key or service account credentials JSON file.
Example/my/path/credentials.json
Compression (Optional): Compression configuration. All compression algorithms use the default compression level unless otherwise specified. Default: No compression
OptionsDescriptionGzip compression
Widely used DEFLATE-based compression format
No compression
No compression applied to data
Zlib compression
DEFLATE-based, lightweight compression library
Acl (Optional): The Predefined ACL to apply to created objects. For more information, see Predefined ACLs. Default: Bucket/object private to project
OptionsDescriptionBucket/object can be read by authenticated users
Any authenticated GCP user can read the object
Object and bucket owner granted OWNER permission
The owner of the bucket and the object will have full control (owner access) over the object
Object is private to bucket owner
Only the bucket owner can access the object
Bucket/object are private
Both the bucket and object are private to the owner
Bucket/object private to project
Access is restricted to the project and its members
Bucket/object can be read publicly
Anyone can access the object without authentication
Filename Append UUID to Timestamp (False): Whether or not to append a UUID v4 token to the end of the object key’s timestamp portion. Ensure uniqueness of name in high performance use cases.
ExampleFor object key `date=2022-07-18/1658176486`, setting this field to `true` would result in an object key that looked like `date=2022-07-18/1658176486-30f6652c-71da-4f9f-800d-a1189c47c547`.
Filename Time Format: The timestamp format for the time component of the object key. By default, object keys are appended with a timestamp that reflects when the objects are sent to Cloud Storage, such that the resulting object key is functionally equivalent to joining the key prefix with the formatted timestamp, such as date=2022-07-18/1658176486. This would represent a key_prefix set to date=%F/ and the timestamp of Mon Jul 18 2022 20:34:44 GMT+0000, with the filename_time_format being set to %s, which renders timestamps in seconds since the Unix epoch. Supports the common strftime specifiers found in most languages. When set to an empty string, no timestamp will be appended to the key prefix. Default: %s
Example%s
Key Prefix: Prefix to apply to all object keys. Useful for partitioning objects. Must end in / to act as a directory path. Default: year=%Y/month=%m/day=%d/
Examplesdate=%F/hour=%H/
year=%Y/month=%m/day=%d/
application_id={{ application_id }}/date=%F/
%Y/%m/%d/
date=%F/
Storage Class (Optional): The storage class for created objects. For more information, see the storage classes documentation. Default: Standard
OptionsDescriptionArchive
Cheapest, for data that is rarely accessed (long-term storage)
Coldline
Low-cost storage for infrequently accessed data, but available within milliseconds
Nearline
Suitable for data that is accessed less than once a month
Standard
For frequently accessed data, offering low latency and high availability
Acknowledgement (False):
Acknowledgements Enabled (False): Whether or not end-to-end acknowledgements are enabled. When enabled, any source connected to this supporting end-to-end acknowledgements, will wait for events to be acknowledged by the destination before acknowledging them at the source.
Encoding:
Encoding Codec: The codec to use for encoding events. Default: JSON Encoding.
OptionsSub-OptionsJSON Encoding
Pretty JSON (False): Format JSON with indentation and line breaks for better readability. Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format
logfmt Encoding
Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format
Apache Avro Encoding
Avro Schema: Specify the Apache Avro schema definition for serializing events. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format
Newline Delimited JSON Encoding
Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (Default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format
No encoding
Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format
Plain text encoding
Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format
Parquet
Include Raw Log (False): Capture the complete log message as an additional field(observo_record) apart from the given schema. Examples: In addition to the Parquet schema, there will be a field named "observo_record" in the Parquet file. Parquet Schema: Enter parquet schema for encoding. Examples: message root { optional binary stream; optional binary time; optional group kubernetes { optional binary pod_name; optional binary pod_id; optional binary docker_id; optional binary container_hash; optional binary container_image; optional group labels { optional binary pod-template-hash; } } } Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format
Common Event Format (CEF)
CEF Device Event Class ID: Provide a unique identifier for categorizing the type of event (maximum 1023 characters). Example: login-failure CEF Device Product: Specify the product name that generated the event (maximum 63 characters). Example: Log Analyzer CEF Device Vendor: Specify the vendor name that produced the event (maximum 63 characters). Example: Observo CEF Device Version: Specify the version of the product that generated the event (maximum 31 characters). Example: 1.0.0 CEF Extensions (Add): Define custom key-value pairs for additional event data fields in CEF format. CEF Name: Provide a human-readable description of the event (maximum 512 characters). Example: cef.name CEF Severity: Indicate the importance of the event with a value from 0 (lowest) to 10 (highest). Example: 5 CEF Version (Select): Specify which version of the CEF specification to use for formatting. - CEF specification version 0.1 - CEF specification version 1.x Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format
CSV Format
CSV Fields (Add): Specify the field names to include as columns in the CSV output and their order. Examples: - timestamp - host - message CSV Buffer Capacity (Optional): Set the internal buffer size (in bytes) used when writing CSV data. Example: 8192 CSV Delimitier (Optional): Set the character that separates fields in the CSV output. Example: , Enable Double Quote Escapes (True): When enabled, quotes in field data are escaped by doubling them. When disabled, an escape character is used instead. CSV Escape Character (Optional): Set the character used to escape quotes when double_quote is disabled. Example: <br> CSV Quote Character (Optional): Set the character used for quoting fields in the CSV output. Example: " CSV Quoting Style (Optional): Control when field values should be wrapped in quote characters. Options: - Always quot all fields - Quote only when necessary - Never use quotes - Quote all non-numeric fields Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format
Protocol Buffers
Protobuf Message Type: Specify the fully qualified message type name for Protobuf serialization. Example: package.Message Protobuf Descriptor File: Specify the path to the compiled protobuf descriptor file (.desc). Example: /path/to/descriptor.desc Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format
Graylog Extended Log Format (GELF)
Encoding Avro Schema (Optional): The Avro schema. Example: { "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }] } Encoding Metric Tag Values (Select): Controls how metric tag values are encoded. - Tag values will be exposed as single strings (default) - Tags exposed as arrays of strings Note: When set to single, only the last non-bare value of tags will be displayed with the metric. When set to full, all metric tags will be exposed as separate assignments. Encoding Timestamp Format (Select): - RFC3339 format - UNIX format
Request Configuration (Optional):
Request Concurrency: Configuration for outbound request concurrency. Default: Adaptive concurrency.
OptionsDescriptionAdaptive concurrency
Adjusts parallelism based on system load
A fixed concurrency of 1
Processes one task at a time only
Request Rate Limit Duration Secs: The time window used for the rate_limit_num option. Default: 1.
Request Rate Limit Num: The maximum number of requests allowed within the rate_limit_duration_secs time window. Default: Unlimited.
Request Retry Attempts: The maximum number of retries to make for failed requests. The default, represents an infinite number of retries. Default: Unlimited.
Request Retry Initial Backoff Secs: The amount of time to wait in seconds before attempting the first retry for a failed request. After the first retry has failed, the fibonacci sequence will be used to select future backoffs. Default: 1.
Request Retry Max Duration Secs: The maximum amount of time to wait between retries. Default: 3600.
Request Timeout Secs: The time a request waits before being aborted. It is recommended that this value is not lowered below the service’s internal timeout, as this could create orphaned requests, and duplicate data downstream. Default: 60.
Batching Requirements (Default):
Batch Timeout Secs: The maximum age of a batch before it is flushed. Default: 1
Batch Max Bytes: The maximum size of a batch that will be processed by a sink. This is based on the uncompressed size of the batched events, before they are serialized / compressed. Default: Empty
Batch Max Events: The maximum size of a batch before it is flushed. Default: Empty
Framing (Default):
Framing Method: The framing method. Default: Newline Delimeted
OptionsDescriptionsRaw Event data (not delimited)
No framing is applied. This method is best when each event is self-contained.
Single Character Delimited
Each event is separated by a specific single character (ASCII value)
Prefixed with Byte Length
Each event is prefixed with its byte length, ensuring precise separation between events
Newline Delimited
Each event is followed by a newline character (), which is commonly used for logging formats.
Framing Character Delimited Delimiter: The ASCII (7-bit) character that delimits byte sequences. Default: (Empty)
TLS Configuration (Optional):
TLS CA: Provides the CA (Certificate Authority) certificate in PEM format. This certificate is used to verify the authenticity of the server being connected to during a TLS handshake. If not provided, the system will use the default CA certificates available on the host machine.
TLS CRT: The TLS certificate (in PEM format) used to authenticate the client with the GCS endpoint. This is part of the mutual TLS (mTLS) configuration if you are using client authentication.
TLS Key Pass: The private key (in PEM format) corresponding to the TLS certificate (TLS CRT). This key is used in combination with the certificate to authenticate the client when establishing a secure connection.
Examples${KEY_PASS_ENV_VAR}
PassWord1
TLS Verify Certificate (False): Enables certificate verification. Certificates must be valid in terms of not being expired, and being issued by a trusted issuer. This verification operates in a hierarchical manner, checking validity of the certificate, the issuer of that certificate and so on until reaching a root certificate. Relevant for both incoming and outgoing connections. Do NOT set this to false unless you understand the risks of not verifying the validity of certificates.
TLS Verify Hostname (False): Enables hostname verification. Hostname used to connect to the remote host must be present in the TLS certificate presented by the remote host, either as the Common Name or as an entry in the Subject Alternative Name extension. Only relevant for outgoing connections. NOT recommended to set this to false unless you understand the risks.
Advanced Settings (Optional):
Filename Extension: The filename extension to use in the object key. If not specified, the extension will be determined by the compression scheme used. Defines the extension to be appended to the object keys, based on the compression or encoding format used. For example, if using Gzip compression, you may set this to .gz, or if using Parquet encoding, it may be .parquet. The extension helps identify the format of the files stored in the GCS bucket. Default: None
Metadata (Add as needed): A key/value pair. Allows you to specify additional metadata for each object stored in GCS. Metadata is key-value pairs that can store useful information, such as the source of the data or the encoding format used. This metadata is included with the object and can be queried or used for auditing and monitoring purposes.
Save and Test Configuration:
Save the configuration settings.
Test the connection to verify that Observo AI can successfully write data to the specified GCP Cloud Storage bucket.
Example Scenarios
FinSecure, a financial services enterprise, manages a vast portfolio of transactional data, compliance logs, and fraud detection metrics generated from its trading platforms and customer banking systems. To ensure regulatory compliance and enable advanced analytics, FinSecure aims to send these telemetry data, stored in JSON and Parquet formats, to a Google Cloud Storage (GCS) bucket named finsecure-telemetry-data using the Observo AI platform. The bucket is configured within a dedicated Google Cloud project, finsecure-project-2025, with a service account assigned the "Storage Object Admin" role for secure write operations. The configuration below outlines the steps to set up the GCS destination in Observo AI, adhering to the required fields specified in the Integration section of the provided document, enabling FinSecure to centralize data for observability, compliance, and fraud analysis.
Standard GCP Cloud Storage Destination Setup
Here is a standard GCP Cloud Storage Destination configuration example. Only the required sections and their associated field updates are displayed in the table below:
General Settings
Name
finsecure-gcs-telemetry
Unique identifier for the GCS destination
Description
Store transactional and compliance telemetry in GCS for FinSecure
Optional description of the destination
Bucket
finsecure-telemetry-data
GCS bucket name for data storage
Api Key
None
Uses service account credentials instead
Credentials Path
/opt/observo/credentials/finsecure-service-account.json
Path to the service account JSON key file
Compression
Gzip
Applies Gzip compression to stored objects
Acl
Bucket/object private to project
Restricts access to the project and its members
Filename Append UUID to Timestamp
True
Appends UUID to timestamp for unique object keys
Filename Time Format
%s
Timestamps in seconds since Unix epoch
Key Prefix
year=%Y/month=%m/day=%d/
Partitions objects by year, month, and day
Storage Class
Nearline
Suitable for infrequently accessed compliance data
Acknowledgement
Acknowledgements Enabled
True
Enables end-to-end acknowledgements for data delivery
Encoding
Encoding Codec
Parquet
Encodes events in Parquet format for structured data
Parquet Include Raw Log
True
Includes complete log as observo_record field
Parquet Schema
message root { optional binary transaction_id; optional binary timestamp; optional binary account_id; optional binary amount; optional binary fraud_score; }
Parquet schema for transactional data
Encoding Avro Schema
{ "type": "record", "name": "log", "fields": [{ "name": "message", "type": "string" }, { "name": "transaction_id", "type": "string" }, { "name": "timestamp", "type": "string" }, { "name": "account_id", "type": "string" }, { "name": "amount", "type": "double" }, { "name": "fraud_score", "type": "double" }] }
Avro schema for additional serialization
Encoding Metric Tag Values
Single
Exposes metric tag values as single strings
Encoding Timestamp Format
RFC3339
Formats timestamps in RFC3339 format
Request Configuration
Request Concurrency
Adaptive concurrency
Adjusts parallelism based on system load
Request Rate Limit Duration Secs
1
Time window for rate limiting
Request Rate Limit Num
1000
Maximum requests within the time window
Request Retry Attempts
3
Maximum retries for failed requests
Request Retry Initial Backoff Secs
1
Initial wait time before first retry
Request Retry Max Duration Secs
3600
Maximum wait time between retries
Request Timeout Secs
60
Time before aborting a request
Batching Configuration
Batch Timeout Secs
1
Maximum age of a batch before flushing
Batch Max Bytes
10485760
Maximum batch size (10MB) before flushing
Batch Max Events
1000
Maximum number of events in a batch
Framing
Framing Method
Newline Delimited
Frames events with newline characters
Framing Character Delimited Delimiter
Empty
Not used as newline-delimited framing is selected
TLS Configuration
TLS CA
/opt/observo/certs/ca.crt
Path to CA certificate for server verification
TLS CRT
/opt/observo/certs/finsecure.crt
Path to client certificate for mTLS
TLS Key Pass
/opt/observo/certs/finsecure.key
Path to private key for mTLS
TLS Verify Certificate
True
Enables certificate verification
TLS Verify Hostname
True
Verifies hostname in the TLS certificate
TLS Key Pass
FinSecure2025
Passphrase to unlock the encrypted key file
Advanced Settings
Filename Extension
.parquet
Specifies Parquet extension for object keys
Metadata
source=finsecure, encoding=parquet
Adds metadata for auditing and querying
Additional Configuration
Save and Test: Save the configuration and send sample transactional data to the finsecure-telemetry-data bucket. Verify data presence in the GCS bucket using the Observo AI Analytics tab to confirm successful data flow.
Outcome
With this configuration, FinSecure successfully sends transactional data, compliance logs, and fraud detection metrics to its GCS bucket via Observo AI, enabling real-time fraud analysis, regulatory compliance, and optimized financial operations through centralized, scalable storage and advanced analytics.
Troubleshooting
If you encounter issues with the GCP Cloud Storage destination, use the following steps to diagnose and resolve them:
Verify Service Account Permissions:
Ensure the service account has the "Storage Object Admin" role. Check the IAM page in the Google Cloud Console and enable the "Include Google-provided role grants" option to view the service account such as [email protected].
Check Connection Status:
In the Observo AI interface, verify the destination’s connection status to confirm it is active.
Review Logs:
Check Observo AI logs for errors or warnings related to data transmission to GCP Cloud Storage.
Validate Bucket Configuration:
Confirm the bucket exists, is accessible, and matches the specified region and name.
Check Data Format:
Ensure the selected encoding format such as JSON, Parquet is compatible with downstream processes.
Proxy Configuration:
If using a proxy, verify the proxy settings are correctly configured (Proxy Configuration).
Test Data Flow:
Send sample data and verify it appears in the GCP Cloud Storage bucket.
Monitor Data Volume:
Use the Analytics tab in the Observo AI pipeline to monitor data volume and ensure expected throughput.
Data not reaching bucket
Incorrect service account credentials
Verify the JSON key file and permissions
Connection errors
Cloud Storage API not enabled or wrong region
Enable Cloud Storage API and confirm bucket region
Serialization errors
Incorrect encoding format
Ensure correct codec such as JSON, Parquet
Slow data transfer
Backpressure or rate limiting
Adjust batching settings or check GCP quotas
Resources
For additional guidance and detailed information, refer to the following resources:
Last updated
Was this helpful?

