Socket

The Socket Source in Observo AI enables the ingestion of raw TCP/UDP data from systems or clients, facilitating real-time analysis and processing of unstructured or semi-structured logs, events, or metrics for enhanced observability and analytics.

Purpose

The purpose of the Observo AI Socket source is to enable users to ingest raw TCP/UDP data from systems or clients sending logs, events, or metrics to a designated hostname and port into the Observo AI platform for analysis and processing. It facilitates the collection of unstructured or semi-structured data in real time, allowing organizations to streamline data pipelines, enhance observability, and support use cases such as monitoring, analytics, and security by processing TCP or UDP-based data from diverse sources.

Prerequisites

Before configuring the Socket source in Observo AI, ensure the following requirements are met to facilitate seamless data ingestion:

  • Observo AI Platform Setup:

    • The Observo AI platform must be installed and operational, with support for the Socket source.

    • Verify that the platform supports raw TCP data ingestion, as Socket sources typically deliver unstructured or semi-structured data. Additional parsers may be needed for custom processing.

  • Socket Configuration:

    • Identify the hostname or IP address such as localhost or 0.0.0.0 and port number where the Socket source will listen for incoming TCP data.

    • Ensure the sending system or client is configured to transmit data to the specified Observo AI Socket endpoint.

  • Authentication:

    • Prepare authentication settings if required:

      • Shared Secret (authToken): Optionally, define a shared secret (authToken) to be provided by clients in a header for secure access.

      • Determine if unauthenticated access is permitted or if a token is mandatory.

  • Network and Connectivity:

    • Ensure Observo AI can listen on the specified hostname/IP and port for incoming TCP connections.

    • Check for firewall rules, proxy settings, or VPC endpoint configurations that may affect connectivity to the Socket endpoint.

Prerequisite
Description
Notes

Observo AI Platform

Must be installed and support Socket source

Verify support for raw TCP data; additional parsers may be needed

Socket Configuration

Hostname/IP and port for incoming TCP data

Ensure clients are configured to send to the specified endpoint

Authentication

Optional shared secret for secure access

Define authToken if required; decide on unauthenticated access

Network

Connectivity for the Socket endpoint

Check firewall, proxies, and VPC endpoints for accessibility

Integration

The Integration section outlines the configurations for the Socket source. To configure the Socket source in Observo AI, follow these steps to set up and test the data flow:

  1. Log in to Observo AI:

    • Navigate to the Sources tab.

    • Click the Add Source button and select Create New.

    • Choose Socket from the list of available sources to begin configuration.

  2. General Settings:

    • Name: A unique identifier for the source, such as socket-source-1.

    • Description (Optional): Provide a description for the source.

    • Socket Address: Socket address to listen for connections on. It should be in the format of host:port. Port should be in range[10000-10200]

      Example

      0.0.0.0:10010

    • The type of socket to use:

      Select an Option

      Listen on a TCP Port

      Listen on a UDP Port

    • Decoding Codec: The supported Codecs.

    Options
    Description

    Raw Bytes

    Processes log messages as uninterpreted binary data without any parsing or structure. The raw byte stream is passed through unchanged, allowing for custom processing downstream. Useful when you need complete control over message interpretation or when dealing with proprietary log formats.

    GELF

    Graylog Extended Log Format - A structured logging format that uses JSON to encode log messages with standardized fields like version, host, timestamp, level, and message. Designed specifically for centralized logging systems and provides better searchability and filtering capabilities compared to traditional syslog formats.

    JSON

    Parses log messages formatted as JSON objects, allowing for structured logging with arbitrary fields and nested data. Provides flexibility for applications to include custom metadata and enables efficient querying and analysis of log data. Messages must be valid JSON to be properly decoded.

    Syslog

    (RFC 3164 or RFC 5424)

    RFC 3164: The original BSD syslog format that uses a simple structure with priority value, timestamp, hostname, and message. Messages follow the format: <priority>timestamp hostname message. This is the traditional syslog format widely supported by legacy systems and network devices.

    RFC 5424: The newer standardized syslog format that includes structured data elements and UTF-8 support. Messages follow the format: <priority>version timestamp hostname app-name proc-id msg-id [structured-data] message. Provides better internationalization, more precise timestamps, and support for structured metadata within log messages.

  3. Framing (Optional):

    • Framing Method: The framing method. Default: Single Character Deliminated

      Options

      Single Character Deliminated

      Raw Event data (not delimited)

      Prefixed with an unsigned big-endian 32-bit integer indicating the length

      Predefined with Byte Length according to the octet counting format

      Newline Delimited

      Netflow

    • Character Delimiter: The character delimiter to use when framing method is set to character_delimited.

      Example

      |

    • Netflow Max Packet Size: Netflow max packet size

      Example

      65536 (default)

  4. TLS Configuration (Optional):

    • Determines if TLS is enabled. Relevant when mode is set to tcp (False): Toggle to Enable.

    • Absolute path to a CA certificate file. Relevant when mode is set to tcp. (Empty): Absolute path to an additional CA certificate file

    • Absolute path to a certificate file. Relevant when mode is set to tcp. (Empty): Absolute path to a certificate file to identify the server

    • Absolute path to a private key file. Relevant when mode is set to tcp (Empty): Absolute path to a private key file

    • Private key password. Relevant when mode is set to tcp.: Passphrase to unlock the encrypted key file, if applicable.

    • Verify client certificates. Relevant when mode is set to tcp. If enabled, the certificate must be issued by a trusted issuer. (False): Enable to verify client certificates.

    • Verify client hostname. Relevant when mode is set to tcp. If enabled, the client hostname must match the certificate hostname. (False): Enable to verify client hostname.

  5. Advanced Settings (Optional):

    • Max number of concurrent TCP connections: Empty (Default)

    • The maximum buffer size in bytes of incoming messages. Relevant when mode is set to udp.: Empty (Default)

  6. Parser Config:

    • Enable Source Log Parser: (False)

    • Toggle Enable Source Log Parser Switch to enable.

    • Select appropriate Parser from the Source Log Parser dropdown.

    • Add additional Parsers as needed.

  7. Pattern Extractor:

    • Refer to Observo AI's Pattern Extractor documentation for details on configuring pattern-based data extraction.

  8. Archival Destination:

    • Toggle Enable Archival on Source Switch to enable.

    • Under Archival Destination, select from the list of Archival Destinations (Required).

  9. Save and Test Configuration:

    • Save the configuration settings in Observo AI.

    • Send sample TCP data to the configured Socket endpoint and verify ingestion in the Analytics tab for data flow.

Example Scenarios

NexGen Energy, a fictitious renewable energy company, integrates wind turbine telemetry data via a TCP socket into Observo AI for real-time analytics and predictive maintenance. The configuration uses a TCP listener on monitor.nexgenenergy.com:10010 with TLS enabled for secure ingestion of JSON-formatted telemetry data.

Standard File Destination Setup

Here is a standard Socket Source configuration example. Only the required sections and their associated field updates are displayed in the table below:

General Settings

Field
Value
Notes

Name

socket-tcp-nexgen-1

Unique identifier for the Socket source, indicating NexGen Energy’s telemetry ingestion.

Description

Ingest wind turbine telemetry via TCP for NexGen Energy analytics

Optional, provides context for the source’s purpose.

Socket Address

monitor.nexgenenergy.com:10010

Host and port for the Observo AI TCP listener, matching the client’s configuration.

The type of socket to use

Listen on a TCP Port

Specifies TCP as the protocol for reliable data transmission.

Decoding Codec

JSON

Selected to parse JSON-formatted telemetry data from wind turbine sensors.

TLS Configuration

Field
Value
Notes

TLS Enabled

True

Enabled to require TLS for secure TCP connections, ensuring encrypted data transfer.

TLS CA File

-----BEGIN CERTIFICATE----- MIID... (PEM format)

Inline PEM string for the CA certificate, verifying the client’s certificate.

TLS Crt File

-----BEGIN CERTIFICATE----- MIIC... (PEM format)

Inline PEM string for the Observo AI server certificate, identifying the TCP listener.

TLS Key File

-----BEGIN PRIVATE KEY----- MIIE... (PEM format)

Inline PEM string for the private key, securely stored for the server certificate.

Private Key Password

NexGenKey2025!

Passphrase to unlock the encrypted key file, securely managed.

Verify Client Certificates

True

Enables verification of the client’s certificate, ensuring it’s valid and trusted.

Verify Client Hostname

True

Ensures the client hostname matches the certificate, enhancing security.

Framing

Field
Value
Notes

Framing Method

Newline Delimited

Selected to match JSON data delimited by newlines, common for telemetry streams.

Character Delimiter

(Empty)

Not used, as the framing method is newline-delimited.

Netflow Max Packet Size

(Empty)

Not applicable, as Netflow is not used.

Test Configuration:

  • Click “Save” to store the configuration settings in Observo AI.

  • Send sample JSON telemetry data (e.g., simulated turbine metrics) to monitor.nexgenenergy.com:10010 via TCP. Verify ingestion by monitoring the Analytics tab in the Observo AI pipeline for event counts and throughput.

Notes:

  • TLS Configuration: PEM certificate/key strings are placeholders; actual values must be provided by NexGen Energy’s security team, securely stored in Observo AI. TLS Enabled, Verify Client Certificates, and Verify Client Hostname are set to True for production-grade security.

  • Client Configuration: NexGen Energy’s IoT devices or telemetry systems must be configured to send JSON data to monitor.nexgenenergy.com:10010 over TCP, with TLS certificates matching Observo AI’s setup.

  • Network: Ensure firewall rules allow TCP traffic on port 10010 to monitor.nexgenenergy.com. For high-volume scenarios, consider a TCP-aware load balancer (e.g., AWS NLB) to distribute connections.

  • Troubleshooting: If issues occur (e.g., “Connection refused” or “Authentication failed”), verify the socket address, port, and TLS certificate paths. Use tools like netcat (nc monitor.nexgenenergy.com 10010) or openssl s_client for connectivity and TLS debugging, and check Observo AI’s logs for errors, as per the Troubleshooting section.

  • Resources: Refer to Observo AI documentation and external resources like “The Ultimate Guide to TCP/IP” and “What is TLS Encryption?” for additional guidance.

  • NexGen Energy Context: This configuration supports real-time ingestion of wind turbine telemetry, enabling predictive maintenance and anomaly detection, complementing other Observo AI sources for a unified observability pipeline.

Troubleshooting

If issues arise with the Socket source in Observo AI, use the following steps to diagnose and resolve them:

  • Verify Configuration Settings:

    • Ensure all fields, such as Address, Port, and IP Allowlist Regex, are correctly entered and match the client’s setup.

    • Confirm header settings and authToken match the expected format and value from the sending client.

  • Check Authentication:

    • Verify the authToken is valid and correctly provided in the header by the client, if enabled.

    • Check Observo AI logs for authentication failure errors if unauthenticated access is not permitted.

  • Validate Network Connectivity:

    • Check for firewall rules, proxy settings, or VPC endpoint configurations that may block access to the Socket endpoint.

    • Test connectivity using tools like netcat (nc) or telnet to the configured address and port to verify access.

  • Common Error Messages:

    • "Connection refused": Indicates the port is not open or Observo AI is not listening on the specified address. Verify Address and Port settings and check firewall rules.

    • "Authentication failed": Ensure the authToken in the header matches the configured value in Observo AI.

    • "Timeout on connection": Check the Socket Idle Timeout and network latency; consider increasing timeout values if needed.

  • Monitor Logs and Data:

    • Verify that data is being ingested by monitoring the Socket endpoint activity.

    • Use the Analytics tab in the targeted Observo AI pipeline to monitor data volume and ensure expected throughput.

    • Check Observo AI logs for errors or warnings related to data ingestion from the Socket source.

Issue
Possible Cause
Resolution

Data not ingested

Incorrect address or port config

Verify Address and Port settings

Authentication errors

Invalid or missing authToken

Check authToken in header and configuration

Connectivity issues

Firewall or proxy blocking access

Test network connectivity and check firewall rules

"Connection refused"

Port not open or service not listening

Ensure Observo AI is listening on the correct address/port

"Authentication failed"

Misconfigured authToken

Verify authToken matches client header

"Timeout on connection"

Network latency or low timeout setting

Increase timeout values or check network

Resources

For additional guidance and detailed information, refer to the following resources:

Last updated

Was this helpful?