Proofpoint Logs

The Proofpoint Logs Source in Observo AI enables the ingestion of JSON-formatted email security logs, events, and metrics from Proofpoint’s SIEM API, supporting real-time threat monitoring, user activity analysis, and compliance reporting.

Purpose

The purpose of the Observo AI Source Proofpoint Logs is to enable users to ingest log data from Proofpoint's SIEM API into the Observo AI platform for analysis and processing. It facilitates the collection of logs, events, and metrics related to email security, threat detection, and user activity, typically in JSON format, allowing organizations to streamline data pipelines, enhance observability, and support use cases such as security monitoring, threat analysis, and compliance by processing Proofpoint log data in real time.

Prerequisites

Before configuring the Proofpoint Logs source in Observo AI, ensure the following requirements are met to facilitate seamless data ingestion:

  • Observo AI Platform Setup:

    • The Observo AI platform must be installed and operational, with support for the Proofpoint Logs source.

    • Verify that the platform supports common data formats such as JSON, as Proofpoint logs are typically delivered in this format. Additional parsers may be needed for custom processing.

  • Proofpoint API Access:

    • An active Proofpoint tenant must be available to send log data to Observo AI.

    • Obtain the Proofpoint API endpoint URL such as https://api.proofpoint.com/v2/siem and required authentication credentials, such as a service principal and secret key, from the Proofpoint admin portal.

  • Authentication:

    • Prepare one of the following authentication methods:

      • Service Principal and Secret: Obtain a service principal and secret key from the Proofpoint admin portal for secure access to the SIEM API.

      • Secret Authentication: Use a stored secret within Observo AI's secure storage for credentials.

  • Network and Connectivity:

    • Ensure Observo AI can communicate with the Proofpoint SIEM API endpoint such as https://api.proofpoint.com/v2/siem.

    • Check for proxy settings, firewall rules, or VPC endpoint configurations that may affect connectivity to the Proofpoint API.

Prerequisite
Description
Notes

Observo AI Platform

Must be installed and support Proofpoint Logs

Verify support for JSON; additional parsers may be needed

Proofpoint API Access

Active Proofpoint tenant for log data submission

Obtain API endpoint URL and credentials from Proofpoint admin portal

Authentication

Service Principal and Secret or Secret Auth

Prepare credentials as required by the SIEM API

Network

Connectivity to the Proofpoint API endpoint

Check VPC endpoints, proxies, and firewalls

Integration

The Integration section outlines the configurations for the Proofpoint Logs source. To configure the Proofpoint Logs source in Observo AI, follow these steps to set up and test the data flow:

  1. Log in to Observo AI:

    • Navigate to the Sources tab.

    • Click the Add Source button and select Create New.

    • Choose Proofpoint Logs from the list of available sources to begin configuration.

  2. General Settings:

    • Name: A unique identifier for the source, such as proofpoint-logs-source-1.

    • Description (Optional): Provide a description for the source.

    • Endpoint: The Proofpoint SIEM API endpoint to collect data from. Supports templating with $LAST_VALUES when using checkpointing. Default: https://tap-api-v2.proofpoint.com/v2/siem/all

      Examples

      https://tap-api-v2.proofpoint.com/v2/siem/all

      https://tap-api-v2.proofpoint.com/v2/siem/all?format=json&sinceSeconds=3600

    • Collection Interval (Optional): Duration between consecutive data collection requests. Default: 60s

      Examples

      10s

      1m

    • Headers (Add as needed): Headers to include in the HTTP request. Use the format {key: value}.

  3. Authentication (Optional):

    • Service Principal: Enter the service principal obtained from the Proofpoint admin portal.

    • Secret: Enter the secret key for authentication.

    • Token URL: URL to get the token. Leave empty.

    • Scopes (Add as needed): Scopes to request for authentication. Leave empty.

    • Authentication Headers (Add as needed): Headers to include in the authentication HTTP request. Use the format {key: value}.

  4. Checkpoint:

    • Enable Checkpoint (False): Enable incremental log collection using checkpointing.

    • Tracking Column: JSON path to the field used for tracking progress such as 'timestamp'. The value from the last log entry will be used. Default: created_timestamp

      Examples

      timestamp

      message.time

      Data.created_at

    • Initial Value: Starting value for the tracking column. Will be used for the first collection.

      Example

      2025-04-06T00:00:00Z

  5. Pagination:

    • Enable Pagination (False): Enable pagination support for handling paginated responses.

    • Pagination Type: Type of pagination to use. Default: Page-Based

      Options
      Description

      Page-Based

      For traditional page numbers

      Attribute-Based

      For cursor or token-based pagination

    • Page Parameter Name: Query parameter name for the page number. Default:: page Example: page results in ?page=1

      Examples

      page

      page_number

      pageNum

    • Size Parameter Name: Query parameter name for the page size. Default: size Example: size results in ?size=50

      Examples

      size

      limit

      page_size

    • Page Size: Number of records to request per page. Default: 50

      Examples

      50

      100

      200

    • Start Page: Page number to start pagination from. Works in conjunction with zero-based setting. Default: 0

      Examples

      0

      1

    • Maximum Pages: Maximum number of pages to retrieve in one collection cycle. Set to 0 for unlimited.

      Examples

      50

      100

      0

    • Total Pages Path: JSON path to total pages count in response. Example: Meta.total_pages for {"meta": {"total_pages": 5}}

      Examples

      Meta.total_pages

      pagination.pages

      page_info.total

    • Total Count Path: JSON path to total record count in response. Example: meta.total for {"meta": {"total": 150}}

      Examples

      meta.total

      pagination.total_records

      count

    • Zero-Based Indexing (False): If true, page numbering starts at 0. If false, it starts at 1.

    • Response Attributes (Add as needed): JSON paths to attributes in the response body that contain next page information. Default: meta.pagination.after Example: meta.nextCursor for cursor-based pagination

      Examples:

      meta.nextCursor

      pagination.nextPage

      Links.next

    • Header Attributes (Add as needed): Names of HTTP headers that contain next page information. Default: Link Example: X-Next-Page for GitHub-style pagination

      Examples:

      X-Next-Page

      X-Next-Cursor

      Link

    • Request Interval: Time to wait between pagination requests. Use a duration string like '100ms' or '1s'. Default: 100ms

      Examples

      100ms

      500ms

      1s

  6. TLS Configuration (Optional):

    • CA File: The CA certificate provided as an inline string in PEM format.

    • ​​Include System CA Certs Pool (True): Include the system CA certificates pool in the list of CAs used to verify the server certificate.

    • Cert File: Path to the TLS cert to use for TLS required connections.

    • Key File: Path to the TLS key to use for TLS required connections.

    • Insecure (True): Skip TLS verification when connecting to the endpoint. This is insecure and should not be used in production.

    • Insecure Skip Verify (True): Enable TLS but not verify the certificate.

    • Server Name Override: The server name to use to verify the hostname on the returned certificates.

  7. Advanced Settings (Optional):

    • Proxy URL: URL of the proxy server to use when connecting to the endpoint.

    • Read Buffer Size: Size of the read buffer in bytes.

    • Write Buffer Size: Size of the write buffer in bytes.

    • Timeout: Timeout for the HTTP request. Use a number followed by a unit, such as '30s' or '1m'. Default: 10s

    • Compression: Compression algorithm to use for the request body. Select one.

    • Max Idle Connections: Maximum number of idle connections to keep open to the endpoint.

    • Idle Connection Timeout: Timeout for idle connections to the endpoint. Use a number followed by a unit, such as '30s' or '1m'.

    • HTTP 2 Read Idle Timeout: Timeout for HTTP/2 read idle connections to the endpoint. Use a number followed by a unit, such as '30s' or '1m'.

    • HTTP 2 Read Ping Timeout: Timeout for HTTP/2 read ping connections to the endpoint. Use a number followed by a unit, such as '30s' or '1m'.

    • Method: HTTP request method to use for requests. Supports GET and POST methods. Default: Get

    • Body: Request body for POST method. Supports templating with $LAST_VALUE$ when using checkpointing.

      Examples

      {"query": "fetch_logs", "from": "$LAST_VALUE$"}

    • Response Log Path: JSON path to logs array in responses. Leave empty if the response is a direct array of logs.

      Examples

      data

      resource.logs

  8. Parser Config:

    • Enable Source Log Parser: (False)

    • Toggle Enable Source Log Parser Switch to enable.

    • Select appropriate Parser from the Source Log Parser dropdown.

    • Add additional Parsers as needed.

  9. Pattern Extractor:

    • Refer to Observo AI's Pattern Extractor documentation for details on configuring pattern-based data extraction.

  10. Archival Destination:

    • Toggle Enable Archival on Source Switch to enable.

    • Under Archival Destination, select from the list of Archival Destinations (Required).

  11. Save and Test Configuration:

    • Save the configuration settings in Observo AI.

    • Send sample data to the Proofpoint Logs endpoint and verify ingestion in the Analytics tab for data flow.

Example Scenarios

Horizon Insurance Group, a fictitious enterprise in the insurance sector, focuses on providing property and casualty insurance. To bolster their cybersecurity posture and comply with regulatory requirements, Horizon aims to ingest email security logs from Proofpoint’s SIEM API into their Observo AI platform. This integration will enable real-time monitoring of email threats, user activity analysis, and compliance reporting. Below is the detailed configuration process for setting up the Proofpoint Logs source in Observo AI, based on the provided documentation, with all required fields specified.

Standard Proofpoint Logs Source Setup

Here is a standard Proofpoint Logs Source configuration example. Only the required sections and their associated field updates are displayed in the table below:

General Settings

Field
Value
Description

Name

proofpoint-email-logs

Unique identifier for the Proofpoint Logs source.

Description

Source for ingesting Proofpoint email security logs

Optional description for clarity.

Endpoint

https://tap-api-v2.proofpoint.com/v2/siem/all?format=json&sinceSeconds=3600

Proofpoint SIEM API endpoint with JSON format and 1-hour lookback.

Collection Interval

30s

Data collection requests every 30 seconds for near-real-time ingestion.

Headers

Content-Type: application/json

Specifies JSON content type for HTTP requests.

Authentication

Field
Value
Description

Service Principal

horizon_proofpoint_sp

Service principal obtained from Proofpoint admin portal.

Secret

${PROOFPOINT_SECRET_KEY}

Secret key stored in Observo AI’s secure storage.

Token URL

Left empty as no token URL is required for service principal auth.

Scopes

Left empty as no scopes are required for SIEM API access.

Authentication Headers

Authorization: Basic [Base64-encoded horizon_proofpoint_sp:${PROOFPOINT_SECRET_KEY}]

Base64-encoded credentials for HTTP Basic Authentication.

Checkpoint

Field
Value
Description

Enable Checkpoint

True

Enables incremental log collection using checkpointing.

Tracking Column

timestamp

JSON path to the timestamp field for tracking progress.

Initial Value

2025-07-01T00:00:00Z

Starting value for the first collection, aligned with UTC.

Pagination

Field
Value
Description

Enable Pagination

True

Enables pagination for handling large API responses.

Pagination Type

Page-Based

Uses traditional page numbers for pagination.

Page Parameter Name

page

Query parameter for page number (e.g., ?page=1).

Size Parameter Name

size of theQuery parameter for page size (e.g., ?size=100).

Page Size

100

Requests 100 records per page for efficient processing.

Start Page

1

Begins pagination from page 1 (non-zero-based).

Maximum Pages

50

Limits collection to 50 pages per cycle to manage load.

Total Pages Path

meta.total_pages

JSON path to total pages in response (e.g., {"meta": {"total_pages": 10}}).

Total Count Path

meta.total_records

meta.jsonTotal_records

Zero-Based Indexing

False

Page numbering starts at 1.

Response Attributes

meta.nextCursor

JSON path to next page cursor for pagination.

Header Attributes

s

Link

Request Interval

200ms

Waits 200 milliseconds between pagination requests.

TLS Configuration

Field
Value
Description

CA File

-----BEGIN CERTIFICATE-----\nMIID...==\n-----END CERTIFICATE-----

Inline CA certificate in PEM format for Proofpoint API.

Include System CA Certs Pool

True

Includes system CA certificates for verification.

Cert File

/certs/horizon_client_cert.pem

Path to the client TLS certificate.

Key File

/certs/horizon_client_key.pem

Path to the client TLS private key.

Insecure

False

Ensures TLS verification is enabled for production security.

Insecure Skip Verify

False

Verifies the server certificate for secure communication.

Server Name Override

api.proofpoint.com

Server name for hostname verification in TLS.

Advanced Settings

Field
Value
Description

Proxy URL

http://proxy.horizonins.com:8080

Proxy server for API connectivity.

Read Buffer Size

8192

Read buffer size of 8 KB for HTTP responses.

Write Buffer Size

8192

Write buffer size of 8 KB for HTTP requests.

Timeout

30s

HTTP request timeout set to 30 seconds.

Compression

Gzip

Uses Gzip compression for request body to optimize bandwidth.

Max Idle Connections

10

Keeps up to 10 idle connections open to the endpoint.

Idle Connection Timeout

60s

Closes idle connections after 60 seconds.

HTTP 2 Read Idle Timeout

30s

Timeout for HTTP/2 read idle connections.

HTTP 2 Read Ping Timeout

15s

Timeout for HTTP/2 read ping connections.

Method

GET

Uses the GET method for Proofpoint SIEM API requests.

Body

Left empty as the GET method does not require a request body.

Response Log Path

data

JSON path to logs array in API response (e.g., {"data": [...] }).

Test Configuration

  • Save the configuration in the Observo AI interface.

  • Send sample data to the Proofpoint SIEM API endpoint and verify ingestion in the Analytics tab.

  • Monitor Observo AI logs for errors and confirm data throughput matches expected email log volume.

  • Use Proofpoint’s admin portal to cross-check log delivery to the API.

Scenario Troubleshooting

  • Authentication Errors: Verify that horizon_proofpoint_sp and ${PROOFPOINT_SECRET_KEY} are valid in the Proofpoint admin portal and have SIEM API access.

  • Connectivity Issues: Ensure the proxy at http://proxy.horizonins.com:8080 allows traffic to https://tap-api-v2.proofpoint.com. Test with curl or Postman.

  • Data Not Ingested: Confirm the data path in Response Log Path matches the API response structure and that the JSON parser is enabled.

  • Request Timeout: Increase the Timeout to 60s if network latency is high or check proxy performance.

  • Inaccessible Host: Verify TLS 1.3 compatibility with Proofpoint’s API and check DNS resolution for api.proofpoint.com.

This configuration enables Horizon Insurance Group to securely ingest Proofpoint email security logs into Observo AI, supporting real-time threat monitoring and compliance.

Troubleshooting

If issues arise with the Proofpoint Logs source in Observo AI, use the following steps to diagnose and resolve them:

  • Verify Configuration Settings:

    • Ensure all fields, such as Endpoint, Service Principal, Secret Key, and parser settings, are correctly entered and match the Proofpoint API setup.

    • Confirm the HTTP method, such as GET, aligns with the Proofpoint SIEM API requirements.

  • Check Authentication:

    • Verify the service principal and secret key are valid, not expired, and have permissions to access the Proofpoint SIEM API.

    • For secret authentication, confirm the secret is accessible in Observo AI's secure storage.

    • Check Observo AI logs for authentication failure errors.

  • Validate Network Connectivity:

    • Check for firewall rules, proxy settings, or VPC endpoint configurations that may block access to the Proofpoint API endpoint.

    • Test connectivity using tools like curl or Postman with similar proxy configurations to verify access.

  • Common Error Messages:

    • "Inaccessible host": May indicate TLS version mismatches, such as TLS 1.3 issues, or DNS problems. Ensure the Proofpoint endpoint supports the required TLS version and check DNS settings.

    • "Authentication failed": Verify that the service principal and secret key are correct and have the necessary permissions for the SIEM API.

    • "Request timeout": Check the Timeout setting and network latency; consider increasing the timeout value.

  • Monitor Logs and Data:

    • Verify that data is being ingested by monitoring the Proofpoint Logs endpoint activity.

    • Use the Analytics tab in the targeted Observo AI pipeline to monitor data volume and ensure expected throughput.

    • Check Observo AI logs for errors or warnings related to data ingestion from the Proofpoint Logs source.

Issue
Possible Cause
Resolution

Data not ingested

Incorrect URL or parser configuration

Verify URL and parser settings

Authentication errors

Invalid or expired credentials

Check service principal and secret key validity

Connectivity issues

Firewall or proxy blocking access

Test network connectivity and VPC endpoints

"Inaccessible host"

TLS or DNS issues

Ensure TLS compatibility and check DNS

"Authentication failed"

Misconfigured credentials

Verify auth method and permissions

"Request timeout"

Network latency or low timeout setting

Increase Timeout or check network

Resources

For additional guidance and detailed information, refer to the following resources:

Last updated

Was this helpful?