Exception Summarization

The Exception Summarization Optimizer in Observo AI allows users to perform summarization on the exception data tying together multiline exceptions to one.

Purpose

The Exception Summarization Optimizer is designed to analyze and condense large volumes of exception data into concise, actionable summaries. It helps organizations quickly identify patterns, root causes, and key insights from error logs to improve troubleshooting efficiency. By leveraging AI, the transform enhances observability, reduces noise, and accelerates incident resolution.

Usage

Select Exception Summarization Optimizer transform. Add Name (required) and Description (optional).

General Configuration:

  • Bypass Transform: Defaults to disable. When enabled, this transform will be bypassed entirely, allowing the event to pass through without any modifications.

  • Add Filter Conditions: Defaults to disable. When enabled, it allows events to filter through conditions. Only events that meet the true condition will be processed; all others will bypass this transform. Based on AND/OR conditions, "+Rule" or "+Group" buttons.

Exception Summarization: IdentifyPodOrHost (pulldown):

  • Identify Log Groups: Used to identify log groups. Defaults to empty. Click the Add button to add a new fields:

    • Field Name: Field name whose value should be matched for identifying Log Groups.

    Examples:
    kubernetes.component.app
    • Regex/Substring for matching field values: You can provide a string or regex to match against.

    Examples:
    stage.*
    content-app-prod
    • Use as Regex: Interpret the matcher as Regex. Defaults to disabled. Toggle on to enable.

  • Pod or Host columns: Add column names which help uniquely identify a pod/host. Defaults to empty. Click the Add button to add a new field.

Example:
kubernetes.pod_id

ExceptionSummarizationConfigs (pulldown):

  • Field Name which contains raw log: Column which contains the raw log.

Examples:
log
  • Max Events: The maximum number of events to group together. Defaults to 30 seconds.

  • Flush Time(seconds): The maximum amount of time in seconds to wait before flushing events to Destination. Defaults to 30 seconds.

  • Regex for identifying start of logline: Regex for identifying start of logline for a multiline log.

Examples:
^[^\s]

Examples

Exception Summarization

Scenario: Summarize kubernetes pod file (log.message) identified by kubernetes.component.app and grouped by kubernetes.pod_id and kubernetes.container.name fields.

ExceptionSummarizationConfigs (Pulldown)

Field Name which contains raw log
Max Events
Flush Time (seconds)
Regex for identifying start of logline

log.message

100

30

^\\d{4}-\\d{2}-\\d{2}

IdentifyPodOrHost (Pulldown)

  • Identify Log Groups

Field Name
Regex/Substring for matching field values
Use as Regex

kubernetes.component.app

kubernetes.component.app

true

Pod or Host columns

kubernetes.pod_id

kubernetes.container.name

Input

[
  {
    "timestamp": "2024-02-15T10:30:00Z",
    "log": {
      "message": "2024-02-15 10:30:00 ERROR NullPointerException: Cannot invoke \"String.length()\" because \"input\" is null"
    },
    "kubernetes": {
      "component": {
        "app": "payment-service-prod"
      },
      "pod_id": "payment-pod-abc123",
      "container": {
        "name": "payment-container"
      }
    }
  },
  {
    "timestamp": "2024-02-15T10:30:00Z",
    "log": {
      "message": "    at com.example.PaymentService.processPayment(PaymentService.java:125)"
    },
    "kubernetes": {
      "component": {
        "app": "payment-service-prod"
      },
      "pod_id": "payment-pod-abc123",
      "container": {
        "name": "payment-container"
      }
    }
  },
  {
    "timestamp": "2024-02-15T10:30:00Z",
    "log": {
      "message": "    at com.example.PaymentController.handlePayment(PaymentController.java:45)"
    },
    "kubernetes": {
      "component": {
        "app": "payment-service-prod"
      },
      "pod_id": "payment-pod-abc123",
      "container": {
        "name": "payment-container"
      }
    }
  }
]

Output

{
  "ExceptionSummarization": true,
  "kubernetes": {
    "component": {
      "app": "payment-service-prod"
    },
    "container": {
      "name": "payment-container"
    },
    "pod_id": "payment-pod-abc123"
  },
  "log": {
    "message": "2024-02-15 10:30:00 ERROR NullPointerException: Cannot invoke \"String.length()\" because \"/input\" is null at com.example.PaymentService.processPayment(PaymentService.java:125) at com.example.PaymentController.handlePayment(PaymentController.java:45)"
  },
  "timestamp": "2024-02-15T10:30:00Z"
}

Results: The kubernetes pod file (log.message) is summarized by timestamp, log message, and kubernetes component, prod ID and exception.

Exception Summarization Optimizer Best Practices

Here are the top best practices for Exception Summarization:

  1. Precisely Identify Log Groups: Configure the IdentifyPodOrHost settings by specifying the correct field (e.g. kubernetes.component.app), matching criteria (regex or substring like "payment-service.*"), and unique pod/host identifiers (e.g. kubernetes.pod_id and kubernetes.container.name) to ensure that only related logs are grouped together.

  2. Specify the Raw Log Field Accurately: In the ExceptionSummarizationConfigs, set the “Field Name which contains raw log” to the appropriate column (e.g. log.message) so that the transform correctly targets the multiline exception data.

  3. Set Sensible Grouping Limits: Define a reasonable value for “Max Events” to control how many log entries are aggregated together and use “Flush Time (seconds)” to balance between processing latency and completeness of the grouped exception.

  4. Implement Effective Regex for Logline Start: Use a well-crafted regex (for example, "^\d{4}-\d{2}-\d{2}") to accurately detect the beginning of each logline in a multiline exception, ensuring that the summary correctly reconstructs the complete exception message.

  5. Test with Realistic Examples: Leverage the provided input/output examples to validate your configuration, ensuring that the transform correctly aggregates exception details (such as exception type, stacktrace length, and root cause) before deploying into production.

  • Cloudtrail Optimizer: Transform group to process AWS Cloudtrail events.

  • Trace Summarization: This transformer has the ability to summarize/collate data based on trace information from the incoming logs.

Last updated

Was this helpful?