Sensitive Data Mask

The Sensitive Data Mask function in Observo AI allows you to obfuscate sensitive information in your data streams.

Purpose

This function is particularly useful for ensuring compliance with data privacy regulations and protecting sensitive data from unauthorized access. It protects sensitive data within log entries by masking or obfuscating it and defines regular expressions that can be used to redact or mask sensitive information.

Usage

Select Sensitive Data Mask transform. Add Name (required) and Description (optional).

General Configuration:

  • Bypass Transform: Defaults to disable. When enabled, this transform will be bypassed entirely, allowing the event to pass through without any modifications.

  • Add Filter Conditions: Defaults to disable. When enabled, it allows events to filter through conditions. Only events that meet the true condition will be processed; all others will bypass this transform. Based on AND/OR conditions, "+Rule" or "+Group" buttons.

Masking Rules: Enabled: Defaults to enabled, meaning it does evaluate all events. Toggle Enabled off to prevent event processing to feed data to the downstream Transforms.

Masking Rules: Set of event fields to evaluate and add/set. First field entry (1 rule) key-value pair added by default. Click Add button to add new field as a key-value pair, with the following inputs:

  • Field to Mask: Field identifier for regex and masking rule application.

  • Prefix to Match: [Optional] Supply a prefix string that is anticipated to appear in sensitive or private information within the field. Imagine a situation where the prefix "SSN" is designated. If a field holds a value such as "SSN: 123-45-6789", the specified prefix "SSN" will align, causing the masking of the numerical value "123-45-6789" while preserving "SSN" as it is.

  • Match With: Indicate the matching technique. If the method is "Sensitive Data Template," then pick from the list of templates to match against the field value. Alternatively, if "Custom Regex" is the selected method, then define a regular expression to match against the field value.

    • Regex: Provide a regular expression for matching against the field value. If the regular expression finds a match, the matched expression will be masked OR

    • Templates [enum]: Select the list of templates to match against the field value.

Options
Sub-options

Secrets & Credentials

AWS Access Key ID, AWS Secret Access Key, AWS Account ID, GCP Service Account Key ID, Password, API Key, Token, Credential, Secret, Passphrase, JWT Token, SSH Private Key, URL Basic

Credit Card & Banking

Credit Card Numbers (Visa), Credit Card Numbers (Generic), Credit Card Numbers (MasterCard), Credit Card Numbers (American Express), Credit Card Numbers (Discover), Bank Account Numbers, Routing Number (US), Routing Number (Canada), Routing Number (UK), Swift Code, IBAN Code

Personal Identifiable Info

Social Security Numbers (SSN), Email Addresses, Phone Numbers (US Format), Dates (MM/DD/YYYY format), Passport Numbers, Drivers License Numbers (US), National Identification Numbers (US), Tax Identification Numbers (TIN/SSN), Health Insurance Numbers (US), Vehicle Identification Numbers (VIN), Medical Record Numbers, Birthday, Address, US Address with Lowercase States, US ZIP Codes (5-digit or 5+4 format)

Network & Device

IP Addresses (IPv4), IP Addresses (IPv6)

  • Action On Match [required, enum]

Enum Options

Redact: Replacement Text [string, required]: Specify the replacement text to use when redacting the field value.

MD5

SHA1

SHA2

SHA3

SeaHash

Examples

Examples require that Enabled is toggled on.

Mask IP v4 addresses in a message field

Scenario: Redact sensitive IP v4 address field.

Masking Rules

Field to Mask
Prefix to Match
Match With
Template
Action on Match
Replacement Text

message

[empty]

Select Sensitive Data Template

Select IP Addresses (IPv4) from Network & Device

Redact

REDACTED_IP

Input: The third line in the input contains an ipv4 address.

{"timestamp": "2023-08-16T00:00:05Z", "event_type": "event_type_1", "message": "This is event 1"}
{"timestamp": "2023-08-16T00:00:12Z", "event_type": "event_type_3", "message": "This is event 2"}
{"timestamp": "2023-08-16T00:00:20Z", "event_type": "event_type_4", "message": "This is error event 3 from ip: 10.121.20.54"}

Output: The ipv4 address in the third input line is matched and replaced with the mask string "REDACTED". Other lines are scanned and sent to output without any change.

{"timestamp": "2023-08-16T00:00:05Z", "event_type": "event_type_1", "message": "This is event 1"}
{"timestamp": "2023-08-16T00:00:12Z", "event_type": "event_type_3", "message": "This is event 2"}
{"timestamp": "2023-08-16T00:00:20Z", "event_type": "event_type_4", "message": "This is error event 3 from ip: REDACTED_IP"}

Results: Sensitive IP v4 address is redacted.

Mask Credit Card Numbers in Payload

Scenario: Mask out user’s SSN.

Masking Rules

Field to Mask
Prefix to Match
Match With
Template
Action on Match
Replacement Text

payload

[empty]

Select Sensitive Data Template

Select Credit Card & Banking

Redact

MASKED

Input:

{"timestamp": "2023-08-16T00:05:30Z", "event_type": "payment", "payload": "Payment made with credit card 1234-5678-9012-3456"}
{"timestamp": "2023-08-16T00:10:45Z", "event_type": "payment", "payload": "Payment processed. Card number: 9876-5432-1098-7654"}
{"timestamp": "2023-08-16T00:15:20Z", "event_type": "error", "payload": "Error in transaction with card number: 1111-2222-3333-4444"}

Output: Credit card numbers in the payload field are matched and replaced with the mask string "MASKED".

{"timestamp": "2023-08-16T00:05:30Z", "event_type": "payment", "payload": "Payment made with credit card MASKED"}
{"timestamp": "2023-08-16T00:10:45Z", "event_type": "payment", "payload": "Payment processed. Card number: MASKED"}
{"timestamp": "2023-08-16T00:15:20Z", "event_type": "error", "payload": "Error in transaction with card number: MASKED"}

Results: The User’s SSN is completely masked out based on PII concerns.

Masking an Entire Field

Scenario: Mask out the entire email field.

Field to Mask
Prefix to Match
Match With
Template
Action on Match
Replacement Text

email

[empty]

Select Sensitive Data Template

Select Email Addresses from Personal Identifiable Info

Redact

***************

Input:

"email": "[email protected]"

Output:

"email": "\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*"

Results: The entire email address field is masked out based on PII concerns.

Use Cases

  • Data Privacy Compliance: Ensure sensitive data like PII (Personally Identifiable Information) is obfuscated to comply with regulations like GDPR or HIPAA.

  • Log Anonymization: Mask sensitive information in logs before storing or analyzing them.

  • Data Sharing: Share data with third parties while protecting sensitive information.

Best Practices

  • Always test the function on sample data to ensure the masking behavior meets your requirements.

  • Combine this function with other Observo AI transforms such as Hash Replace to create comprehensive data processing pipelines.

  • Filter Event: Apply conditions to filter data before or after removing fields.

  • Extract Regex: Employ regular expressions on specific fields.

  • Hash Replace: Replace a field with a hash value.

Additional Resources

Last updated

Was this helpful?