---
title: "Advanced Data Scrubbing"
description: "Learn about using Advanced Data Scrubbing as an alternative way to redact sensitive information just before it is saved in Sentry."
url: https://docs.sentry.io/security-legal-pii/scrubbing/advanced-datascrubbing/
---

# Advanced Data Scrubbing

In addition to [using hooks in your SDK](https://docs.sentry.io/platform-redirect.md?next=%2Fdata-management%2Fsensitive-data%2F) or our [server-side data scrubbing features](https://docs.sentry.io/security-legal-pii/scrubbing//server-side-scrubbing.md) to redact sensitive data, Advanced Data Scrubbing is an alternative way to redact sensitive information just before it is saved in Sentry. It allows you to:

* Define custom regular expressions to match on sensitive data
* Detailed tuning on which parts of an event to scrub
* Partial removal or hashing of sensitive data instead of deletion

##### Rule Precedence

Advanced Data Scrubbing rules take precedence over other Server-Side Data Scrubbing settings. Specifically, any advanced rule will apply regardless of whether the matched field is in Safe Fields or not.

## [A Basic Example](https://docs.sentry.io/security-legal-pii/scrubbing/advanced-datascrubbing.md#a-basic-example)

Navigate to your project- or organization-settings, click *Security and Privacy*, then *Advanced Data Scrubbing*.

1. Click on *Add Rule*. You will be presented with a new dialog.
2. Select *Mask* as *Method*.
3. Select *Credit card numbers* as *Data Type*.
4. Enter `$string` as *Source*.

As soon as you press *Save*, Sentry will attempt to find all credit card numbers in your events going forward, and replace them with a series of `******`.

For a more verbose tutorial check out [this blogpost](https://blog.sentry.io/2020/07/02/sentry-data-wash-now-offering-advanced-scrubbing/).

Rules generally consist of three parts:

* A [*Method*](https://docs.sentry.io/security-legal-pii/scrubbing/advanced-datascrubbing.md#methods): What to do.
* A [*Data Type*](https://docs.sentry.io/security-legal-pii/scrubbing/advanced-datascrubbing.md#data-types): What to look for.
* A [*Source*](https://docs.sentry.io/security-legal-pii/scrubbing/advanced-datascrubbing.md#sources): Where to look.

## [Methods](https://docs.sentry.io/security-legal-pii/scrubbing/advanced-datascrubbing.md#methods)

* *Remove*: Remove the entire field. We may choose to either set it to `null`, remove it entirely, or replace it with an empty string depending on technical constraints.
* *Mask*: Replace all characters with `*`.
* *Hash*: Replace the matched substring with a hashed value.
* *Replace*: Replace the matched substring with a constant *placeholder* value (defaulting to `[Filtered]`).

## [Data Types](https://docs.sentry.io/security-legal-pii/scrubbing/advanced-datascrubbing.md#data-types)

* *Regex Matches*: Custom regular expression. For example: `[a-zA-Z0-9]+`. Some notes:

  * Do not write `/[a-zA-Z0-9]+/g`, as that will search for a literal `/` and `/g`.
  * For case-insensitivity, prefix your regex with `(?i)`.
  * If you're trying to use one of the popular regex "IDEs" like [regex101.com](https://regex101.com/) select the Rust dialect.
  * Escape using `\`, e.g. `\*` is a literal `*`. This works for any of the following characters: `\.+*?()|[]{}^$`.
  * If you want to replace a substring of a specific search string, wrap it in `()` and check "Only replace first capture match".

* *Credit Card Numbers*: Any substrings that look like credit card numbers.

* *Password Fields*: Any substrings that look like they may contain passwords. Any string that mentions passwords, auth tokens or credentials, any variable that is called `password` or `auth`.

* *IP Addresses*: Any substrings that look like valid IPv4 or IPv6 addresses.

* *IMEI Numbers*: Any substrings that look like an IMEI or IMEISV.

* *Email Addresses*

* *UUIDs*

* *PEM Keys*: Any substrings that look like the content of a PEM-keyfile.

* *Auth in URLs*: Usernames and passwords in URLs like `https://user:pass@example.com/foo`.

* *US social security numbers*: 9-digit social security numbers for the USA.

* *Usernames in filepaths*: For example `myuser` in `/Users/myuser/file.txt`, `C:\Users\myuser\file.txt`, `C:\Documents and Settings\myuser\file.txt`, `/home/myuser/file.txt`, ...

* *MAC Addresses*

* *Anything*: Matches any value. This is useful if you want to remove a certain JSON key by path using [*Sources*](https://docs.sentry.io/security-legal-pii/scrubbing/advanced-datascrubbing.md#sources) regardless of the value.

##### Sentry does not know what your code does

Sentry does not know if a local variable that looks like a credit card number actually is one. As such, you need to expect not only false-positives but also false-negatives. [*Sources*](https://docs.sentry.io/security-legal-pii/scrubbing/advanced-datascrubbing.md#sources) can help you in limiting the scope in which your rule runs.

## [Sources](https://docs.sentry.io/security-legal-pii/scrubbing/advanced-datascrubbing.md#sources)

Selectors allow you to restrict rules to certain parts of the event. This is useful to unconditionally remove certain data by event attribute, and can also be used to conservatively test rules on real data. A few examples:

* `**` to scrub [all default event PII fields](https://docs.sentry.io/security-legal-pii/scrubbing//server-side-scrubbing/event-pii-fields.md) (other fields, like the span description, require specific selectors)
* `$error.value` to scrub in the exception message
* `$message` to scrub the event-level log message
* `extra.'My Value'` to scrub the key `My Value` in "Additional Data"
* `extra.**` to scrub everything in "Additional Data"
* `$http.headers.x-custom-token` to scrub the request header `X-Custom-Token`
* `$user.ip_address` to scrub the user's IP address
* `$frame.vars.foo` to scrub a stack trace frame variable called `foo`
* `contexts.device.timezone` or `contexts.culture.timezone` to scrub a key from the Device context
* `tags.server_name` to scrub the tag `server_name`

All key names are treated case-insensitively.

### [Using an event ID to auto-complete sources](https://docs.sentry.io/security-legal-pii/scrubbing/advanced-datascrubbing.md#using-an-event-id-to-auto-complete-sources)

Above the *Source* input field you will find another input field for an event ID. Providing a value there allows for better auto-completion of arbitrary *Additional Data* fields and variable names.

The event ID is purely optional and the value is not saved as part of your settings. Data scrubbing settings always apply to all new events within a project/organization (going forward).

### [Advanced source names](https://docs.sentry.io/security-legal-pii/scrubbing/advanced-datascrubbing.md#advanced-source-names)

Data scrubbing always works on the raw event payload. Keep in mind that some fields in the UI may be called differently in the JSON schema. When looking at an event there should always be a link called "JSON" present that allows you to see what the data scrubber sees.

For example, what is called "Additional Data" in the UI is called `extra` in the event payload. To remove a specific key called `foo`, you would write:

```bash
[Remove] [Anything] from [extra.foo]
```

Another example. Sentry knows about two kinds of error messages: the exception message, and the top-level log message. Here is an example of how such an event payload as sent by the SDK (and downloadable from the UI) would look like:

```json
{
  "logentry": {
    "formatted": "Failed to roll out the dinglebop"
  },
  "exception": {
    "values": [
      {
        "type": "ZeroDivisionError",
        "value": "integer division or modulo by zero"
      }
    ]
  }
}
```

Since the "error message" is taken from the `exception`'s `value`, and the "message" is taken from `logentry`, we would have to write the following to remove both from the event:

```bash
[Remove] [Anything] from [exception.values.*.value]
[Remove] [Anything] from [logentry.formatted]
```

### [Boolean Logic](https://docs.sentry.io/security-legal-pii/scrubbing/advanced-datascrubbing.md#boolean-logic)

You can combine sources using boolean logic.

* Prefix with `!` to invert the source. `foo` matches the JSON key `foo`, while `!foo` matches everything but `foo`.
* Build the conjunction (AND) using `&&`, such as: `foo && !extra.foo` to match the key `foo` except when inside of `extra`.
* Build the disjunction (OR) using `||`, such as: `foo || bar` to match `foo` or `bar`.

### [Wildcards](https://docs.sentry.io/security-legal-pii/scrubbing/advanced-datascrubbing.md#wildcards)

* `**` matches all subpaths, so that `foo.**` matches all JSON keys within `foo`.
* `*` matches a single path item, so that `foo.*` matches all JSON keys one level below `foo`.

When using `**` selectors in scrubbing rules, be aware that they only apply to the [default event PII fields](https://docs.sentry.io/security-legal-pii/scrubbing/server-side-scrubbing/event-pii-fields.md). Fields not on that list will **not** be scrubbed automatically, even if matched by a `**` selector.

For example, given the following event payload:

```json
{
  "exception": {
    "values": [
      {
        "type": "Error",
        "value": "An error occurred",
        "stacktrace": {
          "frames": [
            {
              "filename": "test/example/script.js",
              "abs_path": "test/example/script.js?id=12345"
            }
          ]
        }
      }
    ]
  }
}
```

Using the scrubbing rule `[Mask] [Anything] from [$frame.**]` will not scrub the `filename` or `abs_path` fields, since they're not included in the default PII field list.

To scrub those fields, use one of the following rules:

```bash
[Mask] [Anything] from [$frame.*]
[Mask] [Anything] from [$frame.filename]
[Mask] [Anything] from [$frame.abs_path]
```

### [Value Types](https://docs.sentry.io/security-legal-pii/scrubbing/advanced-datascrubbing.md#value-types)

Select subsections by JSON-type using the following:

* `$string`: Matches any string value
* `$number`: Matches any integer or float value
* `$datetime`: Matches any field in the event that represents a timestamp
* `$array`: Matches any JSON array value
* `$object`: Matches any JSON object

Select known parts of the schema using the following:

* `$error`: Matches a single exception instance. Alias for `exception.values.*`
* `$stack`: Matches a stack trace instance. Alias for `stacktrace || $error.stacktrace || $thread.stacktrace`
* `$frame`: Matches a frame in a stack trace. Alias for `$stacktrace.frames.*`
* `$http`: Matches the HTTP request context of an event. Alias for `request`
* `$user`: Matches the user context of an event. Alias for `user`
* `$message`: Matches the top-level log message. Alias for `$logentry.formatted`
* `$logentry`: Matches the `logentry` attribute of an event. Alias for `logentry`
* `$thread`: Matches a single thread instance. Alias for `threads.values.*`
* `$breadcrumb`: Matches a single breadcrumb. Alias for `breadcrumbs.values.*`
* `$span`: Matches a [trace span](https://docs.sentry.io/product/sentry-basics/tracing/distributed-tracing.md#spans). Alias for `spans.*`
* `$sdk`: Matches the SDK context. Alias for `sdk`

Select attachment and parts of attachments, see [Attachment Scrubbing](https://docs.sentry.io/security-legal-pii/scrubbing/attachment-scrubbing.md) for details.

* `$attachments`: Root selector for attachments.
* `$minidump`: Selector for [minidump](https://docs.sentry.io/platforms/native/guides/minidumps.md) attachments.
* `$binary`: Matches all binary data fields in attachments.

### [Escaping Special Characters](https://docs.sentry.io/security-legal-pii/scrubbing/advanced-datascrubbing.md#escaping-special-characters)

If the object key you want to match contains whitespace or special characters, you can use quotes to escape it:

```bash
[Remove] [Anything] from [extra.'my special value']
```

This matches the key `my special value` in *Additional Data*.

To escape `'` (single quote) within the quotes, replace it with `''` (two quotes):

```bash
[Remove] [Anything] from [extra.'my special '' value']
```

This matches the key `my special ' value` in *Additional Data*.

## [Known Limitations of server-side data scrubbing](https://docs.sentry.io/security-legal-pii/scrubbing/advanced-datascrubbing.md#known-limitations-of-server-side-data-scrubbing)

The following limitations generally apply to all server-side data scrubbing, be it basic Safe Fields usage or Advanced Data Scrubbing.

* Hashing, masking or replacing a JSON object, array or number (anything that is not a string) cannot be done in all circumstances as it would change the JSON type of the value and violate assumptions Sentry's internals make about the data schema. Data scrubbing will ignore the *Method* in those cases and always remove/replace with `null` as that is always safe.

* Sentry's internals require that the event user's IP address must either be `null` or a valid IPv4/IPv6 address. If you're trying to hash, mask or replace IP addresses, data scrubbing will move the replacement value into the user ID (if one is not already set) in order to avoid breaking this requirement while still providing useful data for the Users count on an issue.

* In stack traces, scrubbing works on file paths but not on a file's base name. This would violate assumptions in the processing pipeline resulting in a poor user experience. Instead, you can scrub a file's base name in the SDK itself, using the [`RewriteFrames` integration](https://docs.sentry.io/platforms/javascript/configuration/integrations/rewriteframes.md) or [`beforeSend`](https://docs.sentry.io/platforms/javascript/configuration/filtering.md).

## [Data Scrubbing for Logs](https://docs.sentry.io/security-legal-pii/scrubbing/advanced-datascrubbing.md#data-scrubbing-for-logs)

[Logs](https://docs.sentry.io/product/explore/logs.md) will honor the default [Server-Side Data Scrubbing rules](https://docs.sentry.io/security-legal-pii/scrubbing/server-side-scrubbing.md#geographic-information) applied to your projects. To add advanced data scrubbing rules for Logs, start by choosing the dataset.

Deep wild cards ('\*\*') or $string / $number value selectors will apply across all datasets. For specific data types you can select the log attribute that needs to be scrubbed.

You can also edit existing scrubbing rules
