Skip to content

Commit

Permalink
feat(global-filters): Introduce global generic filters (#3161)
Browse files Browse the repository at this point in the history
This PR adds support for global generic filters.

Generic filters are a new schema of inbound filters with generic
conditions to avoid creating a new filter for each use case.
Project configs already support [project-specific generic
filters](ee3036c).
Global generic filters have been designed to be extendable,
future-compatible, independent, and flexible to work well in isolation
but also with project generic filters.

For event processing, Relay uses both the project config of the event
and the global config. Before running the filters, Relay will merge the
generic filters from the two configs and apply these to the event. If
there's a match relay will drop the event and generate an outcome, or
forward the event to the next processing step if there isn't.

## Why

Currently, adding a new generic filter results in an increase in the
size of project configs (`config_size * number_of_filters`). This can
result in a significant increase in memory usage in all the services
that store project configs (`increase_per_project *
number_of_projects`), especially Relay. Incorporating generic filters
into global configs allows Relay to apply these filters similarly with
minimal overhead.

There are additional, less important goals that this approach also
achieves: not increasing network usage, not impacting project config
computation time, etc.

## Previous work

Generic inbound filters were introduced in project configs in
ee3036c.
They are currently not enabled on `sentry` as supporting these filters
in the global config was prioritized first.

## Protocol

Global generic filters have the same protocol as project generic
filters, but they exist in the global config. A new optional `filters`
key is added as a top-level key in global configs. This object expects
the following properties:
- `version`: the version of global configs, for future-compatibility, as
a `u16`.
- `filters`: a list of generic filters sorted by descending priority,
with the same protocol as with the project generic filters.

Example of a full protocol:
```json
{
	// ... other global config keys ...
	"filters": {
		"version": 1,
		"filters": [
			{
				"id": "sample-errors",
				"isEnabled": true,
				"condition": {
					"op": "eq",
					"name": "event.exceptions",
					"value": "Sample Error"
				}
			}
		]
	}
}
```

## Decisions and trade-offs

### Protocol

Supporting external Relays with future compatibility to make updates
safely:
- Introduced `version` to skip filters on a version not supported by
Relay.
- When filters are unsupported, Relay forwards events upstream without
extracting metrics. This allows the next updated Relay in the chain
supporting the filters to run filtering and, in case there's a decision
to keep the event, extract metrics from it.

Relay merges the configs prioritizing the project config. Each filter
has an `isEnabled` flag to maximize flexibility and allow:
- Defining a filter for a single project in its config.
- Defining a filter for all projects in the global config.
- Defining a filter for most projects by enabling it in the global
config and disabling it in the applicable project configs.
- Defining a filter for some projects by disabling it in the global
config and enabling it in the applicable project configs.
This is the simplest alternative to support all the use cases.

### Relay implementation details

Relay types the filters internally as an ordered map with custom
(de)serialization, keeping the order in the protocol, to benefit from:
- `O(1)` retrieval of filters given the ID.
- Removing duplicated filters.
Removing duplicates is a prerequisite to merge the filters from both
configs properly. Since the list of filters in the payload is expected
to be sorted by descending priority, only the first occurrence of a
filter is selected and the rest are discarded during deserialization.
Discarded filters are not forwarded downstream.

`filters` is typed as an `ErrorBoundary` instead of an `Option`:
- Errors during deserialization indicate breaking changes. Relay skips
metric extraction and dynamic sampling in these cases, and the next
relay in chain deals with them.
- Note: processing relays will run dynamic sampling and extract metrics
even if filters are broken.
- An empty entry in a payload is deserialized as an empty list of
filters, effectively a noop.
- A list of filters in a supported version is appropriately deserialized
and applied.
  • Loading branch information
iker-barriocanal authored Mar 1, 2024
1 parent 18a197e commit 2800e95
Show file tree
Hide file tree
Showing 16 changed files with 1,249 additions and 90 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
- Parametrize transaction in dynamic sampling context. ([#3141](https://github.com/getsentry/relay/pull/3141))
- Adds ReplayVideo envelope-item type. ([#3105](https://github.com/getsentry/relay/pull/3105))
- Parse & scrub span description for supabase. ([#3153](https://github.com/getsentry/relay/pull/3153), [#3156](https://github.com/getsentry/relay/pull/3156))
- Introduce generic filters in global configs. ([#3161](https://github.com/getsentry/relay/pull/3161))

**Bug Fixes**:

Expand Down
2 changes: 2 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ futures = { version = "0.3", default-features = false, features = ["std"] }
insta = { version = "1.31.0", features = ["json", "redactions", "ron"] }
hash32 = "0.3.1"
hashbrown = "0.14.3"
indexmap = "2.0.0"
itertools = "0.10.5"
once_cell = "1.13.1"
parking_lot = "0.12.1"
Expand Down
3 changes: 2 additions & 1 deletion relay-dynamic-config/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ relay-event-normalization = { path = "../relay-event-normalization" }
relay-filter = { path = "../relay-filter" }
relay-log = { path = "../relay-log" }
relay-pii = { path = "../relay-pii" }
relay-protocol = { path = "../relay-protocol" }
relay-protocol = { path = "../relay-protocol" }
relay-quotas = { path = "../relay-quotas" }
relay-sampling = { path = "../relay-sampling" }
serde = { workspace = true }
Expand All @@ -33,3 +33,4 @@ smallvec = { workspace = true }
[dev-dependencies]
insta = { workspace = true }
similar-asserts = { workspace = true }
indexmap = { workspace = true }
38 changes: 38 additions & 0 deletions relay-dynamic-config/src/global.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,13 @@ use std::path::Path;

use relay_base_schema::metrics::MetricNamespace;
use relay_event_normalization::MeasurementsConfig;
use relay_filter::GenericFiltersConfig;
use relay_quotas::Quota;
use serde::{Deserialize, Serialize};
use serde_json::Value;

use crate::ErrorBoundary;

/// A dynamic configuration for all Relays passed down from Sentry.
///
/// Values shared across all projects may also be included here, to keep
Expand All @@ -23,6 +26,12 @@ pub struct GlobalConfig {
/// Quotas that apply to all projects.
#[serde(skip_serializing_if = "Vec::is_empty")]
pub quotas: Vec<Quota>,
/// Configuration for global inbound filters.
///
/// These filters are merged with generic filters in project configs before
/// applying.
#[serde(skip_serializing_if = "is_err_or_empty")]
pub filters: ErrorBoundary<GenericFiltersConfig>,
/// Sentry options passed down to Relay.
#[serde(
deserialize_with = "default_on_error",
Expand All @@ -46,6 +55,21 @@ impl GlobalConfig {
Ok(None)
}
}

/// Returns the generic inbound filters.
pub fn filters(&self) -> Option<&GenericFiltersConfig> {
match &self.filters {
ErrorBoundary::Err(_) => None,
ErrorBoundary::Ok(f) => Some(f),
}
}
}

fn is_err_or_empty(filters_config: &ErrorBoundary<GenericFiltersConfig>) -> bool {
match filters_config {
ErrorBoundary::Err(_) => true,
ErrorBoundary::Ok(config) => config.version == 0 && config.filters.is_empty(),
}
}

/// All options passed down from Sentry to Relay.
Expand Down Expand Up @@ -246,6 +270,20 @@ mod tests {
"namespace": null
}
],
"filters": {
"version": 1,
"filters": [
{
"id": "myError",
"isEnabled": true,
"condition": {
"op": "eq",
"name": "event.exceptions",
"value": "myError"
}
}
]
},
"options": {
"profiling.profile_metrics.unsampled_profiles.enabled": true
}
Expand Down
12 changes: 6 additions & 6 deletions relay-dynamic-config/src/project.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ use relay_event_normalization::{
BreakdownsConfig, MeasurementsConfig, PerformanceScoreConfig, SpanDescriptionRule,
TransactionNameRule,
};
use relay_filter::FiltersConfig;
use relay_filter::ProjectFiltersConfig;
use relay_pii::{DataScrubbingConfig, PiiConfig};
use relay_quotas::Quota;
use relay_sampling::SamplingConfig;
Expand Down Expand Up @@ -32,8 +32,8 @@ pub struct ProjectConfig {
#[serde(skip_serializing_if = "Option::is_none")]
pub grouping_config: Option<Value>,
/// Configuration for filter rules.
#[serde(skip_serializing_if = "FiltersConfig::is_empty")]
pub filter_settings: FiltersConfig,
#[serde(skip_serializing_if = "ProjectFiltersConfig::is_empty")]
pub filter_settings: ProjectFiltersConfig,
/// Configuration for data scrubbers.
#[serde(skip_serializing_if = "DataScrubbingConfig::is_disabled")]
pub datascrubbing_settings: DataScrubbingConfig,
Expand Down Expand Up @@ -109,7 +109,7 @@ impl Default for ProjectConfig {
trusted_relays: vec![],
pii_config: None,
grouping_config: None,
filter_settings: FiltersConfig::default(),
filter_settings: ProjectFiltersConfig::default(),
datascrubbing_settings: DataScrubbingConfig::default(),
event_retention: None,
quotas: Vec::new(),
Expand Down Expand Up @@ -154,8 +154,8 @@ pub struct LimitedProjectConfig {
pub allowed_domains: Vec<String>,
pub trusted_relays: Vec<PublicKey>,
pub pii_config: Option<PiiConfig>,
#[serde(skip_serializing_if = "FiltersConfig::is_empty")]
pub filter_settings: FiltersConfig,
#[serde(skip_serializing_if = "ProjectFiltersConfig::is_empty")]
pub filter_settings: ProjectFiltersConfig,
#[serde(skip_serializing_if = "DataScrubbingConfig::is_disabled")]
pub datascrubbing_settings: DataScrubbingConfig,
#[serde(skip_serializing_if = "Option::is_none")]
Expand Down
1 change: 1 addition & 0 deletions relay-filter/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ publish = false
[dependencies]
ipnetwork = "0.20.0"
once_cell = { workspace = true }
indexmap = { workspace = true }
regex = { workspace = true }
relay-common = { path = "../relay-common" }
relay-event-schema = { path = "../relay-event-schema" }
Expand Down
Loading

0 comments on commit 2800e95

Please sign in to comment.