Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@

- Add InstallableBuild and SizeAnalysis data categories. ([#5084](https://github.com/getsentry/relay/pull/5084))
- Add dynamic PII derivation to `metastructure`. ([#5107](https://github.com/getsentry/relay/pull/5107))
- Add negation pattern matching. ([#5116](https://github.com/getsentry/relay/pull/5116))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong section now.


**Internal**:

Expand Down
31 changes: 30 additions & 1 deletion relay-pattern/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -368,6 +368,8 @@ enum MatchStrategy {
Static(bool),
/// The pattern is complex and needs to be evaluated using [`wildmatch`].
Wildmatch(Tokens),
/// The pattern is complex and needs to be evaluated using [`wildmatch`].
NegatedWildmatch(Tokens),
Comment on lines +371 to +372
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't seem correct, a Static literal can also be inverted, we now have this implicit behavior of negated is always a wild match, which also removes a lot of the performance benefits we have by factoring out simple patterns.

You can instead track this property on the Pattern itself and just invert the is_match there.

// Possible future optimizations for `Any` variations:
// Examples: `??`. `??suffix`, `prefix??` and `?contains?`.
}
Expand All @@ -384,6 +386,7 @@ impl MatchStrategy {
[Token::Wildcard, Token::Literal(literal), Token::Wildcard] => {
Self::Contains(std::mem::take(literal))
}
[Token::Negated, ..] => Self::NegatedWildmatch(tokens),
_ => Self::Wildmatch(tokens),
};

Expand All @@ -399,6 +402,9 @@ impl MatchStrategy {
MatchStrategy::Contains(contains) => match_contains(contains, haystack, options),
MatchStrategy::Static(matches) => *matches,
MatchStrategy::Wildmatch(tokens) => wildmatch::is_match(haystack, tokens, options),
MatchStrategy::NegatedWildmatch(tokens) => {
!wildmatch::is_match(haystack, tokens, options)
}
}
}
}
Expand Down Expand Up @@ -500,6 +506,10 @@ impl<'a> Parser<'a> {
}

fn parse(&mut self) -> Result<(), ErrorKind> {
if self.advance_if(|c| c == '!') {
self.push_token(Token::Negated);
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Negated Token Incorrectly Affects Wildmatch

The Token::Negated is incorrectly passed into the wildmatch algorithm. This meta-token should be stripped before wildmatch evaluation, as its inclusion causes incorrect matching, especially for patterns like ! which incorrectly match non-empty strings but not empty ones.

Additional Locations (2)

Fix in Cursor Fix in Web

Comment on lines +509 to +511
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make this behavior configurable via Options and only enable it for releases for now.

We can then just use a TypedPattern with special options for that one glob.


while let Some(c) = self.advance() {
match c {
'?' => self.push_token(Token::Any(NonZeroUsize::MIN)),
Expand Down Expand Up @@ -671,6 +681,7 @@ impl<'a> Parser<'a> {
/// - A [`Token::Any`] is never followed by [`Token::Any`].
/// - A [`Token::Literal`] is never followed by [`Token::Literal`].
/// - A [`Token::Class`] is never empty.
/// - A [`Token::Negated`] is always the first character in the string.
#[derive(Clone, Debug, Default)]
struct Tokens(Vec<Token>);

Expand Down Expand Up @@ -761,6 +772,8 @@ enum Token {
Any(NonZeroUsize),
/// The wildcard token `*`.
Wildcard,
/// The token `!`.
Negated,
/// A class token `[abc]` or its negated variant `[!abc]`.
Class { negated: bool, ranges: Ranges },
/// A list of nested alternate tokens `{a,b}`.
Expand Down Expand Up @@ -960,6 +973,7 @@ mod tests {
MatchStrategy::Contains(_) => "Contains",
MatchStrategy::Static(_) => "Static",
MatchStrategy::Wildmatch(_) => "Wildmatch",
MatchStrategy::NegatedWildmatch(_) => "NegatedWildmatch",
};
assert_eq!(
kind,
Expand Down Expand Up @@ -1585,7 +1599,7 @@ mod tests {
assert_pattern!("1.18.[!0-4].*", "1.18.5.");
assert_pattern!("1.18.[!0-4].*", "1.18.5.aBc");
assert_pattern!("1.18.[!0-4].*", NOT "1.18.3.abc");
assert_pattern!("!*!*.md", "!foo!.md"); // no `!` outside of character classes
assert_pattern!("*!*.md", "foo!.md"); // no `!` outside of character classes
assert_pattern!("foo*foofoo*foobar", "foofoofooxfoofoobar");
assert_pattern!("foo*fooFOO*fOobar", "fooFoofooXfoofooBAR", i);
assert_pattern!("[0-9]*a", "0aaaaaaaaa", i);
Expand Down Expand Up @@ -1936,4 +1950,19 @@ mod tests {
assert!(!patterns.is_match("foo"));
assert!(patterns.is_match("bar"));
}

#[test]
fn test_pattern_negation() {
let patterns = Patterns::builder().add("!foo@*").unwrap().take();

assert!(patterns.is_match("[email protected]"));
assert!(patterns.is_match("[email protected]"));
assert!(patterns.is_match("foo"));
assert!(patterns.is_match("barfoo@"));

// foo@ is never matched.
assert!(!patterns.is_match("[email protected]"));
assert!(!patterns.is_match("[email protected]"));
assert!(!patterns.is_match("foo@anything"));
}
}
1 change: 1 addition & 0 deletions relay-pattern/src/wildmatch.rs
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ where
t_next += 1;

match token {
Token::Negated => true,
Token::Literal(literal) => match M::is_prefix(h_current, literal) {
Some(n) => advance!(n),
// The literal does not match, but it may match after backtracking.
Expand Down
Loading