Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up event match regex evaluation for big messages #5008

Merged
merged 1 commit into from
Jan 25, 2022

Conversation

SpiritCroc
Copy link
Contributor

@SpiritCroc SpiritCroc commented Jan 20, 2022

regex.containsMatchIn() for .*@room.* can take significantly longer
than checking for @room (some real-world events I was getting took
around 15 seconds with this, significantly slowing down the sync
parsing).

Checking containsMatchIn() does not lead to different results when
having leading and trailing stars however, it will match in the same
cases as when these are omitted.

For testing purposes, I sent myself some Lorem Ipsum with 5000 words
(not containing any @room).
Without this change, the regex evaluation takes about 16 seconds.
With this change, the regex evaluation now takes significantly less then
a second.

Signed-off-by: Tobias Büttner [email protected]

Pull Request Checklist

`regex.containsMatchIn()` for `.*@room.*` can take significantly longer
than checking for `@room` (some real-world events I was getting took
around 15 seconds with this, significantly slowing down the sync
parsing).

Checking `containsMatchIn()` does not lead to different results when
having leading and trailing stars however, it will match in the same
cases as when these are omitted.

For testing purposes, I sent myself some Lorem Ipsum with 5000 words
(not containing any @room).
Without this change, the regex evaluation takes about 16 seconds.
With this change, the regex evaluation now takes significantly less then
a second.
@bmarty bmarty added the Z-Community-PR Issue is solved by a community member's PR label Jan 21, 2022
val modPattern = if (pattern.hasSpecialGlobChar())
// Regex.containsMatchIn() is way faster without leading and trailing
// stars, that don't make any difference for the evaluation result
pattern.removePrefix("*").removeSuffix("*").simpleGlobToRegExp()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL! 💯

not a blocker for the PR (assuming this is working as intended), this class would be a great candidate for some unit tests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have some here!

Copy link
Member

@bmarty bmarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I merge the PR, ideally we should have unit test to cover this specific case with @room. Will add it later.

@bmarty bmarty merged commit 63b3def into element-hq:develop Jan 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Z-Community-PR Issue is solved by a community member's PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants