-
-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ref(abuse) Add more predefined querylog queries #3489
Conversation
Add two new queries to help narrow down with referrers might be causing the abuse. To use these queries you should already know which cluster is experiencing the abuse, and at what time the abuse started. The idea behind the queries is: take half an hour before the abuse started, and half an hour after. Compare the referrers between those two time periods, and surface the ones that had the highest percentage change from before to after. The purpose of these is not to automatically find the smoking gun, but to at least remove some of the noise and show some actionable signal that can be followed up on to confirm if one of the referrers actually is the source of the problem. I tested these queries on some known cases of abuse, and they did surface the referrers that turned out to be the problem.
Codecov ReportBase: 91.66% // Head: 91.66% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #3489 +/- ##
=======================================
Coverage 91.66% 91.66%
=======================================
Files 715 715
Lines 33472 33476 +4
=======================================
+ Hits 30681 30685 +4
Misses 2791 2791
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool!
WHERE timestamp >= toDateTime('2022-12-07T21:55:00') | ||
AND timestamp <= toDateTime('2022-12-07T22:20:00') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WHERE timestamp >= toDateTime('2022-12-07T21:55:00') | |
AND timestamp <= toDateTime('2022-12-07T22:20:00') | |
WHERE timestamp >= toDateTime(<time_str_after_abuse>) | |
AND timestamp <= toDateTime(<time_str_after_abuse + 30 mins) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making these explicit descriptive parameters may be easier for the user
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I filed a ticket to make this a feature in the tool, but I do think this is still an improvement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I deliberately left these like this, since I find it easier when I have an actual example to edit. Otherwise I have to remind myself of the exact date format Clickhouse expects.
I would like to have our tools ultimately work using parameters though, in a similar way to how saved queries work on Redash. Users fill out a form with the values they want and we template them into the query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
up to you, PR is approved though. Thanks for adding this query!
WHERE timestamp >= toDateTime('2022-12-07T21:24:00') | ||
AND timestamp <= toDateTime('2022-12-07T21:54:00') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WHERE timestamp >= toDateTime('2022-12-07T21:24:00') | |
AND timestamp <= toDateTime('2022-12-07T21:54:00') | |
WHERE timestamp >= toDateTime(<time_str_before_abuse>) | |
AND timestamp <= toDateTime(<time_str_before_abuse + 30mins>) |
Add two new queries to help narrow down with referrers might be causing the
abuse. To use these queries you should already know which cluster is
experiencing the abuse, and at what time the abuse started.
The idea behind the queries is: take half an hour before the abuse started, and
half an hour after. Compare the referrers between those two time periods, and
surface the ones that had the highest percentage change from before to after.
The purpose of these is not to automatically find the smoking gun, but to at
least remove some of the noise and show some actionable signal that can be
followed up on to confirm if one of the referrers actually is the source of
the problem.
I tested these queries on some known cases of abuse, and they did surface the
referrers that turned out to be the problem.