-
Notifications
You must be signed in to change notification settings - Fork 16.6k
perf(charts): improve performance on GET list #9619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| "Slice.datasource_type == 'table')", | ||
| remote_side="SqlaTable.id", | ||
| lazy="joined", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting this relation will avoid making one extra query per row, but will not support showing the datasource for deprecated druid source (yet will issue an outer join)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will the experience for the user be if they are primarily using the deprecated Druid connector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the PR description with an example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No impact now
Codecov Report
@@ Coverage Diff @@
## master #9619 +/- ##
==========================================
- Coverage 65.71% 65.68% -0.03%
==========================================
Files 574 574
Lines 30135 30138 +3
Branches 3066 3066
==========================================
- Hits 19802 19797 -5
- Misses 10149 10157 +8
Partials 184 184
Continue to review full report at Codecov.
|
|
I think Airbnb might have opinions about this. @john-bodley would this mess with your users' workflow? |
|
@dpgaspar regarding the comment,
this isn't quite true, i.e., although we encourage environments to use Druid SQL (rather than Druid NoSQL) we still currently support Druid NoSQL and thus I'm not certain whether we should merge this PR. I sense this PR shows the potential performance wins of actually fully deprecating the Druid NoSQL connector, i.e., there are numerous other places in the code base were the logic is complex and/or requires additional joins because there doesn't exist a foreign key between the |
|
@john-bodley had the wrong impression regarding Druid NoSQL. Yet, adapted this PR so that we can have the best of both worlds. We still have the performance boost for fetching charts outside of Druid NoSQL and an extra query is issued for each Druid NoSQL existent on the query page (like before). @willbarrett |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
CATEGORY
Choose one
SUMMARY
This API endpoint was issuing a couple of extra queries for each row. One for
ab_userresolving created by field and another for fetching the correct datasource. Since druid support outside of SQLAlchemy is deprecated, making an outer join with SqlaTable.The idea to optimize and avoid a query per row, is when using
@propertyhas a column and the method itself references a column, make sure this column is declared onlist_columnsso it's prefetched or SQLAlchemy will issue the extra queryLocal times are:
Before:
(timing) ChartRestApi.get_list.time = 130ms - 180ms
After:
(timing) ChartRestApi.get_list.time = 25ms - 50ms
Druid charts get displayed normally but if any, an extra query is issued for each one:
ADDITIONAL INFORMATION
REVIEWERS
@nytai