Skip to content

Conversation

@bryanck
Copy link
Contributor

@bryanck bryanck commented Jan 22, 2021

SUMMARY

This PR adds implicit ordering for the line chart and the word cloud chart. Currently, if a line chart has a group by, the series is limited in number, and there is no sort specified, the series selected will be arbitrary rather than the top series based on the metric. Similarly, the word cloud chart, if limited, will select an arbitrary set of results, rather than the top results based on the metric.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TEST PLAN

Create a line chart with group by, limit the series number, and the top N series should be shown. Similarly with the word cloud chart, apply a limit, and the top results should be displayed rather than an arbitrary set.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Changes UI
  • Requires DB Migration.
  • Confirm DB Migration upgrade and downgrade tested.
  • Introduces new feature or API
  • Removes existing feature or API

@bryanck bryanck changed the title Add implicit orderby to certain queries fix: Add implicit orderby to certain queries Jan 22, 2021
@codecov-io
Copy link

codecov-io commented Jan 22, 2021

Codecov Report

Merging #12674 (ae10fdd) into master (734fb75) will decrease coverage by 0.20%.
The diff coverage is 75.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #12674      +/-   ##
==========================================
- Coverage   66.85%   66.64%   -0.21%     
==========================================
  Files        1018     1018              
  Lines       49776    49800      +24     
  Branches     4869     4877       +8     
==========================================
- Hits        33278    33191      -87     
- Misses      16375    16482     +107     
- Partials      123      127       +4     
Flag Coverage Δ
cypress 50.58% <ø> (-0.38%) ⬇️
javascript 60.94% <ø> (+0.01%) ⬆️
python 63.80% <75.00%> (-0.21%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset/charts/commands/data.py 94.82% <66.66%> (-1.54%) ⬇️
superset/viz.py 60.06% <80.00%> (+0.06%) ⬆️
superset/db_engines/hive.py 0.00% <0.00%> (-85.72%) ⬇️
superset/db_engine_specs/hive.py 54.61% <0.00%> (-29.24%) ⬇️
...et-frontend/src/SqlLab/reducers/getInitialState.js 33.33% <0.00%> (-16.67%) ⬇️
superset-frontend/src/reduxUtils.ts 70.88% <0.00%> (-8.87%) ⬇️
...rontend/src/SqlLab/components/TabbedSqlEditors.jsx 76.58% <0.00%> (-5.07%) ⬇️
superset-frontend/src/SqlLab/actions/sqlLab.js 58.59% <0.00%> (-3.83%) ⬇️
superset-frontend/src/SqlLab/reducers/sqlLab.js 38.42% <0.00%> (-1.66%) ⬇️
superset-frontend/src/CRUD/CollectionTable.tsx 61.60% <0.00%> (-1.01%) ⬇️
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 734fb75...ae10fdd. Read the comment docs.

@junlincc junlincc added viz:charts:line Related to the Line chart viz:charts:wordcloud Related to the Wordcloud chart labels Jan 23, 2021
@junlincc junlincc requested a review from villebro January 23, 2021 06:10
Comment on lines +85 to +87
for query in self._query_context.queries:
if query.row_limit and query.metrics and not query.orderby:
query.orderby = [(query.metrics[0], False)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I understand the use case and agree with it from an end user perspective, always adding orderbys to queries can have a significant performance impact on queries referencing large tables. Going forward we want to make less implicit assumptions about ordering and let the viz' explicitly request ordering if it's needed. Therefore I feel this logic should be moved to the individual plugins rather than apply it on all queries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds reasonable, however the behavior of returning unordered results in these cases seems to be a regression, as this wasn't happening in v0.37. Perhaps the viz components changed?

Copy link
Contributor Author

@bryanck bryanck Jan 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no one else has a problem with this, you can go ahead and close it and we'll patch it on our side for now, until the components are updated (I'll look into that).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, word cloud was using the legacy API pre 0.37. Maybe that's why?

I added a “sort by metric” option for Sankey not long ago (disabled by default), maybe you can take a similar approach?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bryanck originally this regression was caused by #11153 which removed the implicit orderby from chart data requests. I agree with the fix on a chart-by-chart basis, and definitely think it makes sense on word cloud. The problem with implicitly assuming an orderby clause is that it will cause unnecessary query overhead in cases where it is potentially not required, and sometimes cause unwanted side-effects. One side-effect that was noticed by this implicit ordering was on timeseries charts, where ordering by metric caused "scanning" from top to bottom if the limit was reached (=the observations with the lowest metric values didn't get through). In the case of a timeseries chart it feels more natural to order the results by series, not metrics, so as to make sure the request contains as many full series as possible.

Again I'd urge adding this fix to each chart separately, and also adding a sort control to charts where one might be needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. IMHO I feel like the best course of actions is to revert #11153, also revert #12661 by me a few days ago, close this PR, and then open a ticket to move order by to the charts.

@ktmud
Copy link
Member

ktmud commented Feb 15, 2021

Supersedes by #13058 and #13057

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new:contributor The author is a new contributor size/S viz:charts:line Related to the Line chart viz:charts:wordcloud Related to the Wordcloud chart

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants