Skip to content

Comments

Allow user to customize query cost estimate#8470

Merged
betodealmeida merged 3 commits intoapache:masterfrom
lyft:VIZ-1098
Nov 4, 2019
Merged

Allow user to customize query cost estimate#8470
betodealmeida merged 3 commits intoapache:masterfrom
lyft:VIZ-1098

Conversation

@betodealmeida
Copy link
Member

CATEGORY

Choose one

  • Bug Fix
  • Enhancement (new features, refinement)
  • Refactor
  • Add tests
  • Build / Development Environment
  • Documentation

SUMMARY

#8172 introduced the ability to estimate query costs before running them. Currently, this is supported by Presto, and it returns a relative CPU and network cost.

Ideally, we want to show the cost as dollars, time or percentile, instead of showing relative costs. Since the relationship between the relative cost and the actual cost varies from infrastructure to infrastructure, we need to provide users a way of customizing the query cost estimate.

Here's a hypothetical example where we use a coefficient to compute the query cost in dollars based on CPU and network cost:

def presto_query_cost_formatter(result):
    cpu_coefficient = 1e-12
    network_coefficient = 1e-12

    cost = 0
    for row in result:
        cost += row.get("cpuCost", 0) * cpu_coefficient
        cost += row.get("networkCost", 0) * network_coefficient

    return [{"Cost": f"US$ {cost:.2f}"}]


DEFAULT_FEATURE_FLAGS = {
    "ESTIMATE_QUERY_COST": True,
    "QUERY_COST_FORMATTER": {"presto": presto_query_cost_formatter},
}

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Screen Shot 2019-10-29 at 1 36 24 PM

TEST PLAN

ADDITIONAL INFORMATION

  • Has associated issue:
  • Changes UI
  • Requires DB Migration.
  • Confirm DB Migration upgrade and downgrade tested.
  • Introduces new feature or API
  • Removes existing feature or API

REVIEWERS

@khtruong

Copy link
Member

@etr2460 etr2460 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one documentation request!

docs/sqllab.rst Outdated

.. code-block:: python

def presto_query_cost_formatter(result):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we add a type to this so that people know what valid params there are? Or add that in a comment somewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Sorry, I'm still not fully used to types in Python, and I keep forgetting them. :-P

@codecov-io
Copy link

Codecov Report

Merging #8470 into master will increase coverage by 0.09%.
The diff coverage is 23.52%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8470      +/-   ##
==========================================
+ Coverage   66.41%   66.51%   +0.09%     
==========================================
  Files         449      449              
  Lines       22567    22607      +40     
  Branches     2367     2367              
==========================================
+ Hits        14987    15036      +49     
+ Misses       7442     7433       -9     
  Partials      138      138
Impacted Files Coverage Δ
superset/views/core.py 71.94% <0%> (ø) ⬆️
superset/db_engine_specs/presto.py 22.64% <20%> (+0.2%) ⬆️
superset/db_engine_specs/base.py 80.59% <66.66%> (-0.14%) ⬇️
superset/db_engine_specs/oracle.py 66.66% <0%> (-25%) ⬇️
superset/db_engine_specs/mssql.py 48.83% <0%> (-9.5%) ⬇️
superset/db_engine_specs/impala.py 59.09% <0%> (-5.91%) ⬇️
superset/db_engine_specs/sqlite.py 66.66% <0%> (-4.31%) ⬇️
superset/db_engine_specs/bigquery.py 42.66% <0%> (-2.41%) ⬇️
superset/db_engine_specs/postgres.py 83.33% <0%> (-0.54%) ⬇️
superset/connectors/sqla/models.py 84.92% <0%> (+0.04%) ⬆️
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1adf742...f5be80f. Read the comment docs.

Copy link
Member

@etr2460 etr2460 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks for the docs!

@betodealmeida betodealmeida merged commit 338a2b1 into apache:master Nov 4, 2019
graceguo-supercat pushed a commit that referenced this pull request Nov 13, 2019
* Allow user to customize query estimate

* Add docs; run black

* Update docs with types
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.36.0 First shipped in 0.36.0 labels Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/M 🚢 0.36.0 First shipped in 0.36.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants