Skip to content

Comments

[fix] Re-cleanup legacy filters#8523

Merged
john-bodley merged 1 commit intoapache:masterfrom
john-bodley:john-bodley--remove-erroneous-legacy-filters-from-form-data
Nov 12, 2019
Merged

[fix] Re-cleanup legacy filters#8523
john-bodley merged 1 commit intoapache:masterfrom
john-bodley:john-bodley--remove-erroneous-legacy-filters-from-form-data

Conversation

@john-bodley
Copy link
Member

CATEGORY

Choose one

  • Bug Fix
  • Enhancement (new features, refinement)
  • Refactor
  • Add tests
  • Build / Development Environment
  • Documentation

SUMMARY

Whilst spelunking through our production data in preparation to migrate slices from the Druid native JSON-based API to SQLAlchemy I discovered that a non-trivial amount of slices (~ 10%) still contained the legacy filter fields in the form-data; filters, having, having_filters, and where in addition to adhoc_filters (which takes precedence). Though these fields are not problematic it adds cruft to the form-data and makes analyses unnecessarily more complex.

These fields should have been removed in this migration (the convert_legacy_filters_into_adhoc method deletes the old fields) and after trying several flows I wasn't able to determine how these fields were persisting (note loading/saving a slice with legacy filters will replace them with ad-hoc).

This PR performs the following:

  1. Re-applies the previous migration logic. The hypothesis is that maybe when the previous migration was run the code was in a state where these fields persisted.
  2. Updates the example data to remove the obsolete fields and updates the filters to be ad-hoc.

TEST PLAN

Ran the DB upgrade on a dump of our production data and verified that the legacy filters were no longer present in the form-data.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Changes UI
  • Requires DB Migration.
  • Confirm DB Migration upgrade and downgrade tested.
  • Introduces new feature or API
  • Removes existing feature or API

REVIEWERS

to: @betodealmeida @graceguo-supercat @michellethomas @mistercrunch @villebro

Copy link
Member

@etr2460 etr2460 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one question regarding the migration

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe only do this if the parameters are different? And it might make sense to do a session.commit() after each iteration instead of one big one at the end. Not sure how a db with 200k slices would handle this

Copy link
Member Author

@john-bodley john-bodley Nov 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etr2460 I can definitely make the change to only update the params if they differ which the ORM will track. In terms of the commit I tested this with Airbnb's production database which has ~ 200k records. Batching is preferred over single record commits. Note this pattern is defined in other migrations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chiming in here, I also prefer one big commit at the end to make sure the migration doesn't die half way through, leaving the backend half migrated. If it takes long then so be it. Re: the comment about only updating rows with changed params; definitely agree, and it should speed up the commit at the end.

@john-bodley john-bodley force-pushed the john-bodley--remove-erroneous-legacy-filters-from-form-data branch from 860a59b to f9458c1 Compare November 7, 2019 18:02
@john-bodley
Copy link
Member Author

john-bodley commented Nov 7, 2019

@etr2460 and @villebro I addressed your comments.

Copy link
Member

@etr2460 etr2460 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, but you probably need to remake the migration since another one has gone in since you opened your pr

@john-bodley
Copy link
Member Author

@etr2460 the last migration is defined by this commit which is the defined down revision for the migration in this PR.

@john-bodley john-bodley force-pushed the john-bodley--remove-erroneous-legacy-filters-from-form-data branch from f9458c1 to 88c40a4 Compare November 12, 2019 19:45
@etr2460
Copy link
Member

etr2460 commented Nov 12, 2019

ignore me, this looks good

@codecov-io
Copy link

codecov-io commented Nov 12, 2019

Codecov Report

Merging #8523 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #8523   +/-   ##
=======================================
  Coverage   66.82%   66.82%           
=======================================
  Files         450      450           
  Lines       22721    22721           
  Branches     2366     2366           
=======================================
  Hits        15183    15183           
  Misses       7400     7400           
  Partials      138      138
Impacted Files Coverage Δ
superset/examples/random_time_series.py 20% <ø> (ø) ⬆️
superset/examples/country_map.py 24.32% <ø> (ø) ⬆️
superset/examples/birth_names.py 100% <ø> (ø) ⬆️
superset/examples/multiformat_time_series.py 15.38% <ø> (ø) ⬆️
superset/examples/deck.py 11.11% <ø> (ø) ⬆️
superset/examples/long_lat.py 23.07% <ø> (ø) ⬆️
superset/examples/energy.py 100% <ø> (ø) ⬆️
superset/examples/unicode_test_data.py 100% <ø> (ø) ⬆️
superset/examples/world_bank.py 100% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a58b392...88c40a4. Read the comment docs.

@john-bodley john-bodley merged commit 7bfa24d into apache:master Nov 12, 2019
@john-bodley john-bodley deleted the john-bodley--remove-erroneous-legacy-filters-from-form-data branch November 12, 2019 20:29
graceguo-supercat pushed a commit that referenced this pull request Nov 13, 2019
@dpgaspar dpgaspar added the v0.35 label Dec 20, 2019
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.36.0 First shipped in 0.36.0 labels Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/L v0.35 🚢 0.36.0 First shipped in 0.36.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants