[fix] Re-cleanup legacy filters#8523
Conversation
etr2460
left a comment
There was a problem hiding this comment.
one question regarding the migration
There was a problem hiding this comment.
maybe only do this if the parameters are different? And it might make sense to do a session.commit() after each iteration instead of one big one at the end. Not sure how a db with 200k slices would handle this
There was a problem hiding this comment.
@etr2460 I can definitely make the change to only update the params if they differ which the ORM will track. In terms of the commit I tested this with Airbnb's production database which has ~ 200k records. Batching is preferred over single record commits. Note this pattern is defined in other migrations.
There was a problem hiding this comment.
Chiming in here, I also prefer one big commit at the end to make sure the migration doesn't die half way through, leaving the backend half migrated. If it takes long then so be it. Re: the comment about only updating rows with changed params; definitely agree, and it should speed up the commit at the end.
860a59b to
f9458c1
Compare
etr2460
left a comment
There was a problem hiding this comment.
lgtm, but you probably need to remake the migration since another one has gone in since you opened your pr
f9458c1 to
88c40a4
Compare
|
ignore me, this looks good |
Codecov Report
@@ Coverage Diff @@
## master #8523 +/- ##
=======================================
Coverage 66.82% 66.82%
=======================================
Files 450 450
Lines 22721 22721
Branches 2366 2366
=======================================
Hits 15183 15183
Misses 7400 7400
Partials 138 138
Continue to review full report at Codecov.
|
CATEGORY
Choose one
SUMMARY
Whilst spelunking through our production data in preparation to migrate slices from the Druid native JSON-based API to SQLAlchemy I discovered that a non-trivial amount of slices (~ 10%) still contained the legacy filter fields in the form-data;
filters,having,having_filters, andwherein addition toadhoc_filters(which takes precedence). Though these fields are not problematic it adds cruft to the form-data and makes analyses unnecessarily more complex.These fields should have been removed in this migration (the
convert_legacy_filters_into_adhocmethod deletes the old fields) and after trying several flows I wasn't able to determine how these fields were persisting (note loading/saving a slice with legacy filters will replace them with ad-hoc).This PR performs the following:
TEST PLAN
Ran the DB upgrade on a dump of our production data and verified that the legacy filters were no longer present in the form-data.
ADDITIONAL INFORMATION
REVIEWERS
to: @betodealmeida @graceguo-supercat @michellethomas @mistercrunch @villebro