chore(dao/command): Add transaction decorator to try to enforce "unit of work"#24969
Conversation
superset/daos/chart.py
Outdated
There was a problem hiding this comment.
We're really inconsistent with our error handling. The BaseDAO.delete method wraps all SQLAlchemyError errors as DAODeleteFailedError whereas here they are left as is.
de2c324 to
37f0b24
Compare
37f0b24 to
4e51d4d
Compare
4e51d4d to
b5c2e81
Compare
7b201dd to
ef7bcd1
Compare
f03d7b0 to
10dc755
Compare
10dc755 to
092aa45
Compare
82de210 to
7b651ef
Compare
c2e2027 to
19afeb3
Compare
superset/examples/energy.py
Outdated
There was a problem hiding this comment.
Obvious comment and thus not needed.
superset/examples/helpers.py
Outdated
superset/examples/world_bank.py
Outdated
19afeb3 to
69a6e66
Compare
69a6e66 to
c71a3cb
Compare
51ce868 to
ff38e15
Compare
There was a problem hiding this comment.
@michael-s-molina what's your thinking about flushing? Sadly there's no DAO used here and thus it's required given than on the next line the entry.id is used, but in general should the DAO flush or should it be left up to the caller who is context aware?
There was a problem hiding this comment.
@john-bodley @michael-s-molina should we create a DAO for the Key Value entities, and leave permalinks et al as the commands? This could clarify things. I can do that refactor if needed.
There was a problem hiding this comment.
@michael-s-molina what's your thinking about flushing? Sadly there's no DAO used here and thus it's required given than on the next line the entry.id is used, but in general should the DAO flush or should it be left up to the caller who is context aware?
There's no right or wrong answer but I prefer to execute flush operations only when necessary to minimize database load. That means that the command is responsible to call flush when necessary.
@john-bodley @michael-s-molina should we create a DAO for the Key Value entities, and leave permalinks et al as the commands? This could clarify things. I can do that refactor if needed.
This would definitely improve things and make the code more similar to rest of the application.
There was a problem hiding this comment.
@john-bodley @michael-s-molina here's a PR to convert the KV commands into a DAO: #29344
superset/commands/sql_lab/execute.py
Outdated
There was a problem hiding this comment.
@michael-s-molina here's a better example where db.session.flush() is called in the command given that it's not invoked by the underlying DAO.
tests/unit_tests/utils/lock_tests.py
Outdated
There was a problem hiding this comment.
This isn't needed for these tests and causes issues when used with a nested transaction when we want to rollback to a previous SAVEPOINT.
There was a problem hiding this comment.
We do need to ensure that the lock is committed to the metastore somehow. But let me do the key value DAO refactor first, that might help clean up this test. In the interim, feel free to relax/disable the test if needed.
There was a problem hiding this comment.
I think I disagree with this change (I may not have been able to accurately communicate why this is needed). But no worries, I will address this in #29344 after this PR lands and try to document the logic better.
There was a problem hiding this comment.
As the name suggests the one_or_none() method could return None.
There was a problem hiding this comment.
This mimics the logic in the code.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #24969 +/- ##
===========================================
+ Coverage 60.48% 83.89% +23.40%
===========================================
Files 1931 518 -1413
Lines 76236 37468 -38768
Branches 8568 0 -8568
===========================================
- Hits 46114 31434 -14680
+ Misses 28017 6034 -21983
+ Partials 2105 0 -2105
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
This test only seems to fail for the test-postgres-presto workflow.
There was a problem hiding this comment.
unrelated comment: at some point we should replace Presto with Trino, as that's really where the broader community is at right now..
michael-s-molina
left a comment
There was a problem hiding this comment.
Thank you for all the hard work here @john-bodley. Even though we were not able to fully implement SIP-99B, this PR is a step in the right direction and removes a lot of unnecessary code. I left some first-pass comments:
|
|
||
| try: | ||
| result = func(*args, **kwargs) | ||
| db.session.commit() # pylint: disable=consider-using-transaction |
There was a problem hiding this comment.
Because we were not able to use begin_nested here, do you see any point where previously we had only a flush that could be potentially rollbacked and now we have a @transaction which will effectively commit? Something like:
Previously:
CommandA:
try:
do_something()
CommandB()
commit()
except Exception:
rollback()
CommandB:
do_something()
flush()
Now:
@transaction
CommandA:
do_something()
CommandB()
@transaction
CommandB:
do_something()
There was a problem hiding this comment.
@michael-s-molina given that these @transaction decorators are defined at the "unit of work" level I think we're ok, i.e., I'm not sure where we ever had nested commands where one never committed and the outer explicitly rolled back.
There was a problem hiding this comment.
I think for now we should consider commands as the unit of work, meaning we should assume they always commit at the end. If this is not the case we should probably introduce a sort-of notion of a sub-command, that doesn't commit. But let's leave that for a follow-up.
superset/commands/database/update.py
Outdated
| database.set_sqlalchemy_uri(database.sqlalchemy_uri) | ||
| ssh_tunnel = self._handle_ssh_tunnel(database) | ||
| self._refresh_catalogs(database, original_database_name, ssh_tunnel) | ||
| except SSHTunnelError: # pylint: disable=try-except-raise |
There was a problem hiding this comment.
I believe in this case you don't need the try/catch as there's no event logging or anything in the catch block.
villebro
left a comment
There was a problem hiding this comment.
This is a HUGE step in the right direction, and finally introduces a coherent pattern for dealing with complex ORM handling during the request lifecycle. Given that this fundamentally changes how the backend operates, I fear there may be significant risk for regressions here. However, those should be easy to fix now that we have consistent flushing, committing and rollbacking. If nothing else, these potential regrssions will highlight critical gaps in our test coverage. Therefore, I feel the benefits of this change far outweigh the intermediate regression risks it introduces.
| commands = | ||
| superset db upgrade | ||
| superset init | ||
| superset load-test-users |
There was a problem hiding this comment.
Random observation that's not directly related to this PR: I've always felt it's weird that the core application has functionality for loading test users. I feel at some point we should break that out into the test suite.
| for pvm in pvms: | ||
| pvms_dict[(pvm.permission, pvm.view_menu)].append(pvm) | ||
| duplicates = [v for v in pvms_dict.values() if len(v) > 1] | ||
| len(duplicates) |
|
|
||
| try: | ||
| result = func(*args, **kwargs) | ||
| db.session.commit() # pylint: disable=consider-using-transaction |
There was a problem hiding this comment.
I think for now we should consider commands as the unit of work, meaning we should assume they always commit at the end. If this is not the case we should probably introduce a sort-of notion of a sub-command, that doesn't commit. But let's leave that for a follow-up.
There was a problem hiding this comment.
unrelated comment: at some point we should replace Presto with Trino, as that's really where the broader community is at right now..
| } | ||
| command = v1.ImportDashboardsCommand(contents, overwrite=True) | ||
| command.run() | ||
| command.run() |
tests/unit_tests/utils/lock_tests.py
Outdated
There was a problem hiding this comment.
I think I disagree with this change (I may not have been able to accurately communicate why this is needed). But no worries, I will address this in #29344 after this PR lands and try to document the logic better.
michael-s-molina
left a comment
There was a problem hiding this comment.
Thank you @john-bodley for addressing the comments. I agree with @villebro that the benefits greatly outweigh the risks here.
SUMMARY
This is a PR I've had on the back-burner for many months, but have struggled with on numerous occasions—often in part due to the flakey/delicate tests (and their associated frameworks). The initial desire was to fulfill the approach outlined in [SIP-99B] Proposal for (re)defining a "unit of work", but alas I failed, in part due to the challenges trying to untangle Superset logic which inherently is not overly conducive to adhering to the construct that a command should serve as a "unit of work".
Why is that? It's complicated, but asynchronous logic does not help given that a Celery task running within the confines of another command needs to read a previously committed state given the
READ COMMITTEDisolation level. Issues like this could likely be overcome by having two commands—prepare and execute—as opposed to a single execute command.The TL;DR is this PR should likely be interpreted as the first phase of SIP-99B. The general framework holds, i.e., DAOs no longer commit and a
transactiondecorator is used to wrap any command which perform either anINSERT,UPDATE, orDELETE.Finally, I apologize for the size of the PR. I struggled to downside the footprint, but once you start enforcing that DAOs should not commit, then the files which touched begins to snowball.
Regrettably my time (for now) working on Apache Superset is likely drawing to a close, so for completeness I thought there was merit in sharing the incremental diff for what I was hoping to achieve in case @michael-s-molina @villebro et al. wanted to carry the baton on.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TESTING INSTRUCTIONS
CI.
ADDITIONAL INFORMATION