Skip to content

chore(dao/command): Add transaction decorator to try to enforce "unit of work"#24969

Merged
john-bodley merged 1 commit intoapache:masterfrom
john-bodley:john-bodley--dao-nested-session
Jun 28, 2024
Merged

chore(dao/command): Add transaction decorator to try to enforce "unit of work"#24969
john-bodley merged 1 commit intoapache:masterfrom
john-bodley:john-bodley--dao-nested-session

Conversation

@john-bodley
Copy link
Member

@john-bodley john-bodley commented Aug 11, 2023

SUMMARY

This is a PR I've had on the back-burner for many months, but have struggled with on numerous occasions—often in part due to the flakey/delicate tests (and their associated frameworks). The initial desire was to fulfill the approach outlined in [SIP-99B] Proposal for (re)defining a "unit of work", but alas I failed, in part due to the challenges trying to untangle Superset logic which inherently is not overly conducive to adhering to the construct that a command should serve as a "unit of work".

Why is that? It's complicated, but asynchronous logic does not help given that a Celery task running within the confines of another command needs to read a previously committed state given the READ COMMITTED isolation level. Issues like this could likely be overcome by having two commands—prepare and execute—as opposed to a single execute command.

The TL;DR is this PR should likely be interpreted as the first phase of SIP-99B. The general framework holds, i.e., DAOs no longer commit and a transaction decorator is used to wrap any command which perform either an INSERT, UPDATE, or DELETE.

Finally, I apologize for the size of the PR. I struggled to downside the footprint, but once you start enforcing that DAOs should not commit, then the files which touched begins to snowball.

Regrettably my time (for now) working on Apache Superset is likely drawing to a close, so for completeness I thought there was merit in sharing the incremental diff for what I was hoping to achieve in case @michael-s-molina @villebro et al. wanted to carry the baton on.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

CI.

ADDITIONAL INFORMATION

  • Has associated issue: [SIP-99B] Proposal for (re)defining a "unit of work" #25108
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@john-bodley john-bodley changed the title John bodley dao nested session chore(dao): Use nested sessions Aug 11, 2023
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're really inconsistent with our error handling. The BaseDAO.delete method wraps all SQLAlchemyError errors as DAODeleteFailedError whereas here they are left as is.

@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch 2 times, most recently from de2c324 to 37f0b24 Compare August 16, 2023 23:54
@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch from 37f0b24 to 4e51d4d Compare August 19, 2023 05:53
@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch from 4e51d4d to b5c2e81 Compare August 19, 2023 05:55
@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch 4 times, most recently from 7b201dd to ef7bcd1 Compare January 30, 2024 01:32
@john-bodley john-bodley changed the title chore(dao): Use nested sessions chore(dao/command): Use nested sessions Jan 30, 2024
@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch 11 times, most recently from f03d7b0 to 10dc755 Compare January 31, 2024 00:38
@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch from 10dc755 to 092aa45 Compare February 13, 2024 05:08
@github-actions github-actions bot added the api Related to the REST API label Feb 13, 2024
@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch 2 times, most recently from 82de210 to 7b651ef Compare February 13, 2024 23:16
@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch from c2e2027 to 19afeb3 Compare May 21, 2024 19:56
Copy link
Member Author

@john-bodley john-bodley May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obvious comment and thus not needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obvious comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obvious comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obvious comment.

@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch from 19afeb3 to 69a6e66 Compare June 12, 2024 16:15
@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch from 69a6e66 to c71a3cb Compare June 13, 2024 18:57
@john-bodley john-bodley force-pushed the john-bodley--dao-nested-session branch 3 times, most recently from 51ce868 to ff38e15 Compare June 18, 2024 23:09
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michael-s-molina what's your thinking about flushing? Sadly there's no DAO used here and thus it's required given than on the next line the entry.id is used, but in general should the DAO flush or should it be left up to the caller who is context aware?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@john-bodley @michael-s-molina should we create a DAO for the Key Value entities, and leave permalinks et al as the commands? This could clarify things. I can do that refactor if needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michael-s-molina what's your thinking about flushing? Sadly there's no DAO used here and thus it's required given than on the next line the entry.id is used, but in general should the DAO flush or should it be left up to the caller who is context aware?

There's no right or wrong answer but I prefer to execute flush operations only when necessary to minimize database load. That means that the command is responsible to call flush when necessary.

@john-bodley @michael-s-molina should we create a DAO for the Key Value entities, and leave permalinks et al as the commands? This could clarify things. I can do that refactor if needed.

This would definitely improve things and make the code more similar to rest of the application.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@john-bodley @michael-s-molina here's a PR to convert the KV commands into a DAO: #29344

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michael-s-molina here's a better example where db.session.flush() is called in the command given that it's not invoked by the underlying DAO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't needed for these tests and causes issues when used with a nested transaction when we want to rollback to a previous SAVEPOINT.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do need to ensure that the lock is committed to the metastore somehow. But let me do the key value DAO refactor first, that might help clean up this test. In the interim, feel free to relax/disable the test if needed.

Copy link
Member

@villebro villebro Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I disagree with this change (I may not have been able to accurately communicate why this is needed). But no worries, I will address this in #29344 after this PR lands and try to document the logic better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the name suggests the one_or_none() method could return None.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mimics the logic in the code.

@codecov
Copy link

codecov bot commented Jun 26, 2024

Codecov Report

Attention: Patch coverage is 89.57816% with 42 lines in your changes missing coverage. Please review.

Project coverage is 83.89%. Comparing base (76d897e) to head (51cb1f6).
Report is 1094 commits behind head on master.

Files with missing lines Patch % Lines
superset/extensions/pylint.py 0.00% 8 Missing ⚠️
superset/cli/update.py 0.00% 3 Missing ⚠️
superset/daos/tag.py 50.00% 3 Missing ⚠️
superset/cli/examples.py 0.00% 2 Missing ⚠️
superset/cli/main.py 0.00% 2 Missing ⚠️
superset/cli/test.py 0.00% 2 Missing ⚠️
superset/commands/database/ssh_tunnel/update.py 81.81% 2 Missing ⚠️
superset/commands/importers/v1/examples.py 0.00% 2 Missing ⚠️
superset/sqllab/sql_json_executer.py 33.33% 2 Missing ⚠️
superset/commands/dataset/duplicate.py 96.66% 1 Missing ⚠️
... and 15 more
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #24969       +/-   ##
===========================================
+ Coverage   60.48%   83.89%   +23.40%     
===========================================
  Files        1931      518     -1413     
  Lines       76236    37468    -38768     
  Branches     8568        0     -8568     
===========================================
- Hits        46114    31434    -14680     
+ Misses      28017     6034    -21983     
+ Partials     2105        0     -2105     
Flag Coverage Δ
hive 49.10% <47.89%> (-0.06%) ⬇️
javascript ?
mysql 77.41% <88.83%> (?)
postgres 77.52% <89.08%> (?)
presto 53.72% <47.89%> (-0.08%) ⬇️
python 83.89% <89.57%> (+20.40%) ⬆️
sqlite 76.99% <86.10%> (?)
unit 59.65% <56.32%> (+2.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test only seems to fail for the test-postgres-presto workflow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated comment: at some point we should replace Presto with Trino, as that's really where the broader community is at right now..

Copy link
Member

@michael-s-molina michael-s-molina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for all the hard work here @john-bodley. Even though we were not able to fully implement SIP-99B, this PR is a step in the right direction and removes a lot of unnecessary code. I left some first-pass comments:


try:
result = func(*args, **kwargs)
db.session.commit() # pylint: disable=consider-using-transaction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we were not able to use begin_nested here, do you see any point where previously we had only a flush that could be potentially rollbacked and now we have a @transaction which will effectively commit? Something like:

Previously:

CommandA:
   try:
      do_something()
      CommandB()
      commit()
   except Exception:
      rollback()

CommandB:
   do_something()
   flush()

Now:

@transaction
CommandA:
   do_something()
   CommandB()
 
@transaction
CommandB:
   do_something()

Copy link
Member Author

@john-bodley john-bodley Jun 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michael-s-molina given that these @transaction decorators are defined at the "unit of work" level I think we're ok, i.e., I'm not sure where we ever had nested commands where one never committed and the outer explicitly rolled back.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for now we should consider commands as the unit of work, meaning we should assume they always commit at the end. If this is not the case we should probably introduce a sort-of notion of a sub-command, that doesn't commit. But let's leave that for a follow-up.

database.set_sqlalchemy_uri(database.sqlalchemy_uri)
ssh_tunnel = self._handle_ssh_tunnel(database)
self._refresh_catalogs(database, original_database_name, ssh_tunnel)
except SSHTunnelError: # pylint: disable=try-except-raise
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe in this case you don't need the try/catch as there's no event logging or anything in the catch block.

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a HUGE step in the right direction, and finally introduces a coherent pattern for dealing with complex ORM handling during the request lifecycle. Given that this fundamentally changes how the backend operates, I fear there may be significant risk for regressions here. However, those should be easy to fix now that we have consistent flushing, committing and rollbacking. If nothing else, these potential regrssions will highlight critical gaps in our test coverage. Therefore, I feel the benefits of this change far outweigh the intermediate regression risks it introduces.

commands =
superset db upgrade
superset init
superset load-test-users
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random observation that's not directly related to this PR: I've always felt it's weird that the core application has functionality for loading test users. I feel at some point we should break that out into the test suite.

for pvm in pvms:
pvms_dict[(pvm.permission, pvm.view_menu)].append(pvm)
duplicates = [v for v in pvms_dict.values() if len(v) > 1]
len(duplicates)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What on earth was this? 🤔


try:
result = func(*args, **kwargs)
db.session.commit() # pylint: disable=consider-using-transaction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for now we should consider commands as the unit of work, meaning we should assume they always commit at the end. If this is not the case we should probably introduce a sort-of notion of a sub-command, that doesn't commit. But let's leave that for a follow-up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated comment: at some point we should replace Presto with Trino, as that's really where the broader community is at right now..

}
command = v1.ImportDashboardsCommand(contents, overwrite=True)
command.run()
command.run()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

Copy link
Member

@villebro villebro Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I disagree with this change (I may not have been able to accurately communicate why this is needed). But no worries, I will address this in #29344 after this PR lands and try to document the logic better.

Copy link
Member

@michael-s-molina michael-s-molina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @john-bodley for addressing the comments. I agree with @villebro that the benefits greatly outweigh the risks here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api Related to the REST API 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels change:backend Requires changing the backend size/XXL 🚢 4.1.0 First shipped in 4.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants