-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sdk/db: do not hold the lock on Close #29097
Conversation
CI Results: |
Build Results: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fairclothjm I may totally be missing something here, but this change feels unsafe...
What happens if there are two concurrent requests to Close the same database? If we drop the lock for the call to Close, we run the risk of having the g.instances reference changed underneath us and potentially panic, which doesn't seem like something we'd want.
Sorry for the drive-by review, but this seemed a little fishy so I figured I'd throw in my 2c. Hopefully I'm wrong and this is fine to do!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fairclothjm thanks for putting this up! I think there are a couple of tweaks to get the locking logic right here but it shouldn't take too much.
@mpalmi good spot on the missing unlocks on return.
I wonder if there is a way we can still get the safety of defer by default 🤔. I'll think about this more but just fixing the two early returns for now should be fine.
What happens if there are two concurrent requests to Close the same database?
Great thought. I think that's safe (assuming we clean up the return unlocks) because:
- First one to get the lock will remove the DB implementation from the map before it unlocks
- second one will not find it and so return an error
fmt.Errorf("no database instance found")
- either: the first call succeeds and the DB is closed and removed
- or: the first call fails the Close call, re-grabs the lock and puts the implementation back. This might not be necessary but it preserves the old behaviour in case anything relied on being able to retry
Close
if it fails due to a transient error instead of just leaking the resources so it seems lower risk.
I think the one thing in that that could be a problem is that we should probably not blindly put back the broken one on error just in case a new plugin isntance has been started in the interim with the same ID and we end up replacing it with the old broken instance.
@divyapola5 Hi, not sure how I accidentally added you as a reviewer here. Sorry about that! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice JM. I think this addresses all the feedback so far around lock safety and the edge case around overriding a newer instance when we put it back on a failure.
Description
This fixes a bug between the Vault grpc client and the plugin grpc server. This is related to slow database session End calls which could be slow network or database limits. When we see a slow Database session close, this is enough for goroutines to be blocked and prevent a new database config from being created due to an internal timeout on the plugin in the Type() call. This produces the following error in Vault:
TODO only if you're a HashiCorp employee
backport/
label that matches the desired release branch. Note that in the CE repo, the latest release branch will look likebackport/x.x.x
, but older release branches will bebackport/ent/x.x.x+ent
.of a public function, even if that change is in a CE file, double check that
applying the patch for this PR to the ENT repo and running tests doesn't
break any tests. Sometimes ENT only tests rely on public functions in CE
files.
in the PR description, commit message, or branch name.
description. Also, make sure the changelog is in this PR, not in your ENT PR.