-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Master thread safe loader #54878
Master thread safe loader #54878
Conversation
3f78b25
to
3d3e18d
Compare
I'm getting this error on all the tests now:
|
It looks like #49269 is an issue again even though all it's fixes are present in master |
The threadsafe loader has trouble representing undefined data types in the yamldumper (similar to #43694). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's more than the thread proxy stuff right?
Separate changes?
This is all the threadsafe loader + other changes required to make tests pass that failed with the threadsafe loader |
On this PR the
|
Verified, skipping |
f49179d
to
0c66e3b
Compare
Is this issue is a problem again? saltstack/salt-ci-images#504 |
Found it! This is the line of code causing all the trouble on mac: |
8f16c70
to
2e52575
Compare
The last commit did it, there might be a problem with the macosxmojave build or with the version of the grp module, but for now I think everything is working :) |
re-run full centos7 |
re-run full opensuse15 |
Since this is such a large PR we'll break it up into smaller PRs to get it into salt. The msgpack stuff will come out first. |
5077f29
to
ac9faf9
Compare
re-run full all |
The centos7 test failure is a serious issue with this PR:
|
7b50486
@Akm0d I believe this is being addressed in a more comprehensive way for the proxy minion work @garethgreenaway is working on. We should hold off on this to see how that work pans out. |
What does this PR do?
Merge PR from develop #50655 which was a new version of #39948, #48598, and #50648 so the description has been copied from there. This PR will hopefully resolve the problems that existed in the original PRs. The code is identical to #50648, but this PR is based on the develop branch instead of the 2018.3.3 branch because Jenkins obviously cannot run tests for branches based on the 2018.3.3 branch.
There was a race condition in the salt loader when injecting global values (e.g. "pillar" or "salt") into modules. One effect of this race condition was that in a setup with multiple threads, some threads may see pillar data intended for other threads or the pillar data seen by a thread might even change spuriously.
There have been earlier attempts to fix this problem (#27937, #29397). These patches tried to fix the problem by storing the dictionary that keeps the relevant data in a thread-local variable and referencing this thread-local variable from the variables that are injected into the modules.
These patches did not fix the problem completely because they only work when a module is loaded through a single loader instance only. When there is more than one loader, there is more than one thread-local variable and the variable injected into a module is changed to point to another thread-local variable when the module is loaded again. Thus, the problem resurfaced while working on #39670.
This patch attempts to solve the problem from a slightly different angle, complementing the earlier patches: The value injected into the modules now is a proxy that internally uses a thread-local variable to decide to which object it points. This means that when loading a module
again through a different loader (possibly passing different pillar data), the data is actually only changed in the thread in which the loader is used. Other threads are not affected by such a change.
This means that it will work correctly in the current situation where loaders are possibly created by many different modules and these modules do not necessary know in which context they are executed. Thus it is much more flexible and reliable than the more explicit approach
used by the two earlier patches.
What issues does this PR fix or reference?
This PR fixes problems that surfaced when developing the parallel runners feature (#39670), but is also related to problems reported earlier (#23373). The problems described in #29028 might also be related, though this needs further confirmation.
Previous Behavior
Changes to pillar data in one thread might have influenced pillar data in a different thread. Whether or when this problem actually appeared was unpredictable (due to the nature of a race condition).
New Behavior
Changes to pillar data in one thread will never affect the pillar data in a different thread.
Tests written?
Yes
Regression Potential
The change to the loader code itself is rather small, but it affects a central component. So if there is a problem, the impact might affect nearly all components.