Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Move event emitting off the main thread to avoid deadlocks #1314

Merged
merged 6 commits into from
Feb 13, 2025

Conversation

sideshowcoder
Copy link
Contributor

All code based on work by @chrfwow, based on #1299.

My current understanding is that when stacking event emitting inside an EventProvider, when using sychronization the EventProvider can deadlock, to avoid this move the event emitting of the main thread. Similar to how this is handled in EventSupport.

Still requires cleanup and moving the tests to a better location, I'm still not sure I 100% understand the reason for the deadlock, but wanted to get the work started by porting the testcase provided by @chrfwow.

@sideshowcoder sideshowcoder requested a review from a team as a code owner February 3, 2025 21:07
@sideshowcoder sideshowcoder marked this pull request as draft February 3, 2025 21:07
@sideshowcoder sideshowcoder changed the title WIP: Move event emitting off the main thread to avoid deadlocks fix: Move event emitting off the main thread to avoid deadlocks Feb 3, 2025
@sideshowcoder sideshowcoder force-pushed the sideshowcoder/issue/1299 branch 2 times, most recently from 04e35fa to 20ba91c Compare February 3, 2025 21:15
@beeme1mr beeme1mr marked this pull request as ready for review February 4, 2025 12:05
Copy link
Contributor

@chrfwow chrfwow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far, but I have a few suggestions

toddbaert added a commit to open-feature/community that referenced this pull request Feb 4, 2025
@chrfwow has been contributing significantly to our Java implementations:

- finally hook change in SDK: open-feature/java-sdk#1246
- flag metadata support in the flagd java provider: open-feature/java-sdk-contrib#1122

And most recently identified a deadlock in the Java SDK, opened an issue describing it, and has been helping to guide the fix by a new contributor: open-feature/java-sdk#1314

I'm nominating them for Java approver. @chrfwow please approve or 👍 this PR for signal your interest.

Signed-off-by: Todd Baert <[email protected]>
@sideshowcoder sideshowcoder force-pushed the sideshowcoder/issue/1299 branch from b9f3afe to 8c36616 Compare February 4, 2025 16:25
@toddbaert
Copy link
Member

This all makes sense to me. I see 2 test failures which I'm not really surprised about. It's possible that there's actually bugs here, but more likely it's inter-test interference or test-only race conditions revealed by this change. You can try running the failing tests individually and seeing if they work as expected.

@open-feature open-feature deleted a comment from liran-p Feb 4, 2025
public abstract class EventProvider implements FeatureProvider {
private EventProviderListener eventProviderListener;
private final ExecutorService emitterExecutor = Executors.newCachedThreadPool(runnable -> {
final Thread thread = new Thread(runnable);
thread.setDaemon(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

daemon can have it stopped before emitting the remaining events and lose data.
The shutdown() method which is later called can gracefully shut it down with a timeout.

@open-feature open-feature deleted a comment from liran-p Feb 4, 2025
@sideshowcoder sideshowcoder requested a review from chrfwow February 5, 2025 13:46
Copy link
Contributor

@chrfwow chrfwow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your changes look good, just make sure to fix the failing tests then this PR can be merged I believe 🚀

@sideshowcoder
Copy link
Contributor Author

sideshowcoder commented Feb 6, 2025

I am looking at the test failures locally but the seem to be passing, when run on their own, I continue to check what makes the CI run fail, I think there is some interaction going on. Just wanted to quickly update that I'm looking at this 👍

@sideshowcoder sideshowcoder force-pushed the sideshowcoder/issue/1299 branch 2 times, most recently from 26df9e7 to 2b953e0 Compare February 10, 2025 12:45
Philipp Fehre and others added 3 commits February 12, 2025 15:31
When stacking event emitting inside an EventProvider, when using sychronization
the EventProvider can deadlock, to avoid this move the event emitting of the
main thread.

Signed-off-by: Philipp Fehre <[email protected]>
Test provider should respect the init-delay during all test

Signed-off-by: Philipp Fehre <[email protected]>
With the events being executed on a different thread, we need to wait to make
sure the thread is scheduled to have the events emitted.

Signed-off-by: Philipp Fehre <[email protected]>
@sideshowcoder sideshowcoder force-pushed the sideshowcoder/issue/1299 branch from 5c57681 to 121c99a Compare February 12, 2025 14:32
@@ -265,6 +265,8 @@
<artifactId>maven-surefire-plugin</artifactId>
<version>3.5.2</version>
<configuration>
<forkCount>1</forkCount>
<reuseForks>false</reuseForks>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok this is obviously not a solution, but I want to know what the reason for the test failures are, are they actual issues ore is it cross test interactions? It seems we are suffering from global setup issues due to the Singleton that is OpenFeatureApi. @toddbaert any suggestion how to get around this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which tests are failing? Today we merged something which increases the stability of the spectests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a test annotation to run some tests in isolation, which may be the best solution here. If you cannot narrow down the exact suite I'd support you adding forkCount 1 for now, and I can look into it later (or somebody else).

If you want to keep digging at it yourself, you're more than welcome, but it's not 100% related to your improvement, so I don't want to force you to do it if you are sure it's a bad test interaction.

Up to you!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ Execution(SAME_THREAD) annotation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks! Yes it is always the same tests so I'm gonna annotate them and remove this global one. And then this should be done.

Copy link
Contributor Author

@sideshowcoder sideshowcoder Feb 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so I tried this out locally and got some fixes but still have some failing tests, with annotations it is not possible to get the same isolation as with no reuse for forks. I am gonna spent more time tomorrow on this and if I can't resolve it I would suggest to move this to a followup PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of merging as is and following up with improvements in a separate PR. @toddbaert @liran2000 any concerns about that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No objections.

Signed-off-by: Philipp Fehre <[email protected]>
@sideshowcoder sideshowcoder force-pushed the sideshowcoder/issue/1299 branch from 121c99a to eb10367 Compare February 12, 2025 16:10
Copy link

codecov bot commented Feb 12, 2025

Codecov Report

Attention: Patch coverage is 57.14286% with 6 lines in your changes missing coverage. Please review.

Project coverage is 93.22%. Comparing base (08c38fb) to head (69e056f).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...c/main/java/dev/openfeature/sdk/EventProvider.java 57.14% 5 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #1314      +/-   ##
============================================
- Coverage     93.51%   93.22%   -0.30%     
- Complexity      463      467       +4     
============================================
  Files            43       43              
  Lines          1110     1121      +11     
  Branches         89       90       +1     
============================================
+ Hits           1038     1045       +7     
- Misses           44       48       +4     
  Partials         28       28              
Flag Coverage Δ
unittests 93.22% <57.14%> (-0.30%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@chrfwow chrfwow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lines not covered by tests are fine for me

@toddbaert toddbaert merged commit c33ac2d into open-feature:main Feb 13, 2025
9 checks passed
@toddbaert
Copy link
Member

@sideshowcoder thanks for all your efforts on this! Well done. It will be available in our next release.

Consider joining the org: open-feature/community#427

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants