-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Replace failing PolicyStepsRegistry sanity check assert with a test #85346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace failing PolicyStepsRegistry sanity check assert with a test #85346
Conversation
It doesn't work as written because it expects the cachedSteps map to be the same between the call to the put and the call to getCachedStep -- but of course another thread could have put a new policy, which results in PolicyStepsRegistry#update being called and the cache being cleared (in which case the second call returns null).
|
Pinging @elastic/es-data-management (Team:Data Management) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for fixing this Joe
| } | ||
| } finally { | ||
| // tell the other thread we're finished | ||
| done.set(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super minor nit, but in the event this is the very last test run in a suite (or it's just run by itself), it's possible our thread leak detector will complain since we aren't doing a thread.join() on the thread we spawn before to wait for it to be completely done. Do you think we should add that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing, I'll get that in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8a0f7ad to
0b4e18c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this Joe
| LifecycleAction action = randomValueOtherThan(MigrateAction.DISABLED, () -> randomFrom(phase.getActions().values())); | ||
| Step step = randomFrom(action.toSteps(client, phaseName, MOCK_STEP_KEY, null)); | ||
| Step actualStep = registry.getStep(indexMetadata, step.getKey()); | ||
| assertThat(actualStep.getKey(), equalTo(step.getKey())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused as to what this tests. There'll be a new step added, by-passing the cache, on every iteration - I'm quite confused specifically as we don't have any steps defined/registered before, yet we do setup some metadatas (both IndexMetadata and IndexLifecycleMetadata). Apologies if I'm missing something very obvious but could we document the intent here? (I'm guessing we want to populate the cache? )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough -- I spent a bit of time stressing over this exact point. It's a little bit tricky, I don't think you're missing something obvious. I'll add some comments and let's see where that gets us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow up to #84588, which was itself a follow up to #82316
I wanted this assert in there out of an abundance of caution, but indeed the other change in #84588 to
PolicyStepsRegistryTestswas a fair test of the intended behavior change on that PR.This PR moves the stress test to actual test code. I've run it with
-Dtests.iters=10000and it's fine (with the assert remaining it fails the sanity check (returningnullrather than the expected step) in the neighborhood of 20% of the time as measured with-Dtests.iters=100). The test takes about a tenth of a second to run on a warmed up jvm on my box.Closes #84979
Closes #85036
Closes #85075
Closes #85175