-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[Transform] Don't search if all indices are unchanged between checkpoints #77204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Transform] Don't search if all indices are unchanged between checkpoints #77204
Conversation
…ints When every index that a transform is configured to search has remained completely unchanged between checkpoints the transform should not do a search at all. Following elastic#75839 there was a problem where the scenario of all indices being unchanged between checkpoints could cause an empty list of indices to be searched, which Elasticsearch treats as meaning _all_ indices. This change should prevent that happening in future. Fixes elastic#77137
|
Pinging @elastic/ml-core (Team:ML) |
|
Labelled This proved to be very hard to test within the existing framework for testing continuous transforms using an external cluster. That framework repeatedly stops and starts the continuous transforms. Doing this appears to make the transform go through a different code path that never passed through the search avoidance code. Any advice on how to add this without adding a completely new continuous transform test framework would be gratefully received. Alternatively, maybe the QA tests can be used to guard against any regressions. |
| SearchRequest searchRequest = namedSearchRequest.v2(); | ||
| // We want to treat a request to search 0 indices as a request to do nothing, not a request to search all indices | ||
| if (searchRequest.indices().length == 0) { | ||
| logger.debug("[{}] Search optimized to noop; request [{}]", getJobId(), searchRequest); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| logger.debug("[{}] Search optimized to noop; request [{}]", getJobId(), searchRequest); | |
| logger.debug(() -> new ParameterizedMessage("[{}] Search optimized to noop; request [{}]", getJobId(), searchRequest)); |
Serializing the search request to string could be non trivial
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it would be good to log which named search was optimized namedSearchRequest.v1()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The debug() method is calling https://github.com/apache/logging-log4j2/blob/85b564a49f420fbcb9ac8abd3a20a252303d8198/log4j-api/src/main/java/org/apache/logging/log4j/spi/AbstractLogger.java#L449 which takes Object arguments. They only have their toString() methods called if logging at the level is enabled. As far as I can see the ParameterizedMessage approach is only necessary when you want to log both a parameterized message and an exception with stack trace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AH! Cool! Well, forget about the message supplier stuff, but having the search name would be nice.
| listener.onResponse(null); | ||
| return; | ||
| } | ||
| logger.trace("searchRequest: [{}]", searchRequest); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| logger.trace("searchRequest: [{}]", searchRequest); | |
| logger.trace(() -> new ParameterizedMessage("searchRequest: [{}]", searchRequest)); |
|
|
||
| sourceBuilder.query(queryBuilder); | ||
| logger.debug(() -> new ParameterizedMessage("[{}] Querying for data: {}", getJobId(), sourceBuilder)); | ||
| logger.debug("[{}] Querying {} for data: {}", getJobId(), request.indices(), sourceBuilder); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I always thought that the logger would serialize the strings unless we provided a message provider.
Consequently, if we weren't on debug level, we are serializing the sourceBuilder unnecessarily.
| sourceBuilder.query(filteredQuery); | ||
|
|
||
| logger.debug("[{}] Querying for changes: {}", getJobId(), sourceBuilder); | ||
| logger.debug("[{}] Querying {} for changes: {}", getJobId(), request.indices(), sourceBuilder); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ParameterizedMessage?
|
@elasticmachine update branch |
I did a manual test instead that proves the changes of this PR work. However, there might also be some other bug that the automated tests aren't revealing because they repeatedly stop and restart the "continuous" transforms. This is what I did:
The transform worked for its initial checkpoint. The dodgy thing was that the second checkpoint that validated this bug fix didn't take place for over 8 minutes: This seems strange to me given that both Then the next two times we avoided searching all indices were 16 minutes later and 6 minutes after that: |
…ints (elastic#77204) When every index that a transform is configured to search has remained completely unchanged between checkpoints the transform should not do a search at all. Following elastic#75839 there was a problem where the scenario of all indices being unchanged between checkpoints could cause an empty list of indices to be searched, which Elasticsearch treats as meaning _all_ indices. This change should prevent that happening in future. Fixes elastic#77137
💚 Backport successful
|
…ints (#77204) (#77245) When every index that a transform is configured to search has remained completely unchanged between checkpoints the transform should not do a search at all. Following #75839 there was a problem where the scenario of all indices being unchanged between checkpoints could cause an empty list of indices to be searched, which Elasticsearch treats as meaning _all_ indices. This change should prevent that happening in future. Fixes #77137
I think this is by design, because transform should stop early if there are no changes. I wonder why #77137 passed this pre-check. I will try your repro steps, but enable trace logging, if my suspicion is correct the log should tell us, see https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/transform/src/main/java/org/elasticsearch/xpack/transform/transforms/TransformIndexer.java#L374: The reason why after 8 and 16 minutes it found something should be future timestamps in the sample data. However, I am surprised that you saw: Something is fishy. |
|
After enabling trace logging: This is expected. I think the real problem sits in #75839 and the failures reported in #77310 is probably connected to the change in #75839. |
| // We want to treat a request to search 0 indices as a request to do nothing, not a request to search all indices | ||
| if (searchRequest.indices().length == 0) { | ||
| logger.debug("[{}] Search request [{}] optimized to noop; searchRequest [{}]", getJobId(), name, searchRequest); | ||
| listener.onResponse(null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The introduction of a magic value null is dangerous, what if the client returns null. It seems better to return an empty result set, so that null causes the transform to fail as before.
|
I think the bug is a disconnection between sequence id's and timestamp. Let's take checkpoint A with: and checkpoint B with: The change in #75839 calculates the diff between A and B purely on sequence id's. Because they are equal However we must take the timestamp and A could contain documents that are part of the checkpoints w.r.t. to the seq id, but not queried because of |
When every index that a transform is configured to search has
remained completely unchanged between checkpoints the transform
should not do a search at all.
Following #75839 there was a problem where the scenario of all
indices being unchanged between checkpoints could cause an empty
list of indices to be searched, which Elasticsearch treats as
meaning all indices. This change should prevent that happening
in future.
Fixes #77137