-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Prefer adaptive replica selection over awareness attribute based routing #1107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
✅ DCO Check Passed 0503cd7c1b98d65e536d5cac7b191484d6416e04 |
|
✅ Gradle Wrapper Validation success 0503cd7c1b98d65e536d5cac7b191484d6416e04 |
|
❌ Gradle Precommit failure 0503cd7c1b98d65e536d5cac7b191484d6416e04 |
|
✅ DCO Check Passed 4fc97670452ee9cb95dcde03f91c4f1f1a6fef14 |
|
✅ Gradle Wrapper Validation success 4fc97670452ee9cb95dcde03f91c4f1f1a6fef14 |
|
❌ Gradle Precommit failure 4fc97670452ee9cb95dcde03f91c4f1f1a6fef14 |
|
✅ Gradle Wrapper Validation success 80ec765b2fc6f3da05ada355b4adc82731a3e4ba |
|
✅ DCO Check Passed 80ec765b2fc6f3da05ada355b4adc82731a3e4ba |
|
✅ Gradle Precommit success 80ec765b2fc6f3da05ada355b4adc82731a3e4ba |
server/src/test/java/org/opensearch/cluster/routing/OperationRoutingAwarenessTests.java
Outdated
Show resolved
Hide resolved
|
@dvjyothsna could you please include rally perf numbers as well here or in the parent issue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nknize @AmiStrn will this be considered backward compatibility breaking? This is doing 2 things
- Getting rid of the deprecated system property based override mechanism to toggle routing preferences and replacing it with a cluster setting
- Changing the default value from false (of now removed system property) to true (of new cluster setting) - end result is ARS gets enabled everywhere
A nice side effect is bugs like elastic/elasticsearch#53054 will get addressed with default=true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is breaking imo.
for (1) this is fine, I think we should be doing this. Backporting to 1.1 can be done with a better deprecation message.
for (2) I would say the same, however, I noticed some bugs in the ARS implementation were addressed in Elasticsearch since the fork. These should be checked before making this the default.
one example of a bug we forked to opensearch: elastic/elasticsearch#65838
(this is a mix up in the formula implementation so I wouldn't worry about any copyright stuff on that one).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AmiStrn are you saying we should change the default in this PR back to old behavior before merging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous deprecation message was "searches will not be routed based on awareness attributes starting in version 8.0.0" which tells me that the intent was for awareness attribute based routing was planned to be removed in version 8.0.0 (i.e. OpenSearch 2.0). It appears to me that this change keeps the awareness attribute-based routing feature (though sets the default to disabled) and removes the deprecation message. Should we just be removing awareness attributed-based routing in OpenSearch 2.0 all together?
I don't have much of any context on this particular issue, but my bias is towards making things simple when possible, so I would suggest completely removing the feature if at all possible. Changing the default is a breaking change so we might as go all the way unless there is a really good reason to keep the setting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrross The idea is to keep the setting to have a mechanism to enable it in future if needed and also monitor the effectiveness before removing the setting. I agree that it is a breaking change which is why its a good idea to have the setting to toggle before completely removing it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the setting is temporary while we monitor the effectiveness of the new default, then shouldn't there be a deprecation message if a user configures cluster.search.ignore_awareness_attributes since it will go away in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I think we should really consider removing the setting. We're confident enough to make the new behavior the default and the setting was deprecated in the previous major version. However, I will defer to the opinion of folks with more knowledge of this specific feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrross Based on initial tests using rally the results looks promising. But we want to keep a setting to control the feature to toggle it for couple of releases with enabled as default. This will help to see if it is causing any regression with the new behavior or not and have control on it. We can add the deprecation message in later releases when we feel comfortable with this default value and plan to remove the setting completely.
As far as existing deprecation message is concerned I would think it anyways doesn't make sense from opensearch perspective because of version 8.0.0. We can choose to deprecate it in later releases of opensearch rather than sticking with elastic behavior.
|
Performance numbers from rally tests All tests are performed on domain with 6 data nodes
Latency(ms)
Throughput(req/s)
|
server/src/test/java/org/opensearch/cluster/routing/OperationRoutingAwarenessTests.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/opensearch/cluster/routing/OperationRoutingAwarenessTests.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/opensearch/cluster/routing/OperationRoutingTests.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/opensearch/cluster/routing/OperationRoutingTests.java
Outdated
Show resolved
Hide resolved
|
Can one of the admins verify this patch? |
|
✅ Gradle Wrapper Validation success eb3591f6dedb13ade0985190a0c6259335a261dc |
|
✅ DCO Check Passed eb3591f6dedb13ade0985190a0c6259335a261dc |
|
✅ Gradle Precommit success eb3591f6dedb13ade0985190a0c6259335a261dc |
|
✅ Gradle Wrapper Validation success eb3591f6dedb13ade0985190a0c6259335a261dc |
|
✅ DCO Check Passed eb3591f6dedb13ade0985190a0c6259335a261dc |
|
✅ Gradle Precommit success eb3591f6dedb13ade0985190a0c6259335a261dc |
malpani
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me! just left a nit and will approve once branching conflicts are addressed
server/src/main/java/org/opensearch/common/settings/ClusterSettings.java
Outdated
Show resolved
Hide resolved
|
✅ Gradle Wrapper Validation success e47baa6aa19e81f7df952ad1c719f861dbf99ea1 |
|
@malpani Addressed comments and rebased changes. Please take a look |
|
❌ Gradle Precommit failure e47baa6aa19e81f7df952ad1c719f861dbf99ea1 |
|
❌ Gradle Check failure e47baa6aa19e81f7df952ad1c719f861dbf99ea1 |
|
✅ Gradle Precommit success 822028510a8d2645e981f96c41f77c759b9b4d9e |
|
❌ Gradle Check failure 822028510a8d2645e981f96c41f77c759b9b4d9e |
|
✅ Gradle Wrapper Validation success 59ca8f37fb8ccd31f3beec85f307da24353d0dc9 |
|
✅ Gradle Precommit success 59ca8f37fb8ccd31f3beec85f307da24353d0dc9 |
|
✅ Gradle Check success 59ca8f37fb8ccd31f3beec85f307da24353d0dc9 |
kkhatua
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM +1
|
For backporting to 1.x do we just flip the default back and add the deprecation message again? |
|
@dvjyothsna given the performance test data here, i am inclined to backport as is. |
|
Sorry to be late to this, @dvjyothsna can you please rebase the PR and @andrross can help merge it? |
Signed-off-by: jyothsna <[email protected]>
Signed-off-by: jyothsna <[email protected]>
Signed-off-by: jyothsna <[email protected]>
Signed-off-by: jyothsna <[email protected]>
Signed-off-by: jyothsna <[email protected]>
|
I'm not so sure about backporting this change as is. It is more than just changing a default because it also changes how the configuration option is set. For example, if an existing 1.x user has explicitly configured the system property, then that setting would be ignored if we backport the change since this removes the system property and replaces it with a cluster setting. |
I am also wondering why we need to backport this change instead of keeping it in current release only ? |
|
@dblock Regarding your backport question...on 1.x you have to specifically opt-in to get the "prefer adaptive replica selection" behavior and there is a deprecation message saying that the new behavior would be the default in the next major version. This PR makes the new behavior the default in version 2.0+. Therefore I don't think it is appropriate to backport this change since it is essentially a breaking change that was planned for the next major version. Let me know if you disagree. |
|
I agree that back porting by either changing/not changing the default doesn't make sense. Will keep this feature from current version |
Adaptive Replica Selection was introduced to pick the best copy of the shard to route the shard query thus reducing the response latency. Currently the awareness attributes are always given preference over the Adaptive Replica Selection during shard query routing. In this CR, a new cluster setting cluster.search.ignore_awareness_attributes is introduced to ignore the awareness attributes. By default, it is set to true. The idea is that ARS is enabled by default and the users can reap benefits of ARS.
Signed-off-by: jyothsna [email protected]
Description
Introduced a new setting
cluster.search.ignore_awareness_attributesto ignore the awareness properties during search and use Adaptive Replica SelectionIssues Resolved
#760
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.