Fast refresh indices should use search shards by kingherc · Pull Request #113478 · elastic/elasticsearch

kingherc · 2024-09-24T15:38:05Z

Fast refresh indices should now behave like non fast refresh indices in how they execute (m)gets and searches. I.e., they should use the search shards.

For BWC, we define a new transport version. We expect search shards to be upgraded first, before promotable shards. Until the cluster is fully upgraded, the promotable shards (whether upgraded or not) will still receive and execute gets/searches locally.

Relates ES-9573
Relates ES-9579

elasticsearchmachine · 2024-09-30T14:35:32Z

Pinging @elastic/es-distributed (Team:Distributed)

arteam

LGTM! Looking forward to simplify things with ES-9563 after this PR is successfully rolled out.

henningandersen

I have a few comments/questions.

.../org/elasticsearch/action/admin/indices/refresh/TransportUnpromotableShardRefreshAction.java

server/src/main/java/org/elasticsearch/action/get/TransportGetAction.java

henningandersen · 2024-10-03T09:54:11Z

server/src/main/java/org/elasticsearch/action/support/replication/PostWriteRefresh.java

-                    // Fast refresh indices do not depend on the unpromotables being refreshed
-                    boolean fastRefresh = IndexSettings.INDEX_FAST_REFRESH_SETTING.get(indexShard.indexSettings().getSettings());
-                    if (location != null && (indexShard.routingEntry().isSearchable() == false && fastRefresh == false)) {
+                    if (location != null && indexShard.routingEntry().isSearchable() == false) {


This fixes it for future refreshes after the indexing node upgraded. But it does not guarantee immediate availability of the latest state on the search node. So we risk some seconds of non-realtime GET requests going backwards during such an upgrade? I think real-time GET requests will be saved by the wait-for generation, is that also your understanding?

The reasoning here is this code runs on the primary/indexing node, and indeed that the indexing node will be upgraded after the search nodes.

But it does not guarantee immediate availability of the latest state on the search node.

Doesn't our upgrade process guarantee that, since search nodes are upgraded first?

So we risk some seconds of non-realtime GET requests going backwards during such an upgrade?

A non-realtime GET coordinated by an old search node will go the primary to execute.
A non-realtime GET coordinated by a new search node, with an old primary node, will go the primary to execute.
A non-realtime GET coordinated by a new search node on a fully upgraded cluster, will be executed on the search node as is done for non-fast-refresh indices. Which should be fine as well. Not sure I see when/why it might go backwards?

I think real-time GET requests will be saved by the wait-for generation, is that also your understanding?

A real-time GET coordinated by an old search node will go the primary to execute.
A real-time GET coordinated by a new search node, with an old primary node, will go the primary to execute.
A real-time GET coordinated by a new search node on a fully upgraded cluster, will be executed on the search node as is done for non-fast-refresh indices. Which should use wait-for generation if necessary.

Please tell me if you see any corner cases I might have missed or not considered. It might be useful to think about the above combinations also for searches/mgets, but I believe it should be a similar story for them as well.

I think you are right that it works out. The upgrade will force a relocation, which forces a flush, bringing things back into order. Thanks.

server/src/main/java/org/elasticsearch/index/cache/bitset/BitsetFilterCache.java

server/src/main/java/org/elasticsearch/action/get/TransportShardMultiGetAction.java

kingherc

Thanks @henningandersen for the feedback! Feel free to review again.

.../org/elasticsearch/action/admin/indices/refresh/TransportUnpromotableShardRefreshAction.java

server/src/main/java/org/elasticsearch/action/get/TransportGetAction.java

server/src/main/java/org/elasticsearch/index/cache/bitset/BitsetFilterCache.java

kingherc · 2024-10-03T13:21:31Z

server/src/main/java/org/elasticsearch/action/support/replication/PostWriteRefresh.java

-                    // Fast refresh indices do not depend on the unpromotables being refreshed
-                    boolean fastRefresh = IndexSettings.INDEX_FAST_REFRESH_SETTING.get(indexShard.indexSettings().getSettings());
-                    if (location != null && (indexShard.routingEntry().isSearchable() == false && fastRefresh == false)) {
+                    if (location != null && indexShard.routingEntry().isSearchable() == false) {


The reasoning here is this code runs on the primary/indexing node, and indeed that the indexing node will be upgraded after the search nodes.

But it does not guarantee immediate availability of the latest state on the search node.

Doesn't our upgrade process guarantee that, since search nodes are upgraded first?

So we risk some seconds of non-realtime GET requests going backwards during such an upgrade?

A non-realtime GET coordinated by an old search node will go the primary to execute.
A non-realtime GET coordinated by a new search node, with an old primary node, will go the primary to execute.
A non-realtime GET coordinated by a new search node on a fully upgraded cluster, will be executed on the search node as is done for non-fast-refresh indices. Which should be fine as well. Not sure I see when/why it might go backwards?

I think real-time GET requests will be saved by the wait-for generation, is that also your understanding?

A real-time GET coordinated by an old search node will go the primary to execute.
A real-time GET coordinated by a new search node, with an old primary node, will go the primary to execute.
A real-time GET coordinated by a new search node on a fully upgraded cluster, will be executed on the search node as is done for non-fast-refresh indices. Which should use wait-for generation if necessary.

Please tell me if you see any corner cases I might have missed or not considered. It might be useful to think about the above combinations also for searches/mgets, but I believe it should be a similar story for them as well.

Fast refresh indices should now behave like non fast refresh indices in how they execute (m)gets and searches. I.e., they should use the search shards. For BWC, we define a new transport version. We expect search shards to be upgraded first, before promotable shards. Until the cluster is fully upgraded, the promotable shards (whether upgraded or not) will still receive and execute gets/searches locally. Relates ES-9573 Relates ES-9579

henningandersen

Looks good, main issue remaining is the BitsetFilterCache.

server/src/main/java/org/elasticsearch/action/get/TransportShardMultiGetAction.java

server/src/main/java/org/elasticsearch/index/cache/bitset/BitsetFilterCache.java

henningandersen · 2024-10-04T09:54:18Z

server/src/main/java/org/elasticsearch/action/support/replication/PostWriteRefresh.java

-                    // Fast refresh indices do not depend on the unpromotables being refreshed
-                    boolean fastRefresh = IndexSettings.INDEX_FAST_REFRESH_SETTING.get(indexShard.indexSettings().getSettings());
-                    if (location != null && (indexShard.routingEntry().isSearchable() == false && fastRefresh == false)) {
+                    if (location != null && indexShard.routingEntry().isSearchable() == false) {


I think you are right that it works out. The upgrade will force a relocation, which forces a flush, bringing things back into order. Thanks.

…ast-refresh-rco

As recognized in PR elastic#113478 reviewing, the bitset filter cache was wrongly eagerly loaded only for fast refresh indices on index nodes. However, it should be eagerly loaded for any index that can be searched. This PR fixes this.

henningandersen

LGTM.

server/src/main/java/org/elasticsearch/index/cache/bitset/BitsetFilterCache.java

…ast-refresh-rco

elasticsearchmachine · 2024-10-07T18:16:18Z

💚 Backport successful

Status	Branch	Result
✅	8.x

Fast refresh indices should now behave like non fast refresh indices in how they execute (m)gets and searches. I.e., they should use the search shards. For BWC, we define a new transport version. We expect search shards to be upgraded first, before promotable shards. Until the cluster is fully upgraded, the promotable shards (whether upgraded or not) will still receive and execute gets/searches locally. Relates ES-9573 Relates ES-9579

ywangd · 2024-10-08T03:48:04Z

IIUC, it is not absolutely necessary to backport this PR to 8.16 since the change affects serverless only and serverless works on the main branch only? I am trying understanding the reason here in case it applies to any future work. Thanks!

kingherc · 2024-10-08T08:33:25Z

Hi @ywangd , I think I backported it because I saw the transport versions are still on 8 major version, so somehow it made sense in my mind this should be backported. But, no, it was not necessary to backport it indeed. And it has nothing to do with any future work. Nor does it affect stateful.

ywangd · 2024-10-08T12:57:26Z

Thanks for the explanation. 🙏

Fast refresh indices should now behave like non fast refresh indices in how they execute (m)gets and searches. I.e., they should use the search shards. For BWC, we define a new transport version. We expect search shards to be upgraded first, before promotable shards. Until the cluster is fully upgraded, the promotable shards (whether upgraded or not) will still receive and execute gets/searches locally. Relates ES-9573 Relates ES-9579

As recognized in PR #113478 reviewing, the bitset filter cache was wrongly eagerly loaded only for fast refresh indices on index nodes. However, it should be eagerly loaded for any index that can be searched. This PR fixes this.

As recognized in PR elastic#113478 reviewing, the bitset filter cache was wrongly eagerly loaded only for fast refresh indices on index nodes. However, it should be eagerly loaded for any index that can be searched. This PR fixes this.

As recognized in PR #113478 reviewing, the bitset filter cache was wrongly eagerly loaded only for fast refresh indices on index nodes. However, it should be eagerly loaded for any index that can be searched. This PR fixes this.

Fast refresh indices should now behave like non fast refresh indices in how they execute (m)gets and searches. I.e., they should use the search shards. For BWC, we define a new transport version. We expect search shards to be upgraded first, before promotable shards. Until the cluster is fully upgraded, the promotable shards (whether upgraded or not) will still receive and execute gets/searches locally. Relates ES-9573 Relates ES-9579

As recognized in PR elastic#113478 reviewing, the bitset filter cache was wrongly eagerly loaded only for fast refresh indices on index nodes. However, it should be eagerly loaded for any index that can be searched. This PR fixes this.

* Fast refresh indices should use RCO Depends on core ES PR elastic#113478 . Fast refresh indices should now use search shards, by using the new Refresh Cost Optimizations to refresh them. That means they can now start using search shards for gets/searches as well. The only difference now for fast refresh indices vs non fast refresh indices, is that the former have a lower min & default refresh interval (1s vs 5s) & are excluded from refresh throttling. Note that for BWC, we expect search shards to be upgraded first, before indexing shards. Until the cluster is fully upgraded, the promotable shards (whether upgraded or not) will still receive and execute gets/searches locally. An upgraded indexing node will refresh unpromotables (which must be upgraded search nodes). With the later , we will re-add checks that indexing shards should not receive a get/search. Closes Closes * Disable flush throttler for the upgrade test * PR comments --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>

kingherc added >non-issue :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. Team:Distributed Meta label for distributed team. labels Sep 24, 2024

kingherc self-assigned this Sep 24, 2024

elasticsearchmachine added the v9.0.0 label Sep 24, 2024

kingherc force-pushed the non-issue/ES-9573-fast-refresh-rco branch 3 times, most recently from 3f87579 to f1ff18a Compare September 25, 2024 16:21

kingherc marked this pull request as ready for review September 30, 2024 14:35

kingherc requested review from arteam and henningandersen September 30, 2024 14:35

kingherc requested a review from Tim-Brooks September 30, 2024 14:38

arteam approved these changes Oct 1, 2024

View reviewed changes

kingherc requested a review from pxsalehi October 2, 2024 08:30

henningandersen reviewed Oct 3, 2024

View reviewed changes

kingherc force-pushed the non-issue/ES-9573-fast-refresh-rco branch from f1ff18a to 40df9da Compare October 3, 2024 13:30

elasticsearchmachine added the serverless-linked Added by automation, don't add manually label Oct 3, 2024

kingherc commented Oct 3, 2024

View reviewed changes

kingherc requested review from JVerwolf and henningandersen October 3, 2024 13:30

kingherc force-pushed the non-issue/ES-9573-fast-refresh-rco branch from 8f85a6c to da342b0 Compare October 4, 2024 08:39

kingherc added auto-backport-and-merge v8.16.0 labels Oct 4, 2024

henningandersen reviewed Oct 4, 2024

View reviewed changes

kingherc requested a review from original-brownbear October 4, 2024 10:03

Revert comment

e11e283

mark-vieira added auto-backport Automatically create backport pull requests when merged and removed auto-backport-and-merge labels Oct 4, 2024

mark-vieira removed the auto-backport-and-merge label Oct 4, 2024

Merge remote-tracking branch 'kingherc/main' into non-issue/ES-9573-f…

28781a5

…ast-refresh-rco

kingherc mentioned this pull request Oct 7, 2024

Fix bitset filter cache loading in Stateless #114191

Merged

henningandersen approved these changes Oct 7, 2024

View reviewed changes

server/src/main/java/org/elasticsearch/index/cache/bitset/BitsetFilterCache.java Show resolved Hide resolved

Merge remote-tracking branch 'kingherc/main' into non-issue/ES-9573-f…

9ec611e

…ast-refresh-rco

kingherc merged commit 4990276 into elastic:main Oct 7, 2024

kingherc mentioned this pull request Oct 7, 2024

[8.x] Fast refresh indices should use search shards (#113478) #114259

Merged

kingherc mentioned this pull request Oct 17, 2024

Revert fast refresh using search shards #115019

Merged

carlosdelest mentioned this pull request Oct 18, 2024

Change synonyms index auto-expand replicas to 0-1 #115078

Merged

kingherc mentioned this pull request Nov 12, 2024

Fast refresh indices to use search shards #116658

Merged

Conversation

kingherc commented Sep 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Sep 30, 2024

Uh oh!

arteam left a comment

Choose a reason for hiding this comment

Uh oh!

henningandersen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

henningandersen Oct 3, 2024

Choose a reason for hiding this comment

Uh oh!

kingherc Oct 3, 2024

Choose a reason for hiding this comment

Uh oh!

henningandersen Oct 4, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kingherc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kingherc Oct 3, 2024

Choose a reason for hiding this comment

Uh oh!

henningandersen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

henningandersen Oct 4, 2024

Choose a reason for hiding this comment

Uh oh!

henningandersen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elasticsearchmachine commented Oct 7, 2024

💚 Backport successful

Uh oh!

ywangd commented Oct 8, 2024

Uh oh!

kingherc commented Oct 8, 2024

Uh oh!

ywangd commented Oct 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

kingherc commented Sep 24, 2024 •

edited

Loading