Skip to content

Comments

Always error out if CCS expression shows up when CCS is not supported#139009

Merged
smalyshev merged 37 commits intoelastic:mainfrom
smalyshev:ccs-not-supported-fix
Jan 6, 2026
Merged

Always error out if CCS expression shows up when CCS is not supported#139009
smalyshev merged 37 commits intoelastic:mainfrom
smalyshev:ccs-not-supported-fix

Conversation

@smalyshev
Copy link
Contributor

@smalyshev smalyshev commented Dec 3, 2025

This patch decouples ignore_unavailable from "CCS not supported" error when trying to resolve index expression. IndexNameExpressionResolver can not resolve remote expressions, and for most code paths, it is the right thing to return an error when such expression is supplied. For some code paths, it may be OK to ignore such expressions, at least for now, but in either case, ignore_unavailable should have nothing to do with it.

This patch tries to preserve the functionality in most places which misuse IndexNameExpressionResolver by trying to sent it remote indices, but we'd need to go back and fix those eventually.

Closes: #138987

@elasticsearchmachine
Copy link
Collaborator

Hi @smalyshev, I've created a changelog YAML for you.

@smalyshev smalyshev marked this pull request as ready for review December 9, 2025 23:26
@elasticsearchmachine elasticsearchmachine added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Dec 9, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@elasticsearchmachine
Copy link
Collaborator

Hi @smalyshev, I've updated the changelog YAML for you.

@smalyshev smalyshev requested a review from quux00 December 9, 2025 23:37
@smalyshev
Copy link
Contributor Author

Almost every API is using the IndexNameExpressionResolver. So, this has the potential of breaking a lot of requests that in the past would have returned a partial response. Do you understand my concern?

I totally understand your concern. You can use this API in a variety of ways though. The thing is, if you pass a remote index to it, it is supposed to throw. And it will never resolve it properly. The latter has always been the case, so if some API uses it and expects it to resolve remote indices, it is already broken. Fortunately, I didn't and the tests do not reveal any cases where it really happens - except maybe indices_boost case where it's marginal (and it is currently broken with remote indices, unfortunately). So, I think other APIs use it in a way that it was supposed to be used - to resolve only local indices. Except some APIs that I am fixing in this pull, which were kind of sloppy doing this, and relied on just shutting up the errors and not reporting them.

Now, can there be more corner cases where other APIs are relying on the same thing and aren't covered by tests? There could be. But that means they are already broken now, and the only way we can detect it (instead of users just silently getting wrong results in their responses) is to provide an identifiable error where it happens, and then when this error pops up, fix it. Short of reimplementing IndexNameExpressionResolver to actually feature full support for remote indices, which will be a considerable lift, I see no other way of handling it.

@gmarouli
Copy link
Contributor

Now, can there be more corner cases where other APIs are relying on the same thing and aren't covered by tests? There could be. But that means they are already broken now, and the only way we can detect it (instead of users just silently getting wrong results in their responses) is to provide an identifiable error where it happens, and then when this error pops up, fix it.

I am not saying that this "ignoring of remote indices" behaviour was tested and documented, but it is how it has been working and in the code it looked like a conscious choice. We can revise that choice (fwiw I also find it cleaner to consider this behaviour a bug).

However, because it is working like this now, and it is possible that users might have remote expressions that have flew under the radar because of ignore_unavailable: true, we should consider the trade-offs before proceeding, that's all.

@quux00
Copy link
Contributor

quux00 commented Dec 17, 2025

I am not saying that this "ignoring of remote indices" behaviour was tested and documented, but it is how it has been working and in the code it looked like a conscious choice. We can revise that choice (fwiw I also find it cleaner to consider this behaviour a bug).

However, because it is working like this now, and it is possible that users might have remote expressions that have flew under the radar because of ignore_unavailable: true, we should consider the trade-offs before proceeding, that's all.

@gmarouli - I created the issue for this as a bug, not a breaking change, after discussing it with Najwa and Jason Tedor. Given the surface area of the change that you and Stas are highlighting here, I spoke with Jason again today about it and our question is - is there a usage from a user that wouldn't be considered incorrect behavior to start with?

For example, are all the cases where users would now start getting an error (with this code change) ones where they are referencing a remote index in an endpoint that is not cross-cluster enabled, such as GET logs,remote1:logs/_mapping? If yes, and they added ignore_unavailable=true just to bypass an error, that is incorrect usage to start with, so we should address it both on the backend and the client side (adjust their queries). Or is there a case you know of where the user is correctly using an API and might now get an error with the changes in this PR?

@elasticsearchmachine
Copy link
Collaborator

Hi @smalyshev, I've updated the changelog YAML for you.

@gmarouli
Copy link
Contributor

I spoke with Jason again today about it and our question is - is there a usage from a user that wouldn't be considered incorrect behaviour to start with?

I do not know why this was implemented like this and there were no documented use cases in the code as far as I know. I do not know what we have communicated to the users about this either. So, my answer is I do not know, and I do not have someone in mind to redirect you to.

is there a case you know of where the user is correctly using an API and might now get an error with the changes in this PR

I doubt this, but I cannot say it with certainty. In the past when fixing a bug that would change a silent fail to an error, we considered the element of surprise to some users, and that could influence the choice of fixing the bug or letting it be. This is what I am asking, to double check that the benefit of fixing this behaviour outweighs the possibility of surprising some users.

Copy link
Contributor

@gmarouli gmarouli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the late code review, I thought I had submitted it 😶‍🌫️

public static final NodeFeature SEARCH_WITH_NO_DIMENSIONS_BUGFIX = new NodeFeature("search.vectors.no_dimensions_bugfix");
public static final NodeFeature SEARCH_RESCORE_SCRIPT = new NodeFeature("search.rescore.script");
public static final NodeFeature NEGATIVE_FUNCTION_SCORE_BAD_REQUEST = new NodeFeature("search.negative.function.score.bad.request");
public static final NodeFeature INDICES_BOOST_REMOTE_INDEX_FIX = new NodeFeature("search.indices_boost_remote_index_fix");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the APIs that use index resolution? Or can use index boosts?

I was referring to the APIs that use index resolution but their behaviour changes with this.

That could be a lot of them, and I am not sure how to practically do that.

Considering that this is only relevant for testing, so I would add it to the APIs you are testing, in this case the mapping, the index boosts and any other you choose to test.

Unless I am mistaken API capabilities were the recommended way to capture API (and behaviour) changes. Do you think that's not the case?

@smalyshev smalyshev requested review from gmarouli and quux00 January 5, 2026 16:48

/**
* Filter out remote index expressions.
* TODO: SQL Metadata commands currently do not support remote indices.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to handle this in this PR? If not, how urgent is the follow-up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests in x-pack/plugin/sql/qa/server/src/main/resources/multi-cluster-with-security/multi-cluster-command-sys.csv-spec do not support remote aliases, so I don't think we need to fix it here. As for fixing it in general, I think we need to ask the Analytics team, but given as it has been there for a while and nobody complained, it's probably not very high priority.

Copy link
Contributor

@quux00 quux00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's go ahead with merging.

@smalyshev smalyshev enabled auto-merge (squash) January 6, 2026 22:30
@smalyshev smalyshev merged commit 628906c into elastic:main Jan 6, 2026
35 checks passed
@smalyshev smalyshev deleted the ccs-not-supported-fix branch January 6, 2026 23:47
szybia added a commit to szybia/elasticsearch that referenced this pull request Jan 7, 2026
* upstream/main: (191 commits)
  Overall Decision for Deciders prioritizes THROTTLE (elastic#140237)
  Apply group by all logic not only to top-level aggregates (elastic#140248)
  [ES|QL] Refactor MV_UNION and MV_INTERSECTION to use shared set operation helper (elastic#139982)
  Avoid reading entire bloom filter file on reader open (elastic#139374)
  Mark bloom filter files for random access (elastic#139375)
  Ensure that the buffer used for ES93BloomFilterStoredFieldsFormat is zeroed (elastic#139034)
  Add busy assertion to avoid race condition for testStalledShardMigrationProperlyDetected (elastic#140230)
  Remove line number check for testTransitiveFindsDeepCallChain (elastic#140228)
  Allow a slight difference in rescored docs (elastic#139931)
  Mute org.elasticsearch.xpack.inference.integration.AuthorizationTaskExecutorIT testCreatesEisChatCompletion_DoesNotRemoveEndpointWhenNoLongerAuthorized elastic#138480
  Start exchange sink fetchers concurrently (elastic#140196)
  Allow allocation to replacement target node on vacate completion (elastic#140150)
  Ignore JNA cleaner threads in SecureHdfsRepositoryAnalysisRestIT (elastic#139925)
  DeterministicQueue refactor and enhancement (elastic#140151)
  Always error out if CCS expression shows up when CCS is not supported (elastic#139009)
  Use IllegalArgumentException over RepositoryException for readonly-repository checks (elastic#140200)
  Guard promql capabilities in AnalyzerTests (elastic#140232)
  [Inference API] Fix flaky AuthorizationTaskExecutorIT tests (elastic#139978)
  Cleaning up exitable vector value impls (elastic#140190)
  [Inference API] Fix auth exception listener not called bug (elastic#139966)
  ...
sidosera pushed a commit to sidosera/elasticsearch that referenced this pull request Jan 7, 2026
…elastic#139009)

* Always error out if CCS expression shows up when CCS is not supported
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Search Foundations/CCS Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Endpoints that are not cross-cluster enabled should always error out with qualified index expressions

4 participants