Skip to content

Add ability to limit fields added during Mapper#merge#102936

Merged
elasticsearchmachine merged 50 commits intoelastic:mainfrom
felixbarny:mapping-merge-limit-fields
Jan 23, 2024
Merged

Add ability to limit fields added during Mapper#merge#102936
elasticsearchmachine merged 50 commits intoelastic:mainfrom
felixbarny:mapping-merge-limit-fields

Conversation

@felixbarny
Copy link
Copy Markdown
Member

@felixbarny felixbarny commented Dec 4, 2023

This is extracting a piece of independent functionality from #96235

This is the result of this discussion: #96235 (comment)

The mapping merge executed on the master node is the only authoritative place where we have a consistent view of the cluster state as it's guaranteed to be executed on a single thread on the master node. Therefore, there can be no race conditions and we exactly know how many fields are left before hitting the field limit.

Therefore, this is the best place to add as many fields as possible without going over the limit if index.mapping.total_fields.ignore_dynamic_beyond_limit is set to true in the index settings.

This change makes it possible to limit the number of fields added during mapping merger. However, that feature isn't actually used outside of tests.

In #96235, the feature to limit fields during the merge process is used to guarantee that the cluster state update never fails due to dynamic mapping updates that exceed the field limit in case ignore_dynamic_beyond_limit is enabled.

This works because after each dynamic mapping update, the indexing request gets retried (there's no change to that logic in this or related PRs) so that the document parsing takes the new mapping into account.

This only applies to dynamic mapping updates aka auto-updates. When explicitly trying to update the mapping with too many fields, the put mapping operation will still fail, even if ignore_dynamic_beyond_limit is enabled.

Depends on:

This is extracting a piece of independent functionality from elastic#96235
@felixbarny felixbarny added >non-issue :Search Foundations/Mapping Index mappings, including merging and defining field types labels Dec 4, 2023
@elasticsearchmachine elasticsearchmachine added external-contributor Pull request authored by a developer outside the Elasticsearch team v8.12.0 Team:Search Meta label for search team labels Dec 4, 2023
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search (Team:Search)

Comment thread server/src/main/java/org/elasticsearch/index/mapper/ObjectMapper.java Outdated
@felixbarny
Copy link
Copy Markdown
Member Author

felixbarny commented Dec 6, 2023

@elasticmachine update branch

@felixbarny felixbarny requested a review from javanna December 6, 2023 09:20
felixbarny added a commit that referenced this pull request Jan 22, 2024
Encapsulates some mechanical changes from #102936
Copy link
Copy Markdown
Contributor

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did another round. Thanks for all the progress you made. Don't get discouraged by the amount of comments, these are all details around refining, polishing and adding some tests. We are very close!

Could you also add docs to clarify design choices: why the field limit is checked as part of merge. I could see future readers getting confused about that. And describe how it works, meaning it is already applied in the data node, and if certain fields are allowed there, they are going to be allowed when the master does the merge of the mappings, but there are scenarios where data nodes allowed addition of fields that the master will reject. Am I getting this right? Could you expand on the mechanics of this, perhaps in the javadocs of MapperMergeContext?

Comment thread server/src/main/java/org/elasticsearch/index/mapper/Mapper.java
Comment thread server/src/main/java/org/elasticsearch/index/mapper/MapperMergeContext.java Outdated
Comment thread server/src/main/java/org/elasticsearch/index/mapper/MapperMergeContext.java Outdated
Comment thread server/src/main/java/org/elasticsearch/index/mapper/ParsedDocument.java Outdated
Comment thread server/src/main/java/org/elasticsearch/index/mapper/MapperMergeContext.java Outdated
Comment thread server/src/main/java/org/elasticsearch/index/mapper/ObjectMapper.java Outdated
Comment thread server/src/main/java/org/elasticsearch/index/mapper/ObjectMapper.java Outdated
Comment thread server/src/main/java/org/elasticsearch/index/mapper/MapperMergeContext.java Outdated
Comment thread server/src/main/java/org/elasticsearch/index/mapper/ObjectMapper.java Outdated
@felixbarny
Copy link
Copy Markdown
Member Author

Could you also add docs to clarify design choices: why the field limit is checked as part of merge. I could see future readers getting confused about that. And describe how it works, meaning it is already applied in the data node, and if certain fields are allowed there, they are going to be allowed when the master does the merge of the mappings, but there are scenarios where data nodes allowed addition of fields that the master will reject. Am I getting this right? Could you expand on the mechanics of this, perhaps in the javadocs of MapperMergeContext?

I've documented most of that in the other PR. See also https://github.com/elastic/elasticsearch/pull/96235/files#diff-32c1e36544e0c0ef775f9e26576ef7df0504cc05466e513ba7b143947c234784R590

Copy link
Copy Markdown
Contributor

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a last round, this look really good. My final comments are mainly on strengthening the tests further and a tiny bit more docs. Thanks for all the work. No need for another review round, merge when ready.

Comment thread server/src/main/java/org/elasticsearch/index/mapper/RootObjectMapper.java Outdated
Comment thread server/src/main/java/org/elasticsearch/index/mapper/MapperService.java Outdated
Comment thread server/src/main/java/org/elasticsearch/index/mapper/Mapping.java Outdated
@felixbarny felixbarny added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Jan 23, 2024
@elasticsearchmachine elasticsearchmachine merged commit 30b8af2 into elastic:main Jan 23, 2024
@felixbarny felixbarny deleted the mapping-merge-limit-fields branch January 23, 2024 17:16
elasticsearchmachine pushed a commit that referenced this pull request Feb 2, 2024
)

Adds a new `index.mapping.total_fields.ignore_dynamic_beyond_limit`
index setting.

When set to `true`, new fields are added to the mapping as long as the
field limit (`index.mapping.total_fields.limit`) is not exceeded. Fields
that would exceed the limit are not added to the mapping, similar to
`dynamic: false`.  Ignored fields are added to the `_ignored` metadata
field.

Relates to #89911

To make this easier to review, this is split into the following PRs: -
[x] #102915 - [x]
#102936 - [x]
#104769

Related but not a prerequisite: - [ ]
#102885
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) external-contributor Pull request authored by a developer outside the Elasticsearch team >non-issue :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team v8.13.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants