Skip to content

Simplify MergeResult#buildMergedMappers#104578

Merged
felixbarny merged 2 commits intoelastic:mainfrom
felixbarny:build-merged-mappers-refactor
Jan 22, 2024
Merged

Simplify MergeResult#buildMergedMappers#104578
felixbarny merged 2 commits intoelastic:mainfrom
felixbarny:build-merged-mappers-refactor

Conversation

@felixbarny
Copy link
Copy Markdown
Member

@felixbarny felixbarny commented Jan 19, 2024

Encapsulates some mechanical changes from #102936

@felixbarny felixbarny requested a review from javanna January 19, 2024 10:06
@elasticsearchmachine elasticsearchmachine added external-contributor Pull request authored by a developer outside the Elasticsearch team v8.13.0 needs:triage Requires assignment of a team area label labels Jan 19, 2024
@saikatsarkar056 saikatsarkar056 added the :Search Foundations/Mapping Index mappings, including merging and defining field types label Jan 19, 2024
@elasticsearchmachine elasticsearchmachine added Team:Search Meta label for search team and removed needs:triage Requires assignment of a team area label labels Jan 19, 2024
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search (Team:Search)

@felixbarny felixbarny changed the title Encapsulate some mechanical changes from #102936 Simplify MergeResult#buildMergedMappers Jan 22, 2024
if (mergedMappers == null) {
mergedMappers = new HashMap<>(existing.mappers);
}
mergedMappers.put(merged.simpleName(), merged);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the put stay where it is? in every loop iteration we only ever put a single field, so we should be ok leaving a single put call, that adds the current merged instance to the map? That put call needs to be wrapped with a check that ensures we do not go beyond the limit?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can do that. See also #102936 (comment)

It's important to decrement the field budget before calling merge on the sub fields. I think we should have the put calls as close as possible to the place where we decrement the field budget.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the latest changes in #102936, that no longer touches these lines, I still don't follow :)

The put is still within the loop, even if after the conditional instead of within the conditional . The only merge call I see in this method is performed within the conditional, before the put (before and after your change). What am I missing that makes this part of the changes necessary?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've actually reverted this in #102936 now (see c1b4258). Now that I've removed the Consumer methods from MapperMergeContext and there's only decrementFieldBudgetIfPossible which doesn't directly add the merged mapper to the map it think it's viable to not have the put in the if/else.

What's still important though is that we decrement the budget for the shallow object mapper in org.elasticsearch.index.mapper.ObjectMapper.MergeResult#truncateObjectMapper before calling shallowObjectMapper.merge. Otherwise, if we merged the object before decrementing the budget, we wouldn't have capacity to add the object itself.

Copy link
Copy Markdown
Contributor

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a question about moving the put call, I may not see why it needs moving. LGTM otherwise.

@felixbarny felixbarny merged commit cd38c3a into elastic:main Jan 22, 2024
@felixbarny felixbarny deleted the build-merged-mappers-refactor branch January 22, 2024 12:03
elasticsearchmachine pushed a commit that referenced this pull request Jan 23, 2024
This is extracting a piece of independent functionality from
#96235

This is the result of this discussion:
#96235 (comment)

The mapping merge executed on the master node is the only authoritative
place where we have a consistent view of the cluster state as it's
guaranteed to be executed on a single thread on the master node.
Therefore, there can be no race conditions and we exactly know how many
fields are left before hitting the field limit.

Therefore, this is the best place to add as many fields as possible
without going over the limit if
`index.mapping.total_fields.ignore_dynamic_beyond_limit` is set to
`true` in the index settings.

This change makes it possible to limit the number of fields added during
mapping merger. However, that feature isn't actually used outside of
tests.

In #96235, the feature to
limit fields during the merge process is used to guarantee that the
cluster state update never fails due to dynamic mapping updates that
exceed the field limit in case `ignore_dynamic_beyond_limit` is enabled.

This works because after each dynamic mapping update, the indexing
request gets retried (there's no change to that logic in this or related
PRs) so that the document parsing takes the new mapping into account.

This only applies to dynamic mapping updates aka auto-updates. When
explicitly trying to update the mapping with too many fields, the put
mapping operation will still fail, even if `ignore_dynamic_beyond_limit`
is enabled.

Depends on: - [x] #104512 -
[x] #104578
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

external-contributor Pull request authored by a developer outside the Elasticsearch team >non-issue :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team v8.13.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants