Skip to content

Conversation

@jordan-powers
Copy link
Contributor

This PR updates the FlattenedFieldMapper to use binary doc values instead of sorted set doc values.

@jordan-powers jordan-powers self-assigned this Jan 7, 2026
@jordan-powers jordan-powers added >feature :StorageEngine/Mapping The storage related side of mappings labels Jan 7, 2026
@elasticsearchmachine
Copy link
Collaborator

Hi @jordan-powers, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Hi @jordan-powers, I've updated the changelog YAML for you.

@jordan-powers jordan-powers marked this pull request as ready for review January 8, 2026 20:38
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jordan-powers this looks good.

Before this gets merged, maybe first open a pr that adds a rolling upgrade test for the flattened field type? (Similar to TextRollingUpgradeIT?)

jordan-powers added a commit to jordan-powers/elasticsearch that referenced this pull request Jan 10, 2026
This PR adds a test to compare the synthetic source produced by flattened
fields using the TSDB codec against flattened fields not using that
codec. The test generates 32 random documents, indexes them into two
indices (one using the codec, the other not using the codec), then
retrieves the documents from the two indices and compares them.

This is a test for elastic#140246, since once that PR is merged, flattened
fields using the TSDB codec will use binary doc values while flattened
fields using the default codec will continue to use sorted set doc
values.
jordan-powers added a commit that referenced this pull request Jan 12, 2026
This PR adds a test to compare the synthetic source produced by flattened
fields using the TSDB codec against flattened fields not using that
codec. The test generates 32 random documents, indexes them into two
indices (one using the codec, the other not using the codec), then
retrieves the documents from the two indices and compares them.

This is a test for #140246, since once that PR is merged, flattened
fields using the TSDB codec will use binary doc values while flattened
fields using the default codec will continue to use sorted set doc
values.
@jordan-powers
Copy link
Contributor Author

Opened #140611 to add a rolling upgrade test for flattened fields.

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jordan! LGTM

* This class wraps the field data that is built directly on the keyed flattened field,
* and filters out values whose prefix doesn't match the requested key.
*/
public class BinaryKeyedFlattenedLeafFieldData implements LeafFieldData {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final?

}
}

public static class BinaryKeyedFlattenedFieldData implements IndexFieldData<LeafFieldData> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final?

@jordan-powers jordan-powers enabled auto-merge (squash) January 15, 2026 15:51
@jordan-powers jordan-powers merged commit 1c23ba4 into elastic:main Jan 15, 2026
36 checks passed
jordan-powers added a commit to jordan-powers/elasticsearch that referenced this pull request Jan 20, 2026
As of elastic#140246, flattened fields might use either binary or sorted set
doc values, and the associated synthetic field loader can handle either
format. This patch updates the name and javadoc of that field loader to
document this behavior.
jordan-powers added a commit that referenced this pull request Jan 21, 2026
As of #140246, flattened fields might use either binary or sorted set
doc values, and the associated synthetic field loader can handle either
format. This patch updates the name and javadoc of that field loader to
document this behavior.
spinscale pushed a commit to spinscale/elasticsearch that referenced this pull request Jan 21, 2026
…c#140489)

This PR adds a test to compare the synthetic source produced by flattened
fields using the TSDB codec against flattened fields not using that
codec. The test generates 32 random documents, indexes them into two
indices (one using the codec, the other not using the codec), then
retrieves the documents from the two indices and compares them.

This is a test for elastic#140246, since once that PR is merged, flattened
fields using the TSDB codec will use binary doc values while flattened
fields using the default codec will continue to use sorted set doc
values.
spinscale pushed a commit to spinscale/elasticsearch that referenced this pull request Jan 21, 2026
This PR updates the FlattenedFieldMapper to use binary doc values instead
of sorted set doc values
spinscale pushed a commit to spinscale/elasticsearch that referenced this pull request Jan 21, 2026
As of elastic#140246, flattened fields might use either binary or sorted set
doc values, and the associated synthetic field loader can handle either
format. This patch updates the name and javadoc of that field loader to
document this behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants