-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Store flattened field data in binary doc values #140246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store flattened field data in binary doc values #140246
Conversation
|
Hi @jordan-powers, I've created a changelog YAML for you. |
…s-2' into flattened-field-binary-doc-values-2
server/src/main/java/org/elasticsearch/index/fielddata/MultiValuedSortedBinaryDocValues.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/index/mapper/flattened/FlattenedFieldMapperTests.java
Outdated
Show resolved
Hide resolved
|
Hi @jordan-powers, I've updated the changelog YAML for you. |
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
martijnvg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jordan-powers this looks good.
Before this gets merged, maybe first open a pr that adds a rolling upgrade test for the flattened field type? (Similar to TextRollingUpgradeIT?)
server/src/main/java/org/elasticsearch/index/mapper/MultiValuedBinaryDocValuesField.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/search/aggregations/support/ValuesSourceConfigTests.java
Show resolved
Hide resolved
This PR adds a test to compare the synthetic source produced by flattened fields using the TSDB codec against flattened fields not using that codec. The test generates 32 random documents, indexes them into two indices (one using the codec, the other not using the codec), then retrieves the documents from the two indices and compares them. This is a test for elastic#140246, since once that PR is merged, flattened fields using the TSDB codec will use binary doc values while flattened fields using the default codec will continue to use sorted set doc values.
This PR adds a test to compare the synthetic source produced by flattened fields using the TSDB codec against flattened fields not using that codec. The test generates 32 random documents, indexes them into two indices (one using the codec, the other not using the codec), then retrieves the documents from the two indices and compares them. This is a test for #140246, since once that PR is merged, flattened fields using the TSDB codec will use binary doc values while flattened fields using the default codec will continue to use sorted set doc values.
|
Opened #140611 to add a rolling upgrade test for flattened fields. |
martijnvg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Jordan! LGTM
| * This class wraps the field data that is built directly on the keyed flattened field, | ||
| * and filters out values whose prefix doesn't match the requested key. | ||
| */ | ||
| public class BinaryKeyedFlattenedLeafFieldData implements LeafFieldData { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
final?
| } | ||
| } | ||
|
|
||
| public static class BinaryKeyedFlattenedFieldData implements IndexFieldData<LeafFieldData> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
final?
As of elastic#140246, flattened fields might use either binary or sorted set doc values, and the associated synthetic field loader can handle either format. This patch updates the name and javadoc of that field loader to document this behavior.
As of #140246, flattened fields might use either binary or sorted set doc values, and the associated synthetic field loader can handle either format. This patch updates the name and javadoc of that field loader to document this behavior.
…c#140489) This PR adds a test to compare the synthetic source produced by flattened fields using the TSDB codec against flattened fields not using that codec. The test generates 32 random documents, indexes them into two indices (one using the codec, the other not using the codec), then retrieves the documents from the two indices and compares them. This is a test for elastic#140246, since once that PR is merged, flattened fields using the TSDB codec will use binary doc values while flattened fields using the default codec will continue to use sorted set doc values.
This PR updates the FlattenedFieldMapper to use binary doc values instead of sorted set doc values
As of elastic#140246, flattened fields might use either binary or sorted set doc values, and the associated synthetic field loader can handle either format. This patch updates the name and javadoc of that field loader to document this behavior.
This PR updates the FlattenedFieldMapper to use binary doc values instead of sorted set doc values.