Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LUCENE-8594: DV update are broken for updates on new field #516

Merged
merged 1 commit into from
Dec 6, 2018

Conversation

s1monw
Copy link
Member

@s1monw s1monw commented Dec 6, 2018

A segmemnt written with Lucene70Codec failes if it ties to update
a DV field that didn't exist in the index before it was upgraded to
Lucene80Codec. We bake the DV format into the FieldInfo when it's used
the first time and therefor never go to the codec if we need to update.
yet on a field that didn't exist before and was added during an indexing
operation we have to consult the coded and get an exception.
This change fixes this issue and adds the relevant bwc tests.

@s1monw
Copy link
Member Author

s1monw commented Dec 6, 2018

this was found in elastic/elasticsearch#36286

@s1monw
Copy link
Member Author

s1monw commented Dec 6, 2018

@jpountz I pushed changes

Copy link
Member

@mikemccand mikemccand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

@@ -66,7 +67,7 @@ public PostingsFormat getPostingsFormatForField(String field) {
private final DocValuesFormat docValuesFormat = new PerFieldDocValuesFormat() {
@Override
public DocValuesFormat getDocValuesFormatForField(String field) {
throw new IllegalStateException("This codec should only be used for reading, not writing");
return defaultDVFormat;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh this is because the DV updates must also be written with the old codec?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a requirement, but I like using the old format better:

  • it's probably easier to debug, as all fields would be using the same format
  • given that PFDVFormat creates different files per format, forcing the new format would increase the number of files for old segments

A segmemnt written with Lucene70Codec failes if it ties to update
a DV field that didn't exist in the index before it was upgraded to
Lucene80Codec. We bake the DV format into the FieldInfo when it's used
the first time and therefor never go to the codec if we need to update.
yet on a field that didn't exist before and was added during an indexing
operation we have to consult the coded and get an exception.
This change fixes this issue and adds the relevant bwc tests.
@s1monw s1monw force-pushed the fix_dv_update_on_old_index branch from 7cb255d to aaa64d7 Compare December 6, 2018 19:16
@asfgit asfgit merged commit aaa64d7 into apache:master Dec 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants