Skip to content

Conversation

@cbuescher
Copy link
Member

Currently we don't allow retrieving metadata fields through the fields option in search but throw
an error on this case. In #78828 we started to enable this for "_id" if the field is explicitely requested.
This PR adds _index and _version metadata fields which are internally stored as doc values to
the list of fields that can be explicitely retrieved.

Relates to #75836

@cbuescher cbuescher added >enhancement :Search/Search Search-related issues that do not fall into other categories v8.0.0 v7.16.0 labels Oct 13, 2021
@cbuescher cbuescher requested a review from romseygeek October 13, 2021 09:51
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Oct 13, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@cbuescher cbuescher requested a review from jtibshirani October 13, 2021 09:51
}

@Override
public IndexFieldData.Builder fielddataBuilder(String fullyQualifiedIndexName, Supplier<SearchLookup> searchLookup) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to implement this method so fetching version works, the default implementation in MappedFieldType throws an error. I'm not sure if this has undesired effects elsewhere though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will allow _version to be used in aggregations, scripts, etc. This seems okay to me, but maybe worth checking with the team quickly to see if anyone has concerns?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, will check

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't get any immediate negative concerns, and we already support "index" or "id" in similar cases so I'll go ahad with this.

@cbuescher
Copy link
Member Author

With this change, there are five field mappers left that still throw an UnsupportedOperationException when trying to get a value fetcher from them: _source, _seq_no, _field_names, _nested_path and _feature from the RankFeatureMetaFieldMapper.
I think all of them are unlikely to be requested via fields, even with an alias pointing to them, but we might want to discuss if we want to continue throwing an error if that happens or if we are okay silently ignoring these if they try to get fetched. We can discuss this in a follow up though.

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

XContentBuilder builder = JsonXContent.contentBuilder().startObject();
builder.field("field", "value");
builder.endObject();
SourceToParse source = new SourceToParse(index, "id", BytesReference.bytes(builder), XContentType.JSON, "", Map.of());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use source() here as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to randomize the index name since that's the value under test here. All the existing source() methods use test as fixed index name and I didn't want to add another method. Do you think its worth it, or do you think we don't need the randomization?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OIC, no, randomization is good - maybe add a comment explaining why we're using an explicit SourceToParse call?

@cbuescher
Copy link
Member Author

@elasticmachine run elasticsearch-ci/part-1

@Override
public ValueFetcher valueFetcher(SearchExecutionContext context, String format) {
throw new UnsupportedOperationException("Cannot fetch values for internal field [" + name() + "].");
return new DocValueFetcher(docValueFormat(format, null), context.getForField(this));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could do something even simpler here since the value is constant. We could return a direct implementation of ValueFetcher that always returns fullyQualifiedIndexName.

}

@Override
public IndexFieldData.Builder fielddataBuilder(String fullyQualifiedIndexName, Supplier<SearchLookup> searchLookup) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will allow _version to be used in aggregations, scripts, etc. This seems okay to me, but maybe worth checking with the team quickly to see if anyone has concerns?

@cbuescher cbuescher merged commit bf15ccc into elastic:master Oct 14, 2021
cbuescher pushed a commit to cbuescher/elasticsearch that referenced this pull request Oct 14, 2021
)

Currently we don't allow retrieving metadata fields through the fields option in search but throw
an error on this case. In elastic#78828 we started to enable this for "_id" if the field is explicitely requested.
This PR adds _index and _version metadata fields which are internally stored as doc values to
the list of fields that can be explicitely retrieved.

Relates to elastic#75836
cbuescher pushed a commit that referenced this pull request Oct 14, 2021
Currently we exclude metadata fields from being looked up using the fields option in search.
However, as issue like #75836 show, they can still be retrieved e.g. via aliases and then fetching
their values causes errors.
With this change, we enable retrieval of metadata fields (like `_id`, `_ignored` etc.) using the fields
option when the field is explicitely requested. We still continue to exclude any metadata field from
matching wildcard patterns, but they should be retrievable via an exact name or if there is an alias
definition with a path to a metadata field.
This change adds support for the `_id`, `_routing`, `_ignored`, `_index` and `_version` field in particular.

Backport of #78828, #78981 and #79042
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team v7.16.0 v8.0.0-beta1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants