Skip to content

Conversation

@romseygeek
Copy link
Contributor

@romseygeek romseygeek commented Mar 8, 2021

Alias fields allow you to correct errors in mappings or to unify mappings from
multiple sources by adding a redirection from one field name to another. However,
as currently implemented they have the drawbacks of a concrete field: they cannot
be deleted, and they apply to all clients. They can also only refer directly to
concrete fields and cannot link to other aliases.

This commit adds a new concept of alias fields as runtime fields. Like any other runtime
field they can be defined in the runtime section of an index's mappings, or in a search
request; they can be updated or deleted without restriction; and they can refer to
other runtime fields as well as to concrete fields, with some loop detection to prevent
stack overflows. To support this functionality, RuntimeField's asMappedFieldType()
method is extended to take a field type lookup function.

@romseygeek romseygeek added >feature :Search/Search Search-related issues that do not fall into other categories v8.0.0 v7.13.0 labels Mar 8, 2021
@romseygeek romseygeek self-assigned this Mar 8, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Mar 8, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@romseygeek
Copy link
Contributor Author

I got inspired over the weekend after the search meeting on Friday and came up with this, what do people think?

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some initial comments. I also have a couple of meta points:

  • this introduces a bit of complexity on the runtime fields existing structure, and aliases can already be defined on the runtime section through a simple script, more or less? If we do implement a specialized alias field on the runtime section, I wonder if these's a way to make it fit in the existing design without structural changes, similar to what we did for script-less runtime fields.
  • what plan could we make to remove the existing alias field mapper if any?

@romseygeek
Copy link
Contributor Author

aliases can already be defined on the runtime section through a simple script, more or less?

Well, sort of... but only for fields with types that we've implemented runtime script fields for (so not text fields, for example). And even for those the script isn't that simple, because you need to take into account empty and/or multiple values:

"runtime": {
  "not-quite-an-alias": {
    "type": "long",
    "script" : "for (long v : doc['field']) { emit(v); }"
  }
}

vs

"runtime": {
  "alias" : {
    "type": "alias",
    "path": "field"
  }
}

what plan could we make to remove the existing alias field mapper if any?

I think internally it's pretty easy to make FieldAliasMapper return a runtime alias and deal with that in the MappingLookup constructor, which lets us then remove all the special-casing elsewhere. Then we add a deprecation warning for any mappings that use an alias field in properties, refuse to create them in 8.x and remove entirely in 9.x

@jtibshirani
Copy link
Contributor

I'm excited about this effort! I understand it's on pause now, so am holding off on a detailed review. For context, here are two issues I think this change would address:

@romseygeek
Copy link
Contributor Author

I've updated this to the latest master - the changes that @javanna made in #70695 incorporate a lot of what I had initially included here, so this ends up being a much smaller change.

I have not made any attempt to change the existing FieldAliasMapper to use this functionality as well. I don't think it will be tricky, but it will require another look at dynamic field type lookups as this alias implementation does not support flattened fields.

@romseygeek romseygeek requested a review from javanna April 26, 2021 09:44
assertEquals("Loop in field resolution detected: alias-loop1->alias-loop2->alias-loop1", e.getMessage());
}

/*
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commented-out test illustrates the problem with dynamic lookups that would be fixed by #72024

MappedFieldType runtimeFieldType = runtimeField.asMappedFieldType();
//this will override concrete fields with runtime fields that have the same name
fullNameToFieldType.put(runtimeFieldType.name(), runtimeFieldType);
this.runtimeFields.put(runtimeField.name(), runtimeField);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused about why we need to introduce the distinction between ordinary fields and runtime fields in here. An alias can alias any field, what difference does it make if it's a runtime field or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because asMappedFieldType() takes a MappingLookup now, so we can't resolve this at construction time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, maybe another reason to try and change asMappedFieldType then.

@romseygeek
Copy link
Contributor Author

I've simplified the lookup logic considerably:

  • runtime MappedFieldTypes are resolved when FieldTypeLookup or SearchExecutionContext are built
  • aliases defined in the runtime section can only refer to fields in the properties section
  • aliases defined in a query can refer to anything in runtime or in properties, but can't refer to other fields defined in the query

This removes the need for loop detection. It has the downside that you can't alias to a path within a dynamic field any more, but that wasn't previously supported so we aren't losing much.

Copy link
Contributor

@jtibshirani jtibshirani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach looks simpler now, it seems like a good trade-off to only allow aliases to refer to concrete fields. My last high-level comment: It'd be nice to add some light docs in this PR that explain the behavior and restrictions.

* @return the {@link MappedFieldType} backing this runtime field
*/
MappedFieldType asMappedFieldType();
MappedFieldType asMappedFieldType(Function<String, MappedFieldType> lookup);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not super happy about this method. I added it as I thought we needed it for the object runtime field, but it turns out we won't use it...

After thinking about this more, having some method like this makes sense to me in general. It seems like RuntimeField is a counterpart to FieldMapper in that it contains a mapping definition and is able to produce a MappedFieldType. I wonder if we could rename the method to fieldType to better fit with FieldMapper?

As for the function argument, so far I don't see a cleaner approach. To me it's not too strange that runtime fields have access to the concrete field types here. @romseygeek pointed out that we've brainstormed new types of runtime fields that refer to concrete fields -- such as one that combines the contents of multiple text fields like a virtual copy_to.

public MappedFieldType asMappedFieldType(Function<String, MappedFieldType> lookup) {
MappedFieldType ft = lookup.apply(path);
if (ft == null) {
throw new IllegalStateException("Cannot resolve alias [" + name + "]: path [" + path + "] does not exist");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comment, instead of saying 'does not exist' we could say something like 'the field [path] either does not exist or is a runtime field'?

@romseygeek
Copy link
Contributor Author

For docs, I'd like to ask @lockewritesdocs for his opinion on where to add them. The runtime.asciidoc page currently assumes that all runtime fields are defined with a script, and I think will need some reworking so that docs for this new runtime type fit in.

@lockewritesdocs
Copy link
Contributor

For docs, I'd like to ask @lockewritesdocs for his opinion on where to add them. The runtime.asciidoc page currently assumes that all runtime fields are defined with a script, and I think will need some reworking so that docs for this new runtime type fit in.

@romseygeek, I think that we should introduce aliases for runtime fields through a new section in both Mapping a runtime field and Defining runtime fields in a search request, since aliases for runtime fields are supported in both instances. If you can provide some examples and explanatory text, I can create a PR to update the docs in those sections.

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple of comments. I think this is close, we only need to iron out a couple of design details. One thing that I am not sold on is supporting aliases to other runtime fields only in the search request. I think if you can only alias a concrete field, the behaviour should be consistent no matter where the alias is defined. I can imagine that users may want us to support aliasing a runtime field, but runtime fields are editable, and defined through a script, so the alias functionality really applies to concrete fields that cannot be changed.


for (RuntimeField runtimeField : runtimeFields) {
MappedFieldType runtimeFieldType = runtimeField.asMappedFieldType();
//this will override concrete fields with runtime fields that have the same name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you remove this comment on purpose?

if (runtimeMappings.containsKey(name)) {
return runtimeMappings.get(name);
}
return mappingLookup.getFieldType(name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we gain anything with this change?

MappedFieldType runtimeFieldType = runtimeField.asMappedFieldType();
//this will override concrete fields with runtime fields that have the same name
fullNameToFieldType.put(runtimeFieldType.name(), runtimeFieldType);
MappedFieldType resolved = runtimeField.asMappedFieldType(fullNameToFieldType::get);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking out loud: if we use asMappedFieldType only for aliases, given that all existing runtime fields return this, should we consider adding the notion of alias to RuntimeField through a method that returns a String and return by default the same as the field name, and when it differs look it up directly in FieldTypeLookup? Maybe it is cryptic, not too sure, but it would make it more specific to aliases (which is both good and bad I guess), would remove the need to delegate to the runtime field to look up the aliased field type. I do wonder if it should be the responsibility of the field type to look up the aliases field among all fields. We may or may not need something more generic at some point in the future but we can always refactor things later when needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more thing: fullNameToFieldType does not include the previously resolved alias field mapper, so we are sure that an alias runtime field never points to an alias field defined under properties?

if (fieldType instanceof DynamicFieldType) {
dynamicFieldTypes.put(aliasName, (DynamicFieldType) fieldType);
RuntimeField aliasField = new AliasRuntimeField(aliasName, path);
MappedFieldType resolved = aliasField.asMappedFieldType(fullNameToFieldType::get);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that the internal representation of an alias field as it existed so far is the same as the runtime alias field, but this has no effect on how it gets serialized and sent out throughout the nodes and stored in the cluster state, correct?

@elasticsearchmachine
Copy link
Collaborator

Hi @romseygeek, I've created a changelog YAML for you.

@javanna
Copy link
Member

javanna commented Jun 23, 2022

Heya, I opened an issue for this so we track the need for it: see #87969 . I assume that this PR is outdated and can't be brought up-to-date, hence closing.

@javanna javanna closed this Jun 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>feature :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team

Projects

None yet

Development

Successfully merging this pull request may close these issues.