Skip to content

Conversation

@romseygeek
Copy link
Contributor

Flattened field mappers have a specialised lookup class to handle
the fact that their MappedFieldTypes are dynamic and generated
on-the-fly, rather than being registered up front. The way this is
implemented means that this lookup class also has to be aware of
field aliases, which is blocking our attempt to re-implement aliases
as runtime fields.

This commit removes the specialised lookup class and moves
dynamic lookup handling directly into FieldTypeLookup. If a field
containing dots does not have a MappedFieldType directly registered
against it, the FieldTypeLookup will progressively chop dotted parts
from the field name and try and find a registered parent with the
shortened name; if such a parent exists, then the lookup calls a
new childFieldType() method on MappedFieldType, passing the
part of the field that was removed to find the parent.

@romseygeek romseygeek added :Search Foundations/Mapping Index mappings, including merging and defining field types >refactoring v8.0.0 v7.14.0 labels Apr 21, 2021
@romseygeek romseygeek self-assigned this Apr 21, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Apr 21, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@romseygeek
Copy link
Contributor Author

I think implementing dynamic lookup in this way will also help in future for runtime object fields, as we can register the parent object and handle subfields within the runtime implementation.

Copy link
Contributor

@jtibshirani jtibshirani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally liked how this was factored before. I actually moved logic to its own class in #52091 as I thought it made flattened fields more modular, and kept FieldTypeLookup more readable. I also tried not to introduce flattened field concerns on the main FieldMapper, it's not really a general capability of field types that they produce a subfield for any key.

Could you explain more how this helps convert field aliases to runtime fields? I wonder if we could make DynamicKeyFieldTypeLookup unaware of aliases while still maintaining the modularity?

@romseygeek
Copy link
Contributor Author

Could you explain more how this helps convert field aliases to runtime fields?

Let's say we have a flattened field foo, and we want to add an alias to it bar -> foo. When we look up bar.field, FieldTypeLookup will first check its list of registered mappings, and it won't find it there because the .field suffix is dynamic. So we instead pass things to the dynamic lookup. For it to be able to resolve bar.field to foo.field, it also needs to know about aliases, so the way things stand now, we pass the list of aliases to the DynamicFieldTypeLookup constructor.

Enter runtime fields. #70065 implements aliases by adding an extra step to the resolving process. If you call 'get' and it refers to a runtime field, then the field type lookup will ask the runtime field to return a MappedFieldType. This could be a script field implementation, or it could be another existing MappedFieldType, or something else entirely. We don't know about aliases up-front anymore, because they're implemented by having a RuntimeField register itself under the alias name, and then having it return the resolved MappedFieldType during get(). And because we don't know about them up-front, we can't pass them to the dynamic field mapper.

it's not really a general capability of field types that they produce a subfield for any key.

This is true now, but object script fields are probably going to end up using this functionality as well. And DynamicFieldTypeLookup doesn't know anything about runtime fields. So I felt that merging the two lookup types made the most sense here, rather than adding extra complications to DynamicFieldTypeLookup.

@romseygeek
Copy link
Contributor Author

From some other discussions today regarding nested fields, I think this would also come in handy to make nested fields more modular - we can register the nested root in the fieldtypelookup, and then return child mapped field types that can wrap their queries with the correct nested filters.

@romseygeek romseygeek requested a review from jtibshirani April 29, 2021 10:04
@romseygeek
Copy link
Contributor Author

After some discussion with @jtibshirani I've removed the new method on MappedFieldType and instead created a new interface, DynamicFieldType, which can create child field types; RootFlattenedFieldType implements this interface, and we detect any field types implementing this in FieldTypeLookup and add special handling for them.

Copy link
Contributor

@jtibshirani jtibshirani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me overall, just left a few comments.

import org.elasticsearch.index.analysis.NamedAnalyzer;

/**
* A field mapper that supports lookup of dynamic sub-keys. If the field mapper is named 'my_field',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we dropped this javadoc, it'd be nice to move it to DynamicFieldType instead.

assertNull(lookup.get("object.child"));
}

public void testParentPathChecks() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this method be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, yes, gone.

this.maxKeyDepth = getMaxKeyDepth(mappers, aliasToConcreteName);
}

/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could keep these method comments too?

return getDynamicField(field);
}

private MappedFieldType getDynamicField(String field) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a motivation for changing the lookup approach? The old approach seemed more streamlined -- this one seems to do several passes through the string (in longestPossibleParent, then the contains and lastIndexOf calls in a loop below).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why I re-implemented this, but the original approach is clearly more efficient. I've updated.

Copy link
Contributor

@jtibshirani jtibshirani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a last comment, once that's addressed this looks ready to merge.


this.dynamicKeyLookup = new DynamicKeyFieldTypeLookup(dynamicKeyMappers, aliasToConcreteName);
// for testing
String longestPossibleParent(String path) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is now just used in testMaxDynamicKeyDepth and should be removed. It looks like we need to rework this test.

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a small comment, LGTM though.

int dotIndex = -1;
int fieldDepth = -1;

while (true) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason this loop makes me nervous that we may do more work than needed. Effectively we could stop once we encounter an object. But maybe that should not be a concern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Effectively we could stop once we encounter an object

Not quite - you can have a dynamic field nested inside an object. That's why we calculate the maxParentPathDots field, because once you've got past that you know that there are no dynamic roots that could match the path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but you go backwards analyzing the path, and you can't have a dynamic field pointing to an object, right? so once you find an object, you should be done and there is no need to look at its parent and so on? Am I missing something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment is off, because when you look up field types, you can not find an object :) so I think this is a non-issue, like I said I am nervous that we go ahead and analyze the path when it's not needed, but I am not sure that would be a problem and how to avoid it.

@romseygeek romseygeek merged commit 8c28ec2 into elastic:master May 4, 2021
@romseygeek romseygeek deleted the mapper/dynamicfieldtypelookup branch May 4, 2021 09:34
romseygeek added a commit that referenced this pull request May 4, 2021
Flattened field mappers have a specialised lookup class to handle
the fact that their MappedFieldTypes are dynamic and generated
on-the-fly, rather than being registered up front. The way this is
implemented means that this lookup class also has to be aware of
field aliases, which is blocking our attempt to re-implement aliases
as runtime fields.

This commit merges dynamic field lookup handling into the standard
FieldTypeLookup class.  When the lookup class is built, it checks each
MappedFieldType being registered to see if it implements a new
DynamicFieldType interface, and stores these in a separate map.
If a field containing dots does not have a field type directly registered
against it, we check if a dynamic field type matches one of its dot-
delimited prefixes, and if so we then return the result of calling
`DynamicFieldType.getChildFieldType()` with the remainder of the
path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>refactoring :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team v7.14.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants