Merge dynamic field type lookup into FieldTypeLookup #72024

romseygeek · 2021-04-21T14:52:56Z

Flattened field mappers have a specialised lookup class to handle
the fact that their MappedFieldTypes are dynamic and generated
on-the-fly, rather than being registered up front. The way this is
implemented means that this lookup class also has to be aware of
field aliases, which is blocking our attempt to re-implement aliases
as runtime fields.

This commit removes the specialised lookup class and moves
dynamic lookup handling directly into FieldTypeLookup. If a field
containing dots does not have a MappedFieldType directly registered
against it, the FieldTypeLookup will progressively chop dotted parts
from the field name and try and find a registered parent with the
shortened name; if such a parent exists, then the lookup calls a
new childFieldType() method on MappedFieldType, passing the
part of the field that was removed to find the parent.

elasticmachine · 2021-04-21T14:53:00Z

Pinging @elastic/es-search (Team:Search)

romseygeek · 2021-04-21T14:53:51Z

I think implementing dynamic lookup in this way will also help in future for runtime object fields, as we can register the parent object and handle subfields within the runtime implementation.

…typelookup

jtibshirani

I personally liked how this was factored before. I actually moved logic to its own class in #52091 as I thought it made flattened fields more modular, and kept FieldTypeLookup more readable. I also tried not to introduce flattened field concerns on the main FieldMapper, it's not really a general capability of field types that they produce a subfield for any key.

Could you explain more how this helps convert field aliases to runtime fields? I wonder if we could make DynamicKeyFieldTypeLookup unaware of aliases while still maintaining the modularity?

romseygeek · 2021-04-21T16:06:52Z

Could you explain more how this helps convert field aliases to runtime fields?

Let's say we have a flattened field foo, and we want to add an alias to it bar -> foo. When we look up bar.field, FieldTypeLookup will first check its list of registered mappings, and it won't find it there because the .field suffix is dynamic. So we instead pass things to the dynamic lookup. For it to be able to resolve bar.field to foo.field, it also needs to know about aliases, so the way things stand now, we pass the list of aliases to the DynamicFieldTypeLookup constructor.

Enter runtime fields. #70065 implements aliases by adding an extra step to the resolving process. If you call 'get' and it refers to a runtime field, then the field type lookup will ask the runtime field to return a MappedFieldType. This could be a script field implementation, or it could be another existing MappedFieldType, or something else entirely. We don't know about aliases up-front anymore, because they're implemented by having a RuntimeField register itself under the alias name, and then having it return the resolved MappedFieldType during get(). And because we don't know about them up-front, we can't pass them to the dynamic field mapper.

it's not really a general capability of field types that they produce a subfield for any key.

This is true now, but object script fields are probably going to end up using this functionality as well. And DynamicFieldTypeLookup doesn't know anything about runtime fields. So I felt that merging the two lookup types made the most sense here, rather than adding extra complications to DynamicFieldTypeLookup.

romseygeek · 2021-04-22T16:39:17Z

From some other discussions today regarding nested fields, I think this would also come in handy to make nested fields more modular - we can register the nested root in the fieldtypelookup, and then return child mapped field types that can wrap their queries with the correct nested filters.

…typelookup

romseygeek · 2021-04-29T10:06:41Z

After some discussion with @jtibshirani I've removed the new method on MappedFieldType and instead created a new interface, DynamicFieldType, which can create child field types; RootFlattenedFieldType implements this interface, and we detect any field types implementing this in FieldTypeLookup and add special handling for them.

…typelookup

jtibshirani

This looks good to me overall, just left a few comments.

jtibshirani · 2021-05-01T01:24:29Z

server/src/main/java/org/elasticsearch/index/mapper/DynamicKeyFieldMapper.java

-import org.elasticsearch.index.analysis.NamedAnalyzer;
-
-/**
- * A field mapper that supports lookup of dynamic sub-keys. If the field mapper is named 'my_field',


It looks like we dropped this javadoc, it'd be nice to move it to DynamicFieldType instead.

jtibshirani · 2021-05-01T01:31:42Z

server/src/test/java/org/elasticsearch/index/mapper/FieldTypeLookupTests.java

+        assertNull(lookup.get("object.child"));
+    }
+
+    public void testParentPathChecks() {


Can this method be removed?

oops, yes, gone.

jtibshirani · 2021-05-01T01:33:24Z

server/src/main/java/org/elasticsearch/index/mapper/DynamicKeyFieldTypeLookup.java

-        this.maxKeyDepth = getMaxKeyDepth(mappers, aliasToConcreteName);
-    }
-
-    /**


We could keep these method comments too?

jtibshirani · 2021-05-01T01:52:58Z

server/src/main/java/org/elasticsearch/index/mapper/FieldTypeLookup.java

+        return getDynamicField(field);
+    }
+
+    private MappedFieldType getDynamicField(String field) {


Was there a motivation for changing the lookup approach? The old approach seemed more streamlined -- this one seems to do several passes through the string (in longestPossibleParent, then the contains and lastIndexOf calls in a loop below).

I'm not sure why I re-implemented this, but the original approach is clearly more efficient. I've updated.

…typelookup

jtibshirani

I left a last comment, once that's addressed this looks ready to merge.

jtibshirani · 2021-05-03T05:11:08Z

server/src/main/java/org/elasticsearch/index/mapper/FieldTypeLookup.java


-        this.dynamicKeyLookup = new DynamicKeyFieldTypeLookup(dynamicKeyMappers, aliasToConcreteName);
+    // for testing
+    String longestPossibleParent(String path) {


I think this is now just used in testMaxDynamicKeyDepth and should be removed. It looks like we need to rework this test.

…typelookup

javanna

I left a small comment, LGTM though.

javanna · 2021-05-04T08:03:40Z

server/src/main/java/org/elasticsearch/index/mapper/FieldTypeLookup.java

+        int dotIndex = -1;
+        int fieldDepth = -1;
+
+        while (true) {


For some reason this loop makes me nervous that we may do more work than needed. Effectively we could stop once we encounter an object. But maybe that should not be a concern.

Effectively we could stop once we encounter an object

Not quite - you can have a dynamic field nested inside an object. That's why we calculate the maxParentPathDots field, because once you've got past that you know that there are no dynamic roots that could match the path.

but you go backwards analyzing the path, and you can't have a dynamic field pointing to an object, right? so once you find an object, you should be done and there is no need to look at its parent and so on? Am I missing something?

My comment is off, because when you look up field types, you can not find an object :) so I think this is a non-issue, like I said I am nervous that we go ahead and analyze the path when it's not needed, but I am not sure that would be a problem and how to avoid it.

Flattened field mappers have a specialised lookup class to handle the fact that their MappedFieldTypes are dynamic and generated on-the-fly, rather than being registered up front. The way this is implemented means that this lookup class also has to be aware of field aliases, which is blocking our attempt to re-implement aliases as runtime fields. This commit merges dynamic field lookup handling into the standard FieldTypeLookup class. When the lookup class is built, it checks each MappedFieldType being registered to see if it implements a new DynamicFieldType interface, and stores these in a separate map. If a field containing dots does not have a field type directly registered against it, we check if a dynamic field type matches one of its dot- delimited prefixes, and if so we then return the result of calling `DynamicFieldType.getChildFieldType()` with the remainder of the path.

Merge dynamic field type lookup into FieldTypeLookup

4ba25b6

romseygeek added :Search Foundations/Mapping Index mappings, including merging and defining field types >refactoring v8.0.0 v7.14.0 labels Apr 21, 2021

romseygeek requested review from javanna and jtibshirani April 21, 2021 14:52

romseygeek self-assigned this Apr 21, 2021

elasticmachine added the Team:Search Meta label for search team label Apr 21, 2021

Merge remote-tracking branch 'origin/master' into mapper/dynamicfield…

d9cbecb

…typelookup

jtibshirani reviewed Apr 21, 2021

View reviewed changes

romseygeek mentioned this pull request Apr 26, 2021

Add aliases as runtime fields #70065

Closed

romseygeek added 2 commits April 28, 2021 18:34

Merge remote-tracking branch 'origin/master' into mapper/dynamicfield…

63d427b

…typelookup

Introduce DynamicFieldType

538831f

romseygeek requested a review from jtibshirani April 29, 2021 10:04

romseygeek added 2 commits April 29, 2021 12:01

tidy up source lookup

3385c36

Merge remote-tracking branch 'origin/master' into mapper/dynamicfield…

63e9cb0

…typelookup

jtibshirani reviewed May 1, 2021

View reviewed changes

romseygeek added 2 commits May 1, 2021 11:57

Merge remote-tracking branch 'origin/master' into mapper/dynamicfield…

828214d

…typelookup

feedback

3630a3c

jtibshirani approved these changes May 3, 2021

View reviewed changes

Merge remote-tracking branch 'origin/master' into mapper/dynamicfield…

f0de3fe

…typelookup

javanna approved these changes May 4, 2021

View reviewed changes

remove unused method, rework tests

673ae38

romseygeek merged commit 8c28ec2 into elastic:master May 4, 2021

romseygeek deleted the mapper/dynamicfieldtypelookup branch May 4, 2021 09:34

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Merge dynamic field type lookup into FieldTypeLookup #72024

Merge dynamic field type lookup into FieldTypeLookup #72024

Uh oh!

Conversation

romseygeek commented Apr 21, 2021

Uh oh!

elasticmachine commented Apr 21, 2021

Uh oh!

romseygeek commented Apr 21, 2021

Uh oh!

jtibshirani left a comment

Choose a reason for hiding this comment

Uh oh!

romseygeek commented Apr 21, 2021

Uh oh!

romseygeek commented Apr 22, 2021

Uh oh!

romseygeek commented Apr 29, 2021

Uh oh!

jtibshirani left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jtibshirani left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

javanna left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants