Skip to content

Compare with "is distinct" in array_distinct, array_intersect#559

Merged
findepi merged 1 commit intotrinodb:masterfrom
findepi:findepi/array-distinct
May 15, 2019
Merged

Compare with "is distinct" in array_distinct, array_intersect#559
findepi merged 1 commit intotrinodb:masterfrom
findepi:findepi/array-distinct

Conversation

@findepi
Copy link
Member

@findepi findepi commented Mar 28, 2019

Addresses #560 for array_distinct, array_intersect.

Copy link
Member

@martint martint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to add a test with repeated entries in one (or both) of the arrays.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a good pattern, as it will be a megamorphic callsite. What we generally want to do is assemble a methodhandle via methodhandle combinators. The VM is pretty good at generating specialized call trees without profile pollution for those. It's probably not trivial to do for the code that uses this (the array_intersect function + TypedSet combo), so, at a minimum, I'd prefer for this to be done directly in typed set to discourage (or at least not encourage) unwitting developers from reusing this method in other places.

Also, this function is weird. It's not clear why a function that computes whether two elements are distinct from each other needs a function that computes whether the elements are distinct from each other. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inlined

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding methodHandle dependencies in the methods that get called row per row is a bit of an anti-pattern. It typically means the code is turning around and doing methodHandle.invoke at some point, which become polymorphic when the function is used in a variety of contexts. A better approach is to get the dependencies during the "get implementation" call, which should assemble a methodhandle chain or generate specialized bytecode.

It doesn't matter for this specific instance since it's hard to assemble the methodhandle chain with the TypedSet in the middle, but I'm just pointing it out as a general comment (so no need to change anything here).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left as is

@findepi findepi force-pushed the findepi/array-distinct branch from e1d421f to 89ffab7 Compare May 13, 2019 09:35
@findepi findepi force-pushed the findepi/array-distinct branch from d2c2416 to c4d90ca Compare May 14, 2019 08:41
@findepi findepi merged commit d1a50dc into trinodb:master May 15, 2019
@findepi findepi deleted the findepi/array-distinct branch May 15, 2019 10:42
@findepi findepi added this to the 312 milestone May 15, 2019
@findepi findepi mentioned this pull request May 15, 2019
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants