Skip to content

Add equals and hashcode methods to ConnectorExpression#1712

Merged
martint merged 1 commit intotrinodb:masterfrom
phd3:connector-expr-hash
Oct 12, 2019
Merged

Add equals and hashcode methods to ConnectorExpression#1712
martint merged 1 commit intotrinodb:masterfrom
phd3:connector-expr-hash

Conversation

@phd3
Copy link
Copy Markdown
Member

@phd3 phd3 commented Oct 10, 2019

No description provided.

@cla-bot cla-bot bot added the cla-signed label Oct 10, 2019
@phd3
Copy link
Copy Markdown
Member Author

phd3 commented Oct 10, 2019

@martint excluded the types since I feel that the rest of it should uniquely define the ConnectorExpression objects. Please let me know if you think otherwise.

@martint
Copy link
Copy Markdown
Member

martint commented Oct 10, 2019

The type of the leaves is important. For constants, it's implied by the object that's stored, but for variables, there's nothing to distinguish when they have the same name. In practice, it probably won't matter much, as the expressions will be processed and considered within the same scope. The type should be the same, otherwise it's a bug. It will depend on what the equality semantics are being used for.

@findepi
Copy link
Copy Markdown
Member

findepi commented Oct 10, 2019

Why do we need equals in ConnectoExpression?
I am very concerns about "almost value objects" that are considering only some fields in equals.
We have this problem with table handles (see #1448).

I would recommend implementing Equivalence instead of adding this.
(either using com.google.common.base.Equivalence directly or mimicking it in SPI)

Even if our end goal is to really have equals in these classes, adding it as equivalence first is much safer way.
Once you add equals() there is no effective way to find code that depends on it, so it's hard to change/remove it.

@phd3
Copy link
Copy Markdown
Member Author

phd3 commented Oct 10, 2019

@findepi @martint I want to extract the unique fielddereferences and unique variables in all the input connectorexpressions in Hive applyProjection. This helps decide the new columnhandles we want to generate. That requires these methods to be implemented. I see your point about the equivalence method, but I don’t think it’d help here.

Another option is to convert the variables and fielddereferences into a different representation using strings and integers, before doing such operations.

@findepi
Copy link
Copy Markdown
Member

findepi commented Oct 10, 2019

I see your point about the equivalence method, but I don’t think it’d help here.

Thanks for considering. Why it wouldn't help?

@phd3
Copy link
Copy Markdown
Member Author

phd3 commented Oct 11, 2019

@findepi I created a WIP #1720 for which I plan to use the ConnectorExpression equalities. (HiveMetadata#applyProjection method to be specific)

I think you're right, we can make use of Equivalence here. Can you elaborate a bit on your thoughts? Do you mean that we define an equivalence and use Equivalence.wrap whenever we want to use the ConnectorExpression in a set/map or simple equality comparison. In that case, do we have different equivalence objects defined for different implementations?

@findepi
Copy link
Copy Markdown
Member

findepi commented Oct 11, 2019

@phd3 maybe I am overcomplicating this.
If you do equals that's fully value-based (considers all fields), then I am ok with this.
From what I see, you consider all fields except of type.

@martint what are you thoughts?

@phd3 phd3 force-pushed the connector-expr-hash branch from f349340 to a09ead5 Compare October 12, 2019 01:55
@phd3
Copy link
Copy Markdown
Member Author

phd3 commented Oct 12, 2019

@findepi I agree with your point. There isn't any additional benefit of not considering types for comparison. so I think we should include it for simplicity and to also avoid any incorrect behavior in the future.

@martint
Copy link
Copy Markdown
Member

martint commented Oct 12, 2019

Let's add the type as well for simplicity. It will make it easier to reason about the behavior of the code. There's a small flaw in the current structure of ConnectorExpression with respect to how it handles types (in particular, it won't work naturally once we add functions and lambda expressions), but we'll tackle that later. At that point I expect the equality semantics that consider all fields to be exactly what we want.

@phd3
Copy link
Copy Markdown
Member Author

phd3 commented Oct 12, 2019

I've added types in the equality check.

@phd3 phd3 requested a review from martint October 12, 2019 03:10
@martint martint merged commit 013a22c into trinodb:master Oct 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants