Skip to content

Conversation

@joan38
Copy link
Contributor

@joan38 joan38 commented Apr 4, 2016

What changes were proposed in this pull request?

Implement some hashCode and equals together in order to enable the scalastyle.
This is a first batch, I will continue to implement them but I wanted to know your thoughts.

@joan38 joan38 force-pushed the SPARK-6429-HashCode-Equals branch from d276daa to aefff62 Compare April 4, 2016 21:55
@SparkQA
Copy link

SparkQA commented Apr 4, 2016

Test build #54897 has finished for PR 12157 at commit aefff62.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@joan38 joan38 force-pushed the SPARK-6429-HashCode-Equals branch from aefff62 to 6681b0e Compare April 4, 2016 22:14
@SparkQA
Copy link

SparkQA commented Apr 4, 2016

Test build #54899 has finished for PR 12157 at commit 6681b0e.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@joan38 joan38 force-pushed the SPARK-6429-HashCode-Equals branch from 6681b0e to d867d13 Compare April 4, 2016 22:46
@SparkQA
Copy link

SparkQA commented Apr 4, 2016

Test build #54904 has finished for PR 12157 at commit d867d13.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@joan38 joan38 force-pushed the SPARK-6429-HashCode-Equals branch from d867d13 to 8ca6d43 Compare April 4, 2016 23:51
@SparkQA
Copy link

SparkQA commented Apr 5, 2016

Test build #54911 has finished for PR 12157 at commit 8ca6d43.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, super.equals() delegates to the default in Object which requires reference equality. I don't think we can have that. Although defining these in an abstract class is dicey, I agree it should go hand in hand with hashCode at least and should just define equality based on index.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually is there any subclass that relies on this default implementation? If so, I think it also needs to check its own class vs the class of the argument. If not, we could remove this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, do you mean using the canEqual approach?
If not, do you mean removing both equals and hashCode then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If all the subclasses override these methods (and some implement some custom logic), then this isn't used, and maybe it's simpler to omit it. If this stays, yes, you're right that it really has to check the class of itself vs the argument too.

@srowen
Copy link
Member

srowen commented Apr 5, 2016

This doesn't yet enable a style check for this right?

@joan38
Copy link
Contributor Author

joan38 commented Apr 5, 2016

Not yet. I wanted to have some thoughts first before I bother implementing the wrong way everywhere.
I will push a new version soon with your comments and more (if not the rest).
Once all done I will push with the style check enabled.

@srowen
Copy link
Member

srowen commented Apr 12, 2016

@joan38 what do you think about moving forward with the style check, and at least the changes that are uncontroversial here? some of these are good fixes.

@joan38 joan38 force-pushed the SPARK-6429-HashCode-Equals branch from 8ca6d43 to 45e816a Compare April 12, 2016 21:27
@joan38
Copy link
Contributor Author

joan38 commented Apr 12, 2016

Sure, I was busy with another PR.
Do you want to give up on all Partition subtypes also or this is good as per commit 87e3be0 ?

@SparkQA
Copy link

SparkQA commented Apr 12, 2016

Test build #55649 has finished for PR 12157 at commit 45e816a.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@joan38 joan38 force-pushed the SPARK-6429-HashCode-Equals branch from 45e816a to 87e3be0 Compare April 12, 2016 22:32
@SparkQA
Copy link

SparkQA commented Apr 12, 2016

Test build #55656 has finished for PR 12157 at commit 87e3be0.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Apr 13, 2016

I'm wary of giving equals semantics to partitions, since the current semantics in this commit seem incorrect: partition 13 from one RDD is not equal to partition 13 from another. Since it's not technically wrong to implement hashCode without equals, seems like we can be conservative and not make those changes. Adding hashCode is good, as is a style check if possible, as are the other changes.

@joan38 joan38 force-pushed the SPARK-6429-HashCode-Equals branch from 87e3be0 to 9e8085d Compare April 13, 2016 08:07
@SparkQA
Copy link

SparkQA commented Apr 13, 2016

Test build #55703 has finished for PR 12157 at commit 9e8085d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 13, 2016

Test build #55704 has finished for PR 12157 at commit ebc512b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 13, 2016

Test build #55706 has finished for PR 12157 at commit 650ae02.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really mind this, but I think this is overkill when there are 2-3 fields. This could be 31 * x.hashCode() + y.hashCode()

@srowen
Copy link
Member

srowen commented Apr 13, 2016

Jenkins retest this please

@SparkQA
Copy link

SparkQA commented Apr 13, 2016

Test build #55718 has finished for PR 12157 at commit 650ae02.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the test failure is due to this patch, and it's an NPE somewhere. It could be because a field you're using in a hashCode() is null, and it seems like can be the case here. Instead of using .hashCode(), use Objects.hashCode which handles null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I will fix that and rerun CI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could have been left as an import but that alone isn't so worth changing

@srowen
Copy link
Member

srowen commented Apr 18, 2016

I think it's a good change because it lets us enforce the fairly important practice of defining equals/hashCode together. This actually forces it to be explicit in all cases where either of the two is defined, which is IMHO a good thing, as it's something that's easy to get subtly wrong.

The new equals() methods don't change behavior; existing hashCode() methods have the same behavior; new hashCode() methods look consistent with equals(). And tests pass. That LGTM.

My remaining comments are just a nit about implementation of the hash codes, and multiplying by a prime number.

@joan38 joan38 force-pushed the SPARK-6429-HashCode-Equals branch from 58b799e to eb5615f Compare April 18, 2016 17:52
@SparkQA
Copy link

SparkQA commented Apr 18, 2016

Test build #56104 has finished for PR 12157 at commit eb5615f.

  • This patch fails from timeout after a configured wait of 250m.
  • This patch merges cleanly.
  • This patch adds no public classes.

@joan38
Copy link
Contributor Author

joan38 commented Apr 18, 2016

Jenkins retest this please

@SparkQA
Copy link

SparkQA commented Apr 19, 2016

Test build #56152 has finished for PR 12157 at commit eb5615f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 19, 2016

Test build #2824 has finished for PR 12157 at commit eb5615f.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@joan38
Copy link
Contributor Author

joan38 commented Apr 19, 2016

[error] (streaming-flume-sink/*:mimaFindBinaryIssues) java.lang.ArrayIndexOutOfBoundsException: 1497

Jenkins retest this please

@srowen
Copy link
Member

srowen commented Apr 19, 2016

Jenkins retest this please

@SparkQA
Copy link

SparkQA commented Apr 19, 2016

Test build #56217 has finished for PR 12157 at commit eb5615f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@joan38
Copy link
Contributor Author

joan38 commented Apr 19, 2016

@srowen Thanks. All good

@srowen
Copy link
Member

srowen commented Apr 20, 2016

I think we've got just two more things to change: a) a rebase, and b) using prime numbers as multipliers everywhere. I can't see anything else then.

@joan38 joan38 force-pushed the SPARK-6429-HashCode-Equals branch from eb5615f to 02b397e Compare April 20, 2016 19:20
@SparkQA
Copy link

SparkQA commented Apr 20, 2016

Test build #56396 has finished for PR 12157 at commit 02b397e.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@joan38 joan38 force-pushed the SPARK-6429-HashCode-Equals branch from 02b397e to 8ce5135 Compare April 21, 2016 01:08
@SparkQA
Copy link

SparkQA commented Apr 21, 2016

Test build #56451 has finished for PR 12157 at commit 8ce5135.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

}

class MockSplitInfo(host: String) extends SplitInfo(null, host, null, 1, null) {
override def hashCode(): Int = Random.nextInt()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this so that it matches the equals behaviour.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong now though. hashCode has to be deterministic and always return the same value. There is nothing wrong with always returning 0. The problem is actually with the equals method, but, it won't matter here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense

@SparkQA
Copy link

SparkQA commented Apr 21, 2016

Test build #56525 has finished for PR 12157 at commit f11b112.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


class MockSplitInfo(host: String) extends SplitInfo(null, host, null, 1, null) {
override def hashCode(): Int = Random.nextInt()
override def hashCode(): Int = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you need to re-run the tests anyway -- also remove the unneeded import of scala.util.Random now

@SparkQA
Copy link

SparkQA commented Apr 21, 2016

Test build #56538 has finished for PR 12157 at commit ba5633c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Apr 21, 2016

LGTM. Thanks for sticking with it. If there are no more comments today I'll merge.

@srowen
Copy link
Member

srowen commented Apr 22, 2016

Merged to master

@asfgit asfgit closed this in bf95b8d Apr 22, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants