Skip to content

Conversation

@huaxingao
Copy link
Contributor

What changes were proposed in this pull request?

In the algorithms that support instance weight, add checks to make sure instance weight is not negative.

Why are the changes needed?

instance weight has to be >= 0.0

Does this PR introduce any user-facing change?

No

How was this patch tested?

Manually tested

@SparkQA
Copy link

SparkQA commented May 23, 2020

Test build #123027 has finished for PR 28621 at commit 4ba0b1d.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@huaxingao
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented May 23, 2020

Test build #123030 has finished for PR 28621 at commit 4ba0b1d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks generally fine if this consistently checks it in one place, and does so without generating more Spark jobs.


dataset.select(col($(labelCol)).cast(DoubleType), w, col($(featuresCol))).rdd.map {
case Row(label: Double, weight: Double, features: Vector) =>
require (weight >= 0.0, "illegal weight value: " + weight + " weight must be >= 0.0")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to wonder if we should require weights to be positive, but I don't think we do now. I guess we tend to catch it when we check whether all weights are 0 (sum = 0).
BTW you can use string interpolation in these error messages.

@zhengruifeng
Copy link
Contributor

nit: What about adding a checkNonNegative (double->double) udf in ml.functions and use it in .ml? I guess it maybe a little simpler.

@keypointt
Copy link
Contributor

Nit: add a util common method to generate error string? Seems duplicated strings and will need to update each of them if any future changes :)

@SparkQA
Copy link

SparkQA commented May 25, 2020

Test build #123078 has finished for PR 28621 at commit e1b4422.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented May 25, 2020

Weird, errors like org.scalatest.exceptions.TestFailedException: Expected [5.080000000000123,-0.600000000000043] and [-0.96,1.7] to be within 0.001 using absolute tolerance for all elements. Let me retest

@srowen
Copy link
Member

srowen commented May 25, 2020

Jenkins test this please

@SparkQA
Copy link

SparkQA commented May 25, 2020

Test build #123088 has finished for PR 28621 at commit e1b4422.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 25, 2020

Test build #123094 has finished for PR 28621 at commit b2a2166.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented May 27, 2020

Merged to master

@srowen srowen closed this in 50492c0 May 27, 2020
@huaxingao
Copy link
Contributor Author

Thank you all!

@huaxingao huaxingao deleted the weight_check branch May 27, 2020 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants